Making Acceptance Testing as Boring as Possible: How a Team Moved Analysis and Quality Assurance Left Without Need for a Testable Build

Jul 10, 2023

Recently I feel like I've read a lot of blog posts that extoll the benefits of moving testing left. As I understand it (and as experience reminds me), the idea is that if you move testing left, you have an opportunity to catch issues earlier because the automated test net (or set of manual user acceptance testing practices you use) can be employed earlier. So if you test earlier, you'll likely catch issues earlier. The earlier your tests catch issues, the less expensive they are to fix. It works like magic: just test earlier.

While this seems reasonable in theory, experience tells me that testing isn't enough, no matter how early it happens. If it helps, I say this as somebody who has been part of a couple different (successful) efforts to move testing left. Without a clear understanding of how everything fits together and what to expect, both testers and software engineers are effectively searching in the dark for a usable solution to the problem each has been tasked with.

In addition to teams that moved testing left, I have been a member of a team that successfully moved quality left. This was a different experience, especially when I consider that most of the work involved didn't really rely as much on testing (to be clear, there was a lot of testing) as much as it did on helping to change the team's perspective on how we moved work from idea to implementation to evaluation. More specifically, we made testing our understanding of what we were building an important part of our process: within this sort of testing without testing, we found a way to make reexamining our understandings of what we were building (and how we planned on examining what we had built in user acceptance testing) as vital to delivering with quality as an extensive test net. We also put more sets of eyes (not just those of test engineers) on validating what we were building. A key component in this was how test planning fit into our development processes. It set us apart as a team within our organization, and as I understand it, it helped the company we worked within at the time grow. What I'd like to share in this post is what that looked like.

As a quick aside, I believe I also see an opportunity to explore the benefits and drawbacks of each approach (maybe a good idea for a future post), but that's not for today. For this post what I'd like to focus on is what it looked like when the team I was part of moved quality left.

Finally, before we start I'd like to make a quick confession: I originally completed a draft of this post that focused exclusively on test planning, as though test planning on its own was what made the dream work (I actually edited this more after posting to address this more decisively). The idea that test planning pulled a majority of the weight in our successes doesn't seem to ring true. Because test planning (as well as the way I write test plans) seemed to play a role, though, it may be worth making an additional post later on that talks about this specifically.

What We Did

When I joined the team we were in the midst of work on a high-value project that ended up lasting three or four months. I was a manual tester at the time. What we delivered was pretty good (and we did a pretty good job of evaluating finished work product). After that project we formed (sort-of normed) as a sort of tiger team for a specific very-high-value project; after that project was over we sort of kept going, delivering at a high level, delivering new functionality to the company's primary offering. Sometimes we built on existing functionality, but most of our work was greenfield. I won't go into too much depth on the implementation details, other than that our work was varied, both from a technical perspective and from a value perspective: although it all related to the primary offering, each project did different things, leveraged different technologies, and related in different ways.

Early on I suggested in retrospectives (if I recall correctly) that it would be helpful to avoid the sort of design changes it seemed like the team typically engaged in with stories in-flight. That I can recall (this far on: it's been several years), we had a habit of changing design within stories in-flight as a result of feedback received during development, and that this habit had a tendency of invalidating the test plans I had written either for myself or for the user acceptance tester helping me to execute testing. Basically we had ceremonies to define the work we were doing and review the work we had done, but between those ceremonies anything could change. Because the test plans I'd written were invalidated, we ended up with a lot of discovery work we had to do in order to get stories across the line once we got to testing them. To resolve this, we agreed as a team to be more deliberate about planning and more dedicated to following through on work it had committed to.

At another point I remember asking the tech lead if it might be possible to improve the unit- and integration tests run against builds before they were published for testing: I recall we were getting a lot of builds where, although the main system ran, the functionality (within that system) under test was not accessible a for some reason. As a result of this, we would wait for builds, run the necessary setup (most of this was manual), and find that we were unable to test. After the tech lead agreed to do this, we spent less time on setup for untestable builds.

At some point later I remember providing feedback that it seemed like we frequently tried completing stories with only a partial vision of what we were building: it seemed like either software engineers or some other subset of the team was missing part of the puzzle. We would build part of the functionality out, hand it over for testing, and find that some external dependency or some part of our design was missing because we hadn't considered it. I requested (or the team suggested in response; I can't remember distinctly) that we identify external dependencies and design gaps proactively and use spike stories to resolve those before a story could be put in flight.

Eventually I found a summary statement for what I was driving at in all of these requests: I wanted to limit the amount of discovery needed for a story in-flight. Specifically I remember requesting that we make it a goal of our processes that, once a story was handed over for testing that the user acceptance testing be as boring as possible.

Later (if I recall correctly at the suggestion of a project manager), I suggested that the test plans I wrote be reviewed and approved by the software engineer the story had been assigned to before I started testing. I would write test plans as soon as I could and post them in JIRA; when I suggested this, I requested that the team agree to modify its process so that, if a software engineer had not yet reviewed a test plan I had written, I would decline to test the story (at which point it had no chance of getting across the finish line). I asked whether the team agreed with this.

After a short discussion, the team voted: the motion passed.

What We Observed

As a team, what I believe we observed was that we delivered a lot of work at high quality. I'm just one former member (with my own perspective), but I believe I recall talking about it on the team at least in retros. Once we shipped, we didn't seem to need to do a lot of bug fixing except for some very uncommon use cases. In the eighteen months I worked with the team we completed three or four different high-profile projects. As I understand (having spoken with my cousin who was an intern at the time) our team had a reputation in Support for high quality. And as I understand it our work got noticed at a higher level: we actually got called out (however indirectly) by executive management at a year-end company event in response to the growth we'd helped encourage in a market segment the company was saturated within.

On the team, I believe we observed (I know I did) that user acceptance testing seemed to process faster and with fewer issues than it used to. Bigger picture, things seemed to run smoother on the team than before, and it felt like everybody was contributing to it. We could generally take comfort in the understanding that what we'd test was what we'd planned on testing. We also typically didn't run into builds that were not testable. Generally things in testing got predictable enough that in daily stand-up meetings that I could set a date I needed to be able to see a test build by (or get a test plan reviewed) in order to likely be able to complete testing in the current sprint. Sometimes Software Engineers would approach me proactively to let me know that something had changed.

In reviews of test plans, I can remember three things generally happening if the test plan wasn't approved without comment:

Software engineers would advise me that I was mistaken about how the functionality being built could be tested and that I should reconsider part of my test plan.

Software engineers reconsidered their designs for implementation in response to feedback I had provided (sometimes as a group) because my test plans had described functionality differently than they had understood from the user story and discussions, or because I'd unearth stakeholder concerns that had not yet been considered.

The test plans prompted more-general discussion about what we were trying to deliver (and how we could verify that what we had delivered was what we'd intended to).

I think a couple times we took stories out of development so that we could reevaluate them.

Analysis: What I Believe Happened

The most important thing that I think happened was that we committed to being deliberate (at a team) about both what we were building and how we would determine that we'd built what we intended to. The two actually went hand-in-hand. And to make them work together we had to commit to working as a team.

To be clear I believe this is different from Waterfall: in Waterfall, you design your work product (including inputs and outputs) on the front end, and everything revolves around making sure that what you built was what was outlined in the design. The design becomes something that everybody should be able to work from independently, because the design is generally (for better or for worse) unquestionable. We adjusted where needed, but we also used teamwork to minimize the amount of discovery that was needed on either side (by the software engineers or QA). And I can remember user stories in-flight that we actually removed from development because we needed to regroup and understand one side of that relationship or the other better.

We also implemented two boundaries in our development process with rather high bars: one was for work we bring into a development, and another was for work that had been completed. Both of these focused on making sure we were clear what we intended to build and what we intended to test.

To use another experience for comparison, what I believe happened was very similar to what I observed working in QC at a print shop: by putting many sets of eyes on examining what we were building (and comparing what we were building to what we had been tasked with building), we reduced the amount of testing and discovery that needed to happen in QC/ UAT.

What Seemed to Work

Among others, here are some things that I might point out worked:

We successfully committed to reducing the amount of complexity or complication we committed to within a story. If something (design, environment, problem, etc.) was so complex that we couldn't articulate how we'd go about providing a solution, then we needed a spike story to investigate. Once we got to sprint planning, we wanted a story ready to go: no unsatisfied external dependencies, and no pending investigations.

We made testing a first-class citizen. I don't mean Test Engineering here, and I don't mean Software QA. What we made a first-class citizen was practices related to examining and reexamining not just our plans and our work output but our understandings of what we were building, our plans to go about building it, and how we would eventually confirm that we'd built it. By requiring sign-off for test plans in our process, we made it a necessary condition for shipment both that we understood clearly what we were building and how we'd examine what we'd built in user acceptance testing. And at certain points in backlog refinement and sprint planning, I remember offering "if it helps I can test that" at various points where it seemed like others on the team weren't sure how to proceed.

At another point (in what for me was a career highlight) I remember a software engineer staring at me, visibly somewhat concerned, after we'd just committed to for the umpteenth one-week sprint with 50 story points (by the way: this velocity only lasted for four months because it was not generally sustainable). When I asked him what's up, he asked me "How are we going to test all of this?"

I remember saying "Relax: we'll figure it out."

Another thing that seemed to work is that we gave ourselves room to navigate the complexity of what we were trying to build. We actively looked for ambiguities, missed connections, or unsatisfied requirements in what we understood we were building; if we found them, we called them out and made adequate space for them.

In all of the above held ourselves accountable to each other. Nobody had a reasonable expectation that if they raised a concern they'd be brushed off or dismissed.

I believe this also changed somewhat our understandings of the roles we played on the team. On teams that approached this division of labor more conventionally, my experience has been that the work Software QA/ Test Engineering is expected to do is involved with validating work output, which to me seems to suggest: Software engineers build what they believe is a viable solution, and UAT either produces evidence that the solution is viable or finds reasons why it is not. On the team that's the main focus in this post, the arrangement was that Software Engineers had developed confidence that work output was viable, and UAT evaluated their work to gather information (and report) on what worked and what didn't. Beyond this, the team worked together to confirm that what would be evaluated in UAT and what would be developed by software engineers was the same thing.

To me it seems like the difference here is what role QA is expected to play on the team: is QA expected to play the role of goalie (with however-much defense), or is there enough room for QA to play forward (or even scrimmage) with software engineers? Was QA a coach for us? Did I play the role of Quality ambassador as opposed to Quality Analyst or Quality Assurer? Could have been, but again I don't feel like that tells the story very completely for what I can recall. I believe that QA became another set of eyes for the rest of the team as opposed to just the last stop before the sweet release of general availability. But to me it seemed like success was about more sets of eyes on the work than just those made available within QA.

What Seemed Not to Work

This sort of team composition and culture doesn't work naturally for everybody, and even for some of us for whom it worked, it took work to make it work. In my experience it's definitely not conventional. My experience is that the conventional relationship between software engineers and QA practitioners is expected to be somewhat reactive around a general expected process flow: after the software engineers implement what the product owner asked them to, then it's up to QA to validate the finished product on its last stop from development to delivering value to the organization in release.

As an aside, this seems more like Waterfall than the alternative that I describe as the subject of this post, just in smaller increments than traditional Waterfall.

What's more, I'm not aware of anybody (myself included) who appreciates being asked to overthink the work they are doing. For anybody accustomed to not spending a lot of time on test plans, I can see where the processes we used (however flexible) seemed like more than the norm. I'll push back on suggestions that we overthought anything, though.

Although the things we did as an alternative seemed to work for most of the team, it didn't work for everybody. Occasionally we had a software engineer join the team who disagreed with the requirement that test plans be agreed on before testing. Or maybe they disagreed with the amount of discussion we undertook in sprint planning or refinement. Whatever it was, sometimes they would post a message or sit up straight in a meeting to say pointedly "Why are we even doing this?"

Also, among the various product owners the team had (we were assigned a new product owner every four to six months), every time we got a new product owner, the product owner seemed relatively focused on helping us do better by helping us find a way to streamline our operations. For reference: as far as I'm aware we were one of the highest-performing teams in the organization. So any time we got a new product owner, we needed to be patient with each other and seek opportunities to make it clear we were listening to each other. Once we started finding and executing on these opportunities, though, things seemed to go better.

Conclusion

To be clear, I don't believe that the team's successes (even the ones I outline above) necessarily revolved around work I did. I do believe there's a strong case that can be made that I helped (which I hope justifies the title of this post: How I Helped...). I believe that the team's successes followed our willingness to improve, the courage we had to share (and if needed challenge each other) when we found something, and the humility we showed to be willing to listen to each other when somebody made a suggestion (even ones we didn't like -- myself included). We also had unbeatable help in the forms of coaches and leadership in what seemed like the right places. If I had to pick a limited set of things that I believe made work for that team a success, I'd start there. The things it seems made that possible I don't believe are the sort of things any one person can carry.

What I believe worked about the test plans I wrote is that they helped us envision more clearly (and question more incisively) what we understood we were building before- and as we built it. Conventionally it seems like teams build the thing they understand they were asked to, then hand that thing to QA (in practice doing the work of QC), at which point the teams discuss what they should have built and how it should actually be tested in QC: at this point, the team goes back to the drawing board (reevaluating either solution design or test design), the team makes the best of what they've already built, or UAT adapts to what it can do on the spot with the amount of time left in the sprint. I welcome criticism that it seems like I'm being melodramatic here, but although on one hand it's clear to me not every team does this, this approach seems pervasive enough that I've seen it happen (although to be clear: not on this team) when attempting to build endpoints for a REST API from a formal specification.

What we made work as a team was pushing this exchange (that typically happens informally, late in development) left, so that instead of waiting until we were ready to ship to have discussions about whether what we'd built was ready to ship, we started early designing how we'd know we were ready to ship -- not just in terms of what we'd build but in terms of what we'd test. And by doing this we effectively changed the composition of the team without changing many of its members. Our goal wasn't to just get a workable solution across the line; it was to build what we'd expected to. And once we'd reached agreement on that, we tested our design -- in some cases before we wrote any code. And at the same time we tested our design, we tested our inspection checklists. Test planning ended up being a vital part of all of that, but the humility, courage, and willingness to improve that made all of this work is what I believe made the dream work here.

Once we got to a point where a solution was ready to complete and ready for user acceptance testing, we tested that, too. That part was normally boring. But that was an important reason for confidence that we were doing it right.