On Software Testing | Ruminations

Over the past several years at the University, we’ve had to hire for a variety of programming and programming-related positions. One of the interviewing practices we’ve adopted is to give candidates a programming assignment which they can complete at home, at their own pace using any resources they can muster. Because these conditions approximate the conditions under which they would normally be working, we find this to be a really good indicator or what sort of work they’ll do if hired. (Besides, it’s one of the practices from The Joel Test! We currently score about 8.5/12.)

One of the jobs we’ve been hiring for is what I refer to as the “Quality Assurance Czar” — someone who will spearhead and organize our efforts to make sure the code we’re fielding is as awesome as we want it to be. The programming test for this includes both writing a small program to compare two hands in poker and, more importantly, writing a suite of tests to verify that the program works correctly. (This is not the exact programming challenge we use. It was actually one we considered, but it’s a thornier problem than one would guess at first glance, and there are a number of published algorithms out there already. We decided to make up our own problem so that applicants couldn’t just google for an answer.)

A few months back, one of my good friends — let’s call him Larry — interviewed for this position. He did a fine job in the interview, and while his code was very good, it didn’t include the comprehensive testing we were looking for. He wrote a small PHP application that generated a couple of random hands and then compared them, relying on the person running the program to verify that the winning hand the program chose was correct. His solution was 100% accurate, but didn’t include any automated tests, as he had a hard time imagining what sort of testing would have been useful in that case. Here’s a note he sent after the position had been filled:

I was real proud of my [Poker] app, the one that was supposed to include a testing approach. Now I’ve tried Selenium, and I still don’t see how to test my app, because it has no user inputs.

What did you guys ever have in mind?

Okay, I can see these tests:

entering the URL brings up the page.

A refresh causes a change to the page (a new checkers arrangement).

But that don’t seem adequate.

He is, of course, completely correct. While this would test a few trivial aspects of the application’s functioning, it would leave the important and juicy bit of code, analyzing the poker hands, completely untested. (This is called “incomplete code coverage” in the argot.)

There are several things one can do to make this application more testable. First off, one needs to create a mechanism to feed the program a specific input. Once you can do that, you can give it a couple of predetermined hands and verify that it chooses the correct one. Doing this with a variety of different hands will demonstrate that the program is working correctly in those cases.

There are, however, about 2,400,000,000,000,000,000 combinations of two 5 card hands. For obvious reasons, we can’t test all of those. How do we choose which ones to check?

Let’s start with easy ones. Verify that a flush beats a pair. Verify that a full house beats a flush. Verify that four of a kind beats three of a kind. Verify that a pair of 8s with hearts and spades beats a pair of 8s with clubs and diamonds.

Once you have the basics covered, we want to look for “edge cases”. These are where our data bumps up against boundaries of some kind. They’re very important because “off by one” errors are very common in programming, and these tests help to expose those errors. In poker, for example, we would want to check that an “A2345” hand would count as a straight, and that a “10JQKA” would also count as a straight, but that “QKA23” would not.

By this point, we’ve accumulated perhaps a few dozen tests — a miniscule percentage of the possible total, but strategically enough chosen that we have confidence that things are working as they should be. If we later discover that we’ve missed something and that our program is inaccurate in some cases, we simply add another test that demonstrates the failure, and then work on our algorithm until all the tests are once again succeeding.

Next, it’s important to try invalid data and make sure that our program handles the error condition as we would expect. Asking the program to evaluate a hand with 5 aces of spades is meaningless, and should generate an appropriate message for the user. (Something like “You’re a dirty cheating dog” might be appropriate.) Having the 17 of acorns as one of the cards in your hand is similarly out of bounds, and should also trigger an appropriate error.

In Larry’s program, all of the logic was stored in a single PHP file. Though it’s a bit more work to set up, I would recommend using a Model-View-Controller design pattern, and splitting the “evaluate the poker hand” portion of the program from the “show the poker hand on screen” portion and the “handle user input” section. By doing so, we can test these portions of the code independently. If we know that the code that evaluates the hand does so correctly, but it’s showing the wrong answer on the screen, we have immediately narrowed the code that might contain the bug to just the bit that’s responsible for presentation.

Finally, once we’re using MVC, it becomes practical to use one of the many excellent unit testing frameworks that encapsulate a lot of hard-won experience doing effective testing. Since Larry was working in PHP, PHPUnit and SimpleTest appear to be decent options. Using an established testing framework has a number of advantages: the code for common testing tasks (like logging into a site) will already be written, it becomes easier to integrate our tests with other systems, test output can be reported in succinct, attractive forms, and other programmers will have an easier time understanding how our tests work.

So, in summary, we’ve updated our program to accept input, fed it a number of test cases and verified that it’s returning the results we expect, adopted MVC and a standard test framework. We now have both a high degree of confidence that our current code is working as expected and a “safety net”: if we decide to refactor the program or change the algorithm we’re using to compare hands, we’ll know immediately if anything we’ve done has broken it.

For a more comprehensive introduction to testing and using it to drive development, see this article. And if you’ve followed this, want to learn more, and are interested in a job, our QA Czar position is open once again!