Testing Gmail’s new look
Editor's note: This post, like yesterday's, is more technical than most posts here, but we thought some of you might find it interesting to look inside how Gmail works.
Yesterday, we talked about how we make changes like the new look to Gmail. The new look is not just visual, but involves completely different code in the interface. Testing a large user interface (UI) change like we launched for Gmail is foremost a permutation problem. Because all the Gmail features we wrote while we developed the new UI had to work both there and in the old existing UI, we basically needed to double our testing. Plus, the new UI has to work in many browsers, in all languages Gmail is available in, which means even more testing -- and by testing, we mean functional testing, latency testing, usability testing... you get the idea! The only way to handle all of these moving parts is through a) test automation, and b) using the new look.
We use automated tests as much as possible. Automated tests check the code for pieces that might be broken. We run tests on the new code to see how it affects Gmail's servers, whether it breaks in major browsers, how it impacts speed and performance, and more. The scalable build and test infrastructure at Google helps a lot in running all of these tests quickly. However, a major code change like this really tests the stability of your tests. If a test relies too much on a specific element, then the test starts failing -- not because the functionality is broken, but because that element is missing or different in the new interface. Luckily, we learned this lesson early and most of our tests did not have this problem.
We use automated tests as much as possible: we test if code changes lead to functional regressions, how they affect speed and our servers, if the UI breaks in many browsers and more. The scalable build and test infrastructure at Google allows us to run these tests automatically after every single(!) code change. However, a major UI change like this requires that our automated tests are very stable. If a test relies too much on the structure of the UI, then the test starts failing - not because the functionality is broken, but because it fails to work with the new UI. Luckily, we learned this lesson many years back and most of our tests did not have this problem.
But even the best automated tests can’t guarantee that everything is working well and that the visuals are pleasing. The only way to find out is to actually use the new look. For Gmail, we have special environment that gets updated every night with the latest stable code. Almost all Gmail engineers and a handful of other Googlers are using this environment for their real Gmail usage. But it turned out even daily updates were too slow for the rate of code change with the new look. So, we created an environment that updates every hour with the latest stable code. This version of Gmail was used by all engineers who worked on the redesign. It allowed us to test code changes very quickly on the real system. We were able to find many functional and usability issues here. And because we used this system and no engineer likes their email to be broken, issues were fixed very quickly. We can only do this because we have a very good coverage by our automated tests. When all these tests pass, we can be sure that most of the Gmail functionality is working. However, there could still be usability, color, layout or other challenges that tests can't catch.
Gmail’s new look also put a lot of additional load on our testing team. They had to keep up with a high rate of change, test critical functionality quickly and triage a lot of reported issues. Plus, they had to test new features in both the old publicly-available UI as well as the new unlaunched UI. The dedication of our testing team helped us catch bugs early so we could fix them in preparation for launch.
Once we felt that the new look was good enough to be used by others, we turned it on for all Googlers. At Google, we "eat our own dogfood," meaning we use new products and features ourselves before releasing them to the public. Often, this is a very humbling experience. The shiny, new features, that we just developed and are so proud of are now used by people, including sales teams, managers and other non-engineers, who just want to get their job done. And believe me, Googlers are not shy when it comes to feedback! But for a project like this one, this step is absolutely critical. Our different teams at Google tested Gmail in all kinds of use cases and the feedback that we received from this phase was invaluable. It helped us to put the final touches on the new look and get ready for usability tests that were previously discussed.
We hope you've enjoyed a look into the Gmail's design, development and testing of the new look.