Seven people attended the SLUG[1] meeting on Mon 10 May 2010, titled "Simulating web users".[2] Rob showed us some of the stuff he'd been working on to automate testing of a web application. It raises an interesting point -- when your application is a web site, automated testing is a bit more complicated than a shell script wrapped around your program. You have to be a web browser!
[1] http://wiki.gnhlug.org/twiki2/bin/view/Www/SLUG [2] http://slug.gnhlug.org/Members/rea/SLUG/slug-meetings/simulating-web-users Rob briefly demonstrated Selenium[3]. From a distance, it's kind of like expect[4] for a web browser. Selenium has several parts. Selenium IDE is a GUI thing that hooks into Firefox and "records" actions you take as you browse. You can then review and modify the recorded actions. It has several output formats, most of which a programming languages (Java, Python, Ruby, others). The code it generates consists of calls to the Selenium API, in the language of your choice. You can then run that code as a unit test. The API calls talk to Selenium RC (Remote Control). RC is daemon (Java-based) which plays back the actions you recorded with IDE. It starts up the browser you ask for (Firefox, Safari, MSIE) and does what the API calls say to do. The browser runs hidden. It channels the browser through an HTTP proxy it provides, so that it can inject JavaScript code to do the "playback". There's also Selenium Grid, which lets you run many RCs. [3] http://seleniumhq.org/ [4] http://expect.nist.gov/ Rob then moved on to his home-grown solution. He needed to get creative because they didn't really have test cases for what they wanted to test ("load testing with real world usage"). But he did have a bunch of Apache logs. The web app is basically "read only" (it's kind of like a special-purpose Google Maps), so he figured he could use the Apache web logs to replay what real users have done. And replay a bunch of them at once to simulate lots of users. So he whipped up some Python code. First he had to read the Apache logs. For that, he found an existing log parser in apachelog[5]. It parses each line into a dictionary, where the dictionary keys are the Apache log field symbols (%t for time, %h for client host IP address, etc.). Rob wasn't familiar with those, so he wrapped it in a Python class with get methods. Then Rob moved on to simulating a web user. This turned out to be trickier than it would seem. For one, Apache only writes the log entry when the request is finished, so the log order is not necessarily the order the user did things in. For another, modern web browsers makes multiple requests simultaneously. So Rob had to dive into multi-threaded programming in Python. That proved an adventure in itself. Unfortunately, my understanding of Python is very limited, so most of this part was over my head. It seemed interesting though! [5] http://code.google.com/p/apachelog/ Thanks to Rob for once again coming up with an interesting "off the cuff" presentation. See you next month! -- Ben _______________________________________________ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/