Seven people attended the SLUG[1] meeting on Mon 10 May 2010, titled
"Simulating web users".[2]  Rob showed us some of the stuff he'd been
working on to automate testing of a web application.  It raises an
interesting point -- when your application is a web site, automated
testing is a bit more complicated than a shell script wrapped around
your program.  You have to be a web browser!

[1] http://wiki.gnhlug.org/twiki2/bin/view/Www/SLUG
[2] http://slug.gnhlug.org/Members/rea/SLUG/slug-meetings/simulating-web-users

  Rob briefly demonstrated Selenium[3].  From a distance, it's kind of
like expect[4] for a web browser.  Selenium has several parts.
Selenium IDE is a GUI thing that hooks into Firefox and "records"
actions you take as you browse.  You can then review and modify the
recorded actions.  It has several output formats, most of which a
programming languages (Java, Python, Ruby, others).  The code it
generates consists of calls to the Selenium API, in the language of
your choice.  You can then run that code as a unit test.  The API
calls talk to Selenium RC (Remote Control).  RC is daemon (Java-based)
which plays back the actions you recorded with IDE.  It starts up the
browser you ask for (Firefox, Safari, MSIE) and does what the API
calls say to do.  The browser runs hidden.  It channels the browser
through an HTTP proxy it provides, so that it can inject JavaScript
code to do the "playback".  There's also Selenium Grid, which lets you
run many RCs.

[3] http://seleniumhq.org/
[4] http://expect.nist.gov/

  Rob then moved on to his home-grown solution.  He needed to get
creative because they didn't really have test cases for what they
wanted to test ("load testing with real world usage").  But he did
have a bunch of Apache logs.  The web app is basically "read only"
(it's kind of like a special-purpose Google Maps), so he figured he
could use the Apache web logs to replay what real users have done.
And replay a bunch of them at once to simulate lots of users.  So he
whipped up some Python code.

  First he had to read the Apache logs.  For that, he found an
existing log parser in apachelog[5].  It parses each line into a
dictionary, where the dictionary keys are the Apache log field symbols
(%t for time, %h for client host IP address, etc.).  Rob wasn't
familiar with those, so he wrapped it in a Python class with get
methods.  Then Rob moved on to simulating a web user.  This turned out
to be trickier than it would seem.  For one, Apache only writes the
log entry when the request is finished, so the log order is not
necessarily the order the user did things in.  For another, modern web
browsers makes multiple requests simultaneously.  So Rob had to dive
into multi-threaded programming in Python.  That proved an adventure
in itself.  Unfortunately, my understanding of Python is very limited,
so most of this part was over my head.  It seemed interesting though!

[5] http://code.google.com/p/apachelog/

  Thanks to Rob for once again coming up with an interesting "off the
cuff" presentation.

  See you next month!

-- Ben
_______________________________________________
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Reply via email to