I am creating a web app that has several survey pages where there are a
total of 120 or more radio buttons and check boxes on one page.  I have been
seeing average page render times on the order of 2 to 2.5 seconds (some
times as bad as 5 seconds) with a single user hitting the site.  I was going
to put off my investigation into this problem until the site was in a more
finished form.  But since this topic has come up and because I needed
something to keep me busy until I go on vacation Wednesday, I decided to
look into it now.

After running some tests which I'll describe below, I'm starting to feel
like the performance problem reported in TheServerSide case study was caused
by excessive garbage collection and not by the use of reflection to display
and update form values.  Also their choice to use incremental garbage
collection in their tests probably made the struts app look far worse then
it would have if it had been run with standard garbage collection.

Like others who have posted on this subject recently, my initial guess was
that reflection was a likely cause of the performance bottleneck.  But there
was one thing that seemed to contradict my theory.  While it was true that
it took 2 seconds or more to render the page to the browser; when I clicked
the submit button after checking all check boxes, the form bean was updated
and the main page was rendered almost instantly.  So I decided to figure out
just how much time was being spent doing reflection.  My testing showed that
time spent on reflection to render the page was consuming so little time
that it couldn't even be measured.  In my opinion, the big performance
problem is caused by garbage collection firing much more often then you
would expect.  Perhaps making changes which reduce the number of objects
being created and destroyed for each tag invocation is the best way to
improve performance.


How I Tested For Time Spent On Reflection
-----------------------------------------

To prepare for the tests, I made two test JSP pages.  Both pages consisted
of 150 html:checkbox tags arranged in 15 rows with 10 checkboxes per row.
The first test page contained html:checkbox elements with property values
similar to "survey(questionName).selectedValues[5]".  On the second test
page, I removed all property attributes from the html:checkbox elements thus
leaving them with just a value="<some value>" attribute.  Finally,
struts-html.tld was modified to make the html:checkbox "property" attribute
optional.

Looking at CheckboxTag.java and RequestUtils.java, you can see that if the
"property" attribute is not supplied to the html:checkbox tag, then
"RequestUtils.lookup(pageCOntext, name, property, null)" in
CheckbopxTag.java will simply return the form bean which is retrieved via a
call to pageCOntext.findAttribute().  The beanutils are never called if
property == null.

So how much faster was the second test page (which did not use reflection)
rendered then the first test page which did us e reflection?  The time
difference was too small to be measured.  Both test pages took anywhere from
2 seconds to six seconds to render with the average time being around 3.2
seconds.


So Reflection Wasn't The Problem - More Experiments To Get At The Problem
-------------------------------------------------------------------------

With reflection eliminated, my next suspicion was that the app server
(tomcat 4.0.6 bundled with NetBeans 3.5) was the problem.  I had read that
many performance enhancements had been added since tomcat 4.0.6.  Looking at
the .java code produced by the JSP compiler, I could see room for
optimization but there was nothing that could explain the terrible page
rendering times I was seeing.

So to test it out, I copied the second test page and replaced all
html:checkbox tags with html:messages tags.  In the context of my test page,
the html:messages tag would simply look for messages at Globals.ERROR_KEY.
Since nothing would be found at that key, the tag would just render nothing.

This test page which had 150 html:messages tags on it typically rendered in
10 to 20 milliseconds.

Finally just for the heck of it, I made a test JSP page which just contained
150 <input type="checkbox"...> tags.  This page typically render in 0 to 10
milliseconds.


Effect Of Changing To Incremental Garbage Collection
----------------------------------------------------

Up until now, all tests were performed with "java -server ...".  By now I
was getting convinced that the problem was that garbage collection was
happening every time one of the first two test pages were being rendered and
that is what was killing the performance.  Turning on GC logging confirmed
that.  So, I decided to turn on incremental garbage collection.  Wow, what a
difference.  Things started taking twice as long!  Looking at the GC log, I
could see that the GC was firing pretty much constantly (like you would
expect) but that many of those incremental GC's were taking .2 to .9
seconds.  The result was much worse performance on the first two test pages
and even the third test page (with html:messages tags on it was almost twice
as slow.  The last test page which just contained no struts tags except for
html:form and html:submit still took from 0 to 10 milliseconds but I thick
this was just due to limitations in accuracy of the system timer.


Test Environment
----------------
Intel P4 2.8 GHZ 512MB
Window XP
JDK 1.4.1.4.1_04-b01
Tomcat 4.0.6 running under NetBeans 3.5 (normal execution, not debug)
Struts 1.1


Kurt

-----Original Message-----
From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]
Sent: Saturday, August 02, 2003 5:37 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: [BeanUtils] Interesting Microbenchmarks


TheServerSide (www.theserverside.com) recently published a case study that
attempted to analyze the performance characteristics of a variety of
middleware configurations, including two J2EE app servers and Microsoft's
.NET platform.  One of the hot spots that showed up in the Java based
figures was identified as heavy use of reflection (the test app was Struts
based and therefore uses commons-beanutils), but the report did not detail
these findings.

Despite the fact that the test condtions were very artificial (the
database was tuned in a way that it was never the hotspot, which is unlike
virtually any high-use web application in production), it prompted a bit
of discussion over whether using reflection was a good idea.  This in turn
prompted me to write a couple of simple microbenchmarks to measure the
behavior of the most commonly used BeanUtils and PropertyUtils methods for
copying sets of properties -- copyProperties() and populate().

Legend:
======

  "bean" - A plain old Java object (POJO) with eight properties,
           one of each of the standard primitive types and one
           String (org.apache.commons.beanutils.BenchBean)

  "dyna" - An instance of org.apache.commons.beanutils.BasicDynaBean
           with exactly the same properties and types.  The property
           types for primitives in a DynaBean are the wrapper classes
           such as java.lang.Integer.

  "map"  - An instance of HashMap with eight entries pointing at
           exactly the same values.  The values for primitives
           of the corresponding wrapper class types.

  "strs" - An instance of HashMap with eight entries pointing at
           the same values, but all Strings (simulating request
           parameters that are used to populate a form bean in Struts).

  "Dura #1" - Duration for 100,000 loops on my 1ghz desktop (milliseconds)

  "Dura #2" - Duration for 50,000 loops on my 500mhz laptop (milliseconds)

I tested all the legal combinations of an origin object (bean, dyna,
map, or strs) to a destination object (bean or dyna).  The tests were run
on a 1 gigahertz PC running Linux (Red Hat 9) with Sun's JDK 1.4.2, and on
a 500mhz laptop running Win2K with Sun's 1.4.2.  The loop count was set to
100,000 in the PC case, and 50,000 in the laptop case, so the overall
execution times should be roughly the same (since the desktop is roughly
twice as fast).  Execution time was measured in milliseconds.  For each
test, I ran an untimed execution of the specified number of iterations (to
prime HotSpot), followed by a timed execution of the same number of
iterations.

To run these for yourselves, check out the HEAD branch of beanutils (or
grab a source nightly distro dated 20030803 or later), and run "ant
bench".

Results for BeanUtils (which performs type conversions):
=======================================================

Method         Dest Orig Dura #1 Dura #2
-------------- ---- ---- ------- -------

copyProperties bean bean  17,703  17,075
copyProperties dyna bean  17,463  17,074
copyProperties bean dyna  15,413  15,042
copyProperties dyna dyna  13,020  12,077
copyProperties bean map   15,626  15,192
copyProperties dyna map   12,999  12,237
copyProperties bean strs  16,302  16,093
copyProperties dyna strs  13,940  13,129

populate       bean map    7,438   7,481
populate       dyna map    5,082   5,368
populate       bean strs   5,768   5,739
populate       dyna strs   3,865   3,746


Results for PropertyUtils (no type conversions):
===============================================

Method         Dest Orig Dura #1 Dura #2
-------------- ---- ---- ------- -------

copyProperties bean bean   5,595   5,107
copyProperties dyna bean   4,567   4,126
copyProperties bean dyna   3,791   3,675
copyProperties dyna dyna     844     681
copyProperties bean map    3,938   3,755
copyProperties dyna map      931     772


Observations:
============

* The absoute numbers are not particularly interesting, other than the
  fact that they confirm my belief that copying properties (whether
  using reflection or not) is very unlikely to be a bottleneck -- even
  on bargain basement hardware (in today's world), we're talking about
  copying all the properties in anything from 0.8 microseconds up to
  177 microseconds.

* This is a very limited test case, with only one pattern of properties.
  I suspect it will be on the worst case side in terms of the number
  of conversions required, compared to how most people use these methods
  in real applications.

* Increasing the number of properties being copied will (of course)
  increase the execution times, but I suspect it will not change the
  ratios between alternatives much.  I've done a few informal tests
  with more and fewer properties, but this will be worth investigating
  in more detail as well.

* Only the cases where a "bean" was the destination or origin use
  reflection at all.  The "map" case is just doing HashMap gets and puts,
  while the "dyna" case is really the "map" case plus type safety
  checks.

* In all scenarios, using "dyna" was faster than using "bean", ranging
  from statistically insignificant to pretty substantial percentages.
  For Struts based apps in particular, that means using DynaActionForm
  for form beans, with the current BeanUtils code, will give better
  performance than using standard ActionForm bean classes.  It also
  seems that BasicDynaBean would make a pretty good Data Transfer Object
  for moving data between tiers, adding type safety without giving up
  the performance of using just a HashMap.

* One would expect using a "map" as the origin would be roughly
  the same as using a "dyna", because they both iterate over keys and
  values in a HashMap internally.  The fact that using "dyna" was often
  faster is somewhat anomalous, and is likely based on differing amounts
  of optimization inside the BeanUtils and PropertyUtils loops and
  getter/setter implementations.

* Of particular interest to Struts users is the code that populates
  the form bean.  The two lines that describe this use case (populate
  to "bean" from "strs", and populate to "dyna" from "strs") indicate
  that this operation is already highly optimized, running much
  faster than the more general purpose copyProperties() method with the
  same argument types.

* From casual observation of the current source, I suspect the longer
  times for "bean" objects have more to do with the way that
  PropertyDescriptor and Method instances are cached and searched in
  PropertyUtils than they have with the time of the reflection-based
  execution of the property getters and setters.  There is lots of room
  here to optimize.

* It will be very useful to have people try these microbenchmarks in a
  variety of different environments, and create variations of them with
  different kinds of origin and destination beans, to provide a wider
  set of profiles when we attempt optimization on the underlying code.
  Please feel free to post patches to the microbenchmark sources, so that
  everyone can execute exactly the same set of tests.

* It will also be interesting to try this on different JDK generations,
  to confirm whether the improvements in reflection performance that have
  been described for recent versions shows through in the results here.

Craig McClanahan


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to