I am creating a web app that has several survey pages where there are a total of 120 or more radio buttons and check boxes on one page. I have been seeing average page render times on the order of 2 to 2.5 seconds (some times as bad as 5 seconds) with a single user hitting the site. I was going to put off my investigation into this problem until the site was in a more finished form. But since this topic has come up and because I needed something to keep me busy until I go on vacation Wednesday, I decided to look into it now.
After running some tests which I'll describe below, I'm starting to feel like the performance problem reported in TheServerSide case study was caused by excessive garbage collection and not by the use of reflection to display and update form values. Also their choice to use incremental garbage collection in their tests probably made the struts app look far worse then it would have if it had been run with standard garbage collection. Like others who have posted on this subject recently, my initial guess was that reflection was a likely cause of the performance bottleneck. But there was one thing that seemed to contradict my theory. While it was true that it took 2 seconds or more to render the page to the browser; when I clicked the submit button after checking all check boxes, the form bean was updated and the main page was rendered almost instantly. So I decided to figure out just how much time was being spent doing reflection. My testing showed that time spent on reflection to render the page was consuming so little time that it couldn't even be measured. In my opinion, the big performance problem is caused by garbage collection firing much more often then you would expect. Perhaps making changes which reduce the number of objects being created and destroyed for each tag invocation is the best way to improve performance. How I Tested For Time Spent On Reflection ----------------------------------------- To prepare for the tests, I made two test JSP pages. Both pages consisted of 150 html:checkbox tags arranged in 15 rows with 10 checkboxes per row. The first test page contained html:checkbox elements with property values similar to "survey(questionName).selectedValues[5]". On the second test page, I removed all property attributes from the html:checkbox elements thus leaving them with just a value="<some value>" attribute. Finally, struts-html.tld was modified to make the html:checkbox "property" attribute optional. Looking at CheckboxTag.java and RequestUtils.java, you can see that if the "property" attribute is not supplied to the html:checkbox tag, then "RequestUtils.lookup(pageCOntext, name, property, null)" in CheckbopxTag.java will simply return the form bean which is retrieved via a call to pageCOntext.findAttribute(). The beanutils are never called if property == null. So how much faster was the second test page (which did not use reflection) rendered then the first test page which did us e reflection? The time difference was too small to be measured. Both test pages took anywhere from 2 seconds to six seconds to render with the average time being around 3.2 seconds. So Reflection Wasn't The Problem - More Experiments To Get At The Problem ------------------------------------------------------------------------- With reflection eliminated, my next suspicion was that the app server (tomcat 4.0.6 bundled with NetBeans 3.5) was the problem. I had read that many performance enhancements had been added since tomcat 4.0.6. Looking at the .java code produced by the JSP compiler, I could see room for optimization but there was nothing that could explain the terrible page rendering times I was seeing. So to test it out, I copied the second test page and replaced all html:checkbox tags with html:messages tags. In the context of my test page, the html:messages tag would simply look for messages at Globals.ERROR_KEY. Since nothing would be found at that key, the tag would just render nothing. This test page which had 150 html:messages tags on it typically rendered in 10 to 20 milliseconds. Finally just for the heck of it, I made a test JSP page which just contained 150 <input type="checkbox"...> tags. This page typically render in 0 to 10 milliseconds. Effect Of Changing To Incremental Garbage Collection ---------------------------------------------------- Up until now, all tests were performed with "java -server ...". By now I was getting convinced that the problem was that garbage collection was happening every time one of the first two test pages were being rendered and that is what was killing the performance. Turning on GC logging confirmed that. So, I decided to turn on incremental garbage collection. Wow, what a difference. Things started taking twice as long! Looking at the GC log, I could see that the GC was firing pretty much constantly (like you would expect) but that many of those incremental GC's were taking .2 to .9 seconds. The result was much worse performance on the first two test pages and even the third test page (with html:messages tags on it was almost twice as slow. The last test page which just contained no struts tags except for html:form and html:submit still took from 0 to 10 milliseconds but I thick this was just due to limitations in accuracy of the system timer. Test Environment ---------------- Intel P4 2.8 GHZ 512MB Window XP JDK 1.4.1.4.1_04-b01 Tomcat 4.0.6 running under NetBeans 3.5 (normal execution, not debug) Struts 1.1 Kurt -----Original Message----- From: Craig R. McClanahan [mailto:[EMAIL PROTECTED] Sent: Saturday, August 02, 2003 5:37 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: [BeanUtils] Interesting Microbenchmarks TheServerSide (www.theserverside.com) recently published a case study that attempted to analyze the performance characteristics of a variety of middleware configurations, including two J2EE app servers and Microsoft's .NET platform. One of the hot spots that showed up in the Java based figures was identified as heavy use of reflection (the test app was Struts based and therefore uses commons-beanutils), but the report did not detail these findings. Despite the fact that the test condtions were very artificial (the database was tuned in a way that it was never the hotspot, which is unlike virtually any high-use web application in production), it prompted a bit of discussion over whether using reflection was a good idea. This in turn prompted me to write a couple of simple microbenchmarks to measure the behavior of the most commonly used BeanUtils and PropertyUtils methods for copying sets of properties -- copyProperties() and populate(). Legend: ====== "bean" - A plain old Java object (POJO) with eight properties, one of each of the standard primitive types and one String (org.apache.commons.beanutils.BenchBean) "dyna" - An instance of org.apache.commons.beanutils.BasicDynaBean with exactly the same properties and types. The property types for primitives in a DynaBean are the wrapper classes such as java.lang.Integer. "map" - An instance of HashMap with eight entries pointing at exactly the same values. The values for primitives of the corresponding wrapper class types. "strs" - An instance of HashMap with eight entries pointing at the same values, but all Strings (simulating request parameters that are used to populate a form bean in Struts). "Dura #1" - Duration for 100,000 loops on my 1ghz desktop (milliseconds) "Dura #2" - Duration for 50,000 loops on my 500mhz laptop (milliseconds) I tested all the legal combinations of an origin object (bean, dyna, map, or strs) to a destination object (bean or dyna). The tests were run on a 1 gigahertz PC running Linux (Red Hat 9) with Sun's JDK 1.4.2, and on a 500mhz laptop running Win2K with Sun's 1.4.2. The loop count was set to 100,000 in the PC case, and 50,000 in the laptop case, so the overall execution times should be roughly the same (since the desktop is roughly twice as fast). Execution time was measured in milliseconds. For each test, I ran an untimed execution of the specified number of iterations (to prime HotSpot), followed by a timed execution of the same number of iterations. To run these for yourselves, check out the HEAD branch of beanutils (or grab a source nightly distro dated 20030803 or later), and run "ant bench". Results for BeanUtils (which performs type conversions): ======================================================= Method Dest Orig Dura #1 Dura #2 -------------- ---- ---- ------- ------- copyProperties bean bean 17,703 17,075 copyProperties dyna bean 17,463 17,074 copyProperties bean dyna 15,413 15,042 copyProperties dyna dyna 13,020 12,077 copyProperties bean map 15,626 15,192 copyProperties dyna map 12,999 12,237 copyProperties bean strs 16,302 16,093 copyProperties dyna strs 13,940 13,129 populate bean map 7,438 7,481 populate dyna map 5,082 5,368 populate bean strs 5,768 5,739 populate dyna strs 3,865 3,746 Results for PropertyUtils (no type conversions): =============================================== Method Dest Orig Dura #1 Dura #2 -------------- ---- ---- ------- ------- copyProperties bean bean 5,595 5,107 copyProperties dyna bean 4,567 4,126 copyProperties bean dyna 3,791 3,675 copyProperties dyna dyna 844 681 copyProperties bean map 3,938 3,755 copyProperties dyna map 931 772 Observations: ============ * The absoute numbers are not particularly interesting, other than the fact that they confirm my belief that copying properties (whether using reflection or not) is very unlikely to be a bottleneck -- even on bargain basement hardware (in today's world), we're talking about copying all the properties in anything from 0.8 microseconds up to 177 microseconds. * This is a very limited test case, with only one pattern of properties. I suspect it will be on the worst case side in terms of the number of conversions required, compared to how most people use these methods in real applications. * Increasing the number of properties being copied will (of course) increase the execution times, but I suspect it will not change the ratios between alternatives much. I've done a few informal tests with more and fewer properties, but this will be worth investigating in more detail as well. * Only the cases where a "bean" was the destination or origin use reflection at all. The "map" case is just doing HashMap gets and puts, while the "dyna" case is really the "map" case plus type safety checks. * In all scenarios, using "dyna" was faster than using "bean", ranging from statistically insignificant to pretty substantial percentages. For Struts based apps in particular, that means using DynaActionForm for form beans, with the current BeanUtils code, will give better performance than using standard ActionForm bean classes. It also seems that BasicDynaBean would make a pretty good Data Transfer Object for moving data between tiers, adding type safety without giving up the performance of using just a HashMap. * One would expect using a "map" as the origin would be roughly the same as using a "dyna", because they both iterate over keys and values in a HashMap internally. The fact that using "dyna" was often faster is somewhat anomalous, and is likely based on differing amounts of optimization inside the BeanUtils and PropertyUtils loops and getter/setter implementations. * Of particular interest to Struts users is the code that populates the form bean. The two lines that describe this use case (populate to "bean" from "strs", and populate to "dyna" from "strs") indicate that this operation is already highly optimized, running much faster than the more general purpose copyProperties() method with the same argument types. * From casual observation of the current source, I suspect the longer times for "bean" objects have more to do with the way that PropertyDescriptor and Method instances are cached and searched in PropertyUtils than they have with the time of the reflection-based execution of the property getters and setters. There is lots of room here to optimize. * It will be very useful to have people try these microbenchmarks in a variety of different environments, and create variations of them with different kinds of origin and destination beans, to provide a wider set of profiles when we attempt optimization on the underlying code. Please feel free to post patches to the microbenchmark sources, so that everyone can execute exactly the same set of tests. * It will also be interesting to try this on different JDK generations, to confirm whether the improvements in reflection performance that have been described for recent versions shows through in the results here. Craig McClanahan --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]