My startup means the computation needed before executing user's code (in main method) (see [1][2], while Aleksey's opinion is the startup benchmark in SPECJVM2008. [1] http://www.oracle.com/technology/pub/articles/dev2arch/2004/01/jrockit.html [2] http://www.ibm.com/developerworks/java/library/os-ecspy1/
On Mon, Dec 22, 2008 at 8:38 AM, Nathan Beyer <[email protected]> wrote: > Can someone give a quick summary of the two different definitions of > "startup" being discussed? > > -Nathan > > On Sun, Dec 21, 2008 at 6:22 PM, Wenlong Li <[email protected]> wrote: >> Aleksey, >> >> Thx for testing this patch, and sharing your experimental result. >> Yes, I think your result would be reasonable. The performance gain of >> this patch varies with different systems. >> >> Again, I would like to say we have different definitions for "startup". >> Maybe I should move the change in classlib module to vm module, so >> that the dependency can be minimized. >> >> thx again for discussion. :) >> wenlong >> >> On Mon, Dec 22, 2008 at 4:04 AM, Aleksey Shipilev >> <[email protected]> wrote: >>> Hi Wenlong, >>> >>> I had some performance experiments with your patch. The test system is: >>> - Pentium D 820 2.8 Ghz / 2 Gb DDR2-667 >>> - WD 3200KS, 320 Gb, 16 Mb cache >>> - Gentoo Linux x86, 2.6.23 >>> - Harmony r728459 >>> - SPECjvm2008 >>> >>> To recreate the stressful conditions over and over the simple script >>> was written [1]. The script invalidates the caches before actually >>> starting the workload: re-reads the same 64 Mb file a couple of times >>> to fill out on-HDD cache, invalidating VFS block caches first to make >>> sure the data is really requested from the disk. >>> >>> On HWA [2] these performance results were produced: >>> >>> "cold-start" (invalidate caches): >>> clean: (5.24 +- 0.28) secs >>> ondemand: (4.49 +- 0.17) secs >>> >>> "warm-start" (don't invalidate caches); >>> clean: (2.82 +- 0.01) secs >>> ondemand: (2.80 +- 0.02) secs >>> >>> That is, on-demand patch does bring +17% (-+9%) improvement on HWA >>> when running with flushed caches, and does not bring any performance >>> improvement in warm mode. >>> >>> As I mentioned several times, this test does not reflect the real >>> performance end user would perceive, so I took two SPECjvm2008:startup >>> benchmarks and run each of them 10x10 times. >>> >>> SPECjvm2008:startup.helloworld, "cold start": >>> clean: (8.93 +- 0.21) ops/min >>> ondemand: (9.04 +- 0.03) ops/min >>> >>> SPECjvm2008:startup.compiler.compiler, "cold start": >>> clean: (1.44 +- 0.05) ops/min >>> ondemand: (1.42 +- 0.04) ops/min >>> >>> As you can see even in very stressful situation there's no boost. I >>> would find these performance results unconvincing to change the >>> infrastructure of boolclasspath resolution. Am I missing something >>> important? >>> >>> Thanks, >>> Aleksey. >>> >>> [1] run.sh >>> #!/bin/bash >>> >>> R=`pwd` >>> >>> JAVA=$R/platforms/builds/harmony-release-clean/jdk/jre/bin/java >>> #JAVA=$R/platforms/builds/harmony-release-ondemand/jdk/jre/bin/java >>> JAVA_OPTS="-Xmx1024M -Xms1024M" >>> >>> for T in `seq 1 10`; do >>> >>> echo "*************** EXECUTING ITERATION $T ****************" >>> >>> # invalidate HDD caches >>> # - need to replace all entries in LRU HDD cache >>> # - flush the kernel VFS cache first to ensure the data >>> would be read from disk >>> >>> echo "Flushing caches" >>> for I in `seq 1 5`; do >>> sync >>> echo 3 > /proc/sys/vm/drop_caches >>> >>> dd if=cachekiller.file of=/dev/null > /dev/null 2>&1 >>> done >>> >>> echo "Executing." >>> >>> # HelloWorld >>> /usr/bin/time $JAVA $JAVA_OPTS -cp benchmarks/ HelloWorld 2>&1 >>> >>> # SPECjvm2008 >>> #cd $R/benchmarks/storage/SPECjvm2008 >>> #/usr/bin/time $JAVA $JAVA_OPTS -Djava.awt.headless=true -jar >>> SPECjvm2008.jar -ikv -i 10 startup.compiler.compiler 2>&1 >>> >>> echo "" >>> done >>> >>> [2] HelloWorld.java >>> public class HelloWorld { >>> public static void main(String[] args) { >>> System.out.println("Hello, world!"); >>> } >>> } >>> >>> >>> On Sun, Dec 21, 2008 at 6:02 AM, Wenlong Li <[email protected]> wrote: >>>> On Sat, Dec 20, 2008 at 7:10 PM, Alexei Fedotov >>>> <[email protected]> wrote: >>>>> Wenlong, >>>>> Thanks for removing the commented code. >>>>> >>>>> There are several VMs which make use of the Harmony class library, >>>>> e.g. Harmony VM, J9, Android Dalvik, etc. Your change is Harmony VM >>>>> specific, isn't it? If it is, then it's better to keep related changes >>>>> in the VM module. If it is not, then it might be a good idea to keep >>>>> the changes in the class library module unless other VMs already has >>>>> such optimization in their code. >>>> [Wenlong] Though at this moment, you can think on-demand class parsing >>>> is a specif optimization from your point of view. I believe it could >>>> be a general technique, e.g., it can be easily deployed in other >>>> runtime systems. Current VM also depends on the luniglobal.c in >>>> working_classlib to get all class libraries/modules. e.g., there is a >>>> cross-module dependence between classlib and VM. When user wants to >>>> add new module, they should manually change the >>>> bootclasspath.properties, while if applying this patch, user should >>>> revise my added property file instead of the bootclasspath.properties. >>>> I understand modifying bootclasspath file may be a specification. >>>>> >>>>> In any case crossing module boundary would make class library users >>>>> think more than once or even write some code. Is it technically >>>>> possible to prepare a patch which does not change module boundaries? >>>>> What do you think? >>>> [Wenlong] Yes, it is possible from technical perspective, but a little >>>> complicated. I can think about it. :) >>>> >>>>> >>>>> As for your performance experiments, which particular test are your >>>>> measuring? It is bootclasspath-unpretentious "Hello, world", isn't it? >>>> [Wenlong] My startup means the work executed before running user's >>>> computation. That is, the vm creation time. I manually add >>>> instrumentation code for execution time in JNI_CreateJavaVM of >>>> JNI.cpp. This startup work is common for any benchmarks. My experiment >>>> was conducted on both Windows and Linux system. Please see my previous >>>> message about performance gain from this optimization. >>>> >>>> Thx, >>>> Wenlong >>>>> >>>>> Thanks! >>>>> >>>>> On Sat, Dec 20, 2008 at 2:19 AM, Wenlong Li <[email protected]> wrote: >>>>>> On Sat, Dec 20, 2008 at 12:42 AM, Alexei Fedotov >>>>>> <[email protected]> wrote: >>>>>>> Wenlong, >>>>>>> Have I missed a discussion of the proposed design? I see that you >>>>>>> expose a new public interface: >>>>>>> /** >>>>>>> * @map the jar with exported package in the pending jar list for >>>>>>> on-demand jar parsing >>>>>>> * Key is the jar, and value is the package exported by this jar >>>>>>> */ >>>>>>> DECLARE_OPEN(void, vm_properties_set_pending_jar, (const char* key, >>>>>>> const char* value)); >>>>>>> >>>>>>> Did you mean "Maps" instead of "@map"? Strangely the word "pending" >>>>>>> disappeared from the name of the wrapping VMI interface >>>>>>> SetJarPackageMapping . Why should we extend both OPEN and VMI >>>>>>> interfaces with the same function? Why did you put your code into >>>>>>> working_classlib/modules/luni/src/main/native/luni/shared/luniglob.c, >>>>>>> thus introducing another dependency between VM and class library? >>>>>> [Wenlong] The boot class path is defined in luniglobal.c in Harmony, >>>>>> and it also has dependence with VM. In my understanding, my patch is >>>>>> related to boot class path determination, so I also put my code in >>>>>> luniglobal.c, and use VMI interface to communicate with VM. >>>>>> >>>>>>> >>>>>>> + //rcSetProperty = (*vmInterface)->SetJarPackageMapping >>>>>>> (vmInterface, jarName, jarValue); >>>>>>> + /* >>>>>>> + hymem_free_memory(jarName); >>>>>>> + hymem_free_memory(jarValue); >>>>>>> + */ >>>>>>> Should we really commit the commented code? >>>>>>> Thanks. >>>>>> >>>>>> [Wenlong] Please see my latest version of patch in the list. Such >>>>>> commented code has been removed. >>>>>>> >>>>>>> >>>>>>> On Fri, Dec 19, 2008 at 6:59 PM, Tim Ellison <[email protected]> >>>>>>> wrote: >>>>>>>> I was hoping that somebody else would comment first, so I don't have to >>>>>>>> be the grumpy one all the time :-) >>>>>>>> >>>>>>>> As I said before, this is good prototyping work... >>>>>>>> >>>>>>>> Wenlong Li wrote: >>>>>>>>> I did the pre-commit test on the patch of on-demand class library >>>>>>>>> parsing (https://issues.apache.org/jira/browse/HARMONY-6039), and it >>>>>>>>> works well now. >>>>>>>>> Can Harmony incorporate this feature? >>>>>>>> >>>>>>>> I'm not sure it is ready for committing to the head stream yet. >>>>>>>> >>>>>>>>> Via on-demand class parsing, we can reduce startup time from 20+ >>>>>>>>> seconds to 3 seconds for cold runing, and 170 ms to 140 ms for warm-up >>>>>>>>> running on Core 2 Duo with Windows. >>>>>>>> >>>>>>>> Can you tell me how to reproduce 20+sec cold start-up? I haven't seen >>>>>>>> anything like that in my simple tests. >>>>>>>> >>>>>>>>> After applying the patch, please note there is some change to add new >>>>>>>>> modules. >>>>>>>>> (1) If you want to add new modules/libraries, please don't put them in >>>>>>>>> the bootclasspath.properties file. This file now only saves modules >>>>>>>>> needed during startup (the VM startup only accesses class libraries in >>>>>>>>> eight modules) >>>>>>>> >>>>>>>> That would break too much. How about creating a new file rather than >>>>>>>> re-purposing an existing file with different semantics? This file is >>>>>>>> used by Jikes, IBM VME, the Eclipse plug-in, at least. >>>>>>>> >>>>>>>>> (2) For new modules/libraries, please put them in the >>>>>>>>> modulelibrarymapping.properties file. You should specify the module >>>>>>>>> name and its exported class library. Here is one example: >>>>>>>>> math.jar=java.math, where "math.jar" means the module name, and >>>>>>>>> "java.math" means the class libraries this module exports. >>>>>>>> >>>>>>>> As we discussed on another thread, its unclear if the time is spent in >>>>>>>> following the slow indexing through the classpath/JAR directories, or >>>>>>>> whether it is speed of loading bytes once we know what we need. I >>>>>>>> think >>>>>>>> that it is premature to abandon the JAR manifest data as the principal >>>>>>>> source of metadata until we understand the problem this solves. >>>>>>>> >>>>>>>> Can we measure where the time is spent in the current implementation? >>>>>>>> I think it will help guide this approach to a better solution. >>>>>>>> What tools do you recommend for profiling start-up? >>>>>>>> >>>>>>>> Regards >>>>>>>> Tim >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> С уважением, >>>>>>> Алексей Федотов, >>>>>>> ЗАО «Телеком Экспресс» >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> С уважением, >>>>> Алексей Федотов, >>>>> ЗАО «Телеком Экспресс» >>>>> >>>> >>> >> >
