Aleksey, Thx for testing this patch, and sharing your experimental result. Yes, I think your result would be reasonable. The performance gain of this patch varies with different systems.
Again, I would like to say we have different definitions for "startup". Maybe I should move the change in classlib module to vm module, so that the dependency can be minimized. thx again for discussion. :) wenlong On Mon, Dec 22, 2008 at 4:04 AM, Aleksey Shipilev <[email protected]> wrote: > Hi Wenlong, > > I had some performance experiments with your patch. The test system is: > - Pentium D 820 2.8 Ghz / 2 Gb DDR2-667 > - WD 3200KS, 320 Gb, 16 Mb cache > - Gentoo Linux x86, 2.6.23 > - Harmony r728459 > - SPECjvm2008 > > To recreate the stressful conditions over and over the simple script > was written [1]. The script invalidates the caches before actually > starting the workload: re-reads the same 64 Mb file a couple of times > to fill out on-HDD cache, invalidating VFS block caches first to make > sure the data is really requested from the disk. > > On HWA [2] these performance results were produced: > > "cold-start" (invalidate caches): > clean: (5.24 +- 0.28) secs > ondemand: (4.49 +- 0.17) secs > > "warm-start" (don't invalidate caches); > clean: (2.82 +- 0.01) secs > ondemand: (2.80 +- 0.02) secs > > That is, on-demand patch does bring +17% (-+9%) improvement on HWA > when running with flushed caches, and does not bring any performance > improvement in warm mode. > > As I mentioned several times, this test does not reflect the real > performance end user would perceive, so I took two SPECjvm2008:startup > benchmarks and run each of them 10x10 times. > > SPECjvm2008:startup.helloworld, "cold start": > clean: (8.93 +- 0.21) ops/min > ondemand: (9.04 +- 0.03) ops/min > > SPECjvm2008:startup.compiler.compiler, "cold start": > clean: (1.44 +- 0.05) ops/min > ondemand: (1.42 +- 0.04) ops/min > > As you can see even in very stressful situation there's no boost. I > would find these performance results unconvincing to change the > infrastructure of boolclasspath resolution. Am I missing something > important? > > Thanks, > Aleksey. > > [1] run.sh > #!/bin/bash > > R=`pwd` > > JAVA=$R/platforms/builds/harmony-release-clean/jdk/jre/bin/java > #JAVA=$R/platforms/builds/harmony-release-ondemand/jdk/jre/bin/java > JAVA_OPTS="-Xmx1024M -Xms1024M" > > for T in `seq 1 10`; do > > echo "*************** EXECUTING ITERATION $T ****************" > > # invalidate HDD caches > # - need to replace all entries in LRU HDD cache > # - flush the kernel VFS cache first to ensure the data > would be read from disk > > echo "Flushing caches" > for I in `seq 1 5`; do > sync > echo 3 > /proc/sys/vm/drop_caches > > dd if=cachekiller.file of=/dev/null > /dev/null 2>&1 > done > > echo "Executing." > > # HelloWorld > /usr/bin/time $JAVA $JAVA_OPTS -cp benchmarks/ HelloWorld 2>&1 > > # SPECjvm2008 > #cd $R/benchmarks/storage/SPECjvm2008 > #/usr/bin/time $JAVA $JAVA_OPTS -Djava.awt.headless=true -jar > SPECjvm2008.jar -ikv -i 10 startup.compiler.compiler 2>&1 > > echo "" > done > > [2] HelloWorld.java > public class HelloWorld { > public static void main(String[] args) { > System.out.println("Hello, world!"); > } > } > > > On Sun, Dec 21, 2008 at 6:02 AM, Wenlong Li <[email protected]> wrote: >> On Sat, Dec 20, 2008 at 7:10 PM, Alexei Fedotov >> <[email protected]> wrote: >>> Wenlong, >>> Thanks for removing the commented code. >>> >>> There are several VMs which make use of the Harmony class library, >>> e.g. Harmony VM, J9, Android Dalvik, etc. Your change is Harmony VM >>> specific, isn't it? If it is, then it's better to keep related changes >>> in the VM module. If it is not, then it might be a good idea to keep >>> the changes in the class library module unless other VMs already has >>> such optimization in their code. >> [Wenlong] Though at this moment, you can think on-demand class parsing >> is a specif optimization from your point of view. I believe it could >> be a general technique, e.g., it can be easily deployed in other >> runtime systems. Current VM also depends on the luniglobal.c in >> working_classlib to get all class libraries/modules. e.g., there is a >> cross-module dependence between classlib and VM. When user wants to >> add new module, they should manually change the >> bootclasspath.properties, while if applying this patch, user should >> revise my added property file instead of the bootclasspath.properties. >> I understand modifying bootclasspath file may be a specification. >>> >>> In any case crossing module boundary would make class library users >>> think more than once or even write some code. Is it technically >>> possible to prepare a patch which does not change module boundaries? >>> What do you think? >> [Wenlong] Yes, it is possible from technical perspective, but a little >> complicated. I can think about it. :) >> >>> >>> As for your performance experiments, which particular test are your >>> measuring? It is bootclasspath-unpretentious "Hello, world", isn't it? >> [Wenlong] My startup means the work executed before running user's >> computation. That is, the vm creation time. I manually add >> instrumentation code for execution time in JNI_CreateJavaVM of >> JNI.cpp. This startup work is common for any benchmarks. My experiment >> was conducted on both Windows and Linux system. Please see my previous >> message about performance gain from this optimization. >> >> Thx, >> Wenlong >>> >>> Thanks! >>> >>> On Sat, Dec 20, 2008 at 2:19 AM, Wenlong Li <[email protected]> wrote: >>>> On Sat, Dec 20, 2008 at 12:42 AM, Alexei Fedotov >>>> <[email protected]> wrote: >>>>> Wenlong, >>>>> Have I missed a discussion of the proposed design? I see that you >>>>> expose a new public interface: >>>>> /** >>>>> * @map the jar with exported package in the pending jar list for >>>>> on-demand jar parsing >>>>> * Key is the jar, and value is the package exported by this jar >>>>> */ >>>>> DECLARE_OPEN(void, vm_properties_set_pending_jar, (const char* key, >>>>> const char* value)); >>>>> >>>>> Did you mean "Maps" instead of "@map"? Strangely the word "pending" >>>>> disappeared from the name of the wrapping VMI interface >>>>> SetJarPackageMapping . Why should we extend both OPEN and VMI >>>>> interfaces with the same function? Why did you put your code into >>>>> working_classlib/modules/luni/src/main/native/luni/shared/luniglob.c, >>>>> thus introducing another dependency between VM and class library? >>>> [Wenlong] The boot class path is defined in luniglobal.c in Harmony, >>>> and it also has dependence with VM. In my understanding, my patch is >>>> related to boot class path determination, so I also put my code in >>>> luniglobal.c, and use VMI interface to communicate with VM. >>>> >>>>> >>>>> + //rcSetProperty = (*vmInterface)->SetJarPackageMapping >>>>> (vmInterface, jarName, jarValue); >>>>> + /* >>>>> + hymem_free_memory(jarName); >>>>> + hymem_free_memory(jarValue); >>>>> + */ >>>>> Should we really commit the commented code? >>>>> Thanks. >>>> >>>> [Wenlong] Please see my latest version of patch in the list. Such >>>> commented code has been removed. >>>>> >>>>> >>>>> On Fri, Dec 19, 2008 at 6:59 PM, Tim Ellison <[email protected]> >>>>> wrote: >>>>>> I was hoping that somebody else would comment first, so I don't have to >>>>>> be the grumpy one all the time :-) >>>>>> >>>>>> As I said before, this is good prototyping work... >>>>>> >>>>>> Wenlong Li wrote: >>>>>>> I did the pre-commit test on the patch of on-demand class library >>>>>>> parsing (https://issues.apache.org/jira/browse/HARMONY-6039), and it >>>>>>> works well now. >>>>>>> Can Harmony incorporate this feature? >>>>>> >>>>>> I'm not sure it is ready for committing to the head stream yet. >>>>>> >>>>>>> Via on-demand class parsing, we can reduce startup time from 20+ >>>>>>> seconds to 3 seconds for cold runing, and 170 ms to 140 ms for warm-up >>>>>>> running on Core 2 Duo with Windows. >>>>>> >>>>>> Can you tell me how to reproduce 20+sec cold start-up? I haven't seen >>>>>> anything like that in my simple tests. >>>>>> >>>>>>> After applying the patch, please note there is some change to add new >>>>>>> modules. >>>>>>> (1) If you want to add new modules/libraries, please don't put them in >>>>>>> the bootclasspath.properties file. This file now only saves modules >>>>>>> needed during startup (the VM startup only accesses class libraries in >>>>>>> eight modules) >>>>>> >>>>>> That would break too much. How about creating a new file rather than >>>>>> re-purposing an existing file with different semantics? This file is >>>>>> used by Jikes, IBM VME, the Eclipse plug-in, at least. >>>>>> >>>>>>> (2) For new modules/libraries, please put them in the >>>>>>> modulelibrarymapping.properties file. You should specify the module >>>>>>> name and its exported class library. Here is one example: >>>>>>> math.jar=java.math, where "math.jar" means the module name, and >>>>>>> "java.math" means the class libraries this module exports. >>>>>> >>>>>> As we discussed on another thread, its unclear if the time is spent in >>>>>> following the slow indexing through the classpath/JAR directories, or >>>>>> whether it is speed of loading bytes once we know what we need. I think >>>>>> that it is premature to abandon the JAR manifest data as the principal >>>>>> source of metadata until we understand the problem this solves. >>>>>> >>>>>> Can we measure where the time is spent in the current implementation? >>>>>> I think it will help guide this approach to a better solution. >>>>>> What tools do you recommend for profiling start-up? >>>>>> >>>>>> Regards >>>>>> Tim >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> С уважением, >>>>> Алексей Федотов, >>>>> ЗАО «Телеком Экспресс» >>>>> >>>> >>> >>> >>> >>> -- >>> С уважением, >>> Алексей Федотов, >>> ЗАО «Телеком Экспресс» >>> >> >
