Guys,
Why you are in need of instrumentation?
You only want to measure the time spent in JNI_CreateJavaVM method. It
is very-very simple to create C-program which utilizes invocation API
to call to JNI_CreateJavaVM and calls to performance counters before
and after VM creation to calculate that time. This will be exactly the
"startup-time" according to Wenlong. Then you can run it through the
script supplied by Aleksey.
If you also want to include finding main class - you can look into our
launcher and borrow the code from there.
Invocation API rules. ;)
WBR,
Pavel.
On Tue, Dec 23, 2008 at 3:06 AM, Aleksey Shipilev
<[email protected]> wrote:
> Hi, Wenlong!
>
> That would be terrific.
>
> Can you please do the following runs in both cold and warm modes:
> a. HWA without instrumentation.
> b. HWA with instrumentation which measure the following stages: VM
> creation time, Java main() execution, shutdown sequence.
> c. SPECjvm2008:startup.helloworld without instrumentation.
>
> The instrumentation code should belong in corresponding JIRA issue, as
> the part of prerequisites for testing your patch.
>
> Thanks,
> Aleksey.
>
> On Tue, Dec 23, 2008 at 2:45 AM, Wenlong Li <[email protected]> wrote:
>> Aleksey,
>>
>> Do you mind repeating my experiment in your side? Your code below also
>> measures the VM destroy time, and it could be very long. You can
>> instrument it in destroy_VM. So my suggestion is can you only report
>> the time before executing user's code, e.g., the vm creation time,
>> etc. I can share my time instrumentation code if needed.
>>
>> I want to repeat your experiment in my side.
>>
>> thx,
>> Wenlong
>>
>> On Mon, Dec 22, 2008 at 4:04 AM, Aleksey Shipilev
>> <[email protected]> wrote:
>>> Hi Wenlong,
>>>
>>> I had some performance experiments with your patch. The test system is:
>>> - Pentium D 820 2.8 Ghz / 2 Gb DDR2-667
>>> - WD 3200KS, 320 Gb, 16 Mb cache
>>> - Gentoo Linux x86, 2.6.23
>>> - Harmony r728459
>>> - SPECjvm2008
>>>
>>> To recreate the stressful conditions over and over the simple script
>>> was written [1]. The script invalidates the caches before actually
>>> starting the workload: re-reads the same 64 Mb file a couple of times
>>> to fill out on-HDD cache, invalidating VFS block caches first to make
>>> sure the data is really requested from the disk.
>>>
>>> On HWA [2] these performance results were produced:
>>>
>>> "cold-start" (invalidate caches):
>>> clean: (5.24 +- 0.28) secs
>>> ondemand: (4.49 +- 0.17) secs
>>>
>>> "warm-start" (don't invalidate caches);
>>> clean: (2.82 +- 0.01) secs
>>> ondemand: (2.80 +- 0.02) secs
>>>
>>> That is, on-demand patch does bring +17% (-+9%) improvement on HWA
>>> when running with flushed caches, and does not bring any performance
>>> improvement in warm mode.
>>>
>>> As I mentioned several times, this test does not reflect the real
>>> performance end user would perceive, so I took two SPECjvm2008:startup
>>> benchmarks and run each of them 10x10 times.
>>>
>>> SPECjvm2008:startup.helloworld, "cold start":
>>> clean: (8.93 +- 0.21) ops/min
>>> ondemand: (9.04 +- 0.03) ops/min
>>>
>>> SPECjvm2008:startup.compiler.compiler, "cold start":
>>> clean: (1.44 +- 0.05) ops/min
>>> ondemand: (1.42 +- 0.04) ops/min
>>>
>>> As you can see even in very stressful situation there's no boost. I
>>> would find these performance results unconvincing to change the
>>> infrastructure of boolclasspath resolution. Am I missing something
>>> important?
>>>
>>> Thanks,
>>> Aleksey.
>>>
>>> [1] run.sh
>>> #!/bin/bash
>>>
>>> R=`pwd`
>>>
>>> JAVA=$R/platforms/builds/harmony-release-clean/jdk/jre/bin/java
>>> #JAVA=$R/platforms/builds/harmony-release-ondemand/jdk/jre/bin/java
>>> JAVA_OPTS="-Xmx1024M -Xms1024M"
>>>
>>> for T in `seq 1 10`; do
>>>
>>> echo "*************** EXECUTING ITERATION $T ****************"
>>>
>>> # invalidate HDD caches
>>> # - need to replace all entries in LRU HDD cache
>>> # - flush the kernel VFS cache first to ensure the data
>>> would be read from disk
>>>
>>> echo "Flushing caches"
>>> for I in `seq 1 5`; do
>>> sync
>>> echo 3 > /proc/sys/vm/drop_caches
>>>
>>> dd if=cachekiller.file of=/dev/null > /dev/null 2>&1
>>> done
>>>
>>> echo "Executing."
>>>
>>> # HelloWorld
>>> /usr/bin/time $JAVA $JAVA_OPTS -cp benchmarks/ HelloWorld 2>&1
>>>
>>> # SPECjvm2008
>>> #cd $R/benchmarks/storage/SPECjvm2008
>>> #/usr/bin/time $JAVA $JAVA_OPTS -Djava.awt.headless=true -jar
>>> SPECjvm2008.jar -ikv -i 10 startup.compiler.compiler 2>&1
>>>
>>> echo ""
>>> done
>>>
>>> [2] HelloWorld.java
>>> public class HelloWorld {
>>> public static void main(String[] args) {
>>> System.out.println("Hello, world!");
>>> }
>>> }
>>>
>>>
>>> On Sun, Dec 21, 2008 at 6:02 AM, Wenlong Li <[email protected]> wrote:
>>>> On Sat, Dec 20, 2008 at 7:10 PM, Alexei Fedotov
>>>> <[email protected]> wrote:
>>>>> Wenlong,
>>>>> Thanks for removing the commented code.
>>>>>
>>>>> There are several VMs which make use of the Harmony class library,
>>>>> e.g. Harmony VM, J9, Android Dalvik, etc. Your change is Harmony VM
>>>>> specific, isn't it? If it is, then it's better to keep related changes
>>>>> in the VM module. If it is not, then it might be a good idea to keep
>>>>> the changes in the class library module unless other VMs already has
>>>>> such optimization in their code.
>>>> [Wenlong] Though at this moment, you can think on-demand class parsing
>>>> is a specif optimization from your point of view. I believe it could
>>>> be a general technique, e.g., it can be easily deployed in other
>>>> runtime systems. Current VM also depends on the luniglobal.c in
>>>> working_classlib to get all class libraries/modules. e.g., there is a
>>>> cross-module dependence between classlib and VM. When user wants to
>>>> add new module, they should manually change the
>>>> bootclasspath.properties, while if applying this patch, user should
>>>> revise my added property file instead of the bootclasspath.properties.
>>>> I understand modifying bootclasspath file may be a specification.
>>>>>
>>>>> In any case crossing module boundary would make class library users
>>>>> think more than once or even write some code. Is it technically
>>>>> possible to prepare a patch which does not change module boundaries?
>>>>> What do you think?
>>>> [Wenlong] Yes, it is possible from technical perspective, but a little
>>>> complicated. I can think about it. :)
>>>>
>>>>>
>>>>> As for your performance experiments, which particular test are your
>>>>> measuring? It is bootclasspath-unpretentious "Hello, world", isn't it?
>>>> [Wenlong] My startup means the work executed before running user's
>>>> computation. That is, the vm creation time. I manually add
>>>> instrumentation code for execution time in JNI_CreateJavaVM of
>>>> JNI.cpp. This startup work is common for any benchmarks. My experiment
>>>> was conducted on both Windows and Linux system. Please see my previous
>>>> message about performance gain from this optimization.
>>>>
>>>> Thx,
>>>> Wenlong
>>>>>
>>>>> Thanks!
>>>>>
>>>>> On Sat, Dec 20, 2008 at 2:19 AM, Wenlong Li <[email protected]> wrote:
>>>>>> On Sat, Dec 20, 2008 at 12:42 AM, Alexei Fedotov
>>>>>> <[email protected]> wrote:
>>>>>>> Wenlong,
>>>>>>> Have I missed a discussion of the proposed design? I see that you
>>>>>>> expose a new public interface:
>>>>>>> /**
>>>>>>> * @map the jar with exported package in the pending jar list for
>>>>>>> on-demand jar parsing
>>>>>>> * Key is the jar, and value is the package exported by this jar
>>>>>>> */
>>>>>>> DECLARE_OPEN(void, vm_properties_set_pending_jar, (const char* key,
>>>>>>> const char* value));
>>>>>>>
>>>>>>> Did you mean "Maps" instead of "@map"? Strangely the word "pending"
>>>>>>> disappeared from the name of the wrapping VMI interface
>>>>>>> SetJarPackageMapping . Why should we extend both OPEN and VMI
>>>>>>> interfaces with the same function? Why did you put your code into
>>>>>>> working_classlib/modules/luni/src/main/native/luni/shared/luniglob.c,
>>>>>>> thus introducing another dependency between VM and class library?
>>>>>> [Wenlong] The boot class path is defined in luniglobal.c in Harmony,
>>>>>> and it also has dependence with VM. In my understanding, my patch is
>>>>>> related to boot class path determination, so I also put my code in
>>>>>> luniglobal.c, and use VMI interface to communicate with VM.
>>>>>>
>>>>>>>
>>>>>>> + //rcSetProperty = (*vmInterface)->SetJarPackageMapping
>>>>>>> (vmInterface, jarName, jarValue);
>>>>>>> + /*
>>>>>>> + hymem_free_memory(jarName);
>>>>>>> + hymem_free_memory(jarValue);
>>>>>>> + */
>>>>>>> Should we really commit the commented code?
>>>>>>> Thanks.
>>>>>>
>>>>>> [Wenlong] Please see my latest version of patch in the list. Such
>>>>>> commented code has been removed.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Dec 19, 2008 at 6:59 PM, Tim Ellison <[email protected]>
>>>>>>> wrote:
>>>>>>>> I was hoping that somebody else would comment first, so I don't have to
>>>>>>>> be the grumpy one all the time :-)
>>>>>>>>
>>>>>>>> As I said before, this is good prototyping work...
>>>>>>>>
>>>>>>>> Wenlong Li wrote:
>>>>>>>>> I did the pre-commit test on the patch of on-demand class library
>>>>>>>>> parsing (https://issues.apache.org/jira/browse/HARMONY-6039), and it
>>>>>>>>> works well now.
>>>>>>>>> Can Harmony incorporate this feature?
>>>>>>>>
>>>>>>>> I'm not sure it is ready for committing to the head stream yet.
>>>>>>>>
>>>>>>>>> Via on-demand class parsing, we can reduce startup time from 20+
>>>>>>>>> seconds to 3 seconds for cold runing, and 170 ms to 140 ms for warm-up
>>>>>>>>> running on Core 2 Duo with Windows.
>>>>>>>>
>>>>>>>> Can you tell me how to reproduce 20+sec cold start-up? I haven't seen
>>>>>>>> anything like that in my simple tests.
>>>>>>>>
>>>>>>>>> After applying the patch, please note there is some change to add new
>>>>>>>>> modules.
>>>>>>>>> (1) If you want to add new modules/libraries, please don't put them in
>>>>>>>>> the bootclasspath.properties file. This file now only saves modules
>>>>>>>>> needed during startup (the VM startup only accesses class libraries in
>>>>>>>>> eight modules)
>>>>>>>>
>>>>>>>> That would break too much. How about creating a new file rather than
>>>>>>>> re-purposing an existing file with different semantics? This file is
>>>>>>>> used by Jikes, IBM VME, the Eclipse plug-in, at least.
>>>>>>>>
>>>>>>>>> (2) For new modules/libraries, please put them in the
>>>>>>>>> modulelibrarymapping.properties file. You should specify the module
>>>>>>>>> name and its exported class library. Here is one example:
>>>>>>>>> math.jar=java.math, where "math.jar" means the module name, and
>>>>>>>>> "java.math" means the class libraries this module exports.
>>>>>>>>
>>>>>>>> As we discussed on another thread, its unclear if the time is spent in
>>>>>>>> following the slow indexing through the classpath/JAR directories, or
>>>>>>>> whether it is speed of loading bytes once we know what we need. I
>>>>>>>> think
>>>>>>>> that it is premature to abandon the JAR manifest data as the principal
>>>>>>>> source of metadata until we understand the problem this solves.
>>>>>>>>
>>>>>>>> Can we measure where the time is spent in the current implementation?
>>>>>>>> I think it will help guide this approach to a better solution.
>>>>>>>> What tools do you recommend for profiling start-up?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Tim
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> С уважением,
>>>>>>> Алексей Федотов,
>>>>>>> ЗАО «Телеком Экспресс»
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> С уважением,
>>>>> Алексей Федотов,
>>>>> ЗАО «Телеком Экспресс»
>>>>>
>>>>
>>>
>>
>