My biggest concern in this thread relates to what happens when advanced JIT optimizations with common subexpression eliminationand code motion are applied after the basic inlining has been applied. You may end up with a situation where:
(a) You have to generate hundreds of CML events to describe the movement of code from the different inlined components. Even then, the JVMTI specification for CML is a poor vehicle to expose multilayered inlining. (b) You end up with ambiguity where a given assembly address may correspond to more than one of the inlined methods [e.g., common subexpression elimination]. Having the JIT preserve full information in the presence of optimizations will be difficult. These complexities may be why the other VMs don't report inlining events today even though they are called out in the spec. One option would be to report just the outermost method like the RI using the JVMTI interface and provide a richer JVMTI extension interface that actually exposes the full inlining hierarchy in a single event. Standardizing such an enhanced interface might be of value if future JCP discussions of the profiling interface. Changing gears a bit, as someone who has extensively consumed JVMPI and JVMTI CML and DCG events from different VMs, I can state a few things about consuming them on different VMs. Different VMs have different levels of rigor which which they report class loads & unloads, method jits and invalidations. In a well structured environment, you would be guaranteed that before an address was recycled by a new method that the old method would be unloaded with an unload event. I have seen many times where this rule is violated. As a consequence for my engineering tools, I typically check for address overlap with previous JIT events as indicating that the previous JIT has been unloaded without being reported rather than as an indicator that the new jit is inlined into the old JIT. Certainly, one could special case Harmony tools to not treat it this way. I don't have strong feelings regarding sending multiple CML for each disjoint portion of the method. It's a bit ambiguous in the JVMTI specification. It will make it a bit difficult for me to provide statistics that track how many times a given method has been JITTed using the occurances of CML. Imagine a VM that "gos crazy" and accidentally jit methods many many times. Imagine a VM that "goes crazy" and JITs new versions back to back (or concurrently). Using CML timings and counts, one can diagnose this and other types of situations. When CMLs are sent for each disjoint region, such statistics no longer have the same meaning and I am not 100% sure how to do that type of analysis. One could work around this by taking advantage of the "compile_info" field of CML. This is VM specific mechanism to provide more details. To be honest, I've never looked at this value but it could be a way to report that a CML is an inline or a "partial" method... Just a thought... I think Harmony could do something clever here... Finally, to speak to Ivan's comment regarding the ordering of events... If the address ranges of the CMLs do NOT overlap, I have no strong feelings about the ordering. If we do figure a way to use the compile_info field, the ordering should be consistent with the use of that field. If the address ranges DO overlap [e.g., the outermost method claiming the entire address space and the inline events claiming interior chunks], I'd ask that the outermost method CML be sent FIRST and treat subsequent ones as "overrides" for portions of the address space. Thanks, Chris Elford Intel SSG/Enterprise Solutions Software Division -----Original Message----- From: Ivan Popov [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 19, 2006 10:12 AM To: [email protected] Subject: Re: [drlvm][jvmti] Profiling support - Compiled Method Load event Since DRLVM seems the unique JVM implementation that reports inlined methods, it was proposed to document the order in which inlined methods are reported with COMPILED_METHOD_LOAD events - from the most wrapping method to the most inlined ones or vise versa. This will allow to adjust JVMTI tools for handling inlined methods correctly. Thanks. Ivan On 12/18/06, Eugene Ostrovsky <[EMAIL PROTECTED]> wrote: > Let me propose the following design: > > Compiled method load event. More precise specification for inlined methods. > 1. CML event must be sent for every method compiled by JIT. > If compiled method code is disposed in several disconnected regions each of > them must be reported by separate CML event. > Each region location is described by code_addr and code_size parameters. > Native addresses to bytecode location correspondence should be described in > map parameter if this information is available. If compiled method code > contains code blocks of inlined methods the addresses of those blocks should > be associated with the location of corresponding invoke bytecode > instruction. > 2. If compiled method code region was inlined within the code region of the > outer method it should be reported by separate CML event. Inlined method > code region must be enclosed within one of the outer method's code regions. > I.e. outer.start <= inlined.start < inlined.end <= outer.end (where > method.start = code_addr , method.end = code_addr + code_size - 1) > 3. According to #1 and #2 any two of reported regions (R1, R2) may by > enclosed one by other (in case of inlining) but must not overlap (i.e. > R1.start < R2.start <= R1.end < R2.end condition must not be true). > > Is it clear enough? > What do you think? > > On 15 Dec 2006 18:36:22 +0600, Egor Pasko <[EMAIL PROTECTED]> wrote: > > > > On the 0x240 day of Apache Harmony Eugene Ostrovsky wrote: > > > I experimented with sun and bea vms. > > > > did you try HotSpot for JSE 6 (with attach anytime feature)? > > > > > None of them report inlined methods. > > > > > > Thus we need to make a decision by ourselves. > > > > :) yes > > > > > On 14 Dec 2006 16:14:49 +0600, Egor Pasko <[EMAIL PROTECTED]> wrote: > > > > > > > > On the 0x23F day of Apache Harmony Eugene Ostrovsky wrote: > > > > > I'll try to make test to investigate RI behavior. > > > > > > > > thank you thank you thank you > > > > > > > > > On 13 Dec 2006 16:34:46 +0600, Egor Pasko <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > > On the 0x23E day of Apache Harmony George Timoshenko wrote: > > > > > > > Egor, > > > > > > > > > > > > > > thanks for clear scheme. > > > > > > > > > > > > > > In your terms I'd do something like this: > > > > > > > > > > > > > > * firstly - raise event for X: > > > > > > > CompiledMethodLoad(start=X.1.start, > > > > > > > method_size=X.1.size + X.2.size, > > > > > > > addr_loc_map= > > > > > > > [X.1.start -> bcoff1, > > > > > > > X.2.start -> bcoff2]) > > > > > > > * secondly - raise event for Y: > > > > > > > CompiledMethodLoad(start=Y.1.start, > > > > > > > method_size=Y.1.size, > > > > > > > addr_loc_map= > > > > > > > [Y.1.start -> bcoff_Y]) > > > > > > > > > > > > good question! > > > > > > > > > > > > IMHO, code_addr and code_size outlines a region where method code > > is > > > > > > contained. In that case VM can quickly tell which method the IP > > > > > > (instruction pointer) belongs to. So, I intentionally suggested > > > > > > code_size=(X.1.size + Y.1.size + X.2.size) instead of (X.1.size + > > > > > > X.2.size). > > > > > > > > > > > > BTW, Eugene, do you have some important observations of the RI > > > > > > behaviour for us? > > > > > > > > > > > > > > For example, we have > > > > > > > > some chinks of methods X and Y intermixed like this: > > > > > > > > "X.1,Y.1,X.2". To overcome we may: > > > > > > > > * raise a single event for X: > > > > > > > > CompiledMethodLoad(start=X.1.start, > > > > > > > > method_size=X.1.size + Y.1.size + > > X.2.size > > > > , > > > > > > > > addr_loc_map= > > > > > > > > [X.1.start -> bcoff1, > > > > > > > > Y.1.start -> 0, > > > > > > > > X.2.start -> bcoff2]) > > > > > > > > * raise 2 events for X: > > > > > > > > CompiledMethodLoad(start=X.1.start, > > > > > > > > method_size=X.1.size, > > > > > > > > addr_loc_map= > > > > > > > > [X.1.start -> bcoff1]) > > > > > > > > CompiledMethodLoad(start=X.2.start, > > > > > > > > method_size=X.2.size, > > > > > > > > addr_loc_map= > > > > > > > > [X.2.start -> bcoff2]) > > > > > > > > > > > > > > > > I would highly appreciate if some JVMTI guru steps down from > > > > Olymp > > > > > > and > > > > > > > > tells which of two is the best, or at least says what RI does > > in > > > > that > > > > > > > > case (or, maybe, RI does not generate non-contigous blocks?) > > > > > > > > > > > > > > > > I like the second approach (raise 2 events) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Egor Pasko > > > > > > > > > > > > > > > > > > > > -- > > > > Egor Pasko > > > > > > > > > > > > -- > > Egor Pasko > > > > > >
