OK, just after I sent the email I realized public/private won't really give JIT more confidence.
I'd try the two different impls approach though. It does introduce more types to load but if that's not an issue, I think perf should be good. At least worth trying for curiosity's sake. :) Sent from my phone On Nov 7, 2012 8:57 AM, "Staffan Larsen" <staffan.lar...@oracle.com> wrote: > > On 7 nov 2012, at 14:53, Vitaly Davidovich <vita...@gmail.com> wrote: > > Staffan, > > When you say you removed all implementation from fileBeginRead, do you > mean you just return null instead of doing the ENABLED check? > > Yes. > > Does making ENABLED private yield zero-cost? May give JIT more confidence > that this field isn't modified via reflection from outside. > > Sorry, that was a typo in my email, the field was private in my tests. > > /Staffan > > The other option is to have two separate implementations of the callback > mechanism hidden inside IOTrace calls. At static init time, you pick one > based on ENABLED. If not enabled you use an empty method that just returns > null. The JIT should handle this case well. > > Sent from my phone > On Nov 7, 2012 7:56 AM, "Staffan Larsen" <staffan.lar...@oracle.com> > wrote: > >> An update on performance. I have written microbenchmarks for file and >> socket i/o and compared the results before my suggested changes and after. >> >> FileChannelRead -4.06% >> FileChannelWrite -1.06% >> FileInputStream 0.92% >> FileOutputStream 1.32% >> RandomAccessFileRead 1.66% >> RandomAccessFileWrite 0.76% >> SocketChannelReadWrite 0.75% >> SocketReadWrite 0.89% >> >> Negative values means that my changes added a regression. I think most of >> these values are within the margin of error in the measurements. The one >> exception is FileChannelRead. I've rerun this many times and it looks >> fairly consistent around a 4% regression. Why there is only a regression >> when reading from a FileChannel, I don't know. >> >> The 4% number is too high I think and as a result I'm looking at >> alternative implementations. As a first experiment I tried changing >> IoTrace.fileReadBegin/End into something like this: >> >> public static final boolean ENABLED = >> Boolean.getProperty("iotrace.enabled"); >> >> public static IoTraceContext fileReadBegin(String path) { >> if (ENABLED) { >> ... >> } >> return null; >> } >> >> This got the regression down to 2%. Still not good. >> >> Removing all implementation from fileReadBegin/End gets me on par with >> the non-instrumented version. >> >> It looks like some form of dynamic class redefinition is need here. We >> could start out with empty implementations of the methods in IoTrace, and >> redefine the class to one that has implementations when tracing is enabled. >> >> >> /Staffan >> >> On 4 nov 2012, at 13:25, Aleksey Shipilev <aleksey.shipi...@oracle.com> >> wrote: >> >> > On 11/03/2012 01:15 AM, Mandy Chung wrote: >> >> On 11/2/2012 1:47 PM, Staffan Larsen wrote: >> >>> On 2 nov 2012, at 21:12, Mandy Chung<mandy.ch...@oracle.com> wrote: >> >>> >> >>>> The *Begin() methods return a "handle" that will be passed to the >> >>>> *End() methods. Have you considered to define a type for it rather >> >>>> than Object? >> >>> Something like an empty interface, just to signal the intent? >> >> >> >> Yes I think so. A marker interface would suffice. >> >> >> >>>> Do you have any performance measurement result that you can share? >> >>> I don't yet have any specific numbers - I'll try to get some. The >> >>> testing I have done indicates that the overhead is negligible. But it >> >>> would be good to confirm this. >> >> >> >> refworkload is easy to run. You probably have talked with the >> >> performance team. You can find out from them which benchmarks, if they >> >> have any, are applicable for IO instrumentation. >> > >> > Performance team here. >> > >> > There are virtually no benchmarks against I/O per se. Looking at the >> > patch, I would think anything doing intensive network I/O would help to >> > quantify the change, which boils down to SPECjbb2012, SPECjEnterprise, >> > and Volano. First two would be hard to run, and the third has terrible >> > run-to-run variance, so you will probably have to quantify the changes >> > with microbenchmarks. It should be easy enough to construct with our >> > micro harness (still not available in OpenJDK). Contact me internally if >> > you want to get that route. >> > >> > General patch review: I do have the preoccupation against interferenced >> > tracing code, and while appreciating the intent for tracing patch, we >> > need to look for performance penalties. The rule of thumb is that >> > HotSpot will optimize away the code guarded by static final flag (or, as >> > Remi points out with jsr292 magic). Doing the calls hoping for HotSpot >> > to inline and figure out the absence of useful work there is not working >> > reliably either. Inline conditionals will cost something if the tracing >> > method is not inlined. >> > >> > Hence, I would rather recommend to switch the uses like this: >> > >> > public int read() throws IOException { >> > Object traceHandle = IoTrace.fileReadBegin(path); >> > int b = read0(); >> > IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1); >> > return b; >> > } >> > >> > ...to something more like: >> > >> > public int read() throws IOException { >> > if (IoTrace.ENABLED) { >> > Object traceHandle = IoTrace.fileReadBegin(path); >> > int b = read0(); >> > IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1); >> > return b; >> > } else { >> > return read0(); >> > } >> > } >> > >> > where >> > >> > class IoTrace { >> > public static final boolean ENABLED = >> > System.getProperty("java.io.trace"); >> > } >> > >> > ...which will demote the flexibility of setListener(), but have much >> > lower runtime overhead. This should be confirmed by microbenchmarking >> > anyway. >> > >> > -Aleksey. >> >> >