On 7 nov 2012, at 14:53, Vitaly Davidovich <vita...@gmail.com> wrote:
> Staffan, > > When you say you removed all implementation from fileBeginRead, do you mean > you just return null instead of doing the ENABLED check? > Yes. > Does making ENABLED private yield zero-cost? May give JIT more confidence > that this field isn't modified via reflection from outside. > Sorry, that was a typo in my email, the field was private in my tests. /Staffan > The other option is to have two separate implementations of the callback > mechanism hidden inside IOTrace calls. At static init time, you pick one > based on ENABLED. If not enabled you use an empty method that just returns > null. The JIT should handle this case well. > > > Sent from my phone > > On Nov 7, 2012 7:56 AM, "Staffan Larsen" <staffan.lar...@oracle.com> wrote: > An update on performance. I have written microbenchmarks for file and socket > i/o and compared the results before my suggested changes and after. > > FileChannelRead -4.06% > FileChannelWrite -1.06% > FileInputStream 0.92% > FileOutputStream 1.32% > RandomAccessFileRead 1.66% > RandomAccessFileWrite 0.76% > SocketChannelReadWrite 0.75% > SocketReadWrite 0.89% > > Negative values means that my changes added a regression. I think most of > these values are within the margin of error in the measurements. The one > exception is FileChannelRead. I've rerun this many times and it looks fairly > consistent around a 4% regression. Why there is only a regression when > reading from a FileChannel, I don't know. > > The 4% number is too high I think and as a result I'm looking at alternative > implementations. As a first experiment I tried changing > IoTrace.fileReadBegin/End into something like this: > > public static final boolean ENABLED = Boolean.getProperty("iotrace.enabled"); > > public static IoTraceContext fileReadBegin(String path) { > if (ENABLED) { > ... > } > return null; > } > > This got the regression down to 2%. Still not good. > > Removing all implementation from fileReadBegin/End gets me on par with the > non-instrumented version. > > It looks like some form of dynamic class redefinition is need here. We could > start out with empty implementations of the methods in IoTrace, and redefine > the class to one that has implementations when tracing is enabled. > > > /Staffan > > On 4 nov 2012, at 13:25, Aleksey Shipilev <aleksey.shipi...@oracle.com> wrote: > > > On 11/03/2012 01:15 AM, Mandy Chung wrote: > >> On 11/2/2012 1:47 PM, Staffan Larsen wrote: > >>> On 2 nov 2012, at 21:12, Mandy Chung<mandy.ch...@oracle.com> wrote: > >>> > >>>> The *Begin() methods return a "handle" that will be passed to the > >>>> *End() methods. Have you considered to define a type for it rather > >>>> than Object? > >>> Something like an empty interface, just to signal the intent? > >> > >> Yes I think so. A marker interface would suffice. > >> > >>>> Do you have any performance measurement result that you can share? > >>> I don't yet have any specific numbers - I'll try to get some. The > >>> testing I have done indicates that the overhead is negligible. But it > >>> would be good to confirm this. > >> > >> refworkload is easy to run. You probably have talked with the > >> performance team. You can find out from them which benchmarks, if they > >> have any, are applicable for IO instrumentation. > > > > Performance team here. > > > > There are virtually no benchmarks against I/O per se. Looking at the > > patch, I would think anything doing intensive network I/O would help to > > quantify the change, which boils down to SPECjbb2012, SPECjEnterprise, > > and Volano. First two would be hard to run, and the third has terrible > > run-to-run variance, so you will probably have to quantify the changes > > with microbenchmarks. It should be easy enough to construct with our > > micro harness (still not available in OpenJDK). Contact me internally if > > you want to get that route. > > > > General patch review: I do have the preoccupation against interferenced > > tracing code, and while appreciating the intent for tracing patch, we > > need to look for performance penalties. The rule of thumb is that > > HotSpot will optimize away the code guarded by static final flag (or, as > > Remi points out with jsr292 magic). Doing the calls hoping for HotSpot > > to inline and figure out the absence of useful work there is not working > > reliably either. Inline conditionals will cost something if the tracing > > method is not inlined. > > > > Hence, I would rather recommend to switch the uses like this: > > > > public int read() throws IOException { > > Object traceHandle = IoTrace.fileReadBegin(path); > > int b = read0(); > > IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1); > > return b; > > } > > > > ...to something more like: > > > > public int read() throws IOException { > > if (IoTrace.ENABLED) { > > Object traceHandle = IoTrace.fileReadBegin(path); > > int b = read0(); > > IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1); > > return b; > > } else { > > return read0(); > > } > > } > > > > where > > > > class IoTrace { > > public static final boolean ENABLED = > > System.getProperty("java.io.trace"); > > } > > > > ...which will demote the flexibility of setListener(), but have much > > lower runtime overhead. This should be confirmed by microbenchmarking > > anyway. > > > > -Aleksey. >