OK, just after I sent the email I realized public/private won't really give
JIT more confidence.

I'd try the two different impls approach though.  It does introduce more
types to load but if that's not an issue, I think perf should be good.  At
least worth trying for curiosity's sake. :)

Sent from my phone
On Nov 7, 2012 8:57 AM, "Staffan Larsen" <staffan.lar...@oracle.com> wrote:

>
> On 7 nov 2012, at 14:53, Vitaly Davidovich <vita...@gmail.com> wrote:
>
> Staffan,
>
> When you say you removed all implementation from fileBeginRead, do you
> mean you just return null instead of doing the ENABLED check?
>
> Yes.
>
> Does making ENABLED private yield zero-cost? May give JIT more confidence
> that this field isn't modified via reflection from outside.
>
> Sorry, that was a typo in my email, the field was private in my tests.
>
> /Staffan
>
> The other option is to have two separate implementations of the callback
> mechanism hidden inside IOTrace calls.  At static init time, you pick one
> based on ENABLED.  If not enabled you use an empty method that just returns
> null.  The JIT should handle this case well.
>
> Sent from my phone
> On Nov 7, 2012 7:56 AM, "Staffan Larsen" <staffan.lar...@oracle.com>
> wrote:
>
>> An update on performance. I have written microbenchmarks for file and
>> socket i/o and compared the results before my suggested changes and after.
>>
>> FileChannelRead       -4.06%
>> FileChannelWrite      -1.06%
>> FileInputStream        0.92%
>> FileOutputStream       1.32%
>> RandomAccessFileRead   1.66%
>> RandomAccessFileWrite  0.76%
>> SocketChannelReadWrite 0.75%
>> SocketReadWrite        0.89%
>>
>> Negative values means that my changes added a regression. I think most of
>> these values are within the margin of error in the measurements. The one
>> exception is FileChannelRead. I've rerun this many times and it looks
>> fairly consistent around a 4% regression. Why there is only a regression
>> when reading from a FileChannel, I don't know.
>>
>> The 4% number is too high I think and as a result I'm looking at
>> alternative implementations. As a first experiment I tried changing
>> IoTrace.fileReadBegin/End into something like this:
>>
>> public static final boolean ENABLED =
>> Boolean.getProperty("iotrace.enabled");
>>
>> public static IoTraceContext fileReadBegin(String path) {
>>     if (ENABLED) {
>>         ...
>>     }
>>     return null;
>> }
>>
>> This got the regression down to 2%. Still not good.
>>
>> Removing all implementation from fileReadBegin/End gets me on par with
>> the non-instrumented version.
>>
>> It looks like some form of dynamic class redefinition is need here. We
>> could start out with empty implementations of the methods in IoTrace, and
>> redefine the class to one that has implementations when tracing is enabled.
>>
>>
>> /Staffan
>>
>> On 4 nov 2012, at 13:25, Aleksey Shipilev <aleksey.shipi...@oracle.com>
>> wrote:
>>
>> > On 11/03/2012 01:15 AM, Mandy Chung wrote:
>> >> On 11/2/2012 1:47 PM, Staffan Larsen wrote:
>> >>> On 2 nov 2012, at 21:12, Mandy Chung<mandy.ch...@oracle.com>  wrote:
>> >>>
>> >>>> The *Begin() methods return a "handle" that will be passed to the
>> >>>> *End() methods.  Have you considered to define a type for it rather
>> >>>> than Object?
>> >>> Something like an empty interface, just to signal the intent?
>> >>
>> >> Yes I think so.  A marker interface would suffice.
>> >>
>> >>>> Do you have any performance measurement result that you can share?
>> >>> I don't yet have any specific numbers - I'll try to get some. The
>> >>> testing I have done indicates that the overhead is negligible. But it
>> >>> would be good to confirm this.
>> >>
>> >> refworkload is easy to run. You probably have talked with the
>> >> performance team. You can find out from them which benchmarks, if they
>> >> have any, are applicable for IO instrumentation.
>> >
>> > Performance team here.
>> >
>> > There are virtually no benchmarks against I/O per se. Looking at the
>> > patch, I would think anything doing intensive network I/O would help to
>> > quantify the change, which boils down to SPECjbb2012, SPECjEnterprise,
>> > and Volano. First two would be hard to run, and the third has terrible
>> > run-to-run variance, so you will probably have to quantify the changes
>> > with microbenchmarks. It should be easy enough to construct with our
>> > micro harness (still not available in OpenJDK). Contact me internally if
>> > you want to get that route.
>> >
>> > General patch review: I do have the preoccupation against interferenced
>> > tracing code, and while appreciating the intent for tracing patch, we
>> > need to look for performance penalties. The rule of thumb is that
>> > HotSpot will optimize away the code guarded by static final flag (or, as
>> > Remi points out with jsr292 magic). Doing the calls hoping for HotSpot
>> > to inline and figure out the absence of useful work there is not working
>> > reliably either. Inline conditionals will cost something if the tracing
>> > method is not inlined.
>> >
>> > Hence, I would rather recommend to switch the uses like this:
>> >
>> >    public int read() throws IOException {
>> >        Object traceHandle = IoTrace.fileReadBegin(path);
>> >        int b = read0();
>> >        IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1);
>> >        return b;
>> >    }
>> >
>> > ...to something more like:
>> >
>> >    public int read() throws IOException {
>> >        if (IoTrace.ENABLED) {
>> >            Object traceHandle = IoTrace.fileReadBegin(path);
>> >            int b = read0();
>> >            IoTrace.fileReadEnd(traceHandle, b == -1 ? -1 : 1);
>> >            return b;
>> >        } else {
>> >            return read0();
>> >        }
>> >    }
>> >
>> > where
>> >
>> >  class IoTrace {
>> >     public static final boolean ENABLED =
>> > System.getProperty("java.io.trace");
>> >  }
>> >
>> > ...which will demote the flexibility of setListener(), but have much
>> > lower runtime overhead. This should be confirmed by microbenchmarking
>> > anyway.
>> >
>> > -Aleksey.
>>
>>
>

Reply via email to