richardstartin commented on issue #7866:
URL: https://github.com/apache/pinot/issues/7866#issuecomment-985992967


   Hi @mqliang, there’s a problem with your benchmark: unless you present the 
produced value to a blackhole (or return the value if there’s only one) the JIT 
compiler may notice that the value is never used, and then eliminate all the 
code used to produce the value. That is, the code you’re measuring may not 
execute. There may be some a side effect in the invocation of the method which 
prevents dead code elimination.
   
   A good way to spot this is to ask yourself how many cycles must the measured 
code have taken at your processor frequency. Assuming your processor frequency 
is 4GHz, 0.5ns is 2 cycles, which isn’t enough to have made a even a userspace 
function call to to read a clock.
   
   You will get different results if you return the value:
   
    @Benchmark
     public long benchmarkThreadCpuTimer() {
       ThreadTimer threadTimer = new ThreadTimer();
       long totalThreadCpuTimeNs = threadTimer.getThreadTimeNs();
       return totalThreadCpuTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemCurrentTimeMillis() {
       long startWallClockTimeMs = System.currentTimeMillis();
       long totalWallClockTimeMs = System.currentTimeMillis() - 
startWallClockTimeMs;
       long totalWallClockTimeNs = 
TimeUnit.MILLISECONDS.toNanos(totalWallClockTimeMs);
      return totalWallClockTimeNs;
     }
   
     @Benchmark
     public long benchmarkSystemNanoTime() {
       long startWallClockTimeNs = System.nanoTime();
       long totalWallClockTimeNs = System.nanoTime() - startWallClockTimeNs;
       return startWallClockTimeNs;
     }
   
   System.currentTimeMills() and System.nanoTime() generally do *not* require a 
syscall because they call in to clock_gettime(CLOCK_REALTIME) and 
clock_gettime(CLOCK_MONOTONIC) respectively which have been in vDSO for a very 
long time. When these clocks are userspace calls, you should expect latencies 
of 20-30ns. When they are syscalls expect hundreds of nanoseconds. To see what 
actual syscall overhead looks like, run the fixed benchmark on a virtualised 
instance using a ‘xen’ clocksource and you will see the difference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to