OSR can be avoided if you put the body of your loops in their own methods 
so they get normal JIT support but this is unlikely to explain such a 
significant step in latency.

As Gil mentions using loopback will give very different results to a real 
network. The Linux kernel bypasses OSI layer 2 for loopback so no QDiscs. 
For example Nagle not only does not apply on loopback, it WILL also 
increase latency a little when disabled, really!

Have you measured L1 and L2 cache hit and miss rates in each case? Even 
with ISOCPUS the Intel private caches (L1 & L2) are inclusive with the 
shared L3 so that if the L3 has to evict lines then they need to go from 
the corresponding L1/L2 caches. You can use CAT (Cache Allocation 
Technology), CoD (Cluster on Die), or separate sockets to help avoid this.

On Thursday, 13 April 2017 16:01:49 UTC+1, J Crawford wrote:
>
> Thanks for everyone who threw some ideas. I was able to prove that it is* 
> *not** a JIT/HotSpot de-optimization.
>
> First I got the following output when I used "-XX:+PrintCompilation 
> -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining":
>
>     Thu Apr 13 10:21:16 EDT 2017 Results: totalMessagesSent=100000 
> currInterval=1 latency=4210 timeToWrite=2514 timeToRead=1680 realRead=831 
> zeroReads=2 partialReads=0
>       *77543  560 % !   4       Client::run @ -2 (270 bytes)   made not 
> entrant*
>     Thu Apr 13 10:21:39 EDT 2017 Results: totalMessagesSent=100001 
> currInterval=30000 latency=11722 timeToWrite=5645 timeToRead=4531 
> realRead=2363 zeroReads=1 partialReads=
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to mechanical-sympathy+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to