On Fri, 23 Dec 2022 22:28:34 GMT, Markus KARG <d...@openjdk.org> wrote:

> I/O had always been much slower than CPU and memory access, and thanks to 
> physical constraints, always will be.
> While CPUs can get shrinked more and more, and can hold more and more memory 
> cache on or nearby a CPU core, the distance between CPU core and I/O device 
> cannot get reduced much: It will stay "far" away.
> Due to this simple logic (and other factors), the spread between performance 
> of CPU and memory access on one hand, and performance of I/O on the other 
> hand, increases with every new CPU generation.
> As a consequence, internal adjustment factors of the JDK need to get revised 
> from time to time to ensure optimum performance and each hardware generation.
> 
> One such factor is the size of the temporary transfer buffer used internally 
> by `InputStream::transferTo`.
> Since its introduction with JDK 9 many years (hence hardware generations) 
> have passed, so it's time to check the appropriateness of that buffer's size.
> 
> Using JMH on a typical, modern cloud platform, it was proven that the current 
> 8K buffer is (much) too small on modern hardware:
> The small buffer clearly stands in the way of faster transfers.
> The ops/s of a simple `FileInputStream.transferTo(ByteArrayOutputStream)` 
> operation on JDK 21 could be doubled (!) by only doubling the buffer size 
> from 8K to 16K, which seems to be a considerable and cheap deal.
> Doubling the buffer even more shows only marginal improvements of approx. 1% 
> to 3% per duplication of size, which does not justify additional memory 
> consumption.
> 
> 
> TransferToPerformance.transferTo 8192 1048576 thrpt 25 1349.929 ± 47.057 ops/s
> TransferToPerformance.transferTo 16384 1048576 thrpt 25 2633.560 ± 93.337 
> ops/s
> TransferToPerformance.transferTo 32768 1048576 thrpt 25 2721.025 ± 89.555 
> ops/s
> TransferToPerformance.transferTo 65536 1048576 thrpt 25 2855.949 ± 96.623 
> ops/s
> TransferToPerformance.transferTo 131072 1048576 thrpt 25 2903.062 ± 40.798 
> ops/s
> 
> 
> Even on small or limited platforms, an investment of 8K additonal temporary 
> buffer is very cheap and very useful, as it doubles the performance of 
> `InputStream::transferTo`, in particular for legacy (non-NIO) applications 
> still using `FileInputStream` and `ByteArrayOutputStream`.
> I dare to say, even if not proven, that is a very considerable (possibly the 
> major) number of existing applications, as NIO was only adopted gradually by 
> programmers.
> 
> Due to the given reasons, it should be approporiate to change 
> `DEFAULT_BUFFER_SIZE` from 8192 to 16384.

Here's also results for "modern" architecture - I executed the benchmark in a 
k8s container on an Oracle cloud ARM64 virtual machine.

With dropCaches=true:
https://jmh.morethan.io/?gist=2db4c3b51073e90c1a84d7eed8e1a988

With dropCaches=false:
https://jmh.morethan.io/?gist=d7dfb5def5899af41722b2768a827006

Here, the benefit of increasing buffer from 8k to 16k gets from about 10% 
(doing IO) up to 20% (reading from cache) increase in performance.

-------------

PR: https://git.openjdk.org/jdk/pull/11783

Reply via email to