On Fri, 23 Dec 2022 22:28:34 GMT, Markus KARG <d...@openjdk.org> wrote:
> I/O had always been much slower than CPU and memory access, and thanks to > physical constraints, always will be. > While CPUs can get shrinked more and more, and can hold more and more memory > cache on or nearby a CPU core, the distance between CPU core and I/O device > cannot get reduced much: It will stay "far" away. > Due to this simple logic (and other factors), the spread between performance > of CPU and memory access on one hand, and performance of I/O on the other > hand, increases with every new CPU generation. > As a consequence, internal adjustment factors of the JDK need to get revised > from time to time to ensure optimum performance and each hardware generation. > > One such factor is the size of the temporary transfer buffer used internally > by `InputStream::transferTo`. > Since its introduction with JDK 9 many years (hence hardware generations) > have passed, so it's time to check the appropriateness of that buffer's size. > > Using JMH on a typical, modern cloud platform, it was proven that the current > 8K buffer is (much) too small on modern hardware: > The small buffer clearly stands in the way of faster transfers. > The ops/s of a simple `FileInputStream.transferTo(ByteArrayOutputStream)` > operation on JDK 21 could be doubled (!) by only doubling the buffer size > from 8K to 16K, which seems to be a considerable and cheap deal. > Doubling the buffer even more shows only marginal improvements of approx. 1% > to 3% per duplication of size, which does not justify additional memory > consumption. > > > TransferToPerformance.transferTo 8192 1048576 thrpt 25 1349.929 ± 47.057 ops/s > TransferToPerformance.transferTo 16384 1048576 thrpt 25 2633.560 ± 93.337 > ops/s > TransferToPerformance.transferTo 32768 1048576 thrpt 25 2721.025 ± 89.555 > ops/s > TransferToPerformance.transferTo 65536 1048576 thrpt 25 2855.949 ± 96.623 > ops/s > TransferToPerformance.transferTo 131072 1048576 thrpt 25 2903.062 ± 40.798 > ops/s > > > Even on small or limited platforms, an investment of 8K additonal temporary > buffer is very cheap and very useful, as it doubles the performance of > `InputStream::transferTo`, in particular for legacy (non-NIO) applications > still using `FileInputStream` and `ByteArrayOutputStream`. > I dare to say, even if not proven, that is a very considerable (possibly the > major) number of existing applications, as NIO was only adopted gradually by > programmers. > > Due to the given reasons, it should be approporiate to change > `DEFAULT_BUFFER_SIZE` from 8192 to 16384. Here's also results for "modern" architecture - I executed the benchmark in a k8s container on an Oracle cloud ARM64 virtual machine. With dropCaches=true: https://jmh.morethan.io/?gist=2db4c3b51073e90c1a84d7eed8e1a988 With dropCaches=false: https://jmh.morethan.io/?gist=d7dfb5def5899af41722b2768a827006 Here, the benefit of increasing buffer from 8k to 16k gets from about 10% (doing IO) up to 20% (reading from cache) increase in performance. ------------- PR: https://git.openjdk.org/jdk/pull/11783