[ 
https://issues.apache.org/jira/browse/IGNITE-28853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Werner updated IGNITE-28853:
-----------------------------------
    Labels: IEP-132 ise  (was: IEP-132)

> CompressedMessage: excessive copying and per-message direct buffer 
> allocations on both send and receive paths
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28853
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28853
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Anton Vinogradov
>            Assignee: Dmitry Werner
>            Priority: Major
>              Labels: IEP-132, ise
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> CompressedMessage moves the same bytes through memory 3-5 times and allocates 
> direct ByteBuffers per message on both sides of the wire. Direct allocation 
> is expensive (memory zeroing, Cleaner-based release, potential System.gc() 
> inside Bits.reserveMemory), while no point of the path actually needs a 
> direct buffer: data arrives in and leaves through heap arrays.
> Send path: compress() copies the whole source buffer into a byte[], deflates 
> via DeflaterOutputStream (512-byte internal buffer -> many small JNI calls) 
> into a ByteArrayOutputStream pre-sized to the *uncompressed* length, then 
> copies again via toByteArray(); ChunkedByteReader then copies every 10K chunk 
> into a fresh array one more time.
> Receive path: CompressedMessageSerializer.readFrom() accumulates incoming 
> chunks into a 100KB direct ByteBuffer allocated per message (grown by 
> doubling through another copy), although each chunk is already a fresh heap 
> array returned by readByteArray(); uncompress() copies it all back into a 
> heap array and inflates via InflaterInputStream.readAllBytes() (internal 8K 
> buffers + final consolidation copy) despite the exact result size being known 
> upfront; DirectMessageReader.readCompressedMessageAndDeserialize() then 
> copies the whole uncompressed payload into yet another per-message direct 
> buffer, although DirectByteBufferStream fully supports heap buffers.
> Fix (wire format unchanged):
> * Internal representation switched to List<byte[]> chunks for both 
> directions, ChunkedByteReader removed.
> * compress(): raw Deflater with setInput(ByteBuffer) (no input copy), 
> deflating straight into wire-ready chunks - compressed bytes are written 
> exactly once.
> * readFrom(): a received chunk is simply added to the list - zero copies, 
> zero direct allocations.
> * uncompress(): raw Inflater fed chunk by chunk into an exact-size 
> byte[dataSize].
> * readCompressedMessageAndDeserialize(): ByteBuffer.wrap(uncompressed) 
> instead of allocateDirect+put+flip.
> JMH (GridDhtPartitionsFullMessage receive round-trip with two @Compress map 
> fields, JDK 17, M-series):
> * 30 entries: 15.2K +/- 34.8K -> 100.9K +/- 6.2K ops/s (~6.6x; master's huge 
> variance is caused by per-message direct allocations triggering GC storms), 
> heap 66.7K -> 25.2K B/op (-62%).
> * 500 entries: 4.38K -> 5.76K ops/s (+31%), heap 522K -> 431K B/op (-18%).
> * On top of the heap savings, all per-message direct buffer allocations 
> (~365KB/op at 500 entries, invisible to gc.alloc.rate) are eliminated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to