[
https://issues.apache.org/jira/browse/IGNITE-28520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Abashev updated IGNITE-28520:
----------------------------------
Description:
Background / Problem Statement:
After moving the marshalling methods (prepareMarshal / finishUnmarshal) into
the NIO thread, two related issues emerged:
Performance degradation (IGNITE-28473). Marshalling of CustomObject/CacheObject
now happens in a single NIO worker, whereas previously it was done in parallel
across user threads.
Deadlock in Discovery. The marshaller broadcasts a class registration message
across the cluster and waits for acknowledgement from all nodes. If marshalling
happens on the Discovery thread, a deadlock occurs: the thread waits for a
response to a message it is supposed to process itself.
Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly on
the sending thread (NIO / Discovery), whereas these methods must be executed on
a user thread.
Proposed Solution (Phase 1):
Implement two-phase marshalling for CacheObject fields:
Phase 1 — on the send call thread (user thread): Add methods to the generated
serializer that recursively traverse all @Order-annotated fields, locate
CacheObject fields (including nested ones and those inside collections), invoke
prepareMarshal, and store the result in a byte[].
Phase 2 — on the NIO sending thread: The serializer reads the pre-computed
byte[] and writes them to the socket. prepareMarshal is not called.
This phase covers only CacheObject fields generated by the code generator via
@Order. Manual code for MarshallableMessage fields (e.g.
GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields
are deferred to the next ticket.
Out of scope (next ticket):
Handling MarshallableMessage fields that require manual code.
Hiding / encapsulating byte[] fields inside messages.
Acceptance Criteria:
prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on a
user thread, never on NIO / Discovery threads.
The NIO worker only reads pre-computed bytes and writes them to the socket.
Recursive traversal of @Order-annotated fields correctly handles nested
CacheObject instances and collections.
The Discovery deadlock when sending messages with CustomObject is no longer
reproducible.
No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
Existing tests pass.
was:
When sending the same message to multiple nodes via
`GridIoManager.sendToGridTopic()`, the same `Message` object is passed in a
loop for each recipient. Serialization does not happen in the sender thread —
it happens later, in NIO worker threads
(`DirectNioClientWorker.writeToBuffer()` →
`GridIoMessageMarshallableSerializer.writeTo()`).
This results in the same message being serialized **N times concurrently** —
once per target node, each time in a separate NIO worker thread — even though
the byte representation is identical for all recipients. The result is a
performance drop due to redundant serialization work.
**Call stack:**
GridIoMessageMarshallableSerializer.writeTo()
→ GridNioServer$DirectNioClientWorker.writeToBuffer()
→ processWrite0() → processWrite()
→ AbstractNioClientWorker.processSelectedKeysOptimized()
→ bodyInternal() → body()
**Expected behavior:** the message should be serialized **once** in the thread
where it was constructed, before being enqueued to the NIO layer. NIO workers
should operate on a pre-built byte buffer rather than re-serializing the
mutable message object independently.
**Proposed approach:** use the existing serialization mechanism to produce the
byte representation of the message once, eagerly, before handing it off to the
send queue. NIO workers then write the cached bytes directly to the socket
buffer without invoking the serializer.
> Move prepareMarshal / finishUnmarshal out of NIO communication thread — Phase
> 1: CacheObjects
> ---------------------------------------------------------------------------------------------
>
> Key: IGNITE-28520
> URL: https://issues.apache.org/jira/browse/IGNITE-28520
> Project: Ignite
> Issue Type: Task
> Reporter: Alex Abashev
> Assignee: Alex Abashev
> Priority: Minor
> Labels: IEP-132, ise
> Fix For: 2.19
>
>
> Background / Problem Statement:
> After moving the marshalling methods (prepareMarshal / finishUnmarshal) into
> the NIO thread, two related issues emerged:
> Performance degradation (IGNITE-28473). Marshalling of
> CustomObject/CacheObject now happens in a single NIO worker, whereas
> previously it was done in parallel across user threads.
> Deadlock in Discovery. The marshaller broadcasts a class registration message
> across the cluster and waits for acknowledgement from all nodes. If
> marshalling happens on the Discovery thread, a deadlock occurs: the thread
> waits for a response to a message it is supposed to process itself.
> Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly
> on the sending thread (NIO / Discovery), whereas these methods must be
> executed on a user thread.
> Proposed Solution (Phase 1):
> Implement two-phase marshalling for CacheObject fields:
> Phase 1 — on the send call thread (user thread): Add methods to the generated
> serializer that recursively traverse all @Order-annotated fields, locate
> CacheObject fields (including nested ones and those inside collections),
> invoke prepareMarshal, and store the result in a byte[].
> Phase 2 — on the NIO sending thread: The serializer reads the pre-computed
> byte[] and writes them to the socket. prepareMarshal is not called.
> This phase covers only CacheObject fields generated by the code generator via
> @Order. Manual code for MarshallableMessage fields (e.g.
> GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields
> are deferred to the next ticket.
> Out of scope (next ticket):
> Handling MarshallableMessage fields that require manual code.
> Hiding / encapsulating byte[] fields inside messages.
> Acceptance Criteria:
> prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on
> a user thread, never on NIO / Discovery threads.
> The NIO worker only reads pre-computed bytes and writes them to the socket.
> Recursive traversal of @Order-annotated fields correctly handles nested
> CacheObject instances and collections.
> The Discovery deadlock when sending messages with CustomObject is no longer
> reproducible.
> No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
> Existing tests pass.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)