[ 
https://issues.apache.org/jira/browse/IGNITE-28520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Abashev updated IGNITE-28520:
----------------------------------
    Description: 
Background / Problem Statement:
After moving the marshalling methods (prepareMarshal / finishUnmarshal) into 
the NIO thread, two related issues emerged:

Performance degradation (IGNITE-28473). Marshalling of CustomObject/CacheObject 
now happens in a single NIO worker, whereas previously it was done in parallel 
across user threads.
Deadlock in Discovery. The marshaller broadcasts a class registration message 
across the cluster and waits for acknowledgement from all nodes. If marshalling 
happens on the Discovery thread, a deadlock occurs: the thread waits for a 
response to a message it is supposed to process itself.

Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly on 
the sending thread (NIO / Discovery), whereas these methods must be executed on 
a user thread.

Proposed Solution (Phase 1):
Implement two-phase marshalling for CacheObject fields:

Phase 1 — on the send call thread (user thread): Add methods to the generated 
serializer that recursively traverse all @Order-annotated fields, locate 
CacheObject fields (including nested ones and those inside collections), invoke 
prepareMarshal, and store the result in a byte[].
Phase 2 — on the NIO sending thread: The serializer reads the pre-computed 
byte[] and writes them to the socket. prepareMarshal is not called.

This phase covers only CacheObject fields generated by the code generator via 
@Order. Manual code for MarshallableMessage fields (e.g. 
GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields 
are deferred to the next ticket.

Out of scope (next ticket):

Handling MarshallableMessage fields that require manual code.
Hiding / encapsulating byte[] fields inside messages.


Acceptance Criteria:

 prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on a 
user thread, never on NIO / Discovery threads.
 The NIO worker only reads pre-computed bytes and writes them to the socket.
 Recursive traversal of @Order-annotated fields correctly handles nested 
CacheObject instances and collections.
 The Discovery deadlock when sending messages with CustomObject is no longer 
reproducible.
 No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
 Existing tests pass.


  was:
When sending the same message to multiple nodes via 
`GridIoManager.sendToGridTopic()`, the same `Message` object is passed in a 
loop for each recipient. Serialization does not happen in the sender thread — 
it happens later, in NIO worker threads 
(`DirectNioClientWorker.writeToBuffer()` → 
`GridIoMessageMarshallableSerializer.writeTo()`).

This results in the same message being serialized **N times concurrently** — 
once per target node, each time in a separate NIO worker thread — even though 
the byte representation is identical for all recipients. The result is a 
performance drop due to redundant serialization work.

**Call stack:**

    GridIoMessageMarshallableSerializer.writeTo()
      → GridNioServer$DirectNioClientWorker.writeToBuffer()
        → processWrite0() → processWrite()
          → AbstractNioClientWorker.processSelectedKeysOptimized()
            → bodyInternal() → body()

**Expected behavior:** the message should be serialized **once** in the thread 
where it was constructed, before being enqueued to the NIO layer. NIO workers 
should operate on a pre-built byte buffer rather than re-serializing the 
mutable message object independently.

**Proposed approach:** use the existing serialization mechanism to produce the 
byte representation of the message once, eagerly, before handing it off to the 
send queue. NIO workers then write the cached bytes directly to the socket 
buffer without invoking the serializer.


> Move prepareMarshal / finishUnmarshal out of NIO communication thread — Phase 
> 1: CacheObjects
> ---------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28520
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28520
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Alex Abashev
>            Assignee: Alex Abashev
>            Priority: Minor
>              Labels: IEP-132, ise
>             Fix For: 2.19
>
>
> Background / Problem Statement:
> After moving the marshalling methods (prepareMarshal / finishUnmarshal) into 
> the NIO thread, two related issues emerged:
> Performance degradation (IGNITE-28473). Marshalling of 
> CustomObject/CacheObject now happens in a single NIO worker, whereas 
> previously it was done in parallel across user threads.
> Deadlock in Discovery. The marshaller broadcasts a class registration message 
> across the cluster and waits for acknowledgement from all nodes. If 
> marshalling happens on the Discovery thread, a deadlock occurs: the thread 
> waits for a response to a message it is supposed to process itself.
> Root cause: the serializer invokes prepareMarshal / finishUnmarshal directly 
> on the sending thread (NIO / Discovery), whereas these methods must be 
> executed on a user thread.
> Proposed Solution (Phase 1):
> Implement two-phase marshalling for CacheObject fields:
> Phase 1 — on the send call thread (user thread): Add methods to the generated 
> serializer that recursively traverse all @Order-annotated fields, locate 
> CacheObject fields (including nested ones and those inside collections), 
> invoke prepareMarshal, and store the result in a byte[].
> Phase 2 — on the NIO sending thread: The serializer reads the pre-computed 
> byte[] and writes them to the socket. prepareMarshal is not called.
> This phase covers only CacheObject fields generated by the code generator via 
> @Order. Manual code for MarshallableMessage fields (e.g. 
> GridJobExecuteResponse::marshallUserData) and encapsulation of byte[] fields 
> are deferred to the next ticket.
> Out of scope (next ticket):
> Handling MarshallableMessage fields that require manual code.
> Hiding / encapsulating byte[] fields inside messages.
> Acceptance Criteria:
>  prepareMarshal / finishUnmarshal for CacheObject fields are only invoked on 
> a user thread, never on NIO / Discovery threads.
>  The NIO worker only reads pre-computed bytes and writes them to the socket.
>  Recursive traversal of @Order-annotated fields correctly handles nested 
> CacheObject instances and collections.
>  The Discovery deadlock when sending messages with CustomObject is no longer 
> reproducible.
>  No performance degradation (confirmed by JMH benchmarks — IGNITE-28119).
>  Existing tests pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to