[ 
https://issues.apache.org/jira/browse/IGNITE-28255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Abashev updated IGNITE-28255:
----------------------------------
    Description: 
Summary:
MarshallerCacheJobRunNodeRestartTest.testJobRun fails intermittently with 
timeout due to NotSerializableException in MarshallerMappingItem

Description:

The test MarshallerCacheJobRunNodeRestartTest.testJobRun hangs and times out 
after 5 minutes (300 000 ms).

TC link: 
https://ci2.ignite.apache.org/test/381112157178694638?currentProjectId=IgniteTests24Java8&branch=%3Cdefault%3E

Failure rate: 2 failures out of 68 runs (~3%), both on aitc-lin15, branch 
refs/heads/master (builds #41053, #41049)

Root cause:

During node restart, TcpDiscoverySpi fails to deserialize a discovery message 
because MarshallerMappingItem is not serializable:

  [ERROR][tcp-disco-sock-reader-...][TestTcpDiscoverySpi] Failed to read message
  org.apache.ignite.IgniteCheckedException: Failed to deserialize object with 
given class loader: IsolatedClassLoader\{roleName='test'}
      at 
org.apache.ignite.marshaller.jdk.JdkMarshallerImpl.unmarshal0(JdkMarshallerImpl.java:130)
      ...
  Caused by: java.io.WriteAbortedException: writing aborted; 
java.io.NotSerializableException: 
org.apache.ignite.internal.processors.marshaller.MarshallerMappingItem
  Caused by: java.io.NotSerializableException: 
org.apache.ignite.internal.processors.marshaller.MarshallerMappingItem

MarshallerMappingItem is being sent as part of a TcpDiscovery message (likely a 
MarshallerMappingRequest or MarshallerMappingResponse) but does not implement 
Serializable. As a result, the restarting node cannot exchange marshaller 
mappings with the rest of the cluster, causing the worker thread to hang 
indefinitely waiting for the mapping to be resolved.

This ultimately causes GridTestUtils.runMultiThreaded() to block forever at 
Thread.join(), triggering the 5-minute test timeout.

Stack trace (thread dump at timeout):

  Thread 
[name="test-runner-#83435%cache.MarshallerCacheJobRunNodeRestartTest%", 
state=WAITING]
    at java.lang.Object.wait(Native Method)
    at java.lang.Thread.join(Thread.java:1304)
    at 
o.a.i.testframework.GridTestUtils.runMultiThreaded(GridTestUtils.java:1124)
    at 
o.a.i.i.processors.cache.MarshallerCacheJobRunNodeRestartTest.testJobRun(MarshallerCacheJobRunNodeRestartTest.java:65)

Fix:

MarshallerMappingItem should implement java.io.Serializable (or be converted to 
use Ignite's own serialization mechanism) so it can be properly 
marshalled/unmarshalled during TcpDiscovery message exchange.

Environment:
- Ignite version: 2.18.0-SNAPSHOT#20260317
- JVM: OpenJDK 17.0.8.1+1 Eclipse Adoptium
- OS: Linux 5.4.0-216-generic amd64
- Agent: aitc-lin15

> Fix java.io.NotSerializableException: 
> org.apache.ignite.internal.processors.marshaller.MarshallerMappingItem
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28255
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28255
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Alex Abashev
>            Assignee: Alex Abashev
>            Priority: Major
>              Labels: IEP-132
>             Fix For: 2.19
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Summary:
> MarshallerCacheJobRunNodeRestartTest.testJobRun fails intermittently with 
> timeout due to NotSerializableException in MarshallerMappingItem
> Description:
> The test MarshallerCacheJobRunNodeRestartTest.testJobRun hangs and times out 
> after 5 minutes (300 000 ms).
> TC link: 
> https://ci2.ignite.apache.org/test/381112157178694638?currentProjectId=IgniteTests24Java8&branch=%3Cdefault%3E
> Failure rate: 2 failures out of 68 runs (~3%), both on aitc-lin15, branch 
> refs/heads/master (builds #41053, #41049)
> Root cause:
> During node restart, TcpDiscoverySpi fails to deserialize a discovery message 
> because MarshallerMappingItem is not serializable:
>   [ERROR][tcp-disco-sock-reader-...][TestTcpDiscoverySpi] Failed to read 
> message
>   org.apache.ignite.IgniteCheckedException: Failed to deserialize object with 
> given class loader: IsolatedClassLoader\{roleName='test'}
>       at 
> org.apache.ignite.marshaller.jdk.JdkMarshallerImpl.unmarshal0(JdkMarshallerImpl.java:130)
>       ...
>   Caused by: java.io.WriteAbortedException: writing aborted; 
> java.io.NotSerializableException: 
> org.apache.ignite.internal.processors.marshaller.MarshallerMappingItem
>   Caused by: java.io.NotSerializableException: 
> org.apache.ignite.internal.processors.marshaller.MarshallerMappingItem
> MarshallerMappingItem is being sent as part of a TcpDiscovery message (likely 
> a MarshallerMappingRequest or MarshallerMappingResponse) but does not 
> implement Serializable. As a result, the restarting node cannot exchange 
> marshaller mappings with the rest of the cluster, causing the worker thread 
> to hang indefinitely waiting for the mapping to be resolved.
> This ultimately causes GridTestUtils.runMultiThreaded() to block forever at 
> Thread.join(), triggering the 5-minute test timeout.
> Stack trace (thread dump at timeout):
>   Thread 
> [name="test-runner-#83435%cache.MarshallerCacheJobRunNodeRestartTest%", 
> state=WAITING]
>     at java.lang.Object.wait(Native Method)
>     at java.lang.Thread.join(Thread.java:1304)
>     at 
> o.a.i.testframework.GridTestUtils.runMultiThreaded(GridTestUtils.java:1124)
>     at 
> o.a.i.i.processors.cache.MarshallerCacheJobRunNodeRestartTest.testJobRun(MarshallerCacheJobRunNodeRestartTest.java:65)
> Fix:
> MarshallerMappingItem should implement java.io.Serializable (or be converted 
> to use Ignite's own serialization mechanism) so it can be properly 
> marshalled/unmarshalled during TcpDiscovery message exchange.
> Environment:
> - Ignite version: 2.18.0-SNAPSHOT#20260317
> - JVM: OpenJDK 17.0.8.1+1 Eclipse Adoptium
> - OS: Linux 5.4.0-216-generic amd64
> - Agent: aitc-lin15



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to