[ 
https://issues.apache.org/jira/browse/IGNITE-10238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695689#comment-16695689
 ] 

Amelchev Nikita commented on IGNITE-10238:
------------------------------------------

Hi, [~agoncharuk]. 

Problem is in continuous queries:

It has two versions of discovery protocol: with mutable and immutable messages.

Discovery protocol with mutable messages has a problem: 
When _GridContinuousProcessor.processStartRequest_ processes the 
_StartRoutineDiscoveryMessage_ message it sends some other messages for 
unmarshal (when peerClsLoading enabled). It may lead to deadlock disco threads. 
It is not possible to take this out to the system pool because the modified 
message after unmarshal needs to be sent across the ring.

Also, I talked to Denis and his changes(async p2p) will not fix this 
problem(IGNITE-3653).

I suggest two possible solutions: 

1. Use the discovery protocol with immutable messages(2) when peerClsLoading 
enabled.   
2. Drop protocol with mutable messages.

Any thoughts?  

> Intermittent Client Nodes suite hang
> ------------------------------------
>
>                 Key: IGNITE-10238
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10238
>             Project: Ignite
>          Issue Type: Test
>            Reporter: Alexey Goncharuk
>            Assignee: Amelchev Nikita
>            Priority: Critical
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.8
>
>
> There are occasional hangs of Client Nodes suite in master. A quick peek at 
> the thread dumps reveals an interesting deadlock (only relevant parts of the 
> thread dump are left):
> {code}
> "disco-notifier-worker-#634%internal.IgniteClientReconnectApiExceptionTest0%" 
> #791 prio=5 os_prio=0 tid=0x00007f990c12d800 nid=0x11b9 waiting on condition 
> [0x00007f991a0eb000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>       at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.metadata(CacheObjectBinaryProcessorImpl.java:656)
>       at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.metadata(CacheObjectBinaryProcessorImpl.java:206)
>       at 
> org.apache.ignite.internal.binary.BinaryContext.metadata(BinaryContext.java:1293)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.getOrCreateSchema(BinaryReaderExImpl.java:2007)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.<init>(BinaryReaderExImpl.java:286)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.<init>(BinaryReaderExImpl.java:185)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984)
>       at 
> org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:703)
>       at 
> org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:188)
>       at 
> org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1984)
>       at 
> org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:703)
>       at 
> org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:188)
>       at 
> org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:874)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1764)
>       at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716)
>       at 
> org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:313)
>       at 
> org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:101)
>       at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:81)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10131)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10160)
>       at 
> org.apache.ignite.internal.GridEventConsumeHandler.p2pUnmarshal(GridEventConsumeHandler.java:390)
>       at 
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1362)
>       at 
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:111)
>       at 
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:203)
>       at 
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:194)
>       at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:725)
>       at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:602)
>       - locked <0x00000007b62859b8> (a java.lang.Object)
>       at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4$$Lambda$17/432384581.run(Unknown
>  Source)
>       at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2665)
>       at 
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2703)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at java.lang.Thread.run(Thread.java:748)
> "async-callable-runner-1" #876 prio=5 os_prio=0 tid=0x00007f990c26a000 
> nid=0x120b waiting on condition [0x00007f991a2ed000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>       at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:495)
>       at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.addMeta(CacheObjectBinaryProcessorImpl.java:194)
>       at 
> org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1332)
>       at 
> org.apache.ignite.internal.binary.BinaryClassDescriptor.write(BinaryClassDescriptor.java:777)
>       at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal0(BinaryWriterExImpl.java:223)
>       at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:164)
>       at 
> org.apache.ignite.internal.binary.BinaryWriterExImpl.marshal(BinaryWriterExImpl.java:151)
>       at 
> org.apache.ignite.internal.binary.GridBinaryMarshaller.marshal(GridBinaryMarshaller.java:254)
>       at 
> org.apache.ignite.internal.binary.BinaryMarshaller.marshal0(BinaryMarshaller.java:84)
>       at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:57)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10213)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1387)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:666)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:538)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:809)
>       at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:476)
>       at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor.callAsync(GridClosureProcessor.java:449)
>       at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor.callAsync(GridClosureProcessor.java:420)
>       at 
> org.apache.ignite.internal.IgniteComputeImpl.broadcastAsync0(IgniteComputeImpl.java:635)
>       at 
> org.apache.ignite.internal.IgniteComputeImpl.broadcast(IgniteComputeImpl.java:611)
>       at 
> org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$26.apply(IgniteClientReconnectApiExceptionTest.java:578)
>       at 
> org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$26.apply(IgniteClientReconnectApiExceptionTest.java:574)
>       at 
> org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$36.call(IgniteClientReconnectApiExceptionTest.java:853)
>       at 
> org.apache.ignite.internal.IgniteClientReconnectApiExceptionTest$36.call(IgniteClientReconnectApiExceptionTest.java:851)
>       at 
> org.apache.ignite.testframework.GridTestUtils.lambda$runAsync$2(GridTestUtils.java:1003)
>       at 
> org.apache.ignite.testframework.GridTestUtils$$Lambda$133/962953099.run(Unknown
>  Source)
>       at 
> org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:1299)
>       at 
> org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:84)
> "tcp-disco-msg-worker-#88%internal.IgniteClientReconnectApiExceptionTest0%" 
> #792 prio=10 os_prio=0 tid=0x00007f990c130800 nid=0x11ba waiting on condition 
> [0x00007f997fdfc000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>       at 
> org.apache.ignite.internal.util.future.IgniteFutureImpl.get(IgniteFutureImpl.java:134)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:5648)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:5455)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2836)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2610)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7186)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2699)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7117)
>       at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61)
> {code}
> Need to investigate what it is that we are trying to deserialize in the 
> discovery thread. From the binary metadata workflow we should be able to 
> deserialize the value right away.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to