Hello!

It's still hard to say. Can you enable more verbose logging for
org.apache.ignite?

Did the cluster un-stuck eventually?

Regards,
-- 
Ilya Kasnacheev


вт, 18 дек. 2018 г. в 14:25, [email protected] <[email protected]>:

> Hi Ilya,
>
> Attached is the full log of another ignite nodes.   the data in the
> cluster will be written back to the mysql.
>
> For this nodes the ERROR happen at 2018-12-14 10:38:51.730 around , but
> in fact after that, the nodes still working.
>
>
> Regards
> Aaron
>
>
> *From:* Ilya Kasnacheev <[email protected]>
> *Date:* 2018-12-18 18:44
> *To:* user <[email protected]>
> *Subject:* Re: Partition-exchanger blocked after upgrade to 2.7
> Hello!
>
> Unfortunately it's hard to say what happens here from such short log
> snippet. Can you provide full logs?
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 18 дек. 2018 г. в 05:51, [email protected] <[email protected]>:
>
>> Hello,
>>
>> After we upgrade to the 2.7  we meet a wired warn; basically all our
>> ignite cache running in LOCAL model in a internal network.
>>
>> All the configuration are almost default.  but we meet a ERROR logger of
>> the tcp-disco-msg-worker* but after that the the cluster still working,
>> no crash happen.
>>
>>
>> [ERROR] 2018-12-17 23:52:55.989 
>> [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] [ig] G - Blocked 
>> system-critical thread has been detected. This can lead to cluster-wide 
>> undefined behaviour [threadName=partition-exchanger, blockedFor=5s]
>>
>> [WARN ] 2018-12-17 23:52:55.989 
>> [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] [ig] G - Thread 
>> [name="exchange-worker-#98%PortfolioEventIgnite%", id=152, 
>> state=TIMED_WAITING, blockCnt=0, waitCnt=10143]
>>
>>     Lock 
>> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39b50130
>> , ownerName=null, ownerId=-1]
>>
>>
>> [WARN ] 2018-12-17 23:52:55.998 
>> [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] [ig] FailureProcessor - No 
>> deadlocked threads detected.
>>
>> [WARN ] 2018-12-17 23:52:57.443 [jvm-pause-detector-worker] [ig] 
>> IgniteKernal%PortfolioEventIgnite - Possible too long JVM pause: 1404 
>> milliseconds.
>>
>> [WARN ] 2018-12-17 23:52:57.457 
>> [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] [ig] FailureProcessor - 
>> Thread dump at 2018/12/17 23:52:57 UTC
>>
>>
>> While cache are local, not sure why the partition-exchanger still
>> blocking.
>>
>> Also  the tcp-disco-msg-worker, as running in internal network, so this
>> warn suppose not happen.
>>
>> "Possible too long JVM pause: 1404 milliseconds" from the gc details
>> during that time around the cost is reasonable:
>>
>>
>> 2018-12-18T07:44:27.513+0800: 50200.190: [GC pause (G1 Evacuation Pause) 
>> (young), 0.0241404 secs]
>> ....
>> [Times: user=0.19 sys=0.00, real=0.02 secs]
>>
>>
>> 2018-12-18T07:53:21.453+0800: 50734.129: [GC pause (G1 Evacuation Pause) 
>> (young), 0.0221342 secs]
>> ...
>> [Times: user=0.20 sys=0.00, real=0.02 secs]
>>
>>
>>
>> Regards
>> Aaron
>>
>

Reply via email to