[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774351#comment-16774351 ]
Pavel Vinokurov commented on IGNITE-11378: ------------------------------------------ The reproducer was updated > Critical system errors on cluster with enabled peristance > --------------------------------------------------------- > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence > Affects Versions: 2.7 > Reporter: Pavel Vinokurov > Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)