Hi Stephen/Team,

Did you got any chance to look into this?

Thanks and Regards,
Kamlesh Joshi

From: Kamlesh Joshi
Sent: 06 July 2020 14:50
To: user@ignite.apache.org
Subject: RE: [External]Re: Ignite cluster became unresponsive

Hi Stephen,

We have started our node with below JVM parameters. Also, we have increased 
these timeouts 
failureDetectionTimeout/clientFailureDetectionTimeout/networkTimeout to 480000.

-XX:+AggressiveOpts -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC 
-XX:+DisableExplicitGC -XX:+UnlockCommercialFeatures 
-Djava.net.preferIPv4Stack=true -DIGNITE_LONG_OPERATIONS_DUMP_TIMEOUT=600000 

Is there anything else that we have to tune ?

And I think JVM pause is introduced as a result of the error that we 
encountered right? Correct me if am wrong.

Thanks and Regards,
Kamlesh Joshi

From: Stephen Darlington 
Sent: 06 July 2020 14:09
To: user <user@ignite.apache.org<mailto:user@ignite.apache.org>>
Subject: [External]Re: Ignite cluster became unresponsive

The e-mail below is from an external source. Please do not open attachments or 
click links from an unknown or suspicious origin.

There are a few issues here — the blocked thread, the communication error — but 
I possibly the key one is the JVM pause:

][jvm-pause-detector-worker][IgniteKernal%CustomerCC] Possible too long JVM 
pause: 10133 milliseconds.

This is usually due to garbage collection, but there are a number of other 
possibilities such as slow I/O. Suggest you start with the recommendations on 
the GC tuning documentation page: 


On 4 Jul 2020, at 12:44, Kamlesh Joshi 
<kamlesh.jo...@ril.com<mailto:kamlesh.jo...@ril.com>> wrote:

Hi Team,

We have encountered following defect in PROD environment. After which entire 
traffic got halted for around 10 minutes, we recently upgraded our cluster to 
Ignite 2.7.6 from 2.6.0.
Is this related to any existing open defect in this version? Has anyone 
observed the same defect earlier ?

Any help or pointers around this will be appreciated.

[2020-07-03T18:17:11,613][ERROR][sys-stripe-36-#37%CustomerCC%][G] Blocked 
system-critical thread has been detected. This can lead to cluster-wide 
undefined behaviour
[threadName=partition-exchanger, blockedFor=480s]
[2020-07-03T18:17:11,613][WARN ][sys-stripe-36-#37%CustomerCC%][G] Thread 
[name="exchange-worker-#344%CustomerCC%", id=391, state=TIMED_WAITING, 
blockCnt=1, waitCnt=2049782]
 ownerName=null, ownerId=-1]

[2020-07-03T18:17:11,620][ERROR][sys-stripe-36-#37%CustomerCC%][] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED, 
[type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
[name=partition-exchanger, igniteInstanceName=CustomerCC, finished=false, 
org.apache.ignite.IgniteException: GridWorker [name=partition-exchanger, 
igniteInstanceName=CustomerCC, finished=false, heartbeatTs=1593780431612]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
][sys-stripe-36-#37%CustomerCC%][FailureProcessor] No deadlocked threads 
][tcp-disco-sock-reader-#201%CustomerCC%][TcpDiscoverySpi] Finished serving 
remote node connection [rmtAddr=/xx.xx.xx.xx:46416, rmtPort=46416
][jvm-pause-detector-worker][IgniteKernal%CustomerCC] Possible too long JVM 
pause: 10133 milliseconds.
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:11764, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:38500, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:41442, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:44178, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:11884, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:39044, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:48756, 
Communication SPI session write timed out (consider increasing 
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:42190, 

Thanks and Regards,
Kamlesh Joshi

"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s), are confidential and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, re-transmission, conversion to hard copy, copying, circulation or 
other use of this message and any attachments is strictly prohibited. If you 
are not the intended recipient, please notify the sender immediately by return 
email and delete this message and any attachments from your system.
Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. The company cannot accept responsibility 
for any loss or damage arising from the use of this email or attachment."

"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s). 
are confidential and may be privileged. If you are not the intended recipient. 
you are hereby notified that any 
review. re-transmission. conversion to hard copy. copying. circulation or other 
use of this message and any attachments is 
strictly prohibited. If you are not the intended recipient. please notify the 
sender immediately by return email. 
and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. 
The company cannot accept responsibility for any loss or damage arising from 
the use of this email or attachment."

Reply via email to