Hello!
Oct 12 17:49:41 nalrcsvridbq02 kernel: watchdog: BUG: soft lockup - CPU#0
stuck for 38s! [kworker/u256:0:3703404]
This is bad. Your system kernel says your CPU#0 was hanging for 38 seconds.
This is enough to trigger failure detection timeout and kill and instance,
or at least for the
Oct 12 17:47:39 nalrcsvridbq02 Ignite[2031634]: [17:47:39] Possible failure
suppressed accordingly to a configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED,