Re: Ignite in client Mode: outofMemory error with TCP discovery
Hi, I don’t think that it is directly related to the discovery message itself. Even before that you have long jvm pauses, probably it was a full gc, looks like you don’t have enough heap on the client. What do you there? What kind of operations do you run? I’d suggest collecting heap dump and checking what kind of objects consumes this memory. Evgenii > On 15 May 2019, at 19:51, mahesh76private wrote: > > Hi, > > Whenever, a client loses connection with ignite cluster, it throws out an > OutOfMemory error. > Can you please see if you can explain this behavior? > > The ignite client had 1G memory set on XMS and XMX... and a very few clients > active. > Now it is quite possible that ignite clients lose connection with ignite > cluster frequently. > > Any insights into the issue will be deeply appreciated. > > > > > 15-05-2019 15:30:35.024 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 5063 milliseconds. > 15-05-2019 15:30:38.958 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 3664 milliseconds. > 15-05-2019 15:30:42.783 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 3666 milliseconds. > 15-05-2019 15:30:46.522 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 3630 milliseconds. > 15-05-2019 15:30:50.163 [35m[exchange-worker-#38][0;39m [31mWARN [0;39m > org.apache.ignite.internal.diagnostic.warning - Failed to wait for partition > map exchange [topVer=AffinityTopologyVersion [topVer=751, minorTopVer=1], > node=604f46ce-3693-404c-b180-29b2b6dcf8ed]. Consider changing > TransactionConfiguration.txTimeoutOnPartitionMapSynchronization to non > default value to avoid this message. Dumping pending objects that might be > the cause: > 15-05-2019 15:30:50.163 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 3591 milliseconds. > 15-05-2019 15:30:50.163 [35m[exchange-worker-#38][0;39m [31mWARN [0;39m > org.apache.ignite.internal.diagnostic.warning - Ready affinity version: > AffinityTopologyVersion [topVer=751, minorTopVer=0] > 15-05-2019 15:30:57.519 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 3648 milliseconds. > 15-05-2019 15:31:04.762 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 7242 milliseconds. > 15-05-2019 15:31:31.284 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 22505 milliseconds. > 15-05-2019 15:31:38.924 [35m[jvm-pause-detector-worker][0;39m [31mWARN > [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long > JVM pause: 11482 milliseconds. > 15-05-2019 15:31:47.483 [35m[grid-timeout-worker-#23][0;39m [34mINFO > [0;39m org.apache.ignite.internal.IgniteKernal.info - > Metrics for local node (to disable set 'metricsLogFrequency' to 0) >^-- Node [id=604f46ce, uptime=05:40:44.720] >^-- H/N/C [hosts=7, nodes=7, CPUs=56] >^-- CPU [cur=100%, avg=0.63%, GC=30.2%] >^-- PageMemory [pages=0] >^-- Heap [used=2025MB, free=1.08%, comm=2048MB] >^-- Off-heap [used=0MB, free=-1%, comm=0MB] >^-- Outbound messages queue [size=0] >^-- Public thread pool [active=0, idle=0, qSize=0] >^-- System thread pool [active=1, idle=0, qSize=0] > 15-05-2019 15:31:54.864 [35m[tcp-client-disco-sock-writer-#2][0;39m > [1;31mERROR[0;39m > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.error - Failed to send > message: null > java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=1fe38eaba61-604f46ce-3693-404c-b180-29b2b6dcf8ed, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, > isClient=true]] >at > org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1398) >at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "TcpDiscoverySpi.timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "ajp-nio-8009-ClientPoller-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Catalina-utility-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "http-nio-8080-ClientPoller-1" > Exception: java.lang.OutOfMemoryError thrown from the > U
Ignite in client Mode: outofMemory error with TCP discovery
Hi, Whenever, a client loses connection with ignite cluster, it throws out an OutOfMemory error. Can you please see if you can explain this behavior? The ignite client had 1G memory set on XMS and XMX... and a very few clients active. Now it is quite possible that ignite clients lose connection with ignite cluster frequently. Any insights into the issue will be deeply appreciated. 15-05-2019 15:30:35.024 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 5063 milliseconds. 15-05-2019 15:30:38.958 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 3664 milliseconds. 15-05-2019 15:30:42.783 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 3666 milliseconds. 15-05-2019 15:30:46.522 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 3630 milliseconds. 15-05-2019 15:30:50.163 [35m[exchange-worker-#38][0;39m [31mWARN [0;39m org.apache.ignite.internal.diagnostic.warning - Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=751, minorTopVer=1], node=604f46ce-3693-404c-b180-29b2b6dcf8ed]. Consider changing TransactionConfiguration.txTimeoutOnPartitionMapSynchronization to non default value to avoid this message. Dumping pending objects that might be the cause: 15-05-2019 15:30:50.163 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 3591 milliseconds. 15-05-2019 15:30:50.163 [35m[exchange-worker-#38][0;39m [31mWARN [0;39m org.apache.ignite.internal.diagnostic.warning - Ready affinity version: AffinityTopologyVersion [topVer=751, minorTopVer=0] 15-05-2019 15:30:57.519 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 3648 milliseconds. 15-05-2019 15:31:04.762 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 7242 milliseconds. 15-05-2019 15:31:31.284 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 22505 milliseconds. 15-05-2019 15:31:38.924 [35m[jvm-pause-detector-worker][0;39m [31mWARN [0;39m org.apache.ignite.internal.IgniteKernal.warning - Possible too long JVM pause: 11482 milliseconds. 15-05-2019 15:31:47.483 [35m[grid-timeout-worker-#23][0;39m [34mINFO [0;39m org.apache.ignite.internal.IgniteKernal.info - Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=604f46ce, uptime=05:40:44.720] ^-- H/N/C [hosts=7, nodes=7, CPUs=56] ^-- CPU [cur=100%, avg=0.63%, GC=30.2%] ^-- PageMemory [pages=0] ^-- Heap [used=2025MB, free=1.08%, comm=2048MB] ^-- Off-heap [used=0MB, free=-1%, comm=0MB] ^-- Outbound messages queue [size=0] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=1, idle=0, qSize=0] 15-05-2019 15:31:54.864 [35m[tcp-client-disco-sock-writer-#2][0;39m [1;31mERROR[0;39m org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.error - Failed to send message: null java.io.IOException: Failed to get acknowledge for message: TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=1fe38eaba61-604f46ce-3693-404c-b180-29b2b6dcf8ed, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] at org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1398) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "TcpDiscoverySpi.timer" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "ajp-nio-8009-ClientPoller-1" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Catalina-utility-1" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "http-nio-8080-ClientPoller-1" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "AsyncFileHandlerWriter-1259475182" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "jvm-pause-detector-worker" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "grid-nio-worker-tcp-comm-2-#26" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "ajp-nio-8009-ClientPoller-0" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "grid-nio-worker-client-listener-1-#30" Exception: jav