Ping…
From: 张帅 [mailto:satan.stud...@gmail.com] On Behalf Of zhangshuai.u...@gmail.com Sent: Wednesday, April 12, 2017 5:29 PM To: user@ignite.apache.org Subject: OOM when using Ignite as HDFS Cache Hi there, I’d like to use Ignite as HDFS Cache in my cluster but failed with OOM error. Could you help to review my configuration to help avoid it? I’m using DUAL_ASYNC mode. The Ignite nodes can find each other to establish the cluster. There are very few changes in default-config.xml but attached for your review. The JVM heap size is limited to 1GB. The Ignite suffers from OOM exception when I’m running Hadoop benchmark TestDFSIO writing 4*4GB files. I think writing 4GB file to HDFS is in streaming so Ignite should work with it. It’s acceptable to slow down the write performance to wait Ignite write cached data to HDFS but not acceptable to lead crash or data lost. The ignite log is attached as ignite_log.zip, pick some key messages here: 17/04/12 00:49:17 INFO [grid-timeout-worker-#19%null%] internal.IgniteKernal: Metrics for local node (to disable set 'metricsLogFrequency' to 0) ^-- Node [id=9b5dcc35, name=null, uptime=00:26:00:254] ^-- H/N/C [hosts=173, nodes=173, CPUs=2276] ^-- CPU [cur=0.13%, avg=0.82%, GC=0%] ^-- Heap [used=555MB, free=43.3%, comm=979MB] ^-- Non heap [used=61MB, free=95.95%, comm=62MB] ^-- Public thread pool [active=0, idle=0, qSize=0] ^-- System thread pool [active=0, idle=6, qSize=0] ^-- Outbound messages queue [size=0] 17/04/12 00:50:06 INFO [disco-event-worker-#35%null%] discovery.GridDiscoveryManager: Added new node to topology: TcpDiscoveryNode [id=553b5c1a-da0b-43cb-b691-b842352b3105, addrs=[0:0:0:0:0:0:0:1, 10.152.133.46, 10.55.68.223, 127.0.0.1, 192.168.1.1], sockAddrs=[BN1APS0A98852E/10.152.133.46:47500, bn1sch010095221.phx.gbl/10.55.68.223:47500, /0:0:0:0:0:0:0:1:47500, /192.168.1.1:47500, /127.0.0.1:47500], discPort=47500, order=176, intOrder=175, lastExchangeTime=1491983403106, loc=false, ver=2.0.0#20170405-sha1:2c830b0d, isClient=false] [00:50:06] Topology snapshot [ver=176, servers=174, clients=0, CPUs=2288, heap=180.0GB] ... Exception in thread "igfs-client-worker-2-#585%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) at java.lang.StringBuffer.append(StringBuffer.java:270) at java.io.StringWriter.write(StringWriter.java:112) at java.io.PrintWriter.write(PrintWriter.java:456) at java.io.PrintWriter.write(PrintWriter.java:473) at java.io.PrintWriter.print(PrintWriter.java:603) at java.io.PrintWriter.println(PrintWriter.java:756) at java.lang.Throwable$WrappedPrintWriter.println(Throwable.java:764) at java.lang.Throwable.printStackTrace(Throwable.java:658) at java.lang.Throwable.printStackTrace(Throwable.java:721) at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:60) at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87) at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413) at org.apache.log4j.AsyncAppender.append(AsyncAppender.java:162) at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) at org.apache.log4j.Category.callAppenders(Category.java:206) at org.apache.log4j.Category.forcedLog(Category.java:391) at org.apache.log4j.Category.error(Category.java:322) at org.apache.ignite.logger.log4j.Log4JLogger.error(Log4JLogger.java:495) at org.apache.ignite.internal.GridLoggerProxy.error(GridLoggerProxy.java:148) at org.apache.ignite.internal.util.IgniteUtils.error(IgniteUtils.java:4281) at org.apache.ignite.internal.util.IgniteUtils.error(IgniteUtils.java:4306) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:126) at java.lang.Thread.run(Thread.java:745) Exception in thread "LeaseRenewer:had...@namenode-vip.yarn3-dev-bn2.bn2.ap.gbl" java.lang.OutOfMemoryError: GC overhead limit exceeded Exception in thread "igfs-delete-worker%igfs%9b5dcc35-3a4c-4a90-ac9e-89fdd65302a7%" java.lang.OutOfMemoryError: GC overhead limit exceeded Exception in thread "exchange-worker-#39%null%" java.lang.OutOfMemoryError: GC overhead limit exceeded … 17/04/12 01:40:10 WARN [disco-event-worker-#35%null%] discovery.GridDiscoveryManager: Stopping local node according to configured segmentation policy. Looking forward to your help. Regards, Shuai Zhang