[ https://issues.apache.org/jira/browse/SPARK-13288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15145148#comment-15145148 ]
JESSE CHEN commented on SPARK-13288: ------------------------------------ Maybe "heap exhaustion" a better term to call this. jstat showed this for 1.6 running the same streaming code: jstat -gcutil -h20 220288 2000 S0 S1 E O M CCS YGC YGCT FGC FGCT GCT 0.00 0.00 93.10 99.94 97.55 95.27 523 9.454 675 243.502 252.956 0.00 0.00 74.31 99.84 97.55 95.27 523 9.454 676 243.859 253.313 0.00 0.00 48.99 99.84 97.55 95.27 523 9.454 677 244.260 253.714 0.00 0.00 88.42 99.84 97.55 95.27 523 9.454 677 244.260 253.714 0.00 0.00 25.62 99.84 97.55 95.27 523 9.454 678 244.520 253.974 0.00 0.00 57.90 99.84 97.55 95.27 523 9.454 678 244.520 253.974 0.00 0.00 100.00 99.84 97.55 95.27 523 9.454 679 244.520 253.974 0.00 0.00 90.96 99.75 97.55 95.27 523 9.454 679 244.877 254.331 0.00 0.00 63.42 99.75 97.55 95.27 523 9.454 680 245.252 254.706 0.00 0.00 24.97 99.76 97.55 95.27 523 9.454 681 245.620 255.074 0.00 0.00 24.97 99.76 97.55 95.27 523 9.454 681 245.620 255.074 0.00 0.00 24.97 99.76 97.55 95.27 523 9.454 681 245.620 255.074 0.00 0.00 24.98 99.76 97.55 95.27 523 9.454 681 245.620 255.074 0.00 0.00 24.98 99.76 97.55 95.27 523 9.454 681 245.620 255.074 0.00 0.00 61.92 99.76 97.55 95.27 523 9.454 681 245.620 255.074 0.00 0.00 26.54 99.91 97.55 95.27 523 9.454 682 246.149 255.603 Where "old" is full, and total GC time is high. Can say this JVM is hosed at this point. The heap dumps I took were from when this was happening. Can you take a look at the objects referencing Bytes? They look very different from 1.5. Need a clue to go further here. Thanks. > [1.6.0] Memory leak in Spark streaming > -------------------------------------- > > Key: SPARK-13288 > URL: https://issues.apache.org/jira/browse/SPARK-13288 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.6.0 > Environment: Bare metal cluster > RHEL 6.6 > Reporter: JESSE CHEN > Labels: streaming > > Streaming in 1.6 seems to have a memory leak. > Running the same streaming app in Spark 1.5.1 and 1.6, all things equal, 1.6 > showed a gradual increasing processing time. > The app is simple: 1 Kafka receiver of tweet stream and 20 executors > processing the tweets in 5-second batches. > Spark 1.5.0 handles this smoothly and did not show increasing processing time > in the 40-minute test; but 1.6 showed increasing time about 8 minutes into > the test. Please see chart here: > https://ibm.box.com/s/7q4ulik70iwtvyfhoj1dcl4nc469b116 > I captured heap dumps in two version and did a comparison. I noticed the Byte > is using 50X more space in 1.5.1. > Here are some top classes in heap histogram and references. > Heap Histogram > > All Classes (excluding platform) > 1.6.0 Streaming 1.5.1 Streaming > Class Instance Count Total Size Class Instance Count Total > Size > class [B 8453 3,227,649,599 class [B 5095 > 62,938,466 > class [C 44682 4,255,502 class [C 130482 > 12,844,182 > class java.lang.reflect.Method 9059 1,177,670 class > java.lang.String 130171 1,562,052 > > > References by Type References by Type > > class [B [0x640039e38] class [B [0x6c020bb08] > > > Referrers by Type Referrers by Type > > Class Count Class Count > java.nio.HeapByteBuffer 3239 > sun.security.util.DerInputBuffer 1233 > sun.security.util.DerInputBuffer 1233 > sun.security.util.ObjectIdentifier 620 > sun.security.util.ObjectIdentifier 620 [[B 397 > [Ljava.lang.Object; 408 java.lang.reflect.Method > 326 > ---- > The total size by class B is 3GB in 1.5.1 and only 60MB in 1.6.0. > The Java.nio.HeapByteBuffer referencing class did not show up in top in > 1.5.1. > I have also placed jstack output for 1.5.1 and 1.6.0 online..you can get them > here > https://ibm.box.com/sparkstreaming-jstack160 > https://ibm.box.com/sparkstreaming-jstack151 > Jesse -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org