[ https://issues.apache.org/jira/browse/SPARK-35610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Attila Zsolt Piros updated SPARK-35610: --------------------------------------- Description: I have identified this leak by running the Livy tests (I know it is close to the attic but this leak causes a constant OOM there) and it is in our Spark unit tests as well. This leak can be identified by checking the number of LeakyEntry in case of Scala 2.12.14 (and ZipEntry fo Scala 2.12.10) instances which can take up a considerable amount of memory (as those are created from the jars which are on the classpath). I have my own tool to instrument JVM code ([trace-agent|https://github.com/attilapiros/trace-agent]) and with that I am able to call JVM diagnostic commands at specific methods. It has a single text file embedded into the tool's jar called action.txt. In this case actions.txt content is: {noformat} $ unzip -q -c trace-agent-0.0.7.jar actions.txt diagnostic_command org.apache.spark.repl.ReplSuite runInterpreter cmd:gcClassHistogram,limit_output_lines:8,where:beforeAndAfter,with_gc:true diagnostic_command org.apache.spark.repl.ReplSuite afterAll cmd:gcClassHistogram,limit_output_lines:8,where:after,with_gc:true {noformat} Which creates a class histogram at the beginning and at the end of org.apache.spark.repl.ReplSuite#runInterpreter() (after triggering a GC which might not finish as GC is done in a separate thread..) and one histogram in the end of the afterAll() method. And the histograms are the followings on master branch: {noformat} $ ./build/sbt ";project repl;set Test/javaOptions += \"-javaagent:/Users/attilazsoltpiros/git/attilapiros/memoryLeak/trace-agent-0.0.7.jar\"; testOnly" |grep "ZipEntry\|LeakyEntry" 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry {noformat} Where the header of the table is: {noformat} num #instances #bytes class name {noformat} So the LeakyEntry in the end is about 75MB (173MB in case of Scala 2.12.10 and before for another class called ZipEntry) but the first item (a char/byte arrays) and the second item (strings) in the histogram also relates to this leak: {noformat} $ ./build/sbt ";project repl;set Test/javaOptions += \"-javaagent:/Users/attilazsoltpiros/git/attilapiros/memoryLeak/trace-agent-0.0.7.jar\"; testOnly" |grep "1:\|2:\|3:" 1: 2701 3496112 [B 2: 21855 2607192 [C 3: 4885 537264 java.lang.Class 1: 480323 55970208 [C 2: 480499 11531976 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 481825 56148024 [C 2: 481998 11567952 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 487056 57550344 [C 2: 487179 11692296 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 487054 57551008 [C 2: 487176 11692224 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 927823 107139160 [C 2: 928072 22273728 java.lang.String 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 1: 927793 107129328 [C 2: 928041 22272984 java.lang.String 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1361851 155555608 [C 2: 1362261 32694264 java.lang.String 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1361683 155493464 [C 2: 1362092 32690208 java.lang.String 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1803074 205157728 [C 2: 1803268 43278432 java.lang.String 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1802385 204938224 [C 2: 1802579 43261896 java.lang.String 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2236631 253636592 [C 2: 2237029 53688696 java.lang.String 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2236536 253603008 [C 2: 2236933 53686392 java.lang.String 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2668892 301893920 [C 2: 2669510 64068240 java.lang.String 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2668759 301846376 [C 2: 2669376 64065024 java.lang.String 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3101238 350101048 [C 2: 3102073 74449752 java.lang.String 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3101240 350101104 [C 2: 3102075 74449800 java.lang.String 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3533785 398371760 [C 2: 3534835 84836040 java.lang.String 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3533759 398367088 [C 2: 3534807 84835368 java.lang.String 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3967049 446893400 [C 2: 3968314 95239536 java.lang.String 3: 1773801 85142448 scala.reflect.io.FileZipArchive$LeakyEntry [info] - SPARK-26633: ExecutorClassLoader.getResourceAsStream find REPL classes (8 seconds, 248 milliseconds) Setting default log level to "ERROR". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 1: 3966423 446709584 [C 2: 3967682 95224368 java.lang.String 3: 1773801 85142448 scala.reflect.io.FileZipArchive$LeakyEntry 1: 4399583 495097208 [C 2: 4401050 105625200 java.lang.String 3: 1970890 94602720 scala.reflect.io.FileZipArchive$LeakyEntry 1: 4399578 495070064 [C 2: 4401040 105624960 java.lang.String 3: 1970890 94602720 scala.reflect.io.FileZipArchive$LeakyEntry {noformat} The last three is about 700MB altogether. was: I have identified this leak by running the Livy tests (I know it is close to the attic but this leak causes a constant OOM there) and it is in our Spark unit tests as well. This leak can be identified by checking the number of LeakyEntry in case of Scala 2.12.14 (and ZipEntry fo Scala 2.12.10) instances which can take up a considerable amount of memory (as those are created from the jars which are on the classpath). I have my own tool to instrument JVM code ([trace-agent|https://github.com/attilapiros/trace-agent]) and with that I am able to call JVM diagnostic commands at specific methods. It has a single text file embedded into the tool's jar called action.txt. In this case actions.txt content is: {noformat} $ unzip -q -c trace-agent-0.0.7.jar actions.txt diagnostic_command org.apache.spark.repl.ReplSuite runInterpreter cmd:gcClassHistogram,limit_output_lines:8,where:beforeAndAfter,with_gc:true {noformat} Which creates a class histogram at the beginning and at the end of org.apache.spark.repl.ReplSuite#runInterpreter() (after triggering a GC which might not finish as GC is done in a separate thread..). And the histograms are the followings on master branch: {noformat} $ ./build/sbt ";project repl;set Test/javaOptions += \"-javaagent:/Users/attilazsoltpiros/git/attilapiros/memoryLeak/trace-agent-0.0.7.jar\"; testOnly" |grep "ZipEntry\|LeakyEntry" 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry {noformat} Where the header of the table is: {noformat} num #instances #bytes class name {noformat} So it ZipEntry instances altogether is about 75MB (173MB in case of Scala 2.12.10 and before with the ZipEntry), but the first item a char/byte array and the second item strings in the histogram also relates to this leak: {noformat} $ ./build/sbt ";project repl;set Test/javaOptions += \"-javaagent:/Users/attilazsoltpiros/git/attilapiros/memoryLeak/trace-agent-0.0.7.jar\"; testOnly" |grep "1:\|2:\|3:" 1: 2701 3496112 [B 2: 21855 2607192 [C 3: 4885 537264 java.lang.Class 1: 480323 55970208 [C 2: 480499 11531976 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 481825 56148024 [C 2: 481998 11567952 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 487056 57550344 [C 2: 487179 11692296 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 487054 57551008 [C 2: 487176 11692224 java.lang.String 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry 1: 927823 107139160 [C 2: 928072 22273728 java.lang.String 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 1: 927793 107129328 [C 2: 928041 22272984 java.lang.String 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1361851 155555608 [C 2: 1362261 32694264 java.lang.String 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1361683 155493464 [C 2: 1362092 32690208 java.lang.String 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1803074 205157728 [C 2: 1803268 43278432 java.lang.String 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 1: 1802385 204938224 [C 2: 1802579 43261896 java.lang.String 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2236631 253636592 [C 2: 2237029 53688696 java.lang.String 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2236536 253603008 [C 2: 2236933 53686392 java.lang.String 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2668892 301893920 [C 2: 2669510 64068240 java.lang.String 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 1: 2668759 301846376 [C 2: 2669376 64065024 java.lang.String 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3101238 350101048 [C 2: 3102073 74449752 java.lang.String 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3101240 350101104 [C 2: 3102075 74449800 java.lang.String 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3533785 398371760 [C 2: 3534835 84836040 java.lang.String 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3533759 398367088 [C 2: 3534807 84835368 java.lang.String 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry 1: 3967049 446893400 [C 2: 3968314 95239536 java.lang.String 3: 1773801 85142448 scala.reflect.io.FileZipArchive$LeakyEntry [info] - SPARK-26633: ExecutorClassLoader.getResourceAsStream find REPL classes (8 seconds, 248 milliseconds) Setting default log level to "ERROR". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 1: 3966423 446709584 [C 2: 3967682 95224368 java.lang.String 3: 1773801 85142448 scala.reflect.io.FileZipArchive$LeakyEntry 1: 4399583 495097208 [C 2: 4401050 105625200 java.lang.String 3: 1970890 94602720 scala.reflect.io.FileZipArchive$LeakyEntry 1: 4399578 495070064 [C 2: 4401040 105624960 java.lang.String 3: 1970890 94602720 scala.reflect.io.FileZipArchive$LeakyEntry {noformat} The last three is about 700MB altogether. > Memory leak in Spark interpreter > --------------------------------- > > Key: SPARK-35610 > URL: https://issues.apache.org/jira/browse/SPARK-35610 > Project: Spark > Issue Type: Bug > Components: Spark Core, Tests > Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.1.1, 3.1.2, 3.2.0 > Reporter: Attila Zsolt Piros > Assignee: Attila Zsolt Piros > Priority: Major > > I have identified this leak by running the Livy tests (I know it is close to > the attic but this leak causes a constant OOM there) and it is in our Spark > unit tests as well. > This leak can be identified by checking the number of LeakyEntry in case of > Scala 2.12.14 (and ZipEntry fo Scala 2.12.10) instances which can take up a > considerable amount of memory (as those are created from the jars which are > on the classpath). > I have my own tool to instrument JVM code > ([trace-agent|https://github.com/attilapiros/trace-agent]) and with that I am > able to call JVM diagnostic commands at specific methods. > It has a single text file embedded into the tool's jar called action.txt. > In this case actions.txt content is: > {noformat} > $ unzip -q -c trace-agent-0.0.7.jar actions.txt > diagnostic_command org.apache.spark.repl.ReplSuite runInterpreter > cmd:gcClassHistogram,limit_output_lines:8,where:beforeAndAfter,with_gc:true > diagnostic_command org.apache.spark.repl.ReplSuite afterAll > cmd:gcClassHistogram,limit_output_lines:8,where:after,with_gc:true > {noformat} > Which creates a class histogram at the beginning and at the end of > org.apache.spark.repl.ReplSuite#runInterpreter() (after triggering a GC which > might not finish as GC is done in a separate thread..) and one histogram in > the end of the afterAll() method. > And the histograms are the followings on master branch: > {noformat} > $ ./build/sbt ";project repl;set Test/javaOptions += > \"-javaagent:/Users/attilazsoltpiros/git/attilapiros/memoryLeak/trace-agent-0.0.7.jar\"; > testOnly" |grep "ZipEntry\|LeakyEntry" > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry > 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry > {noformat} > Where the header of the table is: > {noformat} > num #instances #bytes class name > {noformat} > So the LeakyEntry in the end is about 75MB (173MB in case of Scala 2.12.10 > and before for another class called ZipEntry) but the first item (a char/byte > arrays) and the second item (strings) in the histogram also relates to this > leak: > {noformat} > $ ./build/sbt ";project repl;set Test/javaOptions += > \"-javaagent:/Users/attilazsoltpiros/git/attilapiros/memoryLeak/trace-agent-0.0.7.jar\"; > testOnly" |grep "1:\|2:\|3:" > 1: 2701 3496112 [B > 2: 21855 2607192 [C > 3: 4885 537264 java.lang.Class > 1: 480323 55970208 [C > 2: 480499 11531976 java.lang.String > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 481825 56148024 [C > 2: 481998 11567952 java.lang.String > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 487056 57550344 [C > 2: 487179 11692296 java.lang.String > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 487054 57551008 [C > 2: 487176 11692224 java.lang.String > 3: 197089 9460272 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 927823 107139160 [C > 2: 928072 22273728 java.lang.String > 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 927793 107129328 [C > 2: 928041 22272984 java.lang.String > 3: 394178 18920544 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 1361851 155555608 [C > 2: 1362261 32694264 java.lang.String > 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 1361683 155493464 [C > 2: 1362092 32690208 java.lang.String > 3: 591267 28380816 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 1803074 205157728 [C > 2: 1803268 43278432 java.lang.String > 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 1802385 204938224 [C > 2: 1802579 43261896 java.lang.String > 3: 788356 37841088 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 2236631 253636592 [C > 2: 2237029 53688696 java.lang.String > 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 2236536 253603008 [C > 2: 2236933 53686392 java.lang.String > 3: 985445 47301360 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 2668892 301893920 [C > 2: 2669510 64068240 java.lang.String > 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 2668759 301846376 [C > 2: 2669376 64065024 java.lang.String > 3: 1182534 56761632 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 3101238 350101048 [C > 2: 3102073 74449752 java.lang.String > 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 3101240 350101104 [C > 2: 3102075 74449800 java.lang.String > 3: 1379623 66221904 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 3533785 398371760 [C > 2: 3534835 84836040 java.lang.String > 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 3533759 398367088 [C > 2: 3534807 84835368 java.lang.String > 3: 1576712 75682176 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 3967049 446893400 [C > 2: 3968314 95239536 java.lang.String > 3: 1773801 85142448 scala.reflect.io.FileZipArchive$LeakyEntry > [info] - SPARK-26633: ExecutorClassLoader.getResourceAsStream find REPL > classes (8 seconds, 248 milliseconds) > Setting default log level to "ERROR". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > 1: 3966423 446709584 [C > 2: 3967682 95224368 java.lang.String > 3: 1773801 85142448 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 4399583 495097208 [C > 2: 4401050 105625200 java.lang.String > 3: 1970890 94602720 scala.reflect.io.FileZipArchive$LeakyEntry > 1: 4399578 495070064 [C > 2: 4401040 105624960 java.lang.String > 3: 1970890 94602720 scala.reflect.io.FileZipArchive$LeakyEntry > {noformat} > The last three is about 700MB altogether. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org