[jira] [Comment Edited] (SPARK-27337) QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared
[ https://issues.apache.org/jira/browse/SPARK-27337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477546#comment-17477546 ] Stanislav Savulchik edited comment on SPARK-27337 at 1/24/22, 2:59 PM: --- [~vinooganesh] thanks for the response. I didn't try the 3.2.0 yet but the linked PR looks promising! Although I tried to increase the shared queue size (spark.scheduler.listenerbus.eventqueue.shared.capacity = 8) in order to keep up with the incoming events rate and resolve the dropped events issue. I still observing the effect on metrics after the change but it seems that the memory leak is gone. Though I still have to verify it but taking another heap dump. UPDATE – getting rid of dropped events in fact doesn't solve the memory leak. was (Author: savulchik): [~vinooganesh] thanks for the response. I didn't try the 3.2.0 yet but the linked PR looks promising! Although I tried to increase the shared queue size (spark.scheduler.listenerbus.eventqueue.shared.capacity = 8) in order to keep up with the incoming events rate and resolve the dropped events issue. I still observing the effect on metrics after the change but it seems that the memory leak is gone. Though I still have to verify it but taking another heap dump. > QueryExecutionListener never cleans up listeners from the bus after > SparkSession is cleared > --- > > Key: SPARK-27337 > URL: https://issues.apache.org/jira/browse/SPARK-27337 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Vinoo Ganesh >Priority: Major > Attachments: Screenshot 2022-01-16 at 23.16.10.png, image001-1.png > > > As a result of > [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3], > it looks like there is a memory leak (specifically > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131).] > > Because the Listener Bus on the context still has a reference to the listener > (even after the SparkSession is cleared), they are never cleaned up. This > means that if you close and remake spark sessions fairly frequently, you're > leaking every single time. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27337) QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared
[ https://issues.apache.org/jira/browse/SPARK-27337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17477546#comment-17477546 ] Stanislav Savulchik commented on SPARK-27337: - [~vinooganesh] thanks for the response. I didn't try the 3.2.0 yet but the linked PR looks promising! Although I tried to increase the shared queue size (spark.scheduler.listenerbus.eventqueue.shared.capacity = 8) in order to keep up with the incoming events rate and resolve the dropped events issue. I still observing the effect on metrics after the change but it seems that the memory leak is gone. Though I still have to verify it but taking another heap dump. > QueryExecutionListener never cleans up listeners from the bus after > SparkSession is cleared > --- > > Key: SPARK-27337 > URL: https://issues.apache.org/jira/browse/SPARK-27337 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Vinoo Ganesh >Priority: Major > Attachments: Screenshot 2022-01-16 at 23.16.10.png, image001-1.png > > > As a result of > [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3], > it looks like there is a memory leak (specifically > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131).] > > Because the Listener Bus on the context still has a reference to the listener > (even after the SparkSession is cleared), they are never cleaned up. This > means that if you close and remake spark sessions fairly frequently, you're > leaking every single time. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27337) QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared
[ https://issues.apache.org/jira/browse/SPARK-27337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476829#comment-17476829 ] Stanislav Savulchik commented on SPARK-27337: - Hi, I've found this ticket while investigating an apparent memory leak in an instance of a long running spark 3.1.1 driver java process executing various jobs posted by an external scheduler. I took a heap dump (jmap -dump:live,file=dump.hprof ) during an idle period when there were no running jobs and opened it with Eclipse Memory Analyzer. I saw a similar picture as posted by [~vinooganesh] . [^Screenshot 2022-01-16 at 23.16.10.png] Every posted job is given a fresh SparkSession instance using SparkSession#newSession method. After a job is done its SparkSession instance is no longer referenced and is expected to be garbage collected with all accumulated session state. Apparently in some cases some old SparkSessions are still referenced from AsyncEventQueue even after manual or scheduled System.gc() calls by spark context cleaner, more specifically from ExecutionListenerBus instances still residing in a listeners queue. I tried to correlate this with spark driver metrics and my current guess is that the reason of stuck ExecutionListenerBus instances – dropped events on a _shared_ queue. I would appreciate if anyone could verify my reasoning. Thank you. > QueryExecutionListener never cleans up listeners from the bus after > SparkSession is cleared > --- > > Key: SPARK-27337 > URL: https://issues.apache.org/jira/browse/SPARK-27337 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Vinoo Ganesh >Priority: Major > Attachments: Screenshot 2022-01-16 at 23.16.10.png, image001-1.png > > > As a result of > [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3], > it looks like there is a memory leak (specifically > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131).] > > Because the Listener Bus on the context still has a reference to the listener > (even after the SparkSession is cleared), they are never cleaned up. This > means that if you close and remake spark sessions fairly frequently, you're > leaking every single time. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-27337) QueryExecutionListener never cleans up listeners from the bus after SparkSession is cleared
[ https://issues.apache.org/jira/browse/SPARK-27337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Savulchik updated SPARK-27337: Attachment: Screenshot 2022-01-16 at 23.16.10.png > QueryExecutionListener never cleans up listeners from the bus after > SparkSession is cleared > --- > > Key: SPARK-27337 > URL: https://issues.apache.org/jira/browse/SPARK-27337 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Vinoo Ganesh >Priority: Major > Attachments: Screenshot 2022-01-16 at 23.16.10.png, image001-1.png > > > As a result of > [https://github.com/apache/spark/commit/9690eba16efe6d25261934d8b73a221972b684f3], > it looks like there is a memory leak (specifically > [https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala#L131).] > > Because the Listener Bus on the context still has a reference to the listener > (even after the SparkSession is cleared), they are never cleaned up. This > means that if you close and remake spark sessions fairly frequently, you're > leaking every single time. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34651) Improve ZSTD support
[ https://issues.apache.org/jira/browse/SPARK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306782#comment-17306782 ] Stanislav Savulchik commented on SPARK-34651: - [~dongjoon] bq. Well, it's a different codec technically. Apache Spark has its own ZStandardCodec class instead of Apache Hadoop ZStandardCodec. I believe you are referring to [org.apache.spark.io.ZStdCompressionCodec |https://github.com/apache/spark/blob/1f6089b165181472e581ed0466694d28ed9b8de8/core/src/main/scala/org/apache/spark/io/CompressionCodec.scala#L213] that is used for spark internal use cases like compressing shuffled data, isn't it? bq. BTW, please note that Apache Hadoop ZStandardCodec exists only at Hadoop 2.9+ (HADOOP-13578) and Apache Spark still supports Hadoop 2.7 distribution. I guess it is the reason why org.apache.hadoop.io.compress.ZStandardCodec is not on the [list|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CompressionCodecs.scala#L29]. Thank you! > Improve ZSTD support > > > Key: SPARK-34651 > URL: https://issues.apache.org/jira/browse/SPARK-34651 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-34651) Improve ZSTD support
[ https://issues.apache.org/jira/browse/SPARK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304676#comment-17304676 ] Stanislav Savulchik edited comment on SPARK-34651 at 3/19/21, 7:18 AM: --- [~dongjoon] I noticed that "zstd" is not supported as a short compression codec name for text files in spark 3.1.1 though I can use it via its full class name {{org.apache.hadoop.io.compress.ZStandardCodec}} in {code:java} scala> spark.read.textFile("hdfs://path/to/file.txt").write.option("compression", "zstd").text("hdfs://path/to/file.txt.zst") java.lang.IllegalArgumentException: Codec [zstd] is not available. Known codecs are bzip2, deflate, uncompressed, lz4, gzip, snappy, none. at org.apache.spark.sql.catalyst.util.CompressionCodecs$.getCodecClassName(CompressionCodecs.scala:53) at org.apache.spark.sql.execution.datasources.text.TextOptions.$anonfun$compressionCodec$1(TextOptions.scala:37) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.execution.datasources.text.TextOptions.(TextOptions.scala:37) at org.apache.spark.sql.execution.datasources.text.TextOptions.(TextOptions.scala:32) at org.apache.spark.sql.execution.datasources.text.TextFileFormat.prepareWrite(TextFileFormat.scala:72) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:133) ... scala> spark.read.textFile("hdfs://pato/to/file.txt").write.option("compression", "org.apache.hadoop.io.compress.ZStandardCodec").text("hdfs://path/to/file.txt.zst") // no exceptions{code} Source code for the stack frame above [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CompressionCodecs.scala#L29] Should I create a Jira issue to add a short codec name for zstd to the list? was (Author: savulchik): [~dongjoon] I noticed that "zstd" is not supported as a short compression codec name for text files in spark 3.1.1 though I can use it via its full class name {{org.apache.hadoop.io.compress.ZStandardCodec}} in {code:java} scala> spark.read.textFile("hdfs://path/to/file.txt").write.option("compression", "zstd").text("hdfs://path/to/file.txt.zst") java.lang.IllegalArgumentException: Codec [zstd] is not available. Known codecs are bzip2, deflate, uncompressed, lz4, gzip, snappy, none. at org.apache.spark.sql.catalyst.util.CompressionCodecs$.getCodecClassName(CompressionCodecs.scala:53) ... scala> spark.read.textFile("hdfs://pato/to/file.txt").write.option("compression", "org.apache.hadoop.io.compress.ZStandardCodec").text("hdfs://path/to/file.txt.zst") // no exceptions{code} Source code for the stack frame above [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CompressionCodecs.scala#L29] Should I create a Jira issue to add a short codec name for zstd to the list? > Improve ZSTD support > > > Key: SPARK-34651 > URL: https://issues.apache.org/jira/browse/SPARK-34651 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34651) Improve ZSTD support
[ https://issues.apache.org/jira/browse/SPARK-34651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304676#comment-17304676 ] Stanislav Savulchik commented on SPARK-34651: - [~dongjoon] I noticed that "zstd" is not supported as a short compression codec name for text files in spark 3.1.1 though I can use it via its full class name {{org.apache.hadoop.io.compress.ZStandardCodec}} in {code:java} scala> spark.read.textFile("hdfs://path/to/file.txt").write.option("compression", "zstd").text("hdfs://path/to/file.txt.zst") java.lang.IllegalArgumentException: Codec [zstd] is not available. Known codecs are bzip2, deflate, uncompressed, lz4, gzip, snappy, none. at org.apache.spark.sql.catalyst.util.CompressionCodecs$.getCodecClassName(CompressionCodecs.scala:53) ... scala> spark.read.textFile("hdfs://pato/to/file.txt").write.option("compression", "org.apache.hadoop.io.compress.ZStandardCodec").text("hdfs://path/to/file.txt.zst") // no exceptions{code} Source code for the stack frame above [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CompressionCodecs.scala#L29] Should I create a Jira issue to add a short codec name for zstd to the list? > Improve ZSTD support > > > Key: SPARK-34651 > URL: https://issues.apache.org/jira/browse/SPARK-34651 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org