[ https://issues.apache.org/jira/browse/SPARK-39866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-39866: ------------------------------------ Assignee: Apache Spark > Memory leak when closing a session of Spark Thrift Server > ---------------------------------------------------------- > > Key: SPARK-39866 > URL: https://issues.apache.org/jira/browse/SPARK-39866 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.1.3 > Reporter: Liu Shuo > Assignee: Apache Spark > Priority: Major > Attachments: image-2022-07-26-11-54-06-826.png > > > we are using Spark Thrift Server as a distributed sql query engine, the > queries are related to read some datasource table with a lot of files. > when we open multiple sessions, the Driver could be crashed with OOM. > We can reproduce it with the following steps: > # start spark thrift server; > # using beeline to open a new session; > # create a new datasource table; > # insert one row data into this table; > # open another 5 new sessions, then using `select` command to scan this table > # close all the 6 session > # using jmap command `jmap -histo:live pid > pid.log` to print the dump of > Driver > *Expected result:* > The cached FileStatus should be cleaned, the number of HdfsLocatedFileStatus > object should be 0. > *Actural result:* > ** The number of HdfsLocatedFileStatus object is 6. > *!image-2022-07-26-11-54-06-826.png!* -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org