[GitHub] spark issue #20029: [SPARK-22793][SQL]Memory leak in Spark Thrift Server

Github user liufengdb commented on the issue:

https://github.com/apache/spark/pull/20029

@zuotingbing I took a close look at the related code and thought the issue
you raised is valid:

1. The hiveClient created for the
[resourceLoader](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala#L45)
is only used to addJar, which is, in turn, to add Jar to the shared
[`IsolatedClientLoader`](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L817).
Then we can just use the shared hive client for this purpose.

2. Another possible reason to use a new hive client is to run [this hive
statement](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L818).
But I think it just some leftovers from old spark and should be removed. So
overall it is fined to use the shared `client` from `HiveExternalCatalog`
without creating a new hive client.

3. Currrently, there are no ways to cleanup the resource created by a [new
session of
SQLContext/SparkSession](https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala#L78).
I couldn't understand the design tradeoff behind
[this](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L716)
(@srowen ). So it is not easy to remove the temp dirs when a session is closed.

4. To what extent, does spark need these scratch dirs? Is it possible we
can make this step optional, if it is not used for all the deployment modes?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20029: [SPARK-22793][SQL]Memory leak in Spark Thrift Server

Reply via email to