Github user liufengdb commented on the issue: https://github.com/apache/spark/pull/20029 @zuotingbing I took a close look at the related code and thought the issue you raised is valid: 1. The hiveClient created for the [resourceLoader](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala#L45) is only used to addJar, which is, in turn, to add Jar to the shared [`IsolatedClientLoader`](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L817). Then we can just use the shared hive client for this purpose. 2. Another possible reason to use a new hive client is to run [this hive statement](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L818). But I think it just some leftovers from old spark and should be removed. So overall it is fined to use the shared `client` from `HiveExternalCatalog` without creating a new hive client. 3. Currrently, there are no ways to cleanup the resource created by a [new session of SQLContext/SparkSession](https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala#L78). I couldn't understand the design tradeoff behind [this](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L716) (@srowen ). So it is not easy to remove the temp dirs when a session is closed. 4. To what extent, does spark need these scratch dirs? Is it possible we can make this step optional, if it is not used for all the deployment modes?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org