Github user liufengdb commented on the issue:

    https://github.com/apache/spark/pull/20029
  
    @zuotingbing I took a close look at the related code and thought the issue 
you raised is valid:
    
    1. The hiveClient created for the 
[resourceLoader](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala#L45)
 is only used to addJar, which is, in turn, to add Jar to the shared 
[`IsolatedClientLoader`](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L817).
 Then we can just use the shared hive client for this purpose.
    
    2. Another possible reason to use a new hive client is to run [this hive 
statement](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L818).
 But I think it just some leftovers from old spark and should be removed. So 
overall it is fined to use the shared `client` from `HiveExternalCatalog` 
without creating a new hive client.
    
    3. Currrently, there are no ways to cleanup the resource created by a [new 
session of 
SQLContext/SparkSession](https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLSessionManager.scala#L78).
 I couldn't understand the design tradeoff behind 
[this](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L716)
 (@srowen ). So it is not easy to remove the temp dirs when a session is closed.
    
    4. To what extent, does spark need these scratch dirs? Is it possible we 
can make this step optional, if it is not used for all the deployment modes?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to