This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 115f777  [SPARK-21449][SQL][FOLLOWUP] Avoid log undesirable 
IllegalStateException when state close
115f777 is described below

commit 115f777cb0a9dff78497bad9b64daa5da1ee0e51
Author: Kent Yao <y...@apache.org>
AuthorDate: Wed Mar 17 15:21:23 2021 +0800

    [SPARK-21449][SQL][FOLLOWUP] Avoid log undesirable IllegalStateException 
when state close
    
    ### What changes were proposed in this pull request?
    
    `TmpOutputFile` and `TmpErrOutputFile`  are registered in 
`o.a.h.u.ShutdownHookManager `during creatation. The `state.close()` will 
delete them if they are not null and try remove them from the 
`o.a.h.u.ShutdownHookManager` which causes IllegalStateException when we call 
it in our ShutdownHookManager too.
    In this PR, we delete them ahead with a high priority hook in Spark and set 
them to null to bypass the deletion and canceling in `state.close()`
    
    ### Why are the changes needed?
    
    W/ or w/o this PR, the deletion of these files is not affected, we just 
mute an undesirable error log here.
    
    ### Does this PR introduce _any_ user-facing change?
    
    no, this is a follow-up
    
    ### How was this patch tested?
    
    #### the undesirable gone
    ```scala
    spark-sql> 21/03/16 18:41:31 ERROR Utils: Uncaught exception in thread 
shutdown-hook-0
    java.lang.IllegalStateException: Shutdown in progress, cannot cancel a 
deleteOnExit
        at 
org.apache.hive.common.util.ShutdownHookManager.cancelDeleteOnExit(ShutdownHookManager.java:106)
        at 
org.apache.hadoop.hive.common.FileUtils.deleteTmpFile(FileUtils.java:861)
        at 
org.apache.hadoop.hive.ql.session.SessionState.deleteTmpErrOutputFile(SessionState.java:325)
        at 
org.apache.hadoop.hive.ql.session.SessionState.dropSessionPaths(SessionState.java:829)
        at 
org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1585)
        at 
org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)
        at 
org.apache.spark.sql.hive.client.HiveClientImpl.closeState(HiveClientImpl.scala:172)
        at 
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$new$1(HiveClientImpl.scala:175)
        at 
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
        at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:188)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1994)
        at 
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:188)
        at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at 
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
        at 
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    (python)  ✘ kentyaohulk  
~/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210316  cd ..
    (python)  kentyaohulk  ~/Downloads/spark  tar zxf 
spark-3.2.0-SNAPSHOT-bin-20210316.tgz
    (python)  kentyaohulk  ~/Downloads/spark  cd -
    ~/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210316
    (python)  kentyaohulk  ~/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210316 
 bin/spark-sql --conf spark.local.dir=./local --conf 
spark.hive.exec.local.scratchdir=./local
    21/03/16 18:42:15 WARN Utils: Your hostname, hulk.local resolves to a 
loopback address: 127.0.0.1; using 10.242.189.214 instead (on interface en0)
    21/03/16 18:42:15 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
another address
    Using Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
    21/03/16 18:42:15 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
    21/03/16 18:42:16 WARN SparkConf: Note that spark.local.dir will be 
overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in 
mesos/standalone/kubernetes and LOCAL_DIRS in YARN).
    21/03/16 18:42:18 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout 
does not exist
    21/03/16 18:42:18 WARN HiveConf: HiveConf of name hive.stats.retries.wait 
does not exist
    21/03/16 18:42:19 WARN ObjectStore: Version information not found in 
metastore. hive.metastore.schema.verification is not enabled so recording the 
schema version 2.3.0
    21/03/16 18:42:19 WARN ObjectStore: setMetaStoreSchemaVersion called but 
recording version is disabled: version = 2.3.0, comment = Set by MetaStore 
kentyao127.0.0.1
    Spark master: local[*], Application Id: local-1615891336877
    spark-sql> %
    ```
    
    #### and the deletion is still fine
    
    ```shell
    kentyaohulk  ~/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210316 
    ls -al local
    total 0
    drwxr-xr-x   7 kentyao  staff  224  3 16 18:42 .
    drwxr-xr-x  19 kentyao  staff  608  3 16 18:42 ..
    drwx------   2 kentyao  staff   64  3 16 18:42 
16cc5238-e25e-4c0f-96ef-0c4bdecc7e51
    -rw-r--r--   1 kentyao  staff    0  3 16 18:42 
16cc5238-e25e-4c0f-96ef-0c4bdecc7e51219959790473242539.pipeout
    -rw-r--r--   1 kentyao  staff    0  3 16 18:42 
16cc5238-e25e-4c0f-96ef-0c4bdecc7e518816377057377724129.pipeout
    drwxr-xr-x   2 kentyao  staff   64  3 16 18:42 
blockmgr-37a52ad2-eb56-43a5-8803-8f58d08fe9ad
    drwx------   3 kentyao  staff   96  3 16 18:42 
spark-101971df-f754-47c2-8764-58c45586be7e
     kentyaohulk  ~/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210316  ls -al 
local
    total 0
    drwxr-xr-x   2 kentyao  staff   64  3 16 19:22 .
    drwxr-xr-x  19 kentyao  staff  608  3 16 18:42 ..
     kentyaohulk  ~/Downloads/spark/spark-3.2.0-SNAPSHOT-bin-20210316 
    ```
    
    Closes #31850 from yaooqinn/followup.
    
    Authored-by: Kent Yao <y...@apache.org>
    Signed-off-by: Kent Yao <y...@apache.org>
---
 .../apache/spark/sql/hive/client/HiveClientImpl.scala | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
index 800c3ca..35dd2c1 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
@@ -155,7 +155,24 @@ private[hive] class HiveClientImpl(
     }
   }
 
-  ShutdownHookManager.addShutdownHook(() => state.close())
+  private def closeState(): Unit = {
+    // These temp files are registered in o.a.h.u.ShutdownHookManager too 
during state start.
+    // The state.close() will delete them if they are not null and try remove 
them from the
+    // o.a.h.u.ShutdownHookManager which causes undesirable 
IllegalStateException.
+    // We delete them ahead with a high priority hook here and set them to 
null to bypass the
+    // deletion in state.close().
+    if (state.getTmpOutputFile != null) {
+      state.getTmpOutputFile.delete()
+      state.setTmpOutputFile(null)
+    }
+    if (state.getTmpErrOutputFile != null) {
+      state.getTmpErrOutputFile.delete()
+      state.setTmpErrOutputFile(null)
+    }
+    state.close()
+  }
+
+  ShutdownHookManager.addShutdownHook(() => closeState())
 
   // Log the default warehouse location.
   logInfo(


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to