[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138
[ https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Lin updated SPARK-41313: - Description: spark-3900 fixed the illegalStateException in cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 accidentally reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing spark-3900 reported by our users at Linkedin. We need to bring back the fix for spark-3900. The illegalStateException when creating a new filesystem object is due to the limitation in hadoop that we can not register a shutdownhook during shutdown. So, when a spark job fails during pre-launch, as part of shutdown, cleanupStagingDir would be called. Then, if we attempt to create a new filesystem object for the first time, hadoop would try to register a hook to shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a result, we hit the illegalStateException. We should avoid the creation of a new filesystem object in cleanupStagingDir() when it is called in a shutdown hook. This was introduced in spark-3900. However, spark-21138 accidentally reverted/undid that change. We need to bring back that fix to Spark to avoid the illegalStateException. was:spark-3900 fixed the illegalStateException in cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 reverted that change when fixing the "Wrong FS" bug. We need both fixes. > Combine fixes for SPARK-3900 and SPARK-21138 > > > Key: SPARK-41313 > URL: https://issues.apache.org/jira/browse/SPARK-41313 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN >Affects Versions: 3.4.0 >Reporter: Xing Lin >Priority: Major > > spark-3900 fixed the illegalStateException in cleanupStagingDir in > ApplicationMaster's shutdownhook. However, spark-21138 accidentally > reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing > spark-3900 reported by our users at Linkedin. We need to bring back the fix > for spark-3900. > The illegalStateException when creating a new filesystem object is due to the > limitation in hadoop that we can not register a shutdownhook during shutdown. > So, when a spark job fails during pre-launch, as part of shutdown, > cleanupStagingDir would be called. Then, if we attempt to create a new > filesystem object for the first time, hadoop would try to register a hook to > shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a > result, we hit the illegalStateException. We should avoid the creation of a > new filesystem object in cleanupStagingDir() when it is called in a shutdown > hook. This was introduced in spark-3900. However, spark-21138 accidentally > reverted/undid that change. We need to bring back that fix to Spark to avoid > the illegalStateException. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138
[ https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Lin updated SPARK-41313: - Description: spark-3900 fixed the illegalStateException in cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 reverted that change when fixing the "Wrong FS" bug. We need both fixes. (was: spark-3900 fixed the illegalStateException in cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for fixing "Wrong FS" bug. We need both fixes. ) > Combine fixes for SPARK-3900 and SPARK-21138 > > > Key: SPARK-41313 > URL: https://issues.apache.org/jira/browse/SPARK-41313 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN >Affects Versions: 3.4.0 >Reporter: Xing Lin >Priority: Major > > spark-3900 fixed the illegalStateException in cleanupStagingDir in > ApplicationMaster's shutdownhook. However, spark-21138 reverted that change > when fixing the "Wrong FS" bug. We need both fixes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138
[ https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Lin updated SPARK-41313: - Description: spark-3900 fixed the illegalStateException in cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for fixing "Wrong FS" bug. We need both fixes. (was: spark-3900 fixed the illegalStateException in ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for fixing "Wrong FS" bug. We need both fixes. ) > Combine fixes for SPARK-3900 and SPARK-21138 > > > Key: SPARK-41313 > URL: https://issues.apache.org/jira/browse/SPARK-41313 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN >Affects Versions: 3.4.0 >Reporter: Xing Lin >Priority: Major > > spark-3900 fixed the illegalStateException in cleanupStagingDir in > ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for > fixing "Wrong FS" bug. We need both fixes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138
[ https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Lin updated SPARK-41313: - Summary: Combine fixes for SPARK-3900 and SPARK-21138 (was: Combine fix for SPARK-3900 and SPARK-21138) > Combine fixes for SPARK-3900 and SPARK-21138 > > > Key: SPARK-41313 > URL: https://issues.apache.org/jira/browse/SPARK-41313 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN >Affects Versions: 3.4.0 >Reporter: Xing Lin >Priority: Major > > spark-3900 fixed the illegalStateException in ApplicationMaster's > shutdownhook. However, spark-21138 reverted that fix for fixing "Wrong FS" > bug. We need both fixes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41313) Combine fix for SPARK-3900 and SPARK-21138
Xing Lin created SPARK-41313: Summary: Combine fix for SPARK-3900 and SPARK-21138 Key: SPARK-41313 URL: https://issues.apache.org/jira/browse/SPARK-41313 Project: Spark Issue Type: Bug Components: Spark Core, YARN Affects Versions: 3.4.0 Reporter: Xing Lin spark-3900 fixed the illegalStateException in ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for fixing "Wrong FS" bug. We need both fixes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org