[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-12-02 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated SPARK-41313:

Description: 
SPARK-3900 fixed the {{IllegalStateException}} in cleanupStagingDir in 
ApplicationMaster's shutdownhook. However, SPARK-21138 accidentally 
reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
SPARK-3900 reported by our users at Linkedin. We need to bring back the fix for 
SPARK-3900.

The illegalStateException when creating a new filesystem object is due to the 
limitation in Hadoop that we can not register a shutdownhook during shutdown. 
So, when a spark job fails during pre-launch, as part of shutdown, 
cleanupStagingDir would be called. Then, if we attempt to create a new 
filesystem object for the first time, HDFS would try to register a hook to 
shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
result, we hit the {{IllegalStateException}}. We should avoid the creation of a 
new filesystem object in cleanupStagingDir() when it is called in a shutdown 
hook. This was introduced in SPARK-3900. However, SPARK-21138 accidentally 
reverted/undid that change. We need to bring back that fix to Spark to avoid 
the {{IllegalStateException}}.

  

  was:
spark-3900 fixed the illegalStateException in cleanupStagingDir in 
ApplicationMaster's shutdownhook. However, spark-21138 accidentally 
reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
spark-3900 reported by our users at Linkedin. We need to bring back the fix for 
spark-3900.

The illegalStateException when creating a new filesystem object is due to the 
limitation in hadoop that we can not register a shutdownhook during shutdown. 
So, when a spark job fails during pre-launch, as part of shutdown, 
cleanupStagingDir would be called. Then, if we attempt to create a new 
filesystem object for the first time, hadoop would try to register a hook to 
shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
result, we hit the illegalStateException. We should avoid the creation of a new 
filesystem object in cleanupStagingDir() when it is called in a shutdown hook. 
This was introduced in spark-3900. However, spark-21138 accidentally 
reverted/undid that change. We need to bring back that fix to Spark to avoid 
the illegalStateException.

  


> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Minor
>
> SPARK-3900 fixed the {{IllegalStateException}} in cleanupStagingDir in 
> ApplicationMaster's shutdownhook. However, SPARK-21138 accidentally 
> reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
> SPARK-3900 reported by our users at Linkedin. We need to bring back the fix 
> for SPARK-3900.
> The illegalStateException when creating a new filesystem object is due to the 
> limitation in Hadoop that we can not register a shutdownhook during shutdown. 
> So, when a spark job fails during pre-launch, as part of shutdown, 
> cleanupStagingDir would be called. Then, if we attempt to create a new 
> filesystem object for the first time, HDFS would try to register a hook to 
> shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
> result, we hit the {{IllegalStateException}}. We should avoid the creation of 
> a new filesystem object in cleanupStagingDir() when it is called in a 
> shutdown hook. This was introduced in SPARK-3900. However, SPARK-21138 
> accidentally reverted/undid that change. We need to bring back that fix to 
> Spark to avoid the {{IllegalStateException}}.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-12-01 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-41313:
-
Priority: Minor  (was: Major)

> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Minor
>
> spark-3900 fixed the illegalStateException in cleanupStagingDir in 
> ApplicationMaster's shutdownhook. However, spark-21138 accidentally 
> reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
> spark-3900 reported by our users at Linkedin. We need to bring back the fix 
> for spark-3900.
> The illegalStateException when creating a new filesystem object is due to the 
> limitation in hadoop that we can not register a shutdownhook during shutdown. 
> So, when a spark job fails during pre-launch, as part of shutdown, 
> cleanupStagingDir would be called. Then, if we attempt to create a new 
> filesystem object for the first time, hadoop would try to register a hook to 
> shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
> result, we hit the illegalStateException. We should avoid the creation of a 
> new filesystem object in cleanupStagingDir() when it is called in a shutdown 
> hook. This was introduced in spark-3900. However, spark-21138 accidentally 
> reverted/undid that change. We need to bring back that fix to Spark to avoid 
> the illegalStateException.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-12-01 Thread Sean R. Owen (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-41313:
-
Target Version/s:   (was: 3.2.4, 3.3.2, 3.4.0)

> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Major
>
> spark-3900 fixed the illegalStateException in cleanupStagingDir in 
> ApplicationMaster's shutdownhook. However, spark-21138 accidentally 
> reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
> spark-3900 reported by our users at Linkedin. We need to bring back the fix 
> for spark-3900.
> The illegalStateException when creating a new filesystem object is due to the 
> limitation in hadoop that we can not register a shutdownhook during shutdown. 
> So, when a spark job fails during pre-launch, as part of shutdown, 
> cleanupStagingDir would be called. Then, if we attempt to create a new 
> filesystem object for the first time, hadoop would try to register a hook to 
> shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
> result, we hit the illegalStateException. We should avoid the creation of a 
> new filesystem object in cleanupStagingDir() when it is called in a shutdown 
> hook. This was introduced in spark-3900. However, spark-21138 accidentally 
> reverted/undid that change. We need to bring back that fix to Spark to avoid 
> the illegalStateException.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-11-29 Thread Xing Lin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Lin updated SPARK-41313:
-
Description: 
spark-3900 fixed the illegalStateException in cleanupStagingDir in 
ApplicationMaster's shutdownhook. However, spark-21138 accidentally 
reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
spark-3900 reported by our users at Linkedin. We need to bring back the fix for 
spark-3900.

The illegalStateException when creating a new filesystem object is due to the 
limitation in hadoop that we can not register a shutdownhook during shutdown. 
So, when a spark job fails during pre-launch, as part of shutdown, 
cleanupStagingDir would be called. Then, if we attempt to create a new 
filesystem object for the first time, hadoop would try to register a hook to 
shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
result, we hit the illegalStateException. We should avoid the creation of a new 
filesystem object in cleanupStagingDir() when it is called in a shutdown hook. 
This was introduced in spark-3900. However, spark-21138 accidentally 
reverted/undid that change. We need to bring back that fix to Spark to avoid 
the illegalStateException.

  

  was:spark-3900 fixed the illegalStateException in cleanupStagingDir in 
ApplicationMaster's shutdownhook. However, spark-21138 reverted that change 
when fixing the "Wrong FS" bug. We need both fixes. 


> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Major
>
> spark-3900 fixed the illegalStateException in cleanupStagingDir in 
> ApplicationMaster's shutdownhook. However, spark-21138 accidentally 
> reverted/undid that change when fixing the "Wrong FS" bug. Now, we are seeing 
> spark-3900 reported by our users at Linkedin. We need to bring back the fix 
> for spark-3900.
> The illegalStateException when creating a new filesystem object is due to the 
> limitation in hadoop that we can not register a shutdownhook during shutdown. 
> So, when a spark job fails during pre-launch, as part of shutdown, 
> cleanupStagingDir would be called. Then, if we attempt to create a new 
> filesystem object for the first time, hadoop would try to register a hook to 
> shutdown KeyProviderCache when creating a ClientContext for DFSClient. As a 
> result, we hit the illegalStateException. We should avoid the creation of a 
> new filesystem object in cleanupStagingDir() when it is called in a shutdown 
> hook. This was introduced in spark-3900. However, spark-21138 accidentally 
> reverted/undid that change. We need to bring back that fix to Spark to avoid 
> the illegalStateException.
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-11-28 Thread Xing Lin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Lin updated SPARK-41313:
-
Description: spark-3900 fixed the illegalStateException in 
cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 
reverted that change when fixing the "Wrong FS" bug. We need both fixes.   
(was: spark-3900 fixed the illegalStateException in cleanupStagingDir in 
ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for 
fixing "Wrong FS" bug. We need both fixes. )

> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Major
>
> spark-3900 fixed the illegalStateException in cleanupStagingDir in 
> ApplicationMaster's shutdownhook. However, spark-21138 reverted that change 
> when fixing the "Wrong FS" bug. We need both fixes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-11-28 Thread Xing Lin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Lin updated SPARK-41313:
-
Description: spark-3900 fixed the illegalStateException in 
cleanupStagingDir in ApplicationMaster's shutdownhook. However, spark-21138 
reverted that fix for fixing "Wrong FS" bug. We need both fixes.   (was: 
spark-3900 fixed the illegalStateException in ApplicationMaster's shutdownhook. 
However, spark-21138 reverted that fix for fixing "Wrong FS" bug. We need both 
fixes. )

> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Major
>
> spark-3900 fixed the illegalStateException in cleanupStagingDir in 
> ApplicationMaster's shutdownhook. However, spark-21138 reverted that fix for 
> fixing "Wrong FS" bug. We need both fixes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-41313) Combine fixes for SPARK-3900 and SPARK-21138

2022-11-28 Thread Xing Lin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Lin updated SPARK-41313:
-
Summary: Combine fixes for SPARK-3900 and SPARK-21138  (was: Combine fix 
for SPARK-3900 and SPARK-21138)

> Combine fixes for SPARK-3900 and SPARK-21138
> 
>
> Key: SPARK-41313
> URL: https://issues.apache.org/jira/browse/SPARK-41313
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, YARN
>Affects Versions: 3.4.0
>Reporter: Xing Lin
>Priority: Major
>
> spark-3900 fixed the illegalStateException in ApplicationMaster's 
> shutdownhook. However, spark-21138 reverted that fix for fixing "Wrong FS" 
> bug. We need both fixes. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org