[ 
https://issues.apache.org/jira/browse/SPARK-22218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-22218:
------------------------------------

    Assignee:     (was: Apache Spark)

> spark shuffle services fails to update secret on application re-attempts
> ------------------------------------------------------------------------
>
>                 Key: SPARK-22218
>                 URL: https://issues.apache.org/jira/browse/SPARK-22218
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, YARN
>    Affects Versions: 2.2.0
>            Reporter: Thomas Graves
>            Priority: Blocker
>
> Running on yarn, If you have any application re-attempts using the spark 2.2 
> shuffle service, the external shuffle service does not update the credentials 
> properly and the application re-attempts fail with 
> javax.security.sasl.SaslException. 
> A bug was fixed in 2.2 (SPARK-21494) where it changed the 
> ShuffleSecretManager to use containsKey 
> (https://git.corp.yahoo.com/hadoop/spark/blob/yspark_2_2_0/common/network-shuffle/src/main/java/org/apache/spark/network/sasl/ShuffleSecretManager.java#L50)
>  , which is the proper behavior, the problem is that between application 
> re-attempts it never removes the key. So when the second attempt starts, the 
> code says it already contains the key (since the application id is the same) 
> and it doesn't update the secret properly.
> to reproduce this you can run something like a word count and have the 
> directory already existing.  The first attempt will fail because the output 
> directory exists, the subsequent attempts will fail with max number of 
> executor failures.   Note that this is assuming the second and third attempts 
> run on the same node as the first attempt.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to