[ 
https://issues.apache.org/jira/browse/SPARK-17664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Ash updated SPARK-17664:
-------------------------------
    Description: 
>From follow logs, task 22 has failed 4 times because of "the driver did not 
>authorize commit". But the strange thing was that I could't find task 22.1. 
>Why? Maybe some synchronization error?

{noformat}
16/09/26 02:14:18 INFO TaskSetManager: Lost task 22.0 in stage 1856.0 (TID 
953902) on executor 10.196.131.13: java.security.PrivilegedActionException 
(null) [duplicate 4]
16/09/26 02:14:18 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
10.196.131.13) as speculatable because it ran more than 5601 ms
16/09/26 02:14:18 INFO TaskSetManager: Starting task 22.2 in stage 1856.0 (TID 
954074, 10.215.143.14, partition 22,PROCESS_LOCAL, 2163 bytes)
16/09/26 02:14:18 INFO TaskSetManager: Lost task 22.2 in stage 1856.0 (TID 
954074) on executor 10.215.143.14: java.security.PrivilegedActionException 
(null) [duplicate 5]
16/09/26 02:14:18 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
10.196.131.13) as speculatable because it ran more than 5601 ms
16/09/26 02:14:18 INFO TaskSetManager: Starting task 22.3 in stage 1856.0 (TID 
954075, 10.196.131.28, partition 22,PROCESS_LOCAL, 2163 bytes)
16/09/26 02:14:19 INFO TaskSetManager: Lost task 22.3 in stage 1856.0 (TID 
954075) on executor 10.196.131.28: java.security.PrivilegedActionException 
(null) [duplicate 6]
16/09/26 02:14:19 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
10.196.131.13) as speculatable because it ran more than 5601 ms
16/09/26 02:14:19 INFO TaskSetManager: Starting task 22.4 in stage 1856.0 (TID 
954076, 10.215.153.225, partition 22,PROCESS_LOCAL, 2163 bytes)
16/09/26 02:14:19 INFO TaskSetManager: Lost task 22.4 in stage 1856.0 (TID 
954076) on executor 10.215.153.225: java.security.PrivilegedActionException 
(null) [duplicate 7]
16/09/26 02:14:19 ERROR TaskSetManager: Task 22 in stage 1856.0 failed 4 times; 
aborting job
16/09/26 02:14:19 INFO YarnClusterScheduler: Cancelling stage 1856
16/09/26 02:14:19 INFO YarnClusterScheduler: Stage 1856 was cancelled
16/09/26 02:14:19 INFO DAGScheduler: ResultStage 1856 (saveAsHadoopFile at 
TDWProvider.scala:514) failed in 23.049 s
16/09/26 02:14:19 INFO DAGScheduler: Job 76 failed: saveAsHadoopFile at 
TDWProvider.scala:514, took 69.865181 s
16/09/26 02:14:19 ERROR ApplicationMaster: User class threw exception: 
java.security.PrivilegedActionException: org.apache.spark.SparkException: Job 
aborted due to stage failure: Task 22 in stage 1856.0 failed 4 times, most 
recent failure: Lost task 22.4 in stage 1856.0 (TID 954076, 10.215.153.225): 
java.security.PrivilegedActionException: 
org.apache.spark.executor.CommitDeniedException: 
attempt_201609260213_1856_m_000022_954076: Not committed because the driver did 
not authorize commit
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:356)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1723)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1284)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1282)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.spark.executor.CommitDeniedException: 
attempt_201609260213_1856_m_000022_954076: Not committed because the driver did 
not authorize commit
        at 
org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:135)
        at 
org.apache.spark.SparkHadoopWriter.commit(SparkHadoopWriter.scala:142)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anon$4.run(PairRDDFunctions.scala:1311)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anon$4.run(PairRDDFunctions.scala:1284)
        ... 11 more
{noformat}

  was:
>From follow logs, task 22 has failed 4 times because of "the driver did not 
>authorize commit". But the strange thing was that I could't find task 22.1. 
>Why? Maybe some synchronization error?

----------------------------------------------------------------------------
16/09/26 02:14:18 INFO TaskSetManager: Lost task 22.0 in stage 1856.0 (TID 
953902) on executor 10.196.131.13: java.security.PrivilegedActionException 
(null) [duplicate 4]
16/09/26 02:14:18 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
10.196.131.13) as speculatable because it ran more than 5601 ms
16/09/26 02:14:18 INFO TaskSetManager: Starting task 22.2 in stage 1856.0 (TID 
954074, 10.215.143.14, partition 22,PROCESS_LOCAL, 2163 bytes)
16/09/26 02:14:18 INFO TaskSetManager: Lost task 22.2 in stage 1856.0 (TID 
954074) on executor 10.215.143.14: java.security.PrivilegedActionException 
(null) [duplicate 5]
16/09/26 02:14:18 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
10.196.131.13) as speculatable because it ran more than 5601 ms
16/09/26 02:14:18 INFO TaskSetManager: Starting task 22.3 in stage 1856.0 (TID 
954075, 10.196.131.28, partition 22,PROCESS_LOCAL, 2163 bytes)
16/09/26 02:14:19 INFO TaskSetManager: Lost task 22.3 in stage 1856.0 (TID 
954075) on executor 10.196.131.28: java.security.PrivilegedActionException 
(null) [duplicate 6]
16/09/26 02:14:19 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
10.196.131.13) as speculatable because it ran more than 5601 ms
16/09/26 02:14:19 INFO TaskSetManager: Starting task 22.4 in stage 1856.0 (TID 
954076, 10.215.153.225, partition 22,PROCESS_LOCAL, 2163 bytes)
16/09/26 02:14:19 INFO TaskSetManager: Lost task 22.4 in stage 1856.0 (TID 
954076) on executor 10.215.153.225: java.security.PrivilegedActionException 
(null) [duplicate 7]
16/09/26 02:14:19 ERROR TaskSetManager: Task 22 in stage 1856.0 failed 4 times; 
aborting job
16/09/26 02:14:19 INFO YarnClusterScheduler: Cancelling stage 1856
16/09/26 02:14:19 INFO YarnClusterScheduler: Stage 1856 was cancelled
16/09/26 02:14:19 INFO DAGScheduler: ResultStage 1856 (saveAsHadoopFile at 
TDWProvider.scala:514) failed in 23.049 s
16/09/26 02:14:19 INFO DAGScheduler: Job 76 failed: saveAsHadoopFile at 
TDWProvider.scala:514, took 69.865181 s
16/09/26 02:14:19 ERROR ApplicationMaster: User class threw exception: 
java.security.PrivilegedActionException: org.apache.spark.SparkException: Job 
aborted due to stage failure: Task 22 in stage 1856.0 failed 4 times, most 
recent failure: Lost task 22.4 in stage 1856.0 (TID 954076, 10.215.153.225): 
java.security.PrivilegedActionException: 
org.apache.spark.executor.CommitDeniedException: 
attempt_201609260213_1856_m_000022_954076: Not committed because the driver did 
not authorize commit
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:356)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1723)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1284)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1282)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: org.apache.spark.executor.CommitDeniedException: 
attempt_201609260213_1856_m_000022_954076: Not committed because the driver did 
not authorize commit
        at 
org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:135)
        at 
org.apache.spark.SparkHadoopWriter.commit(SparkHadoopWriter.scala:142)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anon$4.run(PairRDDFunctions.scala:1311)
        at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anon$4.run(PairRDDFunctions.scala:1284)
        ... 11 more



> Failed to saveAsHadoop when speculate is enabled
> ------------------------------------------------
>
>                 Key: SPARK-17664
>                 URL: https://issues.apache.org/jira/browse/SPARK-17664
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.6.1
>            Reporter: sharkd tu
>
> From follow logs, task 22 has failed 4 times because of "the driver did not 
> authorize commit". But the strange thing was that I could't find task 22.1. 
> Why? Maybe some synchronization error?
> {noformat}
> 16/09/26 02:14:18 INFO TaskSetManager: Lost task 22.0 in stage 1856.0 (TID 
> 953902) on executor 10.196.131.13: java.security.PrivilegedActionException 
> (null) [duplicate 4]
> 16/09/26 02:14:18 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
> 10.196.131.13) as speculatable because it ran more than 5601 ms
> 16/09/26 02:14:18 INFO TaskSetManager: Starting task 22.2 in stage 1856.0 
> (TID 954074, 10.215.143.14, partition 22,PROCESS_LOCAL, 2163 bytes)
> 16/09/26 02:14:18 INFO TaskSetManager: Lost task 22.2 in stage 1856.0 (TID 
> 954074) on executor 10.215.143.14: java.security.PrivilegedActionException 
> (null) [duplicate 5]
> 16/09/26 02:14:18 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
> 10.196.131.13) as speculatable because it ran more than 5601 ms
> 16/09/26 02:14:18 INFO TaskSetManager: Starting task 22.3 in stage 1856.0 
> (TID 954075, 10.196.131.28, partition 22,PROCESS_LOCAL, 2163 bytes)
> 16/09/26 02:14:19 INFO TaskSetManager: Lost task 22.3 in stage 1856.0 (TID 
> 954075) on executor 10.196.131.28: java.security.PrivilegedActionException 
> (null) [duplicate 6]
> 16/09/26 02:14:19 INFO TaskSetManager: Marking task 22 in stage 1856.0 (on 
> 10.196.131.13) as speculatable because it ran more than 5601 ms
> 16/09/26 02:14:19 INFO TaskSetManager: Starting task 22.4 in stage 1856.0 
> (TID 954076, 10.215.153.225, partition 22,PROCESS_LOCAL, 2163 bytes)
> 16/09/26 02:14:19 INFO TaskSetManager: Lost task 22.4 in stage 1856.0 (TID 
> 954076) on executor 10.215.153.225: java.security.PrivilegedActionException 
> (null) [duplicate 7]
> 16/09/26 02:14:19 ERROR TaskSetManager: Task 22 in stage 1856.0 failed 4 
> times; aborting job
> 16/09/26 02:14:19 INFO YarnClusterScheduler: Cancelling stage 1856
> 16/09/26 02:14:19 INFO YarnClusterScheduler: Stage 1856 was cancelled
> 16/09/26 02:14:19 INFO DAGScheduler: ResultStage 1856 (saveAsHadoopFile at 
> TDWProvider.scala:514) failed in 23.049 s
> 16/09/26 02:14:19 INFO DAGScheduler: Job 76 failed: saveAsHadoopFile at 
> TDWProvider.scala:514, took 69.865181 s
> 16/09/26 02:14:19 ERROR ApplicationMaster: User class threw exception: 
> java.security.PrivilegedActionException: org.apache.spark.SparkException: Job 
> aborted due to stage failure: Task 22 in stage 1856.0 failed 4 times, most 
> recent failure: Lost task 22.4 in stage 1856.0 (TID 954076, 10.215.153.225): 
> java.security.PrivilegedActionException: 
> org.apache.spark.executor.CommitDeniedException: 
> attempt_201609260213_1856_m_000022_954076: Not committed because the driver 
> did not authorize commit
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:356)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1723)
>       at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1284)
>       at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1282)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>       at org.apache.spark.scheduler.Task.run(Task.scala:89)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)
> Caused by: org.apache.spark.executor.CommitDeniedException: 
> attempt_201609260213_1856_m_000022_954076: Not committed because the driver 
> did not authorize commit
>       at 
> org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:135)
>       at 
> org.apache.spark.SparkHadoopWriter.commit(SparkHadoopWriter.scala:142)
>       at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anon$4.run(PairRDDFunctions.scala:1311)
>       at 
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anon$4.run(PairRDDFunctions.scala:1284)
>       ... 11 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to