[jira] [Updated] (SPARK-8513) _temporary may be left undeleted when a write job committed with FileOutputCommitter fails due to a race condition

Cheng Lian (JIRA) Sun, 21 Jun 2015 19:46:15 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Cheng Lian updated SPARK-8513:
------------------------------
    Description: 
To reproduce this issue, we need a node with relatively more cores, say 32 
(e.g., Spark Jenkins builder is a good candidate).  With such a node, the 
following code should be relatively easy to reproduce this issue:
{code}
sqlContext.range(0, 10).repartition(32).select('id / 
0).write.mode("overwrite").parquet("file:///tmp/foo")
{code}
You may observe similar log lines as below:
{noformat}
01:58:27.682 pool-1-thread-1-ScalaTest-running-CommitFailureTestRelationSuite 
WARN FileUtil: Failed to delete file or dir 
[/home/jenkins/workspace/SparkPullRequestBuilder/target/tmp/spark-a918b285-fa59-4a29-857e-a95e38fa355a/_temporary/0/_temporary]:
 it still exists.
{noformat}
The reason is that, for a Spark job with multiple tasks, when a task fails 
after multiple retries, the job gets canceled on driver side.  At the same 
time, all child tasks of this job also get canceled.  However, task cancelation 
is asynchronous.  This means, some tasks may still be running when the job is 
already killed on driver side.

With this in mind, the following execution order may cause the log line 
mentioned above:

# Job {{A}} spawns 32 tasks to write the Parquet file
  Since {{ParquetOutputCommitter}} is a subclass of {{FileOutputClass}}, a 
temporary directory {{D1}} is created to hold output files of different task 
attempts.
# Task {{a1}} fails after several retries first because of the division by zero 
error
# Task {{a1}} aborts the Parquet write task and tries to remove its task 
attempt output directory {{d1}} (a sub-directory of {{D1}})
# Job {{A}} gets canceled on driver side, all the other 31 tasks also get 
canceled *asynchronously*
# {{ParquetOutputCommitter.abortJob()}} tries to remove {{D1}} by first 
removing all its child files/directories first
  Note that when testing with local directory, {{RawLocalFileSystem}} simply 
calls {{java.io.File.delete()}} to deletion, and only empty directories can be 
deleted.
# Because tasks are canceled asynchronously, some other task, say {{a2}} may 
just get scheduled and create its own task attempt directory {{d2}} under {{D2}}
# Now {{ParquetOutputCommitter.abortJob()}} tries to finally remove {{D1}} 
itself, but fails because {{d2}} makes {{D1}} non-empty again

Notice that this bug affects all Spark jobs that writes files with 
{{FileOutputCommitter}} and its subclasses which creates temporary directories.

One of the possible way to fix this issue can be making task cancellation 
synchronous, but this also increases latency.

  was:
To reproduce this issue, we need a node with relatively more cores, say 32 
(e.g., Spark Jenkins builder is a good candidate).  With such a node, the 
following code should be relatively easy to reproduce this issue:
{code}
sqlContext.range(0, 10).repartition(32).select('id / 
0).write.mode("overwrite").parquet("file:///tmp/foo")
{code}
You may observe similar log lines as below:
{noformat}
01:58:27.682 pool-1-thread-1-ScalaTest-running-CommitFailureTestRelationSuite 
WARN FileUtil: Failed to delete file or dir 
[/home/jenkins/workspace/SparkPullRequestBuilder/target/tmp/spark-a918b285-fa59-4a29-857e-a95e38fa355a/_temporary/0/_temporary]:
 it still exists.
{noformat}
The reason is that, for a Spark job with multiple tasks, when a task fails 
after multiple retries, the job gets canceled on driver side.  At the same 
time, all child tasks of this job also get canceled.  However, task cancelation 
is asynchronous.  This means, some tasks may still be running when the job is 
already killed on driver side.

With this in mind, the following execution order may cause the log line 
mentioned above:

# Job {{A}} spawns 32 tasks to write the Parquet file

  Since {{ParquetOutputCommitter}} is a subclass of {{FileOutputClass}}, a 
temporary directory {{D1}} is created to hold output files of different task 
attempts.

# Task {{a1}} fails after several retries first because of the division by zero 
error
# Task {{a1}} aborts the Parquet write task and tries to remove its task 
attempt output directory {{d1}} (a sub-directory of {{D1}})
# Job {{A}} gets canceled on driver side, all the other 31 tasks also get 
canceled *asynchronously*
# {{ParquetOutputCommitter.abortJob()}} tries to remove {{D1}} by first 
removing all its child files/directories first

  Note that when testing with local directory, {{RawLocalFileSystem}} simply 
calls {{java.io.File.delete()}} to deletion, and only empty directories can be 
deleted.

# Because tasks are canceled asynchronously, some other task, say {{a2}} may 
just get scheduled and create its own task attempt directory {{d2}} under {{D2}}
# Now {{ParquetOutputCommitter.abortJob()}} tries to finally remove {{D1}} 
itself, but fails because {{d2}} makes {{D1}} non-empty again

Notice that this bug affects all Spark jobs that writes files with 
{{FileOutputCommitter}} and its subclasses which creates temporary directories.

One of the possible way to fix this issue can be making task cancellation 
synchronous, but this also increases latency.


> _temporary may be left undeleted when a write job committed with 
> FileOutputCommitter fails due to a race condition
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-8513
>                 URL: https://issues.apache.org/jira/browse/SPARK-8513
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 1.2.2, 1.3.1, 1.4.0
>            Reporter: Cheng Lian
>
> To reproduce this issue, we need a node with relatively more cores, say 32 
> (e.g., Spark Jenkins builder is a good candidate).  With such a node, the 
> following code should be relatively easy to reproduce this issue:
> {code}
> sqlContext.range(0, 10).repartition(32).select('id / 
> 0).write.mode("overwrite").parquet("file:///tmp/foo")
> {code}
> You may observe similar log lines as below:
> {noformat}
> 01:58:27.682 pool-1-thread-1-ScalaTest-running-CommitFailureTestRelationSuite 
> WARN FileUtil: Failed to delete file or dir 
> [/home/jenkins/workspace/SparkPullRequestBuilder/target/tmp/spark-a918b285-fa59-4a29-857e-a95e38fa355a/_temporary/0/_temporary]:
>  it still exists.
> {noformat}
> The reason is that, for a Spark job with multiple tasks, when a task fails 
> after multiple retries, the job gets canceled on driver side.  At the same 
> time, all child tasks of this job also get canceled.  However, task 
> cancelation is asynchronous.  This means, some tasks may still be running 
> when the job is already killed on driver side.
> With this in mind, the following execution order may cause the log line 
> mentioned above:
> # Job {{A}} spawns 32 tasks to write the Parquet file
>   Since {{ParquetOutputCommitter}} is a subclass of {{FileOutputClass}}, a 
> temporary directory {{D1}} is created to hold output files of different task 
> attempts.
> # Task {{a1}} fails after several retries first because of the division by 
> zero error
> # Task {{a1}} aborts the Parquet write task and tries to remove its task 
> attempt output directory {{d1}} (a sub-directory of {{D1}})
> # Job {{A}} gets canceled on driver side, all the other 31 tasks also get 
> canceled *asynchronously*
> # {{ParquetOutputCommitter.abortJob()}} tries to remove {{D1}} by first 
> removing all its child files/directories first
>   Note that when testing with local directory, {{RawLocalFileSystem}} simply 
> calls {{java.io.File.delete()}} to deletion, and only empty directories can 
> be deleted.
> # Because tasks are canceled asynchronously, some other task, say {{a2}} may 
> just get scheduled and create its own task attempt directory {{d2}} under 
> {{D2}}
> # Now {{ParquetOutputCommitter.abortJob()}} tries to finally remove {{D1}} 
> itself, but fails because {{d2}} makes {{D1}} non-empty again
> Notice that this bug affects all Spark jobs that writes files with 
> {{FileOutputCommitter}} and its subclasses which creates temporary 
> directories.
> One of the possible way to fix this issue can be making task cancellation 
> synchronous, but this also increases latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-8513) _temporary may be left undeleted when a write job committed with FileOutputCommitter fails due to a race condition

Reply via email to