[jira] [Commented] (SPARK-11328) Correctly propagate error message in the case of failures when writing parquet

2015-12-01 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034847#comment-15034847
 ] 

Apache Spark commented on SPARK-11328:
--

User 'nongli' has created a pull request for this issue:
https://github.com/apache/spark/pull/10080

> Correctly propagate error message in the case of failures when writing parquet
> --
>
> Key: SPARK-11328
> URL: https://issues.apache.org/jira/browse/SPARK-11328
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Yin Huai
>Assignee: Nong Li
>Priority: Critical
>
> When saving data to S3 (e.g. saving to parquet), if there is an error during 
> the query execution, the partial file generated by the failed task will be 
> uploaded to S3 and the retries of this task will throw file already exist 
> error. It is very confusing to users because they may think that file already 
> exist error is the error causing the job failure. They can only find the real 
> error in the spark ui (in the stage page).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11328) Correctly propagate error message in the case of failures when writing parquet

2015-10-26 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975361#comment-14975361
 ] 

Yin Huai commented on SPARK-11328:
--

The file already exists error was thrown from [this line | 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala#L237]
 when we try to create a record writer.

> Correctly propagate error message in the case of failures when writing parquet
> --
>
> Key: SPARK-11328
> URL: https://issues.apache.org/jira/browse/SPARK-11328
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Yin Huai
>
> When saving data to S3 (e.g. saving to parquet), if there is an error during 
> the query execution, the partial file generated by the failed task will be 
> uploaded to S3 and the retries of this task will throw file already exist 
> error. It is very confusing to users because they may think that file already 
> exist error is the error causing the job failure. They can only find the real 
> error in the spark ui (in the stage page).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-11328) Correctly propagate error message in the case of failures when writing parquet

2015-10-26 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975556#comment-14975556
 ] 

Yin Huai commented on SPARK-11328:
--

[~nongli] Looks this issue is also related to DirectParquetOutputCommitter. 
Right now, the abortTask method is a no-op.

> Correctly propagate error message in the case of failures when writing parquet
> --
>
> Key: SPARK-11328
> URL: https://issues.apache.org/jira/browse/SPARK-11328
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Yin Huai
>
> When saving data to S3 (e.g. saving to parquet), if there is an error during 
> the query execution, the partial file generated by the failed task will be 
> uploaded to S3 and the retries of this task will throw file already exist 
> error. It is very confusing to users because they may think that file already 
> exist error is the error causing the job failure. They can only find the real 
> error in the spark ui (in the stage page).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org