[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-04-14 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-04-13 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/17364
  
merged to master

sorry I forgot to take look at this for a while @steveloughran, thanks for 
the reminder


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-04-10 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
@squito Is this ready to go in? Like I warned, I'm not going to add tests 
for this, not on its own


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-29 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
I don't have a time/plans to do the test here, as it's a fairly complex 
piece of test setup for what a review should show isn't doing anything other 
than guarantee the outcome pf `currentWriter==null` out of all sequential 
execution paths


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-21 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
looking some more, yes,  as `tryWithSafeFinallyAndFailureCallbacks` wraps 
task commit, it guarantees that the original cause doesn't get lost. The 
abortJob code isn't so well guarded, and looks like a failure there my hide a 
previous one (like a commitJob failure).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-21 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/17364
  
> Note that as the exception handler tries to close resources before 
calling committer.abortTask(taskAttemptContext), without this patch a failing 
releaseResources() means that abortTask() isn't invoked, though hopefully 
abortJob will handle tasks too

it doesn't look that way to me -- the `abortTask` is in a `finally`, so it 
should get called no matter what.

You could add a very targeted regression test -- create the 
`ExecuteWriteTask` with mock OutputFormats, have those mocks throw an exception 
on `close`.  Then make sure two calls to `releaseResources` work correctly.  
Your change is obviously correct, but would help avoid future regressions.

in any case, lgtm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-21 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
Created [SPARK-20045](https://issues.apache.org/jira/browse/SPARK-20045). I 
think there's room to improve resilience in the abort code, primarily to ensure 
that the underlying failure cause doesn't get lost. The codepath there is 
fairly complex  and I'm not going to point at a snippet and say "here". Some 
faulting mock  committer would probably be the actual first step: show the 
problems, then fix.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-21 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
I haven't reviewed that bit of code: make it a separate JIRA and assign to 
me. This one I came across in the HADOOP-2.8.0 RC3 testing; the underlying fix 
there is going in, but the spark code still should be made more resilient to 
failure here. It's always those failure modes which get you —and working with 
S3, that close() can be where the PUT is initiated, so P(fail) > 0.

 Part of [HADOOP-13786] 
(https://issues.apache.org/jira/browse/HADOOP-13786) is MAPREDUCE-6823: making 
it straighforward to define a different implementation of the FileOutputFormat 
committer. That should make it easier to do some fault injection in the commit 
processes, especially all those bits that violate the state machine entirely. 
I'll see what I can do about breaking things :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/17364
  
@steveloughran, maybe this is strictly not related with the problem 
specified in JIRA but do you know if we should do the same thing to 
`SparkHadoopMapReduceWriter`? I remember I had to fix the resource related 
problem identically with `FileFormatWriter` before later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17364
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17364
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74903/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17364
  
**[Test build #74903 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74903/testReport)**
 for PR 17364 at commit 
[`725740b`](https://github.com/apache/spark/commit/725740b49a2b37392699092b1b0e08c63a6152ff).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17364: [SPARK-20038] [SQL]: FileFormatWriter.ExecuteWriteTask.r...

2017-03-20 Thread steveloughran
Github user steveloughran commented on the issue:

https://github.com/apache/spark/pull/17364
  
Note that as [the exception 
handler](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala#L244)
 tries to close resources before calling 
`committer.abortTask(taskAttemptContext)`, without this patch a failing 
`releaseResources()` means that ` abortTask()` isn't invoked, though hopefully 
abortJob will handle tasks too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org