[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/6833


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-21 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113870538
  
LGTM, thanks for fixing this! Merging to master and branch-1.4.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-21 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113870521
  
@andrewor14 They are not the same. #6864 affects dynamic partitioning 
feature of external data sources, while this one is about dynamic partitions of 
Hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113351488
  
i think this pr is for hiveQL and #6864 is for common SQL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113312380
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113312346
  
  [Test build #35171 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35171/console)
 for   PR 6833 at commit 
[`64bbfab`](https://github.com/apache/spark/commit/64bbfab33d748cce3cb1dbad55a86c3991d99899).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113289891
  
  [Test build #35171 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35171/consoleFull)
 for   PR 6833 at commit 
[`64bbfab`](https://github.com/apache/spark/commit/64bbfab33d748cce3cb1dbad55a86c3991d99899).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113289226
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113289197
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-18 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-113287867
  
ok to test. Is this issue the same as the one reported in #6864? @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112946466
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112946390
  
  [Test build #35053 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35053/console)
 for   PR 6833 at commit 
[`64bbfab`](https://github.com/apache/spark/commit/64bbfab33d748cce3cb1dbad55a86c3991d99899).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-17 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112914955
  
  [Test build #35053 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35053/consoleFull)
 for   PR 6833 at commit 
[`64bbfab`](https://github.com/apache/spark/commit/64bbfab33d748cce3cb1dbad55a86c3991d99899).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112914749
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112914725
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-17 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112914495
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread jeanlyn
Github user jeanlyn commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112636409
  
@chenghao-intel ,I think it only affect the dynamic partition.Because 
`SparkHadoopWriter` get the write by `OutputFormat.getRecordWriter`,most of 
them use the `FileOutputFormat.getTaskOutputPath` to get the path


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread jeanlyn
Github user jeanlyn commented on a diff in the pull request:

https://github.com/apache/spark/pull/6833#discussion_r32592438
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala ---
@@ -230,7 +230,15 @@ private[spark] class 
SparkHiveDynamicPartitionWriterContainer(
   val path = {
 val outputPath = FileOutputFormat.getOutputPath(conf.value)
--- End diff --

Oh,I try it later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread scwf
Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112627047
  
also met this issue when dynamic partition in HiveContext


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112620496
  
Seems only affect the dynamic partition in HiveContext, @jeanlyn can you 
confirm that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/6833#discussion_r32589270
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala ---
@@ -230,7 +230,15 @@ private[spark] class 
SparkHiveDynamicPartitionWriterContainer(
   val path = {
 val outputPath = FileOutputFormat.getOutputPath(conf.value)
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/6833#discussion_r32523129
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveWriterContainers.scala ---
@@ -230,7 +230,15 @@ private[spark] class 
SparkHiveDynamicPartitionWriterContainer(
   val path = {
 val outputPath = FileOutputFormat.getOutputPath(conf.value)
--- End diff --

i think we just need to replace FileOutputFormat.getOutputPath with 
FileOutputFormat.getTaskOutputPath. because FileOutputFormat.getTaskOutputPath 
will return $outputPath/_temporary/${attemptId}/.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-16 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/6833#discussion_r32516707
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 ---
@@ -197,7 +197,6 @@ case class InsertIntoHiveTable(
   table.hiveQlTable.getPartCols().foreach { entry =>
 orderedPartitionSpec.put(entry.getName, 
partitionSpec.get(entry.getName).getOrElse(""))
   }
-  val partVals = 
MetaStoreUtils.getPvals(table.hiveQlTable.getPartCols, partitionSpec)
 
--- End diff --

yes, i think you are right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-15 Thread jeanlyn
Github user jeanlyn commented on a diff in the pull request:

https://github.com/apache/spark/pull/6833#discussion_r32492419
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 ---
@@ -197,7 +197,6 @@ case class InsertIntoHiveTable(
   table.hiveQlTable.getPartCols().foreach { entry =>
 orderedPartitionSpec.put(entry.getName, 
partitionSpec.get(entry.getName).getOrElse(""))
   }
-  val partVals = 
MetaStoreUtils.getPvals(table.hiveQlTable.getPartCols, partitionSpec)
 
--- End diff --

I think 
https://github.com/apache/spark/pull/5876/files#diff-d579db9a8f27e0bbef37720ab14ec3f6L203
 should remove this code. @marmbrus. Right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-15 Thread jeanlyn
Github user jeanlyn commented on a diff in the pull request:

https://github.com/apache/spark/pull/6833#discussion_r32491951
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 ---
@@ -197,7 +197,6 @@ case class InsertIntoHiveTable(
   table.hiveQlTable.getPartCols().foreach { entry =>
 orderedPartitionSpec.put(entry.getName, 
partitionSpec.get(entry.getName).getOrElse(""))
   }
-  val partVals = 
MetaStoreUtils.getPvals(table.hiveQlTable.getPartCols, partitionSpec)
 
--- End diff --

This code seems never use,so remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6833#issuecomment-112259097
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8379][SQL]avoid speculative tasks write...

2015-06-15 Thread jeanlyn
GitHub user jeanlyn opened a pull request:

https://github.com/apache/spark/pull/6833

[SPARK-8379][SQL]avoid speculative tasks write to the same file

The issue link 
[SPARK-8379](https://issues.apache.org/jira/browse/SPARK-8379)
Currently,when we insert data to the dynamic partition with speculative 
tasks we will get the Exception
```

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
 
Lease mismatch on 
/tmp/hive-jeanlyn/hive_2015-06-15_15-20-44_734_8801220787219172413-1/-ext-1/ds=2015-06-15/type=2/part-00301.lzo
 
owned by DFSClient_attempt_201506031520_0011_m_000189_0_-1513487243_53 
but is accessed by 
DFSClient_attempt_201506031520_0011_m_42_0_-1275047721_57
```
This pr try to write the data to temporary dir when using dynamic parition  
avoid the speculative tasks writing the same file

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jeanlyn/spark speculation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6833


commit e19a3bd77b6b9f44479e51659e244e9809b2963d
Author: jeanlyn 
Date:   2015-06-15T16:38:16Z

avoid speculative tasks write same file




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org