date:20170816

[GitHub] spark issue #18892: [SPARK-21520][SQL]Improvement a special case for non-det...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18892
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18892: [SPARK-21520][SQL]Improvement a special case for non-det...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18892
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80764/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18892: [SPARK-21520][SQL]Improvement a special case for non-det...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18892
  
**[Test build #80764 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80764/testReport)**
 for PR 18892 at commit 
[`72e0252`](https://github.com/apache/spark/commit/72e0252bb2d3a9c7d43ed8756d8d7ea34fb80ca5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...

2017-08-16 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/18965
  
I think the more important question is, is it a protocol that we are 
guaranteeing? not sure that is. If not then I don't think it should be in 
user-facing docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18648: [SPARK-21428] Turn IsolatedClientLoader off while using ...

2017-08-16 Thread yaooqinn

Github user yaooqinn commented on the issue:

https://github.com/apache/spark/pull/18648
  
@cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone

Github user DonnyZone commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133631784
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala ---
@@ -68,4 +68,25 @@ class QueryPartitionSuite extends QueryTest with 
SQLTestUtils with TestHiveSingl
   sql("DROP TABLE IF EXISTS createAndInsertTest")
 }
   }
+
+  test("SPARK-21739: Cast expression should initialize timezoneId " +
--- End diff --

Oh, it should select the TimestampType column. Thanks for reminder, I will 
fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18953
  
@cloud-fan . The PR is updated. Now, it's minimized as +493 and â247 
lines.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Update ORC data source based on Apach...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18953
  
**[Test build #80771 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80771/testReport)**
 for PR 18953 at commit 
[`80c80f3`](https://github.com/apache/spark/commit/80c80f34eb4dfb7c94d7875438effab52c71575d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18959
  
**[Test build #80770 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80770/testReport)**
 for PR 18959 at commit 
[`973402b`](https://github.com/apache/spark/commit/973402bd822c05f8895405fbcaf918edbaad9d23).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...

2017-08-16 Thread neoremind

Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18965
  
I see, anyway this is what I found when I dig into the wire protocol of 
spark rpc since wire format is a big part of understanding the message 
structure. If someone thinks this is not necessary I can close the PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18959
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80763/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18959
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18959
  
**[Test build #80763 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80763/testReport)**
 for PR 18959 at commit 
[`b33fde8`](https://github.com/apache/spark/commit/b33fde86ecd6a0be5f4a55c408ab10e0ac44101a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq out...

2017-08-16 Thread maropu

Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/18959#discussion_r133629453
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruningSuite.scala 
---
@@ -162,7 +162,13 @@ class PruningSuite extends HiveComparisonTest with 
BeforeAndAfter {
   }.head
 
   assert(actualOutputColumns === expectedOutputColumns, "Output 
columns mismatch")
-  assert(actualScannedColumns === expectedScannedColumns, "Scanned 
columns mismatch")
+
+  // Scanned columns in `HiveTableScanExec` are generated by the 
`pruneFilterProject` method
+  // in `SparkPlanner` that internally uses `AttributeSet.toSeq`.
+  // Since we change an output order of `AttributeSet.toSeq` in 
SPARK-18394,
+  // we need to sort column names for a test below.
--- End diff --

look good, I'll update soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates should not...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18968
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates should not...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18968
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80765/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates should not...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18968
  
**[Test build #80765 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80765/testReport)**
 for PR 18968 at commit 
[`4604a08`](https://github.com/apache/spark/commit/4604a08e390019f7c3952774dd1b9086be9f2680).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq out...

2017-08-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18959#discussion_r133628398
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruningSuite.scala 
---
@@ -162,7 +162,13 @@ class PruningSuite extends HiveComparisonTest with 
BeforeAndAfter {
   }.head
 
   assert(actualOutputColumns === expectedOutputColumns, "Output 
columns mismatch")
-  assert(actualScannedColumns === expectedScannedColumns, "Scanned 
columns mismatch")
+
+  // Scanned columns in `HiveTableScanExec` are generated by the 
`pruneFilterProject` method
+  // in `SparkPlanner` that internally uses `AttributeSet.toSeq`.
+  // Since we change an output order of `AttributeSet.toSeq` in 
SPARK-18394,
+  // we need to sort column names for a test below.
--- End diff --

How about?

>  Scanned columns in `HiveTableScanExec` are generated by the 
`pruneFilterProject` method in `SparkPlanner`. This method internally uses 
`AttributeSet.toSeq`, in which the returned output columns are sorted by the 
names and expression ids.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-16 Thread neoremind

Github user neoremind commented on the issue:

https://github.com/apache/spark/pull/18964
  
Not yet since it is OK to keep buffer size as default system value, but to 
keep it consistent as user would like to specify, this makes sense. 
I also notice that Spark RPC by default uses java native serialization, 
even a verifying endpoint exist or not request would cost 1K of payload size, 
not to mention some other real logic endpoint, so in the real world it might be 
useful to profile this, I suggest maybe providing more RPC monitoring log to or 
hook would be beneficial, anyway this should be discussed in another thread. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarchy to ma...

2017-08-16 Thread ueshin

Github user ueshin commented on the issue:

https://github.com/apache/spark/pull/18958
  
also cc @kiszk for another column vector pr.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133626997
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java
 ---
@@ -505,18 +500,12 @@ public void filterNullsInColumn(int ordinal) {
 nullFilteredColumns.add(ordinal);
   }
 
-  private ColumnarBatch(StructType schema, int maxRows, MemoryMode 
memMode) {
+  public ColumnarBatch(StructType schema, ColumnVector[] columns, int 
capacity) {
 this.schema = schema;
-this.capacity = maxRows;
-this.columns = new ColumnVector[schema.size()];
+this.columns = columns;
+this.capacity = capacity;
--- End diff --

I found some places referring `ColumnarBatch.capacity()`, so I'd be a 
little conservative to do that for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18955: [SPARK-21743][SQL] top-most limit should not caus...

2017-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18955


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18967: [SQL] [MINOR] [TEST] Set spark.unsafe.exceptionOnMemoryL...

2017-08-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18967
  
+1 for updating pom and SparkBuild.scala.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18955: [SPARK-21743][SQL] top-most limit should not cause memor...

2017-08-16 Thread gatorsmile

Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/18955
  
Thanks! Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates should not...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18968
  
**[Test build #80769 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80769/testReport)**
 for PR 18968 at commit 
[`f5d8ebb`](https://github.com/apache/spark/commit/f5d8ebb20ef73c115118b062c57f0f1372f672a2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18936: [SPARK-21688][ML][MLLIB] make native BLAS the first choi...

2017-08-16 Thread VinceShieh

Github user VinceShieh commented on the issue:

https://github.com/apache/spark/pull/18936
  
thanks, Sean and Nick.
To @srowen , I think the difference is the finding from our previous 
investigation that, thread setting in the native BLAS impacts the overall 
performance of a method/algorithm.
To @MLnick, Agree. We know it demands a certain amount of benchmark work 
for this PR, since the changes are in a low level of the stack, they will 
impact several methods and there are also other native BLAS implementations, 
not just MKL, So, we take SVM as an example to show what we might get from it. 
Also, given the fact that mllib is only in maintenance mode, pls let us know if 
such change is unworthy of the work required.
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18887
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80757/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18887
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18956: [SPARK-21726][SQL] Check for structural integrity of the...

2017-08-16 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18956
  
The PR going to fix the issue in `PullupCorrelatedPredicates` is submitted 
at #18968.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18887
  
**[Test build #80757 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80757/testReport)**
 for PR 18887 at commit 
[`dc642bd`](https://github.com/apache/spark/commit/dc642bd70042da965387916656747ae78acdc192).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18930
  
**[Test build #80768 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80768/testReport)**
 for PR 18930 at commit 
[`5191ed4`](https://github.com/apache/spark/commit/5191ed48a57017b3eeb3336e7ffa4a823dca5c28).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates should not...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18968
  
**[Test build #80767 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80767/testReport)**
 for PR 18968 at commit 
[`4a47393`](https://github.com/apache/spark/commit/4a47393e4605790c4cdf1a33639cd6595cd35ba8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-16 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18930
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarchy to ma...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18958
  
**[Test build #80766 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80766/testReport)**
 for PR 18958 at commit 
[`b6ab633`](https://github.com/apache/spark/commit/b6ab63359e00d7fe0175204f191ff1baa10b789f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18930
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80758/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18930
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18930
  
**[Test build #80758 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80758/testReport)**
 for PR 18930 at commit 
[`5191ed4`](https://github.com/apache/spark/commit/5191ed48a57017b3eeb3336e7ffa4a823dca5c28).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133625043
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala ---
@@ -68,4 +68,25 @@ class QueryPartitionSuite extends QueryTest with 
SQLTestUtils with TestHiveSingl
   sql("DROP TABLE IF EXISTS createAndInsertTest")
 }
   }
+
+  test("SPARK-21739: Cast expression should initialize timezoneId " +
+"when it is called statically to convert something into 
TimestampType") {
+// create table for test
+sql("CREATE TABLE table_with_timestamp_partition(value int) 
PARTITIONED by (ts timestamp)")
+sql("INSERT OVERWRITE TABLE table_with_timestamp_partition " +
+  "partition (ts = '2010-01-01 00:00:00.000') VALUES (1)")
+sql("INSERT OVERWRITE TABLE table_with_timestamp_partition " +
+  "partition (ts = '2010-01-02 00:00:00.000') VALUES (2)")
+
+// test for Cast expression in TableReader
+checkAnswer(sql("select value from table_with_timestamp_partition"),
+  Seq(Row(1), Row(2)))
+
+// test for Cast expression in HiveTableScanExec
+checkAnswer(sql("select value from table_with_timestamp_partition " +
+  "where ts = '2010-01-02 00:00:00.000'"), Row(2))
+
+sql("DROP TABLE IF EXISTS table_with_timestamp_partition")
--- End diff --

use `WithTable`. You can check how we do it in the other test cases


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread gatorsmile

Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133625007
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala ---
@@ -68,4 +68,25 @@ class QueryPartitionSuite extends QueryTest with 
SQLTestUtils with TestHiveSingl
   sql("DROP TABLE IF EXISTS createAndInsertTest")
 }
   }
+
+  test("SPARK-21739: Cast expression should initialize timezoneId " +
--- End diff --

This test can pass without the change in `TableReader.scala`. We need 
another test case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80759/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15435
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #80759 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80759/testReport)**
 for PR 15435 at commit 
[`a041ea2`](https://github.com/apache/spark/commit/a041ea22d403b0befb6cade619ebfa5251658aba).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18960
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18960
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80760/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18960
  
**[Test build #80760 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80760/testReport)**
 for PR 18960 at commit 
[`a264e3a`](https://github.com/apache/spark/commit/a264e3aa166d2e83832a82489669893f41ff9749).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133622586
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -89,14 +91,23 @@ class VectorizedHashMapGenerator(
|$generatedAggBufferSchema
|
|  public $generatedClassName() {
-   |batch = 
org.apache.spark.sql.execution.vectorized.ColumnarBatch.allocate(schema,
-   |  org.apache.spark.memory.MemoryMode.ON_HEAP, capacity);
-   |// TODO: Possibly generate this projection in HashAggregate 
directly
-   |aggregateBufferBatch = 
org.apache.spark.sql.execution.vectorized.ColumnarBatch.allocate(
-   |  aggregateBufferSchema, 
org.apache.spark.memory.MemoryMode.ON_HEAP, capacity);
-   |for (int i = 0 ; i < aggregateBufferBatch.numCols(); i++) {
-   |   aggregateBufferBatch.setColumn(i, 
batch.column(i+${groupingKeys.length}));
+   |batchVectors = new org.apache.spark.sql.execution.vectorized
+   |  .OnHeapColumnVector[schema.fields().length];
+   |for (int i = 0; i < schema.fields().length; i++) {
+   |  batchVectors[i] = new 
org.apache.spark.sql.execution.vectorized.OnHeapColumnVector(
+   |capacity, schema.fields()[i].dataType());
+   |}
+   |batch = new 
org.apache.spark.sql.execution.vectorized.ColumnarBatch(
+   |  schema, batchVectors, capacity);
+   |
+   |bufferVectors = new org.apache.spark.sql.execution.vectorized
+   |  .OnHeapColumnVector[aggregateBufferSchema.fields().length];
+   |for (int i = 0; i < aggregateBufferSchema.fields().length; 
i++) {
+   |  bufferVectors[i] = batchVectors[i + ${groupingKeys.length}];
|}
+   |// TODO: Possibly generate this projection in HashAggregate 
directly
--- End diff --

I'm sorry but I'm not sure because this is from original code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133622350
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala
 ---
@@ -89,14 +91,23 @@ class VectorizedHashMapGenerator(
|$generatedAggBufferSchema
|
|  public $generatedClassName() {
-   |batch = 
org.apache.spark.sql.execution.vectorized.ColumnarBatch.allocate(schema,
-   |  org.apache.spark.memory.MemoryMode.ON_HEAP, capacity);
-   |// TODO: Possibly generate this projection in HashAggregate 
directly
-   |aggregateBufferBatch = 
org.apache.spark.sql.execution.vectorized.ColumnarBatch.allocate(
-   |  aggregateBufferSchema, 
org.apache.spark.memory.MemoryMode.ON_HEAP, capacity);
-   |for (int i = 0 ; i < aggregateBufferBatch.numCols(); i++) {
-   |   aggregateBufferBatch.setColumn(i, 
batch.column(i+${groupingKeys.length}));
+   |batchVectors = new org.apache.spark.sql.execution.vectorized
--- End diff --

Sure, I'll try it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18929: [MINOR][LAUNCHER]Reuse EXECUTOR_MEMORY and EXECUTOR_CORE...

2017-08-16 Thread heary-cao

Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/18929
  
thanks @srowen @jerryshao @vanzin  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18929: [MINOR][LAUNCHER]Reuse EXECUTOR_MEMORY and EXECUT...

2017-08-16 Thread heary-cao

Github user heary-cao closed the pull request at:

https://github.com/apache/spark/pull/18929


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17373
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80762/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17373
  
**[Test build #80762 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80762/testReport)**
 for PR 17373 at commit 
[`5369b08`](https://github.com/apache/spark/commit/5369b088e7fcb0fa35b0e4c840772cf60515c882).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17373
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates should not...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18968
  
**[Test build #80765 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80765/testReport)**
 for PR 18968 at commit 
[`4604a08`](https://github.com/apache/spark/commit/4604a08e390019f7c3952774dd1b9086be9f2680).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18968: [SPARK-21759][SQL] PullupCorrelatedPredicates sho...

2017-08-16 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/18968

[SPARK-21759][SQL] PullupCorrelatedPredicates should not produce unresolved 
plans

## What changes were proposed in this pull request?

With the check for structural integrity proposed in SPARK-21726, I found 
that an optimization rule `PullupCorrelatedPredicates` can produce unresolved 
plans.

For a correlated IN query like:

Project [a#0]
+- Filter a#0 IN (list#4 [b#1])
   :  +- Project [c#2]
   : +- Filter (outer(b#1) < d#3)
   :+- LocalRelation , [c#2, d#3]
   +- LocalRelation , [a#0, b#1]


After `PullupCorrelatedPredicates`, it produces query plan like:

'Project [a#0]
+- 'Filter a#0 IN (list#4 [(b#1 < d#3)])
   :  +- Project [c#2, d#3]
   : +- LocalRelation , [c#2, d#3]
   +- LocalRelation , [a#0, b#1]


Because the correlated predicate involves another attribute `d#3` in 
subquery, it has been pulled out and added into the `Project` on the top of the 
subquery.

When `list` in `In` contains just one `ListQuery`, `In.checkInputDataTypes` 
checks if the size of `value` expressions matches the output size of subquery. 
In the above example, there is only `value` expression and the subquery output 
has two attributes `c#2, d#3`, so it fails the check and `In.resolved` returns 
`false`.

We should not let `PullupCorrelatedPredicates` produce unresolved plans to 
fail the structural integrity check.

## How was this patch tested?

Added test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-21759

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/18968.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #18968


commit 4604a08e390019f7c3952774dd1b9086be9f2680
Author: Liang-Chi Hsieh 
Date:   2017-08-17T04:16:39Z

PullupCorrelatedPredicates should not produce unresolved plans.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18892: [SPARK-21520][SQL]Improvement a special case for non-det...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18892
  
**[Test build #80764 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80764/testReport)**
 for PR 18892 at commit 
[`72e0252`](https://github.com/apache/spark/commit/72e0252bb2d3a9c7d43ed8756d8d7ea34fb80ca5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133618935
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
 ---
@@ -40,8 +39,43 @@
   private long lengthData;
   private long offsetData;
 
-  protected OffHeapColumnVector(int capacity, DataType type) {
-super(capacity, type, MemoryMode.OFF_HEAP);
+  public OffHeapColumnVector(int capacity, DataType type) {
+super(capacity, type);
+
+if (type instanceof ArrayType || type instanceof BinaryType || type 
instanceof StringType
--- End diff --

Sure, I'll try it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133618947
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
 ---
@@ -491,6 +525,22 @@ public void loadBytes(ColumnVector.Array array) {
 array.byteArrayOffset = 0;
   }
 
+  /**
+   * Reserve a integer column for ids of dictionary.
+   */
+  @Override
+  public OffHeapColumnVector reserveDictionaryIds(int capacity) {
--- End diff --

Sure, I'll try it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone

Github user DonnyZone commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133618679
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -227,7 +228,8 @@ class HadoopTableReader(
   def fillPartitionKeys(rawPartValues: Array[String], row: 
InternalRow): Unit = {
 partitionKeyAttrs.foreach { case (attr, ordinal) =>
   val partOrdinal = partitionKeys.indexOf(attr)
-  row(ordinal) = Cast(Literal(rawPartValues(partOrdinal)), 
attr.dataType).eval(null)
+  row(ordinal) = Cast(Literal(rawPartValues(partOrdinal)), 
attr.dataType,
+Option(SQLConf.get.sessionLocalTimeZone)).eval(null)
--- End diff --

Do you mean a test case for HadoopTableReader?  a little confusing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18953: [SPARK-20682][SQL] Implement new ORC data source based o...

2017-08-16 Thread dongjoon-hyun

Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/18953
  
Hi, @cloud-fan . As you adviced, I will replace old ORC in the current 
namespace and will try to move to `sql/core` later. Although, we cannot switch 
among old ORC and new ORC, we can bring back old ORC if need from the code. 
Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18955: [SPARK-21743][SQL] top-most limit should not cause memor...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18955
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133617360
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnarBatch.java
 ---
@@ -307,64 +293,73 @@ public void update(int ordinal, Object value) {
 
 @Override
 public void setNullAt(int ordinal) {
+  assert (columns[ordinal] instanceof MutableColumnVector);
   assert (!columns[ordinal].isConstant);
-  columns[ordinal].putNull(rowId);
+  ((MutableColumnVector) columns[ordinal]).putNull(rowId);
 }
 
 @Override
 public void setBoolean(int ordinal, boolean value) {
+  assert (columns[ordinal] instanceof MutableColumnVector);
   assert (!columns[ordinal].isConstant);
-  columns[ordinal].putNotNull(rowId);
-  columns[ordinal].putBoolean(rowId, value);
+  ((MutableColumnVector) columns[ordinal]).putNotNull(rowId);
--- End diff --

Sure, I'll add a private getter and update these.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18955: [SPARK-21743][SQL] top-most limit should not cause memor...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18955
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80756/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18958: [SPARK-21745][SQL] Refactor ColumnVector hierarch...

2017-08-16 Thread ueshin

Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18958#discussion_r133617361
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/MutableColumnVector.java
 ---
@@ -0,0 +1,599 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.execution.vectorized;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import org.apache.spark.sql.internal.SQLConf;
+import org.apache.spark.sql.types.*;
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * This class adds write APIs to ColumnVector.
+ * It supports all the types and contains put APIs as well as their 
batched versions.
+ * The batched versions are preferable whenever possible.
+ *
+ * Capacity: The data stored is dense but the arrays are not fixed 
capacity. It is the
+ * responsibility of the caller to call reserve() to ensure there is 
enough room before adding
+ * elements. This means that the put() APIs do not check as in common 
cases (i.e. flat schemas),
+ * the lengths are known up front.
+ *
+ * A ColumnVector should be considered immutable once originally created. 
In other words, it is not
+ * valid to call put APIs after reads until reset() is called.
+ */
+public abstract class MutableColumnVector extends ColumnVector {
+
+  /**
+   * Resets this column for writing. The currently stored values are no 
longer accessible.
+   */
+  @Override
+  public void reset() {
+if (isConstant) return;
+
+if (childColumns != null) {
+  for (ColumnVector c: childColumns) {
+c.reset();
+  }
+}
+numNulls = 0;
+elementsAppended = 0;
+if (anyNullsSet) {
+  putNotNulls(0, capacity);
+  anyNullsSet = false;
+}
+  }
+
+  public void reserve(int requiredCapacity) {
+if (requiredCapacity > capacity) {
+  int newCapacity = (int) Math.min(MAX_CAPACITY, requiredCapacity * 
2L);
+  if (requiredCapacity <= newCapacity) {
+try {
+  reserveInternal(newCapacity);
+} catch (OutOfMemoryError outOfMemoryError) {
+  throwUnsupportedException(requiredCapacity, outOfMemoryError);
+}
+  } else {
+throwUnsupportedException(requiredCapacity, null);
+  }
+}
+  }
+
+  private void throwUnsupportedException(int requiredCapacity, Throwable 
cause) {
+String message = "Cannot reserve additional contiguous bytes in the 
vectorized reader " +
+"(requested = " + requiredCapacity + " bytes). As a workaround, 
you can disable the " +
+"vectorized reader by setting " + 
SQLConf.PARQUET_VECTORIZED_READER_ENABLED().key() +
+" to false.";
+
+if (cause != null) {
+  throw new RuntimeException(message, cause);
--- End diff --

Thanks. I'll update it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18955: [SPARK-21743][SQL] top-most limit should not cause memor...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18955
  
**[Test build #80756 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80756/testReport)**
 for PR 18955 at commit 
[`4462778`](https://github.com/apache/spark/commit/44627788b9af15e84ec951543a56c7c9970ef247).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18959
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18959
  
**[Test build #80763 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80763/testReport)**
 for PR 18959 at commit 
[`b33fde8`](https://github.com/apache/spark/commit/b33fde86ecd6a0be5f4a55c408ab10e0ac44101a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18949: [SPARK-12961][CORE][FOLLOW-UP] Remove wrapper code for S...

2017-08-16 Thread maropu

Github user maropu commented on the issue:

https://github.com/apache/spark/pull/18949
  
ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18959
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17373: [SPARK-12664][ML] Expose probability in mlp model

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17373
  
**[Test build #80762 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80762/testReport)**
 for PR 17373 at commit 
[`5369b08`](https://github.com/apache/spark/commit/5369b088e7fcb0fa35b0e4c840772cf60515c882).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18959: [SPARK-18394][SQL] Make an AttributeSet.toSeq output ord...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18959
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80761/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18960
  
**[Test build #80760 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80760/testReport)**
 for PR 18960 at commit 
[`a264e3a`](https://github.com/apache/spark/commit/a264e3aa166d2e83832a82489669893f41ff9749).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone

Github user DonnyZone commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133614757
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
 ---
@@ -104,7 +105,7 @@ case class HiveTableScanExec(
 hadoopConf)
 
   private def castFromString(value: String, dataType: DataType) = {
-Cast(Literal(value), dataType).eval(null)
+Cast(Literal(value), dataType, 
Option(SQLConf.get.sessionLocalTimeZone)).eval(null)
--- End diff --

Here, we can obtain SQLConf directly with `sparkSession.sessionState.conf`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15435
  
**[Test build #80759 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80759/testReport)**
 for PR 15435 at commit 
[`a041ea2`](https://github.com/apache/spark/commit/a041ea22d403b0befb6cade619ebfa5251658aba).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread jmchung

Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133614456
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
+SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') FROM jsonTable
--- End diff --

@gatorsmile @viirya Thank you for your time to review the code. SQL 
statements are consistent in style and the golden file of `json-functions.sql` 
also committed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18930: [SPARK-21677][SQL] json_tuple throws NullPointException ...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18930
  
**[Test build #80758 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80758/testReport)**
 for PR 18930 at commit 
[`5191ed4`](https://github.com/apache/spark/commit/5191ed48a57017b3eeb3336e7ffa4a823dca5c28).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone

Github user DonnyZone commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133612759
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
 ---
@@ -104,7 +105,7 @@ case class HiveTableScanExec(
 hadoopConf)
 
   private def castFromString(value: String, dataType: DataType) = {
-Cast(Literal(value), dataType).eval(null)
+Cast(Literal(value), dataType, 
Option(SQLConf.get.sessionLocalTimeZone)).eval(null)
--- End diff --

BTW, is it elegant to initialize a `CastSupport` (`DataSourceAnalysis` rule 
or `DataSourceStrategy`) here, in which we still need to pass `SQLConf`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18902: [SPARK-21690][ML] one-pass imputer

2017-08-16 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/18902
  
I test on dataframes containing `null`, both `avg` and 
`stat.approxQuantile` will ignore `null`. And if one column only contain 
`null`, `null` and `Array.empty[Double]` will be returned respectively.
Agree that we add more tests for this dependency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18887
  
**[Test build #80757 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80757/testReport)**
 for PR 18887 at commit 
[`dc642bd`](https://github.com/apache/spark/commit/dc642bd70042da965387916656747ae78acdc192).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-08-16 Thread vanzin

Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/18887
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread viirya

Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133610560
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
+SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') FROM jsonTable
--- End diff --

Remember to drop the created view `DROP VIEW IF EXISTS jsonTable;`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading r...

2017-08-16 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/18962#discussion_r133610467
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -330,18 +330,45 @@ object SparkSubmit extends CommandLineUtils {
 args.archives = Option(args.archives).map(resolveGlobPaths(_, 
hadoopConf)).orNull
 
 // In client mode, download remote files.
+var localPrimaryResource: String = null
+var localJars: String = null
+var localPyFiles: String = null
 if (deployMode == CLIENT) {
-  args.primaryResource = Option(args.primaryResource).map {
+  localPrimaryResource = Option(args.primaryResource).map {
 downloadFile(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
-  args.jars = Option(args.jars).map {
+  localJars = Option(args.jars).map {
 downloadFileList(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
-  args.pyFiles = Option(args.pyFiles).map {
+  localPyFiles = Option(args.pyFiles).map {
 downloadFileList(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
 }
 
+if (clusterManager == YARN) {
+  def isNoneFsFileExist(paths: String): Boolean = {
+Option(paths).exists { p =>
+  p.split(",").map(_.trim).filter(_.nonEmpty).exists { path =>
+val url = Utils.resolveURI(path)
+url.getScheme match {
+  case "http" | "https" | "ftp" => true
+  case _ => false
+}
+  }
+}
+  }
+
+  // Spark on YARN doesn't support upload remote resources from http, 
https or ftp server
+  // directly to distributed cache, so print a warning and exit the 
process.
+  if (isNoneFsFileExist(args.jars) ||
--- End diff --

That kinda looks like a bug. Spark shouldn't be trying to upload files that 
the distributed cache can handle itself; not sure if there's a programmatic way 
of figuring out the list of schemes that it supports, though.

At worst, Spark shouldn't do anything for those URLs; executors should be 
able to download directly from http / https if needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18948: Add the validation of spark.cores.max under Strea...

2017-08-16 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18948#discussion_r133610439
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala ---
@@ -144,6 +144,13 @@ class StreamingContext private[streaming] (
 }
   }
 
+  if (sc.conf.contains("spark.cores.max")) {
+val totalCores = sc.conf.getInt("spark.cores.max", 1)
--- End diff --

@jiangxb1987 "spark.cores.max" is per application configuration to limit 
the numbers of cores can be requested for this application, it is not a per 
executor limitation. 
 
> The config spark.cores.max is used to limit the max number of cores that 
a single executor can require

So still if we have 2 receivers in one streaming application, the minimum 
number should > 2, checking "1" here is still not feasible.

Since receiver number can only be gotten in run-time, checking 
configuration will not be worked as expected.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18967: [SQL] [MINOR] [TEST] Set spark.unsafe.exceptionOnMemoryL...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18967
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80753/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18967: [SQL] [MINOR] [TEST] Set spark.unsafe.exceptionOnMemoryL...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18967
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18967: [SQL] [MINOR] [TEST] Set spark.unsafe.exceptionOnMemoryL...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18967
  
**[Test build #80753 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80753/testReport)**
 for PR 18967 at commit 
[`c6cca96`](https://github.com/apache/spark/commit/c6cca96eb2c05fa69b0725cae5002323a5d12589).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18734: [SPARK-21070][PYSPARK] Attempt to update cloudpic...

2017-08-16 Thread HyukjinKwon

Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18734#discussion_r133609212
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -241,11 +338,32 @@ def save_function(self, obj, name=None):
 if getattr(themodule, name, None) is obj:
 return self.save_global(obj, name)
 
+# a builtin_function_or_method which comes in as an attribute of 
some
+# object (e.g., object.__new__, itertools.chain.from_iterable) 
will end
+# up with modname "__main__" and so end up here. But these 
functions
+# have no __code__ attribute in CPython, so the handling for
+# user-defined functions below will fail.
+# So we pickle them here using save_reduce; have to do it 
differently
+# for different python versions.
+if not hasattr(obj, '__code__'):
+if PY3:
+if sys.version_info < (3, 4):
+raise pickle.PicklingError("Can't pickle %r" % obj)
+else:
+rv = obj.__reduce_ex__(self.proto)
+else:
+if hasattr(obj, '__self__'):
+rv = (getattr, (obj.__self__, name))
+else:
+raise pickle.PicklingError("Can't pickle %r" % obj)
+return Pickler.save_reduce(self, obj=obj, *rv)
+
 # if func is lambda, def'ed at prompt, is in main, or is nested, 
then
 # we'll pickle the actual function object rather than simply 
saving a
 # reference (as is done in default pickler), via 
save_function_tuple.
-if islambda(obj) or obj.__code__.co_filename == '' or 
themodule is None:
-#print("save global", islambda(obj), obj.__code__.co_filename, 
modname, themodule)
+if (islambda(obj)
--- End diff --

Just as a side note, it looks this PR includes 
https://github.com/cloudpipe/cloudpickle/pull/51 too (SPARK-21753).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18962: [SPARK-21714][CORE][YARN] Avoiding re-uploading r...

2017-08-16 Thread jerryshao

Github user jerryshao commented on a diff in the pull request:

https://github.com/apache/spark/pull/18962#discussion_r133609050
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -330,18 +330,45 @@ object SparkSubmit extends CommandLineUtils {
 args.archives = Option(args.archives).map(resolveGlobPaths(_, 
hadoopConf)).orNull
 
 // In client mode, download remote files.
+var localPrimaryResource: String = null
+var localJars: String = null
+var localPyFiles: String = null
 if (deployMode == CLIENT) {
-  args.primaryResource = Option(args.primaryResource).map {
+  localPrimaryResource = Option(args.primaryResource).map {
 downloadFile(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
-  args.jars = Option(args.jars).map {
+  localJars = Option(args.jars).map {
 downloadFileList(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
-  args.pyFiles = Option(args.pyFiles).map {
+  localPyFiles = Option(args.pyFiles).map {
 downloadFileList(_, targetDir, args.sparkProperties, hadoopConf)
   }.orNull
 }
 
+if (clusterManager == YARN) {
+  def isNoneFsFileExist(paths: String): Boolean = {
+Option(paths).exists { p =>
+  p.split(",").map(_.trim).filter(_.nonEmpty).exists { path =>
+val url = Utils.resolveURI(path)
+url.getScheme match {
+  case "http" | "https" | "ftp" => true
+  case _ => false
+}
+  }
+}
+  }
+
+  // Spark on YARN doesn't support upload remote resources from http, 
https or ftp server
+  // directly to distributed cache, so print a warning and exit the 
process.
+  if (isNoneFsFileExist(args.jars) ||
--- End diff --

The code 
[here](https://github.com/apache/spark/blob/b8ffb51055108fd606b86f034747006962cd2df3/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L346)
 compare two FSs and copy src to dest if FS is different. AFAIK there's no http 
scheme in Hadoop, so `val srcFs = srcPath.getFileSystem(hadoopConf)` this 
probably will throw exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18955: [SPARK-21743][SQL] top-most limit should not cause memor...

2017-08-16 Thread viirya

Github user viirya commented on the issue:

https://github.com/apache/spark/pull/18955
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18935: [SPARK-9104][CORE] Expose Netty memory metrics in Spark

2017-08-16 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/18935
  
Thanks @zsxwing . I was thinking to expose the details of memory allocation 
in Netty for user to monitor and tune, user could filter out unrelated metrics. 
Maybe you're right, it is too verbose and too detailed to expose each arena's 
detail, let me change the codes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18960: [SPARK-21739][SQL]Cast expression should initiali...

2017-08-16 Thread DonnyZone

Github user DonnyZone commented on a diff in the pull request:

https://github.com/apache/spark/pull/18960#discussion_r133607798
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -227,7 +228,8 @@ class HadoopTableReader(
   def fillPartitionKeys(rawPartValues: Array[String], row: 
InternalRow): Unit = {
 partitionKeyAttrs.foreach { case (attr, ordinal) =>
   val partOrdinal = partitionKeys.indexOf(attr)
-  row(ordinal) = Cast(Literal(rawPartValues(partOrdinal)), 
attr.dataType).eval(null)
+  row(ordinal) = Cast(Literal(rawPartValues(partOrdinal)), 
attr.dataType,
+Option(SQLConf.get.sessionLocalTimeZone)).eval(null)
--- End diff --

OK, I will work on it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18964: [SPARK-21701][CORE] Enable RPC client to use ` SO_RCVBUF...

2017-08-16 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/18964
  
The change looks OK to me. Did you meet the issue in which you have to 
change the buffer size in the client side?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18965: [SPARK-21749][DOC] Add comments for MessageEncoder to ex...

2017-08-16 Thread jerryshao

Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/18965
  
We usually don't have the PRs which only add comments to explain something, 
so I'm neutral to this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18957: [SPARK-21744][CORE] Add retry logic for new broadcast in...

2017-08-16 Thread caneGuy

Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/18957
  
If we implement retry logic in `DiskBlockManager`, any high-level which 
call `DiskBlockManager` will be influenced, so i only implement in 
'BroadcastManager'.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18960
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18957: [SPARK-21744][CORE] Add retry logic for new broadcast in...

2017-08-16 Thread caneGuy

Github user caneGuy commented on the issue:

https://github.com/apache/spark/pull/18957
  
@jiangxb1987 From unit test,we can see if "spark.local.dir" has good disk 
and bad disk, retry given times can skip bad disk and driver will not exit with 
exception.And this change has no side-effect for normal scenario.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18960
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80754/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18960: [SPARK-21739][SQL]Cast expression should initialize time...

2017-08-16 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18960
  
**[Test build #80754 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80754/testReport)**
 for PR 18960 at commit 
[`492b756`](https://github.com/apache/spark/commit/492b756fde5008854d1351ed423c3897c683c662).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes f...

2017-08-16 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18855


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18963: [SPARK-18464][SQL][backport] support old table wh...

2017-08-16 Thread cloud-fan

Github user cloud-fan closed the pull request at:

https://github.com/apache/spark/pull/18963


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18855: [SPARK-3151] [Block Manager] DiskStore.getBytes fails fo...

2017-08-16 Thread cloud-fan

Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/18855
  
thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 452 matches

Mail list logo