[GitHub] spark pull request #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14746#discussion_r76366559
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -105,7 +105,13 @@ case class CreateViewCommand(
 }
 val sessionState = sparkSession.sessionState
 
-if (isTemporary) {
+// 1) CREATE VIEW: create a temp view when users explicitly specify 
the keyword TEMPORARY;
+// otherwise, create a permanent view no matter 
whether the temporary view
+// with the same name exists or not.
+// 2) ALTER VIEW: alter the temporary view if the temp view exists; 
otherwise, try to alter
--- End diff --

Yeah! The only way is to pass a flag. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14572: [SPARK-17192] [SQL] Issue Exception when Users Specify t...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14572
  
ping @yhuai : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/8880
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64451/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/8880
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #64451 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64451/consoleFull)**
 for PR 8880 at commit 
[`a9a05c5`](https://github.com/apache/spark/commit/a9a05c5168eede0db26135c0f8f330b451c840ad).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14572: [SPARK-17192] [SQL] Issue Exception when Users Specify t...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14572
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64452/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14572: [SPARK-17192] [SQL] Issue Exception when Users Specify t...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14572
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14572: [SPARK-17192] [SQL] Issue Exception when Users Specify t...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14572
  
**[Test build #64452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64452/consoleFull)**
 for PR 14572 at commit 
[`d3a79c8`](https://github.com/apache/spark/commit/d3a79c847b24b5eb3dce0818099d99dd25869b87).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14801
  
**[Test build #64456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64456/consoleFull)**
 for PR 14801 at commit 
[`8bcd946`](https://github.com/apache/spark/commit/8bcd946ac37a36726a8059f2d074551357d2ed2b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14801
  
**[Test build #64455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64455/consoleFull)**
 for PR 14801 at commit 
[`3f75605`](https://github.com/apache/spark/commit/3f7560517955fae5b47d093567690e30988a1925).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14808: [SPARK-17156][ML][EXAMPLE] Add multiclass logistic regre...

2016-08-25 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/14808
  
This is going to have to be changed after 
[SPARK-17163](https://issues.apache.org/jira/browse/SPARK-17163). Sorry about 
the confusion! We'll still want to make an example with multiclass, though, so 
maybe we can reuse some of this :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14818: [SPARK-17157][SPARKR][WIP]: Add multiclass logistic regr...

2016-08-25 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/14818
  
In fact, we are actually just eliminating the 
`MultinomialLogisticRegression` interface and merging into the existing 
`LogisticRegression` estimator. So, maybe we won't need a change after all? I'm 
not very familiar with the R side, but basically the existing logistic 
regression will now support multiclass. 

My apologies for the confusion, I hadn't seen this Jira until just now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13231: [SPARK-15453] [SQL] Sort Merge Join to use bucketing met...

2016-08-25 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/13231
  
@tejasapatil any chance to update it soon? If not, I am interested in 
implement it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurrentData...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14821
  
LGTM, cc @yhuai to confirm.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14617
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64450/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14617
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14617
  
**[Test build #64450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64450/consoleFull)**
 for PR 14617 at commit 
[`838840d`](https://github.com/apache/spark/commit/838840dc3e40b8b10a111d343329f735e76fad36).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14728: [SPARK-17165][SQL] FileStreamSource should not track the...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14728
  
**[Test build #64454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64454/consoleFull)**
 for PR 14728 at commit 
[`9a5ed19`](https://github.com/apache/spark/commit/9a5ed19f3b397b991794a6852aebb2b14c83d635).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread petermaxlee
Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/14802
  
@zsxwing yup I plan to consolidate them.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14814: [SPARK-17242][Document]Update links of external d...

2016-08-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14814


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14819
  
Does other database do this?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14814: [SPARK-17242][Document]Update links of external dstream ...

2016-08-25 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14814
  
Merging in master/2.0.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14618: [SPARK-17030] [SQL] Remove/Cleanup HiveMetastoreCatalog....

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14618
  
**[Test build #64453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64453/consoleFull)**
 for PR 14618 at commit 
[`ebdfad1`](https://github.com/apache/spark/commit/ebdfad1b575650dd5bedc3ab97c5cf1e97fa3072).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14818: [SPARK-17157][SPARKR][WIP]: Add multiclass logistic regr...

2016-08-25 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14818
  
cool, thanks for the heads up @sethah - please loop us in for the R side 
changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14572: [SPARK-17192] [SQL] Issue Exception when Users Specify t...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14572
  
**[Test build #64452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64452/consoleFull)**
 for PR 14572 at commit 
[`d3a79c8`](https://github.com/apache/spark/commit/d3a79c847b24b5eb3dce0818099d99dd25869b87).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14572: [SPARK-17192] [SQL] Issue Exception when Users Specify t...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14572
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14818: [SPARK-17157][SPARKR][WIP]: Add multiclass logistic regr...

2016-08-25 Thread sethah
Github user sethah commented on the issue:

https://github.com/apache/spark/pull/14818
  
This is going to have to wait. We are changing the interface completely. 
See https://issues.apache.org/jira/browse/SPARK-17163. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #64451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64451/consoleFull)**
 for PR 8880 at commit 
[`a9a05c5`](https://github.com/apache/spark/commit/a9a05c5168eede0db26135c0f8f330b451c840ad).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurrentData...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14821
  
cc @cloud-fan @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14809: [SPARK-17238][SQL] simplify the logic for converting dat...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14809
  
Yeah, that is a bug. We did not get an exception when we read it, but we 
can get the error when trying to write it. The error message is confusing
```
Can only write data to relations with a single path.;
org.apache.spark.sql.AnalysisException: Can only write data to relations 
with a single path.;
at 
org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.applyOrElse(DataSourceStrategy.scala:167)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #13452: [SPARK-15718][SQL] better error message for writing buck...

2016-08-25 Thread Downchuck
Github user Downchuck commented on the issue:

https://github.com/apache/spark/pull/13452
  
Regarding the reason for disallowing bucket writes: "we have no idea [on 
read] if the data is bucketed or not, so it doesn't make sense to use save to 
write bucketed data"

It's easy enough to pass information to the reader, it doesn't need to be 
automatic or rely on a metastore or other discover methods. Something as simple 
as read.sortedBy(cols...).bucketedBy(func Or cols..) would do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64449/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14537
  
**[Test build #64449 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64449/consoleFull)**
 for PR 14537 at commit 
[`9ecb2ed`](https://github.com/apache/spark/commit/9ecb2ed01db1daa19dfe837745d5468cc4990703).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurrentData...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14821
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64448/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurrentData...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14821
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurrentData...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14821
  
**[Test build #64448 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64448/consoleFull)**
 for PR 14821 at commit 
[`1c9a1e3`](https://github.com/apache/spark/commit/1c9a1e3c608be72ca7c4203ecc0e80f15080eb80).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-25 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/14712#discussion_r76356157
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ---
@@ -88,14 +89,70 @@ case class AnalyzeTableCommand(tableName: String) 
extends RunnableCommand {
 }
   }.getOrElse(0L)
 
-// Update the Hive metastore if the total size of the table is 
different than the size
-// recorded in the Hive metastore.
-// This logic is based on 
org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats().
-if (newTotalSize > 0 && newTotalSize != oldTotalSize) {
+var needUpdate = false
+val totalSize = if (newTotalSize > 0 && newTotalSize != 
oldTotalSize) {
+  needUpdate = true
+  newTotalSize
+} else {
+  oldTotalSize
+}
+var numRows: Option[BigInt] = None
+if (!noscan) {
+  val oldRowCount: Long = if (catalogTable.catalogStats.isDefined) 
{
+
catalogTable.catalogStats.get.rowCount.map(_.toLong).getOrElse(-1L)
+  } else {
+-1L
+  }
+  val newRowCount = sparkSession.table(tableName).count()
+  if (newRowCount >= 0 && newRowCount != oldRowCount) {
--- End diff --

If we delete the statistics, we can't tell whether we don't collect stats 
or the table is empty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14809: [SPARK-17238][SQL] simplify the logic for converting dat...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14809
  
ah i see, btw in your example when will we throw exception? when we read 
it? a file-based external table without path is invalid.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

2016-08-25 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/14712
  
@cloud-fan Can you please launch test for this pr? thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/14819#discussion_r76355514
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/literals.sql ---
@@ -27,6 +27,12 @@ select 9223372036854775807L, -9223372036854775808L;
 -- out of range long
 select 9223372036854775808L;
 
+-- big decimal parsing
--- End diff --

nit: if we move the two new queries to the end of this file, the following 
diff of `literals.sql.out` can be less.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14537: [SPARK-16948][SQL] Support empty orc table when c...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14537#discussion_r76355262
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -237,21 +237,27 @@ private[hive] class 
HiveMetastoreCatalog(sparkSession: SparkSession) extends Log
   new Path(metastoreRelation.catalogTable.storage.locationUri.get),
   partitionSpec)
 
-val inferredSchema = if (fileType.equals("parquet")) {
-  val inferredSchema =
-defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles())
-  inferredSchema.map { inferred =>
-ParquetFileFormat.mergeMetastoreParquetSchema(metastoreSchema, 
inferred)
-  }.getOrElse(metastoreSchema)
-} else {
-  defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles()).get
+val schema = fileType match {
+  case "parquet" =>
+val inferredSchema =
+  defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles())
+
+// For Parquet, get correct schema by merging Metastore schema 
data types
--- End diff --

To follow the decision we made in 
https://github.com/apache/spark/pull/14207 , I think we should always use the 
metastore schema and not infer it again.

For branch 2.0, we should open another PR to fix the 
`OrcFileFormat.inferSchema`, to not throw `FileNotFoundException` for empty 
table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14617
  
**[Test build #64450 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64450/consoleFull)**
 for PR 14617 at commit 
[`838840d`](https://github.com/apache/spark/commit/838840dc3e40b8b10a111d343329f735e76fad36).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-25 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/14712#discussion_r76354939
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ---
@@ -88,14 +89,70 @@ case class AnalyzeTableCommand(tableName: String) 
extends RunnableCommand {
 }
   }.getOrElse(0L)
 
-// Update the Hive metastore if the total size of the table is 
different than the size
-// recorded in the Hive metastore.
-// This logic is based on 
org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats().
-if (newTotalSize > 0 && newTotalSize != oldTotalSize) {
+var needUpdate = false
+val totalSize = if (newTotalSize > 0 && newTotalSize != 
oldTotalSize) {
+  needUpdate = true
+  newTotalSize
+} else {
+  oldTotalSize
+}
+var numRows: Option[BigInt] = None
+if (!noscan) {
+  val oldRowCount: Long = if (catalogTable.catalogStats.isDefined) 
{
+
catalogTable.catalogStats.get.rowCount.map(_.toLong).getOrElse(-1L)
+  } else {
+-1L
+  }
+  val newRowCount = sparkSession.table(tableName).count()
+  if (newRowCount >= 0 && newRowCount != oldRowCount) {
+numRows = Some(BigInt(newRowCount))
+needUpdate = true
+  }
+}
+// Update the metastore if the above statistics of the table are 
different from those
+// recorded in the metastore.
+if (needUpdate) {
+  sessionState.catalog.alterTable(
+catalogTable.copy(
+  catalogStats = Some(Statistics(
+sizeInBytes = totalSize, rowCount = numRows))),
+fromAnalyze = true)
+
+  // Refresh the cache of the table in the catalog.
+  sessionState.catalog.refreshTable(tableIdent)
+}
+
+  // data source tables have been converted into LogicalRelations
+  case logicalRel: LogicalRelation if 
logicalRel.metastoreTableIdentifier.isDefined =>
--- End diff --

you can run my added test case "test table-level statistics for data source 
table created in HiveExternalCatalog"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14537
  
BTW, @rajeshbalamohan as you directly use metastore schema now, the PR 
description looks not correct anymore, can you also update it? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14819
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14819
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64445/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/14617
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14819
  
**[Test build #64445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64445/consoleFull)**
 for PR 14819 at commit 
[`fda100f`](https://github.com/apache/spark/commit/fda100f3c42bf82c9d0accafc7230c906e0b8317).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-25 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/14712#discussion_r76354802
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ---
@@ -88,14 +89,70 @@ case class AnalyzeTableCommand(tableName: String) 
extends RunnableCommand {
 }
   }.getOrElse(0L)
 
-// Update the Hive metastore if the total size of the table is 
different than the size
-// recorded in the Hive metastore.
-// This logic is based on 
org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats().
-if (newTotalSize > 0 && newTotalSize != oldTotalSize) {
+var needUpdate = false
+val totalSize = if (newTotalSize > 0 && newTotalSize != 
oldTotalSize) {
+  needUpdate = true
+  newTotalSize
+} else {
+  oldTotalSize
+}
+var numRows: Option[BigInt] = None
+if (!noscan) {
+  val oldRowCount: Long = if (catalogTable.catalogStats.isDefined) 
{
--- End diff --

thanks, this looks more concise!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14537
  
@gatorsmile Thanks for cc'ing me.

As `spark.sql.hive.convertMetastoreOrc` is set to `false` by default, this 
change looks fine. However, if setting the config to `true`, and hitting with 
inconsistent schema between metastore and Orc files, I remember it will cause 
failure when reading the files.

I've implemented two approaches to this issue, #14282 is simply disabling 
Orc conversion if the case happens, #14365 is doing complicated schema mapping. 
Once this is merged, I think we should fix the schema inconsistency soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14239: [SPARK-16593] [CORE] [WIP] Provide a pre-fetch mechanism...

2016-08-25 Thread f7753
Github user f7753 commented on the issue:

https://github.com/apache/spark/pull/14239
  
@tgravescs  To make it more readable and answer the question above.
**1. Are you saying that you are loading all the data for all the maps from 
disk into memory and caching it waiting for the reducer to fetch it?**
**2. does it conditionally do this or always do it?**

 I use parameters ` spark.shuffle.prepare.open ` to switch this mechanism 
off/on and `spark.shuffle.prepare.count ` to control the block number to cache. 
So here gives the user the privilege to control the MEM used for the pre-fetch 
block based on their machine conditions.

**3. How exactly does the timing work on this, aren't you going to send the 
prepare immediately before sending the fetch? does the fetch block on waiting 
on the prepare to cache the data?**

I changed the logistic of the shuffle message transfer process, each time I 
send a FetchRequest, I'll  also send the next, so here the server side would 
eaxctly know the blockIds for the next fetch loop, then  cache them, on the 
FetchRequest succeed callback, the cache would be released since all of them 
had send to the map side and no longer be used.When the `PrepareRequest` 
arrived, the server get a thread from the threadpool to operate the read 
request(In fact, I use a `FutureTask` to do this), if the `FetchRequest` 
arrived  , since the data has not been cached fully yet, this req  would be 
blocked like before and also more effcient than before while the data has been 
load to mem before the req actually arrive.

**4. what testing have you done with this and what size of data? What type 
of load was on the nodes when testing, etc?**

I have implement this and tested based on the branch 1.4 and 1.6, using 
Intel Hibench4.0 terasort 1TB data size, I got about 30% performance 
enhancements, on a cluster which has 5 node, each node has 96GB Mem,CPU is 
Xeon E5 v3 , 7200RPM Disk.

But note that since Benchmark like terasort would shuffle all the data that 
has been read, so in other cases, it may not work so well as that.








---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14712: [SPARK-17072] [SQL] support table-level statistic...

2016-08-25 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/14712#discussion_r76354055
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
 ---
@@ -88,14 +89,70 @@ case class AnalyzeTableCommand(tableName: String) 
extends RunnableCommand {
 }
   }.getOrElse(0L)
 
-// Update the Hive metastore if the total size of the table is 
different than the size
-// recorded in the Hive metastore.
-// This logic is based on 
org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats().
-if (newTotalSize > 0 && newTotalSize != oldTotalSize) {
+var needUpdate = false
+val totalSize = if (newTotalSize > 0 && newTotalSize != 
oldTotalSize) {
+  needUpdate = true
+  newTotalSize
+} else {
+  oldTotalSize
+}
+var numRows: Option[BigInt] = None
+if (!noscan) {
+  val oldRowCount: Long = if (catalogTable.catalogStats.isDefined) 
{
+
catalogTable.catalogStats.get.rowCount.map(_.toLong).getOrElse(-1L)
+  } else {
+-1L
+  }
+  val newRowCount = sparkSession.table(tableName).count()
+  if (newRowCount >= 0 && newRowCount != oldRowCount) {
+numRows = Some(BigInt(newRowCount))
+needUpdate = true
+  }
+}
+// Update the metastore if the above statistics of the table are 
different from those
+// recorded in the metastore.
+if (needUpdate) {
+  sessionState.catalog.alterTable(
+catalogTable.copy(
+  catalogStats = Some(Statistics(
+sizeInBytes = totalSize, rowCount = numRows))),
+fromAnalyze = true)
+
+  // Refresh the cache of the table in the catalog.
+  sessionState.catalog.refreshTable(tableIdent)
+}
+
+  // data source tables have been converted into LogicalRelations
+  case logicalRel: LogicalRelation if 
logicalRel.metastoreTableIdentifier.isDefined =>
--- End diff --

We will reach here when analyzing data source table with hive, the table is 
in form of LogicalRelation maintained in "cachedDataSourceTables" in 
HiveMetastoreCatalog.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14710: [SPARK-16533][CORE] resolve deadlocking in driver when e...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14710
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64442/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14710: [SPARK-16533][CORE] resolve deadlocking in driver when e...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14710
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14710: [SPARK-16533][CORE] resolve deadlocking in driver when e...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14710
  
**[Test build #64442 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64442/consoleFull)**
 for PR 14710 at commit 
[`380291b`](https://github.com/apache/spark/commit/380291b7122aaf1fab461a07d72f0c285696c967).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64446/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Support empty orc table when converti...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14537
  
**[Test build #64446 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64446/consoleFull)**
 for PR 14537 at commit 
[`fc14e2d`](https://github.com/apache/spark/commit/fc14e2d95cb95becf90a38e91e7725e483bae835).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14809: [SPARK-17238][SQL] simplify the logic for converting dat...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14809
  
If we do not specify the schema, it will behave like what you said. For 
example,
```Scala
  sparkSession.catalog.createExternalTable(
"createdParquetTable",
"parquet",
Map.empty[String, String])
```
```
Unable to infer schema for ParquetFormat at . It must be specified manually;
org.apache.spark.sql.AnalysisException: Unable to infer schema for 
ParquetFormat at . It must be specified manually;
```

However, if we specify the schema, we will not get an error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14710: [SPARK-16533][CORE] resolve deadlocking in driver when e...

2016-08-25 Thread angolon
Github user angolon commented on the issue:

https://github.com/apache/spark/pull/14710
  
Thanks for the feedback, @vanzin - all good points. I'll fix them up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread rajeshbalamohan
Github user rajeshbalamohan commented on the issue:

https://github.com/apache/spark/pull/14537
  
Thanks @gatorsmile . Removed the changes related to OrcFileFormat


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14537
  
**[Test build #64449 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64449/consoleFull)**
 for PR 14537 at commit 
[`9ecb2ed`](https://github.com/apache/spark/commit/9ecb2ed01db1daa19dfe837745d5468cc4990703).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14815: [SPARK-17244] Catalyst should not pushdown non-de...

2016-08-25 Thread sameeragarwal
Github user sameeragarwal commented on a diff in the pull request:

https://github.com/apache/spark/pull/14815#discussion_r76351420
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1386,15 +1386,17 @@ object EliminateOuterJoin extends Rule[LogicalPlan] 
with PredicateHelper {
 object PushPredicateThroughJoin extends Rule[LogicalPlan] with 
PredicateHelper {
   /**
* Splits join condition expressions into three categories based on the 
attributes required
-   * to evaluate them.
+   * to evaluate them. Note that we explicitly exclude non-deterministic 
(i.e., stateful) condition
+   * expressions in canEvaluateInLeft or canEvaluateInRight to prevent 
pushing these predicates on
+   * either side of the join.
*
* @return (canEvaluateInLeft, canEvaluateInRight, haveToEvaluateInBoth)
*/
   private def split(condition: Seq[Expression], left: LogicalPlan, right: 
LogicalPlan) = {
 val (leftEvaluateCondition, rest) =
-condition.partition(_.references subsetOf left.outputSet)
+condition.partition(expr => 
expr.references.subsetOf(left.outputSet) && expr.deterministic)
--- End diff --

Good catch! Didn't realize that relative ordering of these expressions 
could become an issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurrentData...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14821
  
**[Test build #64448 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64448/consoleFull)**
 for PR 14821 at commit 
[`1c9a1e3`](https://github.com/apache/spark/commit/1c9a1e3c608be72ca7c4203ecc0e80f15080eb80).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14820: [SparkR][Minor] Fix example of spark.naiveBayes

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14820
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64447/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14811: [SPARK-17231][CORE] Avoid building debug or trace log me...

2016-08-25 Thread mallman
Github user mallman commented on the issue:

https://github.com/apache/spark/pull/14811
  
Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14811: [SPARK-17231][CORE] Avoid building debug or trace...

2016-08-25 Thread mallman
Github user mallman closed the pull request at:

https://github.com/apache/spark/pull/14811


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14820: [SparkR][Minor] Fix example of spark.naiveBayes

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14820
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14820: [SparkR][Minor] Fix example of spark.naiveBayes

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14820
  
**[Test build #64447 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64447/consoleFull)**
 for PR 14820 at commit 
[`607f117`](https://github.com/apache/spark/commit/607f1177cd7800c29ae29edc9548f820f589495d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14816: [SPARK-17245] [SQL] [BRANCH-1.6] Do not rely on Hive's s...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14816
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14816: [SPARK-17245] [SQL] [BRANCH-1.6] Do not rely on Hive's s...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14816
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64443/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14809: [SPARK-17238][SQL] simplify the logic for converting dat...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14809
  
@gatorsmile can you explain more about this example? I think we will throw 
exception in `CreateDataSourceTableCommand` when we create a `DataSource` and 
call its `resolveRelation`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14816: [SPARK-17245] [SQL] [BRANCH-1.6] Do not rely on Hive's s...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14816
  
**[Test build #64443 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64443/consoleFull)**
 for PR 14816 at commit 
[`8b57886`](https://github.com/apache/spark/commit/8b57886c0489c759f0308a7b104f5b058204cdcd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14821: [SPARK-17250] [SQL] Remove HiveClient and setCurr...

2016-08-25 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/14821

[SPARK-17250] [SQL] Remove HiveClient and setCurrentDatabase from 
HiveSessionCatalog

### What changes were proposed in this pull request?
This is the first step to remove `HiveClient` from `HiveSessionState`. In 
the metastore interaction, we always set fully qualified names when 
accessing/operating a table. That means, we always specify the database. Thus, 
it is not necessary to use `HiveClient` to change the active database in Hive 
metastore. 

In `HiveSessionCatalog `, `setCurrentDatabase` is the only function that 
uses `HiveClient`. Thus, we can remove it after removing `setCurrentDatabase`

### How was this patch tested?
The existing test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark setCurrentDB

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14821.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14821


commit 1c9a1e3c608be72ca7c4203ecc0e80f15080eb80
Author: gatorsmile 
Date:   2016-08-26T00:51:57Z

fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14753: [SPARK-17187][SQL] Supports using arbitrary Java ...

2016-08-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14753


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14786: [SPARK-17212][SQL] TypeCoercion supports widening...

2016-08-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14786


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14786: [SPARK-17212][SQL] TypeCoercion supports widening conver...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/14786
  
thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14537
  
You might forget this comment 
https://github.com/apache/spark/pull/14537#discussion_r76189474


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14807: [Deploy, Windows]Check before adding double quotes in sp...

2016-08-25 Thread qualiu
Github user qualiu commented on the issue:

https://github.com/apache/spark/pull/14807
  
@srowen @tsudukim @tritab @andrewor14 : Hello, I've updated to a more 
conservative fix, please review it, thanks!  
I didn't push [my former fix]( 
https://github.com/qualiu/spark/tree/submit-cmd-all) which is now pushed, just 
because they has same effect in fact currently, but this/former involves more 
line changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14820: [SparkR][Minor] Fix example of spark.naiveBayes

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14820
  
**[Test build #64447 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64447/consoleFull)**
 for PR 14820 at commit 
[`607f117`](https://github.com/apache/spark/commit/607f1177cd7800c29ae29edc9548f820f589495d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14820: [SparkR][Minor] Fix example of spark.naiveBayes

2016-08-25 Thread junyangq
GitHub user junyangq opened a pull request:

https://github.com/apache/spark/pull/14820

[SparkR][Minor] Fix example of spark.naiveBayes

## What changes were proposed in this pull request?

The original example doesn't work because the features are not categorical. 
This PR fixes this by changing to another dataset.

## How was this patch tested?

Manual test.




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/junyangq/spark SPARK-FixNaiveBayes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14820.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14820


commit 607f1177cd7800c29ae29edc9548f820f589495d
Author: Junyang Qian 
Date:   2016-08-26T00:17:15Z

Fix example of naiveBayes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14710: [SPARK-16533][CORE] resolve deadlocking in driver when e...

2016-08-25 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/14710
  
Looks ok, a couple of minor suggestions that from my understanding should 
work now. I guess this is the next best thing without making all of these APIs 
properly asynchronous.

pinging @zsxwing also in case he wants to take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14710: [SPARK-16533][CORE] resolve deadlocking in driver...

2016-08-25 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14710#discussion_r76348848
  
--- Diff: 
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala
 ---
@@ -269,20 +258,22 @@ private[spark] abstract class YarnSchedulerBackend(
   case AddWebUIFilter(filterName, filterParams, proxyBase) =>
 addWebUIFilter(filterName, filterParams, proxyBase)
 
-  case RemoveExecutor(executorId, reason) =>
+  case r @ RemoveExecutor(executorId, reason) =>
 logWarning(reason.toString)
-removeExecutor(executorId, reason)
+driverEndpoint.ask[Boolean](r).onFailure {
+  case e =>
+logError(s"Error requesting driver to remove executor 
$executorId for reason $reason")
+}
 }
 
 
 override def receiveAndReply(context: RpcCallContext): 
PartialFunction[Any, Unit] = {
   case r: RequestExecutors =>
 amEndpoint match {
   case Some(am) =>
-Future {
-  context.reply(am.askWithRetry[Boolean](r))
-} onFailure {
-  case NonFatal(e) =>
+am.ask[Boolean](r).andThen {
--- End diff --

Similarly here, could you replace `askAmExecutor` with 
`ThreadUtils.sameThreadExecutionContext` and get rid of another thread pool?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14638
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14638
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64440/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14638
  
**[Test build #64440 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64440/consoleFull)**
 for PR 14638 at commit 
[`3c9adb3`](https://github.com/apache/spark/commit/3c9adb37f77165d78a3cdd159c554621ddb1985d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/14802
  
It would be great if we can reuse codes in `FileStreamSinkLog` for both 
`FileStreamSource` and `FileStreamSink`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14710: [SPARK-16533][CORE] resolve deadlocking in driver...

2016-08-25 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14710#discussion_r76347979
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala ---
@@ -220,19 +225,13 @@ private[spark] class StandaloneAppClient(
 endpointRef: RpcEndpointRef,
 context: RpcCallContext,
 msg: T): Unit = {
-  // Create a thread to ask a message and reply with the result.  
Allow thread to be
+  // Ask a message and create a thread to reply with the result.  
Allow thread to be
   // interrupted during shutdown, otherwise context must be notified 
of NonFatal errors.
-  askAndReplyThreadPool.execute(new Runnable {
-override def run(): Unit = {
-  try {
-context.reply(endpointRef.askWithRetry[Boolean](msg))
-  } catch {
-case ie: InterruptedException => // Cancelled
-case NonFatal(t) =>
-  context.sendFailure(t)
-  }
-}
-  })
+  endpointRef.ask[Boolean](msg).andThen {
+case Success(b) => context.reply(b)
+case Failure(ie: InterruptedException) => // Cancelled
+case Failure(NonFatal(t)) => context.sendFailure(t)
+  }(askAndReplyExecutionContext)
--- End diff --

Do you need `askAndReplyExecutionContext` anymore? It seems now all the 
heavy lifting is being done in the RPC thread pool, and the `andThen` code 
could just use `ThreadUtils.sameThreadExecutionContext` since it doesn't do 
much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14728: [SPARK-17165][SQL] FileStreamSource should not track the...

2016-08-25 Thread zsxwing
Github user zsxwing commented on the issue:

https://github.com/apache/spark/pull/14728
  
Looks pretty good. Just one comment about `Serializable`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14749: [SPARK-17182][SQL] Mark Collect as non-deterministic

2016-08-25 Thread liancheng
Github user liancheng commented on the issue:

https://github.com/apache/spark/pull/14749
  
@rxin It doesn't fail any tests. Found this issue while working on related 
code path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14537
  
**[Test build #64446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64446/consoleFull)**
 for PR 14537 at commit 
[`fc14e2d`](https://github.com/apache/spark/commit/fc14e2d95cb95becf90a38e91e7725e483bae835).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread rajeshbalamohan
Github user rajeshbalamohan commented on the issue:

https://github.com/apache/spark/pull/14537
  
Fixed the test case name. I haven't changed the parquet code path as I 
wasn't sure on whether it would break any backward compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14814: [SPARK-17242][Document]Update links of external dstream ...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14814
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/6/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14814: [SPARK-17242][Document]Update links of external dstream ...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14814
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14814: [SPARK-17242][Document]Update links of external dstream ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14814
  
**[Test build #6 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/6/consoleFull)**
 for PR 14814 at commit 
[`46bf9ab`](https://github.com/apache/spark/commit/46bf9ab1acf8c1f3e18afe95d971f0cb66ed0c41).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14819
  
**[Test build #64445 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64445/consoleFull)**
 for PR 14819 at commit 
[`fda100f`](https://github.com/apache/spark/commit/fda100f3c42bf82c9d0accafc7230c906e0b8317).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/14819
  
cc @JoshRosen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14819: [SPARK-17246][SQL] Add BigDecimal literal

2016-08-25 Thread hvanhovell
GitHub user hvanhovell opened a pull request:

https://github.com/apache/spark/pull/14819

[SPARK-17246][SQL] Add BigDecimal literal

## What changes were proposed in this pull request?
This PR adds parser support for `BigDecimal` literals. If you append the 
suffix `BD` to a valid number then this will be interpreted as a `BigDecimal`, 
for example `12.0E10BD` will interpreted into a BigDecimal with scale -9 and 
precision 3. This is useful in situations where you need exact values.

## How was this patch tested?
Added tests to `ExpressionParserSuite`, `ExpressionSQLBuilderSuite` and 
`SQLQueryTestSuite`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hvanhovell/spark SPARK-17246

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14819.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14819


commit fda100f3c42bf82c9d0accafc7230c906e0b8317
Author: Herman van Hovell 
Date:   2016-08-25T23:31:47Z

Add BigDecimal literal to parser.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14809: [SPARK-17238][SQL] simplify the logic for converting dat...

2016-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/14809
  
Condition 2 is not always true when condition 1 is `true`. I found an 
exception. 
```Scala
  val schema = StructType(StructField("b", StringType, true) :: Nil)
  sparkSession.catalog.createExternalTable(
"createdParquetTable",
"parquet",
schema,
Map.empty[String, String])
```

I think this is a bug. Do you want to fix it in this PR? Or I can fix it in 
another PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >