[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14803
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14805
  
**[Test build #64414 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64414/consoleFull)**
 for PR 14805 at commit 
[`92310d9`](https://github.com/apache/spark/commit/92310d91fa0c981d122a1a684a1dfd430f42db5e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14750
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14803
  
**[Test build #64405 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64405/consoleFull)**
 for PR 14803 at commit 
[`2771d71`](https://github.com/apache/spark/commit/2771d71898f187d479cdb0996c96494c0b53a344).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14766: [SPARK-17197] [ML] [PySpark] PySpark LiR/LoR supports tr...

2016-08-25 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/14766
  
Yes, thanks for review. Merged into master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14750
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64409/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14750
  
**[Test build #64409 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64409/consoleFull)**
 for PR 14750 at commit 
[`6c9c130`](https://github.com/apache/spark/commit/6c9c1308051de27dae0fa147764399e5ebcff9f0).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14698: [SPARK-17061][SPARK-17093][SQL] `MapObjects` should make...

2016-08-25 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/14698
  
LGTM - merging to master/2.0. Thanks for working on this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14802
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64403/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14802
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14802
  
**[Test build #64403 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64403/consoleFull)**
 for PR 14802 at commit 
[`0d9d1e6`](https://github.com/apache/spark/commit/0d9d1e6d59fb68996bf96b5238835a0718a8da1a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14805
  
**[Test build #64414 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64414/consoleFull)**
 for PR 14805 at commit 
[`92310d9`](https://github.com/apache/spark/commit/92310d91fa0c981d122a1a684a1dfd430f42db5e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14805
  
OK, can you perhaps quickly search for other instances of the same in 
Python code? it's worth a skim if you're up for it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14805
  
Jenkins test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14800
  
LGTM as a targeted fix


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14805: [MINOR][DOCS] Fix minor typos in python example code

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14805
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14766: [SPARK-17197] [ML] [PySpark] PySpark LiR/LoR supports tr...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14766
  
OK, so it's just exposing an existing parameter to python? seems OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14744: [SPARK-17178][SPARKR][SPARKSUBMIT] Allow to set sparkr s...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14744
  
**[Test build #64413 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64413/consoleFull)**
 for PR 14744 at commit 
[`bb75190`](https://github.com/apache/spark/commit/bb751907ea0a04af1e6fbf3943ce57aa6c21552b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14805: [MINOR][DOCS] Fix minor typos in python example c...

2016-08-25 Thread silentsokolov
GitHub user silentsokolov opened a pull request:

https://github.com/apache/spark/pull/14805

[MINOR][DOCS] Fix minor typos in python example code

## What changes were proposed in this pull request?

Fix minor typos python example code in streaming programming guide 


## How was this patch tested?

N/A




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/silentsokolov/spark fix-typos

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14805.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14805


commit 92310d91fa0c981d122a1a684a1dfd430f42db5e
Author: Dmitriy Sokolov 
Date:   2016-08-25T08:57:03Z

Fix minor typos in python example code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14766: [SPARK-17197] [ML] [PySpark] PySpark LiR/LoR supports tr...

2016-08-25 Thread yanboliang
Github user yanboliang commented on the issue:

https://github.com/apache/spark/pull/14766
  
@srowen Would you mind to have a look at this one? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14744: [SPARK-17178][SPARKR][SPARKSUBMIT] Allow to set s...

2016-08-25 Thread zjffdu
Github user zjffdu commented on a diff in the pull request:

https://github.com/apache/spark/pull/14744#discussion_r76204928
  
--- Diff: docs/configuration.md ---
@@ -1752,6 +1752,14 @@ showDF(properties, numRows = 200, truncate = FALSE)
 Executable for executing R scripts in client modes for driver. Ignored 
in cluster modes.
   
 
+
+  spark.r.shell.command
+  R
+  
+Executable for executing sparkR shell in client modes for driver. 
Ignored in cluster modes. It is the same as environment variable 
SPARKR_DRIVER_R, but take precedence over it.
+spark.r.shell.command is used for interactive mode of 
sparkR (sparkR shell) while spark.r.driver.command is used for the 
batch mode (running sparkR script).
--- End diff --

Got it  :smile:


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14663: [SPARK-17001] [ML] Enable standardScaler to standardize ...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14663
  
I'll go for this tomorrow if there are no other comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14804
  
Meh, it's still ambiguous and there's a defined way to disambiguate, so 
it's unfortunate, but I'm OK with a step towards consistency in any event.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14800
  
**[Test build #64412 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64412/consoleFull)**
 for PR 14800 at commit 
[`97dde82`](https://github.com/apache/spark/commit/97dde8292d04df37e0c96ce0a7198fe0da6403f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14637: [SPARK-16967] move mesos to module

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14637
  
Nice one, LGTM. I'll leave it open for final comments until tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/14804
  
I think 
[here](http://ux.stackexchange.com/questions/13815/files-size-units-kib-vs-kb-vs-kb)
 has a precise definition. AFAIK in Spark the conversion is 1024 based either 
KB, K, or kb, KiB is not so commonly used. And we usually treat everything as 
1024 based, so it might not be so necessary to differentiate them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #10896: [SPARK-12978][SQL] Skip unnecessary final group-by when ...

2016-08-25 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/10896
  
@hvanhovell could you also give me comments on #13852?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work

2016-08-25 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/14433
  
that's a good point actually - how about we use `args.primaryResource` or 
`args.isR` that already exists in SparkSubmit?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14786: [SPARK-17212][SQL] TypeCoercion supports widening conver...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14786
  
**[Test build #64411 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64411/consoleFull)**
 for PR 14786 at commit 
[`ab754fa`](https://github.com/apache/spark/commit/ab754fa8eb537dcd6ce3f4f3b256f0fba2f2fdcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14760: [SPARK-17193] [CORE] HadoopRDD NPE at DEBUG log l...

2016-08-25 Thread srowen
Github user srowen closed the pull request at:

https://github.com/apache/spark/pull/14760


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14760: [SPARK-17193] [CORE] HadoopRDD NPE at DEBUG log level wh...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14760
  
Merged to master/2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14796: [SPARK-17229][SQL] PostgresDialect shouldn't widen float...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14796
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64402/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14796: [SPARK-17229][SQL] PostgresDialect shouldn't widen float...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14796
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14796: [SPARK-17229][SQL] PostgresDialect shouldn't widen float...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14796
  
**[Test build #64402 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64402/consoleFull)**
 for PR 14796 at commit 
[`708343d`](https://github.com/apache/spark/commit/708343d59e238322be25751960bf6e4dca47d98b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14744: [SPARK-17178][SPARKR][SPARKSUBMIT] Allow to set s...

2016-08-25 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/14744#discussion_r76203143
  
--- Diff: docs/configuration.md ---
@@ -1752,6 +1752,14 @@ showDF(properties, numRows = 200, truncate = FALSE)
 Executable for executing R scripts in client modes for driver. Ignored 
in cluster modes.
   
 
+
+  spark.r.shell.command
+  R
+  
+Executable for executing sparkR shell in client modes for driver. 
Ignored in cluster modes. It is the same as environment variable 
SPARKR_DRIVER_R, but take precedence over it.
+spark.r.shell.command is used for interactive mode of 
sparkR (sparkR shell) while spark.r.driver.command is used for the 
batch mode (running sparkR script).
--- End diff --

I think what I mean is
`spark.r.shell.command is used for interactive mode of SparkR 
while...`
or
`spark.r.shell.command is used for sparkR shell while...`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14804
  
Ugh, yeah that's wrong in the sense that we are not showing MB, but MiB. 
I'd favor fixing the labels here and in Utils.bytesToString?

Then again, I see that we will also parse input of "500kb" as if it's 
"500KiB", using 1024 not 1000. That's wrong too really. But fixing it means a 
bit of a behavior change. We can support "500KB" or "500kb" but it would now 
mean 500*1000 bytes not 500*1024.

Well, maybe best to be consistently wrong than inconsistently wrong. Anyone 
feel at all inclined to take the hit in behavior change or just leave it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14801
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64404/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14801
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14801
  
**[Test build #64404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64404/consoleFull)**
 for PR 14801 at commit 
[`c400c52`](https://github.com/apache/spark/commit/c400c5292a32549cea80861adfaefeb41f4d90b3).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SQLFeatureNotSupportedException(val feature: String)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/14804
  
Because in the log it shows Memory MB in 1024 based, while in the web UI it 
is 1000 based, so this is slightly different.

You could check `Utils#bytesToString`. I think we unify this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/14800
  
No problem, thanks your attention :) okay, I'll remove this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14800
  
I am okay with both too. I apologise for the irrelevant comment @maropu .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14804
  
KB = 1000 bytes, KiB = 1024 bytes. According to the suffixes we're using, 
1000 is correct at the moment. Is the display inconsistent with something else 
in the UI or logs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/14800
  
yea, I see. I also have no strong opinion on this. So, both is okay to me.
For now, I'll remove the requirement. What do u think? cc: @HyukjinKwon 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14786: [SPARK-17212][SQL] TypeCoercion supports widening...

2016-08-25 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/14786#discussion_r76200868
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -134,6 +134,8 @@ object TypeCoercion {
   Some(DecimalPrecision.widerDecimalType(DecimalType.forType(t), d))
 case (_: FractionalType, _: DecimalType) | (_: DecimalType, _: 
FractionalType) =>
   Some(DoubleType)
+case (_: TimestampType, _: DateType) | (_: DateType, _: TimestampType) 
=>
--- End diff --

Ah, sure!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14798: [SPARK-17231][CORE] Avoid building debug or trace log me...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14798
  
Seems fine to me. I think you'd be welcome to fix up the other log messages 
you see in these files to use {} placeholders, but that's entirely optional.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14798: [SPARK-17231][CORE] Avoid building debug or trace...

2016-08-25 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/14798#discussion_r76200529
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportChannelHandler.java
 ---
@@ -29,7 +29,7 @@
 import org.apache.spark.network.protocol.Message;
 import org.apache.spark.network.protocol.RequestMessage;
 import org.apache.spark.network.protocol.ResponseMessage;
-import org.apache.spark.network.util.NettyUtils;
+import static org.apache.spark.network.util.NettyUtils.getRemoteAddress;
--- End diff --

Although I think we sort of avoid static imports as a rule, this seems like 
a good reason to use them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14800
  
True, but, in the with-replacement case, you're no longer selecting a 
subset to begin with, because an element can appear twice. "Sample" does 
generally mean "take a smaller set" but it also means things like "sampling 
from a distribution". 

I wouldn't feel strongly about it except that we're taking away behavior 
that worked fine. 
The RDD API for example, doesn't enforce that the rate must be <= 1 (even 
for without replacement, which is wrong).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14804
  
**[Test build #64410 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64410/consoleFull)**
 for PR 14804 at commit 
[`fe78ecc`](https://github.com/apache/spark/commit/fe78ecc2156ff6e842bd22b6b4419f0219a860b6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/8880
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64401/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/8880
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14804: [MINOR][Web UI] Correctly convert bytes in web UI

2016-08-25 Thread jerryshao
GitHub user jerryshao opened a pull request:

https://github.com/apache/spark/pull/14804

[MINOR][Web UI] Correctly convert bytes in web UI

## What changes were proposed in this pull request?

should be 1024 based, not 1000.

## How was this patch tested?

Manually verified.





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jerryshao/apache-spark correct-convert-bytes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14804.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14804


commit fe78ecc2156ff6e842bd22b6b4419f0219a860b6
Author: jerryshao 
Date:   2016-08-25T08:01:12Z

Correctly convert the bytes in UI




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #64401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64401/consoleFull)**
 for PR 8880 at commit 
[`beb4526`](https://github.com/apache/spark/commit/beb45266872cd52f2a64496056989237477305b6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14783: SPARK-16785 R dapply doesn't return array or raw columns

2016-08-25 Thread clarkfitzg
Github user clarkfitzg commented on the issue:

https://github.com/apache/spark/pull/14783
  
Not sure why these timings are so bad. Found out today that by using bytes 
and calling directly into Java's `org.apache.spark.api.r.RRDD` these can be 
improved by 2 orders of magnitude.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/14800
  
In the definition of statistic terms, Sampling is to select a `subset` of 
whole data 
So, I think the sample rate to be <= 1 is more reasonable.
See: https://en.wikipedia.org/wiki/Sampling_(statistics)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14750: [SPARK-17183][SQL] put hive serde table schema to table ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14750
  
**[Test build #64409 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64409/consoleFull)**
 for PR 14750 at commit 
[`6c9c130`](https://github.com/apache/spark/commit/6c9c1308051de27dae0fa147764399e5ebcff9f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14800
  
@srowen Actually, we are already enforcing it to 100% when the replacement 
is disabled. So, I suggested this to match this up when it is enabled. Yes, it 
seems not related with the bug this PR is trying to fix. I apologise for the 
irrelevant comment.

Ah, it is enforced into 100% when the replacement is disabled because there 
should be replacements when it exceeds. I see. I thought sampling is to have a 
representative smaller population from a larger one and therefore, it is not 
sensible when it exceeds 200%.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/14802
  
Looks like this is a little similar to this one #13513 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() s...

2016-08-25 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/14579
  
I like it personally - if no one has a good reason why not it seems like a 
very reasonable approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14786: [SPARK-17212][SQL] TypeCoercion supports widening conver...

2016-08-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/14786
  
MySQL and PostgreSQL support this

**MySQL**

- Greatest/leastest

```sql
mysql> SELECT GREATEST(CAST("1990-02-24 12:00:00" AS DATETIME), 
CAST("1990-02-25" AS DATE));

+---+
| GREATEST(CAST("1990-02-24 12:00:00" AS DATETIME), CAST("1990-02-25" AS 
DATE)) |

+---+
| 1990-02-25 00:00:00   
|

+---+
```

- Union

```sql
mysql> SELECT CAST("1990-02-24 12:00:00" AS DATETIME) UNION SELECT 
CAST("1990-02-24" AS DATE);
+-+
| CAST("1990-02-24 12:00:00" AS DATETIME) |
+-+
| 1990-02-24 12:00:00 |
| 1990-02-24 00:00:00 |
+-+
```

**PostgreSQL**

- Greatest/leatest

```sql
postgres=# SELECT GREATEST(CAST('1990-02-24 12:00:00' AS TIMESTAMP), 
CAST('1990-02-25' AS DATE));
  greatest
-
 1990-02-25 00:00:00
(1 row)
```

- Union

```sql
postgres=# SELECT CAST('1990-02-24 12:00:00' AS TIMESTAMP) UNION SELECT 
CAST('1990-02-24' AS DATE);
  timestamp
-
 1990-02-24 00:00:00
 1990-02-24 12:00:00
(2 rows)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14801
  
Can we avoid introducing new exception types? It is super annoying to match 
those in Python.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14800
  
Also, is it really necessary to limit the sample rate to be <= 1? It's not 
incoherent to want to sample 200% of a data set if it is with replacement.  
You'd just be generating a data set 2x the size drawn from the same empirical 
distribution.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #64408 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64408/consoleFull)**
 for PR 8880 at commit 
[`9f958a4`](https://github.com/apache/spark/commit/9f958a4847af46de18befaede4d08093fe11416f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #8880: [SPARK-5682][Core] Add encrypted shuffle in spark

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/8880
  
**[Test build #64407 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64407/consoleFull)**
 for PR 8880 at commit 
[`167d474`](https://github.com/apache/spark/commit/167d47488d9f882ea3baca25e6d7b5656f71babb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14617
  
**[Test build #64406 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64406/consoleFull)**
 for PR 14617 at commit 
[`838840d`](https://github.com/apache/spark/commit/838840dc3e40b8b10a111d343329f735e76fad36).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14800
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64400/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14800
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14800: [SPARK-15382][SQL] Fix a bug in sampling with replacemen...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14800
  
**[Test build #64400 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64400/consoleFull)**
 for PR 14800 at commit 
[`81c41d5`](https://github.com/apache/spark/commit/81c41d5c92dd503880fa8ff641743cce25e77514).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2016-08-25 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/14617
  
@mallman I changed the UI based on your comment, here is the new one 
(separate the on heap and off heap memory usage in two columns):

![screen shot 2016-08-25 at 3 28 31 
pm](https://cloud.githubusercontent.com/assets/850797/17960463/c64e32b0-6ad8-11e6-9afa-5f3c6bffa68e.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14803
  
**[Test build #64405 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64405/consoleFull)**
 for PR 14803 at commit 
[`2771d71`](https://github.com/apache/spark/commit/2771d71898f187d479cdb0996c96494c0b53a344).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-08-25 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/14803
  
cc @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14433: [SPARK-16829][SparkR]:sparkR sc.setLogLevel doesn't work

2016-08-25 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14433
  
It feels like some overkill unless there are going to be more uses for 
changing logic based on whether it's running a shell. It seems not so bad to 
define `setRootLevel` in Scala as an alias when in the shell, or define 
something in SparkR, or just change the log message to note the two 
possibilities. Is there more need for this logic?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14803: [SPARK-17153][SQL] Should read partition data whe...

2016-08-25 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/14803

[SPARK-17153][SQL] Should read partition data when reading new files in 
filestream without globbing

## What changes were proposed in this pull request?

When reading file stream with non-globbing path, the results return data 
with all `null`s for the 
partitioned columns. E.g.,

case class A(id: Int, value: Int)
val data = spark.createDataset(Seq(
  A(1, 1), 
  A(2, 2), 
  A(2, 3))
) 
val url = "/tmp/test"
data.write.partitionBy("id").parquet(url)
spark.read.parquet(url).show

+-+---+
|value| id|
+-+---+
|2|  2|
|3|  2|
|1|  1|
+-+---+

val s = 
spark.readStream.schema(spark.read.load(url).schema).parquet(url)
s.writeStream.queryName("test").format("memory").start()

sql("SELECT * FROM test").show

+-++
|value|  id|
+-++
|2|null|
|3|null|
|1|null|
+-++

## How was this patch tested?

Jenkins tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 filestreamsource-option

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14803.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14803


commit 2771d71898f187d479cdb0996c96494c0b53a344
Author: Liang-Chi Hsieh 
Date:   2016-08-25T07:13:20Z

Pass path as basePath for partitionSpec creation if path is not globbing.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11119: [SPARK-10780][ML][WIP] Add initial model to kmeans

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64398/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11119: [SPARK-10780][ML][WIP] Add initial model to kmeans

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/9
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #11119: [SPARK-10780][ML][WIP] Add initial model to kmeans

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/9
  
**[Test build #64398 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64398/consoleFull)**
 for PR 9 at commit 
[`c40192b`](https://github.com/apache/spark/commit/c40192b0579080f4af572cf6d12bf37942c03866).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14802
  
**[Test build #64403 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64403/consoleFull)**
 for PR 14802 at commit 
[`0d9d1e6`](https://github.com/apache/spark/commit/0d9d1e6d59fb68996bf96b5238835a0718a8da1a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14801: [SPARK-17234] [SQL] Table Existence Checking when Index ...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14801
  
**[Test build #64404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64404/consoleFull)**
 for PR 14801 at commit 
[`c400c52`](https://github.com/apache/spark/commit/c400c5292a32549cea80861adfaefeb41f4d90b3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14802: [SPARK-17235][SQL] Support purging of old logs in...

2016-08-25 Thread petermaxlee
Github user petermaxlee commented on a diff in the pull request:

https://github.com/apache/spark/pull/14802#discussion_r76191571
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLogSuite.scala
 ---
@@ -155,8 +174,8 @@ class HDFSMetadataLogSuite extends SparkFunSuite with 
SharedSQLContext {
 }
   }
 
-
-  def testManager(basePath: Path, fm: FileManager): Unit = {
+  /** Basic test case for [[FileManager]] implementation. */
+  private def testFileManager(basePath: Path, fm: FileManager): Unit = {
--- End diff --

I renamed this because initially I thought it's a noun meaning "manager for 
testing", rather than "to test the file manager".



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14802: [SPARK-17235][SQL] Support purging of old logs in...

2016-08-25 Thread petermaxlee
GitHub user petermaxlee opened a pull request:

https://github.com/apache/spark/pull/14802

[SPARK-17235][SQL] Support purging of old logs in MetadataLog

## What changes were proposed in this pull request?
This patch adds a purge interface to MetadataLog, and an implementation in 
HDFSMetadataLog. The purge function is currently unused, but I will use it to 
purge old execution and file source logs in follow-up patches. These changes 
are required in a production structured streaming job that runs for a long 
period of time.

## How was this patch tested?
Added a unit test case in HDFSMetadataLogSuite.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/petermaxlee/spark SPARK-17235

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14802.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14802


commit 0d9d1e6d59fb68996bf96b5238835a0718a8da1a
Author: petermaxlee 
Date:   2016-08-25T07:11:47Z

[SPARK-17235][SQL] Support purging of old logs in MetadataLog




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14802: [SPARK-17235][SQL] Support purging of old logs in Metada...

2016-08-25 Thread petermaxlee
Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/14802
  
@tdas and @zsxwing can you take a look at this? It's a pretty simple change.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14801: [SPARK-17234] [SQL] Table Existence Checking when...

2016-08-25 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/14801

[SPARK-17234] [SQL] Table Existence Checking when Index Table with the Same 
Name Exists

### What changes were proposed in this pull request?
Hive Index tables are not supported by Spark SQL. Thus, we issue an 
exception when users try to access Hive Index tables. When the internal 
function `tableExists` tries to access Hive Index tables, it always gets the 
same error message: ```Hive index table is not supported```. This message could 
be confusing to users, since their SQL operations could be completely unrelated 
to Hive Index tables. For example, when users try to alter a table to a new 
name and there exists an index table with the same name, the expected exception 
should be a `TableAlreadyExistsException`.

This PR made the following changes:
- Introduced a new `AnalysisException` type: 
`SQLFeatureNotSupportedException`. When users try to access an `Index Table`, 
we will issue a `SQLFeatureNotSupportedException`.
- `tableExists` returns `true` when hitting a 
`SQLFeatureNotSupportedException` and the feature is `Hive index table`.
- Add a checking `requireTableNotExists` for `SessionCatalog`'s 
`createTable` API; otherwise, the current implementation relies on the Hive's 
internal checking.

### How was this patch tested?
Added a test case

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark tableExists

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14801.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14801


commit 1af428b68c4341192bf8f66af7c434a7b89be61d
Author: gatorsmile 
Date:   2016-08-25T06:26:00Z

fix

commit 664d6f1caa9b3d62eafbddb292991def722910ae
Author: gatorsmile 
Date:   2016-08-25T06:34:16Z

improve test cases

commit c400c5292a32549cea80861adfaefeb41f4d90b3
Author: gatorsmile 
Date:   2016-08-25T07:12:57Z

fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14746#discussion_r76190937
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -105,7 +105,13 @@ case class CreateViewCommand(
 }
 val sessionState = sparkSession.sessionState
 
-if (isTemporary) {
+// 1) CREATE VIEW: create a temp view when users explicitly specify 
the keyword TEMPORARY;
+// otherwise, create a permanent view no matter 
whether the temporary view
+// with the same name exists or not.
+// 2) ALTER VIEW: alter the temporary view if the temp view exists; 
otherwise, try to alter
--- End diff --

question: how can you tell whether it's CREATE VIEW or ALTER VIEW?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14753: [SPARK-17187][SQL] Supports using arbitrary Java ...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14753#discussion_r76189865
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/TypedImperativeAggregateSuite.scala
 ---
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import java.io.{ByteArrayInputStream, ByteArrayOutputStream, 
DataInputStream, DataOutputStream}
+
+import org.apache.spark.sql.TypedImperativeAggregateSuite.TypedMax
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.expressions.{BoundReference, 
Expression, GenericMutableRow, SpecificMutableRow}
+import 
org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate
+import org.apache.spark.sql.execution.aggregate.SortAggregateExec
+import org.apache.spark.sql.expressions.Window
+import org.apache.spark.sql.functions._
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.{AbstractDataType, BinaryType, DataType, 
IntegerType, LongType}
+
+class TypedImperativeAggregateSuite extends QueryTest with 
SharedSQLContext {
+
+  import testImplicits._
+
+  private val random = new java.util.Random()
+
+  private val data = (0 until 1000).map { _ =>
+(random.nextInt(10), random.nextInt(100))
+  }
+
+  test("aggregate with object aggregate buffer") {
+val agg = new TypedMax(BoundReference(0, IntegerType, nullable = 
false))
+
+val group1 = (0 until data.length / 2)
+val group1Buffer = agg.createAggregationBuffer()
+group1.foreach { index =>
+  val input = InternalRow(data(index)._1, data(index)._2)
+  agg.update(group1Buffer, input)
+}
+
+val group2 = (data.length / 2 until data.length)
+val group2Buffer = agg.createAggregationBuffer()
+group2.foreach { index =>
+  val input = InternalRow(data(index)._1, data(index)._2)
+  agg.update(group2Buffer, input)
+}
+
+val mergeBuffer = agg.createAggregationBuffer()
+agg.merge(mergeBuffer, group1Buffer)
+agg.merge(mergeBuffer, group2Buffer)
+
+assert(mergeBuffer.value == data.map(_._1).max)
+assert(agg.eval(mergeBuffer) == data.map(_._1).max)
+
+// Tests low level eval(row: InternalRow) API.
+val row = new GenericMutableRow(Array(mergeBuffer): Array[Any])
+
+// Evaluates directly on row consist of aggregation buffer object.
+assert(agg.eval(row) == data.map(_._1).max)
+  }
+
+  test("supports SpecificMutableRow as mutable row") {
+val aggregationBufferSchema = Seq(IntegerType, LongType, BinaryType, 
IntegerType)
+val aggBufferOffset = 2
+val buffer = new SpecificMutableRow(aggregationBufferSchema)
+val agg = new TypedMax(BoundReference(ordinal = 1, dataType = 
IntegerType, nullable = false))
+  .withNewMutableAggBufferOffset(aggBufferOffset)
+
+agg.initialize(buffer)
+data.foreach { kv =>
+  val input = InternalRow(kv._1, kv._2)
+  agg.update(buffer, input)
+}
+assert(agg.eval(buffer) == data.map(_._2).max)
+  }
+
+  test("dataframe aggregate with object aggregate buffer, should not use 
HashAggregate") {
+val df = data.toDF("a", "b")
+val max = new TypedMax($"a".expr)
+
+// Always uses SortAggregateExec
+val sparkPlan = 
df.select(Column(max.toAggregateExpression())).queryExecution.sparkPlan
+assert(sparkPlan.isInstanceOf[SortAggregateExec])
+  }
+
+  test("dataframe aggregate with object aggregate buffer, no group by") {
+val df = data.toDF("key", "value").coalesce(2)
+val query = df.select(typedMax($"key"), count($"key"), 
typedMax($"value"), count($"value"))
+val maxKey = data.map(_._1).max
+val countKey = data.size
+val maxValue = data.map(_._2).max
+val countValue = data.size
+val expected = Seq(Row(maxKey, 

[GitHub] spark issue #13780: [SPARK-16063][SQL] Add storageLevel to Dataset

2016-08-25 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/13780
  
ping @rxin @marmbrus @davies @gatorsmile for comment on the Python storage 
level issue I mention at 
https://github.com/apache/spark/pull/13780#discussion_r67833027


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14537: [SPARK-16948][SQL] Querying empty partitioned orc...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14537#discussion_r76189706
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala ---
@@ -372,6 +373,40 @@ class OrcQuerySuite extends QueryTest with 
BeforeAndAfterAll with OrcTest {
 }
   }
 
+  test("SPARK-16948. Check empty orc tables in ORC") {
--- End diff --

how about `support empty orc table when converting hive serde table to data 
source table`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/cache() s...

2016-08-25 Thread MLnick
Github user MLnick commented on the issue:

https://github.com/apache/spark/pull/14579
  
@nchammas @holdenk @davies @rxin how about the approach of @MechCoder in 
https://github.com/apache/spark/pull/14579#discussion_r74813935?

I think this will work well, so we could raise an error to prevent (almost 
all I think) usages outside of the intended pattern of `with some_rdd.cache() 
as x:` or `with some_rdd_already_cached as x:` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14537: [SPARK-16948][SQL] Querying empty partitioned orc...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14537#discussion_r76189474
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileFormat.scala ---
@@ -54,10 +57,12 @@ class OrcFileFormat extends FileFormat with 
DataSourceRegister with Serializable
   sparkSession: SparkSession,
   options: Map[String, String],
   files: Seq[FileStatus]): Option[StructType] = {
-OrcFileOperator.readSchema(
-  files.map(_.getPath.toUri.toString),
-  Some(sparkSession.sessionState.newHadoopConf())
-)
+// Safe to ignore FileNotFoundException in case no files are found.
+val schema = Try(OrcFileOperator.readSchema(
--- End diff --

@rajeshbalamohan is this change unnecessary for this PR? If so, I'd like to 
revert it to make the PR as small as possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14537: [SPARK-16948][SQL] Querying empty partitioned orc...

2016-08-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14537#discussion_r76189247
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -237,21 +237,27 @@ private[hive] class 
HiveMetastoreCatalog(sparkSession: SparkSession) extends Log
   new Path(metastoreRelation.catalogTable.storage.locationUri.get),
   partitionSpec)
 
-val inferredSchema = if (fileType.equals("parquet")) {
-  val inferredSchema =
-defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles())
-  inferredSchema.map { inferred =>
-ParquetFileFormat.mergeMetastoreParquetSchema(metastoreSchema, 
inferred)
-  }.getOrElse(metastoreSchema)
-} else {
-  defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles()).get
+val schema = fileType match {
+  case "parquet" =>
+val inferredSchema =
+  defaultSource.inferSchema(sparkSession, options, 
fileCatalog.allFiles())
+
+// For Parquet, get correct schema by merging Metastore schema 
data types
--- End diff --

Do we have a test for this feature? I think we should make them consistent, 
i.e. parquet conversions should also use the metastore schema.

cc @yhuai @liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14579: [SPARK-16921][PYSPARK] RDD/DataFrame persist()/ca...

2016-08-25 Thread MLnick
Github user MLnick commented on a diff in the pull request:

https://github.com/apache/spark/pull/14579#discussion_r76189167
  
--- Diff: python/pyspark/rdd.py ---
@@ -188,6 +188,12 @@ def __init__(self, jrdd, ctx, 
jrdd_deserializer=AutoBatchedSerializer(PickleSeri
 self._id = jrdd.id()
 self.partitioner = None
 
+def __enter__(self):
--- End diff --

hmmm, yes this does happen to work, because most operations boil down to 
something like `mapPartitions` which creates a new `PipelineRDD` which is not 
cached, or a new `RDD` which is again not cached.

I think it will work for `DataFrame` too for similar reason - most 
operations return a new `DataFrame` instance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64399/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14537
  
**[Test build #64399 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64399/consoleFull)**
 for PR 14537 at commit 
[`6ff7e5d`](https://github.com/apache/spark/commit/6ff7e5d50de530a71df5c4a4b220a8119ca3a3f6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14698: [SPARK-17061][SPARK-17093][SQL] `MapObjects` should make...

2016-08-25 Thread lw-lin
Github user lw-lin commented on the issue:

https://github.com/apache/spark/pull/14698
  
Thanks @hvanhovell  for the review!
This patch has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ALTER V...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14746
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64396/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ALTER V...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14746
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14746: [SPARK-17180] [SQL] Fix View Resolution Order in ALTER V...

2016-08-25 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14746
  
**[Test build #64396 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64396/consoleFull)**
 for PR 14746 at commit 
[`c5add2c`](https://github.com/apache/spark/commit/c5add2cbbcc3cbbce1ff09155da27b145c204ee1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14762: [SPARK-16962][CORE][SQL] Fix misaligned record accesses ...

2016-08-25 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/14762
  
Does your change have any performance impact?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14790: [SPARK-17215][SQL] Method `SQLContext.parseDataTy...

2016-08-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14790


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #14537: [SPARK-16948][SQL] Querying empty partitioned orc tables...

2016-08-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14537
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64397/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   >