[GitHub] spark pull request: [SPARK-2993] [MLLib] colStats (wrapper around ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1911#issuecomment-52012907
  
QA results for PR 1911:br- This patch PASSES unit tests.br- This patch 
merges cleanlybr- This patch adds the following public classes 
(experimental):brclass MultivariateStatisticalSummarySerialized(val summary: 
MultivariateStatisticalSummary)brclass 
MultivariateStatisticalSummary(object):brbrFor more information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18414/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Clean unused code in SortShuffleWriter

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1882#issuecomment-52013113
  
QA results for PR 1882:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18415/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use transferTo when copy merge files in Extern...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1884#issuecomment-52013158
  
QA results for PR 1884:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18416/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-2468] Netty based block server / ...

2014-08-13 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52013277
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-2468] Netty based block server / ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52013347
  
QA results for PR 1907:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18410/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-2468] Netty based block server / ...

2014-08-13 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52013468
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-2468] Netty based block server / ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52013487
  
QA tests have started for PR 1907. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18419/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use transferTo when copy merge files in Extern...

2014-08-13 Thread colorant
Github user colorant commented on the pull request:

https://github.com/apache/spark/pull/1884#issuecomment-52013518
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2986] [SQL] fixed: setting properties d...

2014-08-13 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/1904#issuecomment-52013574
  
+@liancheng, @marmbrus I think this is good catch.`SparkSQLCLIDriver` will 
resort to `SetProcessor`(of Hive) for all of the `set` command currently, but 
with this PR, it will goes into the `SparkCliDriver`, and eventually resort to 
`SetCommand` (of SparkSQL).

@guowei2 , can you also add a unit test for that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use transferTo when copy merge files in Extern...

2014-08-13 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1884#issuecomment-52013620
  
It's ok the flume test is not related. I am going to merge this in master  
branch-1.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP] [SPARK-2468] Netty based block server / ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52013782
  
QA tests have started for PR 1907. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18420/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use transferTo when copy merge files in Extern...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1884#issuecomment-52013783
  
QA tests have started for PR 1884. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18421/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use transferTo when copy merge files in Extern...

2014-08-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1884


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread mengxr
GitHub user mengxr opened a pull request:

https://github.com/apache/spark/pull/1917

[SPARK-3001][MLLIB] Improve Spearman's correlation

The current implementation requires sorting individual columns, which could 
be done with a global sort.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mengxr/spark spearman

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1917.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1917


commit 0846e078bc3fd5c12d7c1773f396d07efbbff45b
Author: Xiangrui Meng m...@databricks.com
Date:   2014-08-13T05:59:24Z

first version

commit b98bb18b69f30537fd235ef73d525c0f59f27293
Author: Xiangrui Meng m...@databricks.com
Date:   2014-08-13T06:19:26Z

add comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2917] [SQL] Avoid table creation in log...

2014-08-13 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/1846#issuecomment-52014423
  
Is this my fault?  Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2917] [SQL] Avoid table creation in log...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1846#issuecomment-52014594
  
QA tests have started for PR 1846. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18424/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-13 Thread colorant
Github user colorant commented on the pull request:

https://github.com/apache/spark/pull/1241#issuecomment-52014668
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1241#issuecomment-52014901
  
QA tests have started for PR 1241. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18425/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

2014-08-13 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/1892#issuecomment-52015074
  
Thanks. Merging in master  branch-1.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1065] [PySpark] improve supporting for ...

2014-08-13 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/1912#discussion_r16159823
  
--- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala 
---
@@ -315,6 +315,15 @@ private[spark] object PythonRDD extends Logging {
 JavaRDD.fromRDD(sc.sc.parallelize(objs, parallelism))
   }
 
+  def readBroadcastFromFile(sc: JavaSparkContext, filename: String):
+  Broadcast[Array[Byte]] = {
--- End diff --

does this fit in the line above?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1065] [PySpark] improve supporting for ...

2014-08-13 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1912#issuecomment-52015135
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1777 (partial)] bugfix: make size of re...

2014-08-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1892


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2993] [MLLib] colStats (wrapper around ...

2014-08-13 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/1911#issuecomment-52015369
  
LGTM. Merged into both master and branch-1.1. Thanks for adding this and 
cleaning `MLLibPythonAPI`!!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2993] [MLLib] colStats (wrapper around ...

2014-08-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/1911


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/1917#issuecomment-52015516
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1917#issuecomment-52015796
  
QA tests have started for PR 1917. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18427/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1917#issuecomment-52015802
  
QA results for PR 1917:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18427/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1031#issuecomment-52016089
  
QA tests have started for PR 1031. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18428/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1031#issuecomment-52016096
  
QA results for PR 1031:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18428/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...

2014-08-13 Thread witgo
Github user witgo commented on the pull request:

https://github.com/apache/spark/pull/1031#issuecomment-52016199
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1065] [PySpark] improve supporting for ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1912#issuecomment-52016367
  
QA tests have started for PR 1912. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18430/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1022#issuecomment-52016360
  
QA tests have started for PR 1022. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18431/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1022#issuecomment-52016370
  
QA results for PR 1022:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18431/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1241#issuecomment-52016533
  
QA results for PR 1241:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18425/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/1917#issuecomment-52016614
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1065] [PySpark] improve supporting for ...

2014-08-13 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/1912#issuecomment-52016718
  
I was talking to Jenkins when I said test this please, but thanks @davies 
for adding tests too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Use transferTo when copy merge files in Extern...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1884#issuecomment-52016904
  
QA results for PR 1884:br- This patch PASSES unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18421/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1917#issuecomment-52016999
  
QA tests have started for PR 1917. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18432/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3001][MLLIB] Improve Spearman's correla...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1917#issuecomment-52017008
  
QA results for PR 1917:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18432/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1065] [PySpark] improve supporting for ...

2014-08-13 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/1912#issuecomment-52017380
  
LoL, I realized this just after pushing the commit :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2468] Netty based block server / client...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52018299
  
QA tests have started for PR 1907. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18434/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1241#issuecomment-52018301
  
QA tests have started for PR 1241. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18435/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1241#issuecomment-52018311
  
QA results for PR 1241:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18435/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2468] Netty based block server / client...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52018524
  
QA results for PR 1907:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18419/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2917] [SQL] Avoid table creation in log...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1846#issuecomment-52018533
  
QA results for PR 1846:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds the following public classes 
(experimental):brcase class CreateTableAsSelect(brbrFor more information 
see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18424/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2468] Netty based block server / client...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52018958
  
QA results for PR 1907:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18420/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1065] [PySpark] improve supporting for ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1912#issuecomment-52019343
  
QA results for PR 1912:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18430/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2969][SQL] Make ScalaReflection be able...

2014-08-13 Thread ueshin
Github user ueshin commented on the pull request:

https://github.com/apache/spark/pull/1889#issuecomment-52019410
  
I noticed that currently Parquet support can't handle `MapType` containing 
`null` value.
There is a following difference, though:

- writing and reading of `ArrayType` throw exception when `containsNull` is 
`true` even if no `null` value is contained.
- writing and reading of `MapType` can do if the map doesn't have `null` 
value regardless of `valueContainsNull`.

Should I modify Parquet support to handle `ArrayType` as the same as 
`MapType`, i.e.  do writing and reading regardless of `containsNull` for now? 
(if contains `null` values, it throws runtime exception.)
And handling `null` value for both `ArrayType` and `MapType` would be the 
next issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2917] [SQL] Avoid table creation in log...

2014-08-13 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/1846#issuecomment-52020897
  
Seems failed due to some other reasons.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3006] Failed to execute spark-shell in ...

2014-08-13 Thread tsudukim
GitHub user tsudukim opened a pull request:

https://github.com/apache/spark/pull/1918

[SPARK-3006] Failed to execute spark-shell in Windows OS

Modified the order of the options and arguments in spark-shell.cmd

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tsudukim/spark feature/SPARK-3006

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1918


commit 1a32410822f1a0372bbad0695579b8ef5973fe1f
Author: Masayoshi TSUZUKI tsudu...@oss.nttdata.co.jp
Date:   2014-08-13T08:26:01Z

[SPARK-3006] Failed to execute spark-shell in Windows OS

Modified the order of the options and arguments in spark-shell.cmd




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3006] Failed to execute spark-shell in ...

2014-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1918#issuecomment-52023691
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2468] Netty based block server / client...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1907#issuecomment-52024735
  
QA results for PR 1907:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18434/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-08-13 Thread baishuo
GitHub user baishuo opened a pull request:

https://github.com/apache/spark/pull/1919

[SPARK-3007][SQL]Add Dynamic Partition support to Spark Sql hive

the detail please refer the comment of 
https://issues.apache.org/jira/browse/SPARK-3007

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/baishuo/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1919.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1919


commit d3e206e1a2fadc271e365462bd93730e31a094eb
Author: baishuo(白硕) vc_j...@hotmail.com
Date:   2014-08-12T17:27:54Z

Update HiveQl.scala

commit b22857a365925a428c41dd3e93d0da3613053071
Author: baishuo(白硕) vc_j...@hotmail.com
Date:   2014-08-12T17:29:36Z

Update SparkHadoopWriter.scala

commit bade51d4726b8c55de83fef5c3e42c48f5af8f59
Author: baishuo(白硕) vc_j...@hotmail.com
Date:   2014-08-12T17:31:01Z

Update InsertIntoHiveTable.scala

commit d211d330550260d93752349682e7c8447691a9e5
Author: baishuo(白硕) vc_j...@hotmail.com
Date:   2014-08-12T17:53:04Z

Update InsertIntoHiveTable.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3004][SQL] Added null checking when ret...

2014-08-13 Thread liancheng
GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/1920

[SPARK-3004][SQL] Added null checking when retrieving row set

JIRA issue: [SPARK-3004](https://issues.apache.org/jira/browse/SPARK-3004)

HiveThriftServer2 throws exception when the result set contains `NULL`. 
Should check `isNullAt` in `SparkSQLOperationManager.getNextRowSet`.

Note that simply using `row.addColumnValue(null)` doesn't work, since Hive 
set the column type of a null `ColumnValue` to String by default.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark spark-3004

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1920.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1920


commit 221772266928b752594aa99aa1d9656c3c7892b5
Author: Cheng Lian lian.cs@gmail.com
Date:   2014-08-13T07:47:11Z

Fixed SPARK-3004: added null checking when retrieving row set

commit 1b1db1ca768a826224567a6f249ba389c345507f
Author: Cheng Lian lian.cs@gmail.com
Date:   2014-08-13T09:13:05Z

Adding NULL column values in the Hive way




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-08-13 Thread baishuo
Github user baishuo commented on the pull request:

https://github.com/apache/spark/pull/1919#issuecomment-52026271
  
I didnt have  add the related test since I dont know how to write it.  but 
I had test the function by SparkSQLCLIDriver


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1919#issuecomment-52026260
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3004][SQL] Added null checking when ret...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1920#issuecomment-52026411
  
QA tests have started for PR 1920. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18436/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2986] [SQL] fixed: setting properties d...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1904#issuecomment-52026786
  
@guowei2 Are you sure `set spark.sql.shuffle.partitions=n` doesn't work? 
Would you mind to provide steps to reproduce this issue? 'Cause I just tried it 
with the most recent master branch (HiveThriftServer2 + beeline):

![sparksql lian laptop local spark 
stages](https://cloud.githubusercontent.com/assets/230655/3903389/53b99c50-22cb-11e4-87c2-13b0ddff277a.png)

Stage 0 and stage 3 indicate that the `set` command works.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2970] [SQL] spark-sql script ends with ...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1891#issuecomment-52026960
  
Build failure was caused by PySpark again :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2970] [SQL] spark-sql script ends with ...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1891#issuecomment-52026969
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2970] [SQL] spark-sql script ends with ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1891#issuecomment-52027298
  
QA tests have started for PR 1891. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18437/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2970] [SQL] spark-sql script ends with ...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1891#issuecomment-52027481
  
Oh, actually the build failure was caused by a Spark SQL Python API test 
case. It seems that we are comparing floating point numbers directly:

```
Expected:
[Row(byte1=126, byte2=-127, short1=-32767, short2=32766, 
int=2147483646, float=2.1)]
Got:
[Row(byte1=126, byte2=-127, short1=-32767, short2=32766, 
int=2147483646, float=2.1001)]
```

This should be fixed... Sorry @sarutak, we'll come back to this PR later, 
thanks for the patience.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-13 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/1878#discussion_r16165816
  
--- Diff: examples/src/main/python/mllib/random_and_sampled_rdds.py ---
@@ -0,0 +1,88 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the License); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+Randomly generated and sampled RDDs.
--- End diff --

Sure, I can separate them.  I'll call them random_rdds.py and 
sampled_rdds.py


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-13 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/1878#discussion_r16165936
  
--- Diff: examples/src/main/python/mllib/random_and_sampled_rdds.py ---
@@ -0,0 +1,88 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the License); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+
+Randomly generated and sampled RDDs.
+
+
+import sys
+
+from pyspark import SparkContext
+from pyspark.mllib.random import RandomRDDGenerators
+from pyspark.mllib.util import MLUtils
+
+
+
+if __name__ == __main__:
+if len(sys.argv) not in [1, 2]:
+print  sys.stderr, Usage: logistic_regression libsvm data 
file
+exit(-1)
+if len(sys.argv) == 2:
+datapath = sys.argv[1]
+else:
+datapath = 'data/mllib/sample_binary_classification_data.txt'
+
+sc = SparkContext(appName=PythonRandomAndSampledRDDs)
+
+points = MLUtils.loadLibSVMFile(sc, datapath)
+
+numExamples = 1 # number of examples to generate
+fraction = 0.1 # fraction of data to sample
+
+# Example: RandomRDDGenerators
+normalRDD = RandomRDDGenerators.normalRDD(sc, numExamples)
+print 'Generated RDD of %d examples sampled from a unit normal 
distribution' % normalRDD.count()
--- End diff --

This file shows off different functionality than normalRDD.stats().  
normalRDD.stats() seems very similar to MultivariateStatisticalSummary / 
MultivariateOnlineSummarizer.  Why are normalRDD.stats() and statcounter.py not 
following the MultivariateStatisticalSummary / MultivariateOnlineSummarizer 
APIs (for which there are no Python APIs currently)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2862] Use shorthand range notation to a...

2014-08-13 Thread nrchandan
Github user nrchandan commented on the pull request:

https://github.com/apache/spark/pull/1787#issuecomment-52028983
  
@pwendell @srowen This version passes all test cases. Also added a new test 
case (the one specified in JIRA #SPARK-2862


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2986] [SQL] fixed: setting properties d...

2014-08-13 Thread guowei2
Github user guowei2 commented on the pull request:

https://github.com/apache/spark/pull/1904#issuecomment-52031004
  
@liancheng  it is ok in (HiveThriftServer2 + beeline).
it doesn't work in spark-sql. so does in my develop IDE with running 
SparkSQLCLIDriver



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3004][SQL] Added null checking when ret...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1920#issuecomment-52032154
  
QA results for PR 1920:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18436/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2873] [SQL] using ExternalAppendOnlyMap...

2014-08-13 Thread guowei2
Github user guowei2 commented on the pull request:

https://github.com/apache/spark/pull/1822#issuecomment-52032345
  
I've improve the structure and testing. is it better now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2970] [SQL] spark-sql script ends with ...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1891#issuecomment-52032913
  
QA results for PR 1891:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18437/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2969][SQL] Make ScalaReflection be able...

2014-08-13 Thread ueshin
Github user ueshin commented on the pull request:

https://github.com/apache/spark/pull/1889#issuecomment-52033915
  
Or do we have to make Parquet support be able to handle `null` values now?
Parquet format will change to apply changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3003] FailedStage could not be cancelle...

2014-08-13 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request:

https://github.com/apache/spark/pull/1921

[SPARK-3003] FailedStage could not be cancelled by DAGScheduler when 
cancelJob or cancelStage

Some stage is changed from running to failed, then DAGSCheduler could not 
cancel it when cancelJob or cancelStage. Since in failJobAndIndependentStages, 
DAGSCheduler will only cancel runningStage and post SparkListenerStageCompleted 
for it.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/YanTangZhai/spark SPARK-3003

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1921.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1921


commit cdef539abc5d2d42d4661373939bdd52ca8ee8e6
Author: YanTangZhai hakeemz...@tencent.com
Date:   2014-08-06T13:07:08Z

Merge pull request #1 from apache/master

update

commit b736bd729713ba6ca23ae901b34cb8523f2d24b2
Author: yantangzhai tyz0...@163.com
Date:   2014-08-13T13:33:24Z

[SPARK-3003] FailedStage could not be cancelled by DAGScheduler when 
cancelJob or cancelStage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-08-13 Thread kmader
Github user kmader commented on the pull request:

https://github.com/apache/spark/pull/1658#issuecomment-52049280
  
@freeman-lab looks good, I will add it to this pull request if that's ok 
for you. I think my personal preference would be do keep byteFile for standard 
operations and fixedLengthByteFile for other files since many standard binary 
formats are not so easily partition-able and trying to read in tif, jpg, even 
hdf5 and raw under such conditions will be rather difficult to do correctly. 
Where as for text files line by line is a common partitioning. Perhaps there 
are other use cases that I am not familiar with that speak against this though. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3003] FailedStage could not be cancelle...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1921#issuecomment-52049448
  
QA tests have started for PR 1921. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18438/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-3009: Reverted readObject method in Appl...

2014-08-13 Thread jacek-lewandowski
GitHub user jacek-lewandowski opened a pull request:

https://github.com/apache/spark/pull/1922

SPARK-3009: Reverted readObject method in ApplicationInfo so that Applic...

...ationInfo is initialized properly after deserialization

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jacek-lewandowski/spark branch-1.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1922.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1922


commit 13c1fb3e92f5dcf92054e4e0e19c1274efc09a76
Author: Jacek Lewandowski lewandowski.ja...@gmail.com
Date:   2014-08-13T13:47:38Z

SPARK-3009: Reverted readObject method in ApplicationInfo so that 
ApplicationInfo is initialized properly after deserialization




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-3009: Reverted readObject method in Appl...

2014-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1922#issuecomment-52050526
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL]Excess judgment

2014-08-13 Thread scwf
GitHub user scwf opened a pull request:

https://github.com/apache/spark/pull/1923

[SQL]Excess judgment



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/scwf/spark patch-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1923.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1923


commit d032bf930be1816694073f7d45ea92c53813c280
Author: wangfei wangfei_he...@126.com
Date:   2014-08-13T14:06:07Z

[SQL]Excess judgment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL]Excess judgment

2014-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1923#issuecomment-52052524
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL]Excess judgment

2014-08-13 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/1923#issuecomment-52053181
  
(PS excess judgment doesn't quite make sense -- I think you mean 
redundant conditional?)
Yes that's redundant although I thought the consensus was not to fix 
trivial things like this unless making changes to the surrounding code anyway?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1658#issuecomment-52053774
  
QA results for PR 1658:br- This patch FAILED unit tests.brbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18439/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1658#issuecomment-52053578
  
QA tests have started for PR 1658. This patch DID NOT merge cleanly! 
brView progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18439/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL]Excess judgment

2014-08-13 Thread scwf
Github user scwf commented on the pull request:

https://github.com/apache/spark/pull/1923#issuecomment-52054526
  
yes, i mean redundant conditional, do you mean we should fix all redundant 
conditions ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-08-13 Thread kmader
Github user kmader commented on a diff in the pull request:

https://github.com/apache/spark/pull/1658#discussion_r16177677
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -511,6 +511,67 @@ class SparkContext(config: SparkConf) extends Logging {
   }
 
   /**
+   * Get an RDD for a Hadoop-readable dataset as byte-streams for each file
+   * (useful for binary data)
+   *
+   * @param minPartitions A suggestion value of the minimal splitting 
number for input data.
+   *
+   * @note Small files are preferred, large file is also allowable, but 
may cause bad performance.
+   */
+  def binaryFiles(path: String, minPartitions: Int = defaultMinPartitions):
+  RDD[(String, Array[Byte])] = {
+val job = new NewHadoopJob(hadoopConfiguration)
+NewFileInputFormat.addInputPath(job, new Path(path))
+val updateConf = job.getConfiguration
+new RawFileRDD(
+  this,
+  classOf[ByteInputFormat],
+  classOf[String],
+  classOf[Array[Byte]],
+  updateConf,
+  minPartitions).setName(path)
+  }
+
+  /**
+   * Get an RDD for a Hadoop-readable dataset as DataInputStreams for each 
file
+   * (useful for binary data)
+   *
+   *
+   * @param minPartitions A suggestion value of the minimal splitting 
number for input data.
+   *
+   * @note Care must be taken to close the files afterwards
+   * @note Small files are preferred, large file is also allowable, but 
may cause bad performance.
+   */
+  @DeveloperApi
+  def dataStreamFiles(path: String, minPartitions: Int = 
defaultMinPartitions):
+  RDD[(String, DataInputStream)] = {
+val job = new NewHadoopJob(hadoopConfiguration)
+NewFileInputFormat.addInputPath(job, new Path(path))
+val updateConf = job.getConfiguration
+new RawFileRDD(
+  this,
+  classOf[StreamInputFormat],
+  classOf[String],
+  classOf[DataInputStream],
+  updateConf,
+  minPartitions).setName(path)
+  }
+
+  /**
+   * Load data from a flat binary file, assuming each record is a set of 
numbers
+   * with the specified numerical format (see ByteBuffer), and the number 
of
+   * bytes per record is constant (see FixedLengthBinaryInputFormat)
+   *
+   * @param path Directory to the input data files
+   * @return An RDD of data with values, RDD[(Array[Byte])]
+   */
+  def fixedLengthBinaryFiles(path: String): RDD[Array[Byte]] = {
--- End diff --

This has been taken almost directly from 

https://github.com/freeman-lab/thunder/blob/master/scala/src/main/scala/thunder/util/Load.scala
 without the extra formatting to load it as a a list of doubles


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1658#issuecomment-52055084
  
QA tests have started for PR 1658. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18441/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3003] FailedStage could not be cancelle...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1921#issuecomment-52055809
  
QA results for PR 1921:br- This patch FAILED unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18438/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2970] [SQL] spark-sql script ends with ...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1891#issuecomment-52062088
  
@sarutak Just FYI, @yhuai is taking care of the floating-point issue. 
Jenkins was pretty crazy during the last 12 hours. We're trying to calm him 
down...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1658#issuecomment-52062694
  
QA results for PR 1658:br- This patch PASSES unit tests.br- This patch 
merges cleanlybr- This patch adds the following public classes 
(experimental):brclass FixedLengthBinaryInputFormat extends 
FileInputFormat[LongWritable, BytesWritable] {brclass 
FixedLengthBinaryRecordReader extends RecordReader[LongWritable, BytesWritable] 
{brabstract class StreamBasedRecordReader[T](brclass 
StreamRecordReader(brclass StreamInputFormat extends 
StreamFileInputFormat[DataInputStream] {brabstract class 
BinaryRecordReader[T](brclass ByteRecordReader(br* A class for reading the 
file using the BinaryRecordReader (as Byte array)brclass ByteInputFormat 
extends StreamFileInputFormat[Array[Byte]] {brbrFor more information see 
test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18441/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1031#issuecomment-52064657
  
QA tests have started for PR 1031. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18442/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3011][SQL] _temporary directory should ...

2014-08-13 Thread joesu
GitHub user joesu opened a pull request:

https://github.com/apache/spark/pull/1924

[SPARK-3011][SQL] _temporary directory should be filtered out by 
sqlContext.parquetFile



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/joesu/spark bugfix-spark3011

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1924.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1924


commit f8fc32a7a1e33af3ababd86549ff227c46cd233f
Author: Chia-Yung Su chiay...@appier.com
Date:   2014-08-13T15:52:15Z

filter out tmp dir




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3011][SQL] _temporary directory should ...

2014-08-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/1924#issuecomment-52069053
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-2736] PySpark converter and example scr...

2014-08-13 Thread kanzhang
Github user kanzhang commented on the pull request:

https://github.com/apache/spark/pull/1916#issuecomment-52070624
  
@ericgarcia would be great if you could check if this patch works for you?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

2014-08-13 Thread liancheng
GitHub user liancheng opened a pull request:

https://github.com/apache/spark/pull/1925

[SQL] Using safe floating-point numbers in doctest

Test code in `sql.py` tries to compare two floating-point numbers directly, 
and cased [build 
failure(s)](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18365/consoleFull).

[Doctest 
documentation](https://docs.python.org/3/library/doctest.html#warnings) 
recommends using numbers in the form of `I/2**J` to avoid the precision issue.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark fix-pysql-fp-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1925


commit e8059d4115cc286bfe87bad82bbd3b2fe8e16db0
Author: Cheng Lian lian.cs@gmail.com
Date:   2014-08-13T16:07:47Z

Using safe floating-point numbers in doctest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1925#issuecomment-52071418
  
QA tests have started for PR 1925. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18443/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

2014-08-13 Thread liancheng
Github user liancheng closed the pull request at:

https://github.com/apache/spark/pull/1925


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1925#issuecomment-52071469
  
Closing this since it's already fixed in 882da57. /cc @yhuai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [WIP][SPARK-1720] Add the value of LD_LIBRARY_...

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1031#issuecomment-52072285
  
QA results for PR 1031:br- This patch PASSES unit tests.br- This patch 
merges cleanlybr- This patch adds no public classesbrbrFor more 
information see test 
ouptut:brhttps://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18442/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3004][SQL] Added null checking when ret...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1920#issuecomment-52074293
  
I believe the build failure is caused by 
[SPARK-3013](https://issues.apache.org/jira/browse/SPARK-3013). Should retest 
this after the issue is fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

2014-08-13 Thread liancheng
GitHub user liancheng reopened a pull request:

https://github.com/apache/spark/pull/1925

[SQL] Using safe floating-point numbers in doctest

Test code in `sql.py` tries to compare two floating-point numbers directly, 
and cased [build 
failure(s)](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18365/consoleFull).

[Doctest 
documentation](https://docs.python.org/3/library/doctest.html#warnings) 
recommends using numbers in the form of `I/2**J` to avoid the precision issue.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liancheng/spark fix-pysql-fp-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1925.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1925


commit e8059d4115cc286bfe87bad82bbd3b2fe8e16db0
Author: Cheng Lian lian.cs@gmail.com
Date:   2014-08-13T16:07:47Z

Using safe floating-point numbers in doctest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] Using safe floating-point numbers in doc...

2014-08-13 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/1925#issuecomment-52074671
  
After some thoughts, I think we still need to use safer floating-point 
numbers (`I/2**J`), since the `2.1...` pattern doesn't cover evil cases like 
`2.0999`. Reopening this. /cc @davies


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Typo in script

2014-08-13 Thread WangTaoTheTonic
GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/spark/pull/1926

Typo in script

use_conf_dir = user_conf_dir in load-spark-env.sh.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/spark TypoInScript

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/1926.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1926


commit 0c104ada73ef2cd2bee926314e370f8340d18d90
Author: WangTao barneystin...@aliyun.com
Date:   2014-08-13T16:37:36Z

Typo in script




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Typo in script

2014-08-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1926#issuecomment-52075508
  
QA tests have started for PR 1926. This patch merges cleanly. brView 
progress: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18444/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >