[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-30 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1241#issuecomment-53950022 Merging this now. I will take care of some minor things myself. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2288] Hide ShuffleBlockManager behind S...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1241 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Branch 1.1

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1824 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1919] Fix Windows spark-shell --jars

2014-08-30 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2211#issuecomment-53950710 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1919] Fix Windows spark-shell --jars

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2211#issuecomment-53950749 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19516/consoleFull) for PR 2211 at commit

[GitHub] spark pull request: [SPARK-1919] Fix Windows spark-shell --jars

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2211#issuecomment-53951693 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19516/consoleFull) for PR 2211 at commit

[GitHub] spark pull request: [SPARK-2961][SQL] Use statistics to prune batc...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2188#issuecomment-53952977 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19517/consoleFull) for PR 2188 at commit

[GitHub] spark pull request: [SPARK-2961][SQL] Use statistics to prune batc...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2188#issuecomment-53953073 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19518/consoleFull) for PR 2188 at commit

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/2214 [SQL] Refined Thrift server test suite This PR fixes two issues of `HiveThriftServer2Suite` and brings 1 enhancement: 1. Although metastore, warehouse directories and listening port are

[GitHub] spark pull request: [SPARK-2973][SQL] Lightweight SQL commands wit...

2014-08-30 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/2215 [SPARK-2973][SQL] Lightweight SQL commands without distributed jobs when calling .collect() By overriding `executeCollect()` in physical plan classes of all commands, we can avoid to kick off a

[GitHub] spark pull request: [SPARK-2973][SQL] Lightweight SQL commands wit...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2215#issuecomment-53954081 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19520/consoleFull) for PR 2215 at commit

[GitHub] spark pull request: [SPARK-3325] Add a parameter to the method pri...

2014-08-30 Thread watermen
GitHub user watermen opened a pull request: https://github.com/apache/spark/pull/2216 [SPARK-3325] Add a parameter to the method print in class DStream. def print(num: Int = 10) User can control the number of elements which to print. You can merge this pull request into a Git

[GitHub] spark pull request: [SPARK-3325] Add a parameter to the method pri...

2014-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2216#issuecomment-53954443 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2961][SQL] Use statistics to prune batc...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2188#issuecomment-53954582 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19517/consoleFull) for PR 2188 at commit

[GitHub] spark pull request: [SPARK-2961][SQL] Use statistics to prune batc...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2188#issuecomment-53954660 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19518/consoleFull) for PR 2188 at commit

[GitHub] spark pull request: [SPARK-2973][SQL] Lightweight SQL commands wit...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2215#issuecomment-53955654 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19520/consoleFull) for PR 2215 at commit

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53960011 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19521/consoleFull) for PR 2176 at commit

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread lirui-intel
Github user lirui-intel commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53960113 Thanks @rxin , @vanzin for the review. I've added experimental mark in the java doc. I see that mima can automatically exclude DeveloperApi and Experimental classes,

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53961548 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19521/consoleFull) for PR 2176 at commit

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2176#discussion_r16930621 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -574,4 +574,15 @@ trait JavaRDDLike[T, This : JavaRDDLike[T, This]]

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2184#discussion_r16930734 --- Diff: dev/run-tests-jenkins --- @@ -138,7 +141,7 @@ function post_message () { test_result=$? if [ $test_result -eq 124 ]; then -

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53965382 Minor style note, but otherwise LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread nchammas
Github user nchammas commented on a diff in the pull request: https://github.com/apache/spark/pull/2184#discussion_r16930998 --- Diff: dev/run-tests-jenkins --- @@ -138,7 +141,7 @@ function post_message () { test_result=$? if [ $test_result -eq 124 ]; then -

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53966846 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19522/consoleFull) for PR 2184 at commit

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53966890 Hmm, looks like I need to fix something now that this doesn't merge cleanly anymore. Investigating. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2214#issuecomment-53967270 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2214#issuecomment-53967407 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19523/consoleFull) for PR 2214 at commit

[GitHub] spark pull request: [SPARK-3327] Make broadcasted value mutable fo...

2014-08-30 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/2217 [SPARK-3327] Make broadcasted value mutable for caching useful information This PR makes broadcasted value mutable for caching useful information when implementing some algorithms that iteratively

[GitHub] spark pull request: [SPARK-3327] Make broadcasted value mutable fo...

2014-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2217#issuecomment-53967605 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-3300][SQL] No need to call clear() and ...

2014-08-30 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/2195#issuecomment-53967657 ok to test please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53968973 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19522/consoleFull) for PR 2184 at commit

[GitHub] spark pull request: [SQL] Refined Thrift server test suite

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2214#issuecomment-53969391 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19523/consoleFull) for PR 2214 at commit

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53969665 @pwendell I think we're all set now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53969763 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19524/consoleFull) for PR 2184 at commit

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on a diff in the pull request: https://github.com/apache/spark/pull/2125#discussion_r16931674 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala --- @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on a diff in the pull request: https://github.com/apache/spark/pull/2125#discussion_r16931670 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala --- @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-53970492 @davies just curious, do all the unit tests run if you do `run-tests` with `pypy`? We should make sure they do, and add a command in there to test this in Jenkins (ask

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/2125#issuecomment-53971034 The ordered categorical features are not binned and the centriods are re-calculated using the entire bin aggregate every level. I can see the improvement in accuracy

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53971091 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19524/consoleFull) for PR 2184 at commit

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1843#issuecomment-53971505 Thanks Marcelo! I've merged this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2889] Create Hadoop config objects cons...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1843 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1919] Fix Windows spark-shell --jars

2014-08-30 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2211#discussion_r16932123 --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala --- @@ -965,11 +966,9 @@ class SparkILoop(in0: Option[BufferedReader], protected val out:

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53972938 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53972940 Looks good to me pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Check if margin 0, not if prob 0.5

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1057#issuecomment-53972967 Hey @naftaliharris, might closing this pull request now that this has been fixed in other PRs? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53973019 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19525/consoleFull) for PR 1992 at commit

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973039 @CodingCat are you still working on this patch? The doc page changed significantly in 1.0, so maybe a lot of this info is still in, but it would be good to look over it and

[GitHub] spark pull request: MetadataCleaner - fine control cleanup documen...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/89#issuecomment-53973053 I agree, we should not expose these to the user given the recent changes. Would it be okay to close this PR? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973073 sure, because other people told me some of the parameters are not supposed to be configurableso I pend the work hereI can go through it again to check the

[GitHub] spark pull request: SPARK-2461. Add a toString method to Generaliz...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1388#issuecomment-53973123 @sryza just wondering, will you have time to update this for Python? As I said it would be useful to include. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/915#issuecomment-53973167 @rcompton I believe this has been addressed by Sigmoid's recent work for Pig on Spark: https://issues.apache.org/jira/browse/PIG-4059. Given that, do we still need this

[GitHub] spark pull request: [SPARK-2237][CORE]Add ZLIBCompressionCodec cod...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1121#issuecomment-53973370 @YanjieGao do you see a major tradeoff in compressed size and speed with this codec over our current ones? Also, I'm not sure your patch will compile as written.

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread rcompton
Github user rcompton commented on the pull request: https://github.com/apache/spark/pull/915#issuecomment-53973422 @mateiz no, for the reasons mentioned by Sean as well as the new work by Sigmoid, you don't need this patch. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread rcompton
Github user rcompton closed the pull request at: https://github.com/apache/spark/pull/915 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3205] add EscapedTextInputFormat

2014-08-30 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2118#discussion_r16932218 --- Diff: core/src/main/scala/org/apache/spark/input/EscapedTextInputFormat.scala --- @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3205] add EscapedTextInputFormat

2014-08-30 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2118#discussion_r16932221 --- Diff: core/src/main/scala/org/apache/spark/input/EscapedTextInputFormat.scala --- @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1952 removed slf4j Pig conflicts

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/915#issuecomment-53973561 Alright, thanks for taking a look at this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: Check if margin 0, not if prob 0.5

2014-08-30 Thread naftaliharris
Github user naftaliharris closed the pull request at: https://github.com/apache/spark/pull/1057 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-53973578 Yeah up to you, you should either update it or close the PR if you think everything is there already. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: Check if margin 0, not if prob 0.5

2014-08-30 Thread naftaliharris
Github user naftaliharris commented on the pull request: https://github.com/apache/spark/pull/1057#issuecomment-53973576 @mateiz oh yeah, no problem. Thanks again for the fixes! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53973919 Looks good, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53973932 Actually you missed JavaSparkContext; it has the same issue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on a diff in the pull request: https://github.com/apache/spark/pull/2125#discussion_r16932280 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impl/DTStatsAggregator.scala --- @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: Documentation update in addFile on how to use ...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53973967 (And please add [SPARK-3318] at the beginning of your PR title) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974032 @mateiz Thanks, completely forgot to check the javadoc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974057 Alright, thanks. Going to merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53974083 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19526/consoleFull) for PR 2210 at commit

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2210 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-08-30 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/2125#issuecomment-53974143 Apart from the discussion around the correct place for centriod calculations and some minor code style comments, it looks good to me. If it's too much work to change

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53974654 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19525/consoleFull) for PR 1992 at commit

[GitHub] spark pull request: SPARK-3318: Documentation update in addFile on...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2210#issuecomment-53975007 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19526/consoleFull) for PR 2210 at commit

[GitHub] spark pull request: [SPARK-2973][SQL] Lightweight SQL commands wit...

2014-08-30 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2215#discussion_r16932497 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala --- @@ -90,10 +90,9 @@ case class SetCommand( throw new

[GitHub] spark pull request: [SPARK-2890][SQL] Allow reading of data when c...

2014-08-30 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2209#issuecomment-53975089 I actually encountered the error with a jsonRDD, but yeah it could happen with parquet files as well. Your comment about joins though makes me think that we should

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread rezazadeh
Github user rezazadeh commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53975250 Style changes made. Experimental results below. We run DIMSUM daily on a production-scale ads dataset. After replacing the traditional cosine similarity

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53975264 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19527/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [SPARK-2890][SQL] Allow reading of data when c...

2014-08-30 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/2209#issuecomment-53975391 Sounds good. I was not sure how to correctly query those results with ambiguous schemas when I added that check. Seems an more informative logging entry is better than an

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53975519 @mateiz , retest this please, tests failed due to forked process exit code is not zero. https://github.com/apache/spark/pull/2108 --- If your project is set up for it,

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53975553 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19528/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53976026 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53976069 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19529/consoleFull) for PR 1992 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53976161 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19527/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-53976440 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19528/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [SPARK-2489] [SQL] Parquet support for fixed_l...

2014-08-30 Thread joesu
Github user joesu commented on the pull request: https://github.com/apache/spark/pull/1737#issuecomment-53976708 It's not that straightforward to reuse BinaryType for handling parquet's binary type and fixed_len_byte_array types because these two types are incompatible in the parquet

[GitHub] spark pull request: [SPARK-2489] [SQL] Parquet support for fixed_l...

2014-08-30 Thread joesu
Github user joesu commented on the pull request: https://github.com/apache/spark/pull/1737#issuecomment-53976828 Another way is to include max length information in the BinaryType type, just like the FixedLenByteArray type in this pull request. Thus we can maintain only one binary

[GitHub] spark pull request: [SPARK-2558][DOCS] Add spark.yarn.queue descri...

2014-08-30 Thread kramimus
GitHub user kramimus opened a pull request: https://github.com/apache/spark/pull/2218 [SPARK-2558][DOCS] Add spark.yarn.queue description to YARN doc Put original YARN queue spark-submit arg description in running-on-yarn html table and example command line You can merge this

[GitHub] spark pull request: [SPARK-2558][DOCS] Add spark.yarn.queue descri...

2014-08-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2218#issuecomment-53977030 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-53977396 Okay I think this is no longer necessary now that we fixed the issue causing lag in processing events. So I'd like to close this issue for now. --- If your project is

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53977395 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19529/consoleFull) for PR 1992 at commit

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53977519 @mateiz retest this again, tests failed in sparkstreaming, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread uncleGen
Github user uncleGen commented on the pull request: https://github.com/apache/spark/pull/1356#issuecomment-53977557 okay! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-3009: Reverted readObject method in Appl...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1922 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3229] spark.shuffle.safetyFraction and ...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2135 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Update spark-daemon.sh

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/254 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2675]LiveListenerBus Queue Overflow

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1356 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Add normalizeByCol method to mllib.util.MLUtil...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1698 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3010] fix redundant conditional

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1992#issuecomment-53977588 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2184#issuecomment-53977717 Cool - thanks Nick! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [Spark QA] only check code files for new class...

2014-08-30 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2184 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3327] Make broadcasted value mutable fo...

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2217#issuecomment-53977801 Hi there, The immutability of broadcast variables might be assumed in other places in the code base. Since this approach requires re-broadcasting the entire

[GitHub] spark pull request: SPARK-2636: Expose job ID in JobWaiter API

2014-08-30 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2176#issuecomment-53977827 I added a comment about the experimental formatting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

  1   2   >