[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17577 **[Test build #75629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75629/testReport)** for PR 17577 at commit

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14830 **[Test build #75632 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75632/testReport)** for PR 14830 at commit

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14830 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75632/ Test PASSed. ---

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14830 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/17580 Sorry,spark java code style is different from the style of my project team.Now I know, and have been fixed. Use 2-space indentation in general. For function declarations, use 4

[GitHub] spark issue #17557: [SPARK-20208][WIP][R][DOCS] Document R fpGrowth support

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17557 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75633/ Test PASSed. ---

[GitHub] spark issue #17557: [SPARK-20208][WIP][R][DOCS] Document R fpGrowth support

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17557 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17557: [SPARK-20208][WIP][R][DOCS] Document R fpGrowth support

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17557 **[Test build #75633 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75633/testReport)** for PR 17557 at commit

[GitHub] spark issue #17359: [SPARK-20028][SQL] Add aggreagate expression nGrams

2017-04-09 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/17359 @rxin @cloud-fan @gatorsmile @viirya @tejasapatil Could you please help me review this PR? Or is there anything I can do on this work? --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16845: [SPARK-19505][Python] AttributeError on Exception.messag...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16845 **[Test build #75631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75631/testReport)** for PR 16845 at commit

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-04-09 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14830 Jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #17578: [SPARK-20269][Structured Streaming][Examples]add ...

2017-04-09 Thread guoxiaolongzte
GitHub user guoxiaolongzte opened a pull request: https://github.com/apache/spark/pull/17578 [SPARK-20269][Structured Streaming][Examples]add JavaWordCountProducer in steaming examples ## What changes were proposed in this pull request? run example of streaming

[GitHub] spark pull request #17578: [SPARK-20269][Structured Streaming][Examples]add ...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte closed the pull request at: https://github.com/apache/spark/pull/17578 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #17580: [20269][Structured Streaming][Examples] add JavaW...

2017-04-09 Thread guoxiaolongzte
GitHub user guoxiaolongzte opened a pull request: https://github.com/apache/spark/pull/17580 [20269][Structured Streaming][Examples] add JavaWordCountProducer in steaming examples ## What changes were proposed in this pull request? run example of streaming kafka,currently

[GitHub] spark pull request #17130: [SPARK-19791] [ML] Add doc and example for fpgrow...

2017-04-09 Thread zero323
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17130#discussion_r110540400 --- Diff: examples/src/main/python/ml/fpgrowth_example.py --- @@ -0,0 +1,48 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-09 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/17533 @squito Thank you so much for taking look into this. > we don't want the TSM requesting info from the DAGSCheduler Sorry I missed this point for the previous change. Now I push the

[GitHub] spark issue #17572: [SPARK-20260][MLLib] String interpolation required for e...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17572 **[Test build #3653 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3653/testReport)** for PR 17572 at commit

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17533 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75634/ Test FAILed. ---

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17533 **[Test build #75634 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75634/testReport)** for PR 17533 at commit

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17533 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17579: [20269][Structured Streaming][Examples] add JavaW...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte closed the pull request at: https://github.com/apache/spark/pull/17579 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17580: [20269][Structured Streaming][Examples] add JavaWordCoun...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17580 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for bucket...

2017-04-09 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/17077 This looks like an important improvement that might make sense to try and get in for 2.2 so I'll try and get some reviewing in. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-04-09 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14830 I guess a rebased will be welcomed, I can do it by tomorow if you want --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #17557: [SPARK-20208][WIP][R][DOCS] Document R fpGrowth support

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17557 **[Test build #75633 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75633/testReport)** for PR 17557 at commit

[GitHub] spark pull request #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for...

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17077#discussion_r110544338 --- Diff: python/pyspark/sql/tests.py --- @@ -2038,6 +2038,61 @@ def test_BinaryType_serialization(self): df =

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75629/ Test FAILed. ---

[GitHub] spark issue #16845: [SPARK-19505][Python] AttributeError on Exception.messag...

2017-04-09 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/16845 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for...

2017-04-09 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/17077#discussion_r110538670 --- Diff: python/pyspark/sql/readwriter.py --- @@ -545,6 +545,57 @@ def partitionBy(self, *cols): self._jwrite =

[GitHub] spark pull request #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for...

2017-04-09 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/17077#discussion_r110538647 --- Diff: python/pyspark/sql/readwriter.py --- @@ -545,6 +545,57 @@ def partitionBy(self, *cols): self._jwrite =

[GitHub] spark issue #16845: [SPARK-19505][Python] AttributeError on Exception.messag...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16845 **[Test build #75631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75631/testReport)** for PR 16845 at commit

[GitHub] spark pull request #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for...

2017-04-09 Thread zero323
Github user zero323 commented on a diff in the pull request: https://github.com/apache/spark/pull/17077#discussion_r110542103 --- Diff: python/pyspark/sql/readwriter.py --- @@ -545,6 +545,57 @@ def partitionBy(self, *cols): self._jwrite =

[GitHub] spark pull request #17577: [SPARK-20270][SQL] na.fill should not change the ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17577#discussion_r110545017 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameNaFunctions.scala --- @@ -407,10 +407,11 @@ final class DataFrameNaFunctions

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75630/ Test PASSed. ---

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17577 **[Test build #75630 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75630/testReport)** for PR 17577 at commit

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/17580 Title,PR description and motive, has been modified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16845: [SPARK-19505][Python] AttributeError on Exception.messag...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16845 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16845: [SPARK-19505][Python] AttributeError on Exception.messag...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16845 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75631/ Test PASSed. ---

[GitHub] spark pull request #17579: [20269][Structured Streaming][Examples] add JavaW...

2017-04-09 Thread guoxiaolongzte
GitHub user guoxiaolongzte opened a pull request: https://github.com/apache/spark/pull/17579 [20269][Structured Streaming][Examples] add JavaWordCountProducer in steaming examples ## What changes were proposed in this pull request? run example of streaming kafka,currently

[GitHub] spark issue #17579: [20269][Structured Streaming][Examples] add JavaWordCoun...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17579 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17572: [SPARK-20260][MLLib] String interpolation required for e...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17572 **[Test build #3653 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3653/testReport)** for PR 17572 at commit

[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14830 **[Test build #75632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75632/testReport)** for PR 14830 at commit

[GitHub] spark issue #17580: [20269][Structured Streaming][Examples] add JavaWordCoun...

2017-04-09 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17580 The code style needs to be fixed, and the title. What example is this based on ? this is the kind of info that you should put in a pull request. --- If your project is set up for it, you can reply

[GitHub] spark issue #17533: [SPARK-20219] Schedule tasks based on size of input from...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17533 **[Test build #75634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75634/testReport)** for PR 17533 at commit

[GitHub] spark pull request #17077: [SPARK-16931][PYTHON][SQL] Add Python wrapper for...

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17077#discussion_r110542557 --- Diff: python/pyspark/sql/readwriter.py --- @@ -545,6 +545,57 @@ def partitionBy(self, *cols): self._jwrite =

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17541#discussion_r110547379 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala --- @@ -43,17 +43,8 @@ case class LogicalRelation(

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75635/testReport)** for PR 17541 at commit

[GitHub] spark pull request #17580: [20269][Structured Streaming] add class 'JavaWord...

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17580#discussion_r110540044 --- Diff: examples/src/main/java/org/apache/spark/examples/streaming/JavaKafkaWordCountProducer.java --- @@ -0,0 +1,76 @@ +/* + * Licensed to

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17577 LGTM except one comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17574 @viirya Based on the code change history, https://github.com/apache/spark/pull/13642 removed the usage of ASM in the test case `SQLMetricsSuite.scala`. Thus, it is safe to remove the test

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110568613 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,258 @@ +/* + * Licensed to

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17574 @gatorsmile Thanks for the search. I don't see any usage of it in `sql/core` now. It is only used in `core`, `repl`, `graphx`. So I am wondering if we can completely remove it from the dependency.

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75640/testReport)** for PR 17541 at commit

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75640/ Test PASSed. ---

[GitHub] spark pull request #17577: [SPARK-20270][SQL] na.fill should not change the ...

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17577 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110574182 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,8 +54,6 @@ case class

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/17568 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #17330: [SPARK-19993][SQL] Caching logical plans containing subq...

2017-04-09 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/17330 @cloud-fan Sure Wenchen. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17577 **[Test build #75636 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75636/testReport)** for PR 17577 at commit

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75636/ Test FAILed. ---

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75637 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75637/testReport)** for PR 17541 at commit

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75637/ Test FAILed. ---

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17577 **[Test build #75638 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75638/testReport)** for PR 17577 at commit

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75638/ Test PASSed. ---

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 If the user really need to limit the returned results, why don't directly use a `limit` operator? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17541 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17541#discussion_r110564162 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -359,9 +359,59 @@ abstract class QueryPlan[PlanType <:

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 Btw, I am not sure why you said `SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT` will waste cluster resource. The difference between incremental collect or not is whether the results will be materialized

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75640/testReport)** for PR 17541 at commit

[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17561 @ueshin, about Jenkins thing, please refer https://github.com/apache/spark/pull/17469#issuecomment-292663021. It might be helpful. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17577 LGTM, do we still need #15994 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 My opinion is: In the production, the user often select without a limit, often lead to service offline,this is a general situation, so increase the parameters. When

[GitHub] spark pull request #17582: [SPARK-20239][Core] Improve HistoryServer's ACL m...

2017-04-09 Thread jerryshao
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/17582 [SPARK-20239][Core] Improve HistoryServer's ACL mechanism ## What changes were proposed in this pull request? Current SHS (Spark History Server) two different ACLs: * ACL of

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17582 **[Test build #75641 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75641/testReport)** for PR 17582 at commit

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 > In the production, the user often select without a limit, often lead to service offline,this is a general situation, so increase the parameters. If the users tend to select without a

[GitHub] spark pull request #17569: [SPARK-20253][SQL] Remove unnecessary nullchecks ...

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17569 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17582 **[Test build #75643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75643/testReport)** for PR 17582 at commit

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17574 Thanks! Merging to master/2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17574: [SPARK-20264][SQL] asm should be non-test depende...

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17574 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110569002 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,258 @@ +/* + * Licensed to

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110569876 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r110576977 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -67,6 +67,9 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r110576873 --- Diff: core/src/main/java/org/apache/spark/memory/MemoryConsumer.java --- @@ -41,7 +41,7 @@ protected MemoryConsumer(TaskMemoryManager

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110568699 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,249 @@ +/* + * Licensed to

[GitHub] spark issue #17528: [MINOR][R] Reorder `Collate` fields in DESCRIPTION file

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17528 I just found some references about the order. It seems there was a question about it - http://stackoverflow.com/questions/18544006/how-do-i-indicate-collate-order-in-roxygen2 and

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110570339 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17574 Meh let's not bother. There isn't any harm in the current setup since it's already a transitive dependency. Why waste time on those? --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/17577 Merged into master. @cloud-fan #15994 is still needed when a user wants to fill in default long value with a extremely large value into NaN. Thanks. --- If your project is set up for it,

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17580 What is the purpose of adding this example? I think we already have a `KafkaWordCountProducer` for the convenience of Kafka streaming example, and we could use that to send events to Kafka. I

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110568482 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,249 @@ +/* + * Licensed to

[GitHub] spark issue #17569: [SPARK-20253][SQL] Remove unnecessary nullchecks of a re...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17569 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110570287 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17541 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/17580 When a user use spark to develop a stream application, he first wants to find and learn example program in 'spark \ examples \ src \ main \ java \ org \ apache \ spark \ examples \

  1   2   >