[GitHub] spark issue #17587: [SPARK-20274][SQL] support compatible array element type...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17587 **[Test build #75646 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75646/testReport)** for PR 17587 at commit [`001921f`](https://github.com/apache/spark/commit/00

[GitHub] spark issue #17587: [SPARK-20274][SQL] support compatible array element type...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17587 cc @liancheng @kiszk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17587: [SPARK-20274][SQL] support compatible array eleme...

2017-04-09 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/17587 [SPARK-20274][SQL] support compatible array element type in encoder ## What changes were proposed in this pull request? This is a regression caused by SPARK-19716. Before SPARK-1

[GitHub] spark issue #17556: [SPARK-16957][MLlib] Use weighted midpoints for split va...

2017-04-09 Thread facaiy
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/17556 ``` Test Result (1 failure / +1) org.apache.spark.storage.TopologyAwareBlockReplicationPolicyBehavior.Peers in 2 racks ``` Does anyone know what is this? --- If your projec

[GitHub] spark issue #17556: [SPARK-16957][MLlib] Use weighted midpoints for split va...

2017-04-09 Thread facaiy
Github user facaiy commented on the issue: https://github.com/apache/spark/pull/17556 is there something wrong with spark CI? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17582 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75643/ Test PASSed. ---

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17582 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17582 **[Test build #75643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75643/testReport)** for PR 17582 at commit [`e56c388`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/17580 Because of the API changes of Kafka, we do not want to delete it, but to maintain and modify it. Although it is absolutely a Kafka producer code, but this is part of spark streaming, it i

[GitHub] spark issue #17586: [SPARK-20249][ML][PYSPARK] Add summary for LinearSVCMode...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17586 **[Test build #75645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75645/testReport)** for PR 17586 at commit [`368af49`](https://github.com/apache/spark/commit/36

[GitHub] spark issue #17586: [SPARK-20249][ML][PYSPARK] Add summary for LinearSVCMode...

2017-04-09 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/17586 I didn't add metrics like roc for this summary yet, I can add it if it is necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as w

[GitHub] spark pull request #17586: [SPARK-20249][ML][PYSPARK] Add summary for Linear...

2017-04-09 Thread zjffdu
GitHub user zjffdu opened a pull request: https://github.com/apache/spark/pull/17586 [SPARK-20249][ML][PYSPARK] Add summary for LinearSVCModel ## What changes were proposed in this pull request? Add summary for LinearSVCModel so that user can get the training process status

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17580 I would say this is not a Spark program, it is absolutely a Kafka producer code. To maintain a Kafka Producer example in Spark is not a good choice, this is a legacy code. Because of the API chang

[GitHub] spark issue #17585: [SPARK-20273] [SQL] Disallow Non-deterministic Filter pu...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17585 **[Test build #75644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75644/testReport)** for PR 17585 at commit [`be3fb64`](https://github.com/apache/spark/commit/be

[GitHub] spark issue #17585: [SPARK-20273] [SQL] Disallow Non-deterministic Filter pu...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17585 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #17585: [SPARK-20273] [SQL] Disallow Non-deterministic Fi...

2017-04-09 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/17585 [SPARK-20273] [SQL] Disallow Non-deterministic Filter push-down into Join Conditions ## What changes were proposed in this pull request? ``` sql("SELECT t1.b, rand(0) as r FROM cachedDat

[GitHub] spark pull request #17187: [SPARK-19847][SQL] port hive read to FileFormat A...

2017-04-09 Thread cloud-fan
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/17187 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17330: [SPARK-19993][SQL] Caching logical plans containing subq...

2017-04-09 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/17330 @cloud-fan Sure Wenchen. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/17580 When a user use spark to develop a stream&kafka application, he first wants to find and learn example program in 'spark \ examples \ src \ main \ java \ org \ apache \ spark \ examples \ st

[GitHub] spark issue #17330: [SPARK-19993][SQL] Caching logical plans containing subq...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17330 Hi @dilipbiswal can you update now? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabl

[GitHub] spark issue #17568: [SPARK-20254][SQL] Remove unnecessary data conversion fo...

2017-04-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/17568 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r110576977 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -67,6 +67,9 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with Logging

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r110576873 --- Diff: core/src/main/java/org/apache/spark/memory/MemoryConsumer.java --- @@ -41,7 +41,7 @@ protected MemoryConsumer(TaskMemoryManager taskMemoryManage

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17541 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17541 thanks for the review, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110574182 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,8 +54,6 @@ case class CostBasedJoinRe

[GitHub] spark issue #17580: [20269][Structured Streaming] add class 'JavaWordCountPr...

2017-04-09 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17580 What is the purpose of adding this example? I think we already have a `KafkaWordCountProducer` for the convenience of Kafka streaming example, and we could use that to send events to Kafka. I thin

[GitHub] spark issue #15899: [SPARK-18466] added withFilter method to RDD

2017-04-09 Thread danielyli
Github user danielyli commented on the issue: https://github.com/apache/spark/pull/15899 I'm simply making an argument for a specific use case, though you're right, it's used for more than just pattern matching. --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark pull request #17577: [SPARK-20270][SQL] na.fill should not change the ...

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17577 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/17577 Merged into master. @cloud-fan #15994 is still needed when a user wants to fill in default long value with a extremely large value into NaN. Thanks. --- If your project is set up for it, yo

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/17574 Meh let's not bother. There isn't any harm in the current setup since it's already a transitive dependency. Why waste time on those? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110573027 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -218,28 +220,44 @@ object JoinReorderDP e

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 Sorry, I am wrong. It's just increase user's query time, not occupy the resource. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75640/ Test PASSed. ---

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75640/testReport)** for PR 17541 at commit [`295acc9`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17574 You can make a try. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes s

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110570339 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP e

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110570287 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP e

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110569876 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP e

[GitHub] spark issue #17528: [MINOR][R] Reorder `Collate` fields in DESCRIPTION file

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17528 I just found some references about the order. It seems there was a question about it - http://stackoverflow.com/questions/18544006/how-do-i-indicate-collate-order-in-roxygen2 and https://cran.r

[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17583 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75642/ Test PASSed. ---

[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17583 **[Test build #75642 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75642/testReport)** for PR 17583 at commit [`9c7e731`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #17569: [SPARK-20253][SQL] Remove unnecessary nullchecks of a re...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17569 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110569002 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,258 @@ +/* + * Licensed to

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110568699 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,249 @@ +/* + * Licensed to

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17574 @gatorsmile Thanks for the search. I don't see any usage of it in `sql/core` now. It is only used in `core`, `repl`, `graphx`. So I am wondering if we can completely remove it from the dependency.

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110568613 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,258 @@ +/* + * Licensed to

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17359#discussion_r110568482 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/NGrams.scala --- @@ -0,0 +1,249 @@ +/* + * Licensed to

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17574 @viirya Based on the code change history, https://github.com/apache/spark/pull/13642 removed the usage of ASM in the test case `SQLMetricsSuite.scala`. Thus, it is safe to remove the test depende

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17582 **[Test build #75643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75643/testReport)** for PR 17582 at commit [`e56c388`](https://github.com/apache/spark/commit/e5

[GitHub] spark pull request #17569: [SPARK-20253][SQL] Remove unnecessary nullchecks ...

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17569 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17574: [SPARK-20264][SQL] asm should be non-test depende...

2017-04-09 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17574 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17574: [SPARK-20264][SQL] asm should be non-test dependency in ...

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17574 Thanks! Merging to master/2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark issue #17359: [SPARK-20028][SQL] Add aggreagate expression nGrams

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17359 Regarding the performance issue, does this change have significant improvement compared with Hive's? --- If your project is set up for it, you can reply to this email and have your reply appear on G

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17582 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75641/ Test FAILed. ---

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17582 **[Test build #75641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75641/testReport)** for PR 17582 at commit [`bc1e53a`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17582 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 > When SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT is open, the process is : beeline[get] -> hs2[get] -> executor[ret] -> hs2[ret] ->beeline[ret] and in the process,the executor's resource i

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 In a department, we can not constraint everyone, but when we start ts2 with this parameter, even if the user goes wrong, it does not matter.We have used the SQLConf.THRIFTSERVER_INCREMENTAL_COLLE

[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17583 **[Test build #75642 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75642/testReport)** for PR 17583 at commit [`9c7e731`](https://github.com/apache/spark/commit/9c

[GitHub] spark issue #17584: Document Master URL format in high availability set up

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17584 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #17583: [SPARK-20271]Add FuncTransformer to simplify custom tran...

2017-04-09 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/17583 No unit test is added for now as I'm not sure if this is something that would interests the community. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request #17584: Document Master URL format in high availability s...

2017-04-09 Thread MirrorZ
GitHub user MirrorZ opened a pull request: https://github.com/apache/spark/pull/17584 Document Master URL format in high availability set up ## What changes were proposed in this pull request? Add documentation for adding master url in multi host, port format for standalone

[GitHub] spark pull request #17583: [SPARK-20271]Add FuncTransformer to simplify cust...

2017-04-09 Thread hhbyyh
GitHub user hhbyyh opened a pull request: https://github.com/apache/spark/pull/17583 [SPARK-20271]Add FuncTransformer to simplify custom transformer creation ## What changes were proposed in this pull request? Just to share some code I implemented to help easily create a cus

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 > In the production, the user often select without a limit, often lead to service offline,this is a general situation, so increase the parameters. If the users tend to select without a limit

[GitHub] spark issue #17582: [SPARK-20239][Core] Improve HistoryServer's ACL mechanis...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17582 **[Test build #75641 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75641/testReport)** for PR 17582 at commit [`bc1e53a`](https://github.com/apache/spark/commit/bc

[GitHub] spark pull request #17582: [SPARK-20239][Core] Improve HistoryServer's ACL m...

2017-04-09 Thread jerryshao
GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/17582 [SPARK-20239][Core] Improve HistoryServer's ACL mechanism ## What changes were proposed in this pull request? Current SHS (Spark History Server) two different ACLs: * ACL of base

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 My opinion is: In the production, the user often select without a limit, often lead to service offline,this is a general situation, so increase the parameters. When SQLConf.THRIFTSERVER_I

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17577 LGTM, do we still need #15994 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17561 @ueshin, about Jenkins thing, please refer https://github.com/apache/spark/pull/17469#issuecomment-292663021. It might be helpful. --- If your project is set up for it, you can reply to this e

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75640/testReport)** for PR 17541 at commit [`295acc9`](https://github.com/apache/spark/commit/29

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 Btw, I am not sure why you said `SQLConf.THRIFTSERVER_INCREMENTAL_COLLECT` will waste cluster resource. The difference between incremental collect or not is whether the results will be materialized a

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17541#discussion_r110564162 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -359,9 +359,59 @@ abstract class QueryPlan[PlanType <:

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17541 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 If the user really need to limit the returned results, why don't directly use a `limit` operator? --- If your project is set up for it, you can reply to this email and have your reply appear on GitH

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75638/ Test PASSed. ---

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17577 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17577 **[Test build #75638 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75638/testReport)** for PR 17577 at commit [`3fb1a66`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17581 Previous reviews can be found at #17561. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17581 cc @viirya, and also cc @dongjoon-hyun @srowen for previous work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17541#discussion_r110562951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala --- @@ -359,9 +359,59 @@ abstract class QueryPlan[PlanType <: Que

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17581 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17581 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17581 @ueshin please take a look at this pr, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17581: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-09 Thread shaolinliu
GitHub user shaolinliu opened a pull request: https://github.com/apache/spark/pull/17581 [SPARK-20248][ SQL]Spark SQL add limit parameter to enhance the reliability. ## What changes were proposed in this pull request? Add a parameter "spark.sql.thriftServer.retainedResults"

[GitHub] spark pull request #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter ...

2017-04-09 Thread shaolinliu
Github user shaolinliu closed the pull request at: https://github.com/apache/spark/pull/17561 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread shaolinliu
Github user shaolinliu commented on the issue: https://github.com/apache/spark/pull/17561 ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the fea

[GitHub] spark issue #17577: [SPARK-20270][SQL] na.fill should not change the values ...

2017-04-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17577 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featur

[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-04-09 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17149#discussion_r110560960 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -285,7 +285,7 @@ private[spark] class HiveExternalCatalog(c

[GitHub] spark issue #17561: [SPARK-20248][ SQL]Spark SQL add limit parameter to enha...

2017-04-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17561 I don't know the reason but Jenkins doesn't work here. @shaolinliu Could you close this and send another pr to make Jenkins work? --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75639 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75639/testReport)** for PR 17541 at commit [`295acc9`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75639/ Test FAILed. ---

[GitHub] spark issue #17558: [SPARK-20247][CORE] Add jar but this jar is missing late...

2017-04-09 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17558 @wangyum , the fix of your PR is more like a bug fix, whereas the comment above is actually a feature request, these two things are not completely matched. I would suggest to focus on shriftserver

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75639 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75639/testReport)** for PR 17541 at commit [`295acc9`](https://github.com/apache/spark/commit/29

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17541 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17541 **[Test build #75637 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75637/testReport)** for PR 17541 at commit [`295acc9`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17541 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

  1   2   >