[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19683 **[Test build #85462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85462/testReport)** for PR 19683 at commit

[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19683 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19683 **[Test build #85461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85461/testReport)** for PR 19683 at commit

[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19683 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85461/ Test FAILed. ---

[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19683 **[Test build #85461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85461/testReport)** for PR 19683 at commit

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread uzadude
Github user uzadude commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158906785 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -73,8 +73,10 @@ case class

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread uzadude
Github user uzadude commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158906660 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala --- @@ -359,12 +359,12 @@ package object dsl { def

[GitHub] spark issue #20062: [SPARK-22892] [SQL] Simplify some estimation logic by us...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20062 **[Test build #85460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85460/testReport)** for PR 20062 at commit

[GitHub] spark pull request #20030: [SPARK-10496][CORE] Efficient RDD cumulative sum

2017-12-27 Thread zhengruifeng
Github user zhengruifeng closed the pull request at: https://github.com/apache/spark/pull/20030 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19977 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85454/ Test PASSed. ---

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19977 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19977 **[Test build #85454 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85454/testReport)** for PR 19977 at commit

[GitHub] spark pull request #19527: [SPARK-13030][ML] Create OneHotEncoderEstimator f...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19527#discussion_r158904223 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoderEstimator.scala --- @@ -0,0 +1,519 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158902856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/GenerateExec.scala --- @@ -85,11 +86,20 @@ case class GenerateExec( val

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158891945 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala --- @@ -73,8 +73,10 @@ case class

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158892688 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala --- @@ -359,12 +359,12 @@ package object dsl { def

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-12-27 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19222 ping @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20029: [SPARK-22793][SQL]Memory leak in Spark Thrift Server

2017-12-27 Thread zuotingbing
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/20029 Could you please to check this PR? Thanks @liufengdb --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20099 **[Test build #85459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85459/testReport)** for PR 20099 at commit

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19977 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85451/ Test PASSed. ---

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19977 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19977 **[Test build #85451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85451/testReport)** for PR 19977 at commit

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20099 **[Test build #85458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85458/testReport)** for PR 20099 at commit

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20099 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85458/ Test FAILed. ---

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20099 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20099 **[Test build #85458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85458/testReport)** for PR 20099 at commit

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20096 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20096 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85455/ Test FAILed. ---

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20096 **[Test build #85455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85455/testReport)** for PR 20096 at commit

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20094 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20094 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85453/ Test PASSed. ---

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20094 **[Test build #85453 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85453/testReport)** for PR 20094 at commit

[GitHub] spark issue #20082: [SPARK-22897][CORE]: Expose stageAttemptId in TaskContex...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20082 **[Test build #85457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85457/testReport)** for PR 20082 at commit

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158899707 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + * Licensed

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158899698 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + * Licensed

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158899621 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + * Licensed

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20094 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85450/ Test FAILed. ---

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20094 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20091: [SPARK-22465][FOLLOWUP] Update the number of partitions ...

2017-12-27 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20091 @jiangxb1987 I am not disagreeing with your hypothesis that default parallelism might not be optimal in all cases within an application (example - when different RDD's in application have widely

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158898080 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -150,6 +150,11 @@ abstract class TaskContext extends Serializable { */

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20094 **[Test build #85450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85450/testReport)** for PR 20094 at commit

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158897767 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -42,6 +42,7 @@ import org.apache.spark.util._ */ private[spark]

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158897401 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -150,6 +150,11 @@ abstract class TaskContext extends Serializable { */

[GitHub] spark pull request #20059: [SPARK-22648][K8s] Add documentation covering ini...

2017-12-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20059 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158897297 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -42,6 +42,7 @@ import org.apache.spark.util._ */ private[spark]

[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19943 **[Test build #85456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85456/testReport)** for PR 19943 at commit

[GitHub] spark issue #20059: [SPARK-22648][K8s] Add documentation covering init conta...

2017-12-27 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20059 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158897045 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,442 @@ +/* + *

[GitHub] spark pull request #19954: [SPARK-22757][Kubernetes] Enable use of remote de...

2017-12-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19954 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20094 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85452/ Test FAILed. ---

[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...

2017-12-27 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19954 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20094 **[Test build #85452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85452/testReport)** for PR 20094 at commit

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20094 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20093: [SPARK-22909][SS]Move Structured Streaming v2 API...

2017-12-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20093 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20093: [SPARK-22909][SS]Move Structured Streaming v2 APIs to st...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20093 LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Co...

2017-12-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20036 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20036: [SPARK-18016][SQL][FOLLOW-UP] Code Generation: Constant ...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20036 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20091: [SPARK-22465][FOLLOWUP] Update the number of partitions ...

2017-12-27 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20091 The major concern is that `spark.default.parallelism` usually is set a relatively small value, so in case the safety-check failed, the value of `defaultParallelism` can even be smaller than the

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895420 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895416 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895321 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895355 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcReadBenchmark.scala --- @@ -0,0 +1,357 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895273 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895242 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158895185 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20096 **[Test build #85455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85455/testReport)** for PR 20096 at commit

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894878 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #20082: [SPARK-22897][CORE]: Expose stageAttemptId in Tas...

2017-12-27 Thread advancedxy
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r158894808 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -42,6 +42,7 @@ import org.apache.spark.util._ */ private[spark]

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894770 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894676 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894581 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894279 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894151 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158894114 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19943 Thank you for review, @cloud-fan, @viirya, @kiszk, @HyukjinKwon, @henrify. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20100: [SPARK-22913][SQL] Improved Hive Partition Pruning

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20100 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20100: [SPARK-22913][SQL] Improved Hive Partition Prunin...

2017-12-27 Thread ameent
GitHub user ameent opened a pull request: https://github.com/apache/spark/pull/20100 [SPARK-22913][SQL] Improved Hive Partition Pruning Adding support for Timestamp and Fractional column types. The pruning of partitions of these types is being put behind default options that

[GitHub] spark pull request #19977: [SPARK-22771][SQL] Concatenate binary inputs into...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19977#discussion_r158893226 --- Diff: sql/core/src/test/resources/sql-tests/inputs/typeCoercion/native/concat.sql --- @@ -0,0 +1,93 @@ +-- Concatenate mixed inputs (output type

[GitHub] spark pull request #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2017-12-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19943#discussion_r158893248 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.scala --- @@ -0,0 +1,432 @@ +/* + *

[GitHub] spark issue #19979: [SPARK-22881][ML][TEST] ML regression package testsuite ...

2017-12-27 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19979 Actually, going further than what Bago said: All of the places which use globalCheckFunction assume that Dataset.collect() returns the Rows in a fixed order. We should really fix those unit

[GitHub] spark issue #19977: [SPARK-22771][SQL] Concatenate binary inputs into a bina...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19977 **[Test build #85454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85454/testReport)** for PR 19977 at commit

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20099 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20099 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85449/ Test FAILed. ---

[GitHub] spark issue #20099: [SPARK-22916][SQL] shouldn't bias towards build right if...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20099 **[Test build #85449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85449/testReport)** for PR 20099 at commit

[GitHub] spark pull request #19977: [SPARK-22771][SQL] Concatenate binary inputs into...

2017-12-27 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19977#discussion_r158892653 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -658,6 +660,33 @@ object TypeCoercion { }

[GitHub] spark pull request #19977: [SPARK-22771][SQL] Concatenate binary inputs into...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19977#discussion_r158892598 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -658,6 +660,33 @@ object TypeCoercion {

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20096 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85448/ Test FAILed. ---

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20096 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20096: [SPARK-22908] Add kafka source and sink for continuous p...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20096 **[Test build #85448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85448/testReport)** for PR 20096 at commit

[GitHub] spark issue #20098: [SPARK-22914][DEPLOY] Register history.ui.port

2017-12-27 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20098 CC @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20094 **[Test build #85453 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85453/testReport)** for PR 20094 at commit

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20094 **[Test build #85452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85452/testReport)** for PR 20094 at commit

[GitHub] spark issue #20094: [SPARK-20392][SQL][followup] should not add extra Analys...

2017-12-27 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20094 LGTM with two minor comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20094: [SPARK-20392][SQL][followup] should not add extra...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20094#discussion_r158891321 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1079,100 +1083,76 @@ class Analyzer( case

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158891168 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/MiscBenchmark.scala --- @@ -227,4 +227,30 @@ class MiscBenchmark extends

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158891075 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/MiscBenchmark.scala --- @@ -227,4 +227,30 @@ class MiscBenchmark extends

[GitHub] spark issue #19813: [SPARK-22600][SQL] Fix 64kb limit for deeply nested expr...

2017-12-27 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19813 LGTM, great work! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20094: [SPARK-20392][SQL][followup] should not add extra...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20094#discussion_r158890342 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1079,100 +1083,76 @@ class Analyzer( case sa

[GitHub] spark pull request #20094: [SPARK-20392][SQL][followup] should not add extra...

2017-12-27 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20094#discussion_r158889013 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -723,7 +726,7 @@ class Analyzer(

[GitHub] spark pull request #19683: [SPARK-21657][SQL] optimize explode quadratic mem...

2017-12-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19683#discussion_r158890765 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala --- @@ -276,22 +276,24 @@ class PlanParserSuite extends

  1   2   3   4   >