[GitHub] spark issue #20091: [SPARK-22465][FOLLOWUP] Update the number of partitions ...

2018-01-18 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20091 Thanks for coding it up @jiangxb1987 ! So if I understand it correctly, the requirements where the PR helps with are : * Max partitioner is not eligible since it is atleast an order

[GitHub] spark issue #20295: [WIP][SPARK-23011] Support alternative function form wit...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20295 How do we turn a single group column to a series? just repeat the group column? --- - To unsubscribe, e-mail:

[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20177 can you fix the test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20026 cc @jerryshao --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19862: [SPARK-22671][SQL] Make SortMergeJoin shuffle read less ...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19862 I don't agree this is a small change, and users using spark prior to 2.0 won't get this patch, as we don't backport performance improvement patches. Overall this patch won't bring much

[GitHub] spark pull request #20091: [SPARK-22465][FOLLOWUP] Update the number of part...

2018-01-18 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/20091#discussion_r162552121 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -67,31 +69,32 @@ object Partitioner { None } -if

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread fjh100456
Github user fjh100456 commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162551602 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #19583: [WIP][SPARK-22339] [CORE] [NETWORK-SHUFFLE] Push epoch u...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19583 also cc @JoshRosen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162551462 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

2018-01-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162551351 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/CompressionCodecSuite.scala --- @@ -0,0 +1,321 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19054: [SPARK-18067] Avoid shuffling child if join keys ...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19054#discussion_r162551159 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -220,45 +220,76 @@ case class

[GitHub] spark pull request #19054: [SPARK-18067] Avoid shuffling child if join keys ...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19054#discussion_r162550613 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala --- @@ -220,45 +220,76 @@ case class

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19175 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19175 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86374/ Test PASSed. ---

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19175 **[Test build #86374 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86374/testReport)** for PR 19175 at commit

[GitHub] spark pull request #20297: [SPARK-23020][CORE] Fix races in launcher code, t...

2018-01-18 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/20297#discussion_r162550217 --- Diff: launcher/src/main/java/org/apache/spark/launcher/ChildProcAppHandle.java --- @@ -48,14 +48,16 @@ public synchronized void disconnect() {

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-18 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162549759 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -261,37 +263,93 @@ private[spark] class MemoryStore(

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-18 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162548350 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,33 @@ private[spark] class MemoryStore( }

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-18 Thread Ngone51
Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162548052 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -233,17 +235,13 @@ private[spark] class MemoryStore( }

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19285 It's just a refactor so I'd like to target it for 2.4 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #20091: [SPARK-22465][FOLLOWUP] Update the number of part...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20091#discussion_r162549412 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -67,31 +69,32 @@ object Partitioner { None } -if

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20316 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/28/ Test

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20316 **[Test build #86377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86377/testReport)** for PR 20316 at commit

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20316 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20316 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20091: [SPARK-22465][FOLLOWUP] Update the number of part...

2018-01-18 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/20091#discussion_r162548620 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -67,31 +69,32 @@ object Partitioner { None } -if

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20277 **[Test build #86376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86376/testReport)** for PR 20277 at commit

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20277 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20277 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/27/ Test

[GitHub] spark issue #19340: [SPARK-22119][ML] Add cosine distance to KMeans

2018-01-18 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19340 @viirya yes you're right in your analysis. Where in the doc should we put this? @srowen please if you.think this.is.ok, may you start a build? Thanks. ---

[GitHub] spark pull request #20297: [SPARK-23020][CORE] Fix races in launcher code, t...

2018-01-18 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/20297#discussion_r162548031 --- Diff: launcher/src/main/java/org/apache/spark/launcher/LauncherConnection.java --- @@ -95,15 +95,15 @@ protected synchronized void send(Message

[GitHub] spark pull request #20324: [SPARK-23091][ML] Incorrect unit test for approxQ...

2018-01-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20324#discussion_r162547532 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala --- @@ -154,24 +154,24 @@ class DataFrameStatSuite extends QueryTest with

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162547306 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala --- @@ -50,7 +50,14 @@ private[sql] trait ColumnarBatchScan

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18277 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18277 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86375/ Test PASSed. ---

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18277 **[Test build #86375 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86375/testReport)** for PR 18277 at commit

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20277 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86372/ Test PASSed. ---

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20277 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20277 **[Test build #86372 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86372/testReport)** for PR 20277 at commit

[GitHub] spark issue #20327: [SPARK-12963][CORE] NM host for driver end points

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20327 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20327: [SPARK-12963][CORE] NM host for driver end points

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20327 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20327: [SPARK-12963][CORE] NM host for driver end points

2018-01-18 Thread gerashegalov
GitHub user gerashegalov opened a pull request: https://github.com/apache/spark/pull/20327 [SPARK-12963][CORE] NM host for driver end points ## What changes were proposed in this pull request? Driver end points on YARN in the cluster mode are potentially bound to incorrect

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18277 This change looks reasonable to me for now. But I'm also concerned about the behavior change. A note into release notes should be good or maybe we need a note at migration guide in `RDD Programming

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18277 **[Test build #86375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86375/testReport)** for PR 18277 at commit

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20316 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20316 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86367/ Test PASSed. ---

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20316 **[Test build #86367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86367/testReport)** for PR 20316 at commit

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18277 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20326: [SPARK-23155][DEPLOY] log.server.url links in SHS

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20326 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20326: [SPARK-23155][DEPLOY] log.server.url links in SHS

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20326 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20326: [SPARK-23155][DEPLOY] log.server.url links in SHS

2018-01-18 Thread gerashegalov
GitHub user gerashegalov opened a pull request: https://github.com/apache/spark/pull/20326 [SPARK-23155][DEPLOY] log.server.url links in SHS ## What changes were proposed in this pull request? Ensure driver/executor log availability via Spark History Server UI even if the

[GitHub] spark issue #20323: [BUILD][MINOR] Fix java style check issues

2018-01-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20323 > If Travis CI can not handle the full traffic of Apache Spark PRs, we may run it for only Java code change PRs. @dongjoon-hyun, do you know if Travis CI supports exclusion/inclusion of

[GitHub] spark issue #20325: [SPARK-22808][DOCS] add insertInto when save hive built ...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20325 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20325: [SPARK-22808][DOCS] add insertInto when save hive built ...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20325 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19892: [SPARK-22797][PySpark] Bucketizer support multi-column

2018-01-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/19892 I’m generally ok with these small python api wrapper additions getting merged as long as the risk of breaking anything is low - and here it is since it’s just api parity On Fri, 19 Jan

[GitHub] spark pull request #20325: [SPARK-22808][DOCS] add insertInto when save hive...

2018-01-18 Thread brandonJY
GitHub user brandonJY opened a pull request: https://github.com/apache/spark/pull/20325 [SPARK-22808][DOCS] add insertInto when save hive built dataframe ## What changes were proposed in this pull request? based on https://issues.apache.org/jira/browse/SPARK-22808 &

[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20324 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20324 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86368/ Test FAILed. ---

[GitHub] spark issue #20324: [SPARK-23091][ML] Incorrect unit test for approxQuantile

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20324 **[Test build #86368 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86368/testReport)** for PR 20324 at commit

[GitHub] spark issue #20298: [SPARK-22976][Core]: Cluster mode driver dir removed whi...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20298 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20298: [SPARK-22976][Core]: Cluster mode driver dir removed whi...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20298 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86365/ Test PASSed. ---

[GitHub] spark issue #20298: [SPARK-22976][Core]: Cluster mode driver dir removed whi...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20298 **[Test build #86365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86365/testReport)** for PR 20298 at commit

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19285 Are we targeting this to 2.3 or 2.4? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162536698 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ColumnarBatchScan.scala --- @@ -50,7 +50,14 @@ private[sql] trait ColumnarBatchScan

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162536737 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala --- @@ -127,8 +127,14 @@ class

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20316 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86369/ Test FAILed. ---

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20316 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20316: [SPARK-23149][SQL] polish ColumnarBatch

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20316 **[Test build #86369 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86369/testReport)** for PR 20316 at commit

[GitHub] spark issue #20275: [SPARK-23085][ML] API parity for mllib.linalg.Vectors.sp...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20275 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86373/ Test PASSed. ---

[GitHub] spark issue #20275: [SPARK-23085][ML] API parity for mllib.linalg.Vectors.sp...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20275 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20275: [SPARK-23085][ML] API parity for mllib.linalg.Vectors.sp...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20275 **[Test build #86373 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86373/testReport)** for PR 20275 at commit

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162534791 --- Diff: sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java --- @@ -53,166 +41,83 @@ public int numNulls() {

[GitHub] spark issue #19301: [SPARK-22084][SQL] Fix performance regression in aggrega...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19301 I believe this has been fixed, can we close it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19293: [SPARK-22079][SQL] Serializer in HiveOutputWriter miss l...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19293 if it's too hard to write a UT, can we have a code snippet to reproduce this bug and put it in PR description? --- - To

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19285 overall looks good --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162534339 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -261,37 +263,93 @@ private[spark] class MemoryStore(

[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162534289 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -162,26 +162,33 @@ private[spark] class MemoryStore( }

[GitHub] spark issue #19892: [SPARK-22797][PySpark] Bucketizer support multi-column

2018-01-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/19892 I mean I think it might have a chance, generally speaking we've allowed outstanding PRs to be merged after the freeze. Since there are outstanding blockers on the branch preventing us from cutting

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19175 **[Test build #86374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86374/testReport)** for PR 19175 at commit

[GitHub] spark issue #18277: [SPARK-20947][PYTHON] Fix encoding/decoding error in pip...

2018-01-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/18277 Jenkins OK to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19175: [SPARK-21964][SQL]Enable splitting the Aggregate (on Exp...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19175 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19420: [SPARK-22191] [SQL] Add hive serde example with serde pr...

2018-01-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/19420 I love more examples, but is there a place we plan to put this in the documentation? --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19420: [SPARK-22191] [SQL] Add hive serde example with s...

2018-01-18 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/19420#discussion_r162532658 --- Diff: examples/src/main/java/org/apache/spark/examples/sql/hive/JavaSparkHiveExample.java --- @@ -124,6 +124,13 @@ public static void main(String[]

[GitHub] spark issue #20260: [SPARK-23039][SQL] Finish TODO work in alter table set l...

2018-01-18 Thread xubo245
Github user xubo245 commented on the issue: https://github.com/apache/spark/pull/20260 I will fix the error of this PR after https://github.com/apache/spark/pull/20249#issuecomment-358720962 merged --- - To

[GitHub] spark issue #19420: [SPARK-22191] [SQL] Add hive serde example with serde pr...

2018-01-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/19420 Jenkins OK to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17185 I agree it's a valid use case, do you wanna bring it up to date? sorry for the delay! --- - To unsubscribe, e-mail:

[GitHub] spark issue #17123: [SPARK-19781][ML] Handle NULLs as well as NaNs in Bucket...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17123 cc @WeichenXu123 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19872#discussion_r162532163 --- Diff: python/pyspark/sql/tests.py --- @@ -4279,6 +4273,425 @@ def test_unsupported_types(self):

[GitHub] spark issue #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to Spark ...

2018-01-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/19876 also maybe @dbtsai ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20306: [SPARK-23054][SQL][PYSPARK][FOLLOWUP] Use sqlType...

2018-01-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20306 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162531810 --- Diff: sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnVector.java --- @@ -152,19 +198,11 @@ public final ColumnarRow getStruct(int

[GitHub] spark issue #20306: [SPARK-23054][SQL][PYSPARK][FOLLOWUP] Use sqlType castin...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20306 thanks, merging to master/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162531441 --- Diff: sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java --- @@ -53,166 +41,83 @@ public int numNulls() {

[GitHub] spark pull request #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20277#discussion_r162531459 --- Diff: sql/core/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java --- @@ -53,166 +41,83 @@ public int numNulls() {

[GitHub] spark pull request #20091: [SPARK-22465][FOLLOWUP] Update the number of part...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20091#discussion_r162531289 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -67,31 +69,32 @@ object Partitioner { None } -if

[GitHub] spark issue #20025: [SPARK-22837][SQL]Session timeout checker does not work ...

2018-01-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20025 **[Test build #86370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86370/testReport)** for PR 20025 at commit

[GitHub] spark issue #20025: [SPARK-22837][SQL]Session timeout checker does not work ...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20025 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86370/ Test PASSed. ---

[GitHub] spark issue #20025: [SPARK-22837][SQL]Session timeout checker does not work ...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20025 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18983: [SPARK-21771][SQL]remove useless hive client in SparkSQL...

2018-01-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18983 LGTM, although I'm not very familiar with the thrift server code... --- - To unsubscribe, e-mail:

[GitHub] spark issue #20277: [SPARK-23090][SQL] polish ColumnVector

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20277 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/26/ Test

  1   2   3   4   5   6   7   8   9   >