[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20422 **[Test build #86792 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86792/testReport)** for PR 20422 at commit

[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-29 Thread suchithjn225
Github user suchithjn225 commented on the issue: https://github.com/apache/spark/pull/20177 Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20422 Jenkins, ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20177 **[Test build #86791 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86791/testReport)** for PR 20177 at commit

[GitHub] spark issue #20430: [SPARK-23263][SQL] Create table stored as parquet should...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20430 **[Test build #86790 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86790/testReport)** for PR 20430 at commit

[GitHub] spark issue #20430: [SPARK-23263][SQL] Create table stored as parquet should...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20430 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20430: [SPARK-23263][SQL] Create table stored as parquet should...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20430 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/356/

[GitHub] spark pull request #20430: [SPARK-23263][SQL] Create table stored as parquet...

2018-01-29 Thread wangyum
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/20430 [SPARK-23263][SQL] Create table stored as parquet should update table size if automatic update table size is enabled …update table size is enabled ## What changes were proposed in this

[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20361#discussion_r164634591 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java --- @@ -115,13 +116,15 @@

[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20361#discussion_r164634402 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/orc/OrcColumnarBatchReader.java --- @@ -49,8 +49,9 @@ * After

[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20361#discussion_r164634339 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -400,6 +406,12 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #20361: [SPARK-23188][SQL] Make vectorized columar reader...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20361#discussion_r164634309 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -377,6 +377,12 @@ object SQLConf { .booleanConf

[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20422 I agree with @squito , unless there's a bug in it, it is risky and unnecessary to change the logic in this critical path. ---

[GitHub] spark pull request #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20426 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20426 LGTM, merging to master/2.3! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread yaooqinn
Github user yaooqinn commented on the issue: https://github.com/apache/spark/pull/20422 thanks guys for reviewing. yes, this is just a minor improvement which I guess code here seem not very logical when I was trying to do some optimizations for my customer's heavy shuffle case. If

[GitHub] spark pull request #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Us...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20387#discussion_r164633577 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -17,15 +17,149 @@ package

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20426 cc @sameeragarwal @mridulm @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20422: [SPARK-23253][Core][Shuffle]Only write shuffle temporary...

2018-01-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20422 thanks for taking a look at this @yaooqinn . To clarify -- there is no bug you are trying to fix here, is there? Its just an optimization? From a quick glance I think the change seems correct ...

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20387 > I thought it was a good thing to push a single node down at a time and not depend on order. The order must be taken care. For example, we can't push down a limit through Filter, unless

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20387 > This is a new API... Are you saying you wanna add a new method in `DataFreameReader` that is different than `load`? In Scala, parameter name is part of the method signature, so for

[GitHub] spark pull request #20335: [SPARK-23088][CORE] History server not showing in...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20335 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20415: [SPARK-23247][SQL]combines Unsafe operations and statist...

2018-01-29 Thread heary-cao
Github user heary-cao commented on the issue: https://github.com/apache/spark/pull/20415 @hvanhovell ,thank you for review it. I tested the code for this PR change, **in FileSourceScanExec->doExecute code:** ``` if (needsUnsafeRowConversion) {

[GitHub] spark issue #20335: [SPARK-23088][CORE] History server not showing incomplet...

2018-01-29 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20335 Since this is a behavior change compared to 2.2/2.3, so I will only merge to master branch. Thanks @pmackles ! --- - To

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20397 I'm kind of convinced but maybe it's because I wrote this `ReadTask`. Let's get feedback from more people, cc @RussellSpitzer @VincentPoncet @HyukjinKwon @wzhfy @dongjoon-hyun @j-baker ---

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20428 +1 on `GROUPED AGG` too, we may add new UDF type when we support partial aggregate. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #20408: [SPARK-23189][Core][Web UI] Reflect stage level b...

2018-01-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20408#discussion_r164627546 --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala --- @@ -254,6 +255,7 @@ private class LiveExecutor(val executorId: String, _addTime:

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86785/ Test PASSed. ---

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20426 **[Test build #86785 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86785/testReport)** for PR 20426 at commit

[GitHub] spark pull request #20427: [SPARK-23260][SPARK-23262][SQL] several data sour...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164623731 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java --- @@ -25,7 +25,7 @@ * session. */

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20427 **[Test build #86789 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86789/testReport)** for PR 20427 at commit

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20427 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20427 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/355/

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20428 +1 on `GROUPED AGG` to me too, just to be clear. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread sameeragarwal
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/20428 +1 on `GROUPED AGG` as well --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20386 **[Test build #86788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86788/testReport)** for PR 20386 at commit

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20428 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20428 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86784/ Test PASSed. ---

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/354/

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20386 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20428 **[Test build #86784 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86784/testReport)** for PR 20428 at commit

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164621094 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java --- @@ -25,7 +25,7 @@ * session. */

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164620326 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java --- @@ -25,7 +25,7 @@ * session. */

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20424 LGTM except two nits and one question. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20424#discussion_r164619864 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -191,7 +191,20 @@ private[spark] class

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164619634 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java --- @@ -18,23 +18,23 @@ package org.apache.spark.sql.sources.v2;

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164619561 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java --- @@ -18,23 +18,23 @@ package org.apache.spark.sql.sources.v2;

[GitHub] spark pull request #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sq...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20423 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20424#discussion_r164619346 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -191,7 +191,20 @@ private[spark] class

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20423 LGTM Thanks! Merged to master/2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20428 First two changes looks good. The last one, maybe `PANDAS GROUP AGG UDF` -> `GROUPED AGG PANDAS UDF`? --- - To unsubscribe,

[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20424#discussion_r164617708 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -191,7 +191,20 @@ private[spark] class

[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20424#discussion_r164617043 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -191,7 +191,20 @@ private[spark] class

[GitHub] spark pull request #20424: [Spark-23240][python] Better error message when e...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20424#discussion_r164616947 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -191,7 +191,20 @@ private[spark] class

[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20429 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20429 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86783/ Test PASSed. ---

[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20429 **[Test build #86783 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86783/testReport)** for PR 20429 at commit

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20424 **[Test build #86787 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86787/testReport)** for PR 20424 at commit

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20424 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20424 Ah, OK. fixing the error message is fine as a separate improvement. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20250: [SPARK-23059][SQL][TEST] Correct some improper with view...

2018-01-29 Thread xubo245
Github user xubo245 commented on the issue: https://github.com/apache/spark/pull/20250 Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20424 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20424 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86782/ Test FAILed. ---

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20424 **[Test build #86782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86782/testReport)** for PR 20424 at commit

[GitHub] spark issue #20272: [SPARK-23078] [CORE] [K8s] allow Spark Thrift Server to ...

2018-01-29 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20272 >IIUC there was a issue in launching Thrift Server on YARN cluster mode, and I'm not sure whether it has been fixed (maybe @jerryshao can kindly check that?) Sorry I cannot remember the

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86781/ Test PASSed. ---

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20423 **[Test build #86781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86781/testReport)** for PR 20423 at commit

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread sameeragarwal
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/20414 Thanks @mridulm, all great points! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20414 Ouch... Yea, we have to think out a way to make it deterministic under hash collisions. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20399 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20399 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86780/ Test PASSed. ---

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20399 **[Test build #86780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86780/testReport)** for PR 20399 at commit

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164609599 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java --- @@ -18,23 +18,23 @@ package org.apache.spark.sql.sources.v2;

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20428 Yup, fortunately(?) we are free to rename `SQL_PANDAS_GROUP_AGG_UDF` within 2.4.0 currently but I believe here is a good place to decide based on what I got so far. The proposal seems fine to

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20414 @jiangxb1987 You are correct when the sizes of the map's are same. But if the map sizes are different, the resulting order can be different - which can happen when requests for additional memory

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20414 Hey I searched the `ExternalAppendOnlyMap` and here are the findings: The `ExternalAppendOnlyMap` claims it keeps the sorted content, but it actually uses a `HashComparator` that compare the

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4.2 plus ...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20373 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86786/ Test PASSed. ---

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4.2 plus ...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20373 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4.2 plus ...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20373 **[Test build #86786 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86786/testReport)** for PR 20373 at commit

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4.2 plus ...

2018-01-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20373 Yup, I double checked that ahead by manual port. I think that failure might be related with environment which I think is the same I met. If we run the comments as written in the travis, seems

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4.2 plus ...

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20373 I ran cloudpickle_tests locally and verified that the tests pass with the 2 backported fixes applied. I did get an unrelated test failure but that happened even before added the fixes and

[GitHub] spark pull request #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4....

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/20373#discussion_r164602629 --- Diff: python/pyspark/cloudpickle.py --- @@ -318,6 +329,24 @@ def save_function(self, obj, name=None): Determines what kind of function

[GitHub] spark pull request #20373: [SPARK-23159][PYTHON] Update cloudpickle to v0.4....

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/20373#discussion_r164602283 --- Diff: python/pyspark/cloudpickle.py --- @@ -608,37 +626,22 @@ def save_global(self, obj, name=None, pack=struct.pack): The name of

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20373 **[Test build #86786 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86786/testReport)** for PR 20373 at commit

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20373 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20373 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/353/

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/20428 * `PANDAS SCALAR UDF` -> `SCALAR PANDAS UDF` This doesn't really change the API so +1 * `PANDAS GROUP MAP UDF` -> `GROUPED MAP PANDAS UDF` The API changes from

[GitHub] spark issue #20408: [SPARK-23189][Core][Web UI] Reflect stage level blacklis...

2018-01-29 Thread attilapiros
Github user attilapiros commented on the issue: https://github.com/apache/spark/pull/20408 @ajbozarth Ok. For tomorrow I can update both the label and the screenshots. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #20408: [SPARK-23189][Core][Web UI] Reflect stage level b...

2018-01-29 Thread attilapiros
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20408#discussion_r164594314 --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala --- @@ -254,6 +255,7 @@ private class LiveExecutor(val executorId: String,

[GitHub] spark issue #20335: [SPARK-23088][CORE] History server not showing incomplet...

2018-01-29 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/20335 @jiangxb1987 I agree that this is a behavior change, but it's a behavior change back to how it was before we switched the page from scala to js. This LGTM ---

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/352/

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20426 **[Test build #86785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86785/testReport)** for PR 20426 at commit

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20427 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86779/ Test PASSed. ---

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20427 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20427 **[Test build #86779 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86779/testReport)** for PR 20427 at commit

[GitHub] spark issue #20408: [SPARK-23189][Core][Web UI] Reflect stage level blacklis...

2018-01-29 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/20408 I agree that despite being wordy `Active (Blacklisted in Stages: [...])` looks best, could you update it and post screenshots to confirm? And if @squito says the code looks good I trust

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19575 @icexelloss Thanks for your reply. Welcome your comment in my PR https://github.com/apache/spark/pull/20428 --- - To

<    1   2   3   4   5   >