[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86778/ Test FAILed. ---

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20426 **[Test build #86778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86778/testReport)** for PR 20426 at commit

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20428 **[Test build #86784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86784/testReport)** for PR 20428 at commit

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20423 **[Test build #86781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86781/testReport)** for PR 20423 at commit

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/349/

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20424 **[Test build #86782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86782/testReport)** for PR 20424 at commit

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20408: [SPARK-23189][Core][Web UI] Reflect stage level b...

2018-01-29 Thread attilapiros
Github user attilapiros commented on a diff in the pull request: https://github.com/apache/spark/pull/20408#discussion_r164594314 --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala --- @@ -254,6 +255,7 @@ private class LiveExecutor(val executorId: String,

[GitHub] spark issue #20272: [SPARK-23078] [CORE] [K8s] allow Spark Thrift Server to ...

2018-01-29 Thread foxish
Github user foxish commented on the issue: https://github.com/apache/spark/pull/20272 Makes sense. The change LGTM. On Jan 29, 2018 10:23 AM, "Jiang Xingbo" wrote: > IIUC there was a issue in launching Thrift Server on YARN cluster mode,

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/19575 @gatorsmile I don't think any change of naming (group map, group agg, etc) has been agreed upon yet. We can certainly open an PR to discuss it. ---

[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20295 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86773/ Test PASSed. ---

[GitHub] spark issue #20295: [SPARK-23011] Support alternative function form with gro...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20295 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20424: [Spark-23240][python] Better error message when extraneo...

2018-01-29 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20424 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/20428 [SPARK-23261] [PySpark] Rename Pandas UDFs ## What changes were proposed in this pull request? Rename the public APIs and names of pandas udfs. - `PANDAS SCALAR UDF` -> `SCALAR

[GitHub] spark issue #20373: [SPARK-23159][PYTHON] Update cloudpickle to match 0.4.2

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20373 Thanks @ueshin and @HyukjinKwon , those fixes look reasonable to backport so sounds good to me. I'll run some tests too and then add to this PR. ---

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20427 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86779/ Test PASSed. ---

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/20428 * `PANDAS SCALAR UDF` -> `SCALAR PANDAS UDF` This doesn't really change the API so +1 * `PANDAS GROUP MAP UDF` -> `GROUPED MAP PANDAS UDF` The API changes from

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20419 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86774/ Test PASSed. ---

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20419 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20425: [SPARK-23259][SQL] Clean up legacy code around hive exte...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20425 **[Test build #86777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86777/testReport)** for PR 20425 at commit

[GitHub] spark issue #20161: [SPARK-21525][streaming] Check error code from superviso...

2018-01-29 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20161 Let's try different people. @zsxwing @squito --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/348/

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20423 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20161: [SPARK-21525][streaming] Check error code from superviso...

2018-01-29 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/20161 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20428 Had an offline discussion with @sameeragarwal and @cloud-fan . To be consistent with the other APIs, we would propose to make the above changes. The major question is about the current

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20399 I am merging this now to master & 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #20399: [SPARK-23209][core] Allow credential manager to w...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20399 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20408: [SPARK-23189][Core][Web UI] Reflect stage level blacklis...

2018-01-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20408 code looks good to me, but lets what @tgravescs @ajbozarth say. @ajbozarth it is wordy, but I think `Active (Blacklisted in Stages: [...])` is probably the best of the options so far. ---

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20426 **[Test build #86785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86785/testReport)** for PR 20426 at commit

[GitHub] spark issue #20408: [SPARK-23189][Core][Web UI] Reflect stage level blacklis...

2018-01-29 Thread attilapiros
Github user attilapiros commented on the issue: https://github.com/apache/spark/pull/20408 @ajbozarth Ok. For tomorrow I can update both the label and the screenshots. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20425: [SPARK-23259][SQL] Clean up legacy code around hive exte...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20425 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20425: [SPARK-23259][SQL] Clean up legacy code around hive exte...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20425 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86777/ Test PASSed. ---

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20423 **[Test build #86776 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86776/testReport)** for PR 20423 at commit

[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20429 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20428 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/351/

[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20429 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/350/

[GitHub] spark issue #20428: [SPARK-23261] [PySpark] Rename Pandas UDFs

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20428 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20429: [SPARK-23157][SQL] Explain restriction on column express...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20429 **[Test build #86783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86783/testReport)** for PR 20429 at commit

[GitHub] spark pull request #20408: [SPARK-23189][Core][Web UI] Reflect stage level b...

2018-01-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20408#discussion_r164588595 --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala --- @@ -254,6 +255,7 @@ private class LiveExecutor(val executorId: String, _addTime:

[GitHub] spark pull request #20408: [SPARK-23189][Core][Web UI] Reflect stage level b...

2018-01-29 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/20408#discussion_r164584678 --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala --- @@ -594,12 +606,24 @@ private[spark] class AppStatusListener(

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19575 @icexelloss Thanks for your reply. Welcome your comment in my PR https://github.com/apache/spark/pull/20428 --- - To

[GitHub] spark issue #20335: [SPARK-23088][CORE] History server not showing incomplet...

2018-01-29 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/20335 @jiangxb1987 I agree that this is a behavior change, but it's a behavior change back to how it was before we switched the page from scala to js. This LGTM ---

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/352/

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20419 **[Test build #86774 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86774/testReport)** for PR 20419 at commit

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86776/ Test FAILed. ---

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20423 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20429: [SPARK-23157][SQL] Explain restriction on column ...

2018-01-29 Thread henryr
GitHub user henryr opened a pull request: https://github.com/apache/spark/pull/20429 [SPARK-23157][SQL] Explain restriction on column expression in withColumn() ## What changes were proposed in this pull request? It's not obvious from the comments that any added column must

[GitHub] spark issue #20161: [SPARK-21525][streaming] Check error code from superviso...

2018-01-29 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/20161 lgtm --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164584822 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java --- @@ -25,7 +25,7 @@ * session. */

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20427 **[Test build #86779 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86779/testReport)** for PR 20427 at commit

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20427 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20408: [SPARK-23189][Core][Web UI] Reflect stage level blacklis...

2018-01-29 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/20408 I agree that despite being wordy `Active (Blacklisted in Stages: [...])` looks best, could you update it and post screenshots to confirm? And if @squito says the code looks good I trust

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20386 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86775/ Test FAILed. ---

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20386 **[Test build #86775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86775/testReport)** for PR 20386 at commit

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20386 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Descript...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20420 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20425: [WIP] remove the redundant code in HiveExternalCatlog an...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20425 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20414 @shivaram Thinking more, this might affect everything which does a zip (or variants/similar idioms like limit K, etc) on partition should be affected - with random + index in coalesce +

[GitHub] spark issue #20425: [WIP] remove the redundant code in HiveExternalCatlog an...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20425 **[Test build #86777 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86777/testReport)** for PR 20425 at commit

[GitHub] spark issue #20393: [SPARK-23207][SQL] Shuffle+Repartition on a DataFrame co...

2018-01-29 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20393 @sameeragarwal Interesting, this is still assuming that shuffle (after fetch) is stable, right ? Is this gauranteed in face of memory pressure/spills ? ---

[GitHub] spark issue #20419: [SPARK-23032][SQL][FOLLOW-UP]Add codegenStageId in comme...

2018-01-29 Thread rednaxelafx
Github user rednaxelafx commented on the issue: https://github.com/apache/spark/pull/20419 @kiszk SGTM and LGTM. Let's ship it! One more question on the side: with the `forceComment = true`, are we fully sure that won't affect the equality of `CodeAndComment`? The whole

[GitHub] spark pull request #20399: [SPARK-23209][core] Allow credential manager to w...

2018-01-29 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20399#discussion_r164539188 --- Diff: core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala --- @@ -75,6 +75,17 @@ private[spark] class

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20386 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20386: [WIP][SPARK-23202][SQL] Break down DataSourceV2Writer.co...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/342/

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19575 Actually, aggregation can only be executed on grouped data, so `SQL_PANDAS_GROUPED_AGG_UDF` doesn't seem to be very concise. How about `SQL_PANDAS_UDAF`? My only concern is how to support partial

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-29 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20397 > I think the renaming is worth to remove future confusions. What future confusion? I understand that the difference isn't obvious, but making the names less accurate isn't a good

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19575 Thanks @gatorsmile , I made https://issues.apache.org/jira/browse/SPARK-23258 to track changing the `maxRecordsPerBatch` conf and I will externalize it in this PR. > group map ->

[GitHub] spark issue #20420: [SPARK-22916][SQL][FOLLOW-UP] Update the Description of ...

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20420 Thanks! Merged to master/2.3. This PR is just to reflect the latest changes in the join selection rule. We can continue the improvement in the comment. ---

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-29 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19575 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20423 **[Test build #86776 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86776/testReport)** for PR 20423 at commit

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19575 Thanks to everyone for contributing and reviewing! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #20424: [Spark 23240][python] Better error message when extraneo...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20424 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20361: [SPARK-23188][SQL] Make vectorized columar reader batch ...

2018-01-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20361 cc @cloud-fan @sameeragarwal --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20426: [SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort f...

2018-01-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20426 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/345/

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20397 About the renaming, a lot of people complained to me about why the namings are not consistent, including @rxin . I named it `ReadTask` at the beginning because it really works like a task. But I

[GitHub] spark issue #19575: [SPARK-22221][DOCS] Adding User Documentation for Arrow

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19575 Thanks! I will submit a follow-up PR to rename it. Merged to 2.3 and master. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20272: [SPARK-23078] [CORE] [K8s] allow Spark Thrift Server to ...

2018-01-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20272 IIUC there was a issue in launching Thrift Server on YARN cluster mode, and I'm not sure whether it has been fixed (maybe @jerryshao can kindly check that?) Anyway that is not a problem on

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20423 Thanks for submitting this follow-up PR. Originally, I planned to do it with the other renaming work. --- - To unsubscribe,

[GitHub] spark issue #20326: [SPARK-23155][DEPLOY] log.server.url links in SHS

2018-01-29 Thread gerashegalov
Github user gerashegalov commented on the issue: https://github.com/apache/spark/pull/20326 thanks for feedback, Marcelo. I was also thinking back and forth where to put this logic. Ideally YARN should provide a permalink similar to AM proxy for logs as well. However, this was faster

[GitHub] spark pull request #20427: [SPARK-23260][SQL] remove V2 from the class name ...

2018-01-29 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/20427#discussion_r164543216 --- Diff: sql/core/src/main/java/org/apache/spark/sql/sources/v2/ReadSupport.java --- @@ -18,23 +18,23 @@ package org.apache.spark.sql.sources.v2;

[GitHub] spark issue #18649: [SPARK-21395][SQL] Spark SQL hive-thriftserver doesn't r...

2018-01-29 Thread debugger87
Github user debugger87 commented on the issue: https://github.com/apache/spark/pull/18649 https://github.com/apache/spark/pull/19721 Fixed the same issue, i will close it. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #18649: [SPARK-21395][SQL] Spark SQL hive-thriftserver do...

2018-01-29 Thread debugger87
Github user debugger87 closed the pull request at: https://github.com/apache/spark/pull/18649 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sq...

2018-01-29 Thread BryanCutler
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/20423 [SPARK-1][SQL][FOLLOWUP] Externalize spark.sql.execution.arrow.maxRecordsPerBatch ## What changes were proposed in this pull request? This is a followup to #19575 which added a

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20414 @mridulm I also agree we should follow @sameeragarwal 's suggestion to let shuffle fetch produce deterministic output, and only do this for a few operations (e.g. repartition/zipWithIndex, do

[GitHub] spark issue #20414: [SPARK-23243][SQL] Shuffle+Repartition on an RDD could l...

2018-01-29 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/20414 @jiangxb1987 Unfortunately I am unable to analyze this in detail; but hopefully can give some pointers, which I hope, helps ! One example I can think of is, for shuffle which uses

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

2018-01-29 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20387 > I'd suggest that we just propogate the paths parameter to options, and data source implementations are free to interprete the path option to whatever they want, e.g. table and database names.

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

2018-01-29 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20387 > [The push-down rule may be run more than once if filters are not pushed through projections] looks weird, do you have a query to reproduce this issue? One of the DataSourceV2 tests hit

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread sameeragarwal
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/20399 @vanzin and reviewers -- is this ready to go? We're waiting on RC3 for this. Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19575: [SPARK-22221][DOCS] Adding User Documentation for...

2018-01-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19575#discussion_r164513778 --- Diff: docs/sql-programming-guide.md --- @@ -1640,6 +1640,133 @@ Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` a

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20387 I dig into the commit history and recalled why I made these decisions: * having an mutable `DataSourceV2Relation`. This is mostly to avoid to keep adding more constructor parameters to

[GitHub] spark issue #20427: [SPARK-23260][SQL] remove V2 from the class name of data...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20427 **[Test build #86779 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86779/testReport)** for PR 20427 at commit

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20399 **[Test build #86780 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86780/testReport)** for PR 20399 at commit

[GitHub] spark issue #20399: [SPARK-23209][core] Allow credential manager to work whe...

2018-01-29 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20399 I was hoping that one of the other committers who +1'ed the patch would push it instead of me. (Ignoring the info vs. debug discussion.) ---

[GitHub] spark issue #20397: [SPARK-23219][SQL]Rename ReadTask to DataReaderFactory i...

2018-01-29 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20397 My point is, `ReadTask` is more precise, but `DataReaderFactor` also works(Spark serialize and send it to executors, and ask it to create data reader, it's reasonable to call it a factory). If we

[GitHub] spark issue #20423: [SPARK-22221][SQL][FOLLOWUP] Externalize spark.sql.execu...

2018-01-29 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20423 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20424: [Spark 23240][python] Better error message when e...

2018-01-29 Thread bersprockets
GitHub user bersprockets opened a pull request: https://github.com/apache/spark/pull/20424 [Spark 23240][python] Better error message when extraneous data in pyspark.daemon's stdout ## What changes were proposed in this pull request? Print more helpful message when daemon

  1   2   3   4   5   >