[GitHub] spark issue #21048: [SPARK-23966][SS] Refactoring all checkpoint file writin...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21048 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21048: [SPARK-23966][SS] Refactoring all checkpoint file writin...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21048 **[Test build #89357 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89357/testReport)** for PR 21048 at commit

[GitHub] spark issue #21048: [SPARK-23966][SS] Refactoring all checkpoint file writin...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21048 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2326/

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21068 **[Test build #89350 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89350/testReport)** for PR 21068 at commit

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21068 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89350/ Test FAILed. ---

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21068 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21063: [SPARK-23886][Structured Streaming][WIP] Update query st...

2018-04-13 Thread jose-torres
Github user jose-torres commented on the issue: https://github.com/apache/spark/pull/21063 I'm not sure isDataAvailable makes sense in the context of continuous processing; it seems fundamentally tied to the microbatch execution model. I think the best option is to just leave it and

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-04-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r181502862 --- Diff: external/kafka-0-10/src/test/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumerSuite.scala --- @@ -0,0 +1,111 @@ +/* + *

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-04-13 Thread koeninger
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r181506863 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,381 @@ +/* + * Licensed

[GitHub] spark pull request #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread rdblue
GitHub user rdblue opened a pull request: https://github.com/apache/spark/pull/21070 SPARK-23972: Update Parquet to 1.10.0. ## What changes were proposed in this pull request? This updates Parquet to 1.10.0 and updates the vectorized path for buffer management changes.

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21070 Could you share the performance number? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-04-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r181513236 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/FailureWithinTimeIntervalTracker.scala --- @@ -0,0 +1,80 @@ +/* + *

[GitHub] spark pull request #21068: [SPARK-16630][YARN] Blacklist a node if executors...

2018-04-13 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21068#discussion_r181515465 --- Diff: resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocatorBlacklistTracker.scala --- @@ -0,0 +1,155 @@ +/* + *

[GitHub] spark issue #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of cached ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20997 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89359/ Test PASSed. ---

[GitHub] spark issue #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of cached ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20997 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of cached ...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20997 **[Test build #89359 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89359/testReport)** for PR 20997 at commit

[GitHub] spark issue #21053: [SPARK-23924][SQL] Add element_at function

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21053 **[Test build #89356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89356/testReport)** for PR 21053 at commit

[GitHub] spark issue #21053: [SPARK-23924][SQL] Add element_at function

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21053 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89356/ Test FAILed. ---

[GitHub] spark issue #21053: [SPARK-23924][SQL] Add element_at function

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21053 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21044 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89361/ Test PASSed. ---

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21060 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21060 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2323/

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21044 **[Test build #89353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89353/testReport)** for PR 21044 at commit

[GitHub] spark pull request #20560: [SPARK-23375][SQL] Eliminate unneeded Sort in Opt...

2018-04-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20560 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21034: [SPARK-23926][SQL] Extending reverse function to support...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21034 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21032: [SPARK-23529][K8s] Support mounting hostPath volumes for...

2018-04-13 Thread madanadit
Github user madanadit commented on the issue: https://github.com/apache/spark/pull/21032 @liyinan926 Sounds good --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #20888: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

2018-04-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20888#discussion_r181459717 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala --- @@ -152,22 +154,28 @@ class DataFrameRangeSuite extends

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21044 **[Test build #89353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89353/testReport)** for PR 21044 at commit

[GitHub] spark issue #15113: [SPARK-17508][PYSPARK][ML] PySpark treat Param values No...

2018-04-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15113 Is this PR still needed? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #21053: [SPARK-23924][SQL] Add element_at function

2018-04-13 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21053#discussion_r181475737 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +287,106 @@ case class

[GitHub] spark issue #20701: [SPARK-23528][ML] Add numIter to ClusteringSummary

2018-04-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/20701 So we need to update for the changed MimaExcludes, I think its ok to include this in the model directly if no one objects in the next week or so? Sklearn has this directly in the model return as

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-13 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r181478560 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +288,80 @@ case class

[GitHub] spark issue #21052: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21052 **[Test build #89349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89349/testReport)** for PR 21052 at commit

[GitHub] spark pull request #21048: [SPARK-23966][SS] Refactoring all checkpoint file...

2018-04-13 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/21048#discussion_r181483332 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala --- @@ -0,0 +1,347 @@ +/* + * Licensed to the

[GitHub] spark pull request #21045: [WIP][SPARK-23931][SQL] Adds zip function to spar...

2018-04-13 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21045#discussion_r181489957 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -87,6 +87,62 @@ case class MapKeys(child:

[GitHub] spark issue #21045: [WIP][SPARK-23931][SQL] Adds zip function to sparksql

2018-04-13 Thread DylanGuedes
Github user DylanGuedes commented on the issue: https://github.com/apache/spark/pull/21045 Ok so It works fine in spark-shell but in pyspark I got this error: ```shell File "/home/dguedes/Workspace/spark/python/pyspark/sql/functions.py", line 2155, in pyspark.sql.functions.zip

[GitHub] spark issue #21045: [WIP][SPARK-23931][SQL] Adds zip function to sparksql

2018-04-13 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/21045 @DylanGuedes the first suggestion I can give you is: do not use spark-shell for testing, but write UT and run them with a debugger. Then, you can breakpoint to check the generated code (or you can

[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21011 **[Test build #89351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89351/testReport)** for PR 21011 at commit

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-04-13 Thread koeninger
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r181506582 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,381 @@ +/* + * Licensed

[GitHub] spark issue #21043: [SPARK-23963] [SQL] Properly handle large number of colu...

2018-04-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21043 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21060 **[Test build #89363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89363/testReport)** for PR 21060 at commit

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21060 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21044 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21044 **[Test build #89361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89361/testReport)** for PR 21044 at commit

[GitHub] spark pull request #20888: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

2018-04-13 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20888#discussion_r181489845 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala --- @@ -152,22 +154,28 @@ class DataFrameRangeSuite extends

[GitHub] spark issue #20894: [SPARK-23786][SQL] Checking column names of csv headers

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20894 **[Test build #89360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89360/testReport)** for PR 20894 at commit

[GitHub] spark issue #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of cached ...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20997 **[Test build #89359 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89359/testReport)** for PR 20997 at commit

[GitHub] spark pull request #20933: [SPARK-23817][SQL]Migrate ORC file format read pa...

2018-04-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20933#discussion_r181507712 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1185,6 +1185,13 @@ object SQLConf { .stringConf

[GitHub] spark pull request #20933: [SPARK-23817][SQL]Migrate ORC file format read pa...

2018-04-13 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/20933#discussion_r181509305 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -368,8 +368,7 @@ case class FileSourceScanExec(

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21060 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21070 Upstream benchmarks for buffer management changes are here: https://github.com/apache/parquet-mr/pull/390#issuecomment-338505426 That doesn't show the GC benefit for smaller buffer

[GitHub] spark issue #20888: [SPARK-23775][TEST] Make DataFrameRangeSuite not flaky

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20888 **[Test build #89358 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89358/testReport)** for PR 20888 at commit

[GitHub] spark issue #21063: [SPARK-23886][Structured Streaming][WIP] Update query st...

2018-04-13 Thread jose-torres
Github user jose-torres commented on the issue: https://github.com/apache/spark/pull/21063 I guess we might not even need to make an API change, just document that these flags only mean anything for microbatch execution. In any case that's a separate discussion. ---

[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21011 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89351/ Test PASSed. ---

[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21011 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21044 **[Test build #89361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89361/testReport)** for PR 21044 at commit

[GitHub] spark pull request #20997: [SPARK-19185] [DSTREAMS] Avoid concurrent use of ...

2018-04-13 Thread koeninger
Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/20997#discussion_r181507520 --- Diff: external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala --- @@ -0,0 +1,381 @@ +/* + * Licensed

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21070 **[Test build #89362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89362/testReport)** for PR 21070 at commit

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21070 **[Test build #89362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89362/testReport)** for PR 21070 at commit

[GitHub] spark pull request #21043: [SPARK-23963] [SQL] Properly handle large number ...

2018-04-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21043 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21070 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2327/

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21070 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21070 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21070: SPARK-23972: Update Parquet to 1.10.0.

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21070 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89362/ Test FAILed. ---

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21060 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2328/

[GitHub] spark issue #21033: [SPARK-19320][MESOS]allow specifying a hard limit on num...

2018-04-13 Thread yanji84
Github user yanji84 commented on the issue: https://github.com/apache/spark/pull/21033 Anything else do we need to do to merge in this change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21011 **[Test build #89340 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89340/testReport)** for PR 21011 at commit

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21060 **[Test build #89352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89352/testReport)** for PR 21060 at commit

[GitHub] spark issue #20988: [SPARK-23877][SQL]: Use filter predicates to prune parti...

2018-04-13 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20988 @cloud-fan or @gatorsmile, could you review this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21044 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20280: [SPARK-22232][PYTHON][SQL] Fixed Row pickling to include...

2018-04-13 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/20280 Awesome, looking forward to the update. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21068 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89347/ Test FAILed. ---

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21068 **[Test build #89347 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89347/testReport)** for PR 21068 at commit

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21068 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21052: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89349/ Test PASSed. ---

[GitHub] spark issue #21052: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #21048: [SPARK-23966][SS] Refactoring all checkpoint file...

2018-04-13 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/21048#discussion_r181486717 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala --- @@ -0,0 +1,347 @@ +/* + *

[GitHub] spark pull request #20938: [SPARK-23821][SQL] Collection function: flatten

2018-04-13 Thread mn-mikke
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/20938#discussion_r181456175 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +289,160 @@ case class

[GitHub] spark pull request #20888: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

2018-04-13 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20888#discussion_r181460382 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala --- @@ -152,22 +154,28 @@ class DataFrameRangeSuite extends QueryTest with

[GitHub] spark pull request #21057: 2 Improvements to Pyspark docs

2018-04-13 Thread aviv-ebates
Github user aviv-ebates commented on a diff in the pull request: https://github.com/apache/spark/pull/21057#discussion_r181460539 --- Diff: python/pyspark/streaming/kafka.py --- @@ -104,7 +104,7 @@ def createDirectStream(ssc, topics, kafkaParams, fromOffsets=None,

[GitHub] spark issue #21067: [SPARK-23980][K8S] Resilient Spark driver on Kubernetes

2018-04-13 Thread mccheah
Github user mccheah commented on the issue: https://github.com/apache/spark/pull/21067 > We don't have a solid story for checkpointing streaming computation right now, and even if we did, you'll certainly lose all progress from batch jobs. Should probably clarify re:

[GitHub] spark issue #21052: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21052 **[Test build #89339 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89339/testReport)** for PR 21052 at commit

[GitHub] spark issue #21052: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21052 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21052: [SPARK-23799] FilterEstimation.evaluateInSet produces de...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21052 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89339/ Test PASSed. ---

[GitHub] spark issue #21060: [SPARK-23942][PYTHON][SQL][BRANCH-2.3] Makes collect in ...

2018-04-13 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/21060 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21044: [SPARK-9312][ML] Add RawPrediction, numClasses, and numF...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21044 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89353/ Test PASSed. ---

[GitHub] spark pull request #20629: [SPARK-23451][ML] Deprecate KMeans.computeCost

2018-04-13 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/20629#discussion_r181471237 --- Diff: python/pyspark/ml/clustering.py --- @@ -322,7 +323,11 @@ def computeCost(self, dataset): """ Return the K-means cost

[GitHub] spark pull request #20629: [SPARK-23451][ML] Deprecate KMeans.computeCost

2018-04-13 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/20629#discussion_r181470189 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -64,12 +65,12 @@ class ClusteringEvaluator @Since("2.3.0")

[GitHub] spark pull request #20629: [SPARK-23451][ML] Deprecate KMeans.computeCost

2018-04-13 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/20629#discussion_r181470948 --- Diff: mllib/src/main/scala/org/apache/spark/ml/evaluation/ClusteringEvaluator.scala --- @@ -84,12 +85,12 @@ class ClusteringEvaluator @Since("2.3.0")

[GitHub] spark pull request #21061: [SPARK-23914][SQL] Add array_union function

2018-04-13 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/21061#discussion_r181473537 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala --- @@ -287,3 +288,80 @@ case class

[GitHub] spark issue #21067: [SPARK-23980][K8S] Resilient Spark driver on Kubernetes

2018-04-13 Thread mccheah
Github user mccheah commented on the issue: https://github.com/apache/spark/pull/21067 Looks like there's a lot of conflicts from the refactor that was just merged. In general though I don't think this buys us too much. The problem is that when the driver fails, you'll lose

[GitHub] spark pull request #21048: [SPARK-23966][SS] Refactoring all checkpoint file...

2018-04-13 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/21048#discussion_r181480351 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CheckpointFileManager.scala --- @@ -0,0 +1,347 @@ +/* + * Licensed to the

[GitHub] spark issue #21066: [SPARK-23977][CLOUD][WIP] Add commit protocol binding to...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21066 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21065: [SPARK-23979][SQL] MultiAlias should not be a CodegenFal...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21065 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89341/ Test PASSed. ---

[GitHub] spark issue #21065: [SPARK-23979][SQL] MultiAlias should not be a CodegenFal...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21065 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #20888: [SPARK-23775][TEST] Make DataFrameRangeSuite not ...

2018-04-13 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20888#discussion_r181456125 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameRangeSuite.scala --- @@ -152,22 +154,28 @@ class DataFrameRangeSuite extends QueryTest with

[GitHub] spark issue #21032: [SPARK-23529][K8s] Support mounting hostPath volumes for...

2018-04-13 Thread foxish
Github user foxish commented on the issue: https://github.com/apache/spark/pull/21032 @madanadit @liyinan926, can we do a short doc with all the types of volumes and config options and run that through the rest of the community? ---

[GitHub] spark issue #21032: [SPARK-23529][K8s] Support mounting hostPath volumes for...

2018-04-13 Thread madanadit
Github user madanadit commented on the issue: https://github.com/apache/spark/pull/21032 @liyinan926 @foxish I've started a doc [here](https://docs.google.com/document/d/15-mk7UnOYNTXoF6EKaVlelWYc9DTrTXrYoodwDuAwY4/edit?usp=sharing). Feel free to edit and circulate to interested

[GitHub] spark pull request #21069: [SPARK-23920][SQL]add array_remove to remove all ...

2018-04-13 Thread huaxingao
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/21069 [SPARK-23920][SQL]add array_remove to remove all elements that equal element from array ## What changes were proposed in this pull request? add array_remove to remove all elements that

[GitHub] spark issue #21068: [SPARK-16630][YARN] Blacklist a node if executors won't ...

2018-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21068 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

  1   2   3   4   5   >