[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21721 **[Test build #94097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94097/testReport)** for PR 21721 at commit

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21889 **[Test build #94101 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94101/testReport)** for PR 21889 at commit

[GitHub] spark issue #21915: [SPARK-24954][Core] Fail fast on job submit if run a bar...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21915 **[Test build #94090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94090/testReport)** for PR 21915 at commit

[GitHub] spark issue #21957: [SPARK-24994][SQL] When the data type of the field is co...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21957 **[Test build #94094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94094/testReport)** for PR 21957 at commit

[GitHub] spark issue #21586: [SPARK-24586][SQL] Upcast should not allow casting from ...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21586 **[Test build #94103 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94103/testReport)** for PR 21586 at commit

[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint for SQL...

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21911 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21933: [SPARK-24917][CORE] make chunk size configurable

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21933 **[Test build #94098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94098/testReport)** for PR 21933 at commit

[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21948 **[Test build #94106 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94106/testReport)** for PR 21948 at commit

[GitHub] spark issue #20345: [SPARK-23172][SQL] Expand the ReorderJoin rule to handle...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20345 **[Test build #94093 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94093/testReport)** for PR 20345 at commit

[GitHub] spark issue #21975: [WIP][SPARK-25001][BUILD] Fix miscellaneous build warnin...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21975 **[Test build #94096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94096/testReport)** for PR 21975 at commit

[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint for SQL...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21911 **[Test build #94108 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94108/testReport)** for PR 21911 at commit

[GitHub] spark issue #21966: [SPARK-23915][SQL][followup] Add array_except function

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21966 **[Test build #94099 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94099/testReport)** for PR 21966 at commit

[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21965 **[Test build #94107 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94107/testReport)** for PR 21965 at commit

[GitHub] spark issue #21955: [SPARK-18057][FOLLOW-UP][SS] Update Kafka client version...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21955 **[Test build #94086 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94086/testReport)** for PR 21955 at commit

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21981 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21981 I am not sure. ML is not my area but I am pretty sure who you know are basically who I know .. ping me if that's minor or trivial like this. I can review and merge. ---

[GitHub] spark issue #21911: [SPARK-24940][SQL] Coalesce and Repartition Hint for SQL...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21911 **[Test build #94108 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94108/testReport)** for PR 21911 at commit

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94105/ Test PASSed. ---

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21981 **[Test build #94105 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94105/testReport)** for PR 21981 at commit

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 The regression happens at writing. Looks like when benchmarking writing time, we don't use `df.count`? --- - To unsubscribe,

[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21965 **[Test build #94107 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94107/testReport)** for PR 21965 at commit

[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21965 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21965 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21952 I noticed that the benchmark uses `df.count`, is it possible that column pruning has some issues in master? --- - To

[GitHub] spark issue #21965: [SPARK-23909][SQL] Add filter function.

2018-08-03 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/21965 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21948 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21948 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21948 **[Test build #94106 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94106/testReport)** for PR 21948 at commit

[GitHub] spark issue #21948: [SPARK-24991][SQL] use InternalRow in DataSourceWriter

2018-08-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21948 @rdblue I have documented the object reuse behavior and ask data source to handle it, please take a look, thanks! --- - To

[GitHub] spark issue #21898: [SPARK-24817][Core] Implement BarrierTaskContext.barrier...

2018-08-03 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21898 > Also, if there shouldn't exist two active attempts at the same time for a barrier stage, maybe we should store attemptId as a state variable. Basically, if we see a new attempt ID, we should

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/21981 BTW, @HyukjinKwon, do you know who's still reviewing the ML PRs? I have a few old PRs and I really want to know which're considered meaningful. ---

[GitHub] spark issue #21979: [SPARK-25009][CORE]Standalone Cluster mode application s...

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21979 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21895 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94084/ Test PASSed. ---

[GitHub] spark issue #21979: [SPARK-25009][CORE]Standalone Cluster mode application s...

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94082/ Test PASSed. ---

[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21895 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21979: [SPARK-25009][CORE]Standalone Cluster mode application s...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21979 **[Test build #94082 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94082/testReport)** for PR 21979 at commit

[GitHub] spark issue #21895: [SPARK-24948][SHS] Delegate check access permissions to ...

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21895 **[Test build #94084 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94084/testReport)** for PR 21895 at commit

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21952 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94104/ Test PASSed. ---

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21898: [SPARK-24817][Core] Implement BarrierTaskContext.barrier...

2018-08-03 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/21898 Also, if there shouldn't exist two active attempts at the same time for a barrier stage, maybe we should store attemptId as a state variable. Basically, if we see a new attempt ID, we should abort

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21952 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21952 **[Test build #94104 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94104/testReport)** for PR 21952 at commit

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/21981 Thanks for the review @HyukjinKwon. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #21898: [SPARK-24817][Core] Implement BarrierTaskContext.barrier...

2018-08-03 Thread mengxr
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/21898 Here is what I mean: ~~~scala case class ContextBarrierId(stageId: Int, stageAttemptId: Int) class ContextBarrierState(val numTasks: Int) { private var epoch: Int = 0

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21981 **[Test build #94105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94105/testReport)** for PR 21981 at commit

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21981: [SAPRK-25011][ML]add prefix to __all__ in fpm.py

2018-08-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21981 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21952: [SPARK-24993] [SQL] Make Avro Fast Again

2018-08-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21952 Ah, finally I can reproduce this. It needs to allocate the array feature with length 16000. I was reducing it to 1600 and it largely relieve the regression. `com.databricks.spark.avro` is faster

<    3   4   5   6   7   8