[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95683/ Test PASSed. ---

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95689/testReport)** for PR 22334 at commit

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22333 **[Test build #95683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95683/testReport)** for PR 22333 at commit

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95682/ Test PASSed. ---

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #95682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95682/testReport)** for PR 21669 at commit

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 @tgravescs yes you are right about the problem here. Instead of asking executors to remove old committed shuffle data, I prefer #6648 , which just write new shuffle data with a different file

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95687/ Test FAILed. ---

[GitHub] spark pull request #22324: [SPARK-25237][SQL] Remove updateBytesReadWithFile...

2018-09-04 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/22324#discussion_r215111327 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -473,6 +476,27 @@ class FileBasedDataSourceSuite extends

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95687 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95687/testReport)** for PR 22334 at commit

[GitHub] spark issue #17174: [SPARK-19145][SQL] Timestamp to String casting is slowin...

2018-09-04 Thread hindog
Github user hindog commented on the issue: https://github.com/apache/spark/pull/17174 I believe another performance impact related to this may be attributed to the `cast` operator failing to match during filter-pushdown, meaning that the filter on the timestamp will NOT get pushed

[GitHub] spark issue #21310: [SPARK-24256][SQL] SPARK-24256: ExpressionEncoder should...

2018-09-04 Thread fangshil
Github user fangshil commented on the issue: https://github.com/apache/spark/pull/21310 To summarize our discussion in this pr: Spark-avro is now merged into Spark as a built-in data source. Upstream community is not merging the AvroEncoder to support Avro types in Dataset,

[GitHub] spark pull request #21310: [SPARK-24256][SQL] SPARK-24256: ExpressionEncoder...

2018-09-04 Thread fangshil
Github user fangshil closed the pull request at: https://github.com/apache/spark/pull/21310 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22324: [SPARK-25237][SQL] Remove updateBytesReadWithFileSize in...

2018-09-04 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22324 ping @srowen @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...

2018-09-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Ideally the last test should have 50 partitions? is it because we really need the test data to be at least 50 bytes? ideally a multiple of 50, I guess. ---

[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...

2018-09-04 Thread bomeng
Github user bomeng commented on the issue: https://github.com/apache/spark/pull/21638 Here is the test code, not sure it is right or not --- ``` test("Number of partitions") { sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local")

[GitHub] spark pull request #22320: [SPARK-25313][SQL]Fix regression in FileFormatWri...

2018-09-04 Thread wangyum
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/22320#discussion_r215106921 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala --- @@ -56,7 +56,7 @@ case class

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22138 **[Test build #95688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95688/testReport)** for PR 22138 at commit

[GitHub] spark issue #21306: [SPARK-24252][SQL] Add catalog registration and table ca...

2018-09-04 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21306 So Kudu range partitions support arbitrary sized partition intervals, like the example below, where the first and last range partition are six months in size, but the middle partition is one

[GitHub] spark issue #21306: [SPARK-24252][SQL] Add catalog registration and table ca...

2018-09-04 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21306 Sure, I am looking at the point of view of supporting Kudu. Check out https://kudu.apache.org/docs/schema_design.html#partitioning for some of the details. In particular

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/22332 I also can't find a strong reason to append a new API in `Dataset`... btw, to add a new API there, you'd be better to discuss in jira before making a pr, I think. cc: @rxin @cloud-fan @HyukjinKwon

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95687/testReport)** for PR 22334 at commit

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/22334 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95684/ Test FAILed. ---

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95684/testReport)** for PR 22334 at commit

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-04 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/21308 @tigerquoll, what we come up with needs to work across a variety of data sources, including those like JDBC that can delete at a lower granularity than partition. For Hive tables, the

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/ ---

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22298 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22298 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22282: [SPARK-23539][SS] Add support for Kafka headers i...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on a diff in the pull request: https://github.com/apache/spark/pull/22282#discussion_r215092933 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -88,7 +92,30 @@ private[kafka010] abstract

[GitHub] spark issue #22234: [SPARK-25241][SQL] Configurable empty values when readin...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22234 Did we introduce any behavior change in https://github.com/apache/spark/pull/21273? Does this PR resolve it? --- - To

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2848/ ---

[GitHub] spark issue #21308: [SPARK-24253][SQL] Add DeleteSupport mix-in for DataSour...

2018-09-04 Thread tigerquoll
Github user tigerquoll commented on the issue: https://github.com/apache/spark/pull/21308 I am assuming this API was intended to support the "drop partition" use-case. I'm arguing that adding and deleting partitions deal with a concept that is a slightly higher concept than just a

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22298 **[Test build #95686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95686/testReport)** for PR 22298 at commit

[GitHub] spark issue #22298: [SPARK-25021][K8S] Add spark.executor.pyspark.memory lim...

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/22298 @felixcheung @holdenk I have moved the PySpark example files to a more appropriate location. Any other comments before merge? ---

[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22171 @vinodkc Could you answer the question from @cloud-fan ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22218: [SPARK-25228][CORE]Add executor CPU time metric.

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22218 **[Test build #4331 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4331/testReport)** for PR 22218 at commit

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95685/testReport)** for PR 22313 at commit

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22313 At this time, R failure. ``` DONE === Had test warnings or failures; see logs. ``` ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22313 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95680/ Test FAILed. ---

[GitHub] spark issue #22313: [SPARK-25306][SQL] Avoid skewed filter trees to speed up...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22313 **[Test build #95680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95680/testReport)** for PR 22313 at commit

[GitHub] spark issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaData...

2018-09-04 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/spark/pull/22138 @zsxwing If it means code freeze for 2.4 is just around the corner then sure! We can focus on blockers for releasing 2.4, and revisit this again. Let me reflect @gaborgsomogyi review

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22333 Oh, I assumed that it's already dockerized. Sorry, never mind about that @shaneknapp . And, thanks! --- - To unsubscribe,

[GitHub] spark issue #21756: [SPARK-24764] [CORE] Add ServiceLoader implementation fo...

2018-09-04 Thread dbtsai
Github user dbtsai commented on the issue: https://github.com/apache/spark/pull/21756 add @jerryshao for more feedback. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK-24748

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22334 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22334 **[Test build #95684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95684/testReport)** for PR 22334 at commit

[GitHub] spark pull request #22334: [SPARK-25336][SS]Revert SPARK-24863 and SPARK 247...

2018-09-04 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/22334 [SPARK-25336][SS]Revert SPARK-24863 and SPARK 24748 ## What changes were proposed in this pull request? Revert SPARK-24863 and SPARK 24748 as per discussion in #21721. We will revisit

[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...

2018-09-04 Thread huaxingao
Github user huaxingao commented on the issue: https://github.com/apache/spark/pull/20442 Any more comments? @MLnick @jkbradley --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in Dataset in...

2018-09-04 Thread wmellouli
Github user wmellouli commented on the issue: https://github.com/apache/spark/pull/22332 @mgaido91 Thank you for your suggestion, I updated the PR name, description and sources with a new version using a parameter `atPosition` instead of a flag `atTheEnd`. Let me know what you think

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread shaneknapp
Github user shaneknapp commented on the issue: https://github.com/apache/spark/pull/22333 moving any parts of the spark build infrastructure to use docker is a big project and not happening in the next few months. ---

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/22112 yeah you would have to be able to handle network partitioning somehow. I don't know how difficult it is but its definitely work we may not want to do here. I was trying to clarify and make

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r215070653 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1513,37 +1513,34 @@ private[spark] class DAGScheduler(

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22333 **[Test build #95683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95683/testReport)** for PR 22333 at commit

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22333 Hi, @shaneknapp and @srowen . Can we build and use the zinc-installed docker images in our build system? -

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22145: [SPARK-25152][K8S] Enable SparkR Integration Tests for K...

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/22145 this PR is waiting on @shaneknapp to migrate to ubuntu and have R setup in the node responsible for distribution building. This was planning on being done right after the 2.4 cut. ---

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it's insta...

2018-09-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22333 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2844/ ---

[GitHub] spark pull request #22333: [SPARK-25335][BUILD] Skip Zinc downloading if it'...

2018-09-04 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/22333 [SPARK-25335][BUILD] Skip Zinc downloading if it's installed in the system ## What changes were proposed in this pull request? Zinc is 23.5MB. ``` $ curl -LO

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/2844/ ---

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22112 > So in order to fix that we would need a way to tell the executors to remove that older committed shuffle data @tgravescs It is also hard to implement such a robust solution for

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21669 This PR has been tested and passed on a local cluster with an integration test that will be merged in a follow-up PR. It passes all three configuration options. It is now in a state that is

[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #95682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95682/testReport)** for PR 21669 at commit

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22192 **[Test build #95681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95681/testReport)** for PR 22192 at commit

[GitHub] spark issue #22192: [SPARK-24918][Core] Executor Plugin API

2018-09-04 Thread bersprockets
Github user bersprockets commented on the issue: https://github.com/apache/spark/pull/22192 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #21669: [SPARK-23257][K8S][WIP] Kerberos Support for Spar...

2018-09-04 Thread ifilonenko
Github user ifilonenko commented on a diff in the pull request: https://github.com/apache/spark/pull/21669#discussion_r215057079 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -164,7 +164,15 @@ private[spark] class SparkSubmit extends Logging {

[GitHub] spark pull request #22209: [SPARK-24415][Core] Fixed the aggregated stage me...

2018-09-04 Thread ankuriitg
Github user ankuriitg commented on a diff in the pull request: https://github.com/apache/spark/pull/22209#discussion_r215056633 --- Diff: core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala --- @@ -1190,6 +1190,61 @@ class AppStatusListenerSuite extends

[GitHub] spark pull request #22323: [SPARK-25262][K8S] Allow SPARK_LOCAL_DIRS to be t...

2018-09-04 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/22323#discussion_r215040572 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/LocalDirsFeatureStep.scala --- @@ -22,6 +22,7 @@ import

[GitHub] spark pull request #22323: [SPARK-25262][K8S] Allow SPARK_LOCAL_DIRS to be t...

2018-09-04 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/22323#discussion_r215041417 --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala --- @@ -225,6 +225,15 @@ private[spark] object

[GitHub] spark pull request #22323: [SPARK-25262][K8S] Allow SPARK_LOCAL_DIRS to be t...

2018-09-04 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/22323#discussion_r215055594 --- Diff: docs/running-on-kubernetes.md --- @@ -215,6 +215,19 @@ spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.options.clai

[GitHub] spark pull request #22323: [SPARK-25262][K8S] Allow SPARK_LOCAL_DIRS to be t...

2018-09-04 Thread liyinan926
Github user liyinan926 commented on a diff in the pull request: https://github.com/apache/spark/pull/22323#discussion_r215040130 --- Diff: docs/running-on-kubernetes.md --- @@ -215,6 +215,19 @@ spark.kubernetes.driver.volumes.persistentVolumeClaim.checkpointpvc.options.clai

[GitHub] spark pull request #22328: [SPARK-22666][ML][SQL] Spark datasource for image...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22328#discussion_r215038606 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/image/ImageFileFormatSuite.scala --- @@ -0,0 +1,119 @@ +/* + * Licensed to the

[GitHub] spark pull request #22328: [SPARK-22666][ML][SQL] Spark datasource for image...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22328#discussion_r215037240 --- Diff: mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala --- @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22328: [SPARK-22666][ML][SQL] Spark datasource for image...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22328#discussion_r215039097 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -567,6 +567,7 @@ object DataSource extends

[GitHub] spark pull request #22328: [SPARK-22666][ML][SQL] Spark datasource for image...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22328#discussion_r215036263 --- Diff: mllib/src/main/scala/org/apache/spark/ml/source/image/ImageDataSource.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22328: [SPARK-22666][ML][SQL] Spark datasource for image...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22328#discussion_r215037968 --- Diff: mllib/src/test/scala/org/apache/spark/ml/source/image/ImageFileFormatSuite.scala --- @@ -0,0 +1,119 @@ +/* + * Licensed to the

[GitHub] spark pull request #22328: [SPARK-22666][ML][SQL] Spark datasource for image...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on a diff in the pull request: https://github.com/apache/spark/pull/22328#discussion_r215036643 --- Diff: mllib/src/main/scala/org/apache/spark/ml/source/image/ImageDataSource.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #22288: [SPARK-22148][Scheduler] Acquire new executors to...

2018-09-04 Thread dhruve
Github user dhruve commented on a diff in the pull request: https://github.com/apache/spark/pull/22288#discussion_r215036162 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala --- @@ -414,9 +425,54 @@ private[spark] class TaskSchedulerImpl(

[GitHub] spark issue #22332: [SPARK-25333][SQL] Ability add new columns in the beginn...

2018-09-04 Thread jaceklaskowski
Github user jaceklaskowski commented on the issue: https://github.com/apache/spark/pull/22332 Why not `select($"*", newColumnHere)` or `select(newColumnHere, $"*")`? Somehow I don't think the use case merits overloading `withColumn`. ---

[GitHub] spark issue #22179: [SPARK-23131][SPARK-25176][BUILD] Upgrade Kryo to 4.0.2

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22179 And, @wangyum . Please add `[SPARK-25258]` to the PR title like `[SPARK-25258][SPARK-23131][SPARK-25176]`. SPARK-23131 is the one you created for this PR. Also, the PR description

[GitHub] spark pull request #21638: [SPARK-22357][CORE] SparkContext.binaryFiles igno...

2018-09-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21638#discussion_r215030825 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -47,7 +47,7 @@ private[spark] abstract class

[GitHub] spark issue #22179: [SPARK-23131][SPARK-25176][BUILD] Upgrade Kryo to 4.0.2

2018-09-04 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22179 Although this will give us a different Kryo version (not Hive, ORC), the newly added test cases show the benefit clearly. Also, I checked two new test cases with/without this PR. It looks

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-09-04 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/22112 ok for anyone else trying, I was able to reproduce this consistently with the following code, adding in more repartitions. I have blacklisting, dynamic allocation, and external shuffle service

[GitHub] spark issue #21721: [SPARK-24748][SS] Support for reporting custom metrics v...

2018-09-04 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/21721 Given the uncertainty about how this works across batch, streaming, and CP, and given we are still flushing out the main APIs, I think we should revert this, and revisit when the main APIs are done.

[GitHub] spark pull request #21638: [SPARK-22357][CORE] SparkContext.binaryFiles igno...

2018-09-04 Thread bomeng
Github user bomeng commented on a diff in the pull request: https://github.com/apache/spark/pull/21638#discussion_r215022562 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -47,7 +47,7 @@ private[spark] abstract class StreamFileInputFormat[T]

[GitHub] spark pull request #22295: [SPARK-25255][PYTHON]Add getActiveSession to Spar...

2018-09-04 Thread huaxingao
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22295#discussion_r215022091 --- Diff: python/pyspark/sql/session.py --- @@ -252,6 +252,16 @@ def newSession(self): """ return self.__class__(self._sc,

<    1   2   3   4   5   6   >