date:20190816

[GitHub] [spark] SparkQA commented on issue #22570: [SPARK-25553][BUILD] Add EmptyInterpolatedStringChecker to scalastyle config

2019-08-16 Thread GitBox

SparkQA commented on issue #22570: [SPARK-25553][BUILD] Add 
EmptyInterpolatedStringChecker to scalastyle config
URL: https://github.com/apache/spark/pull/22570#issuecomment-522110140
 
 
   **[Test build #109227 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109227/testReport)**
 for PR 22570 at commit 
[`99968b9`](https://github.com/apache/spark/commit/99968b9960e9393f90e303e937b3f2fb02ae2e3c).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25474: [SPARK-28758][BUILD][SQL] Upgrade Janino to 3.0.15

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25474: [SPARK-28758][BUILD][SQL] 
Upgrade Janino to 3.0.15
URL: https://github.com/apache/spark/pull/25474#issuecomment-522109298
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25474: [SPARK-28758][BUILD][SQL] Upgrade Janino to 3.0.15

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #25474: [SPARK-28758][BUILD][SQL] Upgrade 
Janino to 3.0.15
URL: https://github.com/apache/spark/pull/25474#issuecomment-522057766
 
 
   **[Test build #109219 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109219/testReport)**
 for PR 25474 at commit 
[`0466b8d`](https://github.com/apache/spark/commit/0466b8d737b33b4862e296d66dc80cb23ea1b05e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25474: [SPARK-28758][BUILD][SQL] Upgrade Janino to 3.0.15

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25474: [SPARK-28758][BUILD][SQL] Upgrade 
Janino to 3.0.15
URL: https://github.com/apache/spark/pull/25474#issuecomment-522109298
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25455: [SPARK-28737][CORE] Update Jersey to 2.29

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25455: [SPARK-28737][CORE] Update Jersey to 
2.29
URL: https://github.com/apache/spark/pull/25455#issuecomment-522110149
 
 
   Oops. Sorry. Although #25474 touches the same files, I thought it's okay 
because it touches a different part. Could you rebase once more?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #25471: [SPARK-28753][SQL] Dynamically reuse subqueries in AQE

2019-08-16 Thread GitBox

maryannxue commented on a change in pull request #25471: [SPARK-28753][SQL] 
Dynamically reuse subqueries in AQE
URL: https://github.com/apache/spark/pull/25471#discussion_r314838909
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 ##
 @@ -119,6 +122,7 @@ case class AdaptiveSparkPlanExec(
 if (isFinalPlan) {
   currentPhysicalPlan.execute()
 } else {
+  currentPhysicalPlan = applyPhysicalRules(currentPhysicalPlan, 
queryStagePreparationRules)
 
 Review comment:
   We have split the pre-planning rules into two groups now: 
`preprocessingRules` and `queryStagePreparationRules`, the former is necessary 
before the creation of `AdaptiveSparkPlanExec` because otherwise the initial 
physical plan will have logical subquery nodes in it. The latter, though, can 
be deferred to after `AdaptiveSparkPlanExec` creation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive 
SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-522105668
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109226/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-522105668
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109226/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #23531: [SPARK-24497][SQL] Support recursive 
SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-522105659
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-522105659
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #23531: [SPARK-24497][SQL] Support recursive 
SQL query
URL: https://github.com/apache/spark/pull/23531#issuecomment-522057866
 
 
   **[Test build #109226 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109226/testReport)**
 for PR 23531 at commit 
[`cfe0a67`](https://github.com/apache/spark/commit/cfe0a67eadc4cbe3aeda786401beef9d472f4ebd).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-08-16 Thread GitBox

SparkQA commented on issue #23531: [SPARK-24497][SQL] Support recursive SQL 
query
URL: https://github.com/apache/spark/pull/23531#issuecomment-522105388
 
 
   **[Test build #109226 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109226/testReport)**
 for PR 23531 at commit 
[`cfe0a67`](https://github.com/apache/spark/commit/cfe0a67eadc4cbe3aeda786401beef9d472f4ebd).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' 
and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-522103413
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14303/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-16 Thread GitBox

SparkQA commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as 
input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-522104051
 
 
   **[Test build #109234 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109234/testReport)**
 for PR 25458 at commit 
[`9e9aac3`](https://github.com/apache/spark/commit/9e9aac3269ed5956afc8a9e6a4491f9e6e747508).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25458: [SPARK-27931][SQL] Accept 'on' 
and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-522103408
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 
'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-522103413
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14303/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25458: [SPARK-27931][SQL] Accept 'on' and 
'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#issuecomment-522103408
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25299: [SPARK-27651][Core] Avoid the 
network when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#issuecomment-522102166
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109225/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25299: [SPARK-27651][Core] Avoid the 
network when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#issuecomment-522102160
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #25299: [SPARK-27651][Core] Avoid the 
network when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#issuecomment-522057827
 
 
   **[Test build #109225 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109225/testReport)**
 for PR 25299 at commit 
[`af99e79`](https://github.com/apache/spark/commit/af99e79bb7d5bb82203440348c11c7d1e522f397).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25299: [SPARK-27651][Core] Avoid the network 
when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#issuecomment-522102166
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109225/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25299: [SPARK-27651][Core] Avoid the network 
when shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#issuecomment-522102160
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25299: [SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host

2019-08-16 Thread GitBox

SparkQA commented on issue #25299: [SPARK-27651][Core] Avoid the network when 
shuffle blocks are fetched from the same host
URL: https://github.com/apache/spark/pull/25299#issuecomment-522101866
 
 
   **[Test build #109225 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109225/testReport)**
 for PR 25299 at commit 
[`af99e79`](https://github.com/apache/spark/commit/af99e79bb7d5bb82203440348c11c7d1e522f397).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

SparkQA commented on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522100789
 
 
   **[Test build #4834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4834/testReport)**
 for PR 25476 at commit 
[`5d6015a`](https://github.com/apache/spark/commit/5d6015abefed7c766f49eac8d7e89228a9246529).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314830758
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDataConsumerSuite.scala
 ##
 @@ -95,10 +100,12 @@ class KafkaDataConsumerSuite extends SharedSQLContext 
with PrivateMethodTester {
   try {
 val range = consumer.getAvailableOffsetRange()
 val rcvd = range.earliest until range.latest map { offset =>
-  val bytes = consumer.get(offset, Long.MaxValue, 1, 
failOnDataLoss = false).value()
-  new String(bytes)
+  val record = consumer.get(offset, Long.MaxValue, 1, 
failOnDataLoss = false)
+  val value = new String(record.value(), StandardCharsets.UTF_8)
+  val headers = record.headers().toArray.map(header => (header.key(), 
header.value())).toSeq
+  (value, headers)
 }
-assert(rcvd == data)
+data === rcvd
 
 Review comment:
   Why remove the assertion?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314830946
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaRelationSuite.scala
 ##
 @@ -158,19 +157,21 @@ abstract class KafkaRelationSuiteBase extends QueryTest 
with SharedSQLContext wi
 val topic = newTopic()
 testUtils.createTopic(topic, partitions = 3)
 testUtils.sendMessage(
-  topic, ("1", Array(("once", "1".getBytes), ("twice", "2".getBytes))), 
Some(0)
+  topic, ("1", Seq()), Some(0)
 )
 testUtils.sendMessage(
-  topic, ("2", Array(("once", "2".getBytes), ("twice", "4".getBytes))), 
Some(1)
+  topic, ("2", Seq(("a", "b".getBytes("UTF-8")), ("c", 
"d".getBytes("UTF-8", Some(1)
 
 Review comment:
   Go ahead and use `UTF-8` rather than the string here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314830204
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala
 ##
 @@ -88,7 +92,19 @@ private[kafka010] abstract class KafkaRowWriter(
   throw new NullPointerException(s"null topic present in the data. Use the 
" +
 s"${KafkaSourceProvider.TOPIC_OPTION_KEY} option for setting a default 
topic.")
 }
-val record = new ProducerRecord[Array[Byte], Array[Byte]](topic.toString, 
key, value)
+val record = if (projectedRow.isNullAt(3)) {
+  new ProducerRecord[Array[Byte], Array[Byte]](topic.toString, null, key, 
value)
+} else {
+  val headerArray = projectedRow.getArray(3)
+  val headers = (0 until headerArray.numElements()).map(
 
 Review comment:
   Total nit, but can you just `.map { i =>`? saves a few parens


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314830807
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaRelationSuite.scala
 ##
 @@ -70,7 +70,7 @@ abstract class KafkaRelationSuiteBase extends QueryTest with 
SharedSQLContext wi
   protected def createDF(
   topic: String,
   withOptions: Map[String, String] = Map.empty[String, String],
-  brokerAddress: Option[String] = None) = {
+  brokerAddress: Option[String] = None, includeHeaders: Boolean = false) = 
{
 
 Review comment:
   Nit: put the new param on a new line


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314831207
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala
 ##
 @@ -257,17 +259,34 @@ class KafkaTestUtils(withBrokerProps: Map[String, 
Object] = Map.empty) extends L
   topic: String,
   messages: Array[String],
   partition: Option[Int]): Seq[(String, RecordMetadata)] = {
+sendMessages(topic, messages.map(m => (m, Seq())), partition)
+  }
+
+  /** Send record to the Kafka broker with headers using specified partition */
+  def sendMessage(topic: String,
+  record: (String, Seq[(String, Array[Byte])]),
+  partition: Option[Int]): Seq[(String, RecordMetadata)] = {
+sendMessages(topic, Array(record).toSeq, partition)
+  }
+
+  /** Send the array of records to the Kafka broker with headers using 
specified partition */
+  def sendMessages(topic: String,
+   records: Seq[(String, Seq[(String, Array[Byte])])],
+   partition: Option[Int]): Seq[(String, RecordMetadata)] = {
 producer = new KafkaProducer[String, String](producerConfiguration)
 val offsets = try {
-  messages.map { m =>
+  records.map { r =>
 
 Review comment:
   can you make this more readable with something like `.map { case (value, 
header) =>`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314829652
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSource.scala
 ##
 @@ -297,17 +298,16 @@ private[kafka010] class KafkaSource(
 }.toArray
 
 // Create an RDD that reads from Kafka and get the (key, value) pair as 
byte arrays.
-val rdd = new KafkaSourceRDD(
+val rdd = if (includeHeaders) {
+  new KafkaSourceRDD(
   sc, executorKafkaParams, offsetRanges, pollTimeoutMs, failOnDataLoss,
-  reuseKafkaConsumer = true).map { cr =>
-  InternalRow(
-cr.key,
-cr.value,
-UTF8String.fromString(cr.topic),
-cr.partition,
-cr.offset,
-DateTimeUtils.fromJavaTimestamp(new java.sql.Timestamp(cr.timestamp)),
-cr.timestampType.id)
+  reuseKafkaConsumer = true)
+.map(KafkaOffsetReader.toInternalRowWithHeaders(_))
 
 Review comment:
   It really doesn't matter, but you can omit `(_)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314830604
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDataConsumerSuite.scala
 ##
 @@ -62,9 +60,16 @@ class KafkaDataConsumerSuite extends SharedSQLContext with 
PrivateMethodTester {
 
   test("SPARK-23623: concurrent use of KafkaDataConsumer") {
 val topic = "topic" + Random.nextInt()
-val data = (1 to 1000).map(_.toString)
+val data = (1 to 1000).map(i =>
+  (i.toString,
+Array(
 
 Review comment:
   Again just a nit, but can you just make a `Seq` rather than 
`Array(...).toSeq`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314830320
 
 

 ##
 File path: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala
 ##
 @@ -88,7 +92,19 @@ private[kafka010] abstract class KafkaRowWriter(
   throw new NullPointerException(s"null topic present in the data. Use the 
" +
 s"${KafkaSourceProvider.TOPIC_OPTION_KEY} option for setting a default 
topic.")
 }
-val record = new ProducerRecord[Array[Byte], Array[Byte]](topic.toString, 
key, value)
+val record = if (projectedRow.isNullAt(3)) {
+  new ProducerRecord[Array[Byte], Array[Byte]](topic.toString, null, key, 
value)
+} else {
+  val headerArray = projectedRow.getArray(3)
+  val headers = (0 until headerArray.numElements()).map(
+i => {
+  val struct = headerArray.getStruct(i, 2)
+  new RecordHeader(struct.getUTF8String(0).toString, 
struct.getBinary(1))
+.asInstanceOf[Header]
 
 Review comment:
   Do you need this cast? it's already a `Header`? I could be missing something


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add support for Kafka headers in Structured Streaming

2019-08-16 Thread GitBox

srowen commented on a change in pull request #22282: [SPARK-23539][SS] Add 
support for Kafka headers in Structured Streaming
URL: https://github.com/apache/spark/pull/22282#discussion_r314828809
 
 

 ##
 File path: docs/structured-streaming-kafka-integration.md
 ##
 @@ -27,6 +27,8 @@ For Scala/Java applications using SBT/Maven project 
definitions, link your appli
 artifactId = spark-sql-kafka-0-10_{{site.SCALA_BINARY_VERSION}}
 version = {{site.SPARK_VERSION_SHORT}}
 
+Please note that to use the headers functionality, your Kafka client version 
should be version 0.11.0.0 or up.
 
 Review comment:
   For Spark, the client is definitely 0.11+ -- do we need this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25172: [SPARK-28412][SQL] ANSI SQL: OVERLAY function support byte array

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25172: [SPARK-28412][SQL] ANSI SQL: OVERLAY 
function support byte array
URL: https://github.com/apache/spark/pull/25172#issuecomment-522095801
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522094886
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109218/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on issue #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel

2019-08-16 Thread GitBox

viirya commented on issue #25442: [SPARK-28722][ML] Change sequential label 
sorting in StringIndexer fit to parallel
URL: https://github.com/apache/spark/pull/25442#issuecomment-522095651
 
 
   thanks @felixcheung @srowen 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522094886
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109218/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522094883
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522057733
 
 
   **[Test build #109218 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109218/testReport)**
 for PR 25476 at commit 
[`5d6015a`](https://github.com/apache/spark/commit/5d6015abefed7c766f49eac8d7e89228a9246529).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522094883
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

SparkQA commented on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case 
with different pre-shuffle partition numbers
URL: https://github.com/apache/spark/pull/25479#issuecomment-522094416
 
 
   **[Test build #109233 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109233/testReport)**
 for PR 25479 at commit 
[`31436a8`](https://github.com/apache/spark/commit/31436a81f9761b26605eee99ca634e4231ac6191).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25463: [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25463: [SPARK-28744][SQL][TEST] 
rename SharedSQLContext to SharedSparkSession
URL: https://github.com/apache/spark/pull/25463#issuecomment-522093666
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109221/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25476: [SPARK-28759][BUILD] Upgrade scala-maven-plugin to 4.1.1

2019-08-16 Thread GitBox

SparkQA commented on issue #25476: [SPARK-28759][BUILD] Upgrade 
scala-maven-plugin to 4.1.1
URL: https://github.com/apache/spark/pull/25476#issuecomment-522094624
 
 
   **[Test build #109218 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109218/testReport)**
 for PR 25476 at commit 
[`5d6015a`](https://github.com/apache/spark/commit/5d6015abefed7c766f49eac8d7e89228a9246529).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25479: 
[SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition 
numbers
URL: https://github.com/apache/spark/pull/25479#issuecomment-522093644
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14302/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25463: [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25463: [SPARK-28744][SQL][TEST] 
rename SharedSQLContext to SharedSparkSession
URL: https://github.com/apache/spark/pull/25463#issuecomment-522093659
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25479: 
[SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition 
numbers
URL: https://github.com/apache/spark/pull/25479#issuecomment-522093634
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix 
case with different pre-shuffle partition numbers
URL: https://github.com/apache/spark/pull/25479#issuecomment-522093634
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix 
case with different pre-shuffle partition numbers
URL: https://github.com/apache/spark/pull/25479#issuecomment-522093644
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14302/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25463: [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #25463: [SPARK-28744][SQL][TEST] rename 
SharedSQLContext to SharedSparkSession
URL: https://github.com/apache/spark/pull/25463#issuecomment-522057739
 
 
   **[Test build #109221 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109221/testReport)**
 for PR 25463 at commit 
[`8bc622f`](https://github.com/apache/spark/commit/8bc622f7332ad7970e54624b6d5b5aa184df3510).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25463: [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25463: [SPARK-28744][SQL][TEST] rename 
SharedSQLContext to SharedSparkSession
URL: https://github.com/apache/spark/pull/25463#issuecomment-522093666
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109221/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25463: [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25463: [SPARK-28744][SQL][TEST] rename 
SharedSQLContext to SharedSparkSession
URL: https://github.com/apache/spark/pull/25463#issuecomment-522093659
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25463: [SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession

2019-08-16 Thread GitBox

SparkQA commented on issue #25463: [SPARK-28744][SQL][TEST] rename 
SharedSQLContext to SharedSparkSession
URL: https://github.com/apache/spark/pull/25463#issuecomment-522093317
 
 
   **[Test build #109221 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109221/testReport)**
 for PR 25463 at commit 
[`8bc622f`](https://github.com/apache/spark/commit/8bc622f7332ad7970e54624b6d5b5aa184df3510).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `abstract class DockerJDBCIntegrationSuite extends SharedSparkSession 
with Eventually `
 * `class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with 
SharedSparkSession `
 * `class OrcFilterSuite extends OrcTest with SharedSparkSession `


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen closed pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel

2019-08-16 Thread GitBox

srowen closed pull request #25442: [SPARK-28722][ML] Change sequential label 
sorting in StringIndexer fit to parallel
URL: https://github.com/apache/spark/pull/25442
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

peter-toth commented on a change in pull request #25479: 
[SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition 
numbers
URL: https://github.com/apache/spark/pull/25479#discussion_r314821470
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala
 ##
 @@ -82,7 +82,11 @@ case class ReduceNumShufflePartitions(conf: SQLConf) 
extends Rule[SparkPlan] {
   // `ShuffleQueryStageExec` gives null mapOutputStatistics when the input 
RDD has 0 partitions,
   // we should skip it when calculating the `partitionStartIndices`.
   val validMetrics = shuffleMetrics.filter(_ != null)
-  if (validMetrics.nonEmpty) {
+  // We may have different pre-shuffle partition numbers, don't reduce 
shuffle partition number
 
 Review comment:
   Ok, added. Please let me know if it should be more detailed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] Accept 'on' and 'off' as input and trim input for the boolean data type.

2019-08-16 Thread GitBox

younggyuchun commented on a change in pull request #25458: [SPARK-27931][SQL] 
Accept 'on' and 'off' as input and trim input for the boolean data type.
URL: https://github.com/apache/spark/pull/25458#discussion_r314820823
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/pgSQL/boolean.sql
 ##
 @@ -23,7 +23,7 @@ SELECT false AS `false`;
 SELECT boolean('t') AS true;
 
 -- [SPARK-27931] Trim the string when cast string type to boolean type
-SELECT boolean('   f   ') AS `false`;
+SELECT boolean('   f   ') AS `true`;
 
 Review comment:
   Let me dig into this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on issue #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel

2019-08-16 Thread GitBox

srowen commented on issue #25442: [SPARK-28722][ML] Change sequential label 
sorting in StringIndexer fit to parallel
URL: https://github.com/apache/spark/pull/25442#issuecomment-522091782
 
 
   Merged to master


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…

2019-08-16 Thread GitBox

SparkQA commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix 
StreamingContext leak through Streaming…
URL: https://github.com/apache/spark/pull/25439#issuecomment-522087186
 
 
   **[Test build #109232 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109232/testReport)**
 for PR 25439 at commit 
[`4d5965e`](https://github.com/apache/spark/commit/4d5965ecb48685faed63a751100433a273695e5b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix 
StreamingContext leak through Streaming…
URL: https://github.com/apache/spark/pull/25439#issuecomment-522086533
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14301/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix 
StreamingContext leak through Streaming…
URL: https://github.com/apache/spark/pull/25439#issuecomment-522086529
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix 
StreamingContext leak through Streaming…
URL: https://github.com/apache/spark/pull/25439#issuecomment-522086529
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix 
StreamingContext leak through Streaming…
URL: https://github.com/apache/spark/pull/25439#issuecomment-522086533
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14301/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

peter-toth commented on a change in pull request #25479: 
[SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition 
numbers
URL: https://github.com/apache/spark/pull/25479#discussion_r314814514
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/ReduceNumShufflePartitionsSuite.scala
 ##
 @@ -587,4 +587,22 @@ class ReduceNumShufflePartitionsSuite extends 
SparkFunSuite with BeforeAndAfterA
 }
 withSparkSession(test, 200, None)
   }
+
+  test("Union two datasets with different pre-shuffle partition number") {
+val test: SparkSession => Unit = { spark: SparkSession =>
+  val df1 = spark.range(3).join(spark.range(3), "id").toDF()
+  val df2 = spark.range(3).groupBy().sum()
+
+  val resultDf = df1.union(df2)
+
+  checkAnswer(resultDf, Seq((0), (1), (2), (3)).map(i => Row(i)))
 
 Review comment:
   It does. The plan is:
   ```
   AdaptiveSparkPlan(isFinalPlan=false)
   +- Union
  :- Project [id#0L]
  :  +- SortMergeJoin [id#0L], [id#2L], Inner
  : :- Sort [id#0L ASC NULLS FIRST], false, 0
  : :  +- Exchange hashpartitioning(id#0L, 5), true
  : : +- Range (0, 3, step=1, splits=12)
  : +- Sort [id#2L ASC NULLS FIRST], false, 0
  :+- Exchange hashpartitioning(id#2L, 5), true
  :   +- Range (0, 3, step=1, splits=12)
  +- HashAggregate(keys=[], functions=[sum(id#6L)], output=[sum(id)#10L])
 +- Exchange SinglePartition, true
+- HashAggregate(keys=[], functions=[partial_sum(id#6L)], 
output=[sum#14L])
   +- Range (0, 3, step=1, splits=12)
   ```
   and the error comes from this assert: 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala#L136


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] peter-toth commented on a change in pull request #25479: [SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition numbers

2019-08-16 Thread GitBox

peter-toth commented on a change in pull request #25479: 
[SPARK-28356][SHUFFLE][FOLLOWUP] Fix case with different pre-shuffle partition 
numbers
URL: https://github.com/apache/spark/pull/25479#discussion_r314814514
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/ReduceNumShufflePartitionsSuite.scala
 ##
 @@ -587,4 +587,22 @@ class ReduceNumShufflePartitionsSuite extends 
SparkFunSuite with BeforeAndAfterA
 }
 withSparkSession(test, 200, None)
   }
+
+  test("Union two datasets with different pre-shuffle partition number") {
+val test: SparkSession => Unit = { spark: SparkSession =>
+  val df1 = spark.range(3).join(spark.range(3), "id").toDF()
+  val df2 = spark.range(3).groupBy().sum()
+
+  val resultDf = df1.union(df2)
+
+  checkAnswer(resultDf, Seq((0), (1), (2), (3)).map(i => Row(i)))
 
 Review comment:
   It does. The plan is:
   ```
   AdaptiveSparkPlan(isFinalPlan=false)
   +- Union
  :- Project [id#0L]
  :  +- SortMergeJoin [id#0L], [id#2L], Inner
  : :- Sort [id#0L ASC NULLS FIRST], false, 0
  : :  +- Exchange hashpartitioning(id#0L, 5), true
  : : +- Range (0, 3, step=1, splits=12)
  : +- Sort [id#2L ASC NULLS FIRST], false, 0
  :+- Exchange hashpartitioning(id#2L, 5), true
  :   +- Range (0, 3, step=1, splits=12)
  +- HashAggregate(keys=[], functions=[sum(id#6L)], output=[sum(id)#10L])
 +- Exchange SinglePartition, true
+- HashAggregate(keys=[], functions=[partial_sum(id#6L)], 
output=[sum#14L])
   +- Range (0, 3, step=1, splits=12)
   ```
   and the error comes from this assert: 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ReduceNumShufflePartitions.scala#L136
   Honestly I'm not sure why the `Exchange SinglePartition` has `true` 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix 
StreamingContext leak through Streaming…
URL: https://github.com/apache/spark/pull/25439#issuecomment-522084583
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522084048
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14300/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522084033
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522084048
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14300/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522084033
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] shaneknapp commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

shaneknapp commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522083248
 
 
   @wangyum please check out the recent updates in 
https://github.com/apache/spark/pull/25423
   
   you'll need to update `dev/run-tests.py` and take out the jdk/java_home 
stuff in `dev/run-tests-jenkins.py`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

SparkQA commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522082372
 
 
   **[Test build #109231 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109231/testReport)**
 for PR 25443 at commit 
[`0ac0b30`](https://github.com/apache/spark/commit/0ac0b30947dc1da33d0ac5ed0c8201a4c3d54c8a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25477: [WIP][SPARK-28760][SS][TESTS][test-maven] Add Kafka delegation token end-to-end test with mini KDC

2019-08-16 Thread GitBox

SparkQA commented on issue #25477: [WIP][SPARK-28760][SS][TESTS][test-maven] 
Add Kafka delegation token end-to-end test with mini KDC
URL: https://github.com/apache/spark/pull/25477#issuecomment-522082305
 
 
   **[Test build #109230 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109230/testReport)**
 for PR 25477 at commit 
[`15f3b2a`](https://github.com/apache/spark/commit/15f3b2a5acf6138b539886d7151af6af90fc79ba).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gaborgsomogyi commented on a change in pull request #25477: [WIP][SPARK-28760][SS][TESTS][test-maven] Add Kafka delegation token end-to-end test with mini KDC

2019-08-16 Thread GitBox

gaborgsomogyi commented on a change in pull request #25477: 
[WIP][SPARK-28760][SS][TESTS][test-maven] Add Kafka delegation token end-to-end 
test with mini KDC
URL: https://github.com/apache/spark/pull/25477#discussion_r314809507
 
 

 ##
 File path: external/kafka-0-10-sql/pom.xml
 ##
 @@ -92,6 +92,18 @@
 
   
 
+
+  org.apache.hadoop
 
 Review comment:
   This can be added to the root pom since MiniKDC can be useful for all of the 
peoples who implements secure tests but since only Kafka connector uses it I've 
left it here for now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522081698
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

dongjoon-hyun closed pull request #25478: [SPARK-28755][R][TESTS] Increase 
tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25443: 
[WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 
2.3.6 on jenkins
URL: https://github.com/apache/spark/pull/25443#issuecomment-522081966
 
 
   All other issues are resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25478: [SPARK-28755][R][TESTS] Increase 
tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522081589
 
 
   Thank you, @HyukjinKwon and @srowen .
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-16 Thread GitBox

maryannxue commented on a change in pull request #25456: [SPARK-28739][SQL] Add 
a simple cost check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#discussion_r314808742
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 ##
 @@ -403,6 +450,18 @@ object AdaptiveSparkPlanExec {
   private val executionContext = ExecutionContext.fromExecutorService(
 ThreadUtils.newDaemonCachedThreadPool("QueryStageCreator", 16))
 
+  /**
+   * The temporary [[LogicalPlan]] link for query stages.
+   *
+   * Temp logical links are set for each query stage after its creation. 
During re-planning, the
 
 Review comment:
   We could do either way (temp or normal link) here I think, because we have a 
post-fix for the link in 
https://github.com/apache/spark/pull/25456/files#diff-6954dd8020a9ca298f1fb9602c0e831cR362,
 which by itself is necessary.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25478: [SPARK-28755][R][TESTS] Increase 
tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522081071
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109216/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25478: [SPARK-28755][R][TESTS] 
Increase tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522081065
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25478: [SPARK-28755][R][TESTS] 
Increase tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522081071
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109216/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25478: [SPARK-28755][R][TESTS] Increase 
tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522081065
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-16 Thread GitBox

maryannxue commented on a change in pull request #25456: [SPARK-28739][SQL] Add 
a simple cost check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#discussion_r314807909
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 ##
 @@ -317,22 +345,21 @@ case class AdaptiveSparkPlanExec(
*/
   private def updateLogicalPlan(
   logicalPlan: LogicalPlan,
-  newStages: Seq[(Exchange, QueryStageExec)]): LogicalPlan = {
+  newStages: Seq[QueryStageExec]): LogicalPlan = {
 var currentLogicalPlan = logicalPlan
 newStages.foreach {
-  case (exchange, stage) =>
-// Get the corresponding logical node for `exchange`. If `exchange` 
has been transformed
-// from a `Repartition`, it should have `logicalLink` available by 
itself; otherwise
-// traverse down to find the first node that is not generated by 
`EnsureRequirements`.
-val logicalNodeOpt = exchange.logicalLink.orElse(exchange.collectFirst 
{
-  case p if p.logicalLink.isDefined => p.logicalLink.get
-})
+  case stage if currentPhysicalPlan.find(_.eq(stage)).isDefined =>
 
 Review comment:
   Following this comment 
https://github.com/apache/spark/pull/25456/files#r314805998:
   
   Now that we might have `newStages` from different rounds of stage creation, 
some stages might have been included in newer stages already, so those are not 
"reachable" now and we don't need to worry about them any more. But meanwhile 
we need to make sure we always apply the latest stages first: 
https://github.com/apache/spark/pull/25456/files#diff-6954dd8020a9ca298f1fb9602c0e831cR142.
 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

SparkQA commented on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance 
in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522080503
 
 
   **[Test build #109216 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109216/testReport)**
 for PR 25478 at commit 
[`fcee12a`](https://github.com/apache/spark/commit/fcee12aaefca407e7f32f8b7e44b5b88401f3380).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25472: [SPARK-28756][R] Fix 
checkJavaVersion to accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522080395
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25478: [SPARK-28755][R][TESTS] Increase tolerance in 'spark.mlp' SparkR test for JDK 11

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #25478: [SPARK-28755][R][TESTS] Increase 
tolerance in 'spark.mlp' SparkR test for JDK 11
URL: https://github.com/apache/spark/pull/25478#issuecomment-522057694
 
 
   **[Test build #109216 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109216/testReport)**
 for PR 25478 at commit 
[`fcee12a`](https://github.com/apache/spark/commit/fcee12aaefca407e7f32f8b7e44b5b88401f3380).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25472: [SPARK-28756][R] Fix 
checkJavaVersion to accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522080412
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109220/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion 
to accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522080412
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109220/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

dongjoon-hyun closed pull request #25472: [SPARK-28756][R] Fix checkJavaVersion 
to accept JDK8+
URL: https://github.com/apache/spark/pull/25472
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

SparkQA removed a comment on issue #25472: [SPARK-28756][R] Fix 
checkJavaVersion to accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522057721
 
 
   **[Test build #109220 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109220/testReport)**
 for PR 25472 at commit 
[`4531ced`](https://github.com/apache/spark/commit/4531ced5892da7aee890c29027cf66b487375b39).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion 
to accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522080395
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion 
to accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522080459
 
 
   Thank you, @HyukjinKwon and @srowen . 
   Merged to master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to accept JDK8+

2019-08-16 Thread GitBox

SparkQA commented on issue #25472: [SPARK-28756][R] Fix checkJavaVersion to 
accept JDK8+
URL: https://github.com/apache/spark/pull/25472#issuecomment-522080120
 
 
   **[Test build #109220 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109220/testReport)**
 for PR 25472 at commit 
[`4531ced`](https://github.com/apache/spark/commit/4531ced5892da7aee890c29027cf66b487375b39).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maryannxue commented on a change in pull request #25456: [SPARK-28739][SQL] Add a simple cost check for Adaptive Query Execution

2019-08-16 Thread GitBox

maryannxue commented on a change in pull request #25456: [SPARK-28739][SQL] Add 
a simple cost check for Adaptive Query Execution
URL: https://github.com/apache/spark/pull/25456#discussion_r314805998
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 ##
 @@ -132,21 +135,23 @@ case class AdaptiveSparkPlanExec(
   var result = createQueryStages(currentPhysicalPlan)
   val events = new LinkedBlockingQueue[StageMaterializationEvent]()
   val errors = new mutable.ArrayBuffer[SparkException]()
+  var stagesToReplace = Seq.empty[QueryStageExec]
   while (!result.allChildStagesMaterialized) {
 currentPhysicalPlan = result.newPlan
-currentLogicalPlan = updateLogicalPlan(currentLogicalPlan, 
result.newStages)
-currentPhysicalPlan.setTagValue(SparkPlan.LOGICAL_PLAN_TAG, 
currentLogicalPlan)
-executionId.foreach(onUpdatePlan)
-
-// Start materialization of all new stages.
-result.newStages.map(_._2).foreach { stage =>
-  stage.materialize().onComplete { res =>
-if (res.isSuccess) {
-  events.offer(StageSuccess(stage, res.get))
-} else {
-  events.offer(StageFailure(stage, res.failed.get))
-}
-  }(AdaptiveSparkPlanExec.executionContext)
+if (result.newStages.nonEmpty) {
+  stagesToReplace = result.newStages ++ stagesToReplace
 
 Review comment:
   Yes, one important idea i should have put in code comment here is:
   
   The current logical plan is always updated together with the current 
physical plan, which means if a new physical plan is not adopted after 
re-optimization, the new logical plan (with stages replaced) is not taken 
either 
(https://github.com/apache/spark/pull/25456/files#diff-6954dd8020a9ca298f1fb9602c0e831cR181).
 That also means that the current logical plan is kind of behind the current 
status of the physical plan because the logical plan does not reflect the new 
stages created since last update 
(https://github.com/apache/spark/pull/25456/files#diff-6954dd8020a9ca298f1fb9602c0e831cR188).
 Yet we cannot update the logical plan alone, as all logical links of the 
current physical plan point to the original logical plan it is planned from. So 
as a fix for this "out-of-date" problem, we keep the logical plan together with 
this `stagesToReplace` list, and each time we re-optimize, we update the 
logical plan with those stages (that haven't been applied to it yet) first 
(https://github.com/apache/spark/pull/25456/files#diff-6954dd8020a9ca298f1fb9602c0e831cR177),
 and then start re-optimizing and re-planning on the updated logical plan. If 
the new physical plan is adopted, we take the new physical plan together with 
the new logical plan and clear the `stagesToReplace` list, otherwise, we keep 
the current logical plan and the list as they are.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #25470: [SPARK-28751][Core] Improve java serializer deserialization performance

2019-08-16 Thread GitBox

srowen commented on a change in pull request #25470: [SPARK-28751][Core] 
Improve java serializer deserialization performance
URL: https://github.com/apache/spark/pull/25470#discussion_r314800393
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
 ##
 @@ -92,9 +106,14 @@ private object JavaDeserializationStream {
 }
 
 private[spark] class JavaSerializerInstance(
-counterReset: Int, extraDebugInfo: Boolean, defaultClassLoader: 
ClassLoader)
+counterReset: Int,
+extraDebugInfo: Boolean,
+defaultClassLoader: ClassLoader,
+useCache: Boolean)
   extends SerializerInstance {
 
+  lazy val resolvedClassesCache = new ConcurrentHashMap[(String, ClassLoader), 
Class[_]]()
 
 Review comment:
   Also, how big might this map get? I suppose its lifetime is limited, and 
won't cache just massive numbers of classes, hm.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #25470: [SPARK-28751][Core] Improve java serializer deserialization performance

2019-08-16 Thread GitBox

srowen commented on a change in pull request #25470: [SPARK-28751][Core] 
Improve java serializer deserialization performance
URL: https://github.com/apache/spark/pull/25470#discussion_r314799430
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
 ##
 @@ -92,9 +106,14 @@ private object JavaDeserializationStream {
 }
 
 private[spark] class JavaSerializerInstance(
-counterReset: Int, extraDebugInfo: Boolean, defaultClassLoader: 
ClassLoader)
+counterReset: Int,
+extraDebugInfo: Boolean,
+defaultClassLoader: ClassLoader,
+useCache: Boolean)
   extends SerializerInstance {
 
+  lazy val resolvedClassesCache = new ConcurrentHashMap[(String, ClassLoader), 
Class[_]]()
 
 Review comment:
   Rather than be lazy, maybe just only instantiate the map if the cache is 
used? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] srowen commented on a change in pull request #25470: [SPARK-28751][Core] Improve java serializer deserialization performance

2019-08-16 Thread GitBox

srowen commented on a change in pull request #25470: [SPARK-28751][Core] 
Improve java serializer deserialization performance
URL: https://github.com/apache/spark/pull/25470#discussion_r314800107
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala
 ##
 @@ -58,19 +59,32 @@ private[spark] class JavaSerializationStream(
   def close() { objOut.close() }
 }
 
-private[spark] class JavaDeserializationStream(in: InputStream, loader: 
ClassLoader)
-  extends DeserializationStream {
+private[spark] class JavaDeserializationStream(
+in: InputStream,
+loader: ClassLoader,
+useCache: Boolean,
+serializerInstance: JavaSerializerInstance) extends DeserializationStream {
 
   private val objIn = new ObjectInputStream(in) {
-override def resolveClass(desc: ObjectStreamClass): Class[_] =
+override def resolveClass(desc: ObjectStreamClass): Class[_] = {
+  if (useCache) {
+serializerInstance.resolvedClassesCache.computeIfAbsent(
+  (desc.getName, loader), pair => normalResolve(pair._1))
+  } else {
+normalResolve(desc.getName)
+  }
+}
+
+private def normalResolve(name: String): Class[_] = {
   try {
 // scalastyle:off classforname
-Class.forName(desc.getName, false, loader)
+Class.forName(name, false, loader)
 
 Review comment:
   Yeah, same question. It's native code so hard to see. I suspect it does, but 
we may still be saving some overhead in digging it out of the JVM?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25455: [SPARK-28737][CORE] Update Jersey to 2.29

2019-08-16 Thread GitBox

dongjoon-hyun commented on issue #25455: [SPARK-28737][CORE] Update Jersey to 
2.29
URL: https://github.com/apache/spark/pull/25455#issuecomment-522071683
 
 
   Yes. It fails at GitHub posting. It seems to be GitHub outage.
   ```
   Tests passed.
   Attempting to post to Github...
   Failed to post message to Github.
> urllib_status: Temporary failure in name resolution
> data: {"body": "**[Test build #109205 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109205/testReport)**
 for PR 25455 at commit 
[`57add37`](https://github.com/apache/spark/commit/57add373ba6cc95730f3f57f9e5b2deb13ab121a).\n
 * This patch passes all tests.\n * This patch merges cleanly.\n * This patch 
adds no public classes."}
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25423: [SPARK-28701][test-java11][k8s] adding java11 support for pull request builds

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25423: 
[SPARK-28701][test-java11][k8s] adding java11 support for pull request builds
URL: https://github.com/apache/spark/pull/25423#issuecomment-522070344
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14299/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25423: [SPARK-28701][test-java11][k8s] adding java11 support for pull request builds

2019-08-16 Thread GitBox

AmplabJenkins removed a comment on issue #25423: 
[SPARK-28701][test-java11][k8s] adding java11 support for pull request builds
URL: https://github.com/apache/spark/pull/25423#issuecomment-522070337
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25423: [SPARK-28701][test-java11][k8s] adding java11 support for pull request builds

2019-08-16 Thread GitBox

AmplabJenkins commented on issue #25423: [SPARK-28701][test-java11][k8s] adding 
java11 support for pull request builds
URL: https://github.com/apache/spark/pull/25423#issuecomment-522070337
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1144 matches

Mail list logo