[GitHub] [spark] AmplabJenkins removed a comment on issue #25180: [SPARK-28423][SQL] Merge Scan and Batch/Stream
AmplabJenkins removed a comment on issue #25180: [SPARK-28423][SQL] Merge Scan and Batch/Stream URL: https://github.com/apache/spark/pull/25180#issuecomment-518641069 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13803/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
SparkQA commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518641897 **[Test build #108703 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108703/testReport)** for PR 25366 at commit [`05a14c5`](https://github.com/apache/spark/commit/05a14c54ea8b5496bf473f3001b5c7ce028e75e9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
SparkQA removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518571418 **[Test build #108703 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108703/testReport)** for PR 25366 at commit [`05a14c5`](https://github.com/apache/spark/commit/05a14c54ea8b5496bf473f3001b5c7ce028e75e9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518642723 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108703/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518642718 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518642718 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518642723 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108703/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25180: [SPARK-28423][SQL] Merge Scan and Batch/Stream
cloud-fan commented on a change in pull request #25180: [SPARK-28423][SQL] Merge Scan and Batch/Stream URL: https://github.com/apache/spark/pull/25180#discussion_r311033591 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/MicroBatchScan.java ## @@ -21,13 +21,27 @@ import org.apache.spark.sql.sources.v2.reader.InputPartition; import org.apache.spark.sql.sources.v2.reader.PartitionReader; import org.apache.spark.sql.sources.v2.reader.PartitionReaderFactory; -import org.apache.spark.sql.sources.v2.reader.Scan; /** - * A {@link SparkDataStream} for streaming queries with micro-batch mode. + * An interface that defines how to scan the data from data source for micro-batch streaming + * processing. + * + * The scanning procedure is: Review comment: Hi @jose-torres , can you double-check if my explanation is correct? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25180: [SPARK-28423][SQL] Merge Scan and Batch/Stream
cloud-fan commented on a change in pull request #25180: [SPARK-28423][SQL] Merge Scan and Batch/Stream URL: https://github.com/apache/spark/pull/25180#discussion_r311033546 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/ContinuousScan.java ## @@ -19,21 +19,36 @@ import org.apache.spark.annotation.Evolving; import org.apache.spark.sql.sources.v2.reader.InputPartition; -import org.apache.spark.sql.sources.v2.reader.Scan; /** - * A {@link SparkDataStream} for streaming queries with continuous mode. + * An interface that defines how to scan the data from data source for continuous streaming + * processing. + * + * The scanning procedure is: Review comment: Hi @jose-torres , can you double-check if my explanation is correct? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
SparkQA commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518650980 **[Test build #108705 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108705/testReport)** for PR 25366 at commit [`5090742`](https://github.com/apache/spark/commit/509074200f0052dbf9da0ee74c3bceb252ef388c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
SparkQA removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518584133 **[Test build #108705 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108705/testReport)** for PR 25366 at commit [`5090742`](https://github.com/apache/spark/commit/509074200f0052dbf9da0ee74c3bceb252ef388c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518651599 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518651599 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518651609 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108705/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518651609 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108705/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25289: [WIP][SPARK-27889][INFRA] Make development scripts under dev/ support Python 3
AmplabJenkins commented on issue #25289: [WIP][SPARK-27889][INFRA] Make development scripts under dev/ support Python 3 URL: https://github.com/apache/spark/pull/25289#issuecomment-518652164 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108709/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25289: [WIP][SPARK-27889][INFRA] Make development scripts under dev/ support Python 3
AmplabJenkins commented on issue #25289: [WIP][SPARK-27889][INFRA] Make development scripts under dev/ support Python 3 URL: https://github.com/apache/spark/pull/25289#issuecomment-518652154 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25289: [WIP][SPARK-27889][INFRA] Make development scripts under dev/ support Python 3
AmplabJenkins removed a comment on issue #25289: [WIP][SPARK-27889][INFRA] Make development scripts under dev/ support Python 3 URL: https://github.com/apache/spark/pull/25289#issuecomment-518652154 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518652788 First of all thanks @HeartSaVioR for the deep look, it helped! @zsxwing I've had another round with the Kafka guys on this and here are the conclusions: * The approach to call `poll(Duration...` in a loop is the suggested solution * The second approach failed because of the following: Originally the partition assignment was synchronous in the old API (and may hang infinitely) but with the new implementation proper timeout applied. In the mentioned tests `"kafkaConsumer.pollTimeoutMs" -> "1000"` is set. Since the partition assignment happens on my machine (and seems like on Jenkins as well) around 1000 ms it was mainly race luck whether the test passed or not. The first approach was more on the success side but the second was more on the failing side. Here is the client log: ``` 19/08/05 06:55:27.650 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] TRACE NetworkClient: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Sending JOIN_GROUP {group_id=spark-kafka-source-42d70d54-e5ce-45$ 19/08/05 06:55:28.758 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] TRACE KafkaConsumer: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Closing the Kafka consumer 19/08/05 06:55:28.758 kafka-coordinator-heartbeat-thread | spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2 DEBUG AbstractCoordinator: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Heartbeat thread has closed 19/08/05 06:55:28.758 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] TRACE NetworkClient: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Completed receive from node 0 for METADATA with correlation id 6$ 19/08/05 06:55:28.759 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] TRACE Metadata: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Determining if we should replace existing epoch 0 with new epoch 0 19/08/05 06:55:28.759 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] DEBUG Metadata: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Updating last seen epoch from 0 to 0 for partition failOnDataLoss-0-0 19/08/05 06:55:28.759 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] DEBUG Metadata: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Updated cluster metadata updateVersion 3 to MetadataCache{cluster=Clu$ 19/08/05 06:55:28.760 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] TRACE NetworkClient: [Consumer clientId=consumer-522, groupId=spark-kafka-source-42d70d54-e5ce-453a-a84c-40cd8126a3ff--1337036482-driver-2] Completed receive from node 2147483647 for JOIN_GROUP with corre$ 19/08/05 06:55:28.760 stream execution thread for DontFailOnDataLoss [id = 9396b4b5-bc30-4e1e-8c6a-1d3a9d890ccf, runId = 5b1ee39c-c174-4c42-b0d5-1c4b59061ab3] TRACE Metrics: Removed metric named MetricName [name=connection-count, group=consumer-metrics, description=The current number of active connections., tags={client-id=consumer-522}] 19/08/05 06:55:28.760 data-plane-kafka-network-thread-0-ListenerName(PLAINTEXT)-PLAINTEXT-1 DEBUG Selector: [SocketServer brokerId=0] Connection with /127.0.0.1 disconnected java.io.EOFException at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:96) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:424) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:385) at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:651) at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:572) at org.apache.kafka.common.network.Selector.poll(Selector.java:483) at kafka.netw
[GitHub] [spark] gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
gaborgsomogyi commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518654674 To resolve this situation I've considered mainly 2 things: * Increase `kafkaConsumer.pollTimeoutMs` * Introduce an additional timeout for partition assignment The second approach is more like an overkill from my point of view. Such configuration can be added later if we see the need. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] shivusondur commented on issue #25215: [SPARK-28445][SQL][PYTHON] Fix error when PythonUDF is used in both group by and aggregate expression
shivusondur commented on issue #25215: [SPARK-28445][SQL][PYTHON] Fix error when PythonUDF is used in both group by and aggregate expression URL: https://github.com/apache/spark/pull/25215#issuecomment-518655612 > cc @skonto, @Udbhav30, @shivusondur > When you guys are available, mind making each followup PR to add udf in group-by clause in each JIRA you guys took? > > SPARK-28279 https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/inputs/udf/udf-group-analytics.sql @skonto > > SPARK-28280 https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/inputs/udf/udf-group-by.sql @skonto > > SPARK-28391 https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/inputs/udf/pgSQL/udf-select_implicit.sql - @Udbhav30 > > SPARK-28390 https://github.com/apache/spark/blob/master/sql/core/src/test/resources/sql-tests/inputs/udf/pgSQL/udf-select_having.sql @shivusondur > > The PR title should usually be `[SPARK-X][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'xxx.sql'` and the procedure to describe PR description will be the same as you guys did before as described in SPARK-27921 @HyukjinKwon I am still getting below error, after adding udf to groupby values, i tried for individual values also same issue ''' -- !query 11 SELECT udf(b), udf(c) FROM test_having GROUP BY udf(b), udf(c) HAVING udf(count(*)) = 1 ORDER BY udf(b), udf(c) -- !query 11 schema struct<> -- !query 11 output org.apache.spark.sql.AnalysisException cannot resolve '`b`' given input columns: [CAST(udf(cast(b as string)) AS INT), CAST(udf(cast(c as string)) AS STRING)]; line 2 pos 63 ''' This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518657160 **[Test build #108719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108719/testReport)** for PR 25135 at commit [`69c7ca5`](https://github.com/apache/spark/commit/69c7ca5252cdc3cc89ff2e92748d448060f18041). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
cloud-fan commented on a change in pull request #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#discussion_r311048475 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/BlockStoreClient.java ## @@ -49,12 +50,24 @@ public abstract void fetchBlocks( String host, int port, String execId, + int shuffleGenerationId, Review comment: does it mean this method can only fetch shuffle blocks? Shall we rename the method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518660146 **[Test build #108720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108720/testReport)** for PR 25135 at commit [`82f1520`](https://github.com/apache/spark/commit/82f1520246f9a3304288f7f873967635b3fbe5ee). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
cloud-fan commented on a change in pull request #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#discussion_r311049975 ## File path: core/src/main/scala/org/apache/spark/network/BlockTransferService.scala ## @@ -53,6 +53,34 @@ abstract class BlockTransferService extends BlockStoreClient with Logging { */ def hostName: String + /** + * Fetch a sequence of shuffle blocks from a remote node asynchronously, + * available only after [[init]] is invoked. + * + * Note that this API takes a sequence so the implementation can batch requests, and does not + * return a future so the underlying implementation can invoke onBlockFetchSuccess as soon as + * the data of a block is fetched, rather than waiting for all blocks to be fetched. + */ + override def fetchBlocks( + host: String, + port: Int, + execId: String, + shuffleGenerationId: Int, + blockIds: Array[String], + listener: BlockFetchingListener, + tempFileManager: DownloadFileManager): Unit + + /** + * Fetch a sequence of non-shuffle blocks from a remote node asynchronously. + */ + override def fetchDataBlocks( Review comment: unnecessary change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files
cloud-fan commented on a change in pull request #24892: [SPARK-25341][Core] Support rolling back a shuffle map stage and re-generate the shuffle files URL: https://github.com/apache/spark/pull/24892#discussion_r311049891 ## File path: core/src/main/scala/org/apache/spark/network/BlockTransferService.scala ## @@ -53,6 +53,34 @@ abstract class BlockTransferService extends BlockStoreClient with Logging { */ def hostName: String + /** + * Fetch a sequence of shuffle blocks from a remote node asynchronously, + * available only after [[init]] is invoked. + * + * Note that this API takes a sequence so the implementation can batch requests, and does not + * return a future so the underlying implementation can invoke onBlockFetchSuccess as soon as + * the data of a block is fetched, rather than waiting for all blocks to be fetched. + */ + override def fetchBlocks( Review comment: unnecessary change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25235: [SPARK-28483][Core] Fix canceling a spark job using barrier mode but barrier tasks blocking on BarrierTaskContext.barrier()
cloud-fan commented on a change in pull request #25235: [SPARK-28483][Core] Fix canceling a spark job using barrier mode but barrier tasks blocking on BarrierTaskContext.barrier() URL: https://github.com/apache/spark/pull/25235#discussion_r311050972 ## File path: core/src/main/scala/org/apache/spark/rpc/RpcEndpointRef.scala ## @@ -46,6 +46,17 @@ private[spark] abstract class RpcEndpointRef(conf: SparkConf) */ def send(message: Any): Unit + /** + * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to + * receive the reply within the specified timeout. + * Return a `CancelableFuture` instance which wrap `Future` but with additional `cancel` method. + * + * This method only sends the message once and never retries. + */ + def askCancelable[T: ClassTag](message: Any, timeout: RpcTimeout): CancelableFuture[T] = { Review comment: `abort` sounds better This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on a change in pull request #24382: [SPARK-27330][SS] support task abort in foreach writer
gaborgsomogyi commented on a change in pull request #24382: [SPARK-27330][SS] support task abort in foreach writer URL: https://github.com/apache/spark/pull/24382#discussion_r311052060 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/ForeachWriterSuite.scala ## @@ -286,6 +293,71 @@ object ForeachWriterSuite { } } +class ForeachWriterAbortSuite extends StreamTest with SharedSQLContext with BeforeAndAfter { Review comment: > IMHO we even need to have a test which commit is failing and abort is being called +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on a change in pull request #24382: [SPARK-27330][SS] support task abort in foreach writer
gaborgsomogyi commented on a change in pull request #24382: [SPARK-27330][SS] support task abort in foreach writer URL: https://github.com/apache/spark/pull/24382#discussion_r311052060 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/sources/ForeachWriterSuite.scala ## @@ -286,6 +293,71 @@ object ForeachWriterSuite { } } +class ForeachWriterAbortSuite extends StreamTest with SharedSQLContext with BeforeAndAfter { Review comment: > IMHO we even need to have a test which commit is failing and abort is being called +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
SparkQA commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518666369 **[Test build #108714 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108714/testReport)** for PR 25047 at commit [`b46b243`](https://github.com/apache/spark/commit/b46b2431cb3dee31180b12b96556562f75a8c961). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
SparkQA removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518634231 **[Test build #108714 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108714/testReport)** for PR 25047 at commit [`b46b243`](https://github.com/apache/spark/commit/b46b2431cb3dee31180b12b96556562f75a8c961). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518666746 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108714/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518666739 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518666739 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518667432 **[Test build #108719 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108719/testReport)** for PR 25135 at commit [`69c7ca5`](https://github.com/apache/spark/commit/69c7ca5252cdc3cc89ff2e92748d448060f18041). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518667595 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108719/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518667586 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518667595 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108719/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518667586 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518657160 **[Test build #108719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108719/testReport)** for PR 25135 at commit [`69c7ca5`](https://github.com/apache/spark/commit/69c7ca5252cdc3cc89ff2e92748d448060f18041). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518666746 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108714/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tgravescs commented on a change in pull request #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
tgravescs commented on a change in pull request #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#discussion_r311062146 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -39,18 +39,17 @@ package object config { private[spark] val SPARK_RESOURCES_COORDINATE = ConfigBuilder("spark.resources.coordinate.enable") .doc("Whether to coordinate resources automatically among workers/drivers(client only) " + -"in Standalone. If not, user should be responsible for assigning different resources " + -"for workers/drivers while using resource discovery script.") +"in Standalone. If false, the user is responsible for configuring different resources " + +"for workers/drivers that run on the same host.") .booleanConf - .createWithDefault(true) + .createWithDefault(false) Review comment: did you mean to turn this to false? If so we need to update the configuration.md to match. I'm ok either way This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518671048 **[Test build #108720 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108720/testReport)** for PR 25135 at commit [`82f1520`](https://github.com/apache/spark/commit/82f1520246f9a3304288f7f873967635b3fbe5ee). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
SparkQA removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518660146 **[Test build #108720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108720/testReport)** for PR 25135 at commit [`82f1520`](https://github.com/apache/spark/commit/82f1520246f9a3304288f7f873967635b3fbe5ee). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518671210 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins commented on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518671219 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108720/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518671219 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108720/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector
AmplabJenkins removed a comment on issue #25135: [SPARK-28367][SS] Use new KafkaConsumer.poll API in Kafka connector URL: https://github.com/apache/spark/pull/25135#issuecomment-518671210 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tgravescs commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
tgravescs commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518671791 test this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
SparkQA commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518672293 **[Test build #108721 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108721/testReport)** for PR 25047 at commit [`b46b243`](https://github.com/apache/spark/commit/b46b2431cb3dee31180b12b96556562f75a8c961). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tgravescs commented on a change in pull request #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
tgravescs commented on a change in pull request #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#discussion_r311067074 ## File path: core/src/test/scala/org/apache/spark/deploy/worker/WorkerSuite.scala ## @@ -218,6 +243,105 @@ class WorkerSuite extends SparkFunSuite with Matchers with BeforeAndAfter { } } + test("worker could be launched without any resources") { +val worker = makeWorker() +worker.rpcEnv.setupEndpoint("worker", worker) +eventually(timeout(10.seconds)) { + assert(worker.resources === Map.empty) + worker.rpcEnv.shutdown() + worker.rpcEnv.awaitTermination() +} +assertResourcesFileDeleted() + } + + test("worker could load resources from resources file while launching") { +val conf = new SparkConf() +withTempDir { dir => + val gpuArgs = ResourceAllocation(WORKER_GPU_ID, Seq("0", "1")) + val fpgaArgs = +ResourceAllocation(WORKER_FPGA_ID, Seq("f1", "f2", "f3")) + val ja = Extraction.decompose(Seq(gpuArgs, fpgaArgs)) + val f1 = createTempJsonFile(dir, "resources", ja) + conf.set(SPARK_WORKER_RESOURCE_FILE.key, f1) + conf.set(WORKER_GPU_ID.amountConf, "2") + conf.set(WORKER_FPGA_ID.amountConf, "3") + val worker = makeWorker(conf) + worker.rpcEnv.setupEndpoint("worker", worker) + eventually(timeout(10.seconds)) { +assert(worker.resources === Map(GPU -> gpuArgs.toResourceInformation, + FPGA -> fpgaArgs.toResourceInformation)) +worker.rpcEnv.shutdown() +worker.rpcEnv.awaitTermination() + } + assertResourcesFileDeleted() +} + } + + test("worker could load resources from discovery script while launching") { +val conf = new SparkConf() +withTempDir { dir => + val scriptPath = createTempScriptWithExpectedOutput(dir, "fpgaDiscoverScript", +"""{"name": "fpga","addresses":["f1", "f2", "f3"]}""") + conf.set(WORKER_FPGA_ID.discoveryScriptConf, scriptPath) + conf.set(WORKER_FPGA_ID.amountConf, "3") + val worker = makeWorker(conf) + worker.rpcEnv.setupEndpoint("worker", worker) + eventually(timeout(10.seconds)) { +assert(worker.resources === Map(FPGA -> + new ResourceInformation(FPGA, Array("f1", "f2", "f3" +worker.rpcEnv.shutdown() +worker.rpcEnv.awaitTermination() + } + assertResourcesFileDeleted() +} + } + + test("worker could load resources from resources file and discovery script while launching") { +val conf = new SparkConf() +withTempDir { dir => + val gpuArgs = ResourceAllocation(WORKER_GPU_ID, Seq("0", "1")) + val ja = Extraction.decompose(Seq(gpuArgs)) + val resourcesPath = createTempJsonFile(dir, "resources", ja) + val scriptPath = createTempScriptWithExpectedOutput(dir, "fpgaDiscoverScript", +"""{"name": "fpga","addresses":["f1", "f2", "f3"]}""") + conf.set(SPARK_WORKER_RESOURCE_FILE.key, resourcesPath) + conf.set(WORKER_FPGA_ID.discoveryScriptConf, scriptPath) + conf.set(WORKER_FPGA_ID.amountConf, "3") + conf.set(WORKER_GPU_ID.amountConf, "2") + val worker = makeWorker(conf) + worker.rpcEnv.setupEndpoint("worker", worker) + eventually(timeout(10.seconds)) { +assert(worker.resources === Map(GPU -> gpuArgs.toResourceInformation, + FPGA -> new ResourceInformation(FPGA, Array("f1", "f2", "f3" +worker.rpcEnv.shutdown() +worker.rpcEnv.awaitTermination() + } + assertResourcesFileDeleted() +} + } + + test("Workers should avoid resources conflict when launch from the same host") { Review comment: would be nice to add a test with the SPARK_RESOURCES_COORDINATE off to make sure all the resources from file/discovery returned properly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql'
SparkQA commented on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql' URL: https://github.com/apache/spark/pull/25360#issuecomment-518674056 **[Test build #108708 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108708/testReport)** for PR 25360 at commit [`245a4d9`](https://github.com/apache/spark/commit/245a4d977e7945fdfef40ff397a00fd47c0cb37c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql'
SparkQA removed a comment on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql' URL: https://github.com/apache/spark/pull/25360#issuecomment-518610572 **[Test build #108708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108708/testReport)** for PR 25360 at commit [`245a4d9`](https://github.com/apache/spark/commit/245a4d977e7945fdfef40ff397a00fd47c0cb37c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518674646 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins commented on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518674653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13804/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on a change in pull request #25310: [SPARK-28578][INFRA] Improve Github pull request template
gaborgsomogyi commented on a change in pull request #25310: [SPARK-28578][INFRA] Improve Github pull request template URL: https://github.com/apache/spark/pull/25310#discussion_r311066783 ## File path: .github/PULL_REQUEST_TEMPLATE ## @@ -1,10 +1,40 @@ -## What changes were proposed in this pull request? + -(Please fill in changes proposed in this fix) +### What changes were proposed in this pull request? + -## How was this patch tested? -(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) -(If this patch involves UI changes, please attach a screenshot; otherwise, remove this) +### Why are the changes needed? + -Please review https://spark.apache.org/contributing.html before opening a pull request. +### Does this PR introduce any user-facing change? Review comment: Just for the sake of my own understanding do we want to hunt for breaking changes here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on a change in pull request #25310: [SPARK-28578][INFRA] Improve Github pull request template
gaborgsomogyi commented on a change in pull request #25310: [SPARK-28578][INFRA] Improve Github pull request template URL: https://github.com/apache/spark/pull/25310#discussion_r311065678 ## File path: .github/PULL_REQUEST_TEMPLATE ## @@ -1,10 +1,40 @@ -## What changes were proposed in this pull request? + -(Please fill in changes proposed in this fix) +### What changes were proposed in this pull request? + -## How was this patch tested? -(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) -(If this patch involves UI changes, please attach a screenshot; otherwise, remove this) +### Why are the changes needed? Review comment: This chapter answers the `what and why to solve?` questions but I personally always look for something `how is it solved high level`. If the high level approach is questionable it worth to wait with the deep dive. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql'
AmplabJenkins commented on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql' URL: https://github.com/apache/spark/pull/25360#issuecomment-518674834 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108708/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql'
AmplabJenkins commented on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql' URL: https://github.com/apache/spark/pull/25360#issuecomment-518674827 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on a change in pull request #25310: [SPARK-28578][INFRA] Improve Github pull request template
gaborgsomogyi commented on a change in pull request #25310: [SPARK-28578][INFRA] Improve Github pull request template URL: https://github.com/apache/spark/pull/25310#discussion_r311062252 ## File path: .github/PULL_REQUEST_TEMPLATE ## @@ -1,10 +1,40 @@ -## What changes were proposed in this pull request? +
[GitHub] [spark] AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518674646 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql'
AmplabJenkins removed a comment on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql' URL: https://github.com/apache/spark/pull/25360#issuecomment-518674827 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
AmplabJenkins removed a comment on issue #25047: [WIP][SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone URL: https://github.com/apache/spark/pull/25047#issuecomment-518674653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13804/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql'
AmplabJenkins removed a comment on issue #25360: [SPARK-28280][PYTHON][SQL][TESTS][FOLLOW-UP] Add UDF cases into group by clause in 'udf-group-by.sql' URL: https://github.com/apache/spark/pull/25360#issuecomment-518674834 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108708/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] DylanGuedes commented on issue #24881: [SPARK-23160][SQL][TEST] Port window.sql
DylanGuedes commented on issue #24881: [SPARK-23160][SQL][TEST] Port window.sql URL: https://github.com/apache/spark/pull/24881#issuecomment-518678362 @dongjoon-hyun I applied the suggestions by @maropu , what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query
SparkQA removed a comment on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query URL: https://github.com/apache/spark/pull/25357#issuecomment-518613142 **[Test build #108710 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108710/testReport)** for PR 25357 at commit [`a9abe00`](https://github.com/apache/spark/commit/a9abe00bc96b7baac966a494c8bbb0ce0bd30041). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query
SparkQA commented on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query URL: https://github.com/apache/spark/pull/25357#issuecomment-518680111 **[Test build #108710 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108710/testReport)** for PR 25357 at commit [`a9abe00`](https://github.com/apache/spark/commit/a9abe00bc96b7baac966a494c8bbb0ce0bd30041). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query
AmplabJenkins commented on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query URL: https://github.com/apache/spark/pull/25357#issuecomment-518680853 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108710/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query
AmplabJenkins commented on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query URL: https://github.com/apache/spark/pull/25357#issuecomment-518680831 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query
AmplabJenkins removed a comment on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query URL: https://github.com/apache/spark/pull/25357#issuecomment-518680853 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108710/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query
AmplabJenkins removed a comment on issue #25357: [SPARK-28617][SQL][TEST] Fix misplacement when comment is at the end of the query URL: https://github.com/apache/spark/pull/25357#issuecomment-518680831 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
SparkQA commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518683881 **[Test build #108712 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108712/testReport)** for PR 25366 at commit [`7278128`](https://github.com/apache/spark/commit/7278128886ba36deab47b2a5b8c72bec937e711d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
SparkQA removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518631787 **[Test build #108712 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108712/testReport)** for PR 25366 at commit [`7278128`](https://github.com/apache/spark/commit/7278128886ba36deab47b2a5b8c72bec937e711d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518684252 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins commented on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518684257 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108712/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518684252 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test.
AmplabJenkins removed a comment on issue #25366: [SPARK-27918][SQL][TEST][FOLLOW-UP] Open comment about boolean test. URL: https://github.com/apache/spark/pull/25366#issuecomment-518684257 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108712/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan opened a new pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan opened a new pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368 ## What changes were proposed in this pull request? This is a pure refactor PR, which creates a new class `CatalogManager` to track the registered v2 catalogs, and provide the catalog up functionality. Later, we can use `CatalogManager` to track the current catalog/namespace. This will be done in another PR. ## How was this patch tested? existing tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r311088658 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + /** + * Tracks all the registered catalogs. + */ + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + /** + * Looks up a catalog by name. + */ + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + /** + * Returns the default catalog specified by config. + */ + def defaultCatalog(): Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => Review comment: This is simply moved from https://github.com/apache/spark/pull/25368/files#diff-ddb77374e054250d0d6ac608a8188729L46 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r311088911 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + /** + * Tracks all the registered catalogs. + */ + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + /** + * Looks up a catalog by name. + */ + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + /** + * Returns the default catalog specified by config. + */ + def defaultCatalog(): Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog(): Option[CatalogPlugin] = { +try { Review comment: https://github.com/apache/spark/pull/25368/files#diff-ddb77374e054250d0d6ac608a8188729L62 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r311088911 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,75 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + /** + * Tracks all the registered catalogs. + */ + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + /** + * Looks up a catalog by name. + */ + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + /** + * Returns the default catalog specified by config. + */ + def defaultCatalog(): Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog(): Option[CatalogPlugin] = { +try { Review comment: This is simply moved from https://github.com/apache/spark/pull/25368/files#diff-ddb77374e054250d0d6ac608a8188729L62 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-518690922 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13805/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-518690909 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test
SparkQA commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test URL: https://github.com/apache/spark/pull/25243#issuecomment-518690899 **[Test build #108715 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108715/testReport)** for PR 25243 at commit [`5ac310b`](https://github.com/apache/spark/commit/5ac310b7feeef2a7a092e5e0e7f6ea973dceb2ba). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-518691157 cc @rdblue @brkyvz This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test
AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test URL: https://github.com/apache/spark/pull/25243#issuecomment-518691259 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test
AmplabJenkins commented on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test URL: https://github.com/apache/spark/pull/25243#issuecomment-518691275 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108715/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test
AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test URL: https://github.com/apache/spark/pull/25243#issuecomment-518691259 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test
SparkQA removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test URL: https://github.com/apache/spark/pull/25243#issuecomment-518636703 **[Test build #108715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108715/testReport)** for PR 25243 at commit [`5ac310b`](https://github.com/apache/spark/commit/5ac310b7feeef2a7a092e5e0e7f6ea973dceb2ba). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
SparkQA commented on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-518691919 **[Test build #108722 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108722/testReport)** for PR 25368 at commit [`de95b81`](https://github.com/apache/spark/commit/de95b81e0ade4880fc83bba0dafd171db4e312b8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test
AmplabJenkins removed a comment on issue #25243: [SPARK-28498][SQL][TEST] clear the states of SparkSession after each test URL: https://github.com/apache/spark/pull/25243#issuecomment-518691275 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108715/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-518690909 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
AmplabJenkins removed a comment on issue #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#issuecomment-518690922 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13805/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25306: [SPARK-28573][SQL] Convert InsertIntoTable(HiveTableRelation) to DataSource inserting for partitioned table
cloud-fan commented on a change in pull request #25306: [SPARK-28573][SQL] Convert InsertIntoTable(HiveTableRelation) to DataSource inserting for partitioned table URL: https://github.com/apache/spark/pull/25306#discussion_r311094525 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveCommandSuite.scala ## @@ -58,7 +58,11 @@ class HiveCommandSuite extends QueryTest with SQLTestUtils with TestHiveSingleto |TBLPROPERTIES('prop1Key'="prop1Val", '`prop2Key`'="prop2Val") """.stripMargin) sql("CREATE TABLE parquet_tab3(col1 int, `col 2` int)") -sql("CREATE TABLE parquet_tab4 (price int, qty int) partitioned by (year int, month int)") +sql( + """ +|CREATE TABLE parquet_tab4 (price int, qty int) partitioned by (year int, month int) +|STORED AS PARQUET Review comment: is it a necessary change? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25306: [SPARK-28573][SQL] Convert InsertIntoTable(HiveTableRelation) to DataSource inserting for partitioned table
cloud-fan commented on a change in pull request #25306: [SPARK-28573][SQL] Convert InsertIntoTable(HiveTableRelation) to DataSource inserting for partitioned table URL: https://github.com/apache/spark/pull/25306#discussion_r311095687 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ## @@ -197,7 +197,9 @@ private[hive] class HiveMetastoreCatalog(sparkSession: SparkSession) extends Log Some(partitionSchema)) val logicalRelation = cached.getOrElse { - val sizeInBytes = relation.stats.sizeInBytes.toLong + val defaultSizeInBytes = sparkSession.sessionState.conf.defaultSizeInBytes Review comment: cc @wangyum do you think this is fixable? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
cloud-fan commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-518695136 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
SparkQA commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-518695464 **[Test build #108723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108723/testReport)** for PR 25115 at commit [`b96c8a7`](https://github.com/apache/spark/commit/b96c8a7d4c9e09c835f2563c1309e91a47227ce0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maryannxue commented on a change in pull request #25328: [SPARK-28595][SQL] explain should not trigger partition listing
maryannxue commented on a change in pull request #25328: [SPARK-28595][SQL] explain should not trigger partition listing URL: https://github.com/apache/spark/pull/25328#discussion_r311096949 ## File path: sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala ## @@ -49,6 +49,16 @@ class BucketedReadWithoutHiveSupportSuite extends BucketedReadSuite with SharedS abstract class BucketedReadSuite extends QueryTest with SQLTestUtils { import testImplicits._ + protected override def beforeAll(): Unit = { +super.beforeAll() + spark.sessionState.conf.setConf(SQLConf.LEGACY_BUCKETED_TABLE_SCAN_OUTPUT_ORDERING, true) + } + + protected override def afterAll(): Unit = { + spark.sessionState.conf.unsetConf(SQLConf.LEGACY_BUCKETED_TABLE_SCAN_OUTPUT_ORDERING) Review comment: Should we do a "store and recover the old conf" instead? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
cloud-fan commented on a change in pull request #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#discussion_r31109 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ## @@ -541,6 +541,14 @@ object OverwritePartitionsDynamic { } } +case class DeleteFromTable( +child: LogicalPlan, +condition: Filter) extends Command { Review comment: I get it that we need the `Filter` operator to resolve subquery, but this is very subtle. Can we add some documents to explain it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maryannxue commented on issue #25328: [SPARK-28595][SQL] explain should not trigger partition listing
maryannxue commented on issue #25328: [SPARK-28595][SQL] explain should not trigger partition listing URL: https://github.com/apache/spark/pull/25328#issuecomment-518696404 LGTM except one minor comment https://github.com/apache/spark/pull/25328/files#r311096949. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org