[spark] branch branch-3.1 updated: [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 74fe4fa [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime` 74fe4fa is described below commit 74fe4fadfa76b15560a78ce53f53319f015819e7 Author: Weiwei Yang AuthorDate: Tue Oct 19 22:42:06 2021 -0700 [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime` SPARK-33099 added the support to respect `spark.dynamicAllocation.executorIdleTimeout` in `ExecutorPodsAllocator`. However, when it checks if a pending executor pod is timed out, it checks against the pod's [startTime](https://github.com/kubernetes/api/blob/2a5dae08c42b1e8fdc1379432d8898efece65363/core/v1/types.go#L3664-L3667), see code [here](https://github.com/apache/spark/blob/c2ba498ff678ddda034cedf45cc17fbeefe922fd/resource-managers/kubernetes/core/src/main/scala/org/apache/spark [...] This can be reproduced locally, run the following job ``` ${SPARK_HOME}/bin/spark-submit --master k8s://http://localhost:8001 --deploy-mode cluster --name spark-group-example \ --master k8s://http://localhost:8001 --deploy-mode cluster \ --class org.apache.spark.examples.GroupByTest \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.namespace=spark-test \ --conf spark.kubernetes.executor.request.cores=1 \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.shuffle.service.enabled=true \ --conf spark.dynamicAllocation.shuffleTracking.enabled=true \ --conf spark.shuffle.service.enabled=false \ --conf spark.kubernetes.container.image=local/spark:3.3.0 \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ local:///opt/spark/examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar \ 1000 1000 100 1000 ``` the local cluster doesn't have enough resources to run more than 4 executors, the rest of the executor pods will be pending. The job will have task backlogs and triggers to request more executors from K8s: ``` 21/10/19 22:51:45 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 1 running: 0. 21/10/19 22:51:51 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 2 running: 1. 21/10/19 22:51:52 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes for ResourceProfile Id: 0, target: 4 running: 2. 21/10/19 22:51:53 INFO ExecutorPodsAllocator: Going to request 4 executors from Kubernetes for ResourceProfile Id: 0, target: 8 running: 4. ... 21/10/19 22:52:14 INFO ExecutorPodsAllocator: Deleting 39 excess pod requests (23,59,32,41,50,68,35,44,17,8,53,62,26,71,11,56,29,38,47,20,65,5,14,46,64,73,55,49,40,67,58,13,22,31,7,16,52,70,43). 21/10/19 22:52:18 INFO ExecutorPodsAllocator: Deleting 28 excess pod requests (25,34,61,37,10,19,28,60,69,63,45,54,72,36,18,9,27,21,57,12,48,30,39,66,15,42,24,33). ``` At `22:51:45`, it starts to request executors; and at `22:52:14` it starts to delete excess executor pods. This is 29s but spark.dynamicAllocation.executorIdleTimeout is set to 60s. The config was not honored. ### What changes were proposed in this pull request? Change the check from using pod's `startTime` to `creationTimestamp`. [creationTimestamp](https://github.com/kubernetes/apimachinery/blob/e6c90c4366be1504309a6aafe0d816856450f36a/pkg/apis/meta/v1/types.go#L193-L201) is the timestamp when a pod gets created on K8s: ``` // CreationTimestamp is a timestamp representing the server time when this object was // created. It is not guaranteed to be set in happens-before order across separate operations. // Clients may not set this value. It is represented in RFC3339 form and is in UTC. ``` [startTime](https://github.com/kubernetes/api/blob/2a5dae08c42b1e8fdc1379432d8898efece65363/core/v1/types.go#L3664-L3667) is the timestamp when pod gets started: ``` // RFC 3339 date and time at which the object was acknowledged by the Kubelet. // This is before the Kubelet pulled the container image(s) for the pod. // +optional ``` a pending pod's startTime is empty. Here is a example of a pending pod: ``` NAMESPACE NAME READY STATUS RESTARTS AGE default pending-pod-example 0/1 Pending 0 2s kubectl get pod pending-pod-example -o yaml | grep creationTimestamp ---> creationTimestamp: "2021-10-19T16:17:52Z" // pending pod has no startTime kubectl get pod pending-pod-example -o yaml | grep
[spark] branch branch-3.2 updated: [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 10a958b3 [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime` 10a958b3 is described below commit 10a958b3480dd536c869247fe3e83e823a51 Author: Weiwei Yang AuthorDate: Tue Oct 19 22:42:06 2021 -0700 [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime` SPARK-33099 added the support to respect `spark.dynamicAllocation.executorIdleTimeout` in `ExecutorPodsAllocator`. However, when it checks if a pending executor pod is timed out, it checks against the pod's [startTime](https://github.com/kubernetes/api/blob/2a5dae08c42b1e8fdc1379432d8898efece65363/core/v1/types.go#L3664-L3667), see code [here](https://github.com/apache/spark/blob/c2ba498ff678ddda034cedf45cc17fbeefe922fd/resource-managers/kubernetes/core/src/main/scala/org/apache/spark [...] This can be reproduced locally, run the following job ``` ${SPARK_HOME}/bin/spark-submit --master k8s://http://localhost:8001 --deploy-mode cluster --name spark-group-example \ --master k8s://http://localhost:8001 --deploy-mode cluster \ --class org.apache.spark.examples.GroupByTest \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.namespace=spark-test \ --conf spark.kubernetes.executor.request.cores=1 \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.shuffle.service.enabled=true \ --conf spark.dynamicAllocation.shuffleTracking.enabled=true \ --conf spark.shuffle.service.enabled=false \ --conf spark.kubernetes.container.image=local/spark:3.3.0 \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ local:///opt/spark/examples/jars/spark-examples_2.12-3.3.0-SNAPSHOT.jar \ 1000 1000 100 1000 ``` the local cluster doesn't have enough resources to run more than 4 executors, the rest of the executor pods will be pending. The job will have task backlogs and triggers to request more executors from K8s: ``` 21/10/19 22:51:45 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 1 running: 0. 21/10/19 22:51:51 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 2 running: 1. 21/10/19 22:51:52 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes for ResourceProfile Id: 0, target: 4 running: 2. 21/10/19 22:51:53 INFO ExecutorPodsAllocator: Going to request 4 executors from Kubernetes for ResourceProfile Id: 0, target: 8 running: 4. ... 21/10/19 22:52:14 INFO ExecutorPodsAllocator: Deleting 39 excess pod requests (23,59,32,41,50,68,35,44,17,8,53,62,26,71,11,56,29,38,47,20,65,5,14,46,64,73,55,49,40,67,58,13,22,31,7,16,52,70,43). 21/10/19 22:52:18 INFO ExecutorPodsAllocator: Deleting 28 excess pod requests (25,34,61,37,10,19,28,60,69,63,45,54,72,36,18,9,27,21,57,12,48,30,39,66,15,42,24,33). ``` At `22:51:45`, it starts to request executors; and at `22:52:14` it starts to delete excess executor pods. This is 29s but spark.dynamicAllocation.executorIdleTimeout is set to 60s. The config was not honored. ### What changes were proposed in this pull request? Change the check from using pod's `startTime` to `creationTimestamp`. [creationTimestamp](https://github.com/kubernetes/apimachinery/blob/e6c90c4366be1504309a6aafe0d816856450f36a/pkg/apis/meta/v1/types.go#L193-L201) is the timestamp when a pod gets created on K8s: ``` // CreationTimestamp is a timestamp representing the server time when this object was // created. It is not guaranteed to be set in happens-before order across separate operations. // Clients may not set this value. It is represented in RFC3339 form and is in UTC. ``` [startTime](https://github.com/kubernetes/api/blob/2a5dae08c42b1e8fdc1379432d8898efece65363/core/v1/types.go#L3664-L3667) is the timestamp when pod gets started: ``` // RFC 3339 date and time at which the object was acknowledged by the Kubelet. // This is before the Kubelet pulled the container image(s) for the pod. // +optional ``` a pending pod's startTime is empty. Here is a example of a pending pod: ``` NAMESPACE NAME READY STATUS RESTARTS AGE default pending-pod-example 0/1 Pending 0 2s kubectl get pod pending-pod-example -o yaml | grep creationTimestamp ---> creationTimestamp: "2021-10-19T16:17:52Z" // pending pod has no startTime kubectl get pod pending-pod-example -o yaml |
[spark] branch master updated (b07dd1a -> 041cd5d)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b07dd1a [SPARK-36348][TEST] Complete test_astype for index add 041cd5d [SPARK-37049][K8S] executorIdleTimeout should check `creationTimestamp` instead of `startTime` No new revisions were added by this update. Summary of changes: .../spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala | 8 .../spark/scheduler/cluster/k8s/ExecutorLifecycleTestUtils.scala | 4 +++- 2 files changed, 7 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang opened a new pull request #365: Update Spark version in "Link with Spark" section of download page
gengliangwang opened a new pull request #365: URL: https://github.com/apache/spark-website/pull/365 Update the Spark version from 3.1.2 to 3.2.0 in "Link with Spark" section of download page Before: ![image](https://user-images.githubusercontent.com/1097932/138031706-cb0dc5a3-6978-48bf-8c7d-f71198652677.png) After: ![image](https://user-images.githubusercontent.com/1097932/138031744-32ac46c4-e171-4ad7-92cc-3c4fe77b7fd2.png) I also run `grep 3.1.2 *.md` to find if there is any place we need to update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-36348][TEST] Complete test_astype for index
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b07dd1a [SPARK-36348][TEST] Complete test_astype for index b07dd1a is described below commit b07dd1aacab9cf2df4ba3e88842a408e3c5c26a8 Author: Yikun Jiang AuthorDate: Wed Oct 20 12:14:06 2021 +0900 [SPARK-36348][TEST] Complete test_astype for index ### What changes were proposed in this pull request? Before 3.2, there was a bug: ``` pidx = pd.Index([10, 20, 15, 30, 45, None], name="x") psidx = ps.Index(pidx) self.assert_eq(psidx.astype(str), pidx.astype(str)) [left pandas on spark]: Index(['10.0', '20.0', '15.0', '30.0', '45.0', 'nan'], dtype='object', name='x') [right pandas]: Index(['10', '20', '15', '30', '45', 'None'], dtype='object', name='x') ``` So, we didn't add any test on [test_base.py int_with_nan]https://github.com/apache/spark/blob/bcc595c112a23d8e3024ace50f0dbc7eab7144b2/python/pyspark/pandas/tests/indexes/test_base.py#L2249 Now, the bug had been resolved, we complete the testcase in here. ### Why are the changes needed? regression for SPARK-36348 and complete testcase. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test only Closes #34335 from Yikun/SPARK-36348. Authored-by: Yikun Jiang Signed-off-by: Hyukjin Kwon --- python/pyspark/pandas/tests/indexes/test_base.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/python/pyspark/pandas/tests/indexes/test_base.py b/python/pyspark/pandas/tests/indexes/test_base.py index 4003998..a7f19a7 100644 --- a/python/pyspark/pandas/tests/indexes/test_base.py +++ b/python/pyspark/pandas/tests/indexes/test_base.py @@ -2243,12 +2243,14 @@ class IndexesTest(PandasOnSparkTestCase, TestUtils): pidx = pd.Index([10, 20, 15, 30, 45, None], name="x") psidx = ps.Index(pidx) +self.assert_eq(psidx.astype(bool), pidx.astype(bool)) +self.assert_eq(psidx.astype(str), pidx.astype(str)) pidx = pd.Index(["hi", "hi ", " ", " \t", "", None], name="x") psidx = ps.Index(pidx) self.assert_eq(psidx.astype(bool), pidx.astype(bool)) -self.assert_eq(psidx.astype(str).to_numpy(), ["hi", "hi ", " ", " \t", "", "None"]) +self.assert_eq(psidx.astype(str), pidx.astype(str)) pidx = pd.Index([True, False, None], name="x") psidx = ps.Index(pidx) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (b1aaefb -> 48b3510)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from b1aaefb [SPARK-37002][PYTHON] Introduce the 'compute.eager_check' option add 48b3510 [SPARK-37044][PYTHON] Add Row to __all__ in pyspark.sql.types No new revisions were added by this update. Summary of changes: python/pyspark/sql/types.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (81aa514 -> b1aaefb)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 81aa514 [SPARK-37059][PYTHON][TESTS] Ensure the sort order of the output in the PySpark doctests add b1aaefb [SPARK-37002][PYTHON] Introduce the 'compute.eager_check' option No new revisions were added by this update. Summary of changes: python/docs/source/user_guide/pandas_on_spark/options.rst | 7 +++ python/pyspark/pandas/config.py | 11 +++ 2 files changed, 18 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (40f1494 -> 81aa514)
This is an automated email from the ASF dual-hosted git repository. sarutak pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 40f1494 [SPARK-37041][SQL] Backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS add 81aa514 [SPARK-37059][PYTHON][TESTS] Ensure the sort order of the output in the PySpark doctests No new revisions were added by this update. Summary of changes: python/pyspark/ml/fpm.py| 20 ++-- python/pyspark/sql/functions.py | 4 ++-- python/run-tests.py | 2 +- 3 files changed, 13 insertions(+), 13 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-37041][SQL] Backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 40f1494 [SPARK-37041][SQL] Backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS 40f1494 is described below commit 40f14942a97d4572178974bcbeea207abb518571 Author: Yuming Wang AuthorDate: Wed Oct 20 08:28:27 2021 +0800 [SPARK-37041][SQL] Backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS ### What changes were proposed in this pull request? This pr backport HIVE-15025: Secure-Socket-Layer (SSL) support for HMS. ### Why are the changes needed? To make it easy upgrade Thrift: ``` [error] /home/jenkins/workspace/SparkPullRequestBuilder/sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java:254:1: error: incompatible types: String cannot be converted to TConfiguration [error] return new TSocket(host, port, loginTimeout); ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing test. Closes #34312 from wangyum/SPARK-37041. Authored-by: Yuming Wang Signed-off-by: Yuming Wang --- .../apache/hive/service/auth/HiveAuthFactory.java | 77 -- 1 file changed, 77 deletions(-) diff --git a/sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java b/sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java index fbb5230..8d77b23 100644 --- a/sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java +++ b/sql/hive-thriftserver/src/main/java/org/apache/hive/service/auth/HiveAuthFactory.java @@ -19,17 +19,10 @@ package org.apache.hive.service.auth; import java.io.IOException; import java.lang.reflect.Field; import java.lang.reflect.Method; -import java.net.InetSocketAddress; -import java.net.UnknownHostException; -import java.util.ArrayList; -import java.util.Arrays; import java.util.HashMap; -import java.util.List; -import java.util.Locale; import java.util.Map; import java.util.Objects; -import javax.net.ssl.SSLServerSocket; import javax.security.auth.login.LoginException; import javax.security.sasl.Sasl; @@ -50,10 +43,6 @@ import org.apache.hadoop.security.authorize.ProxyUsers; import org.apache.hive.service.cli.HiveSQLException; import org.apache.hive.service.cli.thrift.ThriftCLIService; import org.apache.thrift.TProcessorFactory; -import org.apache.thrift.transport.TSSLTransportFactory; -import org.apache.thrift.transport.TServerSocket; -import org.apache.thrift.transport.TSocket; -import org.apache.thrift.transport.TTransport; import org.apache.thrift.transport.TTransportException; import org.apache.thrift.transport.TTransportFactory; import org.slf4j.Logger; @@ -250,72 +239,6 @@ public class HiveAuthFactory { } } - public static TTransport getSocketTransport(String host, int port, int loginTimeout) { -return new TSocket(host, port, loginTimeout); - } - - public static TTransport getSSLSocket(String host, int port, int loginTimeout) -throws TTransportException { -return TSSLTransportFactory.getClientSocket(host, port, loginTimeout); - } - - public static TTransport getSSLSocket(String host, int port, int loginTimeout, -String trustStorePath, String trustStorePassWord) throws TTransportException { -TSSLTransportFactory.TSSLTransportParameters params = - new TSSLTransportFactory.TSSLTransportParameters(); -params.setTrustStore(trustStorePath, trustStorePassWord); -params.requireClientAuth(true); -return TSSLTransportFactory.getClientSocket(host, port, loginTimeout, params); - } - - public static TServerSocket getServerSocket(String hiveHost, int portNum) -throws TTransportException { -InetSocketAddress serverAddress; -if (hiveHost == null || hiveHost.isEmpty()) { - // Wildcard bind - serverAddress = new InetSocketAddress(portNum); -} else { - serverAddress = new InetSocketAddress(hiveHost, portNum); -} -return new TServerSocket(serverAddress); - } - - public static TServerSocket getServerSSLSocket(String hiveHost, int portNum, String keyStorePath, - String keyStorePassWord, List sslVersionBlacklist) throws TTransportException, - UnknownHostException { -TSSLTransportFactory.TSSLTransportParameters params = -new TSSLTransportFactory.TSSLTransportParameters(); -params.setKeyStore(keyStorePath, keyStorePassWord); -InetSocketAddress serverAddress; -if (hiveHost == null || hiveHost.isEmpty()) { - // Wildcard bind - serverAddress = new InetSocketAddress(portNum); -} else { - serverAddress = new InetSocketAddress(hiveHost, portNum); -} -TServerSocket thriftServerSocket = -
[GitHub] [spark-website] srowen commented on pull request #363: Fix Spark 3.2.0 download URL
srowen commented on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946915197 Let's not make the release differ from the profiles, I think. We can rename the profile and leave hadoop-3.2 as a no-op profile. By 3.3.0, who knows, maybe we're on 3.4 or something -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] sunchao edited a comment on pull request #363: Fix Spark 3.2.0 download URL
sunchao edited a comment on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946914494 I'll pick up https://github.com/apache/spark/pull/30891 again soon and make sure we'll get this done before Spark 3.3.0 > Should we keep it as it is? Or figure out a way to fix it? I think we can fix this in the same PR above, perhaps by just changing `make-distribution.sh`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] sunchao commented on pull request #363: Fix Spark 3.2.0 download URL
sunchao commented on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946914494 I'll pick up #30891 again soon and make sure we'll get this done before Spark 3.3.0 > Should we keep it as it is? Or figure out a way to fix it? I think we can fix this in the same PR above, perhaps by just changing `make-distribution.sh`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen opened a new pull request #364: Add Scala 2.13 build download link
srowen opened a new pull request #364: URL: https://github.com/apache/spark-website/pull/364 We have a Scala 2.13 build for Hadoop 3.3 - add it as an option to download -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] limansky commented on pull request #361: Add 3.2.0 release note and news and update links
limansky commented on pull request #361: URL: https://github.com/apache/spark-website/pull/361#issuecomment-946857781 I think it would be nice to have both Hadoop 3 and build without Hadoop for 2.13. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on pull request #363: Fix Spark 3.2.0 download URL
srowen commented on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946847970 Change at Spark 3.3 IMHO. It's just a naming thing. hadoop-3.2 really means "3.2 or later" so it's not crazy -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen commented on pull request #361: Add 3.2.0 release note and news and update links
srowen commented on pull request #361: URL: https://github.com/apache/spark-website/pull/361#issuecomment-946847108 Fair point - I think the problem is the explosion of combinations of artifacts if there are sets for each scala version, but we did publish a binary release for 2.13 and should be in the UI. Unless someone's on that already I can hack in an option maybe. Probably anyone on Scala 2.13 is generally on newer versions of things, so not as much point in building for old Hadoop 2 and 2.13. (People can create whatever build they like from the source release though) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang commented on pull request #363: Fix Spark 3.2.0 download URL
gengliangwang commented on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946847052 There are some discussions about this on dev list: https://www.mail-archive.com/dev@spark.apache.org/msg28025.html The conclusion was to make it at Spark 3.3.0. We could try renaming the tarball on finalizing the release, but I was using release script... Should we keep it as it is? Or figure out a way to fix it? cc @dongjoon-hyun @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] limansky commented on pull request #361: Add 3.2.0 release note and news and update links
limansky commented on pull request #361: URL: https://github.com/apache/spark-website/pull/361#issuecomment-946842912 BTW, why it's the only build available for Scala 2.13? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] yaooqinn commented on pull request #363: Fix Spark 3.2.0 download URL
yaooqinn commented on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946828748 > We should really rename the maven profile name... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] limansky commented on pull request #361: Add 3.2.0 release note and news and update links
limansky commented on pull request #361: URL: https://github.com/apache/spark-website/pull/361#issuecomment-946826339 Hi, I've just found that there is no link for Hadoop 3.3 + Scala 2.13 build on download page. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] cloud-fan commented on pull request #363: Fix Spark 3.2.0 download URL
cloud-fan commented on pull request #363: URL: https://github.com/apache/spark-website/pull/363#issuecomment-946820514 We should really rename the maven profile name... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang merged pull request #363: Fix Spark 3.2.0 download URL
gengliangwang merged pull request #363: URL: https://github.com/apache/spark-website/pull/363 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: Fix Spark 3.2.0 download URL (#363)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new 920bafa Fix Spark 3.2.0 download URL (#363) 920bafa is described below commit 920bafabcbe4f2494085e2e223e3444d2f34e5f8 Author: Gengliang Wang AuthorDate: Tue Oct 19 23:07:00 2021 +0800 Fix Spark 3.2.0 download URL (#363) --- js/downloads.js | 2 +- site/js/downloads.js | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/js/downloads.js b/js/downloads.js index 6ab9447..57003bb 100644 --- a/js/downloads.js +++ b/js/downloads.js @@ -15,7 +15,7 @@ var sources = {pretty: "Source Code", tag: "sources"}; var hadoopFree = {pretty: "Pre-built with user-provided Apache Hadoop", tag: "without-hadoop"}; var hadoop2p7 = {pretty: "Pre-built for Apache Hadoop 2.7", tag: "hadoop2.7"}; var hadoop3p2 = {pretty: "Pre-built for Apache Hadoop 3.2 and later", tag: "hadoop3.2"}; -var hadoop3p3 = {pretty: "Pre-built for Apache Hadoop 3.3 and later", tag: "hadoop3.3"}; +var hadoop3p3 = {pretty: "Pre-built for Apache Hadoop 3.3 and later", tag: "hadoop3.2"}; var scala2p12_hadoopFree = {pretty: "Pre-built with Scala 2.12 and user-provided Apache Hadoop", tag: "without-hadoop-scala-2.12"}; // 3.0.0+ diff --git a/site/js/downloads.js b/site/js/downloads.js index 6ab9447..57003bb 100644 --- a/site/js/downloads.js +++ b/site/js/downloads.js @@ -15,7 +15,7 @@ var sources = {pretty: "Source Code", tag: "sources"}; var hadoopFree = {pretty: "Pre-built with user-provided Apache Hadoop", tag: "without-hadoop"}; var hadoop2p7 = {pretty: "Pre-built for Apache Hadoop 2.7", tag: "hadoop2.7"}; var hadoop3p2 = {pretty: "Pre-built for Apache Hadoop 3.2 and later", tag: "hadoop3.2"}; -var hadoop3p3 = {pretty: "Pre-built for Apache Hadoop 3.3 and later", tag: "hadoop3.3"}; +var hadoop3p3 = {pretty: "Pre-built for Apache Hadoop 3.3 and later", tag: "hadoop3.2"}; var scala2p12_hadoopFree = {pretty: "Pre-built with Scala 2.12 and user-provided Apache Hadoop", tag: "without-hadoop-scala-2.12"}; // 3.0.0+ - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang opened a new pull request #363: Fix Spark 3.2.0 download URL
gengliangwang opened a new pull request #363: URL: https://github.com/apache/spark-website/pull/363 I made a mistake on the download URL of Spark 3.2.0. This PR is to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang merged pull request #361: Add 3.2.0 release note and news and update links
gengliangwang merged pull request #361: URL: https://github.com/apache/spark-website/pull/361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang commented on pull request #361: Add 3.2.0 release note and news and update links
gengliangwang commented on pull request #361: URL: https://github.com/apache/spark-website/pull/361#issuecomment-946762007 I am merging this one. Thanks for the great suggestions, everyone! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (db89320 -> 3849340)
This is an automated email from the ASF dual-hosted git repository. srowen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from db89320 [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh add 3849340 [SPARK-36796][BUILD][CORE][SQL] Pass all `sql/core` and dependent modules UTs with JDK 17 except one case in `postgreSQL/text.sql` No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/SparkContext.scala | 19 + .../apache/spark/launcher/JavaModuleOptions.java | 47 ++ .../spark/launcher/SparkSubmitCommandBuilder.java | 2 + pom.xml| 20 - project/SparkBuild.scala | 14 ++- sql/catalyst/pom.xml | 2 +- sql/core/pom.xml | 2 +- 7 files changed, 101 insertions(+), 5 deletions(-) create mode 100644 launcher/src/main/java/org/apache/spark/launcher/JavaModuleOptions.java - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.2 updated: [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 9c029fd [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh 9c029fd is described below commit 9c029fd5a4cd3dcf7159a2a4272ad6ed629a8596 Author: Gengliang Wang AuthorDate: Tue Oct 19 16:52:26 2021 +0800 [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh ### What changes were proposed in this pull request? In release-tag.sh, the DocSearch facet filter should be updated as the release version before creating git tag. Otherwise, the facet filter would be wrong in the new release doc: https://github.com/apache/spark/blame/v3.2.0/docs/_config.yml#L42 ### Why are the changes needed? Fix a bug in release-tag.sh ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual test Closes #34328 from gengliangwang/fixFacetFilters. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit db893207ba444a303b1915afeb90b82ef3808cf8) Signed-off-by: Gengliang Wang --- dev/create-release/release-tag.sh | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/dev/create-release/release-tag.sh b/dev/create-release/release-tag.sh index d7e9bf2..55aa2e5 100755 --- a/dev/create-release/release-tag.sh +++ b/dev/create-release/release-tag.sh @@ -84,7 +84,8 @@ fi # Set the release version in docs sed -i".tmp1" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$RELEASE_VERSION"'/g' docs/_config.yml sed -i".tmp2" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$RELEASE_VERSION"'/g' docs/_config.yml -sed -i".tmp3" 's/__version__ = .*$/__version__ = "'"$RELEASE_VERSION"'"/' python/pyspark/version.py +sed -i".tmp3" "s/'facetFilters':.*$/'facetFilters': [\"version:$RELEASE_VERSION\"]/g" docs/_config.yml +sed -i".tmp4" 's/__version__ = .*$/__version__ = "'"$RELEASE_VERSION"'"/' python/pyspark/version.py git commit -a -m "Preparing Spark release $RELEASE_TAG" echo "Creating tag $RELEASE_TAG at the head of $GIT_BRANCH" @@ -94,18 +95,18 @@ git tag $RELEASE_TAG $MVN versions:set -DnewVersion=$NEXT_VERSION | grep -v "no value" # silence logs # Remove -SNAPSHOT before setting the R version as R expects version strings to only have numbers R_NEXT_VERSION=`echo $NEXT_VERSION | sed 's/-SNAPSHOT//g'` -sed -i".tmp4" 's/Version.*$/Version: '"$R_NEXT_VERSION"'/g' R/pkg/DESCRIPTION +sed -i".tmp5" 's/Version.*$/Version: '"$R_NEXT_VERSION"'/g' R/pkg/DESCRIPTION # Write out the R_NEXT_VERSION to PySpark version info we use dev0 instead of SNAPSHOT to be closer # to PEP440. -sed -i".tmp5" 's/__version__ = .*$/__version__ = "'"$R_NEXT_VERSION.dev0"'"/' python/pyspark/version.py +sed -i".tmp6" 's/__version__ = .*$/__version__ = "'"$R_NEXT_VERSION.dev0"'"/' python/pyspark/version.py # Update docs with next version -sed -i".tmp6" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$NEXT_VERSION"'/g' docs/_config.yml +sed -i".tmp7" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$NEXT_VERSION"'/g' docs/_config.yml # Use R version for short version -sed -i".tmp7" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION"'/g' docs/_config.yml +sed -i".tmp8" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION"'/g' docs/_config.yml # Update the version index of DocSearch as the short version -sed -i".tmp8" "s/'facetFilters':.*$/'facetFilters': [\"version:$R_NEXT_VERSION\"]/g" docs/_config.yml +sed -i".tmp9" "s/'facetFilters':.*$/'facetFilters': [\"version:$R_NEXT_VERSION\"]/g" docs/_config.yml git commit -a -m "Preparing development version $NEXT_VERSION" - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new db89320 [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh db89320 is described below commit db893207ba444a303b1915afeb90b82ef3808cf8 Author: Gengliang Wang AuthorDate: Tue Oct 19 16:52:26 2021 +0800 [SPARK-37057][INFRA] Fix wrong DocSearch facet filter in release-tag.sh ### What changes were proposed in this pull request? In release-tag.sh, the DocSearch facet filter should be updated as the release version before creating git tag. Otherwise, the facet filter would be wrong in the new release doc: https://github.com/apache/spark/blame/v3.2.0/docs/_config.yml#L42 ### Why are the changes needed? Fix a bug in release-tag.sh ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual test Closes #34328 from gengliangwang/fixFacetFilters. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang --- dev/create-release/release-tag.sh | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/dev/create-release/release-tag.sh b/dev/create-release/release-tag.sh index d7e9bf2..55aa2e5 100755 --- a/dev/create-release/release-tag.sh +++ b/dev/create-release/release-tag.sh @@ -84,7 +84,8 @@ fi # Set the release version in docs sed -i".tmp1" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$RELEASE_VERSION"'/g' docs/_config.yml sed -i".tmp2" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$RELEASE_VERSION"'/g' docs/_config.yml -sed -i".tmp3" 's/__version__ = .*$/__version__ = "'"$RELEASE_VERSION"'"/' python/pyspark/version.py +sed -i".tmp3" "s/'facetFilters':.*$/'facetFilters': [\"version:$RELEASE_VERSION\"]/g" docs/_config.yml +sed -i".tmp4" 's/__version__ = .*$/__version__ = "'"$RELEASE_VERSION"'"/' python/pyspark/version.py git commit -a -m "Preparing Spark release $RELEASE_TAG" echo "Creating tag $RELEASE_TAG at the head of $GIT_BRANCH" @@ -94,18 +95,18 @@ git tag $RELEASE_TAG $MVN versions:set -DnewVersion=$NEXT_VERSION | grep -v "no value" # silence logs # Remove -SNAPSHOT before setting the R version as R expects version strings to only have numbers R_NEXT_VERSION=`echo $NEXT_VERSION | sed 's/-SNAPSHOT//g'` -sed -i".tmp4" 's/Version.*$/Version: '"$R_NEXT_VERSION"'/g' R/pkg/DESCRIPTION +sed -i".tmp5" 's/Version.*$/Version: '"$R_NEXT_VERSION"'/g' R/pkg/DESCRIPTION # Write out the R_NEXT_VERSION to PySpark version info we use dev0 instead of SNAPSHOT to be closer # to PEP440. -sed -i".tmp5" 's/__version__ = .*$/__version__ = "'"$R_NEXT_VERSION.dev0"'"/' python/pyspark/version.py +sed -i".tmp6" 's/__version__ = .*$/__version__ = "'"$R_NEXT_VERSION.dev0"'"/' python/pyspark/version.py # Update docs with next version -sed -i".tmp6" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$NEXT_VERSION"'/g' docs/_config.yml +sed -i".tmp7" 's/SPARK_VERSION:.*$/SPARK_VERSION: '"$NEXT_VERSION"'/g' docs/_config.yml # Use R version for short version -sed -i".tmp7" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION"'/g' docs/_config.yml +sed -i".tmp8" 's/SPARK_VERSION_SHORT:.*$/SPARK_VERSION_SHORT: '"$R_NEXT_VERSION"'/g' docs/_config.yml # Update the version index of DocSearch as the short version -sed -i".tmp8" "s/'facetFilters':.*$/'facetFilters': [\"version:$R_NEXT_VERSION\"]/g" docs/_config.yml +sed -i".tmp9" "s/'facetFilters':.*$/'facetFilters': [\"version:$R_NEXT_VERSION\"]/g" docs/_config.yml git commit -a -m "Preparing development version $NEXT_VERSION" - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-37017][SQL] Reduce the scope of synchronized to prevent potential deadlock
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 875963a [SPARK-37017][SQL] Reduce the scope of synchronized to prevent potential deadlock 875963a is described below commit 875963a28a75532871010fcdcdb916bf093dab34 Author: chenzhx AuthorDate: Tue Oct 19 14:48:32 2021 +0800 [SPARK-37017][SQL] Reduce the scope of synchronized to prevent potential deadlock ### What changes were proposed in this pull request? There is a `synchronize` block in `CatalogManager.currentNamespace` function. This PR pulls `SessionCatalog.getCurrentDatabase` out from this `synchronize` block to prevent potential deadlock. ### Why are the changes needed? In our case, we have implemented an external catalog, and there is a thread that directly calls SessionCatalog.getTempViewOrPermanentTableMetadata and holds the lock of SessionCatalog. It eventually goes into our external catalog, unfortunately, we then call some functions of SparkSession, e.g. sql. When it calls CatalogManager.currentNamespace, it tries to hold the lock of CatalogManager. In the meantime, there are some query threads that execute sqls via DataFrame interface. This is how deadlock occurs. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? use existed test. Closes #34292 from chenzhx/bug-fix. Authored-by: chenzhx Signed-off-by: Wenchen Fan --- .../spark/sql/connector/catalog/CatalogManager.scala | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala index 7d8bc4f..0380621 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala @@ -89,12 +89,16 @@ class CatalogManager( private var _currentNamespace: Option[Array[String]] = None - def currentNamespace: Array[String] = synchronized { -_currentNamespace.getOrElse { - if (currentCatalog.name() == SESSION_CATALOG_NAME) { -Array(v1SessionCatalog.getCurrentDatabase) - } else { -currentCatalog.defaultNamespace() + def currentNamespace: Array[String] = { +val defaultNamespace = if (currentCatalog.name() == SESSION_CATALOG_NAME) { + Array(v1SessionCatalog.getCurrentDatabase) +} else { + currentCatalog.defaultNamespace() +} + +this.synchronized { + _currentNamespace.getOrElse { +defaultNamespace } } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.2 updated: [SPARK-37052][CORE] Spark should only pass --verbose argument to main class when is sql shell
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 34086b0 [SPARK-37052][CORE] Spark should only pass --verbose argument to main class when is sql shell 34086b0 is described below commit 34086b0c4dff77deaa3e381e63bf4597f100166e Author: Angerszh AuthorDate: Tue Oct 19 14:40:52 2021 +0800 [SPARK-37052][CORE] Spark should only pass --verbose argument to main class when is sql shell ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/32163 spark pass `--verbose` to main class o support spark-sql shell can use verbose argument too. But for other shell main class such as saprk-shell, it's intercepter don't support `--verbose`, so we should only pass `--verbose` for sql shell ### Why are the changes needed? Fix bug ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Closes #34322 from AngersZh/SPARK-37052. Authored-by: Angerszh Signed-off-by: Wenchen Fan (cherry picked from commit a6d3a2c84e5cdc642ed57602612f0303585c4b6e) Signed-off-by: Wenchen Fan --- core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala index 8124650..67a601b 100644 --- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala +++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala @@ -853,7 +853,7 @@ private[spark] class SparkSubmit extends Logging { } sparkConf.set(SUBMIT_PYTHON_FILES, formattedPyFiles.split(",").toSeq) -if (args.verbose) { +if (args.verbose && isSqlShell(childMainClass)) { childArgs ++= Seq("--verbose") } (childArgs.toSeq, childClasspath.toSeq, sparkConf, childMainClass) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ebca5232 -> a6d3a2c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ebca5232 [SPARK-36871][SQL][FOLLOWUP] Move error checking from create cmd to parser add a6d3a2c [SPARK-37052][CORE] Spark should only pass --verbose argument to main class when is sql shell No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-36871][SQL][FOLLOWUP] Move error checking from create cmd to parser
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ebca5232 [SPARK-36871][SQL][FOLLOWUP] Move error checking from create cmd to parser ebca5232 is described below commit ebca5232811cb0701d4062ac7ddc21fccc936490 Author: Huaxin Gao AuthorDate: Tue Oct 19 14:38:49 2021 +0800 [SPARK-36871][SQL][FOLLOWUP] Move error checking from create cmd to parser ### What changes were proposed in this pull request? Move error checking from create cmd to parser ### Why are the changes needed? catch error earlier and also make code consistent between parsing CreateFunction and CreateView ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #34283 from huaxingao/create_view_followup. Authored-by: Huaxin Gao Signed-off-by: Wenchen Fan --- .../spark/sql/errors/QueryCompilationErrors.scala | 26 --- .../spark/sql/errors/QueryParsingErrors.scala | 34 +++ .../spark/sql/execution/SparkSqlParser.scala | 39 +++--- .../spark/sql/execution/command/functions.scala| 9 - .../apache/spark/sql/execution/command/views.scala | 21 +--- 5 files changed, 69 insertions(+), 60 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala index 385e6b7..eb8985d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala @@ -1849,19 +1849,6 @@ object QueryCompilationErrors { new AnalysisException("Cannot overwrite a path that is also being read from.") } - def createFuncWithBothIfNotExistsAndReplaceError(): Throwable = { -new AnalysisException("CREATE FUNCTION with both IF NOT EXISTS and REPLACE is not allowed.") - } - - def defineTempFuncWithIfNotExistsError(): Throwable = { -new AnalysisException("It is not allowed to define a TEMPORARY function with IF NOT EXISTS.") - } - - def specifyingDBInCreateTempFuncError(databaseName: String): Throwable = { -new AnalysisException( - s"Specifying a database in CREATE TEMPORARY FUNCTION is not allowed: '$databaseName'") - } - def specifyingDBInDropTempFuncError(databaseName: String): Throwable = { new AnalysisException( s"Specifying a database in DROP TEMPORARY FUNCTION is not allowed: '$databaseName'") @@ -2011,19 +1998,6 @@ object QueryCompilationErrors { features.map(" - " + _).mkString("\n")) } - def createViewWithBothIfNotExistsAndReplaceError(): Throwable = { -new AnalysisException("CREATE VIEW with both IF NOT EXISTS and REPLACE is not allowed.") - } - - def defineTempViewWithIfNotExistsError(): Throwable = { -new AnalysisException("It is not allowed to define a TEMPORARY view with IF NOT EXISTS.") - } - - def notAllowedToAddDBPrefixForTempViewError(database: String): Throwable = { -new AnalysisException( - s"It is not allowed to add database prefix `$database` for the TEMPORARY view name.") - } - def logicalPlanForViewNotAnalyzedError(): Throwable = { new AnalysisException("The logical plan that represents the view is not analyzed.") } diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala index 3af63f1..090f73d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala @@ -391,4 +391,38 @@ object QueryParsingErrors { def invalidGroupingSetError(element: String, ctx: GroupingAnalyticsContext): Throwable = { new ParseException(s"Empty set in $element grouping sets is not supported.", ctx) } + + def createViewWithBothIfNotExistsAndReplaceError(ctx: CreateViewContext): Throwable = { +new ParseException("CREATE VIEW with both IF NOT EXISTS and REPLACE is not allowed.", ctx) + } + + def defineTempViewWithIfNotExistsError(ctx: CreateViewContext): Throwable = { +new ParseException("It is not allowed to define a TEMPORARY view with IF NOT EXISTS.", ctx) + } + + def notAllowedToAddDBPrefixForTempViewError( + database: String, + ctx: CreateViewContext): Throwable = { +new ParseException( + s"It is not allowed to add database prefix `$database` for the TEMPORARY view name.", ctx) + } + + def createFuncWithBothIfNotExistsAndReplaceError(ctx: CreateFunctionContext): Throwable = { +new ParseException("CREATE FUNCTION with both IF NOT
[GitHub] [spark-website] gengliangwang commented on pull request #361: Add 3.2.0 release note and news and update links
gengliangwang commented on pull request #361: URL: https://github.com/apache/spark-website/pull/361#issuecomment-946400783 Thanks all for the reviews. I will keep this open for a few more hours. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang merged pull request #362: Update DocSearch facet filter of 3.2.0 documentation
gengliangwang merged pull request #362: URL: https://github.com/apache/spark-website/pull/362 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: update facetFilters (#362)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new c734972 update facetFilters (#362) c734972 is described below commit c73497265d34e097f46d14976c8513aa3c90939c Author: Gengliang Wang AuthorDate: Tue Oct 19 14:13:40 2021 +0800 update facetFilters (#362) There is a bug in updating the DocSearch facet filter on https://github.com/apache/spark/blob/master/dev/create-release/release-tag.sh. The version is not updated as the release version. This PR is to fix the search function of https://spark.apache.org/docs/3.2.0/. --- site/docs/3.2.0/building-spark.html | 2 +- site/docs/3.2.0/cloud-integration.html | 2 +- site/docs/3.2.0/cluster-overview.html | 2 +- site/docs/3.2.0/configuration.html | 2 +- site/docs/3.2.0/core-migration-guide.html | 2 +- site/docs/3.2.0/graphx-programming-guide.html | 2 +- site/docs/3.2.0/hadoop-provided.html| 2 +- site/docs/3.2.0/hardware-provisioning.html | 2 +- site/docs/3.2.0/index.html | 2 +- site/docs/3.2.0/job-scheduling.html | 2 +- site/docs/3.2.0/migration-guide.html| 2 +- site/docs/3.2.0/ml-advanced.html| 2 +- site/docs/3.2.0/ml-ann.html | 2 +- site/docs/3.2.0/ml-classification-regression.html | 2 +- site/docs/3.2.0/ml-clustering.html | 2 +- site/docs/3.2.0/ml-collaborative-filtering.html | 2 +- site/docs/3.2.0/ml-datasource.html | 2 +- site/docs/3.2.0/ml-decision-tree.html | 2 +- site/docs/3.2.0/ml-ensembles.html | 2 +- site/docs/3.2.0/ml-features.html| 2 +- site/docs/3.2.0/ml-frequent-pattern-mining.html | 2 +- site/docs/3.2.0/ml-guide.html | 2 +- site/docs/3.2.0/ml-linalg-guide.html| 2 +- site/docs/3.2.0/ml-linear-methods.html | 2 +- site/docs/3.2.0/ml-migration-guide.html | 2 +- site/docs/3.2.0/ml-pipeline.html| 2 +- site/docs/3.2.0/ml-statistics.html | 2 +- site/docs/3.2.0/ml-survival-regression.html | 2 +- site/docs/3.2.0/ml-tuning.html | 2 +- site/docs/3.2.0/mllib-classification-regression.html| 2 +- site/docs/3.2.0/mllib-clustering.html | 2 +- site/docs/3.2.0/mllib-collaborative-filtering.html | 2 +- site/docs/3.2.0/mllib-data-types.html | 2 +- site/docs/3.2.0/mllib-decision-tree.html| 2 +- site/docs/3.2.0/mllib-dimensionality-reduction.html | 2 +- site/docs/3.2.0/mllib-ensembles.html| 2 +- site/docs/3.2.0/mllib-evaluation-metrics.html | 2 +- site/docs/3.2.0/mllib-feature-extraction.html | 2 +- site/docs/3.2.0/mllib-frequent-pattern-mining.html | 2 +- site/docs/3.2.0/mllib-guide.html| 2 +- site/docs/3.2.0/mllib-isotonic-regression.html | 2 +- site/docs/3.2.0/mllib-linear-methods.html | 2 +- site/docs/3.2.0/mllib-naive-bayes.html | 2 +- site/docs/3.2.0/mllib-optimization.html | 2 +- site/docs/3.2.0/mllib-pmml-model-export.html| 2 +- site/docs/3.2.0/mllib-statistics.html | 2 +- site/docs/3.2.0/monitoring.html | 2 +- site/docs/3.2.0/programming-guide.html | 2 +- site/docs/3.2.0/pyspark-migration-guide.html| 2 +- site/docs/3.2.0/quick-start.html| 2 +- site/docs/3.2.0/rdd-programming-guide.html | 2 +- site/docs/3.2.0/running-on-kubernetes.html | 2 +- site/docs/3.2.0/running-on-mesos.html