Re: Skip single integration test case in Spark on K8s

2022-03-16 Thread Dongjoon Hyun
-user@spark For cloud backend, you need to exclude minikube specific tests and local-only test (SparkRemoteFileTest). -Dtest.exclude.tags=minikube,local You can find more options including SBT commands here.

Re: Apache Spark 3.3 Release

2022-03-16 Thread Andrew Melo
Hello, I've been trying for a bit to get the following two PRs merged and into a release, and I'm having some difficulty moving them forward: https://github.com/apache/spark/pull/34903 - This passes the current python interpreter to spark-env.sh to allow some currently-unavailable customization

Re: Apache Spark 3.3 Release

2022-03-16 Thread Holden Karau
I'd like to add/backport the logging in https://github.com/apache/spark/pull/35881 PR so that when users submit issues with dynamic allocation we can better debug what's going on. On Wed, Mar 16, 2022 at 3:45 PM Chao Sun wrote: > There is one item on our side that we want to backport to 3.3: >

Re: Apache Spark 3.3 Release

2022-03-16 Thread Chao Sun
There is one item on our side that we want to backport to 3.3: - vectorized DELTA_BYTE_ARRAY/DELTA_LENGTH_BYTE_ARRAY encodings for Parquet V2 support (https://github.com/apache/spark/pull/35262) It's already reviewed and approved. On Wed, Mar 16, 2022 at 9:13 AM Tom Graves wrote: > > It looks

Re: Apache Spark 3.3 Release

2022-03-16 Thread Tom Graves
It looks like the version hasn't been updated on master and still shows 3.3.0-SNAPSHOT, can you please update that.  Tom On Wednesday, March 16, 2022, 01:41:00 AM CDT, Maxim Gekk wrote: Hi All, I have created the branch for Spark 3.3:

Re: Apache Spark 3.3 Release

2022-03-16 Thread Jacky Lee
I also have a PR that has been ready to merge for a while, can we merge in 3.3.0? [SPARK-37831][CORE] add task partition id in TaskInfo and Task Metrics https://github.com/apache/spark/pull/35185 Adam Binford 于2022年3月16日周三 21:16写道: > Also throwing my hat in for two of my PRs that should be

Re: Apache Spark 3.3 Release

2022-03-16 Thread Jacky Lee
I also have a PR that has been ready to merge for a while, can we merge in 3.3.0? [SPARK-37831][CORE] add task partition id in TaskInfo and Task Metrics https://github.com/apache/spark/pull/35185 beliefer 于2022年3月16日周三 21:33写道: > +1 Glad to see we will release 3.3.0. > > > At 2022-03-04

Re:Apache Spark 3.3 Release

2022-03-16 Thread beliefer
+1 Glad to see we will release 3.3.0. At 2022-03-04 02:44:37, "Maxim Gekk" wrote: Hello All, I would like to bring on the table the theme about the new Spark release 3.3. According to the public schedule at https://spark.apache.org/versioning-policy.html, we planned to start the code

Re: Apache Spark 3.3 Release

2022-03-16 Thread Adam Binford
Also throwing my hat in for two of my PRs that should be ready just need final reviews/approval: Removing shuffles from deallocated executors using the shuffle service: https://github.com/apache/spark/pull/35085. This has been asked for for several years across many issues. Configurable memory

Skip single integration test case in Spark on K8s

2022-03-16 Thread Pralabh Kumar
Hi Spark team I am running Spark kubernetes integration test suite on cloud. build/mvn install \ -f pom.xml \ -pl resource-managers/kubernetes/integration-tests -am -Pscala-2.12 -Phadoop-3.1.1 -Phive -Phive-thriftserver -Pyarn -Pkubernetes -Pkubernetes-integration-tests \ -Djava.version=8 \

Re: Apache Spark 3.3 Release

2022-03-16 Thread Wenchen Fan
+1 to define an allowlist of features that we want to backport to branch 3.3. I also have a few in my mind complex type support in vectorized parquet reader: https://github.com/apache/spark/pull/34659 refine the DS v2 filter API for JDBC v2: https://github.com/apache/spark/pull/35768 a few new SQL

Re: Data correctness issue with Repartition + FetchFailure

2022-03-16 Thread Wenchen Fan
It's great if you can help with it! Basically, we need to propagate the column-level deterministic information and sort the inputs if the partition key lineage has nondeterminisitc part. On Wed, Mar 16, 2022 at 5:28 AM Jason Xu wrote: > Hi Wenchen, thanks for the insight. Agree, the previous

Re: Apache Spark 3.3 Release

2022-03-16 Thread Maxim Gekk
Hi All, I have created the branch for Spark 3.3: https://github.com/apache/spark/commits/branch-3.3 Please, backport important fixes to it, and if you have some doubts, ping me in the PR. Regarding new features, we are still building the allow list for branch-3.3. Best regards, Max Gekk On