(spark) branch master updated: [SPARK-47112][INFRA] Write logs into a file in SparkR Windows build
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0d938de112df [SPARK-47112][INFRA] Write logs into a file in SparkR Windows build 0d938de112df is described below commit 0d938de112dfe142f40d4c6f86cfa1e6e32210ec Author: Hyukjin Kwon AuthorDate: Wed Feb 21 16:39:37 2024 +0900 [SPARK-47112][INFRA] Write logs into a file in SparkR Windows build ### What changes were proposed in this pull request? We used to write Log4J logs into `target/unit-tests.log` instead of console. This seems to be broken in SparkR Windows job. This PR fixes it. ### Why are the changes needed? https://github.com/apache/spark/actions/runs/7977185456/job/21779508822#step:10:89 This write too many logs, and difficult to see the real test cases. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? In my fork ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45192 from HyukjinKwon/reduce-logs. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- .github/workflows/build_sparkr_window.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/build_sparkr_window.yml b/.github/workflows/build_sparkr_window.yml index 07f4ebe91ad2..fbaca36f9f87 100644 --- a/.github/workflows/build_sparkr_window.yml +++ b/.github/workflows/build_sparkr_window.yml @@ -71,7 +71,7 @@ jobs: run: | set HADOOP_HOME=%USERPROFILE%\hadoop-3.3.5 set PATH=%HADOOP_HOME%\bin;%PATH% -.\bin\spark-submit2.cmd --driver-java-options "-Dlog4j.configuration=file:///%CD:\=/%/R/log4j2.properties" --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R +.\bin\spark-submit2.cmd --driver-java-options "-Dlog4j.configurationFile=file:///%CD:\=/%/R/log4j2.properties" --conf spark.hadoop.fs.defaultFS="file:///" R\pkg\tests\run-all.R shell: cmd env: NOT_CRAN: true - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-47113][CORE] Revert S3A endpoint fixup logic of SPARK-35878
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b3e34629080b [SPARK-47113][CORE] Revert S3A endpoint fixup logic of SPARK-35878 b3e34629080b is described below commit b3e34629080bfbbc0615bb16a961b9298c5d4756 Author: Steve Loughran AuthorDate: Tue Feb 20 23:12:11 2024 -0800 [SPARK-47113][CORE] Revert S3A endpoint fixup logic of SPARK-35878 ### What changes were proposed in this pull request? Revert [SPARK-35878][CORE] Add fs.s3a.endpoint if unset and fs.s3a.endpoint.region is null Removing the region/endpoint patching code of SPARK-35878 avoids authentication problems with versions of the S3A connector built with AWS v2 SDK -as is the case in Hadoop 3.4.0. That is: if fs.s3a.endpoint is unset it will stay unset. The v2 SDK does its binding to AWS Services differently, in what can be described as "region first" binding. Spark setting the endpoint blocks S3 Express support and is incompatible with HADOOP-18975 S3A: Add option fs.s3a.endpoint.fips to use AWS FIPS endpoints - https://github.com/apache/hadoop/pull/6277 The change is compatible with all releases of the s3a connector other than hadoop 3.3.1 binaries deployed outside EC2 and without the endpoint explicitly set. ### Why are the changes needed? AWS v2 SDK has a different/complex binding mechanism; it doesn't need the endpoint to be set if the region (fs.s3a.region) value is set. This means the spark code to fix an endpoint is not only un-needed, it causes problems when trying to use specific storage options (S3 Express) or security options (FIPS) ### Does this PR introduce _any_ user-facing change? Only visible on hadoop 3.3.1 s3a connector when deployed outside of EC2 -the situation the original patch was added to work around. All other 3.3.x releases are good. ### How was this patch tested? Removed some obsolete tests. Relying on github and jenkins to do the testing so marking this PR as WiP until they are happy. ### Was this patch authored or co-authored using generative AI tooling? No Closes #45193 from dongjoon-hyun/SPARK-47113. Authored-by: Steve Loughran Signed-off-by: Dongjoon Hyun --- .../org/apache/spark/deploy/SparkHadoopUtil.scala | 10 --- .../apache/spark/deploy/SparkHadoopUtilSuite.scala | 33 -- 2 files changed, 43 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala index 628b688dedba..2edd80db2637 100644 --- a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala +++ b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala @@ -529,16 +529,6 @@ private[spark] object SparkHadoopUtil extends Logging { if (conf.getOption("spark.hadoop.fs.s3a.downgrade.syncable.exceptions").isEmpty) { hadoopConf.set("fs.s3a.downgrade.syncable.exceptions", "true", setBySpark) } -// In Hadoop 3.3.1, AWS region handling with the default "" endpoint only works -// in EC2 deployments or when the AWS CLI is installed. -// The workaround is to set the name of the S3 endpoint explicitly, -// if not already set. See HADOOP-17771. -if (hadoopConf.get("fs.s3a.endpoint", "").isEmpty && - hadoopConf.get("fs.s3a.endpoint.region") == null) { - // set to US central endpoint which can also connect to buckets - // in other regions at the expense of a HEAD request during fs creation - hadoopConf.set("fs.s3a.endpoint", "s3.amazonaws.com", setBySpark) -} } private def appendSparkHiveConfigs(conf: SparkConf, hadoopConf: Configuration): Unit = { diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala b/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala index 2326d10d4164..9a81cb947257 100644 --- a/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala @@ -39,19 +39,6 @@ class SparkHadoopUtilSuite extends SparkFunSuite { assertConfigMatches(hadoopConf, "orc.filterPushdown", "true", SOURCE_SPARK_HADOOP) assertConfigMatches(hadoopConf, "fs.s3a.downgrade.syncable.exceptions", "true", SET_TO_DEFAULT_VALUES) -assertConfigMatches(hadoopConf, "fs.s3a.endpoint", "s3.amazonaws.com", SET_TO_DEFAULT_VALUES) - } - - /** - * An empty S3A endpoint will be overridden just as a null value - * would. - */ - test("appendSparkHadoopConfigs with S3A endpoint set to empty string") { -val sc = new SparkConf() -val hadoopConf = new Configuration(false) -sc.set("spark.hadoop.fs.s3a.endpoint", "") -
(spark) branch master updated: [SPARK-47115][INFRA] Use larger memory for Maven builds
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new f4bb6528596b [SPARK-47115][INFRA] Use larger memory for Maven builds f4bb6528596b is described below commit f4bb6528596b6809391664e594be8c7a154529d8 Author: Hyukjin Kwon AuthorDate: Wed Feb 21 16:06:26 2024 +0900 [SPARK-47115][INFRA] Use larger memory for Maven builds ### What changes were proposed in this pull request? This PR proposes to use bigger memory during Maven builds. GitHub Actions runners now have more memory than before (https://docs.github.com/en/actions/using-github-hosted-runners/about-larger-runners/about-larger-runners) so we can increase. https://github.com/HyukjinKwon/spark/actions/runs/7984135094/job/21800463337 ### Why are the changes needed? For stable Maven builds. Some tests consistently fail: ``` *** RUN ABORTED *** An exception or error caused a run to abort: unable to create native thread: possibly out of memory or process/resource limits reached java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.base/java.lang.Thread.start0(Native Method) at java.base/java.lang.Thread.start(Thread.java:1553) at java.base/java.lang.System$2.start(System.java:2577) at java.base/jdk.internal.vm.SharedThreadContainer.start(SharedThreadContainer.java:152) at java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:953) at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1364) at org.apache.spark.rpc.netty.SharedMessageLoop.$anonfun$threadpool$1(MessageLoop.scala:128) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190) at org.apache.spark.rpc.netty.SharedMessageLoop.(MessageLoop.scala:127) at org.apache.spark.rpc.netty.Dispatcher.sharedLoop$lzycompute(Dispatcher.scala:46) ... Warning: The requested profile "volcano" could not be activated because it does not exist. Warning: The requested profile "hive" could not be activated because it does not exist. Error: Failed to execute goal org.scalatest:scalatest-maven-plugin:2.2.0:test (test) on project spark-core_2.13: There are test failures -> [Help 1] Error: Error: To see the full stack trace of the errors, re-run Maven with the -e switch. Error: Re-run Maven using the -X switch to enable full debug logging. Error: Error: For more information about the errors and possible solutions, please read the following articles: Error: [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException Error: Error: After correcting the problems, you can resume the build with the command Error:mvn -rf :spark-core_2.13 Error: Process completed with exit code 1. ``` ### Does this PR introduce _any_ user-facing change? No, dev-only ### How was this patch tested? Will monitor the scheduled jobs. It's a simple memory configuration change. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45195 from HyukjinKwon/bigger-macos. Authored-by: Hyukjin Kwon Signed-off-by: Hyukjin Kwon --- .github/workflows/maven_test.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/maven_test.yml b/.github/workflows/maven_test.yml index d63066a521f9..b2f0118d2e7b 100644 --- a/.github/workflows/maven_test.yml +++ b/.github/workflows/maven_test.yml @@ -185,7 +185,7 @@ jobs: - name: Run tests env: ${{ fromJSON(inputs.envs) }} run: | - export MAVEN_OPTS="-Xss64m -Xmx4g -Xms4g -XX:ReservedCodeCacheSize=128m -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN" + export MAVEN_OPTS="-Xss64m -Xmx6g -Xms6g -XX:ReservedCodeCacheSize=128m -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN" export MAVEN_CLI_OPTS="--no-transfer-progress" export JAVA_VERSION=${{ matrix.java }} # Replace with the real module name, for example, connector#kafka-0-10 -> connector/kafka-0-10 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: Revert "[SPARK-35878][CORE] Revert S3A endpoint fixup logic of SPARK-35878"
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 011b843687d8 Revert "[SPARK-35878][CORE] Revert S3A endpoint fixup logic of SPARK-35878" 011b843687d8 is described below commit 011b843687d8ae36b03e8d3d177b0bf43e7d29b6 Author: Dongjoon Hyun AuthorDate: Tue Feb 20 22:26:56 2024 -0800 Revert "[SPARK-35878][CORE] Revert S3A endpoint fixup logic of SPARK-35878" This reverts commit 36f199d1e41276c78036355eac1dac092e65aabe. --- .../org/apache/spark/deploy/SparkHadoopUtil.scala | 10 +++ .../apache/spark/deploy/SparkHadoopUtilSuite.scala | 33 ++ 2 files changed, 43 insertions(+) diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala index 2edd80db2637..628b688dedba 100644 --- a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala +++ b/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala @@ -529,6 +529,16 @@ private[spark] object SparkHadoopUtil extends Logging { if (conf.getOption("spark.hadoop.fs.s3a.downgrade.syncable.exceptions").isEmpty) { hadoopConf.set("fs.s3a.downgrade.syncable.exceptions", "true", setBySpark) } +// In Hadoop 3.3.1, AWS region handling with the default "" endpoint only works +// in EC2 deployments or when the AWS CLI is installed. +// The workaround is to set the name of the S3 endpoint explicitly, +// if not already set. See HADOOP-17771. +if (hadoopConf.get("fs.s3a.endpoint", "").isEmpty && + hadoopConf.get("fs.s3a.endpoint.region") == null) { + // set to US central endpoint which can also connect to buckets + // in other regions at the expense of a HEAD request during fs creation + hadoopConf.set("fs.s3a.endpoint", "s3.amazonaws.com", setBySpark) +} } private def appendSparkHiveConfigs(conf: SparkConf, hadoopConf: Configuration): Unit = { diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala b/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala index 9a81cb947257..2326d10d4164 100644 --- a/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala +++ b/core/src/test/scala/org/apache/spark/deploy/SparkHadoopUtilSuite.scala @@ -39,6 +39,19 @@ class SparkHadoopUtilSuite extends SparkFunSuite { assertConfigMatches(hadoopConf, "orc.filterPushdown", "true", SOURCE_SPARK_HADOOP) assertConfigMatches(hadoopConf, "fs.s3a.downgrade.syncable.exceptions", "true", SET_TO_DEFAULT_VALUES) +assertConfigMatches(hadoopConf, "fs.s3a.endpoint", "s3.amazonaws.com", SET_TO_DEFAULT_VALUES) + } + + /** + * An empty S3A endpoint will be overridden just as a null value + * would. + */ + test("appendSparkHadoopConfigs with S3A endpoint set to empty string") { +val sc = new SparkConf() +val hadoopConf = new Configuration(false) +sc.set("spark.hadoop.fs.s3a.endpoint", "") +new SparkHadoopUtil().appendSparkHadoopConfigs(sc, hadoopConf) +assertConfigMatches(hadoopConf, "fs.s3a.endpoint", "s3.amazonaws.com", SET_TO_DEFAULT_VALUES) } /** @@ -48,8 +61,28 @@ class SparkHadoopUtilSuite extends SparkFunSuite { val sc = new SparkConf() val hadoopConf = new Configuration(false) sc.set("spark.hadoop.fs.s3a.downgrade.syncable.exceptions", "false") +sc.set("spark.hadoop.fs.s3a.endpoint", "s3-eu-west-1.amazonaws.com") new SparkHadoopUtil().appendSparkHadoopConfigs(sc, hadoopConf) assertConfigValue(hadoopConf, "fs.s3a.downgrade.syncable.exceptions", "false") +assertConfigValue(hadoopConf, "fs.s3a.endpoint", + "s3-eu-west-1.amazonaws.com") + } + + /** + * If the endpoint region is set (even to a blank string) in + * "spark.hadoop.fs.s3a.endpoint.region" then the endpoint is not set, + * even when the s3a endpoint is "". + * This supports a feature in hadoop 3.3.1 where this configuration + * pair triggers a revert to the "SDK to work out the region" algorithm, + * which works on EC2 deployments. + */ + test("appendSparkHadoopConfigs with S3A endpoint region set to an empty string") { +val sc = new SparkConf() +val hadoopConf = new Configuration(false) +sc.set("spark.hadoop.fs.s3a.endpoint.region", "") +new SparkHadoopUtil().appendSparkHadoopConfigs(sc, hadoopConf) +// the endpoint value will not have been set +assertConfigValue(hadoopConf, "fs.s3a.endpoint", null) } /** - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (0b907ed11e6e -> 36f199d1e412)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 0b907ed11e6e [SPARK-46928][SS] Add support for ListState in Arbitrary State API v2 add 36f199d1e412 [SPARK-35878][CORE] Revert S3A endpoint fixup logic of SPARK-35878 No new revisions were added by this update. Summary of changes: .../org/apache/spark/deploy/SparkHadoopUtil.scala | 10 --- .../apache/spark/deploy/SparkHadoopUtilSuite.scala | 33 -- 2 files changed, 43 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-46928][SS] Add support for ListState in Arbitrary State API v2
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 0b907ed11e6e [SPARK-46928][SS] Add support for ListState in Arbitrary State API v2 0b907ed11e6e is described below commit 0b907ed11e6ec6bc4c7d07926ed352806636d58a Author: Bhuwan Sahni AuthorDate: Wed Feb 21 14:43:27 2024 +0900 [SPARK-46928][SS] Add support for ListState in Arbitrary State API v2 ### What changes were proposed in this pull request? This PR adds changes for ListState implementation in State Api v2. As a list contains multiple values for a single key, we utilize RocksDB merge operator to persist multiple values. Changes include 1. A new encoder/decoder to encode multiple values inside a single byte[] array (stored in RocksDB). The encoding scheme is compatible with RocksDB StringAppendOperator merge operator. 2. Support merge operations in ChangelogCheckpointing v2. 3. Extend StateStore to support merge operation, and read multiple values for a single key (via a Iterator). Note that these changes are only supported for RocksDB currently. ### Why are the changes needed? These changes are needed to support list values in the State Store. The changes are part of the work around adding new stateful streaming operator for arbitrary state mgmt that provides a bunch of new features listed in the SPIP JIRA here - https://issues.apache.org/jira/browse/SPARK-45939 ### Does this PR introduce _any_ user-facing change? Yes This PR introduces a new state type (ListState) that users can use in their Spark streaming queries. ### How was this patch tested? 1. Added a new test suite for ListState to ensure the state produces correct results. 2. Added additional testcases for input validation. 3. Added tests for merge operator with RocksDB. 4. Added tests for changelog checkpointing merge operator. 5. Added tests for reading merged values in RocksDBStateStore. ### Was this patch authored or co-authored using generative AI tooling? No Closes #44961 from sahnib/state-api-v2-list-state. Authored-by: Bhuwan Sahni Signed-off-by: Jungtaek Lim --- .../src/main/resources/error/error-classes.json| 18 ++ ...itions-illegal-state-store-value-error-class.md | 41 +++ docs/sql-error-conditions.md | 8 + .../{ValueState.scala => ListState.scala} | 30 +- .../sql/streaming/StatefulProcessorHandle.scala| 10 + .../apache/spark/sql/streaming/ValueState.scala| 2 +- .../v2/state/StatePartitionReader.scala| 2 +- .../sql/execution/streaming/ListStateImpl.scala| 121 .../streaming/StateTypesEncoderUtils.scala | 88 ++ .../streaming/StatefulProcessorHandleImpl.scala| 8 +- .../streaming/TransformWithStateExec.scala | 6 +- .../sql/execution/streaming/ValueStateImpl.scala | 61 +--- .../state/HDFSBackedStateStoreProvider.scala | 27 +- .../sql/execution/streaming/state/RocksDB.scala| 37 +++ .../streaming/state/RocksDBStateEncoder.scala | 96 +- .../state/RocksDBStateStoreProvider.scala | 53 +++- .../sql/execution/streaming/state/StateStore.scala | 53 +++- .../streaming/state/StateStoreChangelog.scala | 48 ++- .../streaming/state/StateStoreErrors.scala | 12 + .../execution/streaming/state/StateStoreRDD.scala | 5 +- .../state/SymmetricHashJoinStateManager.scala | 3 +- .../sql/execution/streaming/state/package.scala| 6 +- .../streaming/state/MemoryStateStore.scala | 11 +- .../streaming/state/RocksDBStateStoreSuite.scala | 56 +++- .../execution/streaming/state/RocksDBSuite.scala | 84 ++ .../streaming/state/StateStoreSuite.scala | 7 +- .../streaming/state/ValueStateSuite.scala | 12 +- .../apache/spark/sql/streaming/StreamSuite.scala | 3 +- .../streaming/TransformWithListStateSuite.scala| 328 + .../sql/streaming/TransformWithStateSuite.scala| 2 +- 30 files changed, 1120 insertions(+), 118 deletions(-) diff --git a/common/utils/src/main/resources/error/error-classes.json b/common/utils/src/main/resources/error/error-classes.json index c1b1171b5dc8..b30b1d60bb4a 100644 --- a/common/utils/src/main/resources/error/error-classes.json +++ b/common/utils/src/main/resources/error/error-classes.json @@ -1380,6 +1380,24 @@ ], "sqlState" : "42601" }, + "ILLEGAL_STATE_STORE_VALUE" : { +"message" : [ + "Illegal value provided to the State Store" +], +"subClass" : { + "EMPTY_LIST_VALUE" : { +"message" : [ + "Cannot write empty list values to State Store for StateName ." +] + }, + "NULL_VALUE" : { +
(spark) branch master updated: [SPARK-47095][INFRA][FOLLOW-UP] Remove TTY specific workaround in Maven build
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new b96914b637ee [SPARK-47095][INFRA][FOLLOW-UP] Remove TTY specific workaround in Maven build b96914b637ee is described below commit b96914b637ee0692f3c836c2637863704c6b73fa Author: Hyukjin Kwon AuthorDate: Tue Feb 20 21:42:12 2024 -0800 [SPARK-47095][INFRA][FOLLOW-UP] Remove TTY specific workaround in Maven build ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/45171 that broke the scheduled build of macos-14. Here I remove TTY specific workaround in Maven build, and skips `AmmoniteTest` that needs the workaround. We should enable the tests back when the bug is fixed (see https://github.com/apache/spark/pull/40675#issuecomment-1513102087) ### Why are the changes needed? To fix up the build, It fails https://github.com/apache/spark/actions/runs/7979285164 See also https://github.com/apache/spark/pull/45186#discussion_r1496839930 ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? In my fork. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45186 from HyukjinKwon/SPARK-47095-followup. Lead-authored-by: Hyukjin Kwon Co-authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- .github/workflows/maven_test.yml | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/.github/workflows/maven_test.yml b/.github/workflows/maven_test.yml index 80898b3a507a..d63066a521f9 100644 --- a/.github/workflows/maven_test.yml +++ b/.github/workflows/maven_test.yml @@ -73,13 +73,19 @@ jobs: connector#kafka-0-10,connector#kafka-0-10-sql,connector#kafka-0-10-token-provider,connector#spark-ganglia-lgpl,connector#protobuf,connector#avro - >- sql#api,sql#catalyst,resource-managers#yarn,resource-managers#kubernetes#core - - >- -connect # Here, we split Hive and SQL tests into some of slow ones and the rest of them. included-tags: [ "" ] excluded-tags: [ "" ] comment: [ "" ] include: + # Connect tests + - modules: connect +java: ${{ inputs.java }} +hadoop: ${{ inputs.hadoop }} +hive: hive2.3 +# TODO(SPARK-47110): Reenble AmmoniteTest tests in Maven builds +excluded-tags: org.apache.spark.tags.AmmoniteTest +comment: "" # Hive tests - modules: sql#hive java: ${{ inputs.java }} @@ -178,13 +184,7 @@ jobs: # Run the tests. - name: Run tests env: ${{ fromJSON(inputs.envs) }} -# The command script takes different options ubuntu vs macos-14, see also SPARK-47095. -shell: '[[ "${{ inputs.os }}" == *"ubuntu"* ]] && script -q -e -c "bash {0}" || script -q -e "bash {0}"' run: | - # Fix for TTY related issues when launching the Ammonite REPL in tests. - export TERM=vt100 - # `set -e` to make the exit status as expected due to use `script -q -e -c` to run the commands - set -e export MAVEN_OPTS="-Xss64m -Xmx4g -Xms4g -XX:ReservedCodeCacheSize=128m -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN" export MAVEN_CLI_OPTS="--no-transfer-progress" export JAVA_VERSION=${{ matrix.java }} @@ -193,10 +193,10 @@ jobs: ./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Djava.version=${JAVA_VERSION/-ea} clean install if [[ "$INCLUDED_TAGS" != "" ]]; then ./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Djava.version=${JAVA_VERSION/-ea} -Dtest.include.tags="$INCLUDED_TAGS" test -fae + elif [[ "$MODULES_TO_TEST" == "connect" ]]; then +./build/mvn $MAVEN_CLI_OPTS -Dtest.exclude.tags="$EXCLUDED_TAGS" -Djava.version=${JAVA_VERSION/-ea} -pl connector/connect/client/jvm,connector/connect/common,connector/connect/server test -fae elif [[ "$EXCLUDED_TAGS" != "" ]]; then ./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Djava.version=${JAVA_VERSION/-ea} -Dtest.exclude.tags="$EXCLUDED_TAGS" test -fae - elif [[ "$MODULES_TO_TEST" == "connect" ]]; then -./build/mvn $MAVEN_CLI_OPTS -Djava.version=${JAVA_VERSION/-ea} -pl connector/connect/client/jvm,connector/connect/common,connector/co
(spark) branch master updated: [SPARK-47052][SS] Separate state tracking variables from MicroBatchExecution/StreamExecution
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new bffa92c838d6 [SPARK-47052][SS] Separate state tracking variables from MicroBatchExecution/StreamExecution bffa92c838d6 is described below commit bffa92c838d6650249a6e71bb0ef8189cf970383 Author: Jerry Peng AuthorDate: Wed Feb 21 12:58:48 2024 +0900 [SPARK-47052][SS] Separate state tracking variables from MicroBatchExecution/StreamExecution ### What changes were proposed in this pull request? To improve code clarity and maintainability, I propose that we move all the variables that track mutable state and metrics for a streaming query into a separate class. With this refactor, it would be easy to track and find all the mutable state a microbatch can have. ### Why are the changes needed? To improve code clarity and maintainability. All the state and metrics that is needed for the execution lifecycle of a microbatch is consolidated into one class. If we decide to modify or add additional state to a streaming query, it will be easier to determine 1) where to add it 2) what existing state are there. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests should suffice ### Was this patch authored or co-authored using generative AI tooling? No Closes #45109 from jerrypeng/SPARK-47052. Authored-by: Jerry Peng Signed-off-by: Jungtaek Lim --- .../sql/execution/streaming/AsyncLogPurge.scala| 11 +- .../AsyncProgressTrackingMicroBatchExecution.scala | 30 +- .../execution/streaming/MicroBatchExecution.scala | 422 ++--- .../sql/execution/streaming/ProgressReporter.scala | 521 + .../sql/execution/streaming/StreamExecution.scala | 112 +++-- .../streaming/StreamExecutionContext.scala | 233 + .../sql/execution/streaming/TriggerExecutor.scala | 24 +- .../streaming/continuous/ContinuousExecution.scala | 56 ++- .../streaming/ProcessingTimeExecutorSuite.scala| 6 +- 9 files changed, 945 insertions(+), 470 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncLogPurge.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncLogPurge.scala index b3729dbc7b45..aa393211a1c1 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncLogPurge.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncLogPurge.scala @@ -29,11 +29,8 @@ import org.apache.spark.util.ThreadUtils */ trait AsyncLogPurge extends Logging { - protected var currentBatchId: Long - protected val minLogEntriesToMaintain: Int - protected[sql] val errorNotifier: ErrorNotifier protected val sparkSession: SparkSession @@ -47,15 +44,11 @@ trait AsyncLogPurge extends Logging { protected lazy val useAsyncPurge: Boolean = sparkSession.conf.get(SQLConf.ASYNC_LOG_PURGE) - protected def purgeAsync(): Unit = { + protected def purgeAsync(batchId: Long): Unit = { if (purgeRunning.compareAndSet(false, true)) { - // save local copy because currentBatchId may get updated. There are not really - // any concurrency issues here in regards to calculating the purge threshold - // but for the sake of defensive coding lets make a copy - val currentBatchIdCopy: Long = currentBatchId asyncPurgeExecutorService.execute(() => { try { - purge(currentBatchIdCopy - minLogEntriesToMaintain) + purge(batchId - minLogEntriesToMaintain) } catch { case throwable: Throwable => logError("Encountered error while performing async log purge", throwable) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala index 206efb9a5450..ec24ec0fd335 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/AsyncProgressTrackingMicroBatchExecution.scala @@ -110,12 +110,12 @@ class AsyncProgressTrackingMicroBatchExecution( } } - override def markMicroBatchExecutionStart(): Unit = { + override def markMicroBatchExecutionStart(execCtx: MicroBatchExecutionContext): Unit = { // check if streaming query is stateful checkNotStatefulStreamingQuery } - override def cleanUpLastExecutedMicroBatch(): Unit = { + override def cleanUpLastExecutedMicroBatch(execCtx: MicroBatchExecutionContext): Unit = { // this is a no op for async progress tracking si
(spark) branch master updated: [SPARK-47111][SQL][TESTS] Upgrade `PostgreSQL` JDBC driver to 42.7.2 and docker image to 16.2
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6e6197073a57 [SPARK-47111][SQL][TESTS] Upgrade `PostgreSQL` JDBC driver to 42.7.2 and docker image to 16.2 6e6197073a57 is described below commit 6e6197073a57a40502f12e9fc0cfd8f2d5f84585 Author: Dongjoon Hyun AuthorDate: Tue Feb 20 19:49:44 2024 -0800 [SPARK-47111][SQL][TESTS] Upgrade `PostgreSQL` JDBC driver to 42.7.2 and docker image to 16.2 ### What changes were proposed in this pull request? This PR aims to upgrade `PostgreSQL` JDBC driver and docker images. - JDBC Driver: `org.postgresql:postgresql` from 42.7.0 to 42.7.2 - Docker Image: `postgres` from `15.1-alpine` to `16.2-alpine` ### Why are the changes needed? To use the latest PostgreSQL combination in the following integration tests. - PostgresIntegrationSuite - PostgresKrbIntegrationSuite - GeneratedSubquerySuite - PostgreSQLQueryTestSuite - v2/PostgresIntegrationSuite - v2/PostgresNamespaceSuite ### Does this PR introduce _any_ user-facing change? No. This is a pure test-environment update. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45191 from dongjoon-hyun/SPARK-47111. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- .../scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala | 6 +++--- .../org/apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala | 6 +++--- .../apache/spark/sql/jdbc/querytest/GeneratedSubquerySuite.scala| 6 +++--- .../apache/spark/sql/jdbc/querytest/PostgreSQLQueryTestSuite.scala | 6 +++--- .../org/apache/spark/sql/jdbc/v2/PostgresIntegrationSuite.scala | 6 +++--- .../scala/org/apache/spark/sql/jdbc/v2/PostgresNamespaceSuite.scala | 6 +++--- pom.xml | 2 +- 7 files changed, 19 insertions(+), 19 deletions(-) diff --git a/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala b/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala index 92d1c3761ba8..968ca09cb3d5 100644 --- a/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala +++ b/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala @@ -30,9 +30,9 @@ import org.apache.spark.sql.types.{ArrayType, DecimalType, FloatType, ShortType} import org.apache.spark.tags.DockerTest /** - * To run this test suite for a specific version (e.g., postgres:15.1): + * To run this test suite for a specific version (e.g., postgres:16.2): * {{{ - * ENABLE_DOCKER_INTEGRATION_TESTS=1 POSTGRES_DOCKER_IMAGE_NAME=postgres:15.1 + * ENABLE_DOCKER_INTEGRATION_TESTS=1 POSTGRES_DOCKER_IMAGE_NAME=postgres:16.2 * ./build/sbt -Pdocker-integration-tests * "docker-integration-tests/testOnly org.apache.spark.sql.jdbc.PostgresIntegrationSuite" * }}} @@ -40,7 +40,7 @@ import org.apache.spark.tags.DockerTest @DockerTest class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite { override val db = new DatabaseOnDocker { -override val imageName = sys.env.getOrElse("POSTGRES_DOCKER_IMAGE_NAME", "postgres:15.1-alpine") +override val imageName = sys.env.getOrElse("POSTGRES_DOCKER_IMAGE_NAME", "postgres:16.2-alpine") override val env = Map( "POSTGRES_PASSWORD" -> "rootpass" ) diff --git a/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala b/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala index 92c3378b4065..d08be3b5f40e 100644 --- a/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala +++ b/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresKrbIntegrationSuite.scala @@ -25,9 +25,9 @@ import org.apache.spark.sql.execution.datasources.jdbc.connection.SecureConnecti import org.apache.spark.tags.DockerTest /** - * To run this test suite for a specific version (e.g., postgres:15.1): + * To run this test suite for a specific version (e.g., postgres:16.2): * {{{ - * ENABLE_DOCKER_INTEGRATION_TESTS=1 POSTGRES_DOCKER_IMAGE_NAME=postgres:15.1 + * ENABLE_DOCKER_INTEGRATION_TESTS=1 POSTGRES_DOCKER_IMAGE_NAME=postgres:16.2 * ./build/sbt -Pdocker-integration-tests * "docker-integration-tests/testOnly *PostgresKrbIntegrationSuite" * }}} @@ -38,7 +38,7 @@ class PostgresKrbIntegrationSuite extends DockerKrbJDBCIntegrationSuite { o
(spark) branch master updated: [SPARK-47109][BUILD] Upgrade `commons-compress` to 1.26.0
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 434acaf48849 [SPARK-47109][BUILD] Upgrade `commons-compress` to 1.26.0 434acaf48849 is described below commit 434acaf48849263986d83480002d8d969eb33f12 Author: Dongjoon Hyun AuthorDate: Tue Feb 20 18:56:42 2024 -0800 [SPARK-47109][BUILD] Upgrade `commons-compress` to 1.26.0 ### What changes were proposed in this pull request? This PR aims to upgrade `commons-compress` to 1.26.0. ### Why are the changes needed? To bring the latest bug fixes. - https://commons.apache.org/proper/commons-compress/changes-report.html#a1.26.0 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45189 from dongjoon-hyun/SPARK-47109. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- dev/deps/spark-deps-hadoop-3-hive-2.3 | 2 +- pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index cc0145e004a0..97205011e265 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -40,7 +40,7 @@ commons-codec/1.16.1//commons-codec-1.16.1.jar commons-collections/3.2.2//commons-collections-3.2.2.jar commons-collections4/4.4//commons-collections4-4.4.jar commons-compiler/3.1.9//commons-compiler-3.1.9.jar -commons-compress/1.25.0//commons-compress-1.25.0.jar +commons-compress/1.26.0//commons-compress-1.26.0.jar commons-crypto/1.1.0//commons-crypto-1.1.0.jar commons-dbcp/1.4//commons-dbcp-1.4.jar commons-io/2.15.1//commons-io-2.15.1.jar diff --git a/pom.xml b/pom.xml index f7acb65b991e..b56fb857ee46 100644 --- a/pom.xml +++ b/pom.xml @@ -192,7 +192,7 @@ 1.1.10.5 3.0.3 1.16.1 -1.25.0 +1.26.0 2.15.1 2.6 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (8ede494ad6c7 -> 76575ee7481c)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 8ede494ad6c7 [SPARK-46906][SS] Add a check for stateful operator change for streaming add 76575ee7481c [MINOR][SQL] Remove `toLowerCase(Locale.ROOT)` for `CATALOG_IMPLEMENTATION` No new revisions were added by this update. Summary of changes: repl/src/main/scala/org/apache/spark/repl/Main.scala | 5 + sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala | 5 ++--- 2 files changed, 3 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch dependabot/maven/org.postgresql-postgresql-42.7.2 deleted (was 3ee67ff7af76)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/org.postgresql-postgresql-42.7.2 in repository https://gitbox.apache.org/repos/asf/spark.git was 3ee67ff7af76 Bump org.postgresql:postgresql from 42.7.0 to 42.7.2 The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch dependabot/maven/org.apache.commons-commons-compress-1.26.0 deleted (was 3488d39e5396)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/org.apache.commons-commons-compress-1.26.0 in repository https://gitbox.apache.org/repos/asf/spark.git was 3488d39e5396 Bump org.apache.commons:commons-compress from 1.25.0 to 1.26.0 The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch dependabot/maven/org.apache.commons-commons-compress-1.26.0 created (now 3488d39e5396)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/org.apache.commons-commons-compress-1.26.0 in repository https://gitbox.apache.org/repos/asf/spark.git at 3488d39e5396 Bump org.apache.commons:commons-compress from 1.25.0 to 1.26.0 No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch dependabot/maven/org.postgresql-postgresql-42.7.2 created (now 3ee67ff7af76)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/org.postgresql-postgresql-42.7.2 in repository https://gitbox.apache.org/repos/asf/spark.git at 3ee67ff7af76 Bump org.postgresql:postgresql from 42.7.0 to 42.7.2 No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (eb71268a6a38 -> 8ede494ad6c7)
This is an automated email from the ASF dual-hosted git repository. kabhwan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from eb71268a6a38 [SPARK-47108][CORE] Set `derby.connection.requireAuthentication` to `false` explicitly in CLIs add 8ede494ad6c7 [SPARK-46906][SS] Add a check for stateful operator change for streaming No new revisions were added by this update. Summary of changes: .../src/main/resources/error/error-classes.json| 7 ++ docs/sql-error-conditions.md | 7 ++ .../spark/sql/errors/QueryExecutionErrors.scala| 13 .../v2/state/metadata/StateMetadataSource.scala| 2 +- .../execution/streaming/IncrementalExecution.scala | 61 +++- .../execution/streaming/statefulOperators.scala| 2 +- .../state/OperatorStateMetadataSuite.scala | 81 +- 7 files changed, 166 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (9d9675922543 -> eb71268a6a38)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 9d9675922543 [SPARK-45615][BUILD] Remove undated "Auto-application to `()` is deprecated" compile suppression rules add eb71268a6a38 [SPARK-47108][CORE] Set `derby.connection.requireAuthentication` to `false` explicitly in CLIs No new revisions were added by this update. Summary of changes: .../main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java | 1 + 1 file changed, 1 insertion(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch branch-3.5 updated: [SPARK-47085][SQL][3.5] reduce the complexity of toTRowSet from n^2 to n
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.5 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.5 by this push: new 92a333ada7c5 [SPARK-47085][SQL][3.5] reduce the complexity of toTRowSet from n^2 to n 92a333ada7c5 is described below commit 92a333ada7c56b6f3dacffc18010880e37e66ee2 Author: Izek Greenfield AuthorDate: Tue Feb 20 12:39:24 2024 -0800 [SPARK-47085][SQL][3.5] reduce the complexity of toTRowSet from n^2 to n ### What changes were proposed in this pull request? reduce the complexity of RowSetUtils.toTRowSet from n^2 to n ### Why are the changes needed? This causes performance issues. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tests + test manually on AWS EMR ### Was this patch authored or co-authored using generative AI tooling? No Closes #45165 from igreenfield/branch-3.5. Authored-by: Izek Greenfield Signed-off-by: Dongjoon Hyun --- .../apache/spark/sql/hive/thriftserver/RowSetUtils.scala | 14 -- .../hive/thriftserver/SparkExecuteStatementOperation.scala | 2 +- 2 files changed, 5 insertions(+), 11 deletions(-) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala index 9625021f392c..047f0612898d 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala @@ -52,11 +52,7 @@ object RowSetUtils { rows: Seq[Row], schema: Array[DataType], timeFormatters: TimeFormatters): TRowSet = { -var i = 0 -val rowSize = rows.length -val tRows = new java.util.ArrayList[TRow](rowSize) -while (i < rowSize) { - val row = rows(i) +val tRows = rows.map { row => val tRow = new TRow() var j = 0 val columnSize = row.length @@ -65,9 +61,8 @@ object RowSetUtils { tRow.addToColVals(columnValue) j += 1 } - i += 1 - tRows.add(tRow) -} + tRow +}.asJava new TRowSet(startRowOffSet, tRows) } @@ -159,8 +154,7 @@ object RowSetUtils { val size = rows.length val ret = new java.util.ArrayList[T](size) var idx = 0 -while (idx < size) { - val row = rows(idx) +rows.foreach { row => if (row.isNullAt(ordinal)) { nulls.set(idx, true) ret.add(idx, defaultVal) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala index a9b46739fa66..e6b4c70bb395 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala @@ -114,7 +114,7 @@ private[hive] class SparkExecuteStatementOperation( val offset = iter.getPosition val rows = iter.take(maxRows).toList log.debug(s"Returning result set with ${rows.length} rows from offsets " + - s"[${iter.getFetchStart}, ${offset}) with $statementId") + s"[${iter.getFetchStart}, ${iter.getPosition}) with $statementId") RowSetUtils.toTRowSet(offset, rows, dataTypes, getProtocolVersion, getTimeFormatters) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated (6a5649410d83 -> 9d9675922543)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 6a5649410d83 [SPARK-47098][INFRA] Migrate from AppVeyor to GitHub Actions for SparkR tests on Windows add 9d9675922543 [SPARK-45615][BUILD] Remove undated "Auto-application to `()` is deprecated" compile suppression rules No new revisions were added by this update. Summary of changes: pom.xml | 8 project/SparkBuild.scala | 6 -- 2 files changed, 14 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
(spark) branch master updated: [SPARK-47098][INFRA] Migrate from AppVeyor to GitHub Actions for SparkR tests on Windows
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6a5649410d83 [SPARK-47098][INFRA] Migrate from AppVeyor to GitHub Actions for SparkR tests on Windows 6a5649410d83 is described below commit 6a5649410d83610777bd3d67c7a6f567215118ae Author: Hyukjin Kwon AuthorDate: Tue Feb 20 08:02:30 2024 -0800 [SPARK-47098][INFRA] Migrate from AppVeyor to GitHub Actions for SparkR tests on Windows ### What changes were proposed in this pull request? This PR proposes to migrate from AppVeyor to GitHub Actions for SparkR tests on Windows. ### Why are the changes needed? Reduce the tools we use for better maintenance. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? - [x] Tested in my fork ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45175 from HyukjinKwon/SPARK-47098. Authored-by: Hyukjin Kwon Signed-off-by: Dongjoon Hyun --- .github/labeler.yml | 1 - .github/workflows/build_sparkr_window.yml | 81 + README.md | 1 - appveyor.yml | 75 dev/appveyor-guide.md | 186 -- dev/appveyor-install-dependencies.ps1 | 153 dev/sparktestsupport/utils.py | 7 +- project/build.properties | 1 - 8 files changed, 83 insertions(+), 422 deletions(-) diff --git a/.github/labeler.yml b/.github/labeler.yml index 20b5c936941c..7d24390f2968 100644 --- a/.github/labeler.yml +++ b/.github/labeler.yml @@ -21,7 +21,6 @@ INFRA: - changed-files: - any-glob-to-any-file: [ '.github/**/*', - 'appveyor.yml', 'tools/**/*', 'dev/create-release/**/*', '.asf.yaml', diff --git a/.github/workflows/build_sparkr_window.yml b/.github/workflows/build_sparkr_window.yml new file mode 100644 index ..07f4ebe91ad2 --- /dev/null +++ b/.github/workflows/build_sparkr_window.yml @@ -0,0 +1,81 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +# +name: "Build / SparkR-only (master, 4.3.2, windows-2019)" + +on: + schedule: +- cron: '0 17 * * *' + +jobs: + build: +name: "Build module: sparkr" +runs-on: windows-2019 +timeout-minutes: 300 +steps: +- name: Download winutils Hadoop binary + uses: actions/checkout@v4 + with: +repository: cdarlint/winutils +- name: Move Hadoop winutil into home directory + run: | +Move-Item -Path hadoop-3.3.5 -Destination ~\ +- name: Checkout Spark repository + uses: actions/checkout@v4 +- name: Cache Maven local repository + uses: actions/cache@v4 + with: +path: ~/.m2/repository +key: build-sparkr-maven-${{ hashFiles('**/pom.xml') }} +restore-keys: | + build-sparkr-windows-maven- +- name: Install Java 17 + uses: actions/setup-java@v4 + with: +distribution: zulu +java-version: 17 +- name: Install R 4.3.2 + uses: r-lib/actions/setup-r@v2 + with: +r-version: 4.3.2 +- name: Install R dependencies + run: | +Rscript -e "install.packages(c('knitr', 'rmarkdown', 'testthat', 'e1071', 'survival', 'arrow', 'xml2'), repos='https://cloud.r-project.org/')" +Rscript -e "pkg_list <- as.data.frame(installed.packages()[,c(1, 3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]" + shell: cmd +- name: Build Spark + run: | +rem 1. '-Djna.nosys=true' is required to avoid kernel32.dll load failure. +rem See SPARK-28759. +rem 2. Ideally we should check the tests related to Hive in SparkR as well (SPARK-31745). +rem 3. setup-java installs Maven 3.8.7 but does not allow changing its version, so overwrite +rem Maven version as a workaround. +mvn -DskipTests -Psparkr -Djna.nosys=true packag
(spark) branch master updated: [SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8e82887ed25e [SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0 8e82887ed25e is described below commit 8e82887ed25e521e1400edf56109f9d8f5ee3303 Author: Haejoon Lee AuthorDate: Tue Feb 20 07:46:52 2024 -0800 [SPARK-46858][PYTHON][PS][BUILD] Upgrade Pandas to 2.2.0 ### What changes were proposed in this pull request? This PR proposes to upgrade Pandas to 2.2.0. See [What's new in 2.2.0 (January 19, 2024)](https://pandas.pydata.org/docs/whatsnew/v2.2.0.html) ### Why are the changes needed? Pandas 2.2.0 is released, and we should support the latest Pandas. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? The existing CI should pass ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44881 from itholic/pandas_2.2.0. Authored-by: Haejoon Lee Signed-off-by: Dongjoon Hyun --- dev/infra/Dockerfile | 4 +- .../source/migration_guide/pyspark_upgrade.rst | 1 + python/pyspark/pandas/frame.py | 6 +- python/pyspark/pandas/namespace.py | 5 +- python/pyspark/pandas/plot/matplotlib.py | 99 ++ python/pyspark/pandas/resample.py | 24 -- python/pyspark/pandas/series.py| 8 +- python/pyspark/pandas/supported_api_gen.py | 2 +- .../pyspark/pandas/tests/computation/test_melt.py | 13 +-- .../pandas/tests/data_type_ops/test_boolean_ops.py | 8 +- .../pandas/tests/data_type_ops/test_complex_ops.py | 12 ++- .../tests/data_type_ops/test_num_arithmetic.py | 24 +++--- .../pandas/tests/data_type_ops/test_num_mod.py | 6 +- .../pandas/tests/data_type_ops/test_num_mul_div.py | 10 +-- .../pandas/tests/data_type_ops/test_num_ops.py | 6 +- .../pandas/tests/data_type_ops/test_num_reverse.py | 24 +++--- .../pyspark/pandas/tests/frame/test_reshaping.py | 6 +- .../pandas/tests/indexes/test_conversion.py| 4 +- python/pyspark/pandas/tests/resample/test_error.py | 9 ++ python/pyspark/pandas/tests/resample/test_frame.py | 9 +- .../pyspark/pandas/tests/resample/test_missing.py | 4 +- .../pyspark/pandas/tests/resample/test_series.py | 4 +- python/pyspark/pandas/tests/test_namespace.py | 6 +- .../sql/tests/connect/test_connect_basic.py| 1 + .../sql/tests/connect/test_connect_function.py | 63 ++ 25 files changed, 253 insertions(+), 105 deletions(-) diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile index fa663bc6e419..eaeed51f90cd 100644 --- a/dev/infra/Dockerfile +++ b/dev/infra/Dockerfile @@ -91,10 +91,10 @@ RUN mkdir -p /usr/local/pypy/pypy3.8 && \ ln -sf /usr/local/pypy/pypy3.8/bin/pypy /usr/local/bin/pypy3.8 && \ ln -sf /usr/local/pypy/pypy3.8/bin/pypy /usr/local/bin/pypy3 RUN curl -sS https://bootstrap.pypa.io/get-pip.py | pypy3 -RUN pypy3 -m pip install numpy 'six==1.16.0' 'pandas<=2.1.4' scipy coverage matplotlib lxml +RUN pypy3 -m pip install numpy 'six==1.16.0' 'pandas<=2.2.0' scipy coverage matplotlib lxml -ARG BASIC_PIP_PKGS="numpy pyarrow>=15.0.0 six==1.16.0 pandas<=2.1.4 scipy plotly>=4.8 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0 scikit-learn>=1.3.2" +ARG BASIC_PIP_PKGS="numpy pyarrow>=15.0.0 six==1.16.0 pandas<=2.2.0 scipy plotly>=4.8 mlflow>=2.8.1 coverage matplotlib openpyxl memory-profiler>=0.61.0 scikit-learn>=1.3.2" # Python deps for Spark Connect ARG CONNECT_PIP_PKGS="grpcio==1.59.3 grpcio-status==1.59.3 protobuf==4.25.1 googleapis-common-protos==1.56.4" diff --git a/python/docs/source/migration_guide/pyspark_upgrade.rst b/python/docs/source/migration_guide/pyspark_upgrade.rst index 9ef04814ef82..1ca5d7aad5d1 100644 --- a/python/docs/source/migration_guide/pyspark_upgrade.rst +++ b/python/docs/source/migration_guide/pyspark_upgrade.rst @@ -69,6 +69,7 @@ Upgrading from PySpark 3.5 to 4.0 * In Spark 4.0, ``Series.dt.week`` and ``Series.dt.weekofyear`` have been removed from Pandas API on Spark, use ``Series.dt.isocalendar().week`` instead. * In Spark 4.0, when applying ``astype`` to a decimal type object, the existing missing value is changed to ``True`` instead of ``False`` from Pandas API on Spark. * In Spark 4.0, ``pyspark.testing.assertPandasOnSparkEqual`` has been removed from Pandas API on Spark, use ``pyspark.pandas.testing.assert_frame_equal`` instead. +* In Spark 4.0, the aliases ``Y``, ``M``, ``H``, ``T``, ``S`` have been deprecated from Pandas API on Spark, use ``YE``, ``ME``, ``h``, ``min``, ``s`` instead respectively. diff --git a/python/pyspark/pandas/frame.py
(spark) branch master updated: [SPARK-47044][SQL] Add executed query for JDBC external datasources to explain output
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e6a3385e27fa [SPARK-47044][SQL] Add executed query for JDBC external datasources to explain output e6a3385e27fa is described below commit e6a3385e27fa95391433ea02fa053540fe101d40 Author: Uros Stankovic AuthorDate: Tue Feb 20 22:03:28 2024 +0800 [SPARK-47044][SQL] Add executed query for JDBC external datasources to explain output ### What changes were proposed in this pull request? Add generated JDBC query to EXPLAIN FORMATTED command when physical Scan node should access to JDBC source to create RDD. Output of Explain formatted with this change from newly added test. ``` == Physical Plan == * Project (2) +- * Scan org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCScan$$anon$14349389d (1) (1) Scan org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCScan$$anon$14349389d [codegen id : 1] Output [1]: [MAX(ID)#x] Arguments: [MAX(ID)#x], [StructField(MAX(ID),IntegerType,true)], PushedDownOperators(Some(org.apache.spark.sql.connector.expressions.aggregate.Aggregation647d3279),None,None,None,List(),ArraySeq(ID IS NOT NULL, ID > 1)), JDBCRDD[0] at $anonfun$executePhase$2 at LexicalThreadLocal.scala:63, org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCScan$$anon$14349389d, Statistics(sizeInBytes=8.0 EiB, ColumnStat: N/A) External engine query: SELECT MAX("ID") FROM "test"."people" WHERE ("ID" IS NOT NULL) AND ("ID" > 1) (2) Project [codegen id : 1] Output [1]: [MAX(ID)#x AS max(id)#x] Input [1]: [MAX(ID)#x] ``` ### Why are the changes needed? This command will allow customers to see which query text is sent to external JDBC sources. ### Does this PR introduce _any_ user-facing change? Yes Customer will have another field in EXPLAIN FORMATTED command for RowDataSourceScanExec node. ### How was this patch tested? Tested using JDBC V2 suite by new unit test. ### Was this patch authored or co-authored using generative AI tooling? No Closes #45102 from urosstan-db/add-sql-query-for-external-datasources. Authored-by: Uros Stankovic Signed-off-by: Wenchen Fan --- .../apache/spark/sql/catalyst/trees/TreeNode.scala | 8 ++-- .../spark/sql/execution/DataSourceScanExec.scala | 10 .../datasources/ExternalEngineDatasourceRDD.scala | 26 ++ .../sql/execution/datasources/jdbc/JDBCRDD.scala | 56 -- .../org/apache/spark/sql/jdbc/JDBCV2Suite.scala| 7 +++ 5 files changed, 78 insertions(+), 29 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala index dbacb833ef59..10e2718da833 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala @@ -1000,12 +1000,10 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] val str = if (verbose) { if (addSuffix) verboseStringWithSuffix(maxFields) else verboseString(maxFields) +} else if (printNodeId) { + simpleStringWithNodeId() } else { - if (printNodeId) { -simpleStringWithNodeId() - } else { -simpleString(maxFields) - } + simpleString(maxFields) } append(prefix) append(str) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala index ec265f4eaea4..474d65a251ba 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala @@ -127,6 +127,16 @@ case class RowDataSourceScanExec( } } + override def verboseStringWithOperatorId(): String = { +super.verboseStringWithOperatorId() + (rdd match { + case externalEngineDatasourceRdd: ExternalEngineDatasourceRDD => +"External engine query: " + + externalEngineDatasourceRdd.getExternalEngineQuery + + System.lineSeparator() + case _ => "" +}) + } + protected override def doExecute(): RDD[InternalRow] = { val numOutputRows = longMetric("numOutputRows") diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ExternalEngineDatasourceRDD.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ExternalEngineDatasourceRDD.scala new file mode 100644 index ..14ca824596f9 --- /dev/null +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ExternalEngineDatasourceRDD.sca
(spark) branch master updated: [SPARK-47100][BUILD] Upgrade `netty` to 4.1.107.Final and `netty-tcnative` to 2.0.62.Final
This is an automated email from the ASF dual-hosted git repository. yangjie01 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fb1e7872a3e6 [SPARK-47100][BUILD] Upgrade `netty` to 4.1.107.Final and `netty-tcnative` to 2.0.62.Final fb1e7872a3e6 is described below commit fb1e7872a3e64eab6127f9c2b3ffa42b63162f6c Author: Dongjoon Hyun AuthorDate: Tue Feb 20 17:04:41 2024 +0800 [SPARK-47100][BUILD] Upgrade `netty` to 4.1.107.Final and `netty-tcnative` to 2.0.62.Final ### What changes were proposed in this pull request? This PR aims to upgrade `netty` to 4.1.107.Final and `netty-tcnative` to 2.0.62.Final. ### Why are the changes needed? To bring the latest bug fixes. - https://netty.io/news/2024/02/13/4-1-107-Final.html ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #45178 from dongjoon-hyun/SPARK-47100. Authored-by: Dongjoon Hyun Signed-off-by: yangjie01 --- dev/deps/spark-deps-hadoop-3-hive-2.3 | 50 +-- pom.xml | 4 +-- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index dbbddbc54c11..cc0145e004a0 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -192,32 +192,32 @@ metrics-jmx/4.2.25//metrics-jmx-4.2.25.jar metrics-json/4.2.25//metrics-json-4.2.25.jar metrics-jvm/4.2.25//metrics-jvm-4.2.25.jar minlog/1.3.0//minlog-1.3.0.jar -netty-all/4.1.106.Final//netty-all-4.1.106.Final.jar -netty-buffer/4.1.106.Final//netty-buffer-4.1.106.Final.jar -netty-codec-http/4.1.106.Final//netty-codec-http-4.1.106.Final.jar -netty-codec-http2/4.1.106.Final//netty-codec-http2-4.1.106.Final.jar -netty-codec-socks/4.1.106.Final//netty-codec-socks-4.1.106.Final.jar -netty-codec/4.1.106.Final//netty-codec-4.1.106.Final.jar -netty-common/4.1.106.Final//netty-common-4.1.106.Final.jar -netty-handler-proxy/4.1.106.Final//netty-handler-proxy-4.1.106.Final.jar -netty-handler/4.1.106.Final//netty-handler-4.1.106.Final.jar -netty-resolver/4.1.106.Final//netty-resolver-4.1.106.Final.jar +netty-all/4.1.107.Final//netty-all-4.1.107.Final.jar +netty-buffer/4.1.107.Final//netty-buffer-4.1.107.Final.jar +netty-codec-http/4.1.107.Final//netty-codec-http-4.1.107.Final.jar +netty-codec-http2/4.1.107.Final//netty-codec-http2-4.1.107.Final.jar +netty-codec-socks/4.1.107.Final//netty-codec-socks-4.1.107.Final.jar +netty-codec/4.1.107.Final//netty-codec-4.1.107.Final.jar +netty-common/4.1.107.Final//netty-common-4.1.107.Final.jar +netty-handler-proxy/4.1.107.Final//netty-handler-proxy-4.1.107.Final.jar +netty-handler/4.1.107.Final//netty-handler-4.1.107.Final.jar +netty-resolver/4.1.107.Final//netty-resolver-4.1.107.Final.jar netty-tcnative-boringssl-static/2.0.61.Final//netty-tcnative-boringssl-static-2.0.61.Final.jar -netty-tcnative-boringssl-static/2.0.61.Final/linux-aarch_64/netty-tcnative-boringssl-static-2.0.61.Final-linux-aarch_64.jar -netty-tcnative-boringssl-static/2.0.61.Final/linux-x86_64/netty-tcnative-boringssl-static-2.0.61.Final-linux-x86_64.jar -netty-tcnative-boringssl-static/2.0.61.Final/osx-aarch_64/netty-tcnative-boringssl-static-2.0.61.Final-osx-aarch_64.jar -netty-tcnative-boringssl-static/2.0.61.Final/osx-x86_64/netty-tcnative-boringssl-static-2.0.61.Final-osx-x86_64.jar -netty-tcnative-boringssl-static/2.0.61.Final/windows-x86_64/netty-tcnative-boringssl-static-2.0.61.Final-windows-x86_64.jar -netty-tcnative-classes/2.0.61.Final//netty-tcnative-classes-2.0.61.Final.jar -netty-transport-classes-epoll/4.1.106.Final//netty-transport-classes-epoll-4.1.106.Final.jar -netty-transport-classes-kqueue/4.1.106.Final//netty-transport-classes-kqueue-4.1.106.Final.jar -netty-transport-native-epoll/4.1.106.Final/linux-aarch_64/netty-transport-native-epoll-4.1.106.Final-linux-aarch_64.jar -netty-transport-native-epoll/4.1.106.Final/linux-riscv64/netty-transport-native-epoll-4.1.106.Final-linux-riscv64.jar -netty-transport-native-epoll/4.1.106.Final/linux-x86_64/netty-transport-native-epoll-4.1.106.Final-linux-x86_64.jar -netty-transport-native-kqueue/4.1.106.Final/osx-aarch_64/netty-transport-native-kqueue-4.1.106.Final-osx-aarch_64.jar -netty-transport-native-kqueue/4.1.106.Final/osx-x86_64/netty-transport-native-kqueue-4.1.106.Final-osx-x86_64.jar -netty-transport-native-unix-common/4.1.106.Final//netty-transport-native-unix-common-4.1.106.Final.jar -netty-transport/4.1.106.Final//netty-transport-4.1.106.Final.jar +netty-tcnative-boringssl-static/2.0.62.Final/linux-aarch_64/netty-tcnative-boringssl-static-2.0.62.Final-linux-aarch_64.jar +netty-tcnati