jason810496 opened a new pull request, #68939: URL: https://github.com/apache/airflow/pull/68939
## Why Demonstrate and regression-test that the Java SDK can run a real Scala + Apache Spark workload, with task logs routed into Airflow via Log4j 2. ## What - Add `java-sdk/scala_spark_example`: a standalone Scala + Spark 3.5 (local mode) ETL bundle whose three tasks pass scalar results over XCom and log through Log4j 2 (`airflow-sdk-log4j2`). - Run it inside the existing `java_sdk` e2e via a second coordinator and queue (`scala-jdk` / `scala`) with its own `jars_root`, keeping the Java example bundle Spark-free. - Pin the e2e worker JRE to Java 17 and pass Spark's `--add-opens` JVM args. - Add `TestJavaSDKScalaSparkExample` asserting the tasks succeed and the XComs match the fixed dataset (5 rows, total revenue 1000). --- ##### Was generative AI tooling used to co-author this PR? - [x] Yes, with help of Claude Code Opus 4.8 following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
