jason810496 opened a new pull request, #68939:
URL: https://github.com/apache/airflow/pull/68939

   ## Why
   
   Demonstrate and regression-test that the Java SDK can run a real Scala + 
Apache Spark workload, with task logs routed into Airflow via Log4j 2.
   
   ## What
   
   - Add `java-sdk/scala_spark_example`: a standalone Scala + Spark 3.5 (local 
mode) ETL bundle whose three tasks pass scalar results over XCom and log 
through Log4j 2 (`airflow-sdk-log4j2`).
   - Run it inside the existing `java_sdk` e2e via a second coordinator and 
queue (`scala-jdk` / `scala`) with its own `jars_root`, keeping the Java 
example bundle Spark-free.
   - Pin the e2e worker JRE to Java 17 and pass Spark's `--add-opens` JVM args.
   - Add `TestJavaSDKScalaSparkExample` asserting the tasks succeed and the 
XComs match the fixed dataset (5 rows, total revenue 1000).
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [x] Yes, with help of Claude Code Opus 4.8 following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to