Hi, All.

Apache Spark 4.1.0 preparation enters the QA stage and `branch-4.1` is open
for bug fixes only.
Please note that stabilizing branch-4.1 is our top priority as we work
toward releasing version 4.1.0.

Below is the third progress update for Apache Spark 4.1.0.

1. Tasks at Risk
- SPARK-51162 Add the TIME data type
- SPARK-51658 Add geospatial types in Spark

2. Moved to 4.2.0

- SPARK-53528 SPIP: Add llms.txt files to Spark Documentation
- SPARK-53798 Enable operator pushdown in Data Source V2 streaming
- SPARK-51167 Build and Run Spark on Java 25
- SPARK-48515 Enable Arrow optimization for Python UDFs
- SPARK-50532 Support Nano Second Timestamp
- SPARK-54119 Metrics & semantic modeling in Spark
- SPARK-48338 Sql Scripting support for Spark SQL
- SPARK-52011 Reduce HDFS NameNode RPC on vectorized Parquet reader

2. Ready for QA
- SPARK-48094 Reduce GitHub Action usage according to ASF project allowance
- SPARK-51207 Constraints in DSv2
- SPARK-51727 Declarative Pipelines
- SPARK-51982 Prepare and Configure Pandas API on Spark for ANSI Mode
- SPARK-52012 Restore IDE Index with type annotations
- SPARK-52176 Release Apache Spark via GitHub Actions
- SPARK-52214 Python Arrow UDF
- SPARK-52282 Improve SQL User-defined Functions
- SPARK-52625 Monthly preview release
- SPARK-52650 User Defined Type Improvements
- SPARK-52857 Improve `Variant` data type support
- SPARK-52979 Python Arrow UDTF
- SPARK-52984 Pandas on Spark ANSI Improvement
- SPARK-53005 Add ANSI Compliance to Pandas API on Spark
- SPARK-53047 Modernize Spark to leverage the latest Java features
- SPARK-53484 JDBC Driver for Spark Connect
- SPARK-53608 Improve Python Aggregation UDFs
- SPARK-53672 Unified interface for UDF
- SPARK-53736 Real-time Mode in Structured Streaming
- SPARK-53754 Python worker logging infrastructure
- SPARK-53885 Frequency estimation functions
- SPARK-54012 Improve Netty usage patterns
- SPARK-54016 Improve K8s support in Spark 4.1.0
- SPARK-54017 Audit test dependencies in Spark 4.1.0
- SPARK-54248 Changes of the existing configurations in Spark 4.1.0
- SPARK-54249 Improve Spark Event Log, History Server, and Web UI
- SPARK-54262 Drop Python 3.9 Support
- SPARK-54266 Improve JDBC data source
- SPARK-54268 Maintain Project Infra for Spark 4.1.0
- SPARK-54274 Support `MERGE INTO` Schema Evolution
- SPARK-54283 Improve Test Coverage and Stability
- SPARK-54284 Apache Spark 4.1.0 Dependency Audit and Cleanup
- SPARK-54286 Support Python 3.14
- SPARK-54309 Add Metrics to DML Operations
- SPARK-54357 Improve SparkConnect usability and performance
- SPARK-54359 Improve Maven/SBT build
- SPARK-54360 Improve Documentation Content and Management

Thank you, as always, for your continued collaboration and contributions.

Best regards,
Dongjoon Hyun

Reply via email to