Hi, All. Apache Spark 4.1.0 preparation enters the QA stage and `branch-4.1` is open for bug fixes only. Please note that stabilizing branch-4.1 is our top priority as we work toward releasing version 4.1.0.
Below is the third progress update for Apache Spark 4.1.0. 1. Tasks at Risk - SPARK-51162 Add the TIME data type - SPARK-51658 Add geospatial types in Spark 2. Moved to 4.2.0 - SPARK-53528 SPIP: Add llms.txt files to Spark Documentation - SPARK-53798 Enable operator pushdown in Data Source V2 streaming - SPARK-51167 Build and Run Spark on Java 25 - SPARK-48515 Enable Arrow optimization for Python UDFs - SPARK-50532 Support Nano Second Timestamp - SPARK-54119 Metrics & semantic modeling in Spark - SPARK-48338 Sql Scripting support for Spark SQL - SPARK-52011 Reduce HDFS NameNode RPC on vectorized Parquet reader 2. Ready for QA - SPARK-48094 Reduce GitHub Action usage according to ASF project allowance - SPARK-51207 Constraints in DSv2 - SPARK-51727 Declarative Pipelines - SPARK-51982 Prepare and Configure Pandas API on Spark for ANSI Mode - SPARK-52012 Restore IDE Index with type annotations - SPARK-52176 Release Apache Spark via GitHub Actions - SPARK-52214 Python Arrow UDF - SPARK-52282 Improve SQL User-defined Functions - SPARK-52625 Monthly preview release - SPARK-52650 User Defined Type Improvements - SPARK-52857 Improve `Variant` data type support - SPARK-52979 Python Arrow UDTF - SPARK-52984 Pandas on Spark ANSI Improvement - SPARK-53005 Add ANSI Compliance to Pandas API on Spark - SPARK-53047 Modernize Spark to leverage the latest Java features - SPARK-53484 JDBC Driver for Spark Connect - SPARK-53608 Improve Python Aggregation UDFs - SPARK-53672 Unified interface for UDF - SPARK-53736 Real-time Mode in Structured Streaming - SPARK-53754 Python worker logging infrastructure - SPARK-53885 Frequency estimation functions - SPARK-54012 Improve Netty usage patterns - SPARK-54016 Improve K8s support in Spark 4.1.0 - SPARK-54017 Audit test dependencies in Spark 4.1.0 - SPARK-54248 Changes of the existing configurations in Spark 4.1.0 - SPARK-54249 Improve Spark Event Log, History Server, and Web UI - SPARK-54262 Drop Python 3.9 Support - SPARK-54266 Improve JDBC data source - SPARK-54268 Maintain Project Infra for Spark 4.1.0 - SPARK-54274 Support `MERGE INTO` Schema Evolution - SPARK-54283 Improve Test Coverage and Stability - SPARK-54284 Apache Spark 4.1.0 Dependency Audit and Cleanup - SPARK-54286 Support Python 3.14 - SPARK-54309 Add Metrics to DML Operations - SPARK-54357 Improve SparkConnect usability and performance - SPARK-54359 Improve Maven/SBT build - SPARK-54360 Improve Documentation Content and Management Thank you, as always, for your continued collaboration and contributions. Best regards, Dongjoon Hyun
