andygrove opened a new pull request, #171: URL: https://github.com/apache/datafusion-site/pull/171
## Summary Draft blog post for the Apache DataFusion Comet 0.16.0 release, modeled on the 0.15.0 post. Headline themes: - **Expanded Spark 4 support** — first-class Spark 4.0.2 and 4.1.1 builds, with a new `spark-4.1` Maven profile (now the default), shared 4.x shims, and a Spark 4.0 / JDK 21 CI profile. Includes an "Adapting to Spark 4 Behavior Changes" subsection that frames each Spark 4 fix in context of what changed in Spark 4.x compared to 3.x (Variant, collation, `TimestampNTZType`, SPARK-43402, BloomFilter V2, 4.1.1 analyzer refinements), and notes that these were caught because Comet runs the full Spark SQL test suite on every supported version. - **ANSI SQL semantics** — emphasized as load-bearing for Spark 4 (where ANSI is on by default). Comet executes ANSI semantics natively for supported expressions so queries with spark.sql.ansi.enabled=true keep being accelerated. - **Dynamic Partition Pruning for native Parquet scans** — non-AQE DPP (#4011), AQE DPP with broadcast reuse (#4112), and AQE DPP broadcast reuse for Iceberg native scans (#4215). Other sections cover hash-join improvements (BuildRight+LeftAnti), PartialMerge aggregation, new expressions, native scan improvements, metrics/observability, stability fixes, and supported platforms. PR/contributor counts (~115 PRs from 17 contributors) and the date in the post are placeholders — please adjust to the actual release date and final changelog numbers before merging. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
