alamb commented on code in PR #84:
URL: https://github.com/apache/datafusion-site/pull/84#discussion_r2210222605


##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 
+  [up to 
3x](https://github.com/apache/datafusion/pull/15931#issuecomment-2848561984) 
performance improvement (PR 
[#15931](https://github.com/apache/datafusion/pull/15931) by 
[Dandandan](https://github.com/Dandandan))

Review Comment:
   fyi @Dandandan 



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 

Review Comment:
   fyi @tlm365 



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 
+  [up to 
3x](https://github.com/apache/datafusion/pull/15931#issuecomment-2848561984) 
performance improvement (PR 
[#15931](https://github.com/apache/datafusion/pull/15931) by 
[Dandandan](https://github.com/Dandandan))
+
+- **Constant aggregate window expressions:** For unbounded aggregate window 
functions the result is the 
+  same for all rows within a partition. DataFusion 48.0.0 avoids unnecessary 
computation for such queries, resulting in [improved performance by 
5.6x](https://github.com/apache/datafusion/pull/16234#issuecomment-2935960865)
+  (PR [#16234](https://github.com/apache/datafusion/pull/16234) by 
[suibianwanwank](https://github.com/suibianwanwank))
+
+## Highlighted New Features
+
+### New `datafusion-spark` crate
+
+The DataFusion community has requested [Apache Spark]-compatible functions for 
many years, but the current builtin function library is most similar to 
Postgresql, which leads to friction. Unfortunately, there are even functions 
with the same name but different signatures and/or return types in the two 
systems.
+
+One of the many uses of DataFusion is to enhance (e.g. [Apache DataFusion 
Comet](https://github.com/apache/datafusion-comet)) 
+or replace (e.g. [Sail](https://github.com/lakehq/sail)) [Apache 
Spark](https://spark.apache.org/). To 
+support the community requests and the use cases mentioned above, we have 
introduced a new
+[datafusion-spark] crate for DataFusion with spark-compatible functions so the 
+community can collaborate to build this shared resource. There are several 
hundred functions to implement, and we are looking for help to [complete 
datafusion-spark Spark Compatible Functions].
+
+[datafusion-spark]: https://crates.io/crates/datafusion-spark
+[Apache Spark]: https://spark.apache.org
+
+To register all functions in `datafusion-spark` you can use:
+```Rust
+    // Create a new session context
+    let mut ctx = SessionContext::new();
+    // register all spark functions with the context
+    datafusion_spark::register_all(&mut ctx)?;
+    // run a query. Note the `sha2` function is now available which
+    // has Spark semantics
+    let df = ctx.sql("SELECT sha2('The input String', 256)").await?;
+    ...
+}
+```
+Or, to use an individual function, you can do:
+```Rust
+use datafusion_expr::{col, lit};
+use datafusion_spark::expr_fn::sha2;
+// Create the expression `sha2(my_data, 256)`
+let expr = sha2(col("my_data"), lit(256));
+...
+```
+Thanks to [shehabgamin](https://github.com/shehabgamin) for the initial PR 
[#15168](https://github.com/apache/datafusion/pull/15168) 

Review Comment:
   fyi @shehabgamin



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 
+  [up to 
3x](https://github.com/apache/datafusion/pull/15931#issuecomment-2848561984) 
performance improvement (PR 
[#15931](https://github.com/apache/datafusion/pull/15931) by 
[Dandandan](https://github.com/Dandandan))
+
+- **Constant aggregate window expressions:** For unbounded aggregate window 
functions the result is the 
+  same for all rows within a partition. DataFusion 48.0.0 avoids unnecessary 
computation for such queries, resulting in [improved performance by 
5.6x](https://github.com/apache/datafusion/pull/16234#issuecomment-2935960865)
+  (PR [#16234](https://github.com/apache/datafusion/pull/16234) by 
[suibianwanwank](https://github.com/suibianwanwank))
+
+## Highlighted New Features
+
+### New `datafusion-spark` crate
+
+The DataFusion community has requested [Apache Spark]-compatible functions for 
many years, but the current builtin function library is most similar to 
Postgresql, which leads to friction. Unfortunately, there are even functions 
with the same name but different signatures and/or return types in the two 
systems.
+
+One of the many uses of DataFusion is to enhance (e.g. [Apache DataFusion 
Comet](https://github.com/apache/datafusion-comet)) 
+or replace (e.g. [Sail](https://github.com/lakehq/sail)) [Apache 
Spark](https://spark.apache.org/). To 
+support the community requests and the use cases mentioned above, we have 
introduced a new
+[datafusion-spark] crate for DataFusion with spark-compatible functions so the 
+community can collaborate to build this shared resource. There are several 
hundred functions to implement, and we are looking for help to [complete 
datafusion-spark Spark Compatible Functions].
+
+[datafusion-spark]: https://crates.io/crates/datafusion-spark
+[Apache Spark]: https://spark.apache.org
+
+To register all functions in `datafusion-spark` you can use:
+```Rust
+    // Create a new session context
+    let mut ctx = SessionContext::new();
+    // register all spark functions with the context
+    datafusion_spark::register_all(&mut ctx)?;
+    // run a query. Note the `sha2` function is now available which
+    // has Spark semantics
+    let df = ctx.sql("SELECT sha2('The input String', 256)").await?;
+    ...
+}
+```
+Or, to use an individual function, you can do:
+```Rust
+use datafusion_expr::{col, lit};
+use datafusion_spark::expr_fn::sha2;
+// Create the expression `sha2(my_data, 256)`
+let expr = sha2(col("my_data"), lit(256));
+...
+```
+Thanks to [shehabgamin](https://github.com/shehabgamin) for the initial PR 
[#15168](https://github.com/apache/datafusion/pull/15168) 
+and many others for their help adding additional functions. Please consider 
+helping [complete datafusion-spark Spark Compatible Functions]. 
+
+[Complete datafusion-spark Spark Compatible Functions]: 
https://github.com/apache/datafusion/issues/15914
+
+### `ORDER BY ALL sql` support
+
+Inspired by 
[DuckDB](https://duckdb.org/docs/stable/sql/query_syntax/orderby.html#order-by-all-examples),
 DataFusion 48.0.0 adds support for `ORDER BY ALL`. This allows for easy 
ordering of all columns in a query:
+
+```sql
+> set datafusion.sql_parser.dialect = 'DuckDB';
+0 row(s) fetched.
+> CREATE OR REPLACE TABLE addresses AS
+    SELECT '123 Quack Blvd' AS address, 'DuckTown' AS city, '11111' AS zip
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'DuckTown', '11111'
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111'
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111-0001';
+0 row(s) fetched.
+> SELECT * FROM addresses ORDER BY ALL;
++------------------------+-----------+------------+
+| address                | city      | zip        |
++------------------------+-----------+------------+
+| 111 Duck Duck Goose Ln | Duck Town | 11111      |
+| 111 Duck Duck Goose Ln | Duck Town | 11111-0001 |
+| 111 Duck Duck Goose Ln | DuckTown  | 11111      |
+| 123 Quack Blvd         | DuckTown  | 11111      |
++------------------------+-----------+------------+
+4 row(s) fetched.
+```
+Thanks to [PokIsemaine](https://github.com/PokIsemaine) for PR 
[#15772](https://github.com/apache/datafusion/pull/15772)

Review Comment:
   fyi @PokIsemaine



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),

Review Comment:
   fyi @xudong963 



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 
+  [up to 
3x](https://github.com/apache/datafusion/pull/15931#issuecomment-2848561984) 
performance improvement (PR 
[#15931](https://github.com/apache/datafusion/pull/15931) by 
[Dandandan](https://github.com/Dandandan))
+
+- **Constant aggregate window expressions:** For unbounded aggregate window 
functions the result is the 
+  same for all rows within a partition. DataFusion 48.0.0 avoids unnecessary 
computation for such queries, resulting in [improved performance by 
5.6x](https://github.com/apache/datafusion/pull/16234#issuecomment-2935960865)
+  (PR [#16234](https://github.com/apache/datafusion/pull/16234) by 
[suibianwanwank](https://github.com/suibianwanwank))
+
+## Highlighted New Features
+
+### New `datafusion-spark` crate
+
+The DataFusion community has requested [Apache Spark]-compatible functions for 
many years, but the current builtin function library is most similar to 
Postgresql, which leads to friction. Unfortunately, there are even functions 
with the same name but different signatures and/or return types in the two 
systems.
+
+One of the many uses of DataFusion is to enhance (e.g. [Apache DataFusion 
Comet](https://github.com/apache/datafusion-comet)) 
+or replace (e.g. [Sail](https://github.com/lakehq/sail)) [Apache 
Spark](https://spark.apache.org/). To 
+support the community requests and the use cases mentioned above, we have 
introduced a new
+[datafusion-spark] crate for DataFusion with spark-compatible functions so the 
+community can collaborate to build this shared resource. There are several 
hundred functions to implement, and we are looking for help to [complete 
datafusion-spark Spark Compatible Functions].
+
+[datafusion-spark]: https://crates.io/crates/datafusion-spark
+[Apache Spark]: https://spark.apache.org
+
+To register all functions in `datafusion-spark` you can use:
+```Rust
+    // Create a new session context
+    let mut ctx = SessionContext::new();
+    // register all spark functions with the context
+    datafusion_spark::register_all(&mut ctx)?;
+    // run a query. Note the `sha2` function is now available which
+    // has Spark semantics
+    let df = ctx.sql("SELECT sha2('The input String', 256)").await?;
+    ...
+}
+```
+Or, to use an individual function, you can do:
+```Rust
+use datafusion_expr::{col, lit};
+use datafusion_spark::expr_fn::sha2;
+// Create the expression `sha2(my_data, 256)`
+let expr = sha2(col("my_data"), lit(256));
+...
+```
+Thanks to [shehabgamin](https://github.com/shehabgamin) for the initial PR 
[#15168](https://github.com/apache/datafusion/pull/15168) 
+and many others for their help adding additional functions. Please consider 
+helping [complete datafusion-spark Spark Compatible Functions]. 
+
+[Complete datafusion-spark Spark Compatible Functions]: 
https://github.com/apache/datafusion/issues/15914
+
+### `ORDER BY ALL sql` support
+
+Inspired by 
[DuckDB](https://duckdb.org/docs/stable/sql/query_syntax/orderby.html#order-by-all-examples),
 DataFusion 48.0.0 adds support for `ORDER BY ALL`. This allows for easy 
ordering of all columns in a query:
+
+```sql
+> set datafusion.sql_parser.dialect = 'DuckDB';
+0 row(s) fetched.
+> CREATE OR REPLACE TABLE addresses AS
+    SELECT '123 Quack Blvd' AS address, 'DuckTown' AS city, '11111' AS zip
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'DuckTown', '11111'
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111'
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111-0001';
+0 row(s) fetched.
+> SELECT * FROM addresses ORDER BY ALL;
++------------------------+-----------+------------+
+| address                | city      | zip        |
++------------------------+-----------+------------+
+| 111 Duck Duck Goose Ln | Duck Town | 11111      |
+| 111 Duck Duck Goose Ln | Duck Town | 11111-0001 |
+| 111 Duck Duck Goose Ln | DuckTown  | 11111      |
+| 123 Quack Blvd         | DuckTown  | 11111      |
++------------------------+-----------+------------+
+4 row(s) fetched.
+```
+Thanks to [PokIsemaine](https://github.com/PokIsemaine) for PR 
[#15772](https://github.com/apache/datafusion/pull/15772)
+
+### FFI Support for `AggregateUDF` and `WindowUDF`
+
+This improvement allows for using user defined aggregate and user defined 
window functions across FFI boundaries, which enables shared libraries to pass 
functions back and forth. This feature unlocks:
+
+- Modules to provide DataFusion based FFI aggregates that can be reused in 
projects such as 
[datafusion-python](https://github.com/apache/datafusion-python)
+
+- Using the same aggregate and window functions without recompiling with 
different DataFusion versions.
+
+This completes the work to add support for all UDF types to DataFusion's FFI 
bindings. Thanks to [timsaucer](https://github.com/timsaucer)

Review Comment:
   fyi @timsaucer 



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 
+  [up to 
3x](https://github.com/apache/datafusion/pull/15931#issuecomment-2848561984) 
performance improvement (PR 
[#15931](https://github.com/apache/datafusion/pull/15931) by 
[Dandandan](https://github.com/Dandandan))
+
+- **Constant aggregate window expressions:** For unbounded aggregate window 
functions the result is the 
+  same for all rows within a partition. DataFusion 48.0.0 avoids unnecessary 
computation for such queries, resulting in [improved performance by 
5.6x](https://github.com/apache/datafusion/pull/16234#issuecomment-2935960865)
+  (PR [#16234](https://github.com/apache/datafusion/pull/16234) by 
[suibianwanwank](https://github.com/suibianwanwank))

Review Comment:
   FYI @suibianwanwank
   



##########
content/blog/2025-07-16-datafusion-48.0.0.md:
##########
@@ -0,0 +1,209 @@
+---
+layout: post
+title: Apache DataFusion 48.0.0 Released
+date: 2025-07-16
+author: PMC
+categories: [ release ]
+---
+
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+<!-- see https://github.com/apache/datafusion/issues/16347 for details -->
+
+We’re excited to announce the release of **Apache DataFusion 48.0.0**! As 
always, this version packs in a wide range of 
+improvements and fixes. You can find the complete details in the full 
+[changelog](https://github.com/apache/datafusion/blob/branch-48/dev/changelog/48.0.0.md).
 We’ll highlight the most
+important changes below and guide you through upgrading.
+
+## Breaking Changes
+
+DataFusion 48.0.0 brings a few **breaking changes** that may require 
adjustments to your code as described in
+the [Upgrade 
Guide](https://datafusion.apache.org/library-user-guide/upgrading.html#datafusion-48-0-0).
 Here are the most notable ones:
+
+
+- `datafusion.execution.collect_statistics` defaults to `true`: In DataFusion 
48.0.0, the default value of this [configuration setting] is now true, and 
DataFusion will collect and store statistics when a table is first created via 
`CREATE EXTERNAL TABLE` or one of the `DataFrame::register_*` APIs.
+
+[configuration setting]: https://datafusion.apache.org/user-guide/configs.html
+
+- `Expr::Literal` has optional metadata: The `Expr::Literal` variant now 
includes optional metadata, which allows 
+  for carrying through Arrow field metadata to support extension types and 
other uses. This means code such as
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar) => ...
+...
+}
+```
+
+Should be updated to:
+
+```rust
+match expr {
+...
+  Expr::Literal(scalar, _metadata) => ...
+...
+}
+```
+
+- `Expr::WindowFunction` is now Boxed: `Expr::WindowFunction` is now a 
`Box<WindowFunction>` instead of a `WindowFunction` 
+  directly. This change was made to reduce the size of `Expr` and improve 
performance when planning queries 
+  (see details on [#16207](https://github.com/apache/datafusion/pull/16207)).
+
+- UDFs changed to use `FieldRef` instead of `DataType`: To support metadata 
handling and 
+  prepare for extension types, UDF traits now use [FieldRef] rather than a 
`DataType`
+  and nullability. `FieldRef` contains the type and nullability, and 
additionally allows access to 
+  metadata fields, which can be used for extension types.
+
+[FieldRef]: https://docs.rs/arrow/latest/arrow/datatypes/type.FieldRef.html
+
+- Physical Expression return `Field`: Similarly to UDFs, in order to prepare 
for extension type support the 
+  [PhysicalExpr] trait has been changed to return [Field] rather than 
`DataType`. To upgrade structs which 
+  implement `PhysicalExpr` you need to implement the `return_field` function. 
+
+[PhysicalExpr]: 
https://docs.rs/datafusion/latest/datafusion/physical_expr/trait.PhysicalExpr.html
+[Field]: https://docs.rs/arrow/latest/arrow/datatypes/struct.Field.html
+
+- `FileFormat::supports_filters_pushdown` was replaced with 
`FileSource::try_pushdown_filters` to support upcoming work to push down 
dynamic filters and physical filter pushdown. 
+
+- `ParquetExec`, `AvroExec`, `CsvExec`, `JsonExec` removed: `ParquetExec`, 
`AvroExec`, `CsvExec`, and `JsonExec`
+  were deprecated in DataFusion 46 and are removed in DataFusion 48.
+
+## Performance Improvements
+
+DataFusion 48.0.0 comes with some noteworthy performance enhancements:
+
+- **Fewer unnecessary projections:** DataFusion now removes additional 
unnecessary `Projection`s in queries. (PRs 
[#15787](https://github.com/apache/datafusion/pull/15787), 
[#15761](https://github.com/apache/datafusion/pull/15761),
+  and [#15746](https://github.com/apache/datafusion/pull/15746) by 
[xudong963](https://github.com/xudong963)).
+
+- **Accelerated string functions**: The `ascii` function was optimized to 
significantly improve its performance
+  (PR [#16087](https://github.com/apache/datafusion/pull/16087) by 
[tlm365](https://github.com/tlm365)). The `character_length` function was 
optimized resulting in 
+  [up to 
3x](https://github.com/apache/datafusion/pull/15931#issuecomment-2848561984) 
performance improvement (PR 
[#15931](https://github.com/apache/datafusion/pull/15931) by 
[Dandandan](https://github.com/Dandandan))
+
+- **Constant aggregate window expressions:** For unbounded aggregate window 
functions the result is the 
+  same for all rows within a partition. DataFusion 48.0.0 avoids unnecessary 
computation for such queries, resulting in [improved performance by 
5.6x](https://github.com/apache/datafusion/pull/16234#issuecomment-2935960865)
+  (PR [#16234](https://github.com/apache/datafusion/pull/16234) by 
[suibianwanwank](https://github.com/suibianwanwank))
+
+## Highlighted New Features
+
+### New `datafusion-spark` crate
+
+The DataFusion community has requested [Apache Spark]-compatible functions for 
many years, but the current builtin function library is most similar to 
Postgresql, which leads to friction. Unfortunately, there are even functions 
with the same name but different signatures and/or return types in the two 
systems.
+
+One of the many uses of DataFusion is to enhance (e.g. [Apache DataFusion 
Comet](https://github.com/apache/datafusion-comet)) 
+or replace (e.g. [Sail](https://github.com/lakehq/sail)) [Apache 
Spark](https://spark.apache.org/). To 
+support the community requests and the use cases mentioned above, we have 
introduced a new
+[datafusion-spark] crate for DataFusion with spark-compatible functions so the 
+community can collaborate to build this shared resource. There are several 
hundred functions to implement, and we are looking for help to [complete 
datafusion-spark Spark Compatible Functions].
+
+[datafusion-spark]: https://crates.io/crates/datafusion-spark
+[Apache Spark]: https://spark.apache.org
+
+To register all functions in `datafusion-spark` you can use:
+```Rust
+    // Create a new session context
+    let mut ctx = SessionContext::new();
+    // register all spark functions with the context
+    datafusion_spark::register_all(&mut ctx)?;
+    // run a query. Note the `sha2` function is now available which
+    // has Spark semantics
+    let df = ctx.sql("SELECT sha2('The input String', 256)").await?;
+    ...
+}
+```
+Or, to use an individual function, you can do:
+```Rust
+use datafusion_expr::{col, lit};
+use datafusion_spark::expr_fn::sha2;
+// Create the expression `sha2(my_data, 256)`
+let expr = sha2(col("my_data"), lit(256));
+...
+```
+Thanks to [shehabgamin](https://github.com/shehabgamin) for the initial PR 
[#15168](https://github.com/apache/datafusion/pull/15168) 
+and many others for their help adding additional functions. Please consider 
+helping [complete datafusion-spark Spark Compatible Functions]. 
+
+[Complete datafusion-spark Spark Compatible Functions]: 
https://github.com/apache/datafusion/issues/15914
+
+### `ORDER BY ALL sql` support
+
+Inspired by 
[DuckDB](https://duckdb.org/docs/stable/sql/query_syntax/orderby.html#order-by-all-examples),
 DataFusion 48.0.0 adds support for `ORDER BY ALL`. This allows for easy 
ordering of all columns in a query:
+
+```sql
+> set datafusion.sql_parser.dialect = 'DuckDB';
+0 row(s) fetched.
+> CREATE OR REPLACE TABLE addresses AS
+    SELECT '123 Quack Blvd' AS address, 'DuckTown' AS city, '11111' AS zip
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'DuckTown', '11111'
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111'
+    UNION ALL
+    SELECT '111 Duck Duck Goose Ln', 'Duck Town', '11111-0001';
+0 row(s) fetched.
+> SELECT * FROM addresses ORDER BY ALL;
++------------------------+-----------+------------+
+| address                | city      | zip        |
++------------------------+-----------+------------+
+| 111 Duck Duck Goose Ln | Duck Town | 11111      |
+| 111 Duck Duck Goose Ln | Duck Town | 11111-0001 |
+| 111 Duck Duck Goose Ln | DuckTown  | 11111      |
+| 123 Quack Blvd         | DuckTown  | 11111      |
++------------------------+-----------+------------+
+4 row(s) fetched.
+```
+Thanks to [PokIsemaine](https://github.com/PokIsemaine) for PR 
[#15772](https://github.com/apache/datafusion/pull/15772)
+
+### FFI Support for `AggregateUDF` and `WindowUDF`
+
+This improvement allows for using user defined aggregate and user defined 
window functions across FFI boundaries, which enables shared libraries to pass 
functions back and forth. This feature unlocks:
+
+- Modules to provide DataFusion based FFI aggregates that can be reused in 
projects such as 
[datafusion-python](https://github.com/apache/datafusion-python)
+
+- Using the same aggregate and window functions without recompiling with 
different DataFusion versions.
+
+This completes the work to add support for all UDF types to DataFusion's FFI 
bindings. Thanks to [timsaucer](https://github.com/timsaucer)
+for PRs [#16261](https://github.com/apache/datafusion/pull/16261) and 
[#14775](https://github.com/apache/datafusion/pull/14775).
+
+### Reduced size of `Expr` struct
+
+The [Expr] struct is widely used across the DataFusion and downstream 
codebases. By `Box`ing `WindowFunction`s,  we reduced the size of `Expr` by 
almost 50%, from `272` to `144` bytes. This reduction improved planning times 
between 10% and 20% and reduced memory usage. Thanks to 
[hendrikmakait](https://github.com/hendrikmakait) for 

Review Comment:
   fyi @hendrikmakait



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to