Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-08 Thread via GitHub


jayzhan211 merged PR #10560:
URL: https://github.com/apache/datafusion/pull/10560


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-08 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2156333141

   It looks pretty nice now! Thanks @alamb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-08 Thread via GitHub


alamb commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2156133115

   > > I have some ideas about additional comments / documentation that I would 
be happy to help add
   > 
   > Sure!
   
   Thank you for your patience @jayzhan211  -- I just pushed a bunch of docs 
and tests (and a small API refinement):
   
   1. Consolidated the example into the expr_api.rs examples
   2. Simplified the api for `filter` from `filter(Box)` to just 
`filter(Expr)`
   3. Added documentation and examples to the trait
   4. Added tests
   5. Checked for SortExprs in `order_by`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2154855300

   > I have some ideas about additional comments / documentation that I would 
be happy to help add
   
   Sure! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


alamb commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1631147705


##
datafusion/functions-aggregate/src/first_last.rs:
##
@@ -31,20 +31,28 @@ use datafusion_common::{
 use datafusion_expr::function::{AccumulatorArgs, StateFieldsArgs};
 use datafusion_expr::utils::{format_state_name, AggregateOrderSensitivity};
 use datafusion_expr::{
-Accumulator, AggregateUDFImpl, ArrayFunctionSignature, Signature, 
TypeSignature,
-Volatility,
+Accumulator, AggregateExt, AggregateUDFImpl, ArrayFunctionSignature, Expr, 
Signature,
+TypeSignature, Volatility,
 };
 use datafusion_physical_expr_common::aggregate::utils::get_sort_options;
 use datafusion_physical_expr_common::sort_expr::{
 limited_convert_logical_sort_exprs_to_physical, LexOrdering, 
PhysicalSortExpr,
 };
 
-make_udaf_expr_and_func!(
-FirstValue,
-first_value,
-"Returns the first value in a group of values.",
-first_value_udaf
-);
+create_func!(FirstValue, first_value_udaf);
+
+/// Returns the first value in a group of values.
+pub fn first_value(expression: Expr, order_by: Option>) -> Expr {
+if let Some(order_by) = order_by {
+first_value_udaf()
+.call(vec![expression])
+.order_by(order_by)
+.build()
+.unwrap()

Review Comment:
   100% agree
   
   ```suggestion
   // guaranteed to be `Expr::AggregateFunction`
   .unwrap()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1631125762


##
datafusion/functions-aggregate/src/first_last.rs:
##
@@ -31,20 +31,28 @@ use datafusion_common::{
 use datafusion_expr::function::{AccumulatorArgs, StateFieldsArgs};
 use datafusion_expr::utils::{format_state_name, AggregateOrderSensitivity};
 use datafusion_expr::{
-Accumulator, AggregateUDFImpl, ArrayFunctionSignature, Signature, 
TypeSignature,
-Volatility,
+Accumulator, AggregateExt, AggregateUDFImpl, ArrayFunctionSignature, Expr, 
Signature,
+TypeSignature, Volatility,
 };
 use datafusion_physical_expr_common::aggregate::utils::get_sort_options;
 use datafusion_physical_expr_common::sort_expr::{
 limited_convert_logical_sort_exprs_to_physical, LexOrdering, 
PhysicalSortExpr,
 };
 
-make_udaf_expr_and_func!(
-FirstValue,
-first_value,
-"Returns the first value in a group of values.",
-first_value_udaf
-);
+create_func!(FirstValue, first_value_udaf);
+
+/// Returns the first value in a group of values.
+pub fn first_value(expression: Expr, order_by: Option>) -> Expr {
+if let Some(order_by) = order_by {
+first_value_udaf()
+.call(vec![expression])
+.order_by(order_by)
+.build()
+.unwrap()

Review Comment:
   I think it is fine to unwrap since udaf.call() is guaranteed to be 
Expr::AggregateFunction



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2154728354

   > > I'm not sure if we really need the error check since we don't have any 
now
   > 
   > I guess what I was thinking is that the reason there is no check now is 
because the rust compiler will ensure the types are correct (specifically that 
they are `AggregateUDF`.
   > 
   > With the builder API as it is written now, you can call `order_by` on any 
arbitrary Expr (not just the appropriate `Expr::AggregateFunction` variant)
   > 
   > So I was imagining if any of the methods had been called on an invalid 
Expr, `build()` would return an error
   > 
   > Maybe something like
   > 
   > ```rust
   > /// Traits to help build aggregate functions
   > trait AggregateExt {
   > // return a builder
   > fn order_by(self, order_by: Vec) -> AggregateBuilder;
   > fn filter(self, filter: Box) -> AggregateBuilder;
   > fn null_treatment(self, null_treatment: NullTreatment) -> 
AggregateBuilder;
   > fn distinct(self) -> AggregateBuilder;
   > }
   > 
   > pub struct AggregateBuilder {
   >   // if set to none, builder errors
   >   udf: Option>,
   >   order_by: Vec,
   > 
   > }
   > 
   > impl AggregateBuilder {
   > fn order_by(self, order_by: Vec) -> AggregateBuilder {
   >   self.order_by = order_by;
   >   self
   > }
   > fn filter(self, filter: Box) -> AggregateBuilder {..}
   > fn null_treatment(self, null_treatment: NullTreatment) -> 
AggregateBuilder {..}
   > fn distinct(self) -> AggregateBuilder {..}
   >// builds up any in progress aggregate
   > fn build(self) -> Result {
   >   let Some(udf) = self.udf else {
   > return plan_err!("Expr of type XXX is not an aggregate")
   >   } 
   >   udf.order_by = self.order_by;
   >   ...
   >   Ok(Expr::AggregateFunction(udf))
   >   }
   > }
   > 
   > impl AggregateExt for Expr {
   > fn order_by(self, order_by: Vec) -> AggregateBuilder {
   > match self {
   > Expr::AggregateFunction(mut udaf) => {
   > AggregateBuilder { udf: Some(udaf) }
   > }
   > // wrong type passed -- error when build is called
   > _ => {
   > AggregateBuilder { udf: None }
   > }
   > }
   > }
   > ```
   
   I'm not sure is it worth to introduce build()? just for handling non 
Expr::AggregateFunction cases 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2154683624

   And, I think we need `fn agg_builder(self) -> AggregateBuilder;` to get 
`AggregateBuilder` from Expr
   
   ```rust
   fn agg_builder(self) -> AggregateBuilder {
   match self {
   Expr::AggregateFunction(udaf) => {
   AggregateBuilder::new(Some(udaf))
   }
   _ => AggregateBuilder::new(None)
   }
   }
   ```
   
   The API is like `udaf.call(args).agg_builder().filter().order_by().build()?` 
`udaf.call(args)` is `Expr::AggregateFunction`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2154664764

   > ```rust
   > AggregateExt
   > ```
   
   I don't think I get the point
   
   I think we can get the AggregateFunction from the result of `call`
   ```rust
   pub fn call(&self, args: Vec) -> Expr {
   Expr::AggregateFunction(AggregateFunction::new_udf(
   Arc::new(self.clone()),
   args,
   false,
   None,
   None,
   None,
   ))
   }
   ```
   
   The introduced extension functions are used for modifying AggregateFunction 
*in-place*. Therefore, we don't need additional check for this.
   
   Unless, we want to handle error for arbitrary `Expr`. For example, if 
Expr::ScalarFunction go through the extension functions and we could get a 
unexpected `Expr` silently or even panic. Is it the reason why we need 
`build()`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-07 Thread via GitHub


alamb commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2154554804

   > I'm not sure if we really need the error check since we don't have any now
   
   
   I guess what I was thinking is that the reason there is no check now is 
because the rust compiler will ensure the types are correct (specifically that 
they are `AggregateUDF`.
   
   With the builder API as it is written now, you can call `order_by` on any 
arbitrary Expr (not just the appropriate `Expr::AggregateFunction` variant)
   
   So I was imagining if any of the methods had been called on an invalid Expr, 
`build()` would return an error
   
   Maybe something like
   
   ```rust
   /// Traits to help build aggregate functions
   trait AggregateExt {
   // return a builder
   fn order_by(self, order_by: Vec) -> AggregateBuilder;
   fn filter(self, filter: Box) -> AggregateBuilder;
   fn null_treatment(self, null_treatment: NullTreatment) -> 
AggregateBuilder;
   fn distinct(self) -> AggregateBuilder;
   }
   
   pub struct AggregateBuilder {
 // if set to none, builder errors
 udf: Option>,
 order_by: Vec,
   
   }
   
   impl AggregateBuilder {
   fn order_by(self, order_by: Vec) -> AggregateBuilder {
 self.order_by = order_by;
 self
   }
   fn filter(self, filter: Box) -> AggregateBuilder {..}
   fn null_treatment(self, null_treatment: NullTreatment) -> 
AggregateBuilder {..}
   fn distinct(self) -> AggregateBuilder {..}
  // builds up any in progress aggregate
   fn build(self) -> Result {
 let Some(udf) = self.udf else {
   return plan_err!("Expr of type XXX is not an aggregate")
 } 
 udf.order_by = self.order_by;
 ...
 Ok(Expr::AggregateFunction(udf))
 }
   }
   
   impl AggregateExt for Expr {
   fn order_by(self, order_by: Vec) -> AggregateBuilder {
   match self {
   Expr::AggregateFunction(mut udaf) => {
   AggregateBuilder { udf: Some(udaf) }
   }
   // wrong type passed -- error when build is called
   _ => {
   AggregateBuilder { udf: None }
   }
   }
   }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-06 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2153499258

   > This is really cool @jayzhan211 -- other than some documentation and 
examples I think this looks great
   > 
   > I left some other comments inline
   > 
   > One thing that I think might be confusing with this API is that this would 
not error:
   > 
   > ```rust
   > use AggregateUDFExprBuilder;
   > let expr = col("a")
   >   .order_by(col("b"))
   > ```
   > 
   > I wonder if we could make it looks more like this:
   > 
   > ```rust
   > let expr = col("a")
   >   .order_by(col("b"))
   >   .build()?
   > ```
   > 
   > To have a change to throw an error
   > 
   > So a more full featured example would look like
   > 
   > ```rust
   > // FIRST_VALUE(ORDER BY b FILTER c > 1)
   > let expr = first_value("a")
   >   .order_by(col("b"))
   >   .filter(col("c").gt(lit(1)) // note no ? here
   >   .build()? // only ? here
   > ```
   
   I'm not sure if we really need the error check since we don't have any now
   
   ```rust
   pub fn new_udf(
   udf: Arc,
   args: Vec,
   distinct: bool,
   filter: Option>,
   order_by: Option>,
   null_treatment: Option,
   ) -> Self {
   Self {
   func_def: AggregateFunctionDefinition::UDF(udf),
   args,
   distinct,
   filter,
   order_by,
   null_treatment,
   }
   }
   ```
   
   What should we do in `build() -> Result`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-06 Thread via GitHub


alamb commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1630300241


##
datafusion/functions-aggregate/src/first_last.rs:
##
@@ -31,20 +31,24 @@ use datafusion_common::{
 use datafusion_expr::function::{AccumulatorArgs, StateFieldsArgs};
 use datafusion_expr::utils::{format_state_name, AggregateOrderSensitivity};
 use datafusion_expr::{
-Accumulator, AggregateUDFImpl, ArrayFunctionSignature, Signature, 
TypeSignature,
-Volatility,
+Accumulator, AggregateUDFExprBuilder, AggregateUDFImpl, 
ArrayFunctionSignature, Expr,
+Signature, TypeSignature, Volatility,
 };
 use datafusion_physical_expr_common::aggregate::utils::get_sort_options;
 use datafusion_physical_expr_common::sort_expr::{
 limited_convert_logical_sort_exprs_to_physical, LexOrdering, 
PhysicalSortExpr,
 };
 
-make_udaf_expr_and_func!(
-FirstValue,
-first_value,
-"Returns the first value in a group of values.",
-first_value_udaf
-);
+create_func!(FirstValue, first_value_udaf);
+
+/// Returns the first value in a group of values.
+pub fn first_value(expression: Expr, order_by: Option>) -> Expr {
+if let Some(order_by) = order_by {

Review Comment:
   that is pretty cool indeed



##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,44 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::{col, AggregateUDFExprBuilder};
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("first_value").unwrap();
+let first_value_builder = first_value_udaf
+.call(vec![col("a")])
+.order_by(vec![col("b")]);

Review Comment:
   I think it would help to add comments to this example explaining what it was 
trying to show
   
   Something like
   ```rust
   // DataFusion also supports more advanced aggregate expressions, like
   // `FIRST_VALUE(a ORDER BY b)`. To create such an expression using the 
expr_api
   // you can use `AggregateUDFExprBuilder` trait as follows:
   ```
   
   Also, I think it would actually be easier to find as a function in 
https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/expr_api.rs
 rather than its own separate example



##
datafusion/expr/src/udaf.rs:
##
@@ -606,3 +607,49 @@ impl AggregateUDFImpl for AggregateUDFLegacyWrapper {
 (self.accumulator)(acc_args)
 }
 }
+
+pub trait AggregateUDFExprBuilder {

Review Comment:
   Also, since it works with `Expr` (not just `AggregateUDF`) I wonder if we 
should call it something more general like `AggregateExprBuilder` 🤔 
   
   Also, I think we should document this trait with some examples, and 
specifically documenting the behavior of what happens  when the Expr is *not* 
an `Expr::AggregateFunction` (the code silently ignores the expr)
   
   Something like this maybe
   
   ```rust
   /// Trait to assist building aggregate expressions.
   ///
   /// # Error handing
   /// (what happens if the Expr is not an aggregate)
   ///
   /// # Example
   /// ```
   /// show it in use
   /// ```
   ```
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-06 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1629552024


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   ```rust
   // geometric_mean(a FILTER b > c)
   let agg = geometric_mean.call(vec![col("a")])
 .filter(col("b").gt(col("c")))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-06 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1629552024


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   ```
   // geometric_mean(a FILTER b > c)
   let agg = geometric_mean.call(vec![col("a")])
 .filter(col("b").gt(col("c")))
   
   let df = df.aggregate(vec![], vec![agg])?;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-06 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1629550398


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   I found that we don;t even need `builder()` with trait impl



##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   I found that we don't even need `builder()` with trait impl



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-04 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1626750687


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   It looks good to me to have both style for function call.
   
   We can also have builder for `expr_fn` API
   expr_fn::first_value(args).agg_builder().filter().builder()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-04 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1626750687


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   It looks good to me to have both style for function call.
   
   How about the API call in expr_fn? Should we introduce builder mode too?
   
   Current state:
   expr_fn::first_value(args, distinct, filter, order_by, null_treatment)
   
   Builder mode:
   expr_fn::first_value_builder(args).build()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-04 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1626750687


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   It looks good to me to have both style for function call.
   
   How about the API call in expr_fn? UDAF (`geometric_mean`) is not a free 
function that could be called anywherem, it can only be retrieved from function 
registry. API call in expr_fn might be used more frequently.
   
   Should we introduce builder mode?
   
   Current state:
   expr_fn::first_value(args, distinct, filter, order_by, null_treatment)
   
   Builder mode:
   expr_fn::first_value_builder(args).build()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-04 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1626750687


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   It looks good to me to have both style for function call.
   
   How about the API call in expr_fn? UDAF (`geometric_mean`) is not a free 
function that could be called anywhere. API call in expr_fn might be used more 
frequently.
   
   Should we introduce builder mode?
   
   Current state:
   expr_fn::first_value(args, distinct, filter, order_by, null_treatment)
   
   Builder mode:
   expr_fn::first_value_builder(args).build()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-04 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1626750687


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   It looks good to me to have both style for function call.
   
   How about the API call in expr_fn? Since, udaf is not public function, it is 
not a free function that could be called anywhere. API call in expr_fn might be 
used more frequently.
   
   Should we introduce builder mode?
   
   Current state:
   expr_fn::first_value(args, distinct, filter, order_by, null_treatment)
   
   Builder mode:
   expr_fn::first_value_builder(args).build()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-06-04 Thread via GitHub


alamb commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1626296887


##
datafusion-examples/examples/advanced_udaf.rs:
##
@@ -412,7 +412,7 @@ async fn main() -> Result<()> {
 let df = ctx.table("t").await?;
 
 // perform the aggregation
-let df = df.aggregate(vec![], vec![geometric_mean.call(vec![col("a")])])?;
+let df = df.aggregate(vec![], 
vec![geometric_mean.call(vec![col("a")]).build()])?;

Review Comment:
   Sorry for the delay in reviewing this @jayzhan211  -- I have been thinking 
about this PR and I am struggling
   
   On one hand, having a builder allows for exposing the ability to provide 
other types of arguments to aggregate functions (e.g. `ORDER BY` or `FILTER`) 
   
   On the other hand, I feel like this API change (requiring a call to 
`build()` after `geometric_mean.call` makes DataFusion  harder to learn / use 
as most calls to aggregate functions are made with just the arguments. 
   
   I was thinking about what an ideal  user experience would be. What do you 
think about something that permits using the existing `expr_fn` API, but also 
allows easier construction of more advanced aggregate calls
   
   What about something like  this (as today, no need to call build)
   
   ```rust
   // geometric_mean(a)
   let agg = geometric_mean.call(vec![col("a")]);
   let df = df.aggregate(vec![], vec![agg])?;
   ```
   
   And then then to create an `ORDER BY` the user could create and use a 
builder like this:
   
   ```rust
   // geometric_mean(a FILTER b > c)
   let agg = geometric_mean.call(vec![col("a")])
 .agg_builder() // creates a builder that wraps an Expr, no error return
 .filter(col("b").gt(col("c"((
 .build()?; // check here if everything was ok
   
   let df = df.aggregate(vec![], vec![agg])?;
   ```
   
   
   We could probably implement a trait for `Expr` to add `agg_builder` (or 
maybe just add it directly)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-25 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611638831


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   The downside (?) of this version is that it needs an internal function 
`first_value_udaf` (non-public). If the user can't get `first_value_udaf ` 
easily, it is not so trivial to use this expr style, and another small issue is 
`call` always expects `Vec`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-25 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611638831


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   The downside (?) of this version is that it needs an internal function 
`first_value_udaf` (non-public). If the user can't get `first_value_udaf ` 
easily, it is not so trivial to use this expr style, and another small issue is 
`call` always expects vec of expr now.
   
   But, there are no builder functions like `first_value_builder`, so it may 
make less noise to the user (?).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611638831


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   The downside (?) of this version is that it needs an internal function 
(non-public), so if the user can't get AggregateUDF easily, it is not so 
trivial to use this expr style, and another small issue is `call` always 
expects vec of expr now.
   
   But, there are no builder functions like `first_value_builder`, so it may 
make less noise to the user (?).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611644593


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   I'm thinking maybe we change the existing expr::fn to builder mode too.
   
   I expect there are two ways to build Expr
   `expr::fn first_value(arg).order_by().build()`
   and
   `first_value_udaf.call().order_by().build()`
   
   I suggest we force to use builder mode for aggregate and window function 
given there are many arguments to build Expr. Leave scalar function as it is. 
@alamb What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611644593


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   I'm thinking maybe we change the existing expr::fn to builder mode too.
   
   I expect there are two ways to build Expr
   `first_value(arg).order_by.build()`
   and
   `first_value_uda.call().order_by.build()`
   
   I suggest we force to use builder mode for aggregate and window function 
given there are many arguments to build Expr. Leave scalar function as it is. 
@alamb What do you think?



##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   I'm thinking maybe we change the existing expr::fn to builder mode too.
   
   I expect there are two ways to build Expr
   `first_value(arg).order_by.build()`
   and
   `first_value_udad.call().order_by.build()`
   
   I suggest we force to use builder mode for aggregate and window function 
given there are many arguments to build Expr. Leave scalar function as it is. 
@alamb What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611644593


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   I'm thinking maybe we change the existing expr::fn to builder mode too.
   
   There are two ways to build Expr
   `first_value(arg).order_by.build()`
   and
   `first_value_uda.call().order_by.build()`
   
   I suggest we force to use builder mode for aggregate and window function 
given there are many arguments to build Expr. Leave scalar function as it is. 
@alamb What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611638831


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,45 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::{
+execution::{
+config::SessionConfig,
+context::{SessionContext, SessionState},
+},
+functions_aggregate::{expr_fn::first_value, register_all},
+};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+let ctx = SessionContext::new();
+let config = SessionConfig::new();
+let mut state = SessionState::new_with_config_rt(config, 
ctx.runtime_env());
+let _ = register_all(&mut state);
+
+let first_value_udaf = 
state.aggregate_functions().get("FIRST_VALUE").unwrap();
+let first_value_builder = first_value_udaf

Review Comment:
   The downside (?) of this version is that it needs an internal function 
(non-public), so it is not so trivial to use it, and `call` always expects vec 
of expr now.
   
   But, there is no builder function like `first_value_builder`, so it maybe 
make less noise to the user (?).
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1611623369


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::functions_aggregate::expr_fn::{first_value, 
first_value_builder};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {

Review Comment:
   I keep it here since I need `functions-aggregate` dependency if I place it 
in doc test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-23 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2126917265

   Maybe we can just extend function for `AggregateFunction` 🤔 
   
   `pub fn call(&self, args: Vec) -> AggregateFunction`
   
   ``` rust
   impl AggregateFunction {
 fn build() -> Expr
   }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-22 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1610984960


##
datafusion/functions-aggregate/src/expr_builder.rs:
##
@@ -0,0 +1,89 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::sync::Arc;
+
+use datafusion_expr::{expr::AggregateFunction, Expr};
+use sqlparser::ast::NullTreatment;
+
+/// Builder for creating an aggregate function expression

Review Comment:
   > if we put it in datafusion-functions-aggregate people can't use it unless 
they use the built in aggregates of DataFusion
   
   Nice! I forgot this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-22 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2126012154

   > which means both adding many new functions as well as that they won't work 
for user defined aggregate functions
   
   I think they can build the function with macro in function-aggregate, so 
they can have their function with ExprBuilder extension. But, then I also need 
to move the macro to `datafusion-expr` 🤔
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-22 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2126011843

   > Thank you @jayzhan211 -- this is looking really cool. I have some feeback 
on the API design
   > 
   > One major thing I think that might be worth considering is now to get one 
of these builders. Right now they are implemented as `count_builder` free 
functions, which means both adding many new functions as well as that they 
won't work for user defined aggregate functions
   > 
   > I wonder if it would be possible to add something to the `AggregateUDF` 
instead maybe make an extension trait so people could make a builder for any 
arbitrary aggregate expression,
   > 
   > So today to call a UDF you do
   > 
   > ```rust
   >   // Given an aggregate UDF:
   >   let agg_udf: AggregateUdf = todo!();
   >   // agg(a)
   >   let expr = agg_udf.call(col("a"))
   > ```
   > 
   > Could we implement a builder that works like
   > 
   > ```rust
   >   // agg(a ORDER BY b)
   >   let expr = agg_udf.builder()
   > .args(col("a"))
   > .order_by("b")
   > .build()
   > ```
   > 
   > Or I potentailly keep the "call" syntax:
   > 
   > ```rust
   >   // agg(a ORDER BY b)
   >   let expr = agg_udf.builder()
   > .order_by(col("b"))
   > .call(col("a"))
   > .build()
   > ```
   > 
   > What do you think?
   
   > which means both adding many new functions as well as that they won't work 
for user defined aggregate functions
   
   I think they can build the function with macro in function-aggregate, so 
they can have their function with ExprBuilder extension. But, then I also need 
to move the macro to `datafusion-expr` 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-22 Thread via GitHub


alamb commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1610577152


##
datafusion-examples/examples/udaf_expr.rs:
##
@@ -0,0 +1,31 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion::functions_aggregate::expr_fn::{first_value, 
first_value_builder};
+
+use datafusion_common::Result;
+use datafusion_expr::col;
+
+#[tokio::main]
+async fn main() -> Result<()> {

Review Comment:
   I recommend moving these examples into the doc comments to make them easier 
to find



##
datafusion/functions-aggregate/src/expr_builder.rs:
##
@@ -0,0 +1,89 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::sync::Arc;
+
+use datafusion_expr::{expr::AggregateFunction, Expr};
+use sqlparser::ast::NullTreatment;
+
+/// Builder for creating an aggregate function expression

Review Comment:
   I suggest:
   1. Add a docstring example here
   2. Consider putting this in `datafusion/expr` (not 
`datafusion-functions-aggregate`) -- if we put it in 
`datafusion-functions-aggregate` people can't use it unless they use the built 
in aggregates of DataFusion 🤔 



##
datafusion/functions-aggregate/src/macros.rs:
##
@@ -48,49 +48,98 @@ macro_rules! make_udaf_expr_and_func {
 None,
 ))
 }
+
+create_builder!(
+$EXPR_FN,
+$($arg)*,
+$DOC,
+$AGGREGATE_UDF_FN
+);
+
 create_func!($UDAF, $AGGREGATE_UDF_FN);
 };
-($UDAF:ty, $EXPR_FN:ident, $($arg:ident)*, $distinct:ident, $DOC:expr, 
$AGGREGATE_UDF_FN:ident) => {

Review Comment:
   Instead of `count_distinct_builder` what do you think about adding a 
`distinct` type method instead?
   
   Like
   ```rust
   let agg = count_builder()
 .args(col("a"))
 .distinct()
 .build()?
   ```
   
   🤔 



##
datafusion/functions-aggregate/src/expr_builder.rs:
##
@@ -0,0 +1,89 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::sync::Arc;
+
+use datafusion_expr::{expr::AggregateFunction, Expr};
+use sqlparser::ast::NullTreatment;
+
+/// Builder for creating an aggregate function expression
+///
+/// Has the same arguments from [AggregateFunction]
+pub struct ExprBuilder {

Review Comment:
   I wonder if we should call it `AggBuilder` 🤔  or AggregeExprBuilder 🤔 



##
datafusion/functions-aggregate/Cargo.toml:
##
@@ -39,6 +39,7 @@ path = "src/lib.rs"
 
 [dependencies]
 arrow = { workspace = true }
+concat-idents = "1.1.5"

Review Comment:
   Would it be possible to use https://docs.rs/paste/latest/paste/ (which is 
already a dependency) instead? I think tha

Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-21 Thread via GitHub


alamb commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2123673946

   THanks @jayzhan211  -- I will plan to review this tomorrow. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-18 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1605955848


##
docs/source/user-guide/expressions.md:
##
@@ -304,6 +304,16 @@ select log(-1), log(0), sqrt(-1);
 | rollup(exprs) | Creates 
a grouping set for rollup sets. 
|
 | sum(expr) | 
Сalculates the sum of `expr`.   
|
 
+## Aggregate Function Builder
+
+Another builder expression that ends with `build()`, it is useful if the 
functions has multiple optional arguments
+
+See datafusion-examples/examples/udaf_expr.rs for example usage.
+
+| Syntax | Equivalent to   
|
+| -- | 
--- |
+| first_value_builder(expr).order_by(vec![expr]).build() | first_value(expr, 
Some(vec![expr])) |

Review Comment:
   I think we can add non-trivial examples in doc, but not all of them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-17 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1605623477


##
datafusion/proto/tests/cases/roundtrip_logical_plan.rs:
##
@@ -621,9 +623,12 @@ async fn roundtrip_expr_api() -> Result<()> {
 lit(1),
 ),
 array_replace_all(make_array(vec![lit(1), lit(2), lit(3)]), lit(2), 
lit(4)),
-first_value(vec![lit(1)], false, None, None, None),
+first_value(lit(1), Some(vec![lit(2)])),
+first_value_builder(lit(1)).order_by(vec![lit(3)]).build(),
 covar_samp(lit(1.5), lit(2.2)),
+covar_samp_builder(lit(1.5), lit(2.3)).build(),
 covar_pop(lit(1.5), lit(2.2)),
+covar_pop_builder(lit(1.5), lit(2.3)).build(),

Review Comment:
   I think roundtrip tests also cover the testing for these expression APIs.
   
   I expect we add all the functions here for test, and only little functions 
in `datafusion-example`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-17 Thread via GitHub


alamb commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1605497952


##
datafusion/proto/tests/cases/roundtrip_logical_plan.rs:
##
@@ -31,8 +31,10 @@ use datafusion::datasource::TableProvider;
 use datafusion::execution::context::SessionState;
 use datafusion::execution::runtime_env::{RuntimeConfig, RuntimeEnv};
 use datafusion::execution::FunctionRegistry;
-use datafusion::functions_aggregate::covariance::{covar_pop, covar_samp};
-use datafusion::functions_aggregate::expr_fn::first_value;
+use datafusion::functions_aggregate::expr_fn::{

Review Comment:
   Could we add some documentation with examples somewhere? That would both 
serve to help document and test this feature.
   
   But thank you @jayzhan211  -- this is looking very cool



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Introduce expr builder for aggregate function [datafusion]

2024-05-17 Thread via GitHub


alamb commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2118310034

   This looks super cool @jayzhan211 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] introduce expr builder for aggregate function [datafusion]

2024-05-17 Thread via GitHub


jayzhan211 commented on PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#issuecomment-2117670563

   I plan to add `distinct` macro in the follow up PR for `count`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] introduce expr builder for aggregate function [datafusion]

2024-05-17 Thread via GitHub


jayzhan211 commented on code in PR #10560:
URL: https://github.com/apache/datafusion/pull/10560#discussion_r1605052852


##
datafusion/functions-aggregate/src/macros.rs:
##
@@ -48,49 +48,98 @@ macro_rules! make_udaf_expr_and_func {
 None,
 ))
 }
+
+create_builder!(
+$EXPR_FN,
+$($arg)*,
+$DOC,
+$AGGREGATE_UDF_FN
+);
+
 create_func!($UDAF, $AGGREGATE_UDF_FN);
 };
-($UDAF:ty, $EXPR_FN:ident, $($arg:ident)*, $distinct:ident, $DOC:expr, 
$AGGREGATE_UDF_FN:ident) => {

Review Comment:
   I plan to introduce another macro for distinct, so we have `count_distinct`, 
`count_distinct_builder` expr::fn.
   
   I change to `order_by` because `first_value` needs it. Also, I change the 
expression for `first_value` to single expression, since it does not expect 
variadic args.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



[PR] introduce expr builder for aggregate function [datafusion]

2024-05-17 Thread via GitHub


jayzhan211 opened a new pull request, #10560:
URL: https://github.com/apache/datafusion/pull/10560

   ## Which issue does this PR close?
   
   
   
   Closes #10545.
   
   ## Rationale for this change
   
   
   
   After this PR, there are two kinds of expression API.
   One is the normal one like `count(expr)`, convenient for expression that has 
few arguments.
   Another is builder mode, useful for expressions that expects multiple kind 
of arguments, like `first_value_builder(expr).order_by().build()`
   
   ## What changes are included in this PR?
   
   
   
   ## Are these changes tested?
   
   
   
   ## Are there any user-facing changes?
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org