Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-06 Thread via GitHub


jonahgao commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1667354250


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   Filed #11303



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


jonahgao commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1667250109


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   @jayzhan211 I'll file a new issue about it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


jonahgao merged PR #11256:
URL: https://github.com/apache/datafusion/pull/11256


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


jayzhan211 commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1667215728


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   count for multiple args are not well-supported, maybe we should return error 
for this?
   
   ```
   statement ok
   create table t(a int, b int) as values (1, null), (null, 2), (3, null);
   
   query I
   select count(a, b) from t;
   
   0
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


findepi commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2211376092

   thanks @jonahgao for your review! updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


findepi commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1667103815


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   i didn't know this exists. The code defines `count` as variadic. The 
semantics of 2+-arg count don't seem to be documented and are not easy to grasp 
around effect of nulls.
   
   For now, i will just drop the arg size condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


jonahgao commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666430879


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   
   > I don't know of `count(a,b)` function. If it is added in the future, would 
it be necessarily always producing non-null values?
   
   It looks like it's already supported. I think it is also non-nullable.
   ```
   DataFusion CLI v39.0.0
   > create table t(a int, b int) as values(NULL, NULL);
   0 row(s) fetched.
   Elapsed 0.007 seconds.
   
   > select count(distinct a, b) from t;
   +-+
   | count(DISTINCT t.a,t.b) |
   +-+
   | 0   |
   +-+
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


findepi commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666408825


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   > I think `args.len() <= 1` is unnecessary, or am I overlooking something?樂
   
   I don't know of `count(a,b)` function. If it is added in the future, would 
it be necessarily always producing non-null values?
   
   > Instead of pattern matching all the possible function name [...]
   
   absolutely agree, that's why i add this comment: 
https://github.com/apache/datafusion/pull/11256#discussion_r1666020300
   thanks for filing the issue
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-05 Thread via GitHub


findepi commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666407091


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability

Review Comment:
   copied into https://github.com/apache/datafusion/issues/11274



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


jayzhan211 commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666217226


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   I file #11274 for this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


jayzhan211 commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666204769


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   Instead of pattern matching all the possible function name, we can add 
another method in UDAFImpl, so we can customize the nullability for each 
function, as least the non-null property also applied to arra_agg, min, max, 
and probably more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


jonahgao commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666162520


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability
+AggregateFunctionDefinition::UDF(udf)
+if udf.name() == "count" && args.len() <= 1 =>

Review Comment:
   I think `args.len() <= 1` is unnecessary, or am I overlooking something?樂
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


findepi commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666030287


##
datafusion/optimizer/tests/optimizer_integration.rs:
##
@@ -294,6 +294,21 @@ fn eliminate_nested_filters() {
 assert_eq!(expected, format!("{plan:?}"));
 }
 
+#[test]
+fn eliminate_redundant_null_check() {
+let sql = "\
+SELECT col_int32, count(*) c
+FROM test
+GROUP BY col_int32
+HAVING c IS NOT NULL";
+let plan = test_sql(sql).unwrap();
+let expected = "\
+  Projection: test.col_int32, count(*) AS c\
+\n  Aggregate: groupBy=[[test.col_int32]], aggr=[[count(Int64(1)) AS 
count(*)]]\
+\nTableScan: test projection=[col_int32]";
+assert_eq!(expected, format!("{plan:?}"));

Review Comment:
   filed as https://github.com/apache/datafusion/issues/11270



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


findepi commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1666020300


##
datafusion/expr/src/expr_schema.rs:
##
@@ -322,10 +322,16 @@ impl ExprSchemable for Expr {
 }
 }
 Expr::Cast(Cast { expr, .. }) => expr.nullable(input_schema),
-Expr::AggregateFunction(AggregateFunction { func_def, .. }) => {
+Expr::AggregateFunction(AggregateFunction { func_def, args, .. }) 
=> {
 match func_def {
+// TODO: min/max over non-nullable input should be 
recognized as non-nullable
 AggregateFunctionDefinition::BuiltIn(fun) => 
fun.nullable(),
 // TODO: UDF should be able to customize nullability

Review Comment:
   To address this TODO we could extend the pattern here
   
https://github.com/apache/datafusion/blob/4bc322819f28695b6b9593d5bbaea775e7d798b6/datafusion/expr/src/aggregate_function.rs#L84-L88
   
   ie now we have `return_type(input_types, nullability) -> return_type`
   we could have `return_type(input_types, nullability) -> (return_type, 
nullability)`
   
   this is an API for builtin functions, we would need similar API for UDAFs
   
https://github.com/apache/datafusion/blob/5f02c8a0658d9ed012ca4ca91f0417a967fe094a/datafusion/expr/src/udaf.rs#L330
   
   Obviously, this would be a breaking change: both method signature and return 
type would change.
   We can avoid the breaking change by making nullability a separate method, 
but it would be good to first determine the ideal end-state, and only then 
think how to get there.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


findepi commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2209470440

   @jonahgao thanks! 
   I think we may later choose to add a similar optimizer to the one that was 
proposed here for the sake of filter expressions. Let's not add one if not 
needed today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-04 Thread via GitHub


jonahgao commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2209124848

   I think there might be a simpler way to achieve this.
   
   We can declare the count aggregation expr as non-nullable like 
[this](https://github.com/apache/datafusion/commit/21d01196d8e76ad6df145e378c755af528569b0e),
 and then the simplify_expressions rule and the eliminate_filter rule will 
handle subsequent processing.
   
   
https://github.com/apache/datafusion/blob/6e637488188c6620ecd113bf47987bd98f6d7871/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L1450-L1454
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-03 Thread via GitHub


alamb commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2207336875

   Thank you very much @findepi  -- for both this PR and your other recent 
contributions.  I will indeed review this PR, though I may not get to it until 
tomorrow or Friday


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-03 Thread via GitHub


findepi commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2207206104

   @alamb PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-03 Thread via GitHub


findepi commented on code in PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#discussion_r1664640979


##
datafusion/optimizer/tests/optimizer_integration.rs:
##
@@ -294,6 +294,21 @@ fn eliminate_nested_filters() {
 assert_eq!(expected, format!("{plan:?}"));
 }
 
+#[test]
+fn eliminate_redundant_null_check() {
+let sql = "\
+SELECT col_int32, count(*) c
+FROM test
+GROUP BY col_int32
+HAVING c IS NOT NULL";
+let plan = test_sql(sql).unwrap();
+let expected = "\
+  Projection: test.col_int32, count(*) AS c\
+\n  Aggregate: groupBy=[[test.col_int32]], aggr=[[count(Int64(1)) AS 
count(*)]]\
+\nTableScan: test projection=[col_int32]";
+assert_eq!(expected, format!("{plan:?}"));

Review Comment:
   Note: `assert_eq!` names its parameters "left" and "right" and doesn't seem 
to designate which one is expected and which is actual. 
   However, when a test is run from RustRover, the IDE is apparently 
opinionated and currently expects the "right" to be "the expected":
   
   https://github.com/apache/datafusion/assets/144328/989d9c55-f032-49f0-880f-4bc93f672604;>
   
   I put the expected on the left, since this is what the existing code does.
   
   Some RustRover references 
   
   - https://github.com/intellij-rust/intellij-rust/issues/3814
   - https://github.com/intellij-rust/intellij-rust/pull/3848
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-03 Thread via GitHub


findepi commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2207004248

   Added unit and integration tests. Should be ready to review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Infer count() aggregation is not null [datafusion]

2024-07-03 Thread via GitHub


findepi commented on PR #11256:
URL: https://github.com/apache/datafusion/pull/11256#issuecomment-2206765177

   this is a draft because there are no tests in this PR.
   
   still review feedback would be very welcomed -- am i going in the right 
direction?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



[PR] Infer count() aggregation is not null [datafusion]

2024-07-03 Thread via GitHub


findepi opened a new pull request, #11256:
URL: https://github.com/apache/datafusion/pull/11256

   `count([DISTINCT] [expr])` aggregate function never returns null. Infer 
non-nullness of such aggregate expression. This allows elimination of the 
HAVING filter for a query such as
   
   SELECT ... count(*) AS c
   FROM ...
   GROUP BY ...
   HAVING c IS NOT NULL
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org