Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-12 Thread via GitHub


alamb closed issue #15387: Trivial WHERE filter not eliminated when combined 
with CTE
URL: https://github.com/apache/datafusion/issues/15387


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-04 Thread via GitHub


ding-young commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2780253832

   take


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-04 Thread via GitHub


alamb commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2779011846

   Here is a simpler reproducer showing the `x = x` filter is still present and 
can be replaced with `x IS NOT NULL`:
   
   ```sql
   > create table foo (x int)
   as values (1), (2), (null);
   0 row(s) fetched.
   Elapsed 0.041 seconds.
   
   > select * from foo;
   +--+
   | x|
   +--+
   | 1|
   | 2|
   | NULL |
   +--+
   3 row(s) fetched.
   Elapsed 0.002 seconds.
   
   > select * from foo where x = x;
   +---+
   | x |
   +---+
   | 1 |
   | 2 |
   +---+
   2 row(s) fetched.
   Elapsed 0.009 seconds.
   
   
   > select * from foo where x IS NOT NULL;
   +---+
   | x |
   +---+
   | 1 |
   | 2 |
   +---+
   2 row(s) fetched.
   Elapsed 0.002 seconds.
   
   > explain select * from foo where x = x;
   +---+---+
   | plan_type | plan  |
   +---+---+
   | physical_plan | ┌───┐ |
   |   | │CoalesceBatchesExec│ |
   |   | │   │ |
   |   | │ target_batch_size:│ |
   |   | │8192   │ |
   |   | └─┬─┘ |
   |   | ┌─┴─┐ |
   |   | │ FilterExec│ |
   |   | │   │ |
   |   | │  predicate: x = x │ |
   |   | └─┬─┘ |
   |   | ┌─┴─┐ |
   |   | │   DataSourceExec  │ |
   |   | │   │ |
   |   | │ bytes: 176│ |
   |   | │   format: memory  │ |
   |   | │  rows: 1  │ |
   |   | └───┘ |
   |   |   |
   +---+---+
   1 row(s) fetched.
   Elapsed 0.006 seconds.
   ```
   
   
   So a sketch for the solution to this issue is:
   1. Add a rule in `ExprSimplifier` for ` = ` --> ` IS NOT 
NULL` in this match statement: 
https://github.com/apache/datafusion/blob/2cd6ed99dab90ca73497374e860f89e6fe83af1d/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L730-L738
   2. Add tests in slt (perhaps like above)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-04 Thread via GitHub


alamb commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2779013119

   As this now has a description and suggested implementation I think it is a 
good first issue for someone who wants to implement an optimization


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-04 Thread via GitHub


alamb commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2779003076

   I will file a ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-04 Thread via GitHub


alamb commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2778998957

   > But is that possible to evaluate via a simplifier? I'd think that in 
general we don't know that until execution time.
   
   we could simplify `x = x` to `x IS NOT NULL` which is likely much faster to 
evaluatie
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [I] Trivial WHERE filter not eliminated when combined with CTE [datafusion]

2025-04-03 Thread via GitHub


adriangb commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2777214161

   But is that possible to evaluate via a simplifier? I'd think that in general 
we don't know that until execution time.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]