CuteChuanChuan opened a new pull request, #19994:
URL: https://github.com/apache/datafusion/pull/19994

   ## Which issue does this PR close?
   
   - Closes #11570.
   
   ## Rationale for this change
   
   The `CaseExpr` implementation is expensive. A common usage pattern 
(particularly in TPC-DS benchmarks) is to protect against divide-by-zero:
   
   ```sql
   CASE WHEN y > 0 THEN x / y ELSE NULL END
   ```
   
   This entire expression can be replaced with a simpler divide operation that 
returns NULL when the divisor is zero, avoiding the overhead of full CASE 
evaluation.
   
   ## What changes are included in this PR?
   
   1. New EvalMethod::DivideByZeroProtection variant - A specialization for the 
divide-by-zero protection pattern
   2. Pattern detection - Detects patterns like:
     - CASE WHEN y > 0 THEN x / y ELSE NULL END
     - CASE WHEN y != 0 THEN x / y ELSE NULL END
     - CASE WHEN 0 < y THEN x / y ELSE NULL END
   3. Critical validation - Ensures the divisor in the division matches the 
operand being checked (addresses feedback from PR#12049)
   4. Safe division implementation - Uses Arrow kernels to perform division 
that returns NULL on zero:
     - eq to create zero mask
     - zip to replace zeros with ones (avoid division error)
     - div to perform division
     - nullif to set NULL where divisor was zero
   
   ## Are these changes tested?
   
   Yes, added two new tests:
   - test_divide_by_zero_protection_specialization - Verifies pattern is 
detected and results are correct
   - test_divide_by_zero_protection_specialization_not_applied - Verifies 
optimization is NOT applied when divisor doesn't match checked operand (key 
feedback from PR#12049)
   
   ## Are there any user-facing changes?
   
   No. This is an internal optimization that produces the same results but with 
better performance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to