Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-24 Thread via GitHub


kazantsev-maksim closed pull request #1864: Chore: implement predicate exprs as 
ScalarUDFImpl
URL: https://github.com/apache/datafusion-comet/pull/1864


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-23 Thread via GitHub


kazantsev-maksim commented on PR #1864:
URL: 
https://github.com/apache/datafusion-comet/pull/1864#issuecomment-2997231061

   > > Ci is failing because in (`iceberg_compat`)`initRecordBatchReader` we 
call `planner.createExpr` for predicates that are pushed down and the 
expressions are no longer there. @andygrove not sure how we should address this.
   > 
   > I didn't realize that @kazantsev-maksim and I were independently looking 
at the same task:
   > 
   > #1887
   > 
   > We do separate expression serialization in the Parquet reader, which would 
need to be brought into alignment as in #1887. The only issue I'm fighting on 
#1887 is that dictionary-encoded columns need to be unpacked first for the DF 
scalar function, which I am looking at fixing.
   
   I can close this PR and work on another task. Looks like @mbutrovich  is 
almost finished. @andygrove 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-23 Thread via GitHub


mbutrovich commented on PR #1864:
URL: 
https://github.com/apache/datafusion-comet/pull/1864#issuecomment-2996336512

   > Ci is failing because in (`iceberg_compat`)`initRecordBatchReader` we call 
`planner.createExpr` for predicates that are pushed down and the expressions 
are no longer there. @andygrove not sure how we should address this.
   
   I didn't realize that @kazantsev-maksim were independently looking at the 
same task:
   
   https://github.com/apache/datafusion-comet/pull/1887
   
   We do separate expression serialization in the Parquet reader, which would 
need to be brought into alignment as in #1887. The only issue I'm fighting on 
#1887 is that dictionary-encoded columns need to be unpacked first for the DF 
scalar function, which I am looking at fixing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-22 Thread via GitHub


kazantsev-maksim commented on code in PR #1864:
URL: https://github.com/apache/datafusion-comet/pull/1864#discussion_r2160393687


##
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##
@@ -952,32 +947,23 @@ object QueryPlanSerde extends Logging with CometExprShim {
   binding,
   (builder, binaryExpr) => builder.setRlike(binaryExpr))
 
-  case StartsWith(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setStartsWith(binaryExpr))
-
-  case EndsWith(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setEndsWith(binaryExpr))
-
-  case Contains(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setContains(binaryExpr))
+  case _: StartsWith =>
+val startsWithExpr = expr.asInstanceOf[StartsWith]

Review Comment:
   Fixed!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-13 Thread via GitHub


parthchandra commented on PR #1864:
URL: 
https://github.com/apache/datafusion-comet/pull/1864#issuecomment-2971784129

   Ci is failing because in  (`iceberg_compat`)`initRecordBatchReader` we call 
`planner.createExpr` for predicates that are pushed down and the expressions 
are no longer there.
   @andygrove not sure how we should address this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-12 Thread via GitHub


andygrove commented on code in PR #1864:
URL: https://github.com/apache/datafusion-comet/pull/1864#discussion_r2143696076


##
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##
@@ -952,32 +947,23 @@ object QueryPlanSerde extends Logging with CometExprShim {
   binding,
   (builder, binaryExpr) => builder.setRlike(binaryExpr))
 
-  case StartsWith(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setStartsWith(binaryExpr))
-
-  case EndsWith(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setEndsWith(binaryExpr))
-
-  case Contains(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setContains(binaryExpr))
+  case _: StartsWith =>
+val startsWithExpr = expr.asInstanceOf[StartsWith]

Review Comment:
   Same comment for the other expressions being matched



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-12 Thread via GitHub


andygrove commented on code in PR #1864:
URL: https://github.com/apache/datafusion-comet/pull/1864#discussion_r2143694940


##
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala:
##
@@ -952,32 +947,23 @@ object QueryPlanSerde extends Logging with CometExprShim {
   binding,
   (builder, binaryExpr) => builder.setRlike(binaryExpr))
 
-  case StartsWith(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setStartsWith(binaryExpr))
-
-  case EndsWith(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setEndsWith(binaryExpr))
-
-  case Contains(left, right) =>
-createBinaryExpr(
-  expr,
-  left,
-  right,
-  inputs,
-  binding,
-  (builder, binaryExpr) => builder.setContains(binaryExpr))
+  case _: StartsWith =>
+val startsWithExpr = expr.asInstanceOf[StartsWith]

Review Comment:
   The `asInstanceOf` cast can be avoided here:
   
   ```suggestion
 case startsWithExpr: StartsWith =>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Chore: implement predicate exprs as ScalarUDFImpl [datafusion-comet]

2025-06-08 Thread via GitHub


codecov-commenter commented on PR #1864:
URL: 
https://github.com/apache/datafusion-comet/pull/1864#issuecomment-2954088958

   ## 
[Codecov](https://app.codecov.io/gh/apache/datafusion-comet/pull/1864?dropdown=coverage&src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   Attention: Patch coverage is `0%` with `18 lines` in your changes missing 
coverage. Please review.
   > Project coverage is 32.23%. Comparing base 
[(`f09f8af`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/f09f8af64c6599255e116a376f4f008f2fd63b43?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 to head 
[(`e75a290`)](https://app.codecov.io/gh/apache/datafusion-comet/commit/e75a2907fad6f6f10f6055926144552d4574989a?dropdown=coverage&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   > Report is 247 commits behind head on main.
   
   | [Files with missing 
lines](https://app.codecov.io/gh/apache/datafusion-comet/pull/1864?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 | Patch % | Lines |
   |---|---|---|
   | 
[.../scala/org/apache/comet/serde/QueryPlanSerde.scala](https://app.codecov.io/gh/apache/datafusion-comet/pull/1864?src=pr&el=tree&filepath=spark%2Fsrc%2Fmain%2Fscala%2Forg%2Fapache%2Fcomet%2Fserde%2FQueryPlanSerde.scala&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-c3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9jb21ldC9zZXJkZS9RdWVyeVBsYW5TZXJkZS5zY2FsYQ==)
 | 0.00% | [15 Missing and 3 partials :warning: 
](https://app.codecov.io/gh/apache/datafusion-comet/pull/1864?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 |
   
   Additional details and impacted files
   
   
   ```diff
   @@  Coverage Diff  @@
   ##   main#1864   +/-   ##
   =
   - Coverage 56.12%   32.23%   -23.90% 
   + Complexity  976  651  -325 
   =
 Files   119  130   +11 
 Lines 1174312646  +903 
 Branches   2251 2359  +108 
   =
   - Hits   6591 4076 -2515 
   - Misses 4012 7778 +3766 
   + Partials   1140  792  -348 
   ```
   
   
   
   [:umbrella: View full report in Codecov by 
Sentry](https://app.codecov.io/gh/apache/datafusion-comet/pull/1864?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
   :loudspeaker: Have feedback on the report? [Share it 
here](https://about.codecov.io/codecov-pr-comment-feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache).
   
:rocket: New features to boost your workflow: 
   
   - :snowflake: [Test 
Analytics](https://docs.codecov.com/docs/test-analytics): Detect flaky tests, 
report on failures, and find test suite problems.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]