Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]

2025-05-28 Thread via GitHub


alamb merged PR #16122:
URL: https://github.com/apache/datafusion/pull/16122


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]

2025-05-28 Thread via GitHub


alamb commented on PR #16122:
URL: https://github.com/apache/datafusion/pull/16122#issuecomment-2917166418

   Thanks @timsaucer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]

2025-05-23 Thread via GitHub


alamb commented on PR #16122:
URL: https://github.com/apache/datafusion/pull/16122#issuecomment-2905409675

   > Can you expand on this a little? Was there a specific metric you were 
watching to see performance improvements or is it looking at the code that we 
roughly have the same number of allocation operations? Or something else?
   
   It was the latter -- mostly I was thinking we'd be able to reuse `FieldRef` 
more than Field. There are some improvements for sure but for some reason I 
expected more. I don't think this is a big deal but wanted to mention it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]

2025-05-23 Thread via GitHub


timsaucer commented on PR #16122:
URL: https://github.com/apache/datafusion/pull/16122#issuecomment-2904018396

   > I had hoped this would be a bigger improvement, but I think it at least 
sets us up for being more efficient / less String cloning going forward
   
   Can you expand on this a little? Was there a specific metric you were 
watching to see performance improvements or is it looking at the code that we 
roughly have the same number of allocation operations? Or something else?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]

2025-05-22 Thread via GitHub


alamb commented on code in PR #16122:
URL: https://github.com/apache/datafusion/pull/16122#discussion_r2103200355


##
datafusion/functions/src/datetime/now.rs:
##
@@ -77,12 +77,13 @@ impl ScalarUDFImpl for NowFunc {
 &self.signature
 }
 
-fn return_field_from_args(&self, _args: ReturnFieldArgs) -> Result {
+fn return_field_from_args(&self, _args: ReturnFieldArgs) -> 
Result {
 Ok(Field::new(

Review Comment:
   So in theory we could update this code to create the `FieldRef` once on 
creation, and then return an `Arc::clone` rather than re-creating the Field 
each time -- perhaps we can do that as some follow on PRs. 



##
datafusion/expr/src/expr_fn.rs:
##
@@ -590,7 +591,7 @@ impl AggregateUDFImpl for SimpleAggregateUDF {
 (self.accumulator)(acc_args)
 }
 
-fn state_fields(&self, _args: StateFieldsArgs) -> Result> {
+fn state_fields(&self, _args: StateFieldsArgs) -> Result> {

Review Comment:
   It is nice that this is now avoiding a deep copy of a bunch of `Field`s 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



[PR] Shift from Field to FieldRef for all user defined functions [datafusion]

2025-05-20 Thread via GitHub


timsaucer opened a new pull request, #16122:
URL: https://github.com/apache/datafusion/pull/16122

   ## Which issue does this PR close?
   
   - Closes https://github.com/apache/datafusion/issues/16121
   
   ## Rationale for this change
   
   With the switch from `DataType` to `Field` we may have some plans that 
create a lot of copy operations. This PR switches from `Field` to `FieldRef` 
for the user defined scalar, window, and aggregate functions.
   
   ## What changes are included in this PR?
   
   Changes to the API for the user defined functions and similar supporting 
operations throughout the code base.
   
   ## Are these changes tested?
   
   Tested via existing unit tests.
   
   ## Are there any user-facing changes?
   None
   
   TODO:
   
   - [ ] Update migration guide to describe changes necessary for end users
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org