Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]
alamb merged PR #16122: URL: https://github.com/apache/datafusion/pull/16122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]
alamb commented on PR #16122: URL: https://github.com/apache/datafusion/pull/16122#issuecomment-2917166418 Thanks @timsaucer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]
alamb commented on PR #16122: URL: https://github.com/apache/datafusion/pull/16122#issuecomment-2905409675 > Can you expand on this a little? Was there a specific metric you were watching to see performance improvements or is it looking at the code that we roughly have the same number of allocation operations? Or something else? It was the latter -- mostly I was thinking we'd be able to reuse `FieldRef` more than Field. There are some improvements for sure but for some reason I expected more. I don't think this is a big deal but wanted to mention it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]
timsaucer commented on PR #16122: URL: https://github.com/apache/datafusion/pull/16122#issuecomment-2904018396 > I had hoped this would be a bigger improvement, but I think it at least sets us up for being more efficient / less String cloning going forward Can you expand on this a little? Was there a specific metric you were watching to see performance improvements or is it looking at the code that we roughly have the same number of allocation operations? Or something else? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Shift from Field to FieldRef for all user defined functions [datafusion]
alamb commented on code in PR #16122: URL: https://github.com/apache/datafusion/pull/16122#discussion_r2103200355 ## datafusion/functions/src/datetime/now.rs: ## @@ -77,12 +77,13 @@ impl ScalarUDFImpl for NowFunc { &self.signature } -fn return_field_from_args(&self, _args: ReturnFieldArgs) -> Result { +fn return_field_from_args(&self, _args: ReturnFieldArgs) -> Result { Ok(Field::new( Review Comment: So in theory we could update this code to create the `FieldRef` once on creation, and then return an `Arc::clone` rather than re-creating the Field each time -- perhaps we can do that as some follow on PRs. ## datafusion/expr/src/expr_fn.rs: ## @@ -590,7 +591,7 @@ impl AggregateUDFImpl for SimpleAggregateUDF { (self.accumulator)(acc_args) } -fn state_fields(&self, _args: StateFieldsArgs) -> Result> { +fn state_fields(&self, _args: StateFieldsArgs) -> Result> { Review Comment: It is nice that this is now avoiding a deep copy of a bunch of `Field`s 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
[PR] Shift from Field to FieldRef for all user defined functions [datafusion]
timsaucer opened a new pull request, #16122: URL: https://github.com/apache/datafusion/pull/16122 ## Which issue does this PR close? - Closes https://github.com/apache/datafusion/issues/16121 ## Rationale for this change With the switch from `DataType` to `Field` we may have some plans that create a lot of copy operations. This PR switches from `Field` to `FieldRef` for the user defined scalar, window, and aggregate functions. ## What changes are included in this PR? Changes to the API for the user defined functions and similar supporting operations throughout the code base. ## Are these changes tested? Tested via existing unit tests. ## Are there any user-facing changes? None TODO: - [ ] Update migration guide to describe changes necessary for end users -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org