Re: [I] Remove record_batch! macro once upstream updates [datafusion]
buraksenn commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-4011850642 I've opened a PR to extend arrow's record_batch! to support variables and expressions. When it is released I think we can continue with @ByteBaker's replacement PR to use upstream. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
ByteBaker commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-3431235544 Noted. Will make the necessary changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
alamb commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-3405889871 I think you should change DataFusion to sync with arrow-rs macro (basically remove the datafusion macro and use the arrow one) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
timsaucer commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-3406163128 I agree with @alamb . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
ByteBaker commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-3404435591 @alamb @buraksenn is this issue still on? Sorry I had been caught up last several months but now I'm able to contribute again. If everyone can agree on one of the two paths - Change arrow-rs to follow the same pattern as in datafusion - Change datafusion to sync with arrow-rs' macro I'll code them. If it's gonna be DF, I'll make necessary code sync and call changes as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
ByteBaker commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2773191102 @alamb @buraksenn I think that would be me, since I made the call to go with array literals than `vec![]` since the former avoids allocation. But now that I see it, should I make a patch in the upstream and change it there? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
alamb commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2776834231 Thanks @ByteBaker -- I don't have any strong opinion / advice here unfortunately -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
alamb commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2544099137 Thanks for looking at this @buraksenn I thought we more or less ported the macro from DataFusion to Arrow so the different is surprising, but I haven't had a chance to look into it in detail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
buraksenn commented on issue #13037:
URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2543312794
I've took another look into this and tried to change it but the issue is
that macro in arrow only accepts array literals such as:
[1,2,3]. However, in the datafusion macro it accepts vectors. One reference
of this macro is in this function:
```
let batches = vec![
create_record_batch(1, 5),
create_record_batch(6, 1),
create_record_batch(7, 5),
];
fn create_record_batch(start_value: i32, num_values: usize) -> RecordBatch {
let end_value = start_value + num_values as i32;
let a_vals: Vec = (start_value..end_value).collect();
let b_vals: Vec = a_vals.iter().map(|v| *v as f64).collect();
record_batch!(("a", Int32, a_vals), ("b", Float64, b_vals)).unwrap()
}
```
As far as I've tried, achieving the same is not possible with macro in arrow
crate in the same way. That's why I think we can update the macro in arrow to
the same in datafusion and then deprecate it here. But since I'm not Rust
expert not sure on this.
Arrow:
https://github.com/apache/arrow-rs/blob/c4dbf0d8af6ca5a19b8b2ea777da3c276807fc5e/arrow-array/src/record_batch.rs#L153-L174
Datafusion:
https://github.com/apache/datafusion/blob/bfabd48de2e0deccbd8ce595936ad24f18baca65/datafusion/common/src/test_util.rs#L338-L357
cc @timsaucer @alamb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
buraksenn commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2527809229 > > Should this be done? I saw this #12846. I can delete create_array and record_batch macros to make it use the same as in arrow. Also they are also not identical: > > > > * Datafusion: ($(($name: expr, $type: ident, $values: expr)),*) > > * Arrow: ($(($name: expr, > > t > > y > > p > > e > > : > > i > > d > > e > > n > > t > > , > > [ > > ($values: expr),_])),_) > > > > @timsaucer @alamb ? > > I recommend: > > 1. Switch to use the upstream arrow one > 2. Deprecate (don't remove) the API in datafusion, per https://datafusion.apache.org/library-user-guide/api-health.html Will do as recommended. Thanks @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
alamb commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2527749169 > Should this be done? I saw this #12846. I can delete create_array and record_batch macros to make it use the same as in arrow. Also they are also not identical: > > * Datafusion: ($(($name: expr, $type: ident, $values: expr)),*) > * Arrow: ($(($name: expr, > t > y > p > e > : > i > d > e > n > t > , > [ > ($values: expr),_])),_) > > @timsaucer @alamb ? I recommend: 1. Switch to use the upstream arrow one 2. Deprecate (don't remove) the API in datafusion, per https://datafusion.apache.org/library-user-guide/api-health.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
buraksenn commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2526291502 Should this be done? I saw this https://github.com/apache/datafusion/pull/12846. I can delete create_array and record_batch macros to make it use the same as in arrow. Also they are also not identical: - Datafusion: ($(($name: expr, $type: ident, $values: expr)),*) - Arrow: ($(($name: expr, $type: ident, [$($values: expr),*])),*) @timsaucer @alamb ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
timsaucer commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2428803455 Sounds great! It will probably take a while as we have to wait for the upstream to make a release, and then DF to update to that release version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
buraksenn commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2428505334 I will keep an eye on the update and do the required change if it is okay. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Remove record_batch! macro once upstream updates [datafusion]
buraksenn commented on issue #13037: URL: https://github.com/apache/datafusion/issues/13037#issuecomment-2428504105 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
