Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
gabotechs commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3022021502 > Duration doesn't really map 1:1 with IntervalDay. Duration has a single logical component for time, but IntervalDay has two logical components, time and days. 🤔 it still does not sound too bad to me. If you want to play it safe, one thing we can do is to take other types that are failing, like Float16, attempt to ship a solution based on Substrait UDTs, and if it turns out to be clean then port the same approach it to Duration/IntervalDay -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
jkosh44 commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3016010500 @gabotechs This response from Substrait makes me a little nervous about this approach: https://github.com/substrait-io/substrait/issues/822#issuecomment-3008350100 Duration doesn't really map 1:1 with IntervalDay. Duration has a single logical component for time, but IntervalDay has two logical components, time and days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
alamb merged PR #16503: URL: https://github.com/apache/datafusion/pull/16503 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
alamb commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3005915991 Thanks again @jkosh44 and @gabotechs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
jkosh44 commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3002105716 Just made the suggested updates. I don't 100% know how the substrait plans are used, but I am slightly worried about one thing with this approach. Would it be possible to construct a substrait plan that converts the Duration to an IntervalDay, execute some operation that adds some days to the interval, then attempts and fails to convert it back to a duration? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
alamb commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3004828625 > I don't 100% know how the substrait plans are used, but I am slightly worried about one thing with this approach. Would it be possible to construct a substrait plan that converts the Duration to an IntervalDay, execute some operation that adds some days to the interval, then attempts and fails to convert it back to a duration? I think you can always go back from Duration --> IntervalDay But in any case, I don't think this would be a substrait specific difference -- it would be in the DataFusion engine itself -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
gabotechs commented on code in PR #16503: URL: https://github.com/apache/datafusion/pull/16503#discussion_r2164581001 ## datafusion/substrait/src/variation_const.rs: ## @@ -55,6 +55,8 @@ pub const LARGE_CONTAINER_TYPE_VARIATION_REF: u32 = 1; pub const VIEW_CONTAINER_TYPE_VARIATION_REF: u32 = 2; pub const DECIMAL_128_TYPE_VARIATION_REF: u32 = 0; pub const DECIMAL_256_TYPE_VARIATION_REF: u32 = 1; +pub const DEFAULT_INTERVAL_DAY_TYPE_VARIATION: u32 = 0; +pub const DURATION_INTERVAL_DAY_TYPE_VARIATION: u32 = 1; Review Comment: Nice idea using the type variation capabilities, this looks good to me 👍. Nit 1: maybe prefixing this constants with `_REF` as all the others? Nit 2: maybe adding a comment explaining that the default variation maps to `DataType::Interval(DayTime)` and the other maps to `DataType::Duration`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
gabotechs commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-2998046913 Sure! I'll get it done in a couple of days. Thanks for submitting this! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]
alamb commented on PR #16503: URL: https://github.com/apache/datafusion/pull/16503#issuecomment-2997820357 Thank you @jkosh44 🙏 @gabotechs is there any chance you have time to review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
