Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-30 Thread via GitHub


gabotechs commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3022021502

   > Duration doesn't really map 1:1 with IntervalDay. Duration has a single 
logical component for time, but IntervalDay has two logical components, time 
and days.
   
   🤔 it still does not sound too bad to me.
   
   If you want to play it safe, one thing we can do is to take other types that 
are failing, like Float16, attempt to ship a solution based on Substrait UDTs, 
and if it turns out to be clean then port the same approach it to 
Duration/IntervalDay


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-28 Thread via GitHub


jkosh44 commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3016010500

   @gabotechs This response from Substrait makes me a little nervous about this 
approach: 
https://github.com/substrait-io/substrait/issues/822#issuecomment-3008350100
   
   Duration doesn't really map 1:1 with IntervalDay. Duration has a single 
logical component for time, but IntervalDay has two logical components, time 
and days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-25 Thread via GitHub


alamb merged PR #16503:
URL: https://github.com/apache/datafusion/pull/16503


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-25 Thread via GitHub


alamb commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3005915991

   Thanks again @jkosh44 and @gabotechs 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-25 Thread via GitHub


jkosh44 commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3002105716

   Just made the suggested updates. 
   
   I don't 100% know how the substrait plans are used, but I am slightly 
worried about one thing with this approach. Would it be possible to construct a 
substrait plan that converts the Duration to an IntervalDay, execute some 
operation that adds some days to the interval, then attempts and fails to 
convert it back to a duration?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-25 Thread via GitHub


alamb commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-3004828625

   > I don't 100% know how the substrait plans are used, but I am slightly 
worried about one thing with this approach. Would it be possible to construct a 
substrait plan that converts the Duration to an IntervalDay, execute some 
operation that adds some days to the interval, then attempts and fails to 
convert it back to a duration?
   
   I think you can always go back from Duration --> IntervalDay
   
   But in any case, I don't think this would be a substrait specific difference 
-- it would be in the DataFusion engine itself


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-24 Thread via GitHub


gabotechs commented on code in PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#discussion_r2164581001


##
datafusion/substrait/src/variation_const.rs:
##
@@ -55,6 +55,8 @@ pub const LARGE_CONTAINER_TYPE_VARIATION_REF: u32 = 1;
 pub const VIEW_CONTAINER_TYPE_VARIATION_REF: u32 = 2;
 pub const DECIMAL_128_TYPE_VARIATION_REF: u32 = 0;
 pub const DECIMAL_256_TYPE_VARIATION_REF: u32 = 1;
+pub const DEFAULT_INTERVAL_DAY_TYPE_VARIATION: u32 = 0;
+pub const DURATION_INTERVAL_DAY_TYPE_VARIATION: u32 = 1;

Review Comment:
   Nice idea using the type variation capabilities, this looks good to me 👍.
   
   Nit 1: maybe prefixing this constants with `_REF` as all the others?
   Nit 2: maybe adding a comment explaining that the default variation maps to 
`DataType::Interval(DayTime)` and the other maps to `DataType::Duration`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-23 Thread via GitHub


gabotechs commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-2998046913

   Sure! I'll get it done in a couple of days. Thanks for submitting this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Add support for Arrow Duration type in Substrait [datafusion]

2025-06-23 Thread via GitHub


alamb commented on PR #16503:
URL: https://github.com/apache/datafusion/pull/16503#issuecomment-2997820357

   Thank you @jkosh44  🙏 
   
   @gabotechs  is there any chance you have time to review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]