Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2643315586 The release has been approved and published to crates.io: https://lists.apache.org/thread/73plk4nhdjby58knd5rvwsmsjzpbcg7s > With 4 +1 votes (3 binding) the release is approved! > > The release is available here: > https://dist.apache.org/repos/dist/release/datafusion/datafusion-45.0.0 > > I have also published the release on [crates.io](http://crates.io/): https://crates.io/crates/datafusion/45.0.0 > > Thanks everyone for your help / support -- I hope this one is epic (and a low drama release) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb closed issue #14008: Release DataFusion `45.0.0` URL: https://github.com/apache/datafusion/issues/14008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
wiedld commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2637421384 @alamb -- release candidate re-tested on InfluxData, and is good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
Omega359 commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2636935066 Beyond the type coercion issues that I can work around my testing is working ... as long as I don't compile with 'release' target. That seems to segfault I think somewhere in DF code but I haven't yet been able to get a core dump from the crashing nodes to investigate further. I thought it was Rust 1.84.1 dependent but I retested with 1.83.0 and had the same issue. I don't think it's a blocker but is something I'm going to continue to try and narrow down. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
Omega359 commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2633412221 I don't think I'll have the bandwidth to test a type coercion fix for UDF's myself this week to be honest. I'm about to fire off a full run of my application against the 45 branch but I likely can only do that once this week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2632898299 > It would also be really nice to sort out this issue from [@shehabgamin](https://github.com/shehabgamin) but it isn't clear to me we can fix that with a small non-risky patch > > * [DataFusion Regression (Starting in v43): Type Coercion for UDF Arguments (Int --> String) #14230](https://github.com/apache/datafusion/issues/14230) I think we came to a temporary solution here https://github.com/apache/datafusion/pull/14268#issuecomment-2632894156 CC @alamb @Omega359 @jayzhan211 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2632865717 I updated https://github.com/apache/datafusion/issues/14408#issue-2825591642, but same issues as before with: Test 2: Commit `26058ac` (DataFusion [`45.0.0-rc1`](https://github.com/apache/datafusion/tree/45.0.0-rc1)). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2632246972 @alamb I'll test on Sail sometime today! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2632204435 Ok, I think we are ready with a release candidate: tag here: https://github.com/apache/datafusion/releases/tag/45.0.0-rc1 Release voting thread: https://lists.apache.org/thread/g20ywc9yto8xp07lcllmvgyn8g5z4420 Content This release candidate is based on commit: 26058ac024095ad8852eb3a8ab707ac09a02e8d7 [1] The proposed release tarball and signatures are hosted at [2]. The changelog is located at [3]. The standard verification procedure is documented at https://github.com/apache/datafusion/blob/main/dev/release/README.md#verifying-release-candidates. [1]: https://github.com/apache/datafusion/tree/26058ac024095ad8852eb3a8ab707ac09a02e8d7 [2]: https://dist.apache.org/repos/dist/dev/datafusion/apache-datafusion-45.0.0-rc1 [3]: https://github.com/apache/datafusion/blob/26058ac024095ad8852eb3a8ab707ac09a02e8d7/CHANGELOG.md -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2631959164 Backport PRs (for my own OCD sanity) - [x] https://github.com/apache/datafusion/pull/14453 - [ ] https://github.com/apache/datafusion/pull/14454 - [ ] https://github.com/apache/datafusion/pull/14455 - [ ] https://github.com/apache/datafusion/pull/14456 - [ ] https://github.com/apache/datafusion/pull/14457 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2631773065 @Omega359 has a fix for https://github.com/apache/datafusion/issues/11911 - https://github.com/apache/datafusion/pull/14449 This afternoon I will make some backport PRs and hopefully get a release candidate created -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2631357227 Update on the release: # `main` is open for new code (will be 46.0.0` I have created a 45.0.0 branch from which to make the 45.0.0 release - https://github.com/apache/datafusion/tree/branch-45 I would very much like to get fixes for the following issues into the 45 branch as I think they are regressions that prevented some people from upgrading from 43 --> 44 and the fixes are fairly small: - [ ] https://github.com/apache/datafusion/pull/14377 (Utf8View regression in 43) - [ ] https://github.com/apache/datafusion/pull/14385 (regression from 43) - [ ] https://github.com/apache/datafusion/pull/14387 (regression from 43) - [ ] https://github.com/apache/datafusion/issues/11911 (Utf8View regression in 43) (mentioned by @Omega359 above) It would also be really nice to sort out this issue from @shehabgamin but it isn't clear to me we can fix that with a small non-risky patch - https://github.com/apache/datafusion/issues/14230 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2631116639 > > Hi [@kevinjqliu](https://github.com/kevinjqliu) -- I am not sure what the plan for datafusion-python is > > I am currently working on building the 44 release. I hope to get that done this morning or tomorrow. It's generally run about a month behind this upstream repo In case anyone is interested, here is the release thread: - https://lists.apache.org/thread/gkk4lq6k19gcq9xw64mcmbxrnf68o95s -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
timsaucer commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629425413 > Hi [@kevinjqliu](https://github.com/kevinjqliu) -- I am not sure what the plan for datafusion-python is I am currently working on building the 44 release. I hope to get that done this morning or tomorrow. It's generally run about a month behind this upstream repo -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
jayzhan211 commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629384465 > Digging into the code, I see that it's deprecated (it should still work even though it's deprecated). What's strange, however, is that the deprecation warning is not propagating to me as a downstream user. I only found this out due to a failed test. I found that we can't make this deprecated but a breaking change if the user override the `is_nullable` function because it takes `schema: &dyn ExprSchema` which is not the case for `return_type_from_exprs` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629381090 Ok, given the feedback what I think we should do is: 1. Complete the testing on our release branch (`branch-45`) and backport any fixes needed 2. Continue development on main as previously planned. Thanks @shehabgamin for the testing and reports. I'll take a more careful look tomorrow -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629379592 > hey @alamb quick question about the release process, are the release for datafusion and datafusion-python in locksteps? I see the next release for datafusion is 45.0.0 meanwhile [datafusion-python's main branch is currently on 43.0.0](https://github.com/apache/datafusion-python/blob/8b513906315a0749b9f5cd6f34bf259ab4dd1add/Cargo.toml#L19-L20) Hi @kevinjqliu -- I am not sure what the plan for datafusion-python is The main branch seems to be on 44 to me 🤔 https://github.com/apache/datafusion-python/blob/8b513906315a0749b9f5cd6f34bf259ab4dd1add/Cargo.toml#L41-L44 @timsaucer updated it a few weeks ago - https://github.com/apache/datafusion-python/pull/973 Any chance you can make a PR to test upgrading datafusion-python to 45? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008:
URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629314116
Another regression is that implementing the `is_nullable` function in the
`ScalarUDFImpl` trait no longer works. For example:
```
impl ScalarUDFImpl for SparkArray {
...
fn is_nullable(&self, _args: &[Expr], _schema: &dyn ExprSchema) -> bool {
false
}
...
}
```
Digging into the code, I see that it's deprecated (it should still work even
though it's deprecated). What's strange, however, is that the deprecation
warning is not propagating to me as a downstream user. I only found this out
due to a failed test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629235506 > Derived TPC-DS Query 66 (https://github.com/lakehq/sail/blob/main/python/pysail/data/tpcds/queries/q66.sql) fails on Sail, when it previously did not. Error is: ``` Exception: Error executing query q66 with error: Internal error: Physical input schema should be the same as the one converted from logical input schema. Differences: - field nullability at index 7 [#98]: (physical) false vs (logical) true. ``` @alamb I created another PR that's dedicated only to the DF 45. Could you please update the issue with this PR: https://github.com/lakehq/sail/pull/365 Additionally, I created a tracking issue: https://github.com/apache/datafusion/issues/14408 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629154530 Derived TPC-DS Query 66 (https://github.com/lakehq/sail/blob/main/python/pysail/data/tpcds/queries/q66.sql) fails on Sail, when it previously did not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
kevinjqliu commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629109322 hey @alamb quick question about the release process, are the release for datafusion and datafusion-python in locksteps? I see the next release for datafusion is 45.0.0 meanwhile [datafusion-python's main branch is currently on 43.0.0](https://github.com/apache/datafusion-python/blob/8b513906315a0749b9f5cd6f34bf259ab4dd1add/Cargo.toml#L19-L20) is the plan to release datafusion 45.0.0 and then upgrade datafusion-python to 45.0.0 too and take on the new datafusion library version? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629106680 Thanks @Omega359 The delta-rs upgrade seems to have gone pretty smootly: https://github.com/delta-io/delta-rs/pull/3175 In terms of releasing DataFusion 45 I think it is better than 44 so I was planning to make the RC tomorrow, but I can wait a day or two if we want to make a push to finish up a few of those issues > utf8view, i32 comparison no longer worked I believe this is fixed in the following PR (I am waiting on another committer to approve and if they do I can backport it) - https://github.com/apache/datafusion/pull/14377 > i32 -> utf8view coercion in regexp_like udf stopped working. I believe this is tracked by - https://github.com/apache/datafusion/issues/11911 But is waiting on the next arrow release (which I just need to scrounge up one more PMC vote to approve and release). It might "just work" after that - https://github.com/apache/arrow-rs/issues/6929 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
Omega359 commented on issue #14008:
URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629085175
I just upgraded my project to latest main from DF 42. The primary
compilation and test suite issues I encountered after setting
`datafusion.execution.parquet.schema_force_view_types` -> false (I'm going to
fix all my .as_string_array(..) and utf8 datatype assumptions later):
utf8view, i32 comparison no longer worked
```
.with_column(
STRING_FIELD,
when(col(STRING_FIELD).eq(lit(83)),
lit(82)).otherwise(col(STRING_FIELD))?,
)?
```
switching to the obvious change below worked
```
.with_column(
STRING_FIELD,
when(col(STRING_FIELD).eq(lit("83")),
lit("82")).otherwise(col(STRING_FIELD))?,
)?
```
i32 -> utf8view coercion in udf's stopped working. For example:
`regexp_like(field_a, '[^0-9]')` where field_a is an int field. changed to
`regexp_like(field_a::varchar, '[^0-9]')`
Auto coercion of date/timestamp to utf8 no longer worked. I had to update my
code to use `to_char` function to properly write out the date/timestamp. Not a
bad thing really.
I'll likely do a full data run with this early next week.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2629031385 I have created a branch: - https://github.com/apache/datafusion/tree/branch-45 Let's start merging stuff to main again. Before I make the RC I want to verify against delta.rs If we are able to get the fixes for any of the following issues reviewed / merged to main we can' port them to `branch-45` - https://github.com/apache/datafusion/pull/14377 (UTF8Vew papercut) - https://github.com/apache/datafusion/pull/14385 (regression from 43) - https://github.com/apache/datafusion/pull/14387 (regression regression from 43) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2628949060 @wiedld tested with InfluxDB and this upgrade works for us -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2628923232 I have created a PR for a version increase and - https://github.com/apache/datafusion/pull/14397 Once that is approved / merged I'll create a release-45 branch to make RCs from We can do final testing / backporting any needed fixes there. In my mind the following PRs would be great to get into 45 but since they all existed in 44 as well, so they shouldn't block 45 the release - https://github.com/apache/datafusion/pull/14377 (UTF8Vew papercut) - https://github.com/apache/datafusion/pull/14385 (regression from 43) - https://github.com/apache/datafusion/pull/14387 (regression regression from 43) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2628544349 Release update: In order to avoid holding up merges main too much longer I plan to make a release branch tomorrow and we can backport individual fixes as needed . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2627806359 Here is a proposed fix for https://github.com/apache/datafusion/issues/13510 - https://github.com/apache/datafusion/pull/14387 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2627581883 I have a proposed fix for https://github.com/apache/datafusion/issues/14154 up now - https://github.com/apache/datafusion/pull/14385 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2627012274 I went through the list of issues again and I think these are the only ones that should be fixed in my opinion before the release - [ ] https://github.com/apache/datafusion/issues/13359 - [ ] https://github.com/apache/datafusion/issues/14230 - [ ] https://github.com/apache/datafusion/issues/13510 - [ ] https://github.com/apache/datafusion/issues/14154 We have PRs up for the first two and I will help / work on the second two today -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2616100769 > Some people currently use ValuesExec::try_new_from_batches, so MemoryExec::try_new_as_values wouldn't necessarily be a suitable substitute. @shehabgamin --makes sense. I made this PR to try and clarify: - https://github.com/apache/datafusion/pull/14322 Thank you again for the testing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2616074861 I am starting to do some ticket triage and prepare for the release > IMO this should go on the "must fix" list too. I'll make sure to have the PR ready by the end of the weekend. Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2614301913 @alamb @jayzhan211 https://github.com/apache/datafusion/pull/14268 is ready for review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2613799483 > Most of the regressions are related to this issue: [#14230](https://github.com/apache/datafusion/issues/14230). I should be able to resolve them well before the `45` release. It turns out that type coercion for UDF arguments (TypeSignature::Coercible) was not being applied to the majority of types. I adjusted the scope of https://github.com/apache/datafusion/issues/14230 and https://github.com/apache/datafusion/pull/14268 to reflect this. IMO this should go on the "must fix" list too. I'll make sure to have the PR ready by the end of the weekend. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
andygrove commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2613057921 @alamb I took the liberty of adding https://github.com/apache/datafusion/issues/14277 to the "must fix" list -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
andygrove commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2612692120 I created an issue to track our progress with upgrading Comet to use DataFusion 45 and linked to it from the PR description: https://github.com/apache/datafusion/issues/14274 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
jayzhan211 commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2611311173 I see. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2610743660 > > To replace `ValuesExec`, `try_new_as_values` is the right one to use not `try_new_from_batches` Some people currently use `ValuesExec::try_new_from_batches`, so `MemoryExec::try_new_as_values` wouldn't necessarily be a suitable substitute. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
jayzhan211 commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2609768983 > ValuesExec is now deprecated. The deprecation message is a bit confusing though. It currently states: "Use MemoryExec::try_new_as_values instead", but I think should say: "Use MemoryExec::try_new_as_values or MemoryExec::try_new_from_batches instead". Or, just simply: "Use MemoryExec instead". To replace `ValuesExec`, `try_new_as_values` is the right one to use not `try_new_from_batches` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2609034408 My apologies @alamb, the DataFusion upgrade from the latest main branch commit is smoother than I initially thought. After investigating the flood of errors, I discovered that many were resolved by simply updating Sail's `serde-arrow` dependency to Arrow `54`. Projects without PyO3 or the `pyarrow` feature in DataFusion should experience a seamless upgrade (as of writing). Projects using PyO3 with the `pyarrow` feature enabled will have varying experiences based on their usage of PyO3. **PyO3 `0.23.3`** DataFusion `45` upgrades from PyO3 `0.22` to `0.23.3`. This is an exciting change, but may introduce significant breaking changes for PyO3 users. Since these changes vary based on PyO3 usage, I'm not listing Sail's specific changes here. Users can refer to the PyO3 migration guide: https://pyo3.rs/v0.23.0/migration **DataFusion** `ValuesExec` is now deprecated. The deprecation message is a bit confusing though. It currently states: "Use `MemoryExec::try_new_as_values` instead", but I think should say: "Use `MemoryExec::try_new_as_values` or `MemoryExec::try_new_from_batches` instead". Or, just simply: "Use `MemoryExec` instead". If you'd like to see these changes, they're in my PR that's testing the regression fixes: https://github.com/lakehq/sail/pull/355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2608409140 > Thanks [@shehabgamin](https://github.com/shehabgamin) -- Can you enumerate these changes (or point me at a PR) so we can see if there is some way to make jarring Yeah I'll work on that right now! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2607619413 > While testing my local Sail code with the latest commit on DataFusion's main branch, I encountered several breaking changes that may make DataFusion 45 a jarring upgrade for some users Thanks @shehabgamin -- Can you enumerate these changes (or point me at a PR) so we can see if there is some way to make jarring -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2606934116 > > I'll take a deeper look into the issue after the weekend. Hope you have a great rest of your weekend! Most of the regressions are related to this issue: https://github.com/apache/datafusion/issues/14230. I should be able to resolve them well before the `45` release. While testing my local Sail code with the latest commit on DataFusion's main branch, I encountered several breaking changes that may make DataFusion 45 a jarring upgrade for some users. Given the previous discussion about wanting to make releases less jarring (https://github.com/apache/datafusion/issues/13334#issuecomment-2552465244), I wanted to bring this to your attention, @alamb. Aside from that, there is one remaining regression I haven't investigated yet, which seems to be related to Parquet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
shehabgamin commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2599492511 As promised, Sail is working on porting relevant tests into DataFusion. A good starting point is a regression our tests caught in DataFusion 43, which still seems to persist in DataFusion 44. A regression was introduced in DataFusion 43.0.0 related to casting to UTF8 in various places. Upgrading to DataFusion 43.0.0 required adding explicit casting in several areas as a workaround. This PR (https://github.com/lakehq/sail/pull/355) comments out those changes to expose the regression through the 12 additional failed tests compared to the main branch. Once I’ve pinpointed the root cause(s) of the regression, I’ll create an issue in DataFusion to track the work. I want to ensure the issue accurately reflects the problem before filing it. I’m happy to address these regressions and port over the tests that cover them in the same PR. Hopefully, we can get this resolved in time for the DataFusion 45 release! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2593564670 I plan to start assembing the release candidate and test on the week of Jan 27 (in about 2 weeks time() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2589707968 > > > > I am happy to do it again for 45 if no one else would like the opportunity (see what I did there 😆 ) > > Thanks, alamb, I booked 46 in advance! Awesome -- I filed https://github.com/apache/datafusion/issues/14123 to track 46 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
xudong963 commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2589148761 > > I don't have a preference. I will traveling around this time though, so perhaps it would make sense for someone else to be release manager for this one. > > I am happy to do it again for 45 if no one else would like the opportunity (see what I did there 😆 ) Thanks, alamb! I booked 46 in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2588105358 > I don't have a preference. I will traveling around this time though, so perhaps it would make sense for someone else to be release manager for this one. I am happy to do it again for 45 if no one else would like the opportunity (see what I did there 😆 ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
andygrove commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2588095820 > [@andygrove](https://github.com/andygrove) would you like to coordinate this release or would you like me to? (or does anyone else want to do so?) I don't have a preference. I will traveling around this time though, so perhaps it would make sense for someone else to be release manager for this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2587937242 I also added some issues to the description above that I think would be worth fixing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [I] Release DataFusion `45.0.0` [datafusion]
alamb commented on issue #14008: URL: https://github.com/apache/datafusion/issues/14008#issuecomment-2587875376 @andygrove would you like to coordinate this release or would you like me to? (or does anyone else want to do so?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
