I think we should create a ticket to discuss releasing 48.1.0 (in addition to 49.0.0) -- I can do so later today if no one beats me to it
On Tue, Nov 7, 2023 at 6:47 AM Raphael Taylor-Davies <r.taylordav...@googlemail.com.invalid> wrote: > It will contain breaking dependency updates, including object_store. > > I hope to cut it today. > > On 07/11/2023 11:43, Andrew Lamb wrote: > > If the release later in the week doesn't have any breaking API changes, > > perhaps it can be 48.1.0 (and thus also get the bugfix to datafusion) > > > > On Tue, Nov 7, 2023 at 6:41 AM Raphael Taylor-Davies > > <r.taylordav...@googlemail.com.invalid> wrote: > > > >> I intend to cut a new arrow release later this week, I would prefer we > >> wait for this. > >> > >> On 07/11/2023 11:39, Andrew Lamb wrote: > >>> Perhaps we can create an arrow 48.1.0 patch release to include the fix? > >>> > >>> On Tue, Nov 7, 2023 at 12:48 AM Will Jones <will.jones...@gmail.com> > >> wrote: > >>>> Thanks for the clarification, Raphael. That likely narrows the scope > of > >> who > >>>> is affected. If this bug is present in DataFusion 33, then delta-rs > will > >>>> likely skip upgrading until 34. If we're the only downstream project > >> this > >>>> parsing issue affects, then I think it's fine to release. > >>>> > >>>> On Mon, Nov 6, 2023 at 8:22 PM Raphael Taylor-Davies > >>>> <r.taylordav...@googlemail.com.invalid> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> To further clarify the bug concerns the serde compatibility feature > >> that > >>>>> allows converting a serde compatible data structure to arrow [1]. It > >> will > >>>>> not impact workloads reading JSON. > >>>>> > >>>>> I am not sure this is a sufficiently fundamental bug to warrant > special > >>>>> concern, but happy to defer to others. > >>>>> > >>>>> Kind Regards, > >>>>> > >>>>> Raphael > >>>>> > >>>>> [1]: https://docs.rs/arrow/latest/arrow/#serde-compatibility > >>>>> > >>>>> On 7 November 2023 03:20:59 GMT, Will Jones <will.jones...@gmail.com > > > >>>>> wrote: > >>>>>> Hello, > >>>>>> > >>>>>> There is an upstream bug in arrow-json that can cause the JSON > reader > >> to > >>>>>> return incorrect data for large integers [1]. It was recently fixed > by > >>>>>> Raphael within the last 24 hours, but is not included in any > release. > >>>> The > >>>>>> bug was introduced in Arrow 48, which this DataFusion release will > >>>> expose > >>>>>> users to. > >>>>>> > >>>>>> Not sure what the precedent here is, but I think either we should > >>>> consider > >>>>>> either (a) seeing if we can release and upgrade Arrow to include the > >>>> fix, > >>>>>> or else (b) calling out the regression as a known bug so downstream > >>>>>> projects can include the path in their applications. > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> Will Jones > >>>>>> > >>>>>> [1] https://github.com/apache/arrow-rs/issues/5038 > >>>>>> [2] https://github.com/apache/arrow-rs/pull/5042 > >>>>>> > >>>>>> On Mon, Nov 6, 2023 at 12:25 PM Andrew Lamb <al...@influxdata.com> > >>>> wrote: > >>>>>>> +1 (the tests passed for me). I have left a comment on > >>>>>>> https://github.com/apache/arrow-datafusion/issues/8069 > >>>>>>> > >>>>>>> On Mon, Nov 6, 2023 at 2:02 PM Andy Grove <andygrov...@gmail.com> > >>>>> wrote: > >>>>>>>> I filed https://github.com/apache/arrow-datafusion/issues/8069 > >>>>>>>> > >>>>>>>> On Mon, Nov 6, 2023 at 11:59 AM Andy Grove <andygrov...@gmail.com > > > >>>>>>> wrote: > >>>>>>>>> I see the same error when I run on my M1 Macbook Air with 16 GB > >>>> RAM. > >>>>>>>>> ---- aggregates::tests::run_first_last_multi_partitions stdout > >>>> ---- > >>>>>>>>> Error: ResourcesExhausted("Failed to allocate additional 632 > bytes > >>>>> for > >>>>>>>>> GroupedHashAggregateStream[0] with 1829 bytes already allocated - > >>>>>>> maximum > >>>>>>>>> available is 605") > >>>>>>>>> > >>>>>>>>> It worked fine on my workstation with 128 GB RAM. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Mon, Nov 6, 2023 at 11:23 AM L. C. Hsieh <vii...@gmail.com> > >>>>> wrote: > >>>>>>>>>> Hmm, ran verification script and got one failure: > >>>>>>>>>> > >>>>>>>>>> failures: > >>>>>>>>>> > >>>>>>>>>> ---- aggregates::tests::run_first_last_multi_partitions stdout > >>>> ---- > >>>>>>>>>> Error: ResourcesExhausted("Failed to allocate additional 632 > >>>> bytes > >>>>> for > >>>>>>>>>> GroupedHashAggregateStream[0] with 1829 bytes already allocated > - > >>>>>>>>>> maximum available is 605") > >>>>>>>>>> > >>>>>>>>>> failures: > >>>>>>>>>> aggregates::tests::run_first_last_multi_partitions > >>>>>>>>>> > >>>>>>>>>> test result: FAILED. 557 passed; 1 failed; 1 ignored; 0 > >>>> measured; 0 > >>>>>>>>>> filtered out; finished in 2.21s > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Mon, Nov 6, 2023 at 6:57 AM Andy Grove < > andygrov...@gmail.com > >>>>>>>> wrote: > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> I would like to propose a release of Apache Arrow DataFusion > >>>>>>>>>> Implementation, > >>>>>>>>>>> version 33.0.0. > >>>>>>>>>>> > >>>>>>>>>>> This release candidate is based on commit: > >>>>>>>>>>> 262f08778b8ec231d96792c01fc3e051640eb5d4 [1] > >>>>>>>>>>> The proposed release tarball and signatures are hosted at [2]. > >>>>>>>>>>> The changelog is located at [3]. > >>>>>>>>>>> > >>>>>>>>>>> Please download, verify checksums and signatures, run the unit > >>>>>>> tests, > >>>>>>>>>> and > >>>>>>>>>>> vote > >>>>>>>>>>> on the release. The vote will be open for at least 72 hours. > >>>>>>>>>>> > >>>>>>>>>>> Only votes from PMC members are binding, but all members of the > >>>>>>>>>> community > >>>>>>>>>>> are > >>>>>>>>>>> encouraged to test the release and vote with "(non-binding)". > >>>>>>>>>>> > >>>>>>>>>>> The standard verification procedure is documented at > >>>>>>>>>>> > >> > https://github.com/apache/arrow-datafusion/blob/main/dev/release/README.md#verifying-release-candidates > >>>>>>>>>>> . > >>>>>>>>>>> > >>>>>>>>>>> [ ] +1 Release this as Apache Arrow DataFusion 33.0.0 > >>>>>>>>>>> [ ] +0 > >>>>>>>>>>> [ ] -1 Do not release this as Apache Arrow DataFusion 33.0.0 > >>>>>>>> because... > >>>>>>>>>>> Here is my vote: > >>>>>>>>>>> > >>>>>>>>>>> +1 > >>>>>>>>>>> > >>>>>>>>>>> [1]: > >>>>>>>>>>> > >> > https://github.com/apache/arrow-datafusion/tree/262f08778b8ec231d96792c01fc3e051640eb5d4 > >>>>>>>>>>> [2]: > >>>>>>>>>>> > >> > https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-33.0.0-rc1 > >>>>>>>>>>> [3]: > >>>>>>>>>>> > >> > https://github.com/apache/arrow-datafusion/blob/262f08778b8ec231d96792c01fc3e051640eb5d4/CHANGELOG.md > >> >