I opened a PR https://github.com/apache/arrow/pull/7566
We should prioritize getting through the other format changes, but we can vote on this in the meantime if there is consensus On Fri, Jun 26, 2020 at 2:58 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > I agree I think we have to do this given the number of changes in flight > (especially union types). > > On Fri, Jun 26, 2020 at 7:29 AM Wes McKinney <wesmck...@gmail.com> wrote: > > > I created a JIRA about this > > > > https://issues.apache.org/jira/browse/ARROW-9231 > > > > This issue is quite important so please take a look. > > > > On Thu, Jun 25, 2020 at 8:53 AM Wes McKinney <wesmck...@gmail.com> wrote: > > > > > > On Thu, Jun 25, 2020 at 5:31 AM Antoine Pitrou <anto...@python.org> > > wrote: > > > > > > > > > > > > Le 25/06/2020 à 12:18, Antoine Pitrou a écrit : > > > > > > > > > > Le 25/06/2020 à 00:40, Wes McKinney a écrit : > > > > >> hi folks, > > > > >> > > > > >> This has come up in some other contexts, but I believe it would be a > > > > >> good idea to increment the version number in Schema.fbs starting > > with > > > > >> 1.0.0 to separate the pre-1.0 and post-1.0 worlds > > > > >> > > > > >> https://github.com/apache/arrow/blob/master/format/Schema.fbs#L22 > > > > >> > > > > >> Given that we are contemplating a number of changes to assist with > > > > >> forward compatibility and a breaking serialization change for > > unions, > > > > >> this would seem prudent so that we do not risk breaking > > compatibility > > > > >> with 0.17.1 and prior. > > > > >> > > > > >> Given that there are no major backwards incompatibilities, there > > > > >> should be no problem with 1.0.0 readers reading data generated by > > > > >> libraries <= 0.17.1. > > > > > > > > > > Actually, it seems that a dense array with top-level null values > > > > > (represented in 0.17.1 fashion) would need non-trivial rewriting of > > its > > > > > offsets and child arrays (at least one child array) to represent the > > > > > nulls at the child level. > > > > > > > > > > This is unless we keep the top-level union null bitmap in C++ and > > only > > > > > avoid emitting it on the IPC side. Which would be a slightly weird > > > > > arrangement, but would limit incompatibilites on the C++ API side. > > > > > > > > Actually, if we do this, the same problem will appear on the IPC write > > > > side (C++-created dense union arrays with a top-level null bitmap will > > > > need regenerating some of the child buffers). > > > > > > I see. Well I think we can shut down this issue by giving up on Union > > > forward compatibility V4 / pre-1.0 libraries. > > > > > > > Regards > > > > > > > > Antoine. > >