>
> Could we detect the 4-byte length, incur a penalty copying the memory to
> an aligned buffer, then continue consuming the stream?

I think that is the plan (or at least would be my plan) if we go ahead with
the change



> (It's probably
> fine if we only write the 8-byte length, since consumers on older
> versions of Arrow could slice from the 4th byte before passing a buffer
> to the reader).

I'm not sure I understand this suggestion:
1.  Wouldn't this cause old readers to miss the last 4 bytes of the buffer
(and provide meaningless bytes at the beginning).
2.  The current proposal on the other thread is to have the pattern be
<0xffffffff><buffer length><buffer data>

Thanks,
Micah

On Tue, Jul 23, 2019 at 11:43 AM Paul Taylor <ptaylor.apa...@gmail.com>
wrote:

> +1 for a 0.15.0 before 1.0 if we go ahead with this.
>
> I'm curious to hear other's thoughts about compatibility. I think we
> should avoid breaking backwards compatibility if possible. It's common
> for apps/libs to be pinned on specific Arrow versions, and I worry it'd
> cause a lot of work for downstream devs to audit their tool suite for
> full Arrow binary compatibility (and/or require their customers to do
> the same).
>
> Could we detect the 4-byte length, incur a penalty copying the memory to
> an aligned buffer, then continue consuming the stream? (It's probably
> fine if we only write the 8-byte length, since consumers on older
> versions of Arrow could slice from the 4th byte before passing a buffer
> to the reader).
>
> I've always understood the metadata to be a few dozen/hundred KB, a
> small percentage of the total message size. I could be underestimating
> the ratios though -- is it common to have tables w/ 1000+ columns? I've
> seen a few reports like that in cuDF, but I'm curious to hear
> Jacques'/Dremio's experience too.
>
> If copying is feasible, it doesn't seem so bad a trade-off to maintain
> backwards-compatibility. As libraries and consumers upgrade their Arrow
> dependencies, the 4-byte length will be less and less common, and
> they'll be less likely to pay the cost.
>
>
>
> On 7/23/19 2:22 AM, Uwe L. Korn wrote:
> > It is also a good way to test the change in public. We don't want to
> adjust something like this anymore in a 1.0.0 release. Already doing this
> in 0.15.0 and then maybe doing adjustments due to issues that appear "in
> the wild" is psychologically the easier way. There is a lot of thinking of
> users bound with the magic 1.0, thus I would plan to minimize what is
> changed between 1.0 and pre-1.0. This also should save us maintainers some
> time as I would expect different behaviour in bug reports between 1.0 and
> pre-1.0 issues.
> >
> > Uwe
> >
> > On Tue, Jul 23, 2019, at 7:52 AM, Micah Kornfield wrote:
> >> I think the main reason to do a release before 1.0.0 is if we want to
> make
> >> the change that would give a good error message for forward
> incompatibility
> >> (I think this could be done as 0.14.2 since it would just be clarifying
> an
> >> error message).  Otherwise, I think including it in 1.0.0 would be fine
> >> (its still not clear to me if there is consensus to fix the issue).
> >>
> >> Thanks,
> >> Micah
> >>
> >>
> >> On Monday, July 22, 2019, Wes McKinney <wesmck...@gmail.com> wrote:
> >>
> >>> I'd be satisfied with fixing the Flatbuffer alignment issue either in
> >>> a 0.15.0 or 1.0.0. In the interest of expediency, though, making a
> >>> 0.15.0 with this change sooner rather than later might be prudent.
> >>>
> >>> On Mon, Jul 22, 2019 at 12:35 PM Antoine Pitrou <anto...@python.org>
> >>> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> Recently we've discussed breaking the IPC format to fix a
> long-standing
> >>>> alignment issue.  See this discussion:
> >>>>
> >>>
> https://lists.apache.org/thread.html/8cea56f2069710ac128ff9129c744f0ef96a3e33a4d79d7e820019af@%3Cdev.arrow.apache.org%3E
> >>>> Should we first do a 0.15.0 in order to get those format fixes right?
> >>>> Once that is fine and settled we can move to the 1.0.0 release?
> >>>>
> >>>> Regards
> >>>>
> >>>> Antoine.
>
>
>

Reply via email to