Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

Fan Liya Mon, 21 Sep 2020 23:53:46 -0700

Hi Micah,

Thanks for your summary. Your proposal sounds reasonable to me.


Best,
Liya Fan


On Tue, Sep 22, 2020 at 1:16 PM Micah Kornfield <[email protected]>
wrote:

> I wanted to give this thread a bump, does the proposal I made below sound
> reasonable?
>
> On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield <[email protected]>
> wrote:
>
> > If I read the responses so far it seems like the following might be a
> good
> > compromise/summary:
> >
> > 1. It does not seem too invasive to support native endianness in
> > implementation libraries.  As long as there is appropriate performance
> > testing and CI infrastructure to demonstrate the changes work.
> > 2. It is up to implementation maintainers if they wish to accept PRs that
> > handle byte swapping between different architectures.  (Right now it
> sounds
> > like C++ is potentially OK with it and for Java at least Jacques is
> opposed
> > to it?
> >
> > Testing changes that break big-endian can be a potential drag on
> developer
> > productivity but there are methods to run locally (at least on more
> recent
> > OSes).
> >
> > Thoughts?
> >
> > Thanks,
> > Micah
> >
> > On Mon, Aug 31, 2020 at 7:08 PM Fan Liya <[email protected]> wrote:
> >
> >> Thank Kazuaki for the survey and thank Micah for starting the
> discussion.
> >>
> >> I do not oppose supporting BE. In fact, I am in general optimistic about
> >> the performance impact (for Java).
> >> IMO, this is going to be a painful way (many byte order related problems
> >> are tricky to debug), so I hope we can make it short.
> >>
> >> It is good that someone is willing to take this on, and I would like to
> >> provide help if needed.
> >>
> >> Best,
> >> Liya Fan
> >>
> >>
> >>
> >> On Tue, Sep 1, 2020 at 7:25 AM Bryan Cutler <[email protected]> wrote:
> >>
> >> > I also think this would be a worthwhile addition and help the project
> >> > expand in more areas. Beyond the Apache Spark optimization use case,
> >> having
> >> > Arrow interoperability with the Python data science stack on BE would
> be
> >> > very useful. I have looked at the remaining PRs for Java and they seem
> >> > pretty minimal and straightforward. Implementing the equivalent record
> >> > batch swapping as done in C++ at [1] would be a little more involved,
> >> but
> >> > still reasonable. Would it make sense to create a branch to apply all
> >> > remaining changes with CI to get a better picture before deciding on
> >> > bringing into master branch?  I could help out with shepherding this
> >> effort
> >> > and assist in maintenance, if we decide to accept.
> >> >
> >> > Bryan
> >> >
> >> > [1] https://github.com/apache/arrow/pull/7507
> >> >
> >> > On Mon, Aug 31, 2020 at 1:42 PM Wes McKinney <[email protected]>
> >> wrote:
> >> >
> >> > > I think it's well within the right of an implementation to reject BE
> >> > > data (or non-native-endian), but if an implementation chooses to
> >> > > implement and maintain the endianness conversions, then it does not
> >> > > seem so bad to me.
> >> > >
> >> > > On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau <[email protected]>
> >> > wrote:
> >> > > >
> >> > > > And yes, for those of you looking closely, I commented on
> ARROW-245
> >> > when
> >> > > it
> >> > > > was committed. I just forgot about it.
> >> > > >
> >> > > > It looks like I had mostly the same concerns then that I do now :)
> >> Now
> >> > > I'm
> >> > > > just more worried about format sprawl...
> >> > > >
> >> > > > On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau <
> [email protected]>
> >> > > wrote:
> >> > > >
> >> > > > > What do you mean?  The Endianness field (a Big|Little enum) was
> >> > added 4
> >> > > > >> years ago:
> >> > > > >> https://issues.apache.org/jira/browse/ARROW-245
> >> > > > >
> >> > > > >
> >> > > > > I didn't realize that was done, my bad. Good example of format
> rot
> >> > > from my
> >> > > > > pov.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

Reply via email to