Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-10-07 Thread Micah Kornfield
In case any one wants to comment further, I've opened https://github.com/apache/arrow/pull/8374 to canonicalize the details. On Mon, Sep 28, 2020 at 9:08 PM Micah Kornfield wrote: > OK, I will try to update documentation

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-28 Thread Micah Kornfield
OK, I will try to update documentation reflecting this in the next few days (in particular it would be good to document which implementations are willing to support byte flipping). On Tue, Sep 22, 2020 at 3:30 AM Antoine Pitrou wrote: > > > Le 22/09/2020 à 06:36, Micah Kornfield a écrit : > > I

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-22 Thread Antoine Pitrou
Le 22/09/2020 à 06:36, Micah Kornfield a écrit : > I wanted to give this thread a bump, does the proposal I made below sound > reasonable? It does! Regards Antoine. > > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield > wrote: > >> If I read the responses so far it seems like the

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-22 Thread Kazuaki Ishizaki
Hi Micah, Thank you. Your proposal also sounds reasonable to me. Best Regards, Kazuaki Ishizaki Fan Liya wrote on 2020/09/22 15:51:58: > From: Fan Liya > To: dev , Micah Kornfield > Date: 2020/09/22 15:52 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was:

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-22 Thread Fan Liya
Hi Micah, Thanks for your summary. Your proposal sounds reasonable to me. Best, Liya Fan On Tue, Sep 22, 2020 at 1:16 PM Micah Kornfield wrote: > I wanted to give this thread a bump, does the proposal I made below sound > reasonable? > > On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield >

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-21 Thread Micah Kornfield
I wanted to give this thread a bump, does the proposal I made below sound reasonable? On Sun, Sep 13, 2020 at 9:57 PM Micah Kornfield wrote: > If I read the responses so far it seems like the following might be a good > compromise/summary: > > 1. It does not seem too invasive to support native

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-09-13 Thread Micah Kornfield
If I read the responses so far it seems like the following might be a good compromise/summary: 1. It does not seem too invasive to support native endianness in implementation libraries. As long as there is appropriate performance testing and CI infrastructure to demonstrate the changes work. 2.

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Fan Liya
Thank Kazuaki for the survey and thank Micah for starting the discussion. I do not oppose supporting BE. In fact, I am in general optimistic about the performance impact (for Java). IMO, this is going to be a painful way (many byte order related problems are tricky to debug), so I hope we can

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Bryan Cutler
I also think this would be a worthwhile addition and help the project expand in more areas. Beyond the Apache Spark optimization use case, having Arrow interoperability with the Python data science stack on BE would be very useful. I have looked at the remaining PRs for Java and they seem pretty

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Wes McKinney
I think it's well within the right of an implementation to reject BE data (or non-native-endian), but if an implementation chooses to implement and maintain the endianness conversions, then it does not seem so bad to me. On Mon, Aug 31, 2020 at 3:33 PM Jacques Nadeau wrote: > > And yes, for

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Jacques Nadeau
And yes, for those of you looking closely, I commented on ARROW-245 when it was committed. I just forgot about it. It looks like I had mostly the same concerns then that I do now :) Now I'm just more worried about format sprawl... On Mon, Aug 31, 2020 at 1:30 PM Jacques Nadeau wrote: > What do

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Jacques Nadeau
> > What do you mean? The Endianness field (a Big|Little enum) was added 4 > years ago: > https://issues.apache.org/jira/browse/ARROW-245 I didn't realize that was done, my bad. Good example of format rot from my pov.

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-31 Thread Antoine Pitrou
arks on a little-endian platform to avoid performance regression. >>> >>> [1] https://arrow.apache.org/blog/2017/07/26/spark-arrow/ >>> [2] >>> >> https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html >>> [3] >>>

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-30 Thread Jacques Nadeau
> https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html > > [3] > > > https://databricks.com/jp/blog/2020/06/01/vectorized-r-i-o-in-upcoming-apache-spark-3-0.html > > [4] https://databricks.com/jp/session_na20/wednesday-morning-keynotes > >

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-30 Thread Micah Kornfield
sday-morning-keynotes > [5] https://github.com/apache/arrow/pull/7507#discussion_r46819873 > [6] https://github.com/apache/arrow/pull/7507 > [7] https://github.com/apache/arrow/pull/7940#issuecomment-672690540 > > Best Regards, > Kazuaki Ishizaki > > Wes McKinney wrote on 2020/08/26 21:

RE: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Kazuaki Ishizaki
ney > To: dev , Micah Kornfield > Cc: Fan Liya > Date: 2020/08/26 21:28 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > hi Micah, > > I agree with your reasoning. If supporting BE in some languages (e

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Kazuaki Ishizaki
ornfield > Cc: Fan Liya > Date: 2020/08/26 21:28 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > hi Micah, > > I agree with your reasoning. If supporting BE in some languages (e.g. > Java) is impractical due t

RE: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Kazuaki Ishizaki
; Date: 2020/08/26 21:28 > Subject: [EXTERNAL] Re: [DISCUSS] Big Endian support in Arrow (was: > Re: [Java] Supporting Big Endian) > > hi Micah, > > I agree with your reasoning. If supporting BE in some languages (e.g. > Java) is impractical due to performance regressions on L

Re: [DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-26 Thread Wes McKinney
hi Micah, I agree with your reasoning. If supporting BE in some languages (e.g. Java) is impractical due to performance regressions on LE platforms, then I don't think it's worth it. But if it can be handled at compile time or without runtime overhead, and tested / maintained properly on an

[DISCUSS] Big Endian support in Arrow (was: Re: [Java] Supporting Big Endian)

2020-08-25 Thread Micah Kornfield
I'm expanding the scope of this thread since it looks like work has also started for making golang support BigEndian architectures. I think as a community we should come to a consensus on whether we want to support Big Endian architectures in general. I don't think it is a good outcome if some