Hi Wes,
I'm not an expert on this either, my inclination mostly comes from some
research I've done.  I think it is important to distinguish two cases:
1.  unaligned access at the processor instruction level
2.  undefined behavior

>From my reading unaligned access is fine on most modern architectures and
it seems the performance penalty has mostly been eliminated.

Undefined behavior is a compiler/language concept.  The problem is the
compiler can choose to do anything in UB scenarios, not just the "obvious"
translation.  Specifically, the compiler is under no obligation to generate
the unaligned access instructions, and if it doesn't SEGVs ensue.  Two
examples, both of which relate to SIMD optimizations are linked below.

I tend to be on the conservative side with this type of thing but if we
have experts on the the ML that can offer a more informed opinion, I would
love to hear it.

[1] http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65709

On Mon, Jul 1, 2019 at 1:41 PM Wes McKinney <wesmck...@gmail.com> wrote:

> The <0xffffffff><int32_t size> solution is downright ugly but I think
> it's one of the only ways that achieves
>
> * backward compatibility (new clients can read old data)
> * opt-in forward compatibility (if we want to go to the labor of doing
> so, sort of dangerous)
> * old clients receiving new data do not blow up (they will see a
> metadata length of -1)
>
> NB 0xFFFFFFFF <length> would look like:
>
> In [13]: np.array([(2 << 32) - 1, 128], dtype=np.uint32)
> Out[13]: array([4294967295,        128], dtype=uint32)
>
> In [14]: np.array([(2 << 32) - 1, 128],
> dtype=np.uint32).view(np.int32)
> Out[14]: array([ -1, 128], dtype=int32)
>
> In [15]: np.array([(2 << 32) - 1, 128], dtype=np.uint32).view(np.uint8)
> Out[15]: array([255, 255, 255, 255, 128,   0,   0,   0], dtype=uint8)
>
> Flatbuffers are 32-bit limited so we don't need all 64 bits.
>
> Do you know in what circumstances unaligned reads from Flatbuffers
> might cause an issue? I do not know enough about UB but my
> understanding is that it causes issues on some specialized platforms
> where for most modern x86-64 processors and compilers it is not really
> an issue (though perhaps a performance issue)
>
> On Sun, Jun 30, 2019 at 6:36 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
> >
> > At least on the read-side we can make this detectable by using something
> like <0xffffffff><int32_t size> instead of int64_t.  On the write side we
> would need some sort of default mode that we could flip on/off if we wanted
> to maintain compatibility.
> >
> > I should say I think we should fix it.  Undefined behavior is unpaid
> debt that might never be collected or might cause things to fail in
> difficult to diagnose ways. And pre-1.0.0 is definitely the time.
> >
> > -Micah
> >
> > On Sun, Jun 30, 2019 at 3:17 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >>
> >> On Sun, Jun 30, 2019 at 5:14 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >> >
> >> > hi Micah,
> >> >
> >> > This is definitely unfortunate, I wish we had realized the potential
> >> > implications of having the Flatbuffer message start on a 4-byte
> >> > (rather than 8-byte) boundary. The cost of making such a change now
> >> > would be pretty high since all readers and writers in all languages
> >> > would have to be changed. That being said, the 0.14.0 -> 1.0.0 version
> >> > bump is the last opportunity we have to make a change like this, so we
> >> > might as well discuss it now. Note that particular implementations
> >> > could implement compatibility functions to handle the 4 to 8 byte
> >> > change so that old clients can still be understood. We'd probably want
> >> > to do this in C++, for example, since users would pretty quickly
> >> > acquire a new pyarrow version in Spark applications while they are
> >> > stuck on an old version of the Java libraries.
> >>
> >> NB such a backwards compatibility fix would not be forward-compatible,
> >> so the PySpark users would need to use a pinned version of pyarrow
> >> until Spark upgraded to Arrow 1.0.0. Maybe that's OK
> >>
> >> >
> >> > - Wes
> >> >
> >> > On Sun, Jun 30, 2019 at 3:01 AM Micah Kornfield <
> emkornfi...@gmail.com> wrote:
> >> > >
> >> > > While working on trying to fix undefined behavior for unaligned
> memory
> >> > > accesses [1], I ran into an issue with the IPC specification [2]
> which
> >> > > prevents us from ever achieving zero-copy memory mapping and having
> aligned
> >> > > accesses (i.e. clean UBSan runs).
> >> > >
> >> > > Flatbuffer metadata needs 8-byte alignment to guarantee aligned
> accesses.
> >> > >
> >> > > In the IPC format we align each message to 8-byte boundaries.  We
> then
> >> > > write a int32_t integer to to denote the size of flat buffer
> metadata,
> >> > > followed immediately  by the flatbuffer metadata.  This means the
> >> > > flatbuffer metadata will never be 8 byte aligned.
> >> > >
> >> > > Do people care?  A simple fix  would be to use int64_t instead of
> int32_t
> >> > > for length.  However, any fix essentially breaks all previous client
> >> > > library versions or incurs a memory copy.
> >> > >
> >> > > [1] https://github.com/apache/arrow/pull/4757
> >> > > [2] https://arrow.apache.org/docs/ipc.html
>

Reply via email to