Re: [DISCUSS] Add kernel integer overflow handling

Wes McKinney Wed, 03 Jun 2020 08:47:15 -0700

On Wed, Jun 3, 2020 at 10:44 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > By default an error should probably be raised
>
> I would very strongly recommend keeping the behavior consistent with
> that of analytic DBMSes. I don't think that most analytic DBMS error
> on overflows because it's too computationally expensive to check.
> NumPy doesn't error (by default at least) when you have overflows
> either:
>
> >>> arr = np.array([128, 128, 128], dtype='i1')
> >>> arr + arr
> array([0, 0, 0], dtype=int8)


*facepalm*

>>> arr = np.array([127, 127, 127], dtype='i1')
>>> arr + arr
array([-2, -2, -2], dtype=int8)

>>>


>
> On Wed, Jun 3, 2020 at 10:29 AM Antoine Pitrou <solip...@pitrou.net> wrote:
> >
> > On Wed, 3 Jun 2020 10:47:38 -0400
> > Ben Kietzman <ben.kietz...@rstudio.com> wrote:
> > > https://github.com/apache/arrow/pull/7341#issuecomment-638241193
> > >
> > > How should arithmetic kernels handle integer overflow?
> > >
> > > The approach currently taken in the linked PR is to promote such that
> > > overflow will not occur, for example `(int8, int8)->int16` and `(uint16,
> > > uint16)->uint32`.
> > >
> > > I'm not sure that's desirable. For one thing this leads to inconsistent
> > > handling of 64 bit integer types, which are currently allowed to overflow
> > > since we cannot promote further (NB: that means this kernel includes
> > > undefined behavior for int64).
> >
> > I agree with you.  I would strongly advise against implicit promotion
> > accross arithmetic operations.  We initially did that in Numba and it
> > quickly became a can of worms.
> >
> > The most desirable behaviour IMHO is to keep the original type, so:
> > - (int8, int8) -> int8
> > - (uint16, uint16) -> uint16
> >
> > Then the question is what happens when the actual overflow occurs.  I
> > think this should be directed by a kernel option.  By default an error
> > should probably be raised (letting errors pass and silently produce
> > erroneous data is wrong), but we might want to allow people to bypass
> > overflow checks for speed.
> >
> > Note that even if overflow detection is enabled, it *should* be possible
> > to enable vectorization, e.g. by making overflow detection a separate
> > pass (itself vectorizable).
> >
> > Regards
> >
> > Antoine.
> >
> >

Re: [DISCUSS] Add kernel integer overflow handling

Reply via email to