Re: Incompatability of all existing pyarrow releases with the next NumPy release

2020-12-07 Thread Wes McKinney
I believe we can do a release that is just focused on the Python
artifacts, yes.

On Mon, Dec 7, 2020 at 6:52 AM Joris Van den Bossche
 wrote:
>
> On Fri, 4 Dec 2020 at 21:11, Uwe L. Korn  wrote:
>
> > Hello all,
> >
> > Today the Karotothek CI turned quite red in
> > https://github.com/JDASoftwareGroup/kartothek/pull/383 /
> > https://github.com/JDASoftwareGroup/kartothek/pull/383/checks?check_run_id=1497941813
> > as the new NumPy 1.20rc1 was pulled in. It simply broke all pyarrow<->NumPy
> > interop as now dtypes returned by numpy are actual subclasses not directly
> > numpy.dtype instances anymore. I reported the issue over at
> > https://github.com/numpy/numpy/issues/17913. We are running into that as
> > we build our wheels and conda packages with an older release of NumPy that
> > has a faulty implementation of PyArray_DescrCheck.
> >
> >  (a) For upcoming releases, we can either move our minimal supported NumPy
> > to 1.16.6 or merge the PR over at
> > https://github.com/apache/arrow/pull/8834
> >  (b) Existing conda(-forge) packages can get a repodata patch that adds a
> > numpy<1.20 constraint to them
> >  (c) I'll rebuild the latest but still frequently used pyarrow releases on
> > conda-forge using numpy 1.16.6
> >  (d) Old pyarrow wheels (Python<3.8) though won't be easily fixed and
> > instead will return the confusing "ArrowTypeError: Did not pass numpy.dtype
> > object" error message. Personally my approach would be here to not do
> > anything and simply direct users to downgrade NumPy if they run into the
> > issue.
> >
> > In addition to this last item (pip installs), doing a small 2.0.1 bugfix
> release with this patch would also help a lot I think. It would at least
> ensure that plain pip installs with latest versions will work (while it
> doesn't solve it for older pyarrow releases of course, in case people
> upgrade numpy in an existing environment, or install numpy with pyarrow
> pinned to an older version).
>
> Does our project governance allow doing a python-only release? (meaning, a
> release branch where the 2.0.1 tag compared to 2.0.0 tag only includes
> changes to the python libraries) That would make it less burdensome to
> resolve part of this situation.
>
>
> > Is anyone objecting to this approach?
> >
> > Cheers
> > Uwe
> >


Re: Incompatability of all existing pyarrow releases with the next NumPy release

2020-12-07 Thread Joris Van den Bossche
On Fri, 4 Dec 2020 at 21:11, Uwe L. Korn  wrote:

> Hello all,
>
> Today the Karotothek CI turned quite red in
> https://github.com/JDASoftwareGroup/kartothek/pull/383 /
> https://github.com/JDASoftwareGroup/kartothek/pull/383/checks?check_run_id=1497941813
> as the new NumPy 1.20rc1 was pulled in. It simply broke all pyarrow<->NumPy
> interop as now dtypes returned by numpy are actual subclasses not directly
> numpy.dtype instances anymore. I reported the issue over at
> https://github.com/numpy/numpy/issues/17913. We are running into that as
> we build our wheels and conda packages with an older release of NumPy that
> has a faulty implementation of PyArray_DescrCheck.
>
>  (a) For upcoming releases, we can either move our minimal supported NumPy
> to 1.16.6 or merge the PR over at
> https://github.com/apache/arrow/pull/8834
>  (b) Existing conda(-forge) packages can get a repodata patch that adds a
> numpy<1.20 constraint to them
>  (c) I'll rebuild the latest but still frequently used pyarrow releases on
> conda-forge using numpy 1.16.6
>  (d) Old pyarrow wheels (Python<3.8) though won't be easily fixed and
> instead will return the confusing "ArrowTypeError: Did not pass numpy.dtype
> object" error message. Personally my approach would be here to not do
> anything and simply direct users to downgrade NumPy if they run into the
> issue.
>
> In addition to this last item (pip installs), doing a small 2.0.1 bugfix
release with this patch would also help a lot I think. It would at least
ensure that plain pip installs with latest versions will work (while it
doesn't solve it for older pyarrow releases of course, in case people
upgrade numpy in an existing environment, or install numpy with pyarrow
pinned to an older version).

Does our project governance allow doing a python-only release? (meaning, a
release branch where the 2.0.1 tag compared to 2.0.0 tag only includes
changes to the python libraries) That would make it less burdensome to
resolve part of this situation.


> Is anyone objecting to this approach?
>
> Cheers
> Uwe
>


Re: Incompatability of all existing pyarrow releases with the next NumPy release

2020-12-04 Thread Uwe L. Korn
NumPy's deprecation policy would drop support for the 1.16 series in January: 
https://numpy.org/neps/nep-0029-deprecation_policy.html#support-table This when 
I would suggest to up the minimal numpy in builds here to 1.17 and we will also 
up the version used for builds in conda-forge.

Still, the PR is so trival that we should merge it. I'm not uptodate what the 
status of the 2.0.1 release is but this would be an essential patch for that.

On Fri, Dec 4, 2020, at 9:22 PM, Antoine Pitrou wrote:
> 
> 
> Le 04/12/2020 à 21:11, Uwe L. Korn a écrit :
> > Hello all,
> > 
> > Today the Karotothek CI turned quite red in 
> > https://github.com/JDASoftwareGroup/kartothek/pull/383 / 
> > https://github.com/JDASoftwareGroup/kartothek/pull/383/checks?check_run_id=1497941813
> >  as the new NumPy 1.20rc1 was pulled in. It simply broke all 
> > pyarrow<->NumPy interop as now dtypes returned by numpy are actual 
> > subclasses not directly numpy.dtype instances anymore. I reported the issue 
> > over at https://github.com/numpy/numpy/issues/17913. We are running into 
> > that as we build our wheels and conda packages with an older release of 
> > NumPy that has a faulty implementation of PyArray_DescrCheck.
> > 
> >  (a) For upcoming releases, we can either move our minimal supported NumPy 
> > to 1.16.6 or merge the PR over at https://github.com/apache/arrow/pull/8834
> 
> I would be fine with merging the PR (assuming comments are added to
> explain why things are done that way).  Apparently Numpy 1.16.6 is only
> one year old.
> 
> Regards
> 
> Antoine.
>


Re: Incompatability of all existing pyarrow releases with the next NumPy release

2020-12-04 Thread Antoine Pitrou



Le 04/12/2020 à 21:11, Uwe L. Korn a écrit :
> Hello all,
> 
> Today the Karotothek CI turned quite red in 
> https://github.com/JDASoftwareGroup/kartothek/pull/383 / 
> https://github.com/JDASoftwareGroup/kartothek/pull/383/checks?check_run_id=1497941813
>  as the new NumPy 1.20rc1 was pulled in. It simply broke all pyarrow<->NumPy 
> interop as now dtypes returned by numpy are actual subclasses not directly 
> numpy.dtype instances anymore. I reported the issue over at 
> https://github.com/numpy/numpy/issues/17913. We are running into that as we 
> build our wheels and conda packages with an older release of NumPy that has a 
> faulty implementation of PyArray_DescrCheck.
> 
>  (a) For upcoming releases, we can either move our minimal supported NumPy to 
> 1.16.6 or merge the PR over at https://github.com/apache/arrow/pull/8834

I would be fine with merging the PR (assuming comments are added to
explain why things are done that way).  Apparently Numpy 1.16.6 is only
one year old.

Regards

Antoine.