[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Ronald van Elburg
OK. Then we will just weight for 2.x and test then.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Andrew Nelson
On Mon, 9 Oct 2023 at 16:36, Jerome Kieffer  wrote:

> On Fri, 06 Oct 2023 19:17:22 -
> norbertpiotraduc...@gmail.com wrote:
>
> > Hi,
> > I have an idea to change the numpy.percentile. Think numpy.percentile
> and numpy.nanpercentyl are the same features, and the only difference is
> that numpy.nanpercentyl doesn't include NaN values. Wouldn't it be easier
> if numpy.percentile included an argument specifying whether NaN values
> should be considered? It would certainly be easier for people who are
> starting their adventure with the library.


I'd be ambivalent on making this change. THere are a whole host of other
`np.nan*` functions, would they all need to be modified as well? e.g.
nanprod, nansum, nanargmin, ..
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Juan Nunez-Iglesias

On Mon, 9 Oct 2023, at 7:07 PM, Andrew Nelson wrote:
> On Mon, 9 Oct 2023 at 16:36, Jerome Kieffer  wrote:
> I'd be ambivalent on making this change. THere are a whole host of other 
> `np.nan*` functions, would they all need to be modified as well? e.g. 
> nanprod, nansum, nanargmin, ..

I think obviously, either change all functions or none. The question is whether 
such a change would fit into the overall NumPy 2.0 and array-API plans. 🤷‍♂️___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Matthew Brett
Hi,

Is there any reason to have separate functions - or to keep enforcing
that?I agree, an equivalent of R's rm.na argument seems like a
very reasonable and useful addition, such as (sorry for the
obviousness):

np.mean(x, dropna=True)

and so on,

Cheers,

Matthew

On Mon, Oct 9, 2023 at 9:17 AM Juan Nunez-Iglesias  wrote:
>
>
> On Mon, 9 Oct 2023, at 7:07 PM, Andrew Nelson wrote:
>
> On Mon, 9 Oct 2023 at 16:36, Jerome Kieffer  wrote:
> I'd be ambivalent on making this change. THere are a whole host of other 
> `np.nan*` functions, would they all need to be modified as well? e.g. 
> nanprod, nansum, nanargmin, ..
>
>
> I think obviously, either change all functions or none. The question is 
> whether such a change would fit into the overall NumPy 2.0 and array-API 
> plans. 🤷‍♂️
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: matthew.br...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread norbertpiotraduckir
Surely you can do this for all functions of eg.nan*. Why separate them is the 
only thing that distinguishes them.  Setting the parameter seems to be more 
handy and user-friendly.  Well for me it's seems better to do it right away in 
NumPy 2.0
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Andrew Nelson
On Mon, 9 Oct 2023 at 20:34,  wrote:

> Surely you can do this for all functions of eg.nan*. Why separate them is
> the only thing that distinguishes them.  Setting the parameter seems to be
> more handy and user-friendly.  Well for me it's seems better to do it right
> away in NumPy 2.0
>

I think I prefer the clearer intent of having nan* functions.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Mateusz Sokol
Just to mention for visibility: Introducing a "nan" option and deprecating
nan* functions was considered for 2.0 main namespace refactor but it
was deemed large enough to be (hopefully) tackled in a separate
story/project.

https://github.com/numpy/numpy/issues/24306#issuecomment-1660073584 (first
bullet point on the list)

On Mon, Oct 9, 2023 at 12:48 PM Andrew Nelson  wrote:

> On Mon, 9 Oct 2023 at 20:34,  wrote:
>
>> Surely you can do this for all functions of eg.nan*. Why separate them is
>> the only thing that distinguishes them.  Setting the parameter seems to be
>> more handy and user-friendly.  Well for me it's seems better to do it right
>> away in NumPy 2.0
>>
>
> I think I prefer the clearer intent of having nan* functions.
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: mso...@quansight.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Matthew Brett
Hi,

On Mon, Oct 9, 2023 at 11:49 AM Andrew Nelson  wrote:
>
> On Mon, 9 Oct 2023 at 20:34,  wrote:
>>
>> Surely you can do this for all functions of eg.nan*. Why separate them is 
>> the only thing that distinguishes them.  Setting the parameter seems to be 
>> more handy and user-friendly.  Well for me it's seems better to do it right 
>> away in NumPy 2.0
>
>
> I think I prefer the clearer intent of having nan* functions.

Could you say more about why you consider:

np.mean(x, dropna=True)

to be less clear in intent than:

np.nanmean(x)

?  Is it just that someone could accidentally forget that the default
for `np.mean` is not to drop NaNs?If so - is that a major problem?
  We would be introducing `dropna=True` as not-default, on a world
that is used to the default.

I must say I have several times found myself thinking - why is there a
separate function for means when dropping NaN?

Cheers,

Matthew
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Nathan
On Mon, Oct 9, 2023 at 12:57 AM Aaron Meurer  wrote:

> Is it possible to convert a NumPy 1 pickle file into a generic pickle
> file that works in both NumPy 1 and 2? As far as I understand, pickle
> is Turing complete, so I imagine it should be theoretically possible,
> but I don't know how easy it would be to actually do this or how it
> would affect the pickle file size.
>

Hi Aaron,

The issue is that the pickle protocol needs a reference to a reconstructor
to recreate numpy types. For ndarray, that function is currently
`numpy.core.multiarray._reconstruct` and in numpy 2 becomes
numpy._core.multiarray.reconstruct. For a pickle file containing only an
ndarray, this is the first thing in the pickle file and the import happens
inside of the pickle implementation. I am not aware of a hook that Python
gives us to intercept that path before Python imports it.

So, even if there is a way to correct subsequent paths in the pickle file,
we won't be able to fix the most problematic path that will occur in any
pickle that includes a numpy array. That means some user-visible pain no
matter what. If we can't avoid that, I'd prefer to offer a solution that
will allow people to continue loading old pickle files indefinitely (albeit
with a minor code change). This also gives us a place to put compatibility
fixes for future changes that impact old pickle files.

-Nathan



>
> Aaron Meurer
>
> On Fri, Oct 6, 2023 at 10:17 AM Nathan  wrote:
> >
> > Hi all,
> >
> > As part of the ongoing work on NEP 52 we are getting close to merging
> the pull request that changes numpy.core to numpy._core.
> >
> > While working on this we realized that numpy pickle files include paths
> to np.core in the pickle data. If we do nothing, switching np.core to
> np._core will generate deprecation warnings when loading pickle files
> generated by Numpy 1.x in Numpy 2.x and Numpy 1.x will be unable to read
> Numpy 2.x pickle files. Eventually, when Numpy 2.x completely removes the
> stub np.core module, loading old pickle files will break.
> >
> > The fix we have come up with is to add a new public NumpyUnpickler class
> to both the main branch and the Numpy 1.26 maintenance branch. This allows
> loading pickle files that were generated by Numpy 1.x and 2.x in either
> version without any warnings or errors. Users who are loading old pickle
> files will need to update their code to use NumpyUnpickler or create new
> pickle files and users who generate pickles with numpy 2.x will need to use
> NumpyUnpickler to read the files in numpy 1.x.
> >
> > We are using NumpyUnpickler internally for loading files in the npy file
> format. Users with pickle data saved in npy files won't see warnings. Only
> users who are storing data in pickle files directly and who want pickle
> files written in one numpy version to load correctly in another numpy
> version will run into trouble. The I/O docs already explicitly discourage
> using pickles to share data files between people and organizations like
> this.
> >
> > An alternate approach which would require less work for users would be
> to leave a limited subset of functionality in `np.core` needed for loading
> pickle files undeprecated. We would prefer to avoid doing this both because
> it would leave behind a publicly visible `np.core` module in NumPy's public
> API and because we're not sure if we can come up with a complete set of
> imports that should be allowed without warning from `np.core` without
> missing some corner cases and users will see deprecation warnings when
> loading pickles anyway.
> >
> > See https://github.com/numpy/numpy/pull/24866,
> https://github.com/numpy/numpy/issues/24844, and the discussion in
> https://github.com/numpy/numpy/pull/24634 for more context.
> >
> > -Nathan
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: asmeu...@gmail.com
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: nathan12...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] *New Time* Next Documentation team meeting

2023-10-09 Thread Mukulika Pahari
Hi all, sorry for the late notice.

Our next Documentation Team meeting will happen on *Monday, October 9* at *11PM 
UTC*. If this time slot is inconvenient for you to join, please let me know in 
the replies or Slack and we will try to add another time slot.

All are welcome - you don't need to already be a contributor to join. If you 
have questions or are curious about what we're doing, we'll be happy to meet 
you!

If you wish to join on Zoom, use this (updated) link:
https://numfocus-org.zoom.us/j/85016474448?pwd=TWEvaWJ1SklyVEpwNXUrcHV1YmFJQ...

Here's the permanent hackmd document with the meeting notes (still being
updated):
https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg

Hope to see you around!

Best wishes,
Mukulika
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Oscar Benjamin
On Mon, 9 Oct 2023 at 17:03, Nathan  wrote:
>
> On Mon, Oct 9, 2023 at 12:57 AM Aaron Meurer  wrote:
>>
>> Is it possible to convert a NumPy 1 pickle file into a generic pickle
>> file that works in both NumPy 1 and 2? As far as I understand, pickle
>> is Turing complete, so I imagine it should be theoretically possible,
>> but I don't know how easy it would be to actually do this or how it
>> would affect the pickle file size.

There are many ways that this could be made to work with the various
options like reduce() etc.

> The issue is that the pickle protocol needs a reference to a reconstructor to 
> recreate numpy types. For ndarray, that function is currently 
> `numpy.core.multiarray._reconstruct` and in numpy 2 becomes 
> numpy._core.multiarray.reconstruct. For a pickle file containing only an 
> ndarray, this is the first thing in the pickle file and the import happens 
> inside of the pickle implementation. I am not aware of a hook that Python 
> gives us to intercept that path before Python imports it.
>
> So, even if there is a way to correct subsequent paths in the pickle file, we 
> won't be able to fix the most problematic path that will occur in any pickle 
> that includes a numpy array. That means some user-visible pain no matter 
> what. If we can't avoid that, I'd prefer to offer a solution that will allow 
> people to continue loading old pickle files indefinitely (albeit with a minor 
> code change). This also gives us a place to put compatibility fixes for 
> future changes that impact old pickle files.

Suppose that there is NumPy v1 and that in future there will be NumPy
v2. Also suppose that there will be two NumPy pickle formats fmtA and
a future fmtB. One possibility is that NumPy v1 only reads and writes
fmtA and then NumPy v2 only reads and writes fmtB. One problem with
this is that when NumPy v2 comes out there is no easy way to convert
pickles from fmtA to fmtB for compatibility with NumPy v2. Another
problem with this is that it does not make a nice transition during
any period of time when both NumPy v1 and v2 might be used in
different parts of a software stack.

An alternative is to introduce fmtB as part of the NumPy v1 series.
NumPy could be changed now so that it can read both fmtA and fmtB but
by default it would write fmtB which would be designed ahead of time
so that in future NumPy v2 would be able to read fmtB as well. It
would also be possible to design it so that fmtB would be readable by
older versions of NumPy that were released before fmtB was designed.

Then there is a version of NumPy (v1) which can read fmtA and write to
fmtB. This version of NumPy can be used to convert pickles from fmtA
to fmtB. Then when NumPy v2 is released it can already read any
pickles that were generated by the most recent releases of NumPy v1.x.
Anyone who still has older pickles in fmtA could use NumPy v1 to do
dumps(loads(f)) which would convert from fmtA to fmtB.

In this scenario the only part that does not work is reading fmtA in
NumPy v2 which is unavoidable if numpy.core is removed or renamed in
v2.

--
Oscar
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Nathan
On Mon, Oct 9, 2023 at 2:44 PM Oscar Benjamin 
wrote:

> On Mon, 9 Oct 2023 at 17:03, Nathan  wrote:
> >
> > On Mon, Oct 9, 2023 at 12:57 AM Aaron Meurer  wrote:
> >>
> >> Is it possible to convert a NumPy 1 pickle file into a generic pickle
> >> file that works in both NumPy 1 and 2? As far as I understand, pickle
> >> is Turing complete, so I imagine it should be theoretically possible,
> >> but I don't know how easy it would be to actually do this or how it
> >> would affect the pickle file size.
>
> There are many ways that this could be made to work with the various
> options like reduce() etc.
>
> > The issue is that the pickle protocol needs a reference to a
> reconstructor to recreate numpy types. For ndarray, that function is
> currently `numpy.core.multiarray._reconstruct` and in numpy 2 becomes
> numpy._core.multiarray.reconstruct. For a pickle file containing only an
> ndarray, this is the first thing in the pickle file and the import happens
> inside of the pickle implementation. I am not aware of a hook that Python
> gives us to intercept that path before Python imports it.
> >
> > So, even if there is a way to correct subsequent paths in the pickle
> file, we won't be able to fix the most problematic path that will occur in
> any pickle that includes a numpy array. That means some user-visible pain
> no matter what. If we can't avoid that, I'd prefer to offer a solution that
> will allow people to continue loading old pickle files indefinitely (albeit
> with a minor code change). This also gives us a place to put compatibility
> fixes for future changes that impact old pickle files.
>
>
Hi Oscar,


> Suppose that there is NumPy v1 and that in future there will be NumPy
> v2. Also suppose that there will be two NumPy pickle formats fmtA and
> a future fmtB. One possibility is that NumPy v1 only reads and writes
> fmtA and then NumPy v2 only reads and writes fmtB. One problem with
> this is that when NumPy v2 comes out there is no easy way to convert
> pickles from fmtA to fmtB for compatibility with NumPy v2. Another
> problem with this is that it does not make a nice transition during
> any period of time when both NumPy v1 and v2 might be used in
> different parts of a software stack.
>

Doesn't NumpyUnpickler solve this? It will be present in both v1 and v2 and
will allow loading files either np.core or np._core in either version.


> An alternative is to introduce fmtB as part of the NumPy v1 series.
> NumPy could be changed now so that it can read both fmtA and fmtB but
> by default it would write fmtB which would be designed ahead of time
> so that in future NumPy v2 would be able to read fmtB as well. It
> would also be possible to design it so that fmtB would be readable by
> older versions of NumPy that were released before fmtB was designed.
>
> Then there is a version of NumPy (v1) which can read fmtA and write to
> fmtB. This version of NumPy can be used to convert pickles from fmtA
> to fmtB. Then when NumPy v2 is released it can already read any
> pickles that were generated by the most recent releases of NumPy v1.x.
> Anyone who still has older pickles in fmtA could use NumPy v1 to do
> dumps(loads(f)) which would convert from fmtA to fmtB.
>
> In this scenario the only part that does not work is reading fmtA in
> NumPy v2 which is unavoidable if numpy.core is removed or renamed in
> v2.
>

I agree it would have been better to anticipate this and move the
_reconstruct function to np._core many releases ago. Sadly this was not
done and the next release is Numpy 2.0.

I also want to emphasize that using pickle like this - to share data
between different python installations - is inherently insecure and should
never be done outside of an organization that fully controls all of the
python installations. In that case, the organization can use
NumpyUnpickler. In any other case, I think it's good to perhaps nudge
people away from doing things like this.


>
> --
> Oscar
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: nathan12...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Oscar Benjamin
On Mon, 9 Oct 2023 at 21:57, Nathan  wrote:
>
> On Mon, Oct 9, 2023 at 2:44 PM Oscar Benjamin  
> wrote:
>> Suppose that there is NumPy v1 and that in future there will be NumPy
>> v2. Also suppose that there will be two NumPy pickle formats fmtA and
>> a future fmtB. One possibility is that NumPy v1 only reads and writes
>> fmtA and then NumPy v2 only reads and writes fmtB. One problem with
>> this is that when NumPy v2 comes out there is no easy way to convert
>> pickles from fmtA to fmtB for compatibility with NumPy v2. Another
>> problem with this is that it does not make a nice transition during
>> any period of time when both NumPy v1 and v2 might be used in
>> different parts of a software stack.
>
> Doesn't NumpyUnpickler solve this? It will be present in both v1 and v2 and 
> will allow loading files either np.core or np._core in either version.

I guess that makes it possible in some way to convert formats in
either version. I presume though that this still means that a plain
pickle.loads() (and any code built on top of such) would fail in v2.

>> An alternative is to introduce fmtB as part of the NumPy v1 series.
>> NumPy could be changed now so that it can read both fmtA and fmtB but
>> by default it would write fmtB which would be designed ahead of time
>> so that in future NumPy v2 would be able to read fmtB as well. It
>> would also be possible to design it so that fmtB would be readable by
>> older versions of NumPy that were released before fmtB was designed.
>>
>> Then there is a version of NumPy (v1) which can read fmtA and write to
>> fmtB. This version of NumPy can be used to convert pickles from fmtA
>> to fmtB. Then when NumPy v2 is released it can already read any
>> pickles that were generated by the most recent releases of NumPy v1.x.
>> Anyone who still has older pickles in fmtA could use NumPy v1 to do
>> dumps(loads(f)) which would convert from fmtA to fmtB.
>>
>> In this scenario the only part that does not work is reading fmtA in
>> NumPy v2 which is unavoidable if numpy.core is removed or renamed in
>> v2.
>
> I agree it would have been better to anticipate this and move the 
> _reconstruct function to np._core many releases ago. Sadly this was not done 
> and the next release is Numpy 2.0.

Well if the next release is NumPy 2.0 then my suggestion does not
work. There are alternatives but they might not be worth it at this
point.

> I also want to emphasize that using pickle like this - to share data between 
> different python installations - is inherently insecure and should never be 
> done outside of an organization that fully controls all of the python 
> installations. In that case, the organization can use NumpyUnpickler. In any 
> other case, I think it's good to perhaps nudge people away from doing things 
> like this.

Agreed but I guarantee that someone depends on this and is using it in
a way that is reasonable for their own purposes. There might not be
much to be done about it but someone will experience unexpected
breakage and it is worthwhile to contemplate (as you are doing) what
can be done to mitigate that.

--
Oscar
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Nathan
On Mon, Oct 9, 2023 at 3:12 PM Oscar Benjamin 
wrote:

> On Mon, 9 Oct 2023 at 21:57, Nathan  wrote:
> >
> > On Mon, Oct 9, 2023 at 2:44 PM Oscar Benjamin <
> oscar.j.benja...@gmail.com> wrote:
> >> Suppose that there is NumPy v1 and that in future there will be NumPy
> >> v2. Also suppose that there will be two NumPy pickle formats fmtA and
> >> a future fmtB. One possibility is that NumPy v1 only reads and writes
> >> fmtA and then NumPy v2 only reads and writes fmtB. One problem with
> >> this is that when NumPy v2 comes out there is no easy way to convert
> >> pickles from fmtA to fmtB for compatibility with NumPy v2. Another
> >> problem with this is that it does not make a nice transition during
> >> any period of time when both NumPy v1 and v2 might be used in
> >> different parts of a software stack.
> >
> > Doesn't NumpyUnpickler solve this? It will be present in both v1 and v2
> and will allow loading files either np.core or np._core in either version.
>
> I guess that makes it possible in some way to convert formats in
> either version. I presume though that this still means that a plain
> pickle.loads() (and any code built on top of such) would fail in v2.
>

In Numpy2.0 you would see a deprecation warning about the path in the
pickle file but no crash. Eventually, when we finally remove the stub
np.core, you would see a crash.

However, one thing we can do now is, for that one particular symbol that we
know is going to be in every pickle file and probably never elsewhere, is
intercept that one import and instead of generating a generic warning about
np.core being deprecated, we instead make that specific version of the
deprecation warning mentions NumpyUnpickler. I'll make sure this gets done.

We *could* just allow that import to happen without a warning, but then
we're stuck keeping np.core around even longer and we also will still
generate a deprecation warning for an import from np.core if the pickle
file happens to include any other numpy types that might generate imports
in np.core.


>
> >> An alternative is to introduce fmtB as part of the NumPy v1 series.
> >> NumPy could be changed now so that it can read both fmtA and fmtB but
> >> by default it would write fmtB which would be designed ahead of time
> >> so that in future NumPy v2 would be able to read fmtB as well. It
> >> would also be possible to design it so that fmtB would be readable by
> >> older versions of NumPy that were released before fmtB was designed.
> >>
> >> Then there is a version of NumPy (v1) which can read fmtA and write to
> >> fmtB. This version of NumPy can be used to convert pickles from fmtA
> >> to fmtB. Then when NumPy v2 is released it can already read any
> >> pickles that were generated by the most recent releases of NumPy v1.x.
> >> Anyone who still has older pickles in fmtA could use NumPy v1 to do
> >> dumps(loads(f)) which would convert from fmtA to fmtB.
> >>
> >> In this scenario the only part that does not work is reading fmtA in
> >> NumPy v2 which is unavoidable if numpy.core is removed or renamed in
> >> v2.
> >
> > I agree it would have been better to anticipate this and move the
> _reconstruct function to np._core many releases ago. Sadly this was not
> done and the next release is Numpy 2.0.
>
> Well if the next release is NumPy 2.0 then my suggestion does not
> work. There are alternatives but they might not be worth it at this
> point.
>
> > I also want to emphasize that using pickle like this - to share data
> between different python installations - is inherently insecure and should
> never be done outside of an organization that fully controls all of the
> python installations. In that case, the organization can use
> NumpyUnpickler. In any other case, I think it's good to perhaps nudge
> people away from doing things like this.
>
> Agreed but I guarantee that someone depends on this and is using it in
> a way that is reasonable for their own purposes. There might not be
> much to be done about it but someone will experience unexpected
> breakage and it is worthwhile to contemplate (as you are doing) what
> can be done to mitigate that.
>
> --
> Oscar
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: nathan12...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Oscar Benjamin
On Mon, 9 Oct 2023 at 22:30, Nathan  wrote:
>
> On Mon, Oct 9, 2023 at 3:12 PM Oscar Benjamin  
> wrote:
>>
>> On Mon, 9 Oct 2023 at 21:57, Nathan  wrote:
>> >
>> > On Mon, Oct 9, 2023 at 2:44 PM Oscar Benjamin  
>> > wrote:
>> >> Suppose that there is NumPy v1 and that in future there will be NumPy
>> >> v2. Also suppose that there will be two NumPy pickle formats fmtA and
>> >> a future fmtB. One possibility is that NumPy v1 only reads and writes
>> >> fmtA and then NumPy v2 only reads and writes fmtB. One problem with
>> >> this is that when NumPy v2 comes out there is no easy way to convert
>> >> pickles from fmtA to fmtB for compatibility with NumPy v2. Another
>> >> problem with this is that it does not make a nice transition during
>> >> any period of time when both NumPy v1 and v2 might be used in
>> >> different parts of a software stack.
>> >
>> > Doesn't NumpyUnpickler solve this? It will be present in both v1 and v2 
>> > and will allow loading files either np.core or np._core in either version.
>>
>> I guess that makes it possible in some way to convert formats in
>> either version. I presume though that this still means that a plain
>> pickle.loads() (and any code built on top of such) would fail in v2.
>
> In Numpy2.0 you would see a deprecation warning about the path in the pickle 
> file but no crash. Eventually, when we finally remove the stub np.core, you 
> would see a crash.

Okay, that makes sense. What happens in the reverse scenario: loading
a pickle generated by NumPy 2.0 using NumPy 1.x?

--
Oscar
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Stephan Hoyer
On Mon, Oct 9, 2023 at 2:29 PM Nathan  wrote:

> However, one thing we can do now is, for that one particular symbol that
> we know is going to be in every pickle file and probably never elsewhere,
> is intercept that one import and instead of generating a generic warning
> about np.core being deprecated, we instead make that specific version of
> the deprecation warning mentions NumpyUnpickler. I'll make sure this gets
> done.
>
> We *could* just allow that import to happen without a warning, but then
> we're stuck keeping np.core around even longer and we also will still
> generate a deprecation warning for an import from np.core if the pickle
> file happens to include any other numpy types that might generate imports
> in np.core.
>

My preferred option would be to keep restoring old NumPy pickles working
indefinitely, and also to preserve backwards compatibility for pickles
written in newer versions of NumPy. We can still do the rest of the
numpy.core cleanup, but it's OK if we keep a bit of compatibility code in
NumPy indefinitely.

I don't think warnings would help much in this case, because if somebody is
currently distributing pickled numpy arrays despite all of our warnings not
to do so, they are unlikely to go back and update their old files.

We could keep around numpy.core.multiarray as a minimal stub for only this
purpose, or potentially only define the object
numpy.core.multiarray._reconstruct.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Nathan
On Mon, Oct 9, 2023 at 3:58 PM Oscar Benjamin 
wrote:

> On Mon, 9 Oct 2023 at 22:30, Nathan  wrote:
> >
> > On Mon, Oct 9, 2023 at 3:12 PM Oscar Benjamin <
> oscar.j.benja...@gmail.com> wrote:
> >>
> >> On Mon, 9 Oct 2023 at 21:57, Nathan  wrote:
> >> >
> >> > On Mon, Oct 9, 2023 at 2:44 PM Oscar Benjamin <
> oscar.j.benja...@gmail.com> wrote:
> >> >> Suppose that there is NumPy v1 and that in future there will be NumPy
> >> >> v2. Also suppose that there will be two NumPy pickle formats fmtA and
> >> >> a future fmtB. One possibility is that NumPy v1 only reads and writes
> >> >> fmtA and then NumPy v2 only reads and writes fmtB. One problem with
> >> >> this is that when NumPy v2 comes out there is no easy way to convert
> >> >> pickles from fmtA to fmtB for compatibility with NumPy v2. Another
> >> >> problem with this is that it does not make a nice transition during
> >> >> any period of time when both NumPy v1 and v2 might be used in
> >> >> different parts of a software stack.
> >> >
> >> > Doesn't NumpyUnpickler solve this? It will be present in both v1 and
> v2 and will allow loading files either np.core or np._core in either
> version.
> >>
> >> I guess that makes it possible in some way to convert formats in
> >> either version. I presume though that this still means that a plain
> >> pickle.loads() (and any code built on top of such) would fail in v2.
> >
> > In Numpy2.0 you would see a deprecation warning about the path in the
> pickle file but no crash. Eventually, when we finally remove the stub
> np.core, you would see a crash.
>
> Okay, that makes sense. What happens in the reverse scenario: loading
> a pickle generated by NumPy 2.0 using NumPy 1.x?


There would be a crash, so people creating these pickles would need to tell
users to load them using NumpyUnpickler. Do you see that being problematic?
It would only impact newly created pickle files and there would be an
immediate fix available - use NumpyUnpickler.load instead of pickle.load.


>
> --
> Oscar
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: nathan12...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Adding NumpyUnpickler to Numpy 1.26 and future Numpy 2.0

2023-10-09 Thread Oscar Benjamin
On Mon, 9 Oct 2023 at 23:12, Nathan  wrote:
> On Mon, Oct 9, 2023 at 3:58 PM Oscar Benjamin  
> wrote:
>>
>> On Mon, 9 Oct 2023 at 22:30, Nathan  wrote:
>> >
>> > On Mon, Oct 9, 2023 at 3:12 PM Oscar Benjamin  
>> > wrote:
>> >>
>> >> I guess that makes it possible in some way to convert formats in
>> >> either version. I presume though that this still means that a plain
>> >> pickle.loads() (and any code built on top of such) would fail in v2.
>> >
>> > In Numpy2.0 you would see a deprecation warning about the path in the 
>> > pickle file but no crash. Eventually, when we finally remove the stub 
>> > np.core, you would see a crash.
>>
>> Okay, that makes sense. What happens in the reverse scenario: loading
>> a pickle generated by NumPy 2.0 using NumPy 1.x?
>
> There would be a crash, so people creating these pickles would need to tell 
> users to load them using NumpyUnpickler. Do you see that being problematic? 
> It would only impact newly created pickle files and there would be an 
> immediate fix available - use NumpyUnpickler.load instead of pickle.load.

I am sure that it will be problematic for someone but maybe that is
just part of the collateral damage that should be expected with
changes like this. (I don't have any particular opinion about what is
right here.)

Using NumPyUnpickler is not a simple fix for anyone who is using
pickle indirectly e.g. via another function that calls pickle.load
internally.

--
Oscar
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-09 Thread Andrew Nelson
On Mon, 9 Oct 2023 at 23:50, Matthew Brett  wrote:

> Hi,
>
> On Mon, Oct 9, 2023 at 11:49 AM Andrew Nelson  wrote:
> Could you say more about why you consider:
> np.mean(x, dropna=True)
> to be less clear in intent than:
> np.nanmean(x)
> ?  Is it just that someone could accidentally forget that the default
>

The discussion isn't a deal breaker for me, I just wanted to put out a
different POV.
The name of the function encodes what it does. By putting them both in the
function name it's clear what the function does.

nanmean -> deals with nan when calculating a mean.

-vs-

mean -> calculates a mean
  |
  > oh, it has dropna as a keyword argument, that's how you deal with
nan.


Imagine that one has a large codebase and you have to find all the
locations where nans could affect a mean. There may be lots of prod, sum,
etc, also distributed within the codebase. You wouldn't want to search for
`dropna` because you get every function that handles a nan. If you search
for nanmean you only get the locations you want.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com