[Numpy-discussion] Feature query: fetch top/bottom k from array

2022-02-22 Thread Joseph Bolton
Morning,

My apologies if this deviates from the vision of numpy:

I find myself often requiring the indices and/or values of the top (or
bottom) k items in a numpy array.

I am aware of solutions involving partition/argpartition but these are
inelegant.

I am thinking of 1-dimensional arrays, but this concept extends to an
arbitrary number of dimensions.

Is this a feature that would benefit the numpy package? I am happy to code
it up.

Thanks for your time!

Best regards
Joe
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Feature query: fetch top/bottom k from array

2022-02-22 Thread Joseph Fox-Rabinovitz
Joe,

Could you show an example that you find inelegant and elaborate on how you
intend to improve it? It's hard to discuss without more specific
information.

- Joe

On Tue, Feb 22, 2022, 07:23 Joseph Bolton 
wrote:

> Morning,
>
> My apologies if this deviates from the vision of numpy:
>
> I find myself often requiring the indices and/or values of the top (or
> bottom) k items in a numpy array.
>
> I am aware of solutions involving partition/argpartition but these are
> inelegant.
>
> I am thinking of 1-dimensional arrays, but this concept extends to an
> arbitrary number of dimensions.
>
> Is this a feature that would benefit the numpy package? I am happy to code
> it up.
>
> Thanks for your time!
>
> Best regards
> Joe
>
>
>
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: jfoxrabinov...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: NEP draft for the future behaviour of scalar promotion

2022-02-22 Thread Sebastian Berg
On Tue, 2022-02-22 at 01:43 -0600, Juan Nunez-Iglesias wrote:
> 
> On Tue, 22 Feb 2022, at 1:01 AM, Stefan van der Walt wrote:
> > it is easier to explain away `x + 1` behaving oddly over `x[0] + 1`
> > behaving oddly
> 
> Is it? I find the two equivalent, honestly.
> 
> > given that we pretend like NumPy scalars do not exist.
> 
> This is the leaky abstraction that I think should be plugged in this
> revamp.
> 
> > This then argues for making explicit to the user that there are
> > scalars involved.  I.e., no more:
> > 
> > In [4]: x = np.array([1, 2, 3])
> > 
> > In [5]: x[0]
> > Out[5]: 1
> > 
> > But rather
> > 
> > Out[5]: np.int64(1)
> 
> Yup. I would be in favour of such a repr change. (And to be clear, it
> is *only* a repr change, not a behaviour change!) I have indeed run
> across this a few times, e.g. trying to encode a single value in json
> only to find that it was a NumPy int64 rather than an int.
> 
> > > > The benefit of these semantics are that you can readily express
> > > > sequences of operations with clean Python code, without having
> > > > to explicitly cast scalars to the appropriate type. Imagine if
> > > > rather than writing this:
> > > > 
> > > > 3 * (x + 1) ** 2
> > > > you had to write this:
> > > > 
> > > > np.int32(3) * (x + np.int32(1)) ** np.int32(2)
> > 
> > And how do you write the much more common
> > 
> > x[0] + 1
> 
> Is it really much more common than arithmetic combining arrays and
> literals? I'd say it's much *less* common, especially in "idiomatic"
> NumPy which tries to avoid Python looping over elements.


I think there are a few use-cases for this (one that comes to mind is
integration, where the integration function is sometimes called on
scalar values).  Especially if you look to new users, who may be using
scalars for lack of experience writing vectorized code.

But mainly, I think it is the sneakiest backcompat break...

The one "middle ground" possibility I see here is that we could limit
the weak logic to Python operators in principle (I know this seems
unpopular).
The main arguments are:
* It seems somewhat straight forward to explain that `np.add(x, 1)`
  behaves more like `np.add(x, np.asarray(1))`
* We can give warnings for operators: At least integer overflows will
  give a warning, notifying users of a potential problem.
* The long notation `np.add(x, np.uint8(1))` isn't so bad if you don't
  have operators.  (or `dtype=x.dtype`)

(I may well be missing a reason for why this doesn't add up at all.)

Unfortunately, there will always be strange cases.  No matter what we
do, it will not always be clear if a library function calls
`np.asarray()` on the input first, or first uses the input directly.

I do not think that `asarray` should drag around the information that
it was "weak" as JAX at least can (to me this seems prone to errors and
unlike JAX our arrays are not immutable).
So if you want "weak" logic for function input you need to take care to
handle it before calling `np.asarray()`.

Cheers,

Sebastian


> 
> > now?  It becomes: x[0] + np.int64(1).
> 
> I would write it as x[0].astype(np.int64) + 1, and indeed I think I
> would find that less confusing, reading the code years later, because
> it would allow me to not even have to think about type promotion.
> 
> > The reason we had value inspection was that it gave us a cushy
> > "best of both worlds"; when going with dtype-only casting, you have
> > to give something up.
> 
> Yes yes, we agree we are giving something up, we merely disagree
> about what is better to give up long term for our community. For me,
> the attractiveness of unified scalar and array semantics, together
> with unified type promotion, beats the attractiveness of hiding
> overflow from users, especially since the hiding can only ever be
> patchy.* I 100% agree with you that it is a tradeoff. But, imho, one
> worth making.
> 
> * e.g. the same user might initially be happy about the result of
> x[0] + 1 matching their infinite-precision expectation, but then be
> surprised by
> 
> x[0] + 1
> -> 256
> 
> y[0] = 1
> x[0] + y[0]
> -> 0  # WTH
> 
> Juan.
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] NumPy Development Meeting Wednesday - Triage Focus

2022-02-22 Thread Sebastian Berg
Hi all,

Our bi-weekly triage-focused NumPy development meeting is Wednesday,
February 23rd at 17:00 UTC (9:00am Pacific Time).
Everyone is invited to join in and edit the work-in-progress meeting
topics and notes:
https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg

I encourage everyone to notify us of issues or PRs that you feel should
be prioritized, discussed, or reviewed.

Best regards

Sebastian



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com