Re: [Numpy-discussion] NEP 31 — Context-local and global overrides of the NumPy API

2019-09-07 Thread sebastian

On 2019-09-07 15:33, Ralf Gommers wrote:

On Sat, Sep 7, 2019 at 1:07 PM Sebastian Berg
 wrote:


On Fri, 2019-09-06 at 14:45 -0700, Ralf Gommers wrote:







That's part of it. The concrete problems it's solving are
threefold:
Array creation functions can be overridden.
Array coercion is now covered.
"Default implementations" will allow you to re-write your NumPy
array more easily, when such efficient implementations exist in
terms of other NumPy functions. That will also help achieve

similar

semantics, but as I said, they're just "default"...



There may be another very concrete one (that's not yet in the

NEP):

allowing other libraries that consume ndarrays to use overrides.

An

example is numpy.fft: currently both mkl_fft and pyfftw

monkeypatch

NumPy, something we don't like all that much (in particular for
mkl_fft, because it's the default in Anaconda).

`__array_function__`

isn't able to help here, because it will always choose NumPy's own
implementation for ndarray input. With unumpy you can support
multiple libraries that consume ndarrays.

Another example is einsum: if you want to use opt_einsum for all
inputs (including ndarrays), then you cannot use np.einsum. And

yet

another is using bottleneck (
https://kwgoodman.github.io/bottleneck-doc/reference.html) for

nan-

functions and partition. There's likely more of these.

The point is: sometimes the array protocols are preferred (e.g.
Dask/Xarray-style meta-arrays), sometimes unumpy-style dispatch

works

better. It's also not necessarily an either or, they can be
complementary.



Let me try to move the discussion from the github issue here (this
may
not be the best place). (https://github.com/numpy/numpy/issues/14441
which asked for easier creation functions together with
`__array_function__`).

I think an important note mentioned here is how users interact with
unumpy, vs. __array_function__. The former is an explicit opt-in,
while
the latter is implicit choice based on an `array-like` abstract base
class and functional type based dispatching.

To quote NEP 18 on this: "The downsides are that this would require
an
explicit opt-in from all existing code, e.g., import numpy.api as
np,
and in the long term would result in the maintenance of two separate
NumPy APIs. Also, many functions from numpy itself are already
overloaded (but inadequately), so confusion about high vs. low level
APIs in NumPy would still persist."
(I do think this is a point we should not just ignore, `uarray` is a
thin layer, but it has a big surface area)

Now there are things where explicit opt-in is obvious. And the FFT
example is one of those, there is no way to implicitly choose
another
backend (except by just replacing it, i.e. monkeypatching) [1]. And
right now I think these are _very_ different.

Now for the end-users choosing one array-like over another, seems
nicer
as an implicit mechanism (why should I not mix sparse, dask and
numpy
arrays!?). This is the promise `__array_function__` tries to make.
Unless convinced otherwise, my guess is that most library authors
would
strive for implicit support (i.e. sklearn, skimage, scipy).

Circling back to creation and coercion. In a purely Object type
system,
these would be classmethods, I guess, but in NumPy and the libraries
above, we are lost.

Solution 1: Create explicit opt-in, e.g. through uarray. (NEP-31)
* Required end-user opt-in.



* Seems cleaner in many ways
* Requires a full copy of the API.


bullet 1 and 3 are not required. if we decide to make it default, then
there's no separate namespace


It does require explicit opt-in to have any benefits to the user.




Solution 2: Add some coercion "protocol" (NEP-30) and expose a way
to
create new arrays more conveniently. This would practically mean
adding
an `array_type=np.ndarray` argument.
* _Not_ used by end-users! End users should use dask.linspace!
* Adds "strange" API somewhere in numpy, and possible a new
"protocol" (additionally to coercion).[2]

I still feel these solve different issues. The second one is
intended
to make array likes work implicitly in libraries (without end users
having to do anything). While the first seems to force the end user
to
opt in, sometimes unnecessarily:

def my_library_func(array_like):
exp = np.exp(array_like)
idx = np.arange(len(exp))
return idx, exp

Would have all the information for implicit opt-in/Array-like
support,
but cannot do it right now.


Can you explain this a bit more? `len(exp)` is a number, so
`np.arange(number)` doesn't really have any information here.



Right, but as a library author, I want a way a way to make it use the 
same type as `array_like` in this particular function, that is the 
point! The end-user already signaled they prefer say dask, due to the 
array that was actually passed in. (but this is just repeating what is 
below I think).



This is what I have 

[Numpy-discussion] Re: Dealing with static local variables in Numpy

2023-09-01 Thread Sebastian Berg
On Tue, 2023-08-29 at 08:01 +, Nicolas Holzschuch wrote:
> Hello,
> 
> This is my first post to this group; I'd like to start by expressing
> my appreciation for the amazing work in developing and maintaining
> Numpy. 
> 
> I have a question. Numpy has quite a lot of static local variables 
> (variables defined as static inside a function, like this
> (core/src/multiarraymodule.c, line 4483):
>     if (raise_exceptions) {
> static PyObject *too_hard_cls = NULL;
>     /* ... */
>     }
> 
> I understand that these variables provide local caching and are
> important for efficiency. They do however cause some issues when
> dealing with multiple subinterpreters, where the static local
> variable might have been initialized by one of the subinterpreters,
> and is not reset when accessed by another subinterpreter.
> More globally, they cannot be reset when the Numpy module is
> released, and thus will likely cause an issue if it is reloaded after
> being released.


Right, but in the end these caches are there for a reason (or almost
all), and just removing them does not seem acceptable to me.

However, there are better ways to solve this.  You can move it into
module state.  In the vast majority of cases that should not be hard: 
The patterns are known.  In a few cases it may be harder but I believe
CPython offers decent solutions now (not sure how it looks like).
I had for a long time hoped for the HPy drive will solve this, but
there is no reason to wait for it.

In any case, contributions to this effect are very much welcome, I have
been hoping they would come for a long time, but I am not excited about
just removing the "static".

- Sebastian



> 
> I have seen the issue mentionned in at least one pull request: 
> https://github.com/numpy/numpy/pull/15169 and in several issues. If I
> understand correctly, the issue is not considered as important
> because subinterpreters are not yet prominent in CPython, static
> local variables provide an important service in caching data locally
> (instead of exposing these variables globally). So the benefits
> outweigh the costs and risks (that would be a huge change to the code
> base).
> 
> I happen to maintain, compile and run a version of Python on iOS (
> https://github.com/holzschu/a-shell/ or 
> https://apps.apple.com/us/app/a-shell/id1473805438), where I have to
> remove all these static local variables, because of the specificity
> of the platform (in order to run Python multiple times, I have to
> release and reset all modules). Right now, I'm maintaining the
> changes to the code base in a separate branch (
> https://github.com/holzschu/numpy/) and not necessarily in a very
> clean way. 
> 
> With the recent renewed interest in subinterpreters, I was wondering
> if there was a way I could contribute these changes back to the main
> numpy branch. I would have to clean up the code, obviously, and
> probably get guidance on how to do it cleanly, but the first question
> is: would there be an interest, or is that something I should keep in
> my separate branch? 
> 
> > From a technical point of view, about 80% of these static local
> > variables are just before a call to npy_cache_import(), and the
> > most efficient way to do it (in terms of lines of code) is just to
> > remove the part where npy_cache_import uses the static local
> > variable. You pay a price in performance, but gain in usability. 
> 
> Best regards,
> Nicolas Holzschuch
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Curious performance different with np.unique on arrays of characters

2023-09-29 Thread Sebastian Berg
On Fri, 2023-09-29 at 11:39 +0200, Klaus Zimmermann wrote:
> Hi,
> 
> one thing that's been on my mind about this discussion:
> 
> Isn't sorting strings simply a much harder job? Particularly Unicode 
> strings?


Yes, but in theory if they are length 1 it is just sorting integers (8
or 64bit) for the current quirky NumPy fixed-length string dtypes. 
Modulo complicated stuff that Python doesn't worry about either [1].

But, of course that is in theory.  In practice have a single
implementation that deals with arbitrary string lengths, so the code
does a lot of extra stuff (and it is harder to use fancy tricks, and
our implementation for a lot of these things is very basic).

Also while we do have the flexibility to create it now, we don't
actually have an obvious place where to add such a specialization (of
course you can always insert an `if ...` clause somewhere, but that
isn't a nice design).

- Sebastian


[1] In principle you are right: sorting unicode is complicated!  In
practice, that is your problem as a user though.  If you want to deal
with weirder things, you have to normalize the unicode first, etc.



> 
> Cheers
> Klaus
> 
> On 27/09/2023 13:12, Lyla Watts wrote:
> > Could you share the processor you're currently running this on? I
> > ask because np.sort leverages AVX-512 acceleration for sorting
> > np.int32, and I'm curious if that could be contributing to the
> > observed difference in performance. 
> > https://apkhexo.com/koloro-mod-apk/
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: klaus.zimmerm...@smhi.se
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Merging very limited weights support for quantiles/percentiles

2023-10-27 Thread Sebastian Berg
Hi all,

there is a PR to merge very limited support for weights in quantiles,
which given no further input I will probably merge based on sklearn
devs saying that they will use it.  This means, adding a `weights`
kwarg [1]. See:

https://github.com/numpy/numpy/pull/24254

Limited here means that it would only work for the "inverted_cdf"
method (which is not the default one).

Why is it very limited?  Because this limited version is the only form
we/I am pretty confident about getting it right.

There are various problems with making it more broad:
1. Weights are not clearly defined and can have many meanings, e.g.:
   * frequency weights (repeated observations)
   * probability weights (removing sample biases)
   * "analytic"/"precision" weights (encoding observation
 precision/variance).

2. There is very little to no literature on how to deal with the
   subtleties of dealing with (in the context of the various types
   of weights:
   * Interpolation (relevant to all interpolating methods)
   * Unbiasing (the main difference between the methods)

The PR adds the most minimal thing, where weights are largly equivalent
(no unbiasing issues, no interpolation). [2]

Due to these complexities (and the lack of many statistic specialists
looking at it) there is a point to be made that we just shouldn't add
this in NumPy, but if nobody else has an opinion, I will go with the
sklearn devs who want it :).
(Also with weights we have to rely on full sorting for now, which can
be slow, which I can live with personally.)

- Sebastian


[1] There are different styles of weights and for some method that
clearly matters.  Thus, if we ever expand the definition, it may be
that `weights` has to be mapped to one of these, or that the the
generic `weights` kwarg would raise an error for these that you need to
pick a specific one like `pweights=`, or `fweights=`.

[2] I am not quite sure about "analytic weights" here, but to me these
do not really make sense in the context of a discrete interpolation
method.

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Windows default integer now 64bit in main

2023-11-02 Thread Sebastian Berg
Hi all,

just a heads up, the PR to change the default integer is merged on
main.  This may cause issues, especially with Cython code because
`np.int_t` cannot be reasonably defined anymore.

Other code may also want to vet usage of "long" in any variation.  Much
code (like SciPy) simply supports any integer input, although even
there integer output may be relevant.  New NumPy defines
`NPY_DEFAULT_INT` to be able to branch at runtime for backward
compatiblity you could use:

#ifndef NPY_DEFAULT_INT
#define NPY_DEFAULT_INT NPY_LONG
#endif

Unfortunately, I expect this to be a bit painful, please let us know if
it is too painful for some reason.

But OTOH it has been a recurring surprise and is a common reason for
linux written software to not run on windows.

- Sebastian

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Windows default integer now 64bit in main

2023-11-03 Thread Sebastian Berg
On Thu, 2023-11-02 at 19:37 +0100, Michael Siebert wrote:
> Hi Sebastian,
> 
> great news! Does that mean that Windows Numpy 64 bit default integers
> are coming before Numpy 2.0, like in Numpy 1.27? Will there be
> another release before 2.0?


NumPy 2 of course.  Way to big change.  There is no 1.27 planned as of
now, if it happens it would be a (big) backport release, though.
(Due to files having been moved around backports seem to be getting
harder, though.)

- Sebastian


> 
> Best, Michael
> 
> > On 2. Nov 2023, at 16:25, Sebastian Berg <
> > sebast...@sipsolutions.net> wrote:
> > Hi all,
> > 
> > just a heads up, the PR to change the default integer is merged on
> > main.  This may cause issues, especially with Cython code because
> > `np.int_t` cannot be reasonably defined anymore.
> > 
> > Other code may also want to vet usage of "long" in any variation. 
> > Much
> > code (like SciPy) simply supports any integer input, although even
> > there integer output may be relevant.  New NumPy defines
> > `NPY_DEFAULT_INT` to be able to branch at runtime for backward
> > compatiblity you could use:
> > 
> > #ifndef NPY_DEFAULT_INT
> > #define NPY_DEFAULT_INT NPY_LONG
> > #endif
> > 
> > Unfortunately, I expect this to be a bit painful, please let us
> > know if
> > it is too painful for some reason.
> > 
> > But OTOH it has been a recurring surprise and is a common reason
> > for
> > linux written software to not run on windows.
> > 
> > - Sebastian
> > 
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: michael.sieber...@gmail.com
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Switching default order to column-major

2023-11-13 Thread Sebastian Berg
Few things in the Python API care about order, but there are also quite
a few places that will return C-order (and are faster for C-order
inputs) whether you change those defaults or not.

The main issue is that e.g. some cython wrappers will probably assume
that the newly created array is C-order.  And those will just not work.

For example, I would imagine many libraries that have C/Cython wrappers
have code that doesn't specify `order="C"` explicitly (why would they?)
but then passes it into a typed memory-views (if cython) like
`double[:, ::1]` enforcing a C-contiguous memory layout for speed.
Such code should normally fail gracefully, but fail it will.

Also, as Aaron said, a lot of these places might not enforce it but
still be speed impacted.

So yes, it would be expected break a lot of C-interfacing code that has
Python wrappers around it to normalize input.

- Sebastian



On Fri, 2023-11-10 at 22:37 +, Valerio De Benedetto wrote:
> Hi, I found that the documented default row-major order is enforced
> throughout the library with a series of `order='C'` default
> parameters, so given this I supposed there's no way to change the
> default (or am I wrong?)
> If, supposedly, I'd change that by patching the library (substituting
> 'C's for 'F's), do you think there would by any problem with other
> downstream libraries using numpy in my project? Do you think they
> assume a default-constructed array is always row-major and access the
> underlying data?
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?

2023-12-23 Thread Sebastian Berg
On Fri, 2023-12-22 at 18:01 -0500, Marten van Kerkwijk wrote:
> Hi Martin,
> 
> I agree it is a long-standing issue, and I was reminded of it by your
> comment.  I have a draft PR at 
> https://github.com/numpy/numpy/pull/25476
> that does not change the old behaviour, but allows you to pass in a
> start-stop array which behaves more sensibly (exact API TBD).
> 
> Please have a look!


That looks nice, I don't have a clear feeling on the order of items, if
we think of it in terms of `(start, stop)` there was also the idea
voiced to simply add another name in which case you would allow start
and stop to be separate arrays.
Of course if go with your `slice(start, stop)` idea that also works,
although passing as separate parameters seems nice too.

Adding another name (if we can think of one at least) seems pretty good
to me, since I suspect we would add docs to suggest not using
`reduceat`.


One small thing about the PR: I would like to distinct `default` and
`initial`.  I.e. the default value is used only for empty reductions,
while the initial value should be always used (unless you would pass
both, which we don't for normal reductions though).
I suppose the machinery isn't quite set up to do both side-by-side.

- Sebastian



> 
> Marten
> 
> Martin Ling  writes:
> 
> > Hi folks,
> > 
> > I don't follow numpy development in much detail these days but I
> > see
> > that there is a 2.0 release planned soon.
> > 
> > Would this be an opportunity to change the behaviour of 'reduceat'?
> > 
> > This issue has been open in some form since 2006!
> > https://github.com/numpy/numpy/issues/834
> > 
> > The current behaviour was originally inherited from Numeric, and
> > makes
> > reduceat often unusable in practice, even where it should be the
> > perfect, concise, efficient solution. But it has been impossible to
> > change it without breaking compatibіlity with existing code.
> > 
> > As a result, horrible hacks are needed instead, e.g. my answer
> > here:
> > https://stackoverflow.com/questions/57694003
> > 
> > Is this something that could finally be fixed in 2.0?
> > 
> > 
> > Martin
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: m...@astro.utoronto.ca
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?

2023-12-23 Thread Sebastian Berg
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote:
> Hi Sebastian,
> 
> > That looks nice, I don't have a clear feeling on the order of
> > items, if
> > we think of it in terms of `(start, stop)` there was also the idea
> > voiced to simply add another name in which case you would allow
> > start
> > and stop to be separate arrays.
> 
> Yes, one could add another method.  Or perhaps even add a new
> argument
> to `.reduce` instead (say `slices`).  But this seemed the simplest
> route...

Yeah, I don't mind this, doesn't stop us from a better idea either.
Adding to `.reduce` could be fine, but overall I actually think a new
name or using `reduceat` is nicer than overloading it more, even
`reduce_slices()`.

> > 


> > 
> > I suppose the machinery isn't quite set up to do both side-by-side.
> 
> I just followed what is done for reduce, where a default could also
> have
> made sense given that `where` can exclude all inputs along a given
> row.
> I'm not convinced it would be necessary to have both, though it would
> not be hard to add.

Sorry, I misread the code: You do use initial the same way as in
reductions, I thought it wasn't used when there were multiple elements.
I.e. it is used for non-empty slices also.

There is still a little annoyance when `initial=` isn't passed, since
default/initial can be different (this is the case for object add for
example: the default is `0`, but it is not used as initial for non
empty reductions).
Anyway, its a small details to some degree even if it may be finicky to
get right.  At the moment it seems passing `dtype=object` somehow
changes the result also.

- Sebastian


> 
> All the best,
> 
> Marten
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?

2024-01-07 Thread Sebastian Berg
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote:
> Hi Sebastian,
> 
> > That looks nice, I don't have a clear feeling on the order of
> > items, if
> > we think of it in terms of `(start, stop)` there was also the idea
> > voiced to simply add another name in which case you would allow
> > start
> > and stop to be separate arrays.
> 
> Yes, one could add another method.  Or perhaps even add a new
> argument
> to `.reduce` instead (say `slices`).  But this seemed the simplest
> route...
> 
> > Of course if go with your `slice(start, stop)` idea that also
> > works,
> > although passing as separate parameters seems nice too.
> > 
> > Adding another name (if we can think of one at least) seems pretty
> > good
> > to me, since I suspect we would add docs to suggest not using
> > `reduceat`.
> 
> If we'd want to, even with the present PR it would be possible to
> (very
> slowly) deprecate the use of a list of single integers.  But I'm
> trying
> to go with just making the existing method more useful.
> 
> > One small thing about the PR: I would like to distinct `default`
> > and
> > `initial`.  I.e. the default value is used only for empty
> > reductions,
> > while the initial value should be always used (unless you would
> > pass
> > both, which we don't for normal reductions though).
> > I suppose the machinery isn't quite set up to do both side-by-side.
> 
> I just followed what is done for reduce, where a default could also
> have
> made sense given that `where` can exclude all inputs along a given
> row.
> I'm not convinced it would be necessary to have both, though it would
> not be hard to add.


Was looking at the PR, which still seems worthwhile, although not
urgnet right now.

But, this makes me think (loudly ;)) that the `get_reduction_initial`
should maybe distinguish this more fully...

Because there are 3 cases, even if we only use the first two currently:

1. True idenity: default and initial are the same.
2. Default but no initial: Object sum has no initial, but does use `0`
   as default.
3. Initial is not valid default: This would be useful to simplify
   min/max reductions: `-inf` or `MIN_INT` are valid initial values
   but are not valid default values.

- Sebastian

> 
> All the best,
> 
> Marten
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Proposal to accept NEP 55: Add a UTF-8 variable-width string DType to NumPy

2024-01-24 Thread Sebastian Berg
On Mon, 2024-01-22 at 17:08 -0700, Nathan wrote:
> Hi all,
> 
> I propose we accept NEP 55 and merge PR #25347 implementing the NEP
> in time
> for the NumPy 2.0 RC:


I really like this work and I think it is a big improvement!  At this
point we probably have to expect some things to be still buggy, but
that is also a reason to get it in (testing is hard if it isn't shipped
first-class unfortunately).

Nathan summarized the things I might have brought up very well.  The 
support of missing values is the one thing that to me may end up a bit
more in flux.
But I am happy to hope that this is in a way that pandas will not be
affected and, honestly, without deep integration testing we won't make
progress in figuring out whether there is some change needed or not.

Thanks for the great work!

- Sebastian


> 
> https://numpy.org/neps/nep-0055-string_dtype.html
> https://github.com/numpy/numpy/pull/25347
> 
> The most controversial aspect of the NEP was support for missing
> strings
> via a user-supplied sentinel object. In the previous discussion on
> the
> mailing list, Warren Weckesser argued for shipping a missing data
> sentinel
> with NumPy for use with the DType, while in code review and the PR
> for the
> NEP, Sebestian expressed concern about the additional complexity of
> including missing data support at all.
> 
> I found that supporting missing data is key to efficiently supporting
> the
> new DType in Pandas. I think that argues that we need some level of
> missing
> data support to fully replace object string arrays. I believe the
> compromise proposal in the NEP is sufficient for downstream libraries
> while
> limiting additional complexity elsewhere in NumPy.
> 
> Concerns raised in previous discussions about concretely specifying
> the C
> API to be made public, preventing use-after-free errors in a
> multithreaded
> context, and uncertainty around the arena allocator implementation
> have
> been resolved in the latest version of the NEP and the open PR.
> Additionally, due to some excellent and timely work by Lysandros
> Nikolaou,
> we now have a number of string ufuncs in NumPy and a straightforward
> plan
> to add more. Loops have been implemented for all the ufuncs added in
> the
> NumPy 2.0 dev cycle so far.
> 
> I would like to see us ship the DType in NumPy 2.0. This will allow
> us to
> advertise a major new feature, will spur efforts to support new
> DTypes in
> downstream libraries, and will allow us to get feedback from the
> community
> that would be difficult to obtain without releasing the code into the
> wild.
> Additionally, I am funded via a NASA ROSES grant for work related to
> this
> effort until the end of 2024, so including the DType in NumPy 2.0
> will more
> efficiently use my funded time to fix issues.
> 
> If there are no substantive objections to this email, then the NEP
> will be
> considered accepted; see NEP 0 for more details:
> https://numpy.org/neps/nep-.html
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Automatic Clipping of array to upper / lower bounds of dtype

2024-03-25 Thread Sebastian Berg
On Mon, 2024-03-25 at 13:49 +, percynichols...@gmail.com wrote:
> Many thanks!
> 
> Just one more inquiry along those lines, if I may. The code asserts
> that clip should outpace np.maximum(mp.minumum(arr, max), min).
> Despite this:
> *time a = np.arange(100)it.clip(4, 20)    # 8.48 µs
> %timeit np.maximum(np.minimum(a, 20), 4)    2.09 nanoseconds
> Will this be the norm?


There some slow paths necessary due to NaN handling and a deprecation
in `np.clip`.  You should try with an up to date NumPy version.

That was a known issue, but not much to do about it.  You shouldn't
really see much of a difference on up to date NumPy versions.

- Sebastian

> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0

2024-05-06 Thread Sebastian Berg
On Mon, 2024-05-06 at 09:17 +1000, Matti Picus wrote:
> On 05/05/2024 11:32, Mark Harfouche wrote:
> 

> > 
> > Thank you for considering this last minute request. I know it adds 
> > work at this stage.
> > 
> > Mark
> 
> 
> I think NumPy should not be the leader in dropping versions, rather 
> should be one of the more conservative packages since other packages 
> depend on it. We have indeed dropped 3.9 on HEAD, and will not be 
> supporting it for 2.1, but to me it makes sense to support it for the
> large 2.0 release.


I think it is late anyway and NumPy always had a slightly longer
support period and that seemed fine especially since NumPy is low in
the stack.

The SPEC was written to give the community that precedence and show
that many agree with you (and numpy endorses it).
Maybe the "endorsed by" list should rather be grown to strengthen the
argument instead?

(Of course there are true exceptions, IIRC scikit-learn chooses to have
much longer support windows.)

- Sebastian


> 
> 
> Matti
> 
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0

2024-05-07 Thread Sebastian Berg
On Tue, 2024-05-07 at 15:46 +1000, Juan Nunez-Iglesias wrote:
> On Tue, 7 May 2024, at 7:04 AM, Ralf Gommers wrote:
> > This problem could have been avoided by proper use of upper bounds.
> > Scikit-image 0.22 not including a `numpy<2.0` upper bound is a bug
> > in scikit-image (definitely for ABI reasons, and arguably also for
> > API reasons). It would really be useful if downstream packages
> > started to take adding upper bounds correctly more seriously. E.g.,
> > SciPy has always done it right, so the failure mode that this
> > thread is about doesn't exist for SciPy. That said, this ship has
> > sailed for 2.0 - most packages don't have upper bounds in some or
> > all of their recent releases.
> 
> I don't think this is a downstream problem, I think this is a "PyPI
> has immutable metadata" problem. I'm a big fan of Henry Schreiner's
> "Should You Use Upper Bound Version Constraints" <
> https://iscinumpy.dev/post/bound-version-constraints/>, where he
> argues convincingly that the answer is almost always no. This
> highlighted bit contains the gist:



Yes, that is all because of `pip` limitations, but those limitations
are a given.  And I think it is unfortunate/odd that it effectively
argues that the lower in the stack you are,  the fewer version you
should support.

But, with the clarification we have that there may be a lot of packages
that never support both Python 3.9 and NumPy 2.
That means not publishing for 3.9 may end up helping quite a lot of
users who would have to downgrade NumPy explicitly.

If that seems the case, that is an unfortunate, but good, argument for
dropping 3.9.

I don't have an idea for how many users we'll effectively help, or if
we do the opposite because an application (more than library) wants to
just use NumPy 2 always but still support Python 3.9.
But it seems to me that is what the decision comes down to, and I can
believe that it'll be a lot of hassle saved for `pip` installing users.
(Note that skimage users will hit cython, so should get a relatively
clear printout that includes a "please downgrade NumPy" suggestion.)

- Sebastian



> 
> > A library that requires a manual version intervention is not
> > broken, it’s just irritating. A library that can’t be installed due
> > to a version conflict is broken. If that version conflict is fake,
> > then you’ve created an unsolvable problem where one didn’t exist.
> 
> Dropping Py 3.9 will fix the issue for a subset of users, but
> certainly not all users. Someone who pip installs scikit-image==0.22
> on Py 3.10 will have a broken install. But importantly, they will be
> able to fix it in user space.
> 
> At any rate, it's not like NumPy (or SciPy, or scikit-image) don't
> change APIs over several minor versions. Quoting Henry again:
> 
> > Quite ironically, the better a package follows SemVer, the smaller
> > the change will trigger a major version, and therefore the less
> > likely a major version will break a particular downstream code.
> 
> In short, and independent of the Py3.9 issue, I don't think we should
> advocate for upper caps in general, because in general it is
> impossible to know whether an update is going to break your library,
> regardless of their SemVer practices, and a fake upper pin is worse
> than no upper pin.
> 
> Juan.
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0

2024-05-07 Thread Sebastian Berg
On Tue, 2024-05-07 at 11:41 +0200, Gael Varoquaux wrote:
> On Tue, May 07, 2024 at 11:31:02AM +0200, Ralf Gommers wrote:
> > make `pip install scikit-image==0.22` work if that version of
> > scikit-image depends on an unconstrained numpy version.
> 
> Would an option be for the scikit-image maintainers to release a
> version of scikit-image 0.22 (like 0.22.1) with a constraint numpy
> version?


I don't think it helps, pip will just skip that version and pick the
prevous one.  IIUC, the one thing you could do is release a new version
without a constraint that raises a detailed/informative error message
at runtime.
I.e. "work around" pip by telling users exactly what they should do.

- Sebastian


> 
> Gaël
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0

2024-05-08 Thread Sebastian Berg
On Mon, 2024-05-06 at 22:39 +, Henry Schreiner wrote:
> This will be messier for projects building wheels and wanting to
> support non-EoL Python versions. To build a wheel with anything other
> than pybind11, you now need the oldest supported NumPy for Python <
> 3.9, the latest NumPy 1 for Python 3.9, and NumPy 2 for Python 3.10+.
> I don't know if that's important in the decision, but thought I'd
> point it out. Also, according to NEP 29, 3.10+ only became the
> requirement a couple of weeks ago, while it has been months since
> SPEC 0 dropped it. I don't think either document really details what
> to do when there's a really long development cycle that spans a
> cutoff date.


FWIW, I have heard similar opinions now.  Supporting a stack of
libraries for all non-EoL Python versions is harder if NumPy must be
different.
The biggest problem would be if you end up with full support for only
NumPy 1 or NumPy 2 (i.e. similar to what numba has to do due to
promotion).  I hope that is rare enough that it doesn't matter, but I
can't say I am sure.
(And yeah, if that happens, we might see the ask of downstream to
support 3.9 and NumPy 2 in a release.  And trying to avoid that was
part of why the discussion started I think.)

- Sebastian


> 
> If you drop 3.9 from the metadata, I don't think there's any need to
> secretly keep support. It's too hard to actually use it, and it's not
> guaranteed to work; it would be better to just tell anyone needing
> 3.9 to use a beta version when it was still supported.
> 
> (Rant below)
> 
> To be fair, I've never understood NEP 29's need to limit Python
> versions to 42 months after the 2.7 issue was resolved with official
> Python EoL. Now there's a standard (60 months, exactly 5 versions),
> and almost all the rest of the ecosystem supports it. This just
> wedges a divide in the ecosystem between "scientific" and "everyone
> else". It makes me have to think "is this a scientific Python
> project? Or a general Python project?" when I really shouldn't have
> to on every project.
> 
> I really didn't understand SPEC 0's _tightening_ it to 36 months (and
> I was at the developer summit where this was decided, and stated I
> was interested in being involved in this, but was never included in
> any discussion on it, so not sure how this was even decided).
> Dropping Python doesn't hurt projects that are mostly stable, but
> ones that are not are really hurt by it. Python 3.8 is still heavily
> used; people don't mind that NumPy dropped 3.8 support because an
> older version works fine. But if there's a major change, then it
> makes smaller or new projects have to do extra work.
> 
> Current numbers (as of May 4th) for downloads of manylinux wheels:
> * 2.7: 2%
> * 3.5: 0.3%
> * 3.6: 7.4%
> * 3.7: 20.4%
> * 3.8: 23.0%
> * 3.9: 15.3%
> * 3.10: 20.8%
> * 3.11: 8.4%
> * 3.12: 2.3%
> 
> So only 30% of users have Python 3.10+ or newer. Most smaller or
> newer projects can more than double their user base by supporting 
> 3.8+. I could even argue that 3.7+ is still helpful for a new
> project. Once a library is large and stable, then it can go higher,
> even 3.13+ and not hurt anyone unless there's a major development.
> 
> Little rant finished. :)
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Unexpected return values for np.mod with x2=np.inf and similar

2024-06-10 Thread Sebastian Berg
On Mon, 2024-06-10 at 10:49 +0300, Matti Picus wrote:
> What operating system?
> 
> 
> If I recall correctly, NumPy tries to be compatible with CPython for 
> these edge cases.
> 

Right, and there seems nothing odd here to me.  Try using `divmod()` on
a few numbers (not infinities) to realize that this is how Python
defines things.
Python modulo is not identical to IEEE modulo as describd in the docs.

- Sebastian


> 
> The actual implementation is a bit scattered. I think it would be
> nice 
> if we could have an "explain" decorator to ufuncs that would return
> the 
> name of the inner loop used in practice to aid in debugging and 
> teaching. Until then your best bet is to build NumPy locally with
> debug 
> information and use a debugger, but even that can be challenging at
> times.
> 
> Matti
> 
> 
> On 07/06/2024 21:10, jesse.live...@gmail.com wrote:
> > Hi all,
> > 
> > I ran into an odd edge-case with np.mod and was wondering if this
> > is the expected behavior, and if so why. This is on a fresh install
> > of python 3.10.14 with numpy 1.26.4 from conda-forge.
> > ...
> > Any ideas why these are the return values? I had a hard time
> > tracking down where in the numpy source np.mod was coming from.
> > Jesse
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: matti.pi...@gmail.com
> > https://github.com/numpy/numpy/blob/main/numpy/_core/src/umath/loops_modulo.dispatch.c.src#L557
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Mysterious issue to build pyFFTW with Numpy 2.0 on Windows

2024-07-03 Thread Sebastian Berg
The most probably change seems to me that NumPy now includes
`complex.h`.  But not sure that is the right direction or why it would
lead to cryptic errors.

- Sebastian



On Wed, 2024-07-03 at 10:30 +0200, PIERRE AUGIER wrote:
> Hi,
> 
> We have a strange issue with building pyFFTW with Numpy 2.0 on
> Windows. I observed it before when a build in the CI tried to use
> Numpy 2.0. The solution was to pin the Numpy version used for the
> build to <2.0.
> 
> However, now I'm trying in this PR
> (https://github.com/pyFFTW/pyFFTW/pull/383) to make pyFFTW compatible
> with Numpy 2.0. With few simple changes, it works well on Linux and
> Macosx but not on Windows.
> 
> The meaningful part of the log seems to be:
> 
> INFO:root:"C:\Program Files\Microsoft Visual
> Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.e
> xe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DPYFFTW_HAVE_DOUBLE=1 -
> DPYFFTW_HAVE_DOUBLE_OMP=0 -DPYFFTW_HAVE_DOUBLE_THREADS=1 -
> DPYFFTW_HAVE_DOUBLE_MULTITHREADING=1 -DPYFFTW_HAVE_DOUBLE_MPI=0 -
> DPYFFTW_HAVE_SINGLE=1 -DPYFFTW_HAVE_SINGLE_OMP=0 -
> DPYFFTW_HAVE_SINGLE_THREADS=1 -DPYFFTW_HAVE_SINGLE_MULTITHREADING=1 -
> DPYFFTW_HAVE_SINGLE_MPI=0 -DPYFFTW_HAVE_LONG=1 -
> DPYFFTW_HAVE_LONG_OMP=0 -DPYFFTW_HAVE_LONG_THREADS=1 -
> DPYFFTW_HAVE_LONG_MULTITHREADING=1 -DPYFFTW_HAVE_LONG_MPI=0 -
> DPYFFTW_HAVE_MPI=0 -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -
> ID:\a\pyFFTW\pyFFTW\include -ID:\a\pyFFTW\pyFFTW\pyfftw -
> IC:\Users\runneradmin\AppData\Local\Temp\pip-build-env-
> zhyzy1bf\overlay\Lib\site-packages\numpy\_core\include -
> IC:\Users\runneradmin\AppData\Local\Temp\cibw-run-g1feworz\cp310-
> win_amd64\build\venv\include -ID:\a\pyFFTW\pyFFTW\include\win -
> IC:\Users\runneradmin\AppData\Local\Temp\cibw-run-g1few
>  orz\cp310-win_amd64\build\venv\include -
> IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-
> cpython\python.3.10.11\tools\include -
> IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-
> cpython\python.3.10.11\tools\Include -
> IC:\Users\runneradmin\AppData\Local\Temp\cibw-run-g1feworz\cp310-
> win_amd64\build\venv\include -
> IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-
> cpython\python.3.10.11\tools\include -
> IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-
> cpython\python.3.10.11\tools\Include "-IC:\Program Files\Microsoft
> Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\include" "-
> IC:\Program Files\Microsoft Visual
> Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\ATLMFC\include" "-
> IC:\Program Files\Microsoft Visual
> Studio\2022\Enterprise\VC\Auxiliary\VS\include" "-IC:\Program Files
> (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files
> (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files
> (x86)\W
>  indows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files
> (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program
> Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "-
> IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-
> IC:\Program Files\Microsoft Visual
> Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\include" "-
> IC:\Program Files\Microsoft Visual
> Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\ATLMFC\include" "-
> IC:\Program Files\Microsoft Visual
> Studio\2022\Enterprise\VC\Auxiliary\VS\include" "-IC:\Program Files
> (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files
> (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files
> (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program
> Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-
> IC:\Program Files (x86)\Windows
> Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files
> (x86)\Windows Kits\NETFXSDK\4.8\include\um" /Tcpyfftw\pyfftw.c
> /Fobuild\temp.win
>  -amd64-cpython-310\Release\pyfftw\pyfftw.obj
>     pyfftw.c
>     D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2061: syntax
> error: identifier 'fftw_complex'
>     D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2059: syntax
> error: ';'
>     D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2143: syntax
> error: missing ')' before '*'
>     D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2081:
> 'fftw_complex': name in formal parameter list illegal
>     D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2143: syntax
> error: missing '{' before '*'
>     D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2143: syntax
> error:

[Numpy-discussion] Re: Enhancement for generalized ufuncs

2024-07-12 Thread Sebastian Berg
On Thu, 2024-07-11 at 19:31 -0400, Warren Weckesser wrote:
> I have implemented quite a few generalized ufuncs over in ufunclab
> (https://github.com/WarrenWeckesser/ufunclab), and in the process I
> have accumulated a gufunc "wish list". Two items on that list are:
> 
> (1) the ability to impose constraints on the core dimensions that are
> checked when the gufunc is called. By far the most common use-case I
> have is requiring that a dimension have length at least 1. To do this
> currently, I check the shapes in the ufunc loop function, and if they
> are not valid, raise an exception and hope that the gufunc machinery
> processes it as expected when the loop function returns. (Sorry, I'm
> using lingo--"loop function", "core dimension", etc--that will be
> familiar to those who already know the ufunc C API, but not so
> familiar to general users of NumPy.)
> 
> (2) the ability to have the output dimension be a function of the
> input dimensions, instead of being limited to one of the input
> dimensions or an independent dimension. Then one could create, for
> example, a 1-d convolution gufunc with shape signature that is
> effectively `(m),(n)->(m + n - 1)` (corresponding to `mode='full'` in
> `np.convolve`) and the gufunc code would automatically allocate the
> output with the correct shape and dtype.
> 

Nice, thanks!  I have to look at the implementation in detail, but this
seems like a good idea.

Have to look at the PR for bike-shedding, but I think we should just
add this.
(You won't be able to know these relations from reading the signature,
but I doubt it's worth worrying about that.)

This seems like it should cover all or at least almost all of the
things that have come up about ufunc core dimension flexibility (might
be nice to check briefly, but even if not I suspect the hook here is
the right choice).

- Sebastian


> I have proposed a change in https://github.com/numpy/numpy/pull/26908
> that makes both these features possible. A field is added to the
> PyUFuncObject that is an optional user-defined C function that the
> gufunc author implements. When a gufunc is called, this function is
> called with an array of the values of the core dimensions of the
> input
> and output arrays. Some or all of the output core dimensions might be
> -1, meaning the arrays are to be allocated by the gufunc/iterator
> machinery.  The new "hook" allows the user to check the given core
> dimensions and raise an exception if some constraint is not
> satisfied.
> The user-defined function can also replace those -1 values with sizes
> that it computes based on the given input core dimensions.
> 
> To define the 1-d convolution gufunc, the actual shape signature that
> is passed to `PyUFunc_FromFuncAndDataAndSignature` is `(m),(n)->(p)`.
> When a user passes arrays with shapes, say, (20,) and (30,) as the
> input and with no output array specified, the user-defined function
> will get the array [20, 30, -1]. It would replace -1 with m + n - 1 =
> 49 and return. If the caller *does* include an output array in the
> call, the core dimension of that array will be the third element of
> the array passed to the user-defined function. In that case, the
> function verifies that the value equals m + n - 1, and raises an
> exception if it doesn't.
> 
> Here's that 1-d convolution, called `conv1d_full` here, in action:
> 
> ```
> In [14]: import numpy as np
> 
> In [15]: from experiment import conv1d_full
> 
> In [16]: type(conv1d_full)
> Out[16]: numpy.ufunc
> ```
> 
> `m = 4`, `n = 6`, so the output shape is `p = m + n - 1 = 9`:
> 
> ```
> In [17]: conv1d_full([1, 2, 3, 4], [-1, 1, 2, 1.5, -2, 1])
> Out[17]: array([-1. , -1. , 1. , 4.5, 11. , 9.5, 2. , -5. , 4. ])
> ```
> 
> Standard broadcasting:
> 
> ```
> In [18]: conv1d_full([[1, 2, 3, 4], [0.5, 0, -1, 1]], [-1, 1, 2, 1.5,
> -2, 1])
> Out[18]:
> array([[-1. , -1. , 1. , 4.5 , 11. , 9.5 , 2. , -5. , 4. ],
>     [-0.5 , 0.5 , 2. , -1.25, -2. , 1. , 3.5 , -3. , 1. ]])
> ```
> 
> Comments here or over in the pull request are welcome. The essential
> changes to the source code are just 7 lines in `ufunc_object.c` and 7
> lines in `ufuncobject.h`. The rest of the changes in the PR create a
> couple gufuncs that use the new feature, with corresponding unit
> tests.
> 
> Warren
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net
> 


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Enhancement for generalized ufuncs

2024-07-12 Thread Sebastian Berg
On Fri, 2024-07-12 at 09:56 -0400, Warren Weckesser wrote:
> On Fri, Jul 12, 2024 at 7:47 AM Sebastian Berg
>  wrote:
> > 
> 
> > (You won't be able to know these relations from reading the
> > signature,
> > but I doubt it's worth worrying about that.)
> 
> After creating the gufunc with `PyUFunc_FromFuncAndDataAndSignature`,
> the gufunc author could set the `core_signature` field at the same
> time that `process_core_dims_func` is set.  That requires freeing the
> old signature and allocating memory for the new one.  For the 1-d
> convolution example, the signature would be set to `"(m),(n)->(m + n
> -
> 1)"`:
> 
> ```
> In [1]: from experiment import conv1d_full
> 
> In [2]: conv1d_full.signature
> Out[2]: '(m),(n)->(m + n - 1)'
> ```


I have to look at the PR, but the ufunc parses the signature only once?
That solution seems very hacky, but allowing to just replace the
signature may make sense.
(Downside is, if someone else wants to parse the original signature,
but I guess it is unlikely.)

In either case, the only other thing to hook into would be the
signature parsing itself with the full shapes available.  But then you
may need to deal with `axes=`, etc. as well, so I think your solution
that only adjusts shapes seems better.
It's much simpler and should cover most or even all relevant things.

- Sebastian



> 
> Warren
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Welcome Joren Hammudoglu to the NumPy Maintainers Team

2024-08-19 Thread Sebastian Berg
Hi all,

please join me in welcoming Joren (https://github.com/jorenham) to the
NumPy maintainers team.

Joren has done a lot of work recently contributing, reviewing, and
maintaining typing related improvements to NumPy.
We are looking forward to see new momentum to improve NumPy typing.

Thanks for all the contributions!

Cheers,

Sebastian

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: ENH: Uniform interface for accessing minimum or maximum value of a dtype

2024-08-26 Thread Sebastian Berg
On Mon, 2024-08-26 at 11:26 -0400, Marten van Kerkwijk wrote:
> I think a NEP is a good idea.  It would also seem to make sense to
> consider how the dtype itself can hold/calculate this type of
> information, since that will be the only way a generic ``info()``
> function can get information for a user-defined dtype.  Indeed,
> taking
> that further, might a method or property on the dtype itself be
> the cleaner interface?  I.e., one would do ``dtype.info().min`` or
> ``dtype.info.min``.
> 

I agree, I think it should be properties/attributes (I don't think it
needs to be a function, it should be cheap?)
Now it might also be that `np.finfo()` could keep working via
`dtype.finfo` or a dunder if we want to hide it.

In general, I would lean towards some form of attributes, even if I am
not sure if they should be `.info`, `.finfo`, or even directly on the
dtype.
(`.info.min` seems tricky, because I am not sure it is clear whether
inf or the minimum finite value is "min".)

A (potentially very short) NEP would probably help to get momentum on
making a decision.  I certainly would like to see this being worked on!

- Sebastian


> -- Marten
> 
> Nathan  writes:
> 
> > That seems reasonable to me on its face. There are some corner
> > cases to work out though.
> > 
> > Swayam is tinkering with a quad precision dtype written using rhe
> > new DType API and just ran into the
> > fact that finfo doesn’t support user dtypes: 
> > 
> > https://github.com/numpy/numpy/issues/27231
> > 
> > IMO any new feature along these lines should have some thought in
> > the design about how to handle
> > user-defined data types.
> > 
> > Another thing to consider is that data types can be non-numeric
> > (things like categories) or number-like
> > but not really just a number like a quantity with a physical unit. 
> > That means you should also think
> > about what to do where fields like min and max don’t make any sense
> > or need to be a generic python
> > object rather than a numeric type.
> > 
> > I think if someone proposed a NEP that fully worked this out it
> > would be welcome. My understanding
> > is that the array API consortium prefers to standardize on APIs
> > that gain tractions in libraries rather
> > than inventing APIs and telling libraries to adopt them, so I think
> > a NEP is the right first step, rather
> > than trying to standardize something in the array API.
> > 
> > On Mon, Aug 26, 2024 at 8:06 AM Lucas Colley <
> > lucas.coll...@gmail.com> wrote:
> > 
> >  Or how about `np.dtype_info(dt)`, which could return an object
> > with attributes like `min` and `max
> >  `. Would that be possible?
> >  ___
> >  NumPy-Discussion mailing list -- numpy-discussion@python.org
> >  To unsubscribe send an email to numpy-discussion-le...@python.org
> >  
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> >  Member address: nathan12...@gmail.com
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


Re: [Numpy-discussion] Optimize evaluation of function on matrix

2017-03-25 Thread Sebastian Berg
On Sat, 2017-03-25 at 18:46 +0100, Florian Lindner wrote:
> Hello,
> 
> I have this function:
> 
> def eval_BF(self, meshA, meshB):
> """ Evaluates single BF or list of BFs on the meshes. """
> if type(self.basisfunction) is list:
> A = np.empty((len(meshA), len(meshB)))
> for i, row in enumerate(meshA):
> for j, col in enumerate(meshB):
> A[i, j] = self.basisfunction[j](row - col)
> else:
> mgrid = np.meshgrid(meshB, meshA)
> A = self.basisfunction( np.abs(mgrid[0] - mgrid[1]) )
> return A
> 
> 
> meshA and meshB are 1-dimensional numpy arrays. self.basisfunction is
> e.g.
> 
> def Gaussian(radius, shape):
> """ Gaussian Basis Function """
> return np.exp( -np.power(shape*abs(radius), 2))
> 
> 
> or a list of partial instantations of such functions (from
> functools.partial).
> 
> How can I optimize eval_BF? Esp. in the case of basisfunction being a
> list.
> 

Are you sure you need to optimize it? If they have a couple of hundred
elements or so for each row, the math is probably the problem and most
of that might be the `exp`.
You can get rid of the `row` loop though in case row if an individual
row is a pretty small array.

To be honest, I am a bit surprised that its a problem, since "basis
function" sounds a bit like you have to do this once and then use the
result many times.

- Sebastian


> Thanks!
> Florian
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] heads up: gufuncs on empty arrays and NpyIter removal of empty axis

2017-03-26 Thread Sebastian Berg
Hi all,

just a small heads up for gufunc hackers and low level iterator users.
We will probably very soon put in a commit into master that will allow
the removal of empty axis from NpyIter/nditer, effectively removing the
error:

"ValueError: cannot remove a zero-sized axis from an iterator"

and allowing:

```
arr = np.zeros((100, 0))
it = np.nditer((arr,),
   flags=["zerosize_ok", "multi_index"])
it.remove_axis(1)
```

As a follow up step, we also allow that gufuncs may be called with
empty inner loop sizes. In some cases that may mean that your gufuncs
may need special handling for lets say:

```
arr = np.zeros((100, 0))  # note the 0 dimension.
my_gufunc(arr)
```

If this creates problems for you, please tell, so that we can slow down
or undo the change. As an example, we have a matrix_multiply gufunc for
testing purpose, which did not zero out the output for the case of
`matrix_multiply(np.ones((10, 0)), np.ones((0, 10)))`. So this could
turn code that errored out for weird reasons into wrong results in rare
cases.

- Sebastian

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Optimize evaluation of function on matrix

2017-03-27 Thread Sebastian Berg
On Mon, 2017-03-27 at 13:06 +0200, Florian Lindner wrote:
> Hey,
> 
> I've timed the two versions, one basisfunction being a function:
> 
> 1 loop, best of 3: 17.3 s per loop
> 
> the other one, basisfunction being a list of functions:
> 
> 1 loop, best of 3: 33.5 s per loop
> 
> > To be honest, I am a bit surprised that its a problem, since "basis
> > function" sounds a bit like you have to do this once and then use
> > the
> > result many times.
> 
> It's part of a radial basis function interpolation algorithm. Yes, in
> practice the matrix is filled only once and reused
> a couple of times, but in my case, which is exploration of parameters
> for the algorithm, I call eval_BF many times.
> 
> > You can get rid of the `row` loop though in case row if an
> > individual
> > row is a pretty small array.
> 
> Would you elaborate on that? Do you mean that the inner col loop
> produces an array which is then assigned to the row.
> But I think it stell need to row loop there.

Well, I like to not serve the result, but if you exchange the loops:

A = np.empty((len(meshA), len(meshB)))
for j, col in enumerate(meshB):
for i, row in enumerate(meshA):
A[i, j] = self.basisfunction[j](row - col)

Then you can see that there is broadcasting magic similar (do not want
to use too many brain cells now) to:

A = np.empty((len(meshA), len(meshB)))
for j, col in enumerate(meshB):
    # possibly insert np.newaxis/None or a reshape in [??]
    A[:, j] = self.basisfunction[j](meshA[??] - col)

- Sebastian

> 
> Best,
> Florian
> 
> Am 25.03.2017 um 22:31 schrieb Sebastian Berg:
> > On Sat, 2017-03-25 at 18:46 +0100, Florian Lindner wrote:
> > > Hello,
> > > 
> > > I have this function:
> > > 
> > > def eval_BF(self, meshA, meshB):
> > > """ Evaluates single BF or list of BFs on the meshes. """
> > > if type(self.basisfunction) is list:
> > > A = np.empty((len(meshA), len(meshB)))
> > > for i, row in enumerate(meshA):
> > > for j, col in enumerate(meshB):
> > > A[i, j] = self.basisfunction[j](row - col)
> > > else:
> > > mgrid = np.meshgrid(meshB, meshA)
> > > A = self.basisfunction( np.abs(mgrid[0] - mgrid[1]) )
> > > return A
> > > 
> > > 
> > > meshA and meshB are 1-dimensional numpy arrays.
> > > self.basisfunction is
> > > e.g.
> > > 
> > > def Gaussian(radius, shape):
> > > """ Gaussian Basis Function """
> > > return np.exp( -np.power(shape*abs(radius), 2))
> > > 
> > > 
> > > or a list of partial instantations of such functions (from
> > > functools.partial).
> > > 
> > > How can I optimize eval_BF? Esp. in the case of basisfunction
> > > being a
> > > list.
> > > 
> > 
> > Are you sure you need to optimize it? If they have a couple of
> > hundred
> > elements or so for each row, the math is probably the problem and
> > most
> > of that might be the `exp`.
> > You can get rid of the `row` loop though in case row if an
> > individual
> > row is a pretty small array.
> > 
> > To be honest, I am a bit surprised that its a problem, since "basis
> > function" sounds a bit like you have to do this once and then use
> > the
> > result many times.
> > 
> > - Sebastian
> > 
> > 
> > > Thanks!
> > > Florian
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > > 
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: SciPy2017 Sprints FinAid for sprint leaders/core devs

2017-03-30 Thread Sebastian Berg
On Thu, 2017-03-30 at 22:46 +1300, Ralf Gommers wrote:
> 

> 
> Agreed, and I would call that productive. Getting even one new
> maintainer involved is worth organizing multiple sprints for.
> 
> That said, also +1 to a developer meeting this year. It'd be good if
> we could combine it with the NumFOCUS summit or a relevant conference
> in the second half of the year.

Would be good, even if there is nothing big going on.

Can we gather possible dates and possible (personal) preferences? Here
is a start:

* SciPy (Austin, TX): July 10-16
* EuroScipy (Germany): August 23-27
* NumFocus Summit?
* PyData Events??

Personally, I probably can't make longer trips until some time in July.
 time around then). We won't find a perfect time anyway probably, so
personal preferences or not, whoever is willing to organize a bit can
decide on the time and place as far as I am concerned :).

- Sebastian


> 
> Ralf
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] proposal: smaller representation of string arrays

2017-04-26 Thread Sebastian Berg
> > saving
> > > some memory in some ascii heavy cases, e.g. astronomy.
> > > It is not that significant anymore as porting to python3 has
> > > mostly
> > > already happened via the ugly byte workaround and memory saving
> > > is
> > > probably not as significant in the context of numpy which is
> > > already
> > > heavy on memory usage.
> > > 
> > > My initial approach was to not add a new dtype but to make
> > > unicode
> > > parametrizable which would have meant almost no cluttering of
> > > numpys
> > > internals and keeping the api more or less consistent which would
> > > make
> > > this a relatively simple addition of minor functionality for
> > > people that
> > > want it.
> > > But adding a completely new partially redundant dtype for this
> > > usecase
> > > may be a too large change to the api. Having two partially
> > > redundant
> > > string types may confuse users more than our current status quo
> > > of our
> > > single string type (U).
> > > 
> > > Discussing whether we want to support truncated utf8 has some
> > > merit as
> > > it is a decision whether to give the users an even larger gun to
> > > shot
> > > themselves in the foot with.
> > > But I'd like to focus first on the 1 byte type to add a symmetric
> > > API
> > > for python2 and python3.
> > > utf8 can always be added latter should we deem it a good idea.
> > 
> > What is your current proposal? A string dtype parameterized with
> > the
> > encoding (initially supporting the latin-1 that you desire and
> > maybe
> > adding utf-8 later)? Or a latin-1-specific dtype such that we will
> > have
> > to add a second utf-8 dtype at a later date?
> 
> My proposal is a single new parameterizable dtype. Adding multiple
> dtypes for each encoding seems unnecessary to me given that numpy
> already supports parameterizable types.
> For example datetime is very similar, it is basically encoded
> integers.
> There are multiple encodings = units supported.
> 
> > 
> > If you're not going to support arbitrary encodings right off the
> > bat,
> > I'd actually suggest implementing UTF-8 and ASCII-surrogateescape
> > first
> > as they seem to knock off more use cases straight away.
> > 
> 
> 
> Please list the use cases in the context of numpy usage. hdf5 is the
> most obvious, but how exactly would hdf5 use an utf8 array in the
> actual
> implementation?
> 
> What you save by having utf8 in the numpy array is replacing a
> decoding
> ane encoding step with a stripping null padding step.
> That doesn't seem very worthwhile compared to all their other
> overheads
> involved.

I remember talking with a colleague about something like that. And
basically an annoying thing there was that if you strip the zero bytes
in a zero padded string, some encodings (UTF16) may need one of the
zero bytes to work right. (I think she got around it, by weird
trickery, inverting the endianess or so and thus putting the zero bytes
first).
Maybe will ask her if this discussion is interesting to her. Though I
think it might have been something like "make everything in
hdf5/something similar work" without any actual use case, I don't know.

Have not read the thread, I think a fixed byte but settable encoding
type would make sense. I personally wonder whether storing the length
might make sense, even if that removes direct memory mapping, but as
you said, you can still memmap the bytes, and then probably just cast
back and forth.

Sorry if there is zero actual input here :)

- Sebastian



> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-User] NumPy v1.13.0rc1 released.

2017-05-12 Thread Sebastian Berg
On Fri, 2017-05-12 at 16:28 +0200, Jens Jørgen Mortensen wrote:
> Den 11-05-2017 kl. 03:48 skrev Charles R Harris:
> > Hi All,
> > 
> > I'm please to announce the NumPy 1.13.0rc1 release. This release
> > supports Python 2.7 and 3.4-3.6 and contains many new features. It
> > is one of the most ambitious releases in the last several years.
> > Some of the highlights and new functions are
>  
> I found this strange behavior:
> 
> (np113) [jensj@ASUS np113]$ python3
> Python 3.5.3 (default, Jan 19 2017, 14:11:04) 
> [GCC 6.3.0 20170118] on linux
> Type "help", "copyright", "credits" or "license" for more
> information.
> >>> import numpy as np
> >>> np.__version__
> '1.13.0rc1'
> >>> s = (27, 27, 27)
> >>> x = np.ones(s, complex)
> >>> y = np.zeros(s)
> >>> y += abs(x * 2.0)**2
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: Cannot cast ufunc add output from dtype('complex128') to
> dtype('float64') with casting rule 'same_kind'
> 
> Works OK with s=(3,3,3).
> 

I have opened an issue: https://github.com/numpy/numpy/issues/9109

since it is so "odd", I expect it is due to the temporary elision
kicking in when it should not in this case.

- Sebastian


> Jens Jørgen
> 
> > Highlights
> > Operations like ``a + b + c`` will reuse temporaries on some
> > platforms, resulting in less memory use and faster execution.
> > Inplace operations check if inputs overlap outputs and create
> > temporaries to avoid problems.
> > New __array_ufunc__ attribute provides improved ability for classes
> > to override default ufunc behavior.
> >  New np.block function for creating blocked arrays.
> > 
> > New functions
> > New ``np.positive`` ufunc.
> > New ``np.divmod`` ufunc provides more efficient divmod.
> > New ``np.isnat`` ufunc tests for NaT special values.
> > New ``np.heaviside`` ufunc computes the Heaviside function.
> > New ``np.isin`` function, improves on ``in1d``.
> > New ``np.block`` function for creating blocked arrays.
> > New ``PyArray_MapIterArrayCopyIfOverlap`` added to NumPy C-API.
> > Wheels for the pre-release are available on PyPI. Source tarballs,
> > zipfiles, release notes, and the Changelog are available on github.
> > 
> > A total of 100 people contributed to this release.  People with a
> > "+" by their
> > names contributed a patch for the first time.
> > A. Jesse Jiryu Davis +
> > Alessandro Pietro Bardelli +
> > Alex Rothberg +
> > Alexander Shadchin
> > Allan Haldane
> > Andres Guzman-Ballen +
> > Antoine Pitrou
> > Antony Lee
> > B R S Recht +
> > Baurzhan Muftakhidinov +
> > Ben Rowland
> > Benda Xu +
> > Blake Griffith
> > Bradley Wogsland +
> > Brandon Carter +
> > CJ Carey
> > Charles Harris
> > Danny Hermes +
> > Duke Vijitbenjaronk +
> > Egor Klenin +
> > Elliott Forney +
> > Elliott M Forney +
> > Endolith
> > Eric Wieser
> > Erik M. Bray
> > Eugene +
> > Evan Limanto +
> > Felix Berkenkamp +
> > François Bissey +
> > Frederic Bastien
> > Greg Young
> > Gregory R. Lee
> > Importance of Being Ernest +
> > Jaime Fernandez
> > Jakub Wilk +
> > James Cowgill +
> > James Sanders
> > Jean Utke +
> > Jesse Thoren +
> > Jim Crist +
> > Joerg Behrmann +
> > John Kirkham
> > Jonathan Helmus
> > Jonathan L Long
> > Jonathan Tammo Siebert +
> > Joseph Fox-Rabinovitz
> > Joshua Loyal +
> > Juan Nunez-Iglesias +
> > Julian Taylor
> > Kirill Balunov +
> > Likhith Chitneni +
> > Loïc Estève
> > Mads Ohm Larsen
> > Marein Könings +
> > Marten van Kerkwijk
> > Martin Thoma
> > Martino Sorbaro +
> > Marvin Schmidt +
> > Matthew Brett
> > Matthias Bussonnier +
> > Matthias C. M. Troffaes +
> > Matti Picus
> > Michael Seifert
> > Mikhail Pak +
> > Mortada Mehyar
> > Nathaniel J. Smith
> > Nick Papior
> > Oscar Villellas +
> > Pauli Virtanen
> > Pavel Potocek
> > Pete Peeradej Tanruangporn +
> > Philipp A +
> > Ralf Gommers
> > Robert Kern
> > Roland Kaufmann +
> > Ronan Lamy
> > Sami Salonen +
> > Sanchez Gonzalez Alvaro
> > Sebastian Berg
> > Shota Kawabuchi
> > Simon Gibbons
> > Stefan Otte
> > Stefan Peterson +
> > Stephan Hoyer
> > Søren Fuglede Jørgensen +
> > Takuya Akiba
> > Tom Boyd +
> > Ville Skyttä +
> > Warren Weckesser
> > Wendell Smith
> > Yu Feng
> > Zixu Zhao +
> > Zè Vinícius +
> > aha66 +
> > davidjn +
> > drabach +
> > drlvk +
> > jsh9 +
> > solarjoe +
> > zengi +
> > Cheers,
> > 
> > Chuck
> > 
> > 
> > ___
> > SciPy-User mailing list
> > scipy-u...@python.org
> > https://mail.python.org/mailman/listinfo/scipy-user
>  
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] failed to add routine to the core module

2017-05-18 Thread Sebastian Berg
On Thu, 2017-05-18 at 15:04 +0200, marc wrote:
> Dear Numpy developers,
> I'm trying to add a routine to calculate the sum of a product of two
> arrays (a dot product). But that would not increase the memory (from
> what I saw np.dot is increasing the memory while it should not be
> necessary). The idea is to avoid the use of the temporary array in
> the calculation of the variance ( numpy/numpy/core/_methods.py line
> 112).

np.dot should only increase memory in some cases (such as non-
contiguous arrays) and be much faster in most cases (unless e.g. you do
not have a BLAS compatible type). You might also want to check out
np.einsum, which is pretty slick and can handle these kind of
operations as well. Note that `np.dot` calls into BLAS so that it is in
general much faster then np.einsum.

- Sebastian

> The routine that I want to implement look like this in python,
> arr = np.random.rand(10)
> mean = arr.mean()
> var = 0.0
> for ai in arr: var += (ai-mean)**2
> I would like to implement it in the umath module. As a first step, I
> tried to reproduce the divmod function of umath, but I did not manage
> to do it, you can find my fork here (the branch with the changes is
> call looking_around). During compilation I get the following error,
> gcc: numpy/core/src/multiarray/number.c 
> In file included from numpy/core/src/multiarray/number.c:17:0:
> numpy/core/src/multiarray/number.c: In function
> ‘array_sum_multiply’: 
> numpy/core/src/private/binop_override.h:176:39: error:
> ‘PyNumberMethods {aka struct }’ has no member named
> ‘nb_sum_multiply’ (void*)(Py_TYPE(m2)->tp_as_number->SLOT_NAME) !=
> (void*)(test_func)) 
>                             ^ 
> numpy/core/src/private/binop_override.h:180:13: note: in expansion of
> macro ‘BINOP_IS_FORWARD’ if (BINOP_IS_FORWARD(m1, m2, slot_expr,
> test_func) && \
>         ^ 
> numpy/core/src/multiarray/number.c:363:5: note: in expansion of macro
> ‘BINOP_GIVE_UP_IF_NEEDED’ BINOP_GIVE_UP_IF_NEEDED(m1, m2,
> nb_sum_multiply, array_sum_multiply);
> Sorry if my question seems basic, but I'm new in Numpy development.
> Any help?
> Thank you in advance,
> Marc Barbry
> 
> PS: I opened an issues as well on the github repository
> https://github.com/numpy/numpy/issues/9130 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] UC Berkeley hiring developers to work on NumPy

2017-05-22 Thread Sebastian Berg
On Mon, 2017-05-22 at 17:35 +0100, Matthew Brett wrote:
> Hi,
> 
> On Mon, May 22, 2017 at 4:52 PM, Marten van Kerkwijk
>  wrote:
> > Hi Matthew,
> > 
> > > it seems to me that we could get 80% of the way to a reassuring
> > > blueprint with a relatively small amount of effort.
> > 
> > My sentence "adapt the typical academic rule for conflicts of
> > interests to PRs, that non-trivial ones cannot be merged by someone
> > who has a conflict of interest with the author, i.e., it cannot be
> > a
> > superviser, someone from the same institute, etc." was meant as a
> > suggestion for part of this blueprint!
> > 
> > I'll readily admit, though, that since I'm not overly worried, I
> > haven't even looked at the policies that are in place, nor do I
> > intend
> > to contribute much beyond this e-mail. Indeed, it may be that the
> > old
> > adage "every initiative is punishable" holds here...
> 
> I understand what you're saying, but I think a more helpful way of
> thinking of it, is putting the groundwork in place for the most
> fruitful possible collaboration.
> 
> > would you, or one
> > of the others who feels it is important to have a blueprint, be
> > willing to provide a concrete text for discussion?
> 
> It doesn't make sense for me to do that, I'm #13 for commits in the
> last year.  I'm just one of the many people who completely depend on
> numpy.  Also, taking a little time to think these things through
> seems
> like a small investment with the potential for significant gain, in
> terms of improving communication and mitigating risk.
> 
> So, I think my suggestion is that it would be a good idea for
> Nathaniel and the current steering committee to talk through how this
> is going to play out, how the work will be selected and directed, and
> so on.
> 

Frankly, I would suggest to wait for now and ask whoever is going to
get the job to work out how they think it should be handled. And then
we complain if we expect more/better ;).
For now I only would say that I will expect more community type of work
then we now often manage to do. And things such as meticulously
sticking to writing NEPs.
So the only thing I can see that might be good is putting "community
work" or something like it specifically as part of the job description,
and thats up to Nathaniel probably.

Some things like not merging large changes by two people sittings in
the same office should be obvious (and even if it happens, we can
revert). But its nothing much new there I think.

- Sebastian


> Cheers,
> 
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Future of ufuncs

2017-05-29 Thread Sebastian Berg
On Sun, 2017-05-28 at 14:53 -0600, Charles R Harris wrote:
> Hi All,
> This post is to open a discussion of the future of ufuncs. There are
> two contradictory ideas that have floated about regarding ufuncs
> evolution. One is to generalize ufuncs to operate on buffers,
> essentially separating them from their current entanglement with
> ndarrays. The other is to accept that they are fundamentally part of
> the ndarray universe and move them into the multiarray module, thus
> avoiding the odd overloading of functions in the multiarray module.
> The first has been a long time proposal that I once thought sounded
> good, but I've come to prefer the second. That change of mind was
> driven by the resulting code simplification and the removal of a
> dependence on a Python feature, buffers, that we cannot easily modify
> to adapt to changing needs and new dtypes. Because I'd like to move
> the ufuncs, if we decide to move them, sometime after NumPy 1.14 is
> released, now seems a good time to decide the issue.
> Thoughts?


I did not think about it much. But I agree that the dtypes are probably
the biggest issue, also I am not sure anymore if there is much of a
gain on having ufuncs work on buffers in any case?

The dtype thing goes a bit back to ideas like the datashape things and
trying to make the dtypes somewhat separate from numpy? Though I doubt
I would want to make that an explicit goal.

I wonder how much of the C-loops and type resolving we could/should
expose? What I mean is that ufuncs are:

 * type resolving (somewhat ufunc specific)
 * outer loops (normal, reduce, etc.) using nditer (buffering)
 * inner 1d loops

It is a bit more complicating, but just wondering if it might make
sense to try and expose the individual ufunc things (type resolving and
1d loop) but not all the outer loop nditer setup which is ndarray
specific in any case (honestly, I am not sure it is entirely possible
it is already exposed).

- Sebastian


> Chuck
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.diff on boolean arrays now raises

2017-06-15 Thread Sebastian Berg
On Thu, 2017-06-15 at 22:35 +1200, Ralf Gommers wrote:
> 
> 
> On Thu, Jun 15, 2017 at 7:08 PM, Jaime Fernández del Río  gmail.com> wrote:
> > There is an ongoing discussion on github:
> > 
> > https://github.com/numpy/numpy/issues/9251
> > 
> > In 1.13 np.diff has started raising on boolean arrays, since
> > subtraction of  boolean arrays is now deprecated.
> > 
> > A decision has to be made whether:
> > raising an error is the correct thing to do, and only the docstring
> > needs updating, or
> > backwards compatibility is more important and diff should still
> > work on boolean arrays.
> > 
> 
> The issue is bigger than np.diff. For example, there's a problem with
> the scipy.ndimage morphology functions (https://github.com/scipy/scip
> y/issues/7493) that looks pretty serious. All ndimage.binary_*
> functions (7 of them) for example return boolean arrays, and chaining
> those is now broken. Unfortunately it seems that that wasn't covered
> by the ndimage test suite.
> 
> Possibly reverting the breaking change in 1.13.1 is the best way to
> fix this.
> 

Sure, I would say there is nothing wrong with reverting for now (and it
simply is the safe and easy way).
Though it would be good to address the issue of what should happen in
the future with diff (and possibly the subtract deprecation itself). If
we stick to it, but its necessary, we could delay the deprecation and
make it a VisibleDeprecationWarning.

- Sebastian


> Ralf
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-Dev] PyRSB: Python interface to librsb sparse matrices library

2017-06-24 Thread Sebastian Berg
On Sat, 2017-06-24 at 15:47 -0400, josef.p...@gmail.com wrote:
> 
> 
> On Sat, Jun 24, 2017 at 3:16 PM, Nathaniel Smith 
> wrote:
> > On Jun 24, 2017 7:29 AM, "Sylvain Corlay"  > > wrote:
> > 
> > Also, one quick question: is the LGPL license a deliberate choice
> > or is it not important to you? Most projects in the Python
> > scientific stack are BSD licensed. So the LGPL choice makes it
> > unlikely that a higher-level project adopts it as a dependency. If
> > you are the only copyright holder, you would still have the
> > possibility to license it under a more permissive license such as
> > BSD or MIT...
> > 
> > Why would LGPL be a problem in a dependency? That doesn't stop you
> > making your code BSD, and it's less restrictive license-wise than
> > depending on MKL or the windows C runtime...
> > 
> 
> Is scipy still including any LGPL code, I thought not.
> There might still be some optional dependencies that not many users
> are using by default. ?
> Julia packages are mostly MIT, AFAIK. (except for the GPL parts
> because of cholmod, which we (?) avoid)
> 


Well, I don't think scipy has many dependencies (but I would not be
surprised if those are LGPL). Not a specialist, but as a dependency it
should be fine (that is the point of the L in LGPL after all as far as
I understand, it is much less viral).
If you package it with your own stuff, you have to make sure to point
out that parts are LGPL of course (just like there is a reason you get
the GPL printed out with some devices) and if you modify it provide
these modifications, etc.

Of course you cannot include it into the scipy codebase itself, but
there is probably no aim of doing so here, so without a specific reason
 I would think that LGPL is a great license.

- Sebastian


> Josef
>  
> > -n
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-Dev] PyRSB: Python interface to librsb sparse matrices library

2017-06-24 Thread Sebastian Berg
On Sat, 2017-06-24 at 22:58 +0200, Carl Kleffner wrote:
> Does this still apply: https://scipy.github.io/old-wiki/pages/License
> _Compatibility.html
> 

Of course, but it talks about putting it into the code base of scipy
not about being able to use the package in any way in a dependency
(i.e. `import package`).

- Sebastian


> Carl
> 
> 2017-06-24 22:07 GMT+02:00 Sebastian Berg  >:
> > On Sat, 2017-06-24 at 15:47 -0400, josef.p...@gmail.com wrote:
> > >
> > >
> > > On Sat, Jun 24, 2017 at 3:16 PM, Nathaniel Smith 
> > > wrote:
> > > > On Jun 24, 2017 7:29 AM, "Sylvain Corlay"  > .com
> > > > > wrote:
> > > >
> > > > Also, one quick question: is the LGPL license a deliberate
> > choice
> > > > or is it not important to you? Most projects in the Python
> > > > scientific stack are BSD licensed. So the LGPL choice makes it
> > > > unlikely that a higher-level project adopts it as a dependency.
> > If
> > > > you are the only copyright holder, you would still have the
> > > > possibility to license it under a more permissive license such
> > as
> > > > BSD or MIT...
> > > >
> > > > Why would LGPL be a problem in a dependency? That doesn't stop
> > you
> > > > making your code BSD, and it's less restrictive license-wise
> > than
> > > > depending on MKL or the windows C runtime...
> > > >
> > >
> > > Is scipy still including any LGPL code, I thought not.
> > > There might still be some optional dependencies that not many
> > users
> > > are using by default. ?
> > > Julia packages are mostly MIT, AFAIK. (except for the GPL parts
> > > because of cholmod, which we (?) avoid)
> > >
> > 
> > 
> > Well, I don't think scipy has many dependencies (but I would not be
> > surprised if those are LGPL). Not a specialist, but as a dependency
> > it
> > should be fine (that is the point of the L in LGPL after all as far
> > as
> > I understand, it is much less viral).
> > If you package it with your own stuff, you have to make sure to
> > point
> > out that parts are LGPL of course (just like there is a reason you
> > get
> > the GPL printed out with some devices) and if you modify it provide
> > these modifications, etc.
> > 
> > Of course you cannot include it into the scipy codebase itself, but
> > there is probably no aim of doing so here, so without a specific
> > reason
> >  I would think that LGPL is a great license.
> > 
> > - Sebastian
> > 
> > 
> > > Josef
> > >  
> > > > -n
> > > >
> > > > ___
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion@python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > >
> > >
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Boolean binary '-' operator

2017-06-26 Thread Sebastian Berg
On Sun, 2017-06-25 at 18:59 +0200, Julian Taylor wrote:
> On 25.06.2017 18:45, Stefan van der Walt wrote:
> > Hi Chuck
> > 
> > On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote:
> > > The boolean binary '-' operator was deprecated back in NumPy 1.9
> > > and
> > > changed to an error in 1.13. This caused a number of failures in
> > > downstream projects. The choices now are to continue the
> > > deprecation
> > > for another couple of releases, or simply give up on the change.
> > > For
> > > booleans,  `a - b` was implemented as `a xor b`, which leads to
> > > the
> > > somewhat unexpected identity `a - b == b - a`, but it is a handy
> > > operator that allows simplification of some functions,
> > > `numpy.diff`
> > > among therm. At this point I'm inclined to give up on the
> > > deprecation
> > > and retain the old behavior. It is a bit impure but perhaps we
> > > can
> > > consider it a feature rather than a bug.
> > 
> > What was the original motivation behind the deprecation?  `xor`
> > seems
> > like exactly what one would expect when subtracting boolean arrays.
> > 
> > But, in principle, I'm not against the deprecation (we've had to
> > fix a
> > few problems that arose in skimage, but nothing big).
> > 
> > Stéfan
> > 
> > 
> 
> I am against this deprecation for apparently cosmetic reasons.
> Is there any practical drawback in that it makes subtraction
> commutative
> for booleans?
> 
> numpy should not be imposing change of style when the existing sub
> par
> historical style does not cause actual bugs.
> 
> While I don't like it I can accept a deprecation warning that will
> never
> be acted upon.

Dunno, that is also weird, then a UserWarning might even be better ;),
but more visible

For the unary minus, there are good reasons. For subtract, I don't
remember really, but I don't think there was any huge argument for it.
Probably it was mostly that many feel that:
`False - True == -1` as is the case in python while we have:
`np.False_ - np.True_ == np.True_`.
And going to a deprecation would open up that possibility (though maybe
you could go there directly). Not that I am convinced of that option.

So, I don't mind much either way, but unless there is a concrete plan
with quite a bit of support we should maybe just go with the
conservative option.

- Sebastian


> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] why a[0][0].__mul__(a[0][0]) where a is np.array, gives 'missing 1 required positional argument'?

2017-06-27 Thread Sebastian Berg
x27;,mp)
> --}--cut here--
> 
> 
> When run by make, gives this result:
> 
> 
> --{--cut here--
> make -k
> python3 shortestPathABC.py
>   d0= <0>  d1= <1>  d2=  3.0  d3=  6.0
>   type(d0)= ShortestNull
> d4=  3.0
> d5=  9.0
> d6= <0>
> d7=  3.0
> d8= <0>
> d9=  3.0
> a=
> [[ 12.0]
>   [12.0 <0>]]
> a[0]=
> [ 12.0]
> a[0][0]=
> 
> Traceback (most recent call last):
>    File "shortestPathABC.py", line 123, in 
>  a00mul=a[0][0].__mul__(a[0][0])
> TypeError: __mul__() missing 1 required positional argument: 'other'
> Makefile:7: recipe for target 'all' failed
> make: *** [all] Error 1
> --}--cut here--
> 
> I don't understand why.  Apparently, a[0][0] is not a ShortestNull 
> because otherwise, the .__mul__ would have the positional argument,
> 'other' equal to a[0][0].
> 


I don't think debugging support really suits the list, but how
about you see why in your result:
[[ 12.0]
  [12.0 <0>]]

a[0, 0] and a[1, 1] do not show up as the same thing?


- Sebastian


> What am I missing?
> 
> TIA.
> 
> -regards,
> Larry
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Array blitting (pasting one array into another)

2017-06-29 Thread Sebastian Berg
On Fri, 2017-06-30 at 02:16 +0200, Mikhail V wrote:
> Hello all
> 
> I often need to copy one array into another array, given an offset.
> This is how the "blit" function can be understood, i.e. in
> every graphical lib there is such a function.
> The common definition is like:
> blit ( dest, src, offset ):
> where dest is destination array, src is source array and offset is
> coordinates in destination where the src should pe blitted.
> Main feature of such function is that it never gives an error,
> so if the source does not fit into the destination array, it is
> simply trimmed.
> And respectively if there is no intersection area then nothing
> happens.
> 
> Hope this is clear.
> So to make it work with Numpy arrays one need to calculate the
> slices before copying the data.
> I cannot find any Numpy or Python method to help with that so
> probably
> it does not exist yet.
> If so, my proposal is to add a Numpy method which helps with that.
> Namely the proposal will be to add a method which returns
> the slices for the intersection areas of two arbitrary arrays, given
> an offset,
> so then one can "blit" the array into another with simple assignment
> =.
> 
> Here is a Python function I use for 2d arrays now:
> 
> def interslice ( dest, src, offset ):
> y,x = offset
> H,W = dest.shape
> h,w = src.shape
> 
> dest_starty = max (y,0)
> dest_endy = min (y+h,H)
> dest_startx = max (x,0)
> dest_endx = min (x+w,W)
> 
> src_starty = 0
> src_endy = h
> if y<0: src_starty = -y
> by = y+h - H # Y bleed
> if by>0: src_endy = h - by
> 
> src_startx = 0
> src_endx = w
> if x<0:  src_startx = -x
> bx = x+w - W # X bleed
> if bx>0:  src_endx = w - bx
> 
> dest_sliceY =  slice(dest_starty,dest_endy)
> dest_sliceX =  slice(dest_startx,dest_endx)
> src_sliceY = slice(src_starty, src_endy)
> src_sliceX = slice(src_startx, src_endx)
> if dest_endy <= dest_starty:
> print "No Y intersection !"
> dest_sliceY = ( slice(0, 0) )
> src_sliceY = ( slice(0, 0) )
> if dest_endx <= dest_startx:
> print "No X intersection !"
> dest_sliceX = ( slice(0, 0) )
> src_sliceX = ( slice(0, 0) )
> dest_slice = ( dest_sliceY, dest_sliceX )
> src_slice = ( src_sliceY, src_sliceX )
> return ( dest_slice, src_slice )
> 
> 
> --
> 
> I have intentionally made it expanded and without contractions
> so that it is better understandable.
> It returns the intersection area of two arrays given an offset.
> First returned tuple element is the slice for DEST array and the
> second element is the slice for SRC array.
> If there is no intersection along one of the axis at all
> it returns the corresponding slice as (0,0)
> 
> With this helper function one can blit arrays easily e.g. example
> code:
> 
> W = 8; H = 8
> DEST = numpy.ones([H,W], dtype = "uint8")
> w = 4; h = 1
> SRC = numpy.zeros([h,w], dtype = "uint8")
> SRC[:]=8
> offset = (0,9)
> ds, ss = interslice (DEST, SRC, offset )
> 
> # blit SRC into DEST
> DEST[ds] = SRC[ss]
> 
> So changing the offset one can observe how the
> SRC array is trimmed if it is crosses the DEST boundaries.
> I think it is very useful function in general and it has
> well defined behaviour. It has usage not only for graphics,
> but actually any data copying-pasting between arrays.
> 
> So I am looking forward to comments on this proposal.
> 

First, the slice object provides some help:
```
In [8]: s = slice(1, 40, 2)

In [9]: s.indices(20)  # length of dimension
Out[9]: (1, 20, 2)  # and the 40 becomes 20

```

Second, there is almost no overhead of creating a view, so just create
the views first (it may well be faster). Then use the result to see how
large they actually are and index those (a second time) instead of
creating new slice objects.

- Sebastian


> 
> Mikhail
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] proposed changes to array printing in 1.14

2017-06-30 Thread Sebastian Berg
On Fri, 2017-06-30 at 17:55 +1000, Juan Nunez-Iglesias wrote:
> To reiterate my point on a previous thread, I don't think this should
> happen until NumPy 2.0. This *will* break a massive number of
> doctests, and what's worse, it will do so in a way that makes it
> difficult to support doctesting for both 1.13 and 1.14. I don't see a
> big enough benefit to these changes to justify breaking everyone's
> tests before an API-breaking version bump.
> 

Just so we are on the same page, nobody is planning a NumPy 2.0, so
insisting on not changing anything until a possible NumPy 2.0 is almost
like saying it should never happen. Of course we could enmass
deprecations and at some point do many at once and call it 2.0, but I
am not sure that helps anyone, when compared to saying that we do
deprecations for 1-2 years at least, and longer if someone complains.

The question is, do you really see a big advantage in fixing a
gazillion tests at once over doing a small part of the fixes one after
another? The "big step" thing did not work too well for Python 3

- Sebastian


> On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk
> , wrote:
> > To add to Allan's message: point (2), the printing of 0-d arrays,
> > is
> > the one that is the most important in the sense that it rectifies a
> > really strange situation, where the printing cannot be logically
> > controlled by the same mechanism that controls >=1-d arrays (see
> > PR).
> > 
> > While point 3 can also be considered a bug fix, 1 & 4 are at some
> > level matters of taste; my own reason for supporting their
> > implementation now is that the 0-d arrays already forces me (or,
> > specifically, astropy) to rewrite quite a few doctests, and I'd
> > rather
> > have everything in one go -- in this respect, it is a pity that
> > this
> > is separate from the earlier change in printing for structured
> > arrays
> > (which was also much for the better, but broke a lot of doctests).
> > 
> > -- Marten
> > 
> > 
> > 
> > On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane  > com> wrote:
> > > Hello all,
> > > 
> > > There are various updates to array printing in preparation for
> > > numpy
> > > 1.14. See https://github.com/numpy/numpy/pull/9139/
> > > 
> > > Some are quite likely to break other projects' doc-tests which
> > > expect a
> > > particular str or repr of arrays, so I'd like to warn the list in
> > > case
> > > anyone has opinions.
> > > 
> > > The current proposed changes, from most to least painful by my
> > > reckoning, are:
> > > 
> > > 1.
> > > For float arrays, an extra space previously used for the sign
> > > position
> > > will now be omitted in many cases. Eg, `repr(arange(4.))` will
> > > now
> > > return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2.,
> > > 3.])'.
> > > 
> > > 2.
> > > The printing of 0d arrays is overhauled. This is a bit finicky to
> > > describe, please see the release note in the PR. As an example of
> > > the
> > > effect of this, the `repr(np.array(0.))` now prints as
> > > 'array(0.)`
> > > instead of 'array(0.0)'. Also the repr of 0d datetime arrays is
> > > now like
> > > "array('2005-04-04', dtype='datetime64[D]')" instead of
> > > "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".
> > > 
> > > 3.
> > > User-defined dtypes which did not properly implement their `repr`
> > > (and
> > > `str`) should do so now. Otherwise it now falls back to
> > > `object.__repr__`, which will return something ugly like
> > > ``. (Previously you could depend
> > > on
> > > only implementing the `item` method and the repr of that would be
> > > printed. But no longer, because this risks infinite recursions.).
> > > 
> > > 4.
> > > Bool arrays of size 1 with a 'True' value will now omit a space,
> > > so that
> > > `repr(array([True]))` is now 'array([True])' instead of
> > > 'array([ True])'.
> > > 
> > > Allan
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> >  ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Scipy 2017 NumPy sprint

2017-07-02 Thread Sebastian Berg
On Sun, 2017-07-02 at 10:49 -0400, Allan Haldane wrote:
> On 07/02/2017 10:03 AM, Charles R Harris wrote:
> > Updated list below.
> > 
> > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root  >  
> > <mailto:ben.v.r...@gmail.com>> wrote:
> > 
> > Just a heads-up. There is now a sphinx-gallery plugin.
> > Matplotlib
> > and a few other projects have migrated their docs over to use
> > it.
> > 
> > https://sphinx-gallery.readthedocs.io/en/latest/
> > <https://sphinx-gallery.readthedocs.io/en/latest/>
> > 
> > Cheers!
> > Ben Root
> > 
> > 
> > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers  > l.com
> > <mailto:ralf.gomm...@gmail.com>> wrote:
> > 
> > 
> > 
> > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen  > <mailto:p...@iki.fi>> wrote:
> > 
> > Charles R Harris kirjoitti 29.06.2017 klo 20:45:
> > > Here's a random idea: how about building a NumPy
> > gallery?
> > > scikit-{image,learn} has it, and while those
> > projects may have more
> > > visual datasets, I can imagine something along
> > the lines of Nicolas
> > > Rougier's beautiful book:
> > >
> > > http://www.labri.fr/perso/nrougier/from-python-to
> > -numpy/
> > <http://www.labri.fr/perso/nrougier/from-python-to-nump
> > y/>
> > > <http://www.labri.fr/perso/nrougier/from-python-t
> > o-numpy/
> > <http://www.labri.fr/perso/nrougier/from-python-to-nump
> > y/>>
> > >
> > >
> > > So that would be added in the  numpy
> > > <https://github.com/numpy>/numpy.org
> > <http://numpy.org>
> > > <https://github.com/numpy/numpy.org
> > <https://github.com/numpy/numpy.org>> repo?
> > 
> > Or https://scipy-cookbook.readthedocs.io/
> > <https://scipy-cookbook.readthedocs.io/>  ?
> > (maybe minus bitrot and images added :)
> > _
> > 
> > 
> > I'd like the numpy.org <http://numpy.org> one. numpy.org
> > <http://numpy.org> is now incredibly sparse and ugly, a
> > gallery
> > would make it look a lot better.
> > 
> > Another idea, from the "deprecate np.matrix" discussion:
> > add
> > numpy documentation describing the preferred way to handle
> > matrices, extolling the virtues of @, and move np.matrix
> > documentation to a deprecated section.
> > 
> > 
> >   Putting things together with a few new ideas,
> > 
> >  1. add gallery to numpy.org <http://numpy.org>,
> >  2. add extended documentation of '@' operator,
> >  3. make Numpy tests Pytest compatible,
> >  4. add matrix multiplication ufunc.
> > 
> >   Any more ideas?
> 
> The new doctest runner suggested in the printing thread? This is to 
> ignore whitespace and precision in ndarray output.
> 
> I can see an argument for distributing it in numpy if it is designed
> to 
> be specially aware of ndarrays or numpy scalars (eg to test equality 
> between 'wants' and 'got')
> 

I don't really feel it is very numpy specific or should be under the
numpy umbrella (I mean if there is no other spot, I guess it could live
on the numpy github page). Its about as numpy specific, as the gallery
sphinx extension is probably matplotlib specific

That doesn't mean that it might not be a good sprint, though :).

The question to me is a bit what those who actually go there want from
it or do a few people who know numpy/scipy already plan to come? Two
years ago, we did not have much of a plan, so it was mostly giving
three people or so a bit of a tutorial of how numpy worked internally
leading to some bug fixes.

One quick idea that might be nice and dives a bit into the C-layer
(might be nice if there is no big topic with a few people working on):

* Find places that should have the new memory overlap
  detection and implement it there.

If someone who does subclasses/array-likes or so (e.g. like Stefan
Hoyer ;)) and is interested, and also we do some
teleconferencing/chatting (and I have time) I might be interested
in discussing and possibly trying to develop the new indexer ideas,
which I feel are pretty far, but I got stuck on how to get subclasses
right.

- Sebastian


> Allan
> 
> 
> > Chuck
> > 
> > 
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] reshape 2D array into 3D

2017-07-10 Thread Sebastian Berg
On Mon, 2017-07-10 at 16:16 +0300, eat wrote:
> Hi,
> 
> On Mon, Jul 10, 2017 at 3:20 PM,  wrote:
> > Dear All
> > I'm looking in a way to reshape a 2D matrix into a 3D one ; in my
> > example I want to move the columns from the 4th to the 8th in the
> > 2nd plane  (3rd dimension i guess)
> > a =  np.random.rand(5,8); print(a)
> > I tried
> > a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting
> > 
> > Nota : it looks like the following task (while I want to split it
> > in 2 levels and not in 4), but I've not understood at all
> > https://stackoverflow.com/questions/31686989/numpy-reshape-and-part
> > ition-2d-array-to-3d
> > 
> 
> Is this what you are looking for: 
> import numpy as np
> 
> a= np.arange(40).reshape(5, 8)
> 
> a
> Out[]: 
> array([[ 0,  1,  2,  3,  4,  5,  6,  7],
>        [ 8,  9, 10, 11, 12, 13, 14, 15],
>        [16, 17, 18, 19, 20, 21, 22, 23],
>        [24, 25, 26, 27, 28, 29, 30, 31],
>        [32, 33, 34, 35, 36, 37, 38, 39]])
> 
> np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4))
> Out[]: 
> array([[[ 0,  1,  2,  3],
>         [ 8,  9, 10, 11],
>         [16, 17, 18, 19],
>         [24, 25, 26, 27],
>         [32, 33, 34, 35]],
> 
>        [[ 4,  5,  6,  7],
>         [12, 13, 14, 15],
>         [20, 21, 22, 23],
>         [28, 29, 30, 31],
>         [36, 37, 38, 39]]])
> 

While maybe what he wants, I would avoid stride tricks if you can
achieve the same thing with a reshape + transpose. Far more safe if you
hardcode the strides, and much shorter if you don't, plus easier to
read usually.

One thing some people might get confused about with reshape is the
order, numpy reshape defaults to C-order, while other packages may use
fortran order for reshaping, you can actually change the order you want
to use (though it is in general a good idea to prefer C-order in numpy
probably).

- Sebastian


> Regards,
> -eat
> > Thanks for your support
> > 
> > Paul
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] pytest and degrees of separation.

2017-07-11 Thread Sebastian Berg
On Tue, 2017-07-11 at 14:49 -0600, Charles R Harris wrote:
> Hi All,
> 
> Just looking for opinions and feedback on the need to keep NumPy from
> having a hard nose/pytest dependency. The options as I see them are:
> 
> pytest is never imported until the tests are run -- current practice
> with nose
> pytest is never imported unless the testfiles are imported -- what I
> would like 
> pytest is imported together when numpy is -- what we need to avoid.
> Currently the approach has been 1), but I think 2) makes more sense
> and allows more flexibility.


I am not quite sure about everything here. My guess is we can do
whatever we want when it comes to our own tests, and I don't mind just
switching everything to pytest (I for one am happy as long as I can run
`runtests.py` ;)).
When it comes to the utils we provide, those should keep working
without nose/pytest if they worked before without it I think.

My guess is that all your options do that, so I think we should take
the one that gives the nicest maintainable code :). Though can't say I
looked enough into it to really make a well educated decision, that
probably means your option 2.

- Sebastian



> Thoughts?
> Chuck
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)?

2017-07-17 Thread Sebastian Berg
On Mon, 2017-07-17 at 09:13 +, martin.gfel...@swisscom.com wrote:
> Dear all
> 
> I have object array of arrays, which I compare element-wise to None
> in various places:
> 
> > > > a =
> > > > numpy.array([numpy.arange(5),None,numpy.nan,numpy.arange(6),Non
> > > > e],dtype=numpy.object)
> > > > a
> 
> array([array([0, 1, 2, 3, 4]), None, nan, array([0, 1, 2, 3, 4, 5]),
> None], dtype=object)
> > > > numpy.equal(a,None)
> 
> FutureWarning: comparison to `None` will result in an elementwise
> object comparison in the future.
> 
> 
> So far, I always ignored the warning, for lack of an idea how to
> resolve it. 
> 
> Now, with Numpy 1.13, I have to resolve the issue, because it fails
> with: 
> 
> ValueError: The truth value of an array with more than one element is
> ambiguous. Use a.any() or a.all() 
> 
> It seem that the numpy.equal is applied to each inner array,
> returning a Boolean array for each element, which cannot be coerced
> to a single Boolean.
> 
> The expression 
> 
> > > > numpy.vectorize(operator.is_)(a,None)
> 
> gives the desired result, but feels a bit clumsy. 
> 

Yes, I guess ones bug is someone elses feature :(, if it is very bad,
we could delay the deprecation probably. For a solutions, maybe we
could add a ufunc  for elementwise `is` on object arrays (dunno about
the name, maybe `object_identity`.

Just some quick thoughts.

- Sebastian


> Is there a cleaner, efficient way to do an element-wise (but shallow)
> comparison? 
> 
> Thank you and best regards,
> Martin Gfeller, Swisscom
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to compare an array of arrays elementwise to None in

2017-07-19 Thread Sebastian Berg
On Wed, 2017-07-19 at 08:31 +, martin.gfel...@swisscom.com wrote:
> Thank you for your help!
> 
> Sebastian, I couldn't agree more with someone's bug being someone
> else's feature! A fast identity ufunc would be useful, though. 
> 

An `object_identity` ufunc should be very easy to implement, the bigger
work is likely to actually decide on it and the name. Also should
probably check back with the PyPy guys to make sure it would also work
on PyPy.

- Sebastian


> Actually, numpy.frompyfunc(operator.is_,2,1) is much faster than the
> numpy.vectorize approach - only about 35% slower on quick
> measurement 
> than the direct ==, as opposed to 62% slower with vectorize (with
> otypes hint). 
> 
> Robert, yes, that's what I already did provisionally. 
> 
> Eric, that is a nice puzzle - but I agree with Robert about
> understanding by code maintainers. 
> 
> Thanks again, and best regards,
> Martin
> 
> 
> 
> 
> On Mon, 17 Jul 2017 11:41 Sebastian Berg 
> write
> 
> > Yes, I guess ones bug is someone elses feature :(, if it is very
> > bad, we could delay the deprecation probably. For a solutions,
> > maybe 
> > we could add a ufunc  for elementwise `is` on object arrays (dunno
> > about the name, maybe `object_identity`.
> > Just some quick thoughts.
> > - Sebastian
> 
> On Mon, 17 Jul 2017 at 17:45 Robert Kern 
> wrote:
> 
> > Wrap the clumsiness up in a documented, tested utility function
> > with a descriptive name and use that function everywhere instead.
> > Robert Kern
> 
> On Mon, Jul 17, 2017 at 10:52 AM, Eric Wieser  l.com>
> wrote:
> 
> > Here's a hack that lets you keep using ==:
> > 
> > class IsCompare:
> > __array_priority__ = 99  # needed to make it work on either
> > side of `==`
> > def __init__(self, val): self._val = val
> > def __eq__(self, other): return other is self._val
> > def __neq__(self, other): return other is not self._val
> > 
> > a == IsCompare(None)  # a is None
> > a == np.array(IsCompare(None)) # broadcasted a is None
> > 
> > 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy steering councils members

2017-07-21 Thread Sebastian Berg
On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote:
> On 21.07.2017 08:52, Ralf Gommers wrote:
> > Hi all,
> > 
> > It has been well over a year since we put together the governance
> > structure and steering council
> > (https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#go
> > vernance-people).
> > We haven't reviewed the people on the steering council in that
> > time.
> > Based on the criteria for membership I would like to make the
> > following
> > suggestion (note, not discussed with everyone in private
> > beforehand):
> > 
> > Adding the following people to the steering council:
> > - Eric Wieser
> > - Marten van Kerkwijk
> > - Stephan Hoyer
> > - Allan Haldane
> > 
> 
> 
> Eric and Marten have only been members with commit rights for 6
> months,
> While they have been contributing and very valuable to the project
> for
> significantly longer, I do think this it is a bit to short time to be
> considered for the steering council.
> I certainly approve of them becoming members at some point, but I do
> want to avoid the steering council to grow to large to quick as long
> as
> it does not need more members to do its job.
> What I do want to avoid is that the steering council becomes like our
> committers list, a group that only grows and never shrinks as long as
> the occasional heartbeat is heard.
> 
> That said if we think the current steering council is not able to
> fulfil
> its purpose I do offer my seat for a replacement as I currently have
> not
> really been contributing much.

I doubt that ;). IIRC the rules were "at least one year", so you are
probably right that we should delay the official status until then, but
I care much personally.

I think all of us are in the position where we don't mind giving up
this "official" position in favor of more active people (just to note,
IIRC in two years now, it was _somewhat_ used a single time when we
donate a bit of numpy money to the mingwpy effort).

I am not sure if we had it, but we could put in (up to changes of
course), a rough number of people we aim to have on it. Just so we
don't forget to discuss that there should be a bit flux. And I am all
for some flux, because I would think it silly if those who actually
make decisions  don't end up on it because someone is occasionally
throws in a comment. And yes, that person may well be me :).

- Sebastian



> cheers,
> Julian
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy steering councils members

2017-07-21 Thread Sebastian Berg
On Fri, 2017-07-21 at 12:59 -0700, Nathaniel Smith wrote:
> On Jul 21, 2017 9:36 AM, "Sebastian Berg"  > wrote:
> On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote:
> > On 21.07.2017 08:52, Ralf Gommers wrote:

> Also FWIW, the jupyter steering council is currently 15 people, or 16
> including Fernando:
>   https://github.com/jupyter/governance/blob/master/people.md
> 
> By comparison, Numpy's currently has 8, so Ralf's proposal would
> bring it to 11:
>   https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#gov
> ernance-people
> 
> Looking at the NumPy council, then with the exception of Alex who I
> haven't heard from in a while, it looks like a list of people who
> regularly speak up and have sensible things to say, so I don't
> personally see any problem with keeping everyone around. It's not
> like the council is an active working group; it's mainly for
> occasional oversight and boring logistics.
> 

For what its worth, I fully agree. Frankly, I thought the lits might be
longer ;). And yes, while I can understand that there might be a
problem at some point, I am sure we are far from it for a while.

Anyway, I think all of those four people Ralf mentioned would be a
great addition (and if anyone wants to suggest someone else please
speak up).

- Sebastian


> -n
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slice nested arrays, How to

2017-07-24 Thread Sebastian Berg
On Mon, 2017-07-24 at 16:37 +0200, Bob wrote:
> Hello,
> 
> I created the following array by converting it from a nested list:
> 
> a = np.array([np.array([
> 17.56578416,  16.82712825,  16.57992292, 
> 15.83534836]),
>    np.array([ 17.9002445
> ,  17.35024876,  16.69733472,  15.78809856]),
>    np.array([
> 17.90086839,  17.64315136,  17.40653009,  17.26346787,
> 16.99901931,  16.87787178,  16.68278558,  16.56006419, 
> 16.43672445]),
>    np.array([ 17.91147242,  17.2770623 ,  17.0320501 , 
> 16.73729491,  16.4910479 ])], dtype=object)
> 
> I wish to slice the first element of each sub-array so I can perform
> basic statistics (mean, sd, etc...0).
> 
> How can I do that for large data without resorting to loops? Here's
> the
> result I want with a loop:
> 


Arrays of arrays are not very nice in these regards, you could use
np.frompyfunc/np.vectorize together with `operator.getitem` to avoid
the loop. It probably will not be much faster though.

- Sebastian


> s = np.zeros(4)
> for i in np.arange(4):
> s[i] = a[i][0]
> 
> array([ 17.56578416,  17.9002445 ,  17.90086839,  17.91147242])
> 
> Thank you
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy steering councils members

2017-07-25 Thread Sebastian Berg
Hi all,

so I guess this means: Unless anyone protests (soon, though at least a
week from now probably) publicly or privately. We will invite four new
members to the steering council and, if they accept, they will be added
soon [1]. These are:

- Eric Wieser
- Marten van Kerkwijk
- Stephan Hoyer
- Allan Haldane

all of whom have done considerable work for NumPy for a long time. I
would like to also note again that I am happy about any additional
suggestions.

Alex Griffin will be informed that depending on his wishes, he may have
to leave soon or within about a year (IIRC that was about what the
governance docs say).

Regards,

Sebastian


[1] Two of whom may be appointed with some delay due to the one year
rule. We may have to hash out details here.


On Fri, 2017-07-21 at 22:18 +0200, Sebastian Berg wrote:
> On Fri, 2017-07-21 at 12:59 -0700, Nathaniel Smith wrote:
> > On Jul 21, 2017 9:36 AM, "Sebastian Berg"  > et
> > > wrote:
> > 
> > On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote:
> > > On 21.07.2017 08:52, Ralf Gommers wrote:
> 
> 
> > Also FWIW, the jupyter steering council is currently 15 people, or
> > 16
> > including Fernando:
> >   https://github.com/jupyter/governance/blob/master/people.md
> > 
> > By comparison, Numpy's currently has 8, so Ralf's proposal would
> > bring it to 11:
> >   https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#g
> > ov
> > ernance-people
> > 
> > Looking at the NumPy council, then with the exception of Alex who I
> > haven't heard from in a while, it looks like a list of people who
> > regularly speak up and have sensible things to say, so I don't
> > personally see any problem with keeping everyone around. It's not
> > like the council is an active working group; it's mainly for
> > occasional oversight and boring logistics.
> > 
> 
> For what its worth, I fully agree. Frankly, I thought the lits might
> be
> longer ;). And yes, while I can understand that there might be a
> problem at some point, I am sure we are far from it for a while.
> 
> Anyway, I think all of those four people Ralf mentioned would be a
> great addition (and if anyone wants to suggest someone else please
> speak up).
> 
> - Sebastian
> 
> 
> > -n
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.array, copy=False and memmap

2017-08-10 Thread Sebastian Berg
On Thu, 2017-08-10 at 12:27 -0400, Allan Haldane wrote:
> On 08/07/2017 05:01 PM, Nisoli Isaia wrote:
> > Dear all,
> > I have a question about the behaviour of
> > 
> > y = np.array(x, copy=False, dtype='float32')
> > 
> > when x is a memmap. If we check the memmap attribute of mmap
> > 
> > print "mmap attribute", y._mmap
> > 
> > numpy tells us that y is not a memmap.
> > But the following code snippet crashes the python interpreter
> > 
> > # opens the memmap
> > with open(filename,'r+b') as f:
> >   mm = mmap.mmap(f.fileno(),0)
> >   x = np.frombuffer(mm, dtype='float32')
> > 
> > # builds an array from the memmap, with the option copy=False
> > y = np.array(x, copy=False, dtype='float32')
> > print "before", y
> > 
> > # closes the file
> > mm.close()
> > print "after", y
> > 
> > In my code I use memmaps to share read-only objects when doing
> > parallel
> > processing
> > and the behaviour of np.array, even if not consistent, it's
> > desirable.
> > I share scipy sparse matrices over many processes and if np.array
> > would
> > make a copy
> > when dealing with memmaps this would force me to rewrite part of
> > the sparse
> > matrices
> > code.
> > Would it be possible in the future releases of numpy to have
> > np.array
> > check,
> > if copy is false, if y is a memmap and in that case return a full
> > memmap
> > object
> > instead of slicing it?
> 
> This does appear to be a bug in numpy or mmap.
> 

Frankly on first sight, I do not think it is a bug in either of them.
Numpy uses view (memmap really is just a name for a memory map backed
numpy array). The numpy array will hold a reference to the memory map
object in its `.base` attribute (or the base of the base, etc.).

If you close a mmap object, and then keep using it, you can get
segfaults of course, I am not sure what you can do about it. Maybe
python can try to warn you when you exit the context/close a file
pointer, but I suppose: Python does memory management for you, it makes
doing IO management easy, but you need to manage the IO correctly. That
this segfaults and not just errors may be annoying, but seems the
nature of things on first sight.

- Sebastian



> Probably the solution isn't to make mmaps a special case, rather we
> should fix a bug somewhere in the use of the PEP3118 interface.
> 
> I've opened an issue on github for your issue:
> https://github.com/numpy/numpy/issues/9537
> 
> It seems to me that the "correct" behavior may be for it to me
> impossible to close the memmap while pointers to it exist; this is
> the
> behavior for `memoryview`s of mmaps. That is, your line `mm.close()`
> shoud raise an error `BufferError: cannot close exported pointers
> exist`.
> 
> 
> > Best wishes
> > Isaia
> > 
> > P.S. A longer account of the issue may be found on my university
> > blog
> > http://www.im.ufrj.br/nisoli/blog/?p=131
> > 
> > 
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL)

2017-08-17 Thread Sebastian Berg
On Thu, 2017-08-17 at 00:33 +0200, Paul Springer wrote:
> Am 8/16/17 um 6:08 PM schrieb Anne Archibald:

>  
> > If you wanted to integrate HPTT into numpy, I think the best
> > approach might be to wire it into the assignment machinery, so that
> > when users do things like a[::2,:] = b[:,::3].T HPTT springs into
> > action behind the scenes and makes this assignment as efficient as
> > possible (how well does it handle arrays with spaces between
> > elements?). Then ascontiguousarray and asfortranarray and the like
> > could simply use assignment to an appropriately-ordered destination
> > when they actually needed to do anything.
>  HPTT offers support for subtensor (via the outerSize parameter,
> which is similar to the leading dimension in BLAS), thus, HPTT can
> also deal with arbitrarily strided transpositions.
> However, a non-unite stride for the fastest-varying index is
> devastating for performance since this prohibits the use of
> vectorization and the exploitation of spatial locality.
> 
> How would the integration of HPTT into NumPY look like? 
> Which steps would need to be taken?
> Would it be required the HPTT be distributed in source code along
> side NumPY (at that point I might have to change the license for
> HPTT) or would it be fine to add an git dependency? That way users
> who build NumPY from source could fetch HPTT and set a flag during
> the build process of NumPY, indicating the HPTT is available? 
> How would the process look like if NumPY is distributed as a
> precompiled binary?
> 

Well, numpy is BSD, and the official binaries will be BSD, someone else
could do less free binaries of course. I doubt we can have a hard
dependency unless it is part of the numpy source (some trick like this
at one point existed for fftw, but). I doubt including the source
itself is going to happen quickly since we would first have to decide
to actually use a modern C++ compiler (I have no idea if that is
problematic or not).

Having a blocked/fancier (I assume) iterator jump in at least for
simple operations such as transposed+copy as Anne suggested sounds very
cool though. It could be nice for simple ufuncs at least as well. I
have no idea how difficult that may be though or how much complexity it
would add to maintenance. My guess is it might require quite a lot of
work to integrate such optimizations into the Iterator itself (even
though it would be awesome), compared to just trying to plug it into
some selected fast paths as Anne suggested.

One thing that might be very simple and also pretty nice is just trying
to keep the documentation (or wiki page or so linked from the
documentation) up to date with suggestions for people interested in
speed improvements listing things such as (not sure if we have that):
 
* Use pyfftw for speeding up ffts
* numexpr can be nice and gives a way to quickly use multiple cores
* numba can automagically compile some python functions to be fast
* Use TCL if you need faster einsum(like) operations
* ...

Just a few thoughts, did not think about details really. But yes, it is
sounds reasonable to me to re-add support for optional dependencies
such as fftw or your TCL. But packagers have to make use of that or I
fear it is actually less available than a standalone python module.

- Sebastian



> The same questions apply with respect to TCL.
> > > TCL uses the Transpose-Transpose-GEMM-Transpose approach where
> > > all tensors are flattened into matrices (via HPTT) and then
> > > contracted via GEMM; the final result is eventually folded (via
> > > HPTT) into the desired output tensor.
> > > 
> > 
> > This is a pretty direct replacement of einsum, but I think einsum
> > may well already do pretty much this, apart from not using HPTT to
> > do the transposes. So the way to get this functionality would be to
> > make the matrix-rearrangement primitives use HPTT, as above.
>  That would certainly be one approach, however, TCL also explores
> several different strategies/candidates and picks the one that
> minimizes the data movements required by the transpositions.
> > > Would it be possible to expose HPTT and TCL as optional packages
> > > within NumPY? This way I don't have to redo the work that I've
> > > already put into those libraries.
> > > 
> > 
> > I think numpy should be regarded as a basically-complete package
> > for manipulating strided in-memory data, to which we are reluctant
> > to add new user-visible functionality. Tools that can act under the
> > hood to make existing code faster, or to reduce the work users must
> > to to make their code run fast enough, are valuable.
>  It seems to me that TCL is such a candidate, since it can replace a
> significant portion of th

[Numpy-discussion] Github overview change

2017-10-18 Thread Sebastian Berg
Hi all,

probably silly, but is anyone else annoyed at not seeing comments
anymore in the github overview/start page? I stopped getting everything
as mails and had a (bad) habit of glancing at them which would spot at
least bigger discussions going on, but now it only shows actual
commits, which honestly are less interesting to me.

Probably just me, was just wondering if anyone knew a setting or so?

- Sebastian

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Github overview change

2017-10-18 Thread Sebastian Berg
On Wed, 2017-10-18 at 13:25 -0500, Nathan Goldbaum wrote:
> This is a change in the UI that github introduced a couple weeks ago
> during their annual conference.
> 
> See https://github.com/blog/2447-a-more-connected-universe
> 

This announces the "Discover repositories" thing, but my normal news
feed changed significantly, maybe at the same time, not showing
comments at all.

Is there a simple setup where:

1. I can get a rough overview what is being discussed without
   necessarily reading everything.
2. Still get anything with @mention, etc. so that I can't really
   miss it? (right now I have those in mail -- which I like -- and
   on the website, which I don't care too much about).

Probably I can set it up to get everything as mail, and set the website
to still only give notifications for 2., which would be OK. Maybe I am
just change resistant ;).

- Sebastian


> On Wed, Oct 18, 2017 at 11:49 AM Charles R Harris
>  wrote:
> > On Wed, Oct 18, 2017 at 7:23 AM, Sebastian Berg  > ions.net> wrote:
> > > Hi all,
> > > 
> > > probably silly, but is anyone else annoyed at not seeing comments
> > > anymore in the github overview/start page? I stopped getting
> > > everything
> > > as mails and had a (bad) habit of glancing at them which would
> > > spot at
> > > least bigger discussions going on, but now it only shows actual
> > > commits, which honestly are less interesting to me.
> > > 
> > > Probably just me, was just wondering if anyone knew a setting or
> > > so?
> > 
> > Don't know any settings. It's almost as annoying as not forwarding
> > my own comments ...
> > 
> > Chuck 
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal of timeline for dropping Python 2.7 support

2017-11-08 Thread Sebastian Berg
On Wed, 2017-11-08 at 18:15 +0100, Ilhan Polat wrote:
> I was about to send the same thing. I think this matter became a
> vim/emacs issue and Py2 supporters won't take any arguments anymore.
> But if Instagram can do it, it means that legacy code argument is a
> matter of will but not a technicality. https://thenewstack.io/instagr
> am-makes-smooth-move-python-3/
> 
> Also people are really going out of their ways such as Tauthon https:
> //github.com/naftaliharris/tauthon to stay with Python2. To be
> honest, I'm convinced that this is a sentimental debate after seeing
> this fork.
> 
> 

In my opinion it is fine for us to drop support for python 2 in master
relatively soon (as proposed here).
But I guess we will need to a "LTS" release which means some extra
maintenance burden until 2020.
I could hope those who really need it jumping in to carry some of that
(and by 2020 my guess is if anyone still wants to support it longer, we
won't stop you, but I doubt the current core devs, at least not me,
would be very interested in it).

So in my opinion, the current NumPy is excellent and very stable,
anyone who needs fancy new stuff likely also wants other fancy new
stuff so will soon have to use python 3 anyway Which means, if we
think the extra burden of a "LTS" is lower then the current hassle,
lets do it :).
Also downstream seems only half a reason to me, since downstream
normally supports much outdated versions anyway?

- Sebastian


> 
> 
> 
> 
> 
> On Wed, Nov 8, 2017 at 5:50 PM, Peter Cock  > wrote:
> > On Tue, Nov 7, 2017 at 11:40 PM, Nathaniel Smith 
> > wrote:
> > >
> > > 
> > >
> > > Right now, the decision in front of us is what to tell people who
> > ask about
> > > numpy's py2 support plans, so that they can make their own plans.
> > Given what
> > > we know right now, I don't think we should promise to keep
> > support past
> > > 2018. If we get there and the situation's changed, and there's
> > both desire
> > > and means to extend support we can revisit that. But's better to
> > > under-promise and possibly over-deliver, instead of promising to
> > support py2
> > > until after it becomes a millstone around our necks and then
> > realizing we
> > > haven't warned anyone and are stuck supporting it another year
> > beyond
> > > that...
> > >
> > > -n
> > 
> > NumPy (and to a lesser extent SciPy) is in a tough position being
> > at the
> > bottom of many scientific Python programming stacks. Whenever you
> > drop Python 2 support is going to upset someone.
> > 
> > It is too ambitious to pledge to drop support for Python 2.7 no
> > later than
> > 2020, coinciding with the Python development team’s timeline for
> > dropping
> > support for Python 2.7?
> > 
> > If that looks doable, NumPy could sign up to http://www.python3stat
> > ement.org/
> > 
> > Regards,
> > 
> > Peter
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Is there a way that indexing a matrix of data with a matrix of indices?

2017-11-29 Thread Sebastian Berg
On Wed, 2017-11-29 at 14:56 +, ZHUO QL (KDr2) wrote:
> Hi, all
> 
> suppose:
> 
> - D, is the data matrix, its shape is  M x N
> - I, is the indices matrix, its shape is M x K,  K<=N
> 
> Is there a efficient way to get a Matrix R with the same shape of I
> so that R[x,y] = D[x, I[x,y]] ?
> 
> A nested for-loop or list-comprehension is too slow for me.  
> 

Advanced indexing can do any odd thing you might want to do. I would
not suggest to use the matrix class, but always use the array class in
case you are doing that though.

This should do the trick, I will refer the the documentation for how it
works, except that it is basically:

R[x,y] = D[I1[x, y], I2[x, y]]

R = D[np.arange(I.shape[0])[:, np.newaxis], I]

- Sebastian



> Thanks.
> 
> 
> ZHUO QL (KDr2) http://kdr2.com
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Which rule makes x[np.newaxis, :] and x[np.newaxis] equivalent?

2017-12-12 Thread Sebastian Berg
On Tue, 2017-12-12 at 14:19 +0100, Joe wrote:
> Ah, ok, now that I knew what to look for I guess I found it:
> 
> "If the number of objects in the selection tuple is less than N ,
> then : 
> is assumed for any subsequent dimensions."
> 
> https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.htm
> l
> 
> This is the one, right?
> 

Yeah, plus if it is not a tuple, it actually behaves the same as a
tuple, e.g. `arr[obj]` is identical to `arr[obj,]` (or `arr[(obj,)]`
which is the same). There are some weird exception when obj is a list a
sequence but not an array.

Note also that while everything has an implicit `, ...` at the end of
indexing, if you have exactly as many integers to index as dimensions
you get a scalar, if you would add the Ellipsis you would get an array
back.

Anyway, too many weird details for day to day stuff :). And all of that
should be covered in the docs?

- Sebastian

> 
> Am 12.12.2017 09:09 schrieb Nathaniel Smith:
> > On Tue, Dec 12, 2017 at 12:02 AM, Joe  wrote:
> > > Hi,
> > > 
> > > question says it all. I looked through the basic and advanced 
> > > indexing,
> > > but I could not find the rule that is applied to make
> > > x[np.newaxis,:] and x[np.newaxis] the same.
> > 
> > I think it's the general rule that all indexing expressions have an
> > invisible "..." on the right edge. For example, x[i][j][k] is an
> > inefficient and IMO somewhat confusing way to write x[i, j, k],
> > because x[i][j][k] is interpreted as:
> > 
> > -> x[i, ...][j, ...][k, ...]
> > -> x[i, :, :][j, :][k]
> > 
> > That this also applies to newaxis is a little surprising, but I
> > guess
> > consistent.
> > 
> > -n
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Does x[True] trigger basic or advanced indexing?

2017-12-14 Thread Sebastian Berg
On Thu, 2017-12-14 at 16:24 +, Eric Wieser wrote:
> It sounds like you're using an old version of numpy, where boolean
> scalars were interpreted as integers.
> What version are you using?
> Eric
> 

Indeed, you are maybe using a pre 1.9 version (post 1.9 should at least
have a DeprecationWarning or some such, though you might not notice it
IIRC).
For newer versions you should get boolean indexing, the result of it
may be a bit confusing. It is advanced indexing, basically with False
giving you an empty array (with an extra dimension of size 0) and True
being much like an `np.newaxis`.
It all makes perfect sense if you think of it of a 0-d array
picking

The same thing is true for example for lists of booleans.

- Sebastian



> On Thu, Dec 14, 2017, 04:27 Joe  wrote:
> > Hello,
> > thanks for you feedback.
> > 
> > Sorry, if thie question is stupid and the case below does not make
> > sense.
> > I am just trying to understand the logic.
> > For
> > 
> > x = np.random.rand(2,3)
> > 
> > x[True]
> > x[(True,)]
> > 
> > or
> > 
> > x[False]
> > x[(False,)]
> > 
> > where True and False are not arrays,
> > it will pick the first or second row.
> > 
> > Is this basic indexing then with one the rules
> > - obj is an integer
> > - obj is a tuple of slice objects and integers.
> > ?
> > 
> > 
> > Am 13.12.2017 21:49 schrieb Eric Wieser:
> > > Increasingly, NumPy does not considers booleans to be integer
> > types,
> > > and indexing is one of these cases.
> > >
> > > So no, it will not be treated as a tuple of integers, but as a 0d
> > mask
> > >
> > > Eric
> > >
> > > On Wed, 13 Dec 2017 at 12:44 Joe  wrote:
> > >
> > >> Hi,
> > >>
> > >> yet another question.
> > >>
> > >> I looked through the indexing rules in the
> > >> documentation but I count not find which one
> > >> applies to x[True] and x[False]
> > >>
> > >> that might e.g result from
> > >>
> > >> import numpy as np
> > >> x = np.array(3)
> > >> x[x>5]
> > >> x[x<1]
> > >> x[True]
> > >> x[False]
> > >>
> > >> x = np.random.rand(2,3)
> > >> x[x>5]
> > >> x[x<1]
> > >> x[True]
> > >> x[False]
> > >>
> > >> I understood that they are equivalent to
> > >>
> > >> x[(False,)]
> > >>
> > >> I tested it and it looks like advanced indexing,
> > >> but I try to unterstand the logic behind this,
> > >> if there is any :)
> > >>
> > >> In x[x<1] the x<1 is a mask and thus I guess it is a
> > >> "tuple with at least one sequence object or ndarray (of data
> > type
> > >> integer or bool)", right?
> > >>
> > >> Or will x[True] trigger basic indexing as it is "a tuple of
> > >> integers"
> > >> because True will be converted to Int?
> > >>
> > >> Cheers,
> > >> Joe
> > >> ___
> > >> NumPy-Discussion mailing list
> > >> NumPy-Discussion@python.org
> > >> https://mail.python.org/mailman/listinfo/numpy-discussion [1]
> > >
> > >
> > > Links:
> > > --
> > > [1] https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Applying logical operations along an axis of a boolean array?

2017-12-18 Thread Sebastian Berg
On Mon, 2017-12-18 at 12:02 +0100, hanno_li...@gmx.net wrote:
> Hi,
>  
> is it possible, to apply a logical operation, such as AND or OR along
> a particular axis of a numpy array?
> 

As mentioned, `np.any` and `np.all` work. However, what is more/also
interesting to you is maybe that:

`np.logical_or.reduce`

works. All binary ufuncs (most elementwise functions such as addition,
subtraction, multiplication, etc. support this `reduce` (and some
other, please find out yourself ;)) methods.
So that thing like `any`, `sum`, or `cumsum` are actually just thin
wrappers around those.

- Sebastian


>  
> Let's say I have an (n,m) array and I want to AND along the first
> axis, such that I get a (1,m) (or just (m,) dimensional array back. I
> looked at the documentation for np.logical_and and friends but
> couldn't find an axis keyword on the logical_xyz operations and
> nothing else seemed to fit.
>  
> Thanks, and best regards,
> Hanno
>  
>  
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] GSoC'18 participation

2017-12-27 Thread Sebastian Berg
On Tue, 2017-12-26 at 08:19 -0700, Charles R Harris wrote:
> 
> 
> On Mon, Dec 25, 2017 at 7:12 PM, Ralf Gommers  > wrote:
> > Hi all,
> > 
> > It's the time of the year again where projects start preparing for
> > GSoC. So I wanted to bring it up here. Last year I wrote: "in
> > practice working on NumPy is just far too hard for most GSoC
> > students. Previous years we've registered and generated ideas, but
> > not gotten any students. We're also short on maintainer capacity.
> > So I propose to not participate this year."
> > 
> > I think that's still the case, so I won't be mentoring or
> > organizing. In case anyone is interested to do one of those things,
> > please speak up!
> > 
> > 
> 
> Sounds realistic. I thought some of the ideas last year were doable,
> but no bites.
> 

A bit unfortunate, but yeah, realistic. I do not have time to help out
in any case.

- Sebastian



> Chuck 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] array - dimension size of 1-D and 2-D examples

2018-01-09 Thread Sebastian Berg
On Tue, 2018-01-09 at 12:27 +, martin.gfel...@swisscom.com wrote:
> Hi Derek
> 
> I have a related question:
> 
> Given:
> 
>   a = numpy.array([[0,1,2],[3,4]])
>   assert a.ndim == 1
>   b = numpy.array([[0,1,2],[3,4,5]])
>   assert b.ndim == 2
> 
> Is there an elegant  way to force b to remain a 1-dim object array? 
> 

You will have to create an empty object array and assign the lists to
it.

```
b = np.empty(len(l), dtype=object)
b[...] = l
```

> I have a use case where normally the sublists are of different
> lengths, but I get a completely different structure when they are
> (coincidentally in my case) of the same length.
> 
> Thanks and best regards, Martin
> 
> 
> Martin Gfeller, Swisscom / Enterprise / Banking / Products / Quantax
> 
> Message: 1
> Date: Sun, 31 Dec 2017 00:11:48 +0100
> From: Derek Homeier 
> To: Discussion of Numerical Python 
> Subject: Re: [Numpy-discussion] array - dimension size of 1-D and 2-D
>   examples
> Message-ID:
>en.de>
> Content-Type: text/plain; charset=utf-8
> 
> On 30 Dec 2017, at 5:38 pm, Vinodhini Balusamy 
> wrote:
> > 
> > Just one more question from the details you have provided which
> > from 
> > my understanding strongly seems to be Design [DEREK] You cannot
> > create 
> > a regular 2-dimensional integer array from one row of length 3
> > > and a second one of length 0. Thus np.array chooses the next
> > > most 
> > > basic type of array it can fit your input data in
> 
> Indeed, the general philosophy is to preserve the structure and type
> of your input data as far as possible, i.e. a list is turned into a
> 1d-array, a list of lists (or tuples etc?) into a 2d-array,_ if_ the
> sequences are of equal length (even if length 1).
> As long as there is an unambiguous way to convert the data into an
> array (see below).
> 
> >Which is the case,  only if an second one of length 0 is given.
> >What about the case 1 :
> > > > > x12 = np.array([[1,2,3]])
> > > > > x12
> > 
> > array([[1, 2, 3]])
> > > > > print(x12)
> > 
> > [[1 2 3]]
> > > > > x12.ndim
> > 
> > 2
> > > > > 
> > > > > 
> > 
> > This seems to take 2 dimension.
> 
> Yes, structurally this is equivalent to your second example
> 
> > also,
> > > > x12 = np.array([[1,2,3],[0,0,0]])
> > > > print(x12)
> 
> [[1 2 3]
>  [0 0 0]]
> > > > x12.ndim
> 
> 2
> 
> > I presumed the above case and the case where length 0 is provided
> > to be treated same(I mean same behaviour).
> > Correct me if I am wrong.
> > 
> 
> In this case there is no unambiguous way to construct the array - you
> would need a shape (2, 3) array to store the two lists with 3
> elements in the first list. Obviously x12[0] would be
> np.array([1,2,3]), but what should be the value of x12[1], if the
> second list is empty - it could be zeros, or repeating x12[0], or
> simply undefined. np.array([1, 2, 3], [4]]) would be even less
> clearly defined.
> These cases where there is no obvious ?right? way to create the array
> have usually been discussed at some length, but I don?t know if this
> is fully documented in some place. For the essentials, see
> 
> https://docs.scipy.org/doc/numpy/reference/routines.array-creation.ht
> ml
> 
> note also the upcasting rules if you have e.g. a mix of integers and
> reals or complex numbers, and also how to control shape or data type
> explicitly with the respective keywords.
> 
>   Derek
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy 1.14.0 release

2018-01-14 Thread Sebastian Berg
On Sun, 2018-01-14 at 11:35 +, Matthew Brett wrote:
> Hi,
> 
> On Sun, Jan 14, 2018 at 3:35 AM, Eric Wieser
>  wrote:
> > Did recarrays change? I didn’t see anything in the release notes.
> > 
> > Not directly, but structured arrays did, for which recarrays are
> > really just
> > a thin and somewhat buggy wrapper.
> 
> Oh dear oh dear - for some reason I had completely missed these
> changes, and the justification for them.
> 
> They do exactly the kind of thing that Konrad Hinsen was complaining
> about before, with justification, which is to change the behavior of
> previous code, without an intervening (long) period of raising an
> error.  In this case, the benefits of these changes seem small,
> compared to the inevitable breakage and silently changed results they
> will cause.
> 
> Is there any chance of reversing them?
> 

Without knowing the change, there is always a chance of (temporary)
reversal and for unexpected complications its probably the safest
default if there is no agreement anyway.

- Sebastian


> Cheers,
> 
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy 1.14.1 released

2018-02-21 Thread Sebastian Berg
Great news, as always, thanks for your relentless effort Chuck!

- Sebastian


On Tue, 2018-02-20 at 18:21 -0700, Charles R Harris wrote:
> Hi All,
> 
> On behalf of the NumPy team, I am pleased to announce NumPy
> 1.14.1. This is a bugfix release for some problems reported following
> the 1.14.0 release. The major problems fixed are the following.
> Problems with the new array printing, particularly the printing of
> complex values, Please report any additional problems that may turn
> up.
> 
> Problems with ``np.einsum`` due to the new ``optimized=True``
> default. Some fixes for optimization have been applied and
> ``optimize=False`` is now the default.
> 
> The sort order in ``np.unique`` when ``axis=`` will now
> always be lexicographic in the subarray elements. In previous NumPy
> versions there was an optimization that could result in sorting the
> subarrays as unsigned byte strings.
> 
> The change in 1.14.0 that multi-field indexing of structured arrays
> returns a view instead of a copy has been reverted but remains on
> track for NumPy 1.15. Affected users should read the 1.14.1 Numpy
> User Guide section "basics/structured arrays/accessing multiple
> fields" for advice on how to manage this transition.
> This release supports Python 2.7 and 3.4 - 3.6. Wheels for the
> release are available on PyPI. Source tarballs, zipfiles, release
> notes, and the changelog are available on github.
> 
> Contributors
> 
> A total of 14 people contributed to this release.  People with a "+"
> by their names contributed a patch for the first time.
> 
> * Allan Haldane
> * Charles Harris
> * Daniel Smith
> * Dennis Weyland +
> * Eric Larson
> * Eric Wieser
> * Jarrod Millman
> * Kenichi Maehashi +
> * Marten van Kerkwijk
> * Mathieu Lamarre
> * Sebastian Berg
> * Simon Conseil
> * Simon Gibbons
> * xoviat
> 
> Cheers,
> 
> Charles Harris
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] improving arange()? introducing fma()?

2018-02-22 Thread Sebastian Berg
On Thu, 2018-02-22 at 14:33 -0500, Benjamin Root wrote:
> Sorry, I have been distracted with xarray improvements the past
> couple of weeks.
> 
> Some thoughts on what has been discussed:
> 
> First, you are right...Decimal is not the right module for this. I
> think instead I should use the 'fractions' module for loading grid
> spec information from strings (command-line, configs, etc). The
> tricky part is getting the yaml reader to use it instead of
> converting to a float under the hood.
> 
> Second, what has been pointed out about the implementation of arange
> actually helps to explain some oddities I have encountered. In some
> situations, I have found that it was better for me to produce the
> reversed sequence, and then reverse that array back and use it.
> 
> Third, it would be nice to do what we can to improve arange()'s
> results. Would we be ok with a PR that uses fma() if it is available,
> but then falls back on a regular multiply and add if it isn't
> available, or are we going to need to implement it ourselves for
> consistency?
> 

I am not sure I like the idea:
  1. It sounds like it might break code
  2. It sounds *not* like a fix, but rather a
 "make it slightly less bad, but it is still awful"

Using fma inside linspace might make linspace a bit more exact
possible, and would be a good thing, though I am not sure we have a
policy yet for something that is only used sometimes, nor am I sure it
actually helps.

It also would be nice to add stable summation to numpy in general (in
whatever way), which maybe is half related but on nobody's specific
todo list.

> 
> Lastly, there definitely needs to be a better tool for grid making.
> The problem appears easy at first, but it is fraught with many
> pitfalls and subtle issues. It is easy to say, "always use
> linspace()", but if the user doesn't have the number of pixels, they
> will need to calculate that using --- gasp! -- floating point
> numbers, which could result in the wrong answer. Or maybe their
> first/last positions were determined by some other calculation, and
> so the resulting grid does not have the expected spacing. Another
> problem that I run into is starting from two different sized grids
> and padding them both to be the same spec -- and getting that to
> match what would come about if I had generated the grid from scratch.
> 

Maybe you are right, but right now I have no clue what that tool would
do :). If we should add it to numpy likely depends on what exactly it
does and how complex it is.
I once wanted to add a "step" argument to linspace, but didn't in the
end, largely because it basically enforced in a very convoluted way
that the step fit exactly to a number of steps (up to floating point
precision) and body was quite sure it was a good idea, since it would
just be useful for a little convenience when you do not want to
calculate the steps.

Best,

Sebastian


> 
> Getting these things right is hard. I am not even certain that my
> existing code for doing this even right. But, what I do know is that
> until we build such a tool, users will continue to incorrectly use
> arange() and linspace(), and waste time trying to re-invent the wheel
> badly, assuming they even notice their mistakes in the first place!
> So, should such a tool go into numpy, given how fundamental it is to
> generate a sequence of floating point numbers, or should we try to
> put it into a package like rasterio or xarray?
> 
> Cheers!
> Ben Root
> 
> 
> 
> On Thu, Feb 22, 2018 at 2:02 PM, Chris Barker 
> wrote:
> > @Ben: Have you found a solution to your problem? Are there thinks
> > we could do in numpy to make it better?
> > 
> > -CHB
> > 
> > 
> > On Mon, Feb 12, 2018 at 9:33 AM, Chris Barker  > v> wrote:
> > > I think it's all been said, but a few comments:
> > > 
> > > On Sun, Feb 11, 2018 at 2:19 PM, Nils Becker  > > com> wrote:
> > > > Generating equidistantly spaced grids is simply not always
> > > > possible.
> > > > 
> > > 
> > > exactly -- and linspace gives pretty much teh best possible
> > > result, guaranteeing tha tthe start an end points are exact, and
> > > the spacing is within an ULP or two (maybe we could make that
> > > within 1 ULP always, but not sure that's worth it).
> > >  
> > > > The reason is that the absolute spacing of the possible
> > > > floating point numbers depends on their magnitude [1].
> > > > 
> > > 
> > > Also that the exact spacing may not be exactly representable in
> > > FP -- so you have to have at least one space

Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray

2018-03-09 Thread Sebastian Berg
On Thu, 2018-03-08 at 18:56 +, Stephan Hoyer wrote:
> Hi Nathaniel,
> 
> Thanks for starting the discussion!
> 
> Like Marten says, I think it would be useful to more clearly define
> what it means to be an abstract array. ndarray has lots of
> methods/properties that expose internal implementation (e.g., view,
> strides) that presumably we don't want to require as part of this
> interfaces. On the other hand, dtype and shape are almost assuredly
> part of this interface.
> 
> To help guide the discussion, it would be good to identify concrete
> examples of types that should and should not satisfy this interface,
> e.g.,
> Marten's case 1: works exactly like ndarray, but stores data
> differently: parallel arrays (e.g., dask.array), sparse arrays (e.g.,
> https://github.com/pydata/sparse), hypothetical non-strided arrays
> (e.g., always C ordered).
> Marten's case 2: same methods as ndarray, but gives different
> results: np.ma.MaskedArray, arrays with units (quantities), maybe
> labeled arrays like xarray.DataArray
> 
> I don't think we have a hope of making a single base class for case 2
> work with everything in NumPy, but we can define interfaces with
> different levels of functionality.


True, but I guess the aim is not to care at all about how things are
implemented (so only 2)? I agree that we can aim to be as close as
possible, but should not expect to reach it.
My personal opinion:

1. To do this, we should start it "experimentally"

2. We need something like a reference implementation. First, because it
allows testing whether a function e.g. in numpy is actually abstract-
safe and second because it will be the only way to find out what our
minimal abstract interface actually is (assuming we have started 3).

3. Go ahead with putting it into numpy functions and see how much you
need to make them work. In the end, my guess is, everything that works
for MaskedArrays and xarray is a pretty safe bet.

I disagree with the statement that we do not need to define the minimal
reference. In practice we do as soon as we use it for numpy functions.

- Sebastian


> 
> Because there is such a gradation of "duck array" types, I agree with
> Marten that we should not deprecate NDArrayOperatorsMixin. It's
> useful for types like xarray.Dataset that define __array_ufunc__ but
> cannot satisfy the full abstract array interface.
> 
> Finally for the name, what about `asduckarray`? Thought perhaps that
> could be a source of confusion, and given the gradation of duck array
> like types.
> 
> Cheers,
> Stephan
> 
> On Thu, Mar 8, 2018 at 7:07 AM Marten van Kerkwijk  mail.com> wrote:
> > Hi Nathaniel,
> > 
> > Overall, hugely in favour!  For detailed comments, it would be good
> > to
> > have a link to a PR; could you put that up?
> > 
> > A larger comment: you state that you think `np.asanyarray` is a
> > mistake since `np.matrix` and `np.ma.MaskedArray` would pass
> > through
> > and that those do not strictly mimic `NDArray`. Here, I agree with
> > `matrix` (but since we're deprecating it, let's remove that from
> > the
> > discussion), but I do not see how your proposed interface would not
> > let `MaskedArray` pass through, nor really that one would
> > necessarily
> > want that.
> > 
> > I think it may be good to distinguish two separate cases:
> > 1. Everything has exactly the same meaning as for `ndarray` but the
> > data is stored differently (i.e., only `view` does not work). One
> > can
> > thus expect that for `output = function(inputs)`, at the end all
> > `duck_output == ndarray_output`.
> > 2. Everything is implemented but operations may give different
> > output
> > (depending on masks for masked arrays, units for quantities, etc.),
> > so
> > generally `duck_output != ndarray_output`.
> > 
> > Which one of these are you aiming at? By including
> > `NDArrayOperatorsMixin`, it would seem option (2), but perhaps not?
> > Is
> > there a case for both separately?
> > 
> > Smaller general comment: at least in the NEP I would not worry
> > about
> > deprecating `NDArrayOperatorsMixin` - this may well be handy in
> > itself
> > (for things that implement `__array_ufunc__` but do not have shape,
> > etc. (I have been doing some work on creating ufunc chains that
> > would
> > use this -- but they definitely are not array-like). Similarly, I
> > think there is room for an `NDArrayShapeMixin` which might help
> > with
> > `concatenate` and friends.
> > 
> > Finally, on the name: `asarray` and `asanyarray` are just shims
> > over
>

Re: [Numpy-discussion] 3D array slicing bug?

2018-03-22 Thread Sebastian Berg
This NEP draft has some more hints/explanations if you are interested:

https://github.com/seberg/numpy/blob/5becd12914d0402967205579d6f59a9815
1e0d98/doc/neps/indexing.rst#examples

Plus, it tries to avoid the word "subspace" hehehe.

- Sebastian


On Thu, 2018-03-22 at 10:41 +0100, Pauli Virtanen wrote:
> ke, 2018-03-21 kello 20:40 +, Michael Himes kirjoitti:
> > I have discovered what I believe is a bug with array slicing
> > involving 3D (and higher) dimension arrays. When slicing a 3D array
> > by a single value for axis 0, all values for axis 1, and a list to
> > slice axis 2, the dimensionality of the resulting 2D array is
> > flipped. However, slicing more than a single index for axis 0 or
> > performing the slicing in two steps results in the correct
> > dimensionality. Below is a quick example to demonstrate this
> > behavior.
> > 
> 
> https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combi
> ning-advanced-and-basic-indexing
> 
> The key part seems to be: "There are two parts to the indexing
> operation, the subspace defined by the basic indexing 
> (**excluding integers**) and the subspace from the advanced indexing
> part."
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
Initializer or this sounds fine to me. As an other data point which I
think has been mentioned before, `sum` uses start and min/max use
default. `start` does not work, unless we also change the code to
always use the identity if given (currently that is not the case), in
which case it might be nice. However, "start" seems a bit like solving
a different issue in any case.

Anyway, mostly noise. I really like adding this, the only thing worth
discussing a bit is the name :).

- Sebastian


On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:
> It calls it `initializer` - See https://docs.python.org/3.5/library/f
> unctools.html#functools.reduce
> 
> Sent from Astro for Mac
> 
> > On Mar 26, 2018 at 09:54, Eric Wieser 
> > wrote:
> > 
> > It turns out I mispoke - functools.reduce calls the argument
> > `initial`
> > 
> > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer 
> > wrote:
> > > This looks like a very logical addition to the reduce interface.
> > > It has my support!
> > > 
> > > I would have preferred the more descriptive name "initial_value",
> > > but consistency with functools.reduce makes a compelling case for
> > > "initializer".
> > > 
> > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser  > > ail.com> wrote:
> > > > To reiterate my comments in the issue - I'm in favor of this.
> > > > 
> > > > It seems seem especially valuable for identity-less functions
> > > > (`min`, `max`, `lcm`), and the argument name is consistent with
> > > > `functools.reduce`. too.
> > > > 
> > > > The only argument I can see against merging this would be
> > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > cases to justify that.
> > > > 
> > > > I'd like to merge in a few days, if no one else has any
> > > > opinions.
> > > > 
> > > > Eric
> > > > 
> > > > On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi  > > > il.com> wrote:
> > > > > Hello, everyone. I’ve submitted a PR to add a initializer
> > > > > kwarg to ufunc.reduce. This is useful in a few cases, e.g.,
> > > > > it allows one to supply a “default” value for identity-less
> > > > > ufunc reductions, and specify an initial value for reductions
> > > > > such as sum (other than zero.)
> > > > > 
> > > > > Please feel free to review or leave feedback, (although I
> > > > > think Eric and Marten have picked it apart pretty well).
> > > > > 
> > > > > https://github.com/numpy/numpy/pull/10635
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > Hameer
> > > > > Sent from Astro for Mac
> > > > > 
> > > > > ___
> > > > > NumPy-Discussion mailing list
> > > > > NumPy-Discussion@python.org
> > > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > > 
> > > > ___
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion@python.org
> > > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
OK, the new documentation is actually clear:

initializer : scalar, optional
The value with which to start the reduction.
Defaults to the `~numpy.ufunc.identity` of the ufunc.
If ``None`` is given, the first element of the reduction is used,
and an error is thrown if the reduction is empty. If ``a.dtype`` is
``object``, then the initializer is _only_ used if reduction is empty.

I would actually like to say that I do not like the object special case
much (and it is probably the reason why I was confused), nor am I quite
sure this is what helps a lot? Logically, I would argue there are two
things:

 1. initializer/start (always used)
 2. default (oly used for empty reductions)

For example, I might like to give `np.nan` as the default for some
empty reductions, this will not work. I understand that this is a
minimal invasive PR and I am not sure I find the solution bad enough to
really dislike it, but what do other think? My first expectation was
the default behaviour (in all cases, not just object case) for some
reason.

To be honest, for now I just wonder a bit: How hard would it be to do
both, or is that too annoying? It would at least get rid of that
annoying thing with object ufuncs (which currently have a default, but
not really an identity/initializer).

Best,

Sebastian


On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi wrote:
> Actually, the behavior right now isn’t that of `default` but that of
> `initializer` or `start`.
> 
> This was discussed further down in the PR but to reiterate:
> `np.sum([10], initializer=5)` becomes `15`.
> 
> Also, `np.min([5], initializer=0)` becomes `0`, so it isn’t really
> the default value, it’s the initial value among which the reduction
> is performed.
> 
> This was the reason to call it initializer in the first place. I like
> `initial` and `initial_value` as well, and `start` also makes sense
> but isn’t descriptive enough.
> 
> Hameer
> Sent from Astro for Mac
> 
> > On Mar 26, 2018 at 12:06, Sebastian Berg  > t> wrote:
> > 
> > Initializer or this sounds fine to me. As an other data point which
> > I
> > think has been mentioned before, `sum` uses start and min/max use
> > default. `start` does not work, unless we also change the code to
> > always use the identity if given (currently that is not the case),
> > in
> > which case it might be nice. However, "start" seems a bit like
> > solving
> > a different issue in any case.
> > 
> > Anyway, mostly noise. I really like adding this, the only thing
> > worth
> > discussing a bit is the name :).
> > 
> > - Sebastian
> > 
> > 
> > On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:
> > > It calls it `initializer` - See https://docs.python.org/3.5/libra
> > > ry/f
> > > unctools.html#functools.reduce
> > > 
> > > Sent from Astro for Mac
> > > 
> > > > On Mar 26, 2018 at 09:54, Eric Wieser  > > > com>
> > > > wrote:
> > > > 
> > > > It turns out I mispoke - functools.reduce calls the argument
> > > > `initial`
> > > > 
> > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer 
> > > > wrote:
> > > > > This looks like a very logical addition to the reduce
> > > > > interface.
> > > > > It has my support!
> > > > > 
> > > > > I would have preferred the more descriptive name
> > > > > "initial_value",
> > > > > but consistency with functools.reduce makes a compelling case
> > > > > for
> > > > > "initializer".
> > > > > 
> > > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser  > > > > y@gm
> > > > > ail.com> wrote:
> > > > > > To reiterate my comments in the issue - I'm in favor of
> > > > > > this.
> > > > > > 
> > > > > > It seems seem especially valuable for identity-less
> > > > > > functions
> > > > > > (`min`, `max`, `lcm`), and the argument name is consistent
> > > > > > with
> > > > > > `functools.reduce`. too.
> > > > > > 
> > > > > > The only argument I can see against merging this would be
> > > > > > `kwarg`-creep of `reduce`, and I think this has enough use
> > > > > > cases to justify that.
> > > > > > 
> > > > > > I'd like to merge in a few days, if no one else has any
> > > > > > opinions.
> > > > 

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi wrote:
> That is the idea, but NaN functions are in a separate branch for
> another PR to be discussed later. You can see it on my fork, if
> you're
> interested.

Except that as far as I understand I am not sure it will help much with
it, since it is not a default, but an initializer. Initializing to NaN
would just make all results NaN.

- Sebastian


> On 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat.
> I imagine it would finally give some people a choice on what
> np.nansum([np.nan]) should return? It caused a huge hullabeloo a few
> years ago when we changed it from returning NaN to returning zero.
> Ben
> Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
>  wrote: OK, the new documentation is
> actually clear: initializer : scalar, optional The value with which
> to
> start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> ufunc. If ``None`` is given, the first element of the reduction is
> used, and an error is thrown if the reduction is empty. If
> ``a.dtype``
> is ``object``, then the initializer is _only_ used if reduction is
> empty. I would actually like to say that I do not like the object
> special case much (and it is probably the reason why I was confused),
> nor am I quite sure this is what helps a lot? Logically, I would
> argue
> there are two things: 1. initializer/start (always used) 2. default
> (oly used for empty reductions) For example, I might like to give
> `np.nan` as the default for some empty reductions, this will not
> work.
> I understand that this is a minimal invasive PR and I am not sure I
> find the solution bad enough to really dislike it, but what do other
> think? My first expectation was the default behaviour (in all cases,
> not just object case) for some reason. To be honest, for now I just
> wonder a bit: How hard would it be to do both, or is that too
> annoying? It would at least get rid of that annoying thing with
> object
> ufuncs (which currently have a default, but not really an
> identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t
> that of `default` but that of > `initializer` or `start`. > > This
> was
> discussed further down in the PR but to reiterate: > `np.sum([10],
> initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
> becomes `0`, so it isn’t really > the default value, it’s the initial
> value among which the reduction > is performed. > > This was the
> reason to call it initializer in the first place. I like > `initial`
> and `initial_value` as well, and `start` also makes sense > but isn’t
> descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
> 26, 2018 at 12:06, Sebastian Berg  > t>
> wrote: > > > > Initializer or this sounds fine to me. As an other
> data
> point which > > I > > think has been mentioned before, `sum` uses
> start and min/max use > > default. `start` does not work, unless we
> also change the code to > > always use the identity if given
> (currently that is not the case), > > in > > which case it might be
> nice. However, "start" seems a bit like > > solving > > a different
> issue in any case. > > > > Anyway, mostly noise. I really like adding
> this, the only thing > > worth > > discussing a bit is the name :). >
> > > > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400,
> 
> Hameer Abbasi wrote: > > > It calls it `initializer` - See
> https://docs.python.org/3.5/libra > > > ry/f > > >
> unctools.html#functools.reduce > > > > > > Sent from Astro for Mac >
> >
> > > > > > On Mar 26, 2018 at 09:54, Eric Wieser
> 
>  > > > com> > > > > wrote: > > > > > > > >
> It turns out I mispoke - functools.reduce calls the argument > > > >
> `initial` > > > > > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer
>  > > > > wrote: > > > > > This looks like a very
> logical addition to the reduce > > > > > interface. > > > > > It has
> my support! > > > > > > > > > > I would have preferred the more
> descriptive name > > > > > "initial_value", > > > > > but consistency
> with functools.reduce makes a compelling case > > > > > for > > > > >
> "initializer". > > > > > > > > > > On 

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> It'll need to be thought out for object arrays and subclasses. But
> for
> Regular numeric stuff, Numpy uses fmin and this would have the
> desired
> effect.

I do not want to block this, but I would like a clearer opinion about
this issue, `np.nansum` as Benjamin noted would require something like:

np.nansum([np.nan], default=np.nan)

because

np.sum([1], initializer=np.nan)
np.nansum([1], initializer=np.nan)

would both give NaN if the logic is the same as the current `np.sum`.
And yes, I guess for fmin/fmax NaN happens to work. And then there are
many nonsense reduces which could make sense with `initializer`.

Now nansum is not implemented in a way that could make use of the new
kwarg anyway, so maybe it does not matter in some sense. We can in
principle use `default` in nansum and at some point possibly add
`default` to the normal ufuncs. If we argue like that, the only
annoying thing is the `object` dtype which confuses the two use cases
currently.

This confusion IMO is not harmless, because I might want to use it
(e.g. sum with initializer=5), and I would expect things like dropping
in `decimal.Decimal` to work most of the time, while here it would give
silently bad results.

- Sebastian





> On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
> 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN functions
> are in a separate branch for another PR to be discussed later. You
> can
> see it on my fork, if you're interested. Except that as far as I
> understand I am not sure it will help much with it, since it is not a
> default, but an initializer. Initializing to NaN would just make all
> results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: Hmm,
> this is neat. I imagine it would finally give some people a choice on
> what np.nansum([np.nan]) should return? It caused a huge hullabeloo a
> few years ago when we changed it from returning NaN to returning
> zero.
> Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
>  wrote: OK, the new documentation is
> actually clear: initializer : scalar, optional The value with which
> to
> start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> ufunc. If ``None`` is given, the first element of the reduction is
> used, and an error is thrown if the reduction is empty. If
> ``a.dtype``
> is ``object``, then the initializer is _only_ used if reduction is
> empty. I would actually like to say that I do not like the object
> special case much (and it is probably the reason why I was confused),
> nor am I quite sure this is what helps a lot? Logically, I would
> argue
> there are two things: 1. initializer/start (always used) 2. default
> (oly used for empty reductions) For example, I might like to give
> `np.nan` as the default for some empty reductions, this will not
> work.
> I understand that this is a minimal invasive PR and I am not sure I
> find the solution bad enough to really dislike it, but what do other
> think? My first expectation was the default behaviour (in all cases,
> not just object case) for some reason. To be honest, for now I just
> wonder a bit: How hard would it be to do both, or is that too
> annoying? It would at least get rid of that annoying thing with
> object
> ufuncs (which currently have a default, but not really an
> identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t
> that of `default` but that of > `initializer` or `start`. > > This
> was
> discussed further down in the PR but to reiterate: > `np.sum([10],
> initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
> becomes `0`, so it isn’t really > the default value, it’s the initial
> value among which the reduction > is performed. > > This was the
> reason to call it initializer in the first place. I like > `initial`
> and `initial_value` as well, and `start` also makes sense > but isn’t
> descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
> 26, 2018 at 12:06, Sebastian Berg  > t>
> wrote: > > > > Initializer or this sounds fine to me. As an other
> data
> point which > > I > > think has been mentioned before, `sum` uses
> start and min/max use > > default. `start` does not work, unless we
> also change the code to > > always use the identity if given
> (currently that is not the case), > > in > > which case it might be
> nice. However, "start" seems a bit like > > solving > > a different
> issue in any case. > > > > Anyway, mostly noise. I really like adding
> this, the only thing > > worth &g

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg wrote:
> On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> > It'll need to be thought out for object arrays and subclasses. But
> > for
> > Regular numeric stuff, Numpy uses fmin and this would have the
> > desired
> > effect.
> 
> I do not want to block this, but I would like a clearer opinion about
> this issue, `np.nansum` as Benjamin noted would require something
> like:
> 
> np.nansum([np.nan], default=np.nan)
> 
> because
> 
> np.sum([1], initializer=np.nan)
> np.nansum([1], initializer=np.nan)
> 
> would both give NaN if the logic is the same as the current `np.sum`.
> And yes, I guess for fmin/fmax NaN happens to work. And then there
> are
> many nonsense reduces which could make sense with `initializer`.
> 
> Now nansum is not implemented in a way that could make use of the new
> kwarg anyway, so maybe it does not matter in some sense. We can in
> principle use `default` in nansum and at some point possibly add
> `default` to the normal ufuncs. If we argue like that, the only
> annoying thing is the `object` dtype which confuses the two use cases
> currently.
> 
> This confusion IMO is not harmless, because I might want to use it
> (e.g. sum with initializer=5), and I would expect things like
> dropping
> in `decimal.Decimal` to work most of the time, while here it would
> give
> silently bad results.
> 

In other words: I am very very much in favor if you get rid that object
dtype special case. I frankly not see why not (except that it needs a
bit more code change).
If given explicitly, we might as well force the use and not do the
funny stuff which is designed to be more type agnostic! If it happens
to fail due to not being type agnostic, it will at least fail loudly.

If you leave that object special case I am *very* hesitant about it.

That I think I would like a `default` argument as well, is another
issue and it can wait to another day.

- Sebastian


> - Sebastian
> 
> 
> 
> 
> 
> > On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
> > 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN
> > functions
> > are in a separate branch for another PR to be discussed later. You
> > can
> > see it on my fork, if you're interested. Except that as far as I
> > understand I am not sure it will help much with it, since it is not
> > a
> > default, but an initializer. Initializing to NaN would just make
> > all
> > results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote:
> > Hmm,
> > this is neat. I imagine it would finally give some people a choice
> > on
> > what np.nansum([np.nan]) should return? It caused a huge hullabeloo
> > a
> > few years ago when we changed it from returning NaN to returning
> > zero.
> > Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
> >  wrote: OK, the new documentation is
> > actually clear: initializer : scalar, optional The value with which
> > to
> > start the reduction. Defaults to the `~numpy.ufunc.identity` of the
> > ufunc. If ``None`` is given, the first element of the reduction is
> > used, and an error is thrown if the reduction is empty. If
> > ``a.dtype``
> > is ``object``, then the initializer is _only_ used if reduction is
> > empty. I would actually like to say that I do not like the object
> > special case much (and it is probably the reason why I was
> > confused),
> > nor am I quite sure this is what helps a lot? Logically, I would
> > argue
> > there are two things: 1. initializer/start (always used) 2. default
> > (oly used for empty reductions) For example, I might like to give
> > `np.nan` as the default for some empty reductions, this will not
> > work.
> > I understand that this is a minimal invasive PR and I am not sure I
> > find the solution bad enough to really dislike it, but what do
> > other
> > think? My first expectation was the default behaviour (in all
> > cases,
> > not just object case) for some reason. To be honest, for now I just
> > wonder a bit: How hard would it be to do both, or is that too
> > annoying? It would at least get rid of that annoying thing with
> > object
> > ufuncs (which currently have a default, but not really an
> > identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
> > -0400, Hameer Abbasi wrote: > Actually, the behavior right now
> > isn’t
> > that of `default` but that of > `initializer` or `start`. > > This
> > was
> > discussed further down in the PR but to reiterate: > `np.sum([10],
> > initializer=5)` bec

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> That may be complicated. Currently, the identity isn't used in object
> dtype reductions. We may need to change that, which could cause a
> whole lot of other backwards incompatible changes. For example, sum
> actually including zero in object reductions. Or we could pass in a
> flag saying an initializer was passed in to change that behaviour. If
> this is agreed upon and someone is kind enough to point me to the
> code, I'd be willing to make this change.

I realize the implication, I am not suggesting to change the default
behaviour (when no initial=... is passed), I would think about
deprecating it, but probably only if we also have the `default`
argument, since otherwise you cannot replicate the old behaviour.

What I think I would like to see is to change how it works if (and only
if) the initializer is passed in. Yes, this will require holding on to
some extra information since you will have to know/remember whether the
"identity" was passed in or defined otherwise.

I did not check the code, but I would hope that it is not awfully
tricky to do that.

- Sebastian


PS: A side note, but I see your emails as a single block of text with
no/broken new-lines.


>  On 26/03/2018 at 18:54,
> Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg
> wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll
> need to be thought out for object arrays and subclasses. But for
> Regular numeric stuff, Numpy uses fmin and this would have the
> desired
> effect. I do not want to block this, but I would like a clearer
> opinion about this issue, `np.nansum` as Benjamin noted would require
> something like: np.nansum([np.nan], default=np.nan) because
> np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan)
> would both give NaN if the logic is the same as the current `np.sum`.
> And yes, I guess for fmin/fmax NaN happens to work. And then there
> are
> many nonsense reduces which could make sense with `initializer`. Now
> nansum is not implemented in a way that could make use of the new
> kwarg anyway, so maybe it does not matter in some sense. We can in
> principle use `default` in nansum and at some point possibly add
> `default` to the normal ufuncs. If we argue like that, the only
> annoying thing is the `object` dtype which confuses the two use cases
> currently. This confusion IMO is not harmless, because I might want
> to
> use it (e.g. sum with initializer=5), and I would expect things like
> dropping in `decimal.Decimal` to work most of the time, while here it
> would give silently bad results. In other words: I am very very much
> in favor if you get rid that object dtype special case. I frankly not
> see why not (except that it needs a bit more code change). If given
> explicitly, we might as well force the use and not do the funny stuff
> which is designed to be more type agnostic! If it happens to fail due
> to not being type agnostic, it will at least fail loudly. If you
> leave
> that object special case I am *very* hesitant about it. That I think
> I
> would like a `default` argument as well, is another issue and it can
> wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45,
> Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
> wrote: That is the idea, but NaN functions are in a separate branch
> for another PR to be discussed later. You can see it on my fork, if
> you're interested. Except that as far as I understand I am not sure
> it
> will help much with it, since it is not a default, but an
> initializer.
> Initializing to NaN would just make all results NaN. - Sebastian On
> 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it
> would finally give some people a choice on what np.nansum([np.nan])
> should return? It caused a huge hullabeloo a few years ago when we
> changed it from returning NaN to returning zero. Ben Root On Mon, Mar
> 26, 2018 at 11:16 AM, Sebastian Berg 
> wrote: OK, the new documentation is actually clear: initializer :
> scalar, optional The value with which to start the reduction.
> Defaults
> to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given,
> the
> first element of the reduction is used, and an error is thrown if the
> reduction is empty. If ``a.dtype`` is ``object``, then the
> initializer
> is _only_ used if reduction is empty. I would actually like to say
> that I do not like the object special case much (and it is probably
> the reason why I was confused), nor am I quite sure this is what
> helps
> a lot? Logically, I would argue there are two things: 1.
> initializer/start (always used) 2. default (oly used for empty
> reductions) For example, I might like to give `np.nan` as th

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Sebastian Berg
On Mon, 2018-03-26 at 17:40 +, Eric Wieser wrote:
> The difficulty in supporting object arrays is that func.reduce(arr,
> initial=func.identity) and func.reduce(arr) have different meanings -
> whereas with the current patch, they are equivalent.
> 

True, but the current meaning is:

func.reduce(arr, intial=, default=func.identity)

in the case for object dtype. Luckily for normal dtypes, func.identity
is both the correct default "default" and a no-op for initial. Thus the
name "identity" kinda works there. I am also not really sure that both
kwargs would make real sense (plus initial probably disallows
default...), but I got some feeling that the "default" meaning may be
even more useful to simplify special casing the empty case.

Anyway, still just pointing out that I it gives me some headaches to
see such a special case for objects :(.

- Sebastian
 

> 
> On Mon, 26 Mar 2018 at 10:10 Sebastian Berg  et> wrote:
> > On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> > > That may be complicated. Currently, the identity isn't used in
> > object
> > > dtype reductions. We may need to change that, which could cause a
> > > whole lot of other backwards incompatible changes. For example,
> > sum
> > > actually including zero in object reductions. Or we could pass in
> > a
> > > flag saying an initializer was passed in to change that
> > behaviour. If
> > > this is agreed upon and someone is kind enough to point me to the
> > > code, I'd be willing to make this change.
> > 
> > I realize the implication, I am not suggesting to change the
> > default
> > behaviour (when no initial=... is passed), I would think about
> > deprecating it, but probably only if we also have the `default`
> > argument, since otherwise you cannot replicate the old behaviour.
> > 
> > What I think I would like to see is to change how it works if (and
> > only
> > if) the initializer is passed in. Yes, this will require holding on
> > to
> > some extra information since you will have to know/remember whether
> > the
> > "identity" was passed in or defined otherwise.
> > 
> > I did not check the code, but I would hope that it is not awfully
> > tricky to do that.
> > 
> > - Sebastian
> > 
> > 
> > PS: A side note, but I see your emails as a single block of text
> > with
> > no/broken new-lines.
> > 
> > 
> > >  On 26/03/2018 at 18:54,
> > > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian
> > Berg
> > > wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> > It'll
> > > need to be thought out for object arrays and subclasses. But for
> > > Regular numeric stuff, Numpy uses fmin and this would have the
> > > desired
> > > effect. I do not want to block this, but I would like a clearer
> > > opinion about this issue, `np.nansum` as Benjamin noted would
> > require
> > > something like: np.nansum([np.nan], default=np.nan) because
> > > np.sum([1], initializer=np.nan) np.nansum([1],
> > initializer=np.nan)
> > > would both give NaN if the logic is the same as the current
> > `np.sum`.
> > > And yes, I guess for fmin/fmax NaN happens to work. And then
> > there
> > > are
> > > many nonsense reduces which could make sense with `initializer`.
> > Now
> > > nansum is not implemented in a way that could make use of the new
> > > kwarg anyway, so maybe it does not matter in some sense. We can
> > in
> > > principle use `default` in nansum and at some point possibly add
> > > `default` to the normal ufuncs. If we argue like that, the only
> > > annoying thing is the `object` dtype which confuses the two use
> > cases
> > > currently. This confusion IMO is not harmless, because I might
> > want
> > > to
> > > use it (e.g. sum with initializer=5), and I would expect things
> > like
> > > dropping in `decimal.Decimal` to work most of the time, while
> > here it
> > > would give silently bad results. In other words: I am very very
> > much
> > > in favor if you get rid that object dtype special case. I frankly
> > not
> > > see why not (except that it needs a bit more code change). If
> > given
> > > explicitly, we might as well force the use and not do the funny
> > stuff
> > > which is designed to be more type agnostic! If it happens to fail
> > due
> > > to not being type agnostic, it will at least fail loudly. If you
> > > leave

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-04-09 Thread Sebastian Berg
On Mon, 2018-04-09 at 13:37 +0200, Hameer Abbasi wrote:
> I've renamed the kwarg to `initial`. I'm willing to make the object
> dtype changes as well, if someone pointed me to relevant bits of
> code.
> 
> Unfortunately, currently, the identity is only used for object dtypes
> if the reduction is empty. I think this is to prevent things like `0`
> being passed in the sum of objects (and similar cases), which makes
> sense.
> 
> However, with the kwarg, it makes sense to include it in the
> reduction. I think the change will be somewhere along the lines of:
> Detect if `initial` was passed, if so, include for object, otherwise
> exclude.
> 
> I personally feel `initial` renders `default` redundant. It can be
> used for both purposes. I can't think of a reasonable use case where
> you would want the default to be different from the initial value.
> However, I do agree that fixing the object case is important, we
> don't want users to get used to this behaviour and then rely on it
> later.

The reason would be the case of NaN which is not a possible initial
value for the reduction.
I personally find the object case important, if someone seriously
argues the opposite I might be swayed possibly.

- Sebastian


> 
> Hameer
> 
> On Mon, Mar 26, 2018 at 8:09 PM, Sebastian Berg  ns.net> wrote:
> > On Mon, 2018-03-26 at 17:40 +, Eric Wieser wrote:
> > > The difficulty in supporting object arrays is that
> > func.reduce(arr,
> > > initial=func.identity) and func.reduce(arr) have different
> > meanings -
> > > whereas with the current patch, they are equivalent.
> > >
> > 
> > True, but the current meaning is:
> > 
> > func.reduce(arr, intial=, default=func.identity)
> > 
> > in the case for object dtype. Luckily for normal dtypes,
> > func.identity
> > is both the correct default "default" and a no-op for initial. Thus
> > the
> > name "identity" kinda works there. I am also not really sure that
> > both
> > kwargs would make real sense (plus initial probably disallows
> > default...), but I got some feeling that the "default" meaning may
> > be
> > even more useful to simplify special casing the empty case.
> > 
> > Anyway, still just pointing out that I it gives me some headaches
> > to
> > see such a special case for objects :(.
> > 
> > - Sebastian
> > 
> > 
> > >
> > > On Mon, 26 Mar 2018 at 10:10 Sebastian Berg  > ns.n
> > > et> wrote:
> > > > On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> > > > > That may be complicated. Currently, the identity isn't used
> > in
> > > > object
> > > > > dtype reductions. We may need to change that, which could
> > cause a
> > > > > whole lot of other backwards incompatible changes. For
> > example,
> > > > sum
> > > > > actually including zero in object reductions. Or we could
> > pass in
> > > > a
> > > > > flag saying an initializer was passed in to change that
> > > > behaviour. If
> > > > > this is agreed upon and someone is kind enough to point me to
> > the
> > > > > code, I'd be willing to make this change.
> > > >
> > > > I realize the implication, I am not suggesting to change the
> > > > default
> > > > behaviour (when no initial=... is passed), I would think about
> > > > deprecating it, but probably only if we also have the `default`
> > > > argument, since otherwise you cannot replicate the old
> > behaviour.
> > > >
> > > > What I think I would like to see is to change how it works if
> > (and
> > > > only
> > > > if) the initializer is passed in. Yes, this will require
> > holding on
> > > > to
> > > > some extra information since you will have to know/remember
> > whether
> > > > the
> > > > "identity" was passed in or defined otherwise.
> > > >
> > > > I did not check the code, but I would hope that it is not
> > awfully
> > > > tricky to do that.
> > > >
> > > > - Sebastian
> > > >
> > > >
> > > > PS: A side note, but I see your emails as a single block of
> > text
> > > > with
> > > > no/broken new-lines.
> > > >
> > > >
> > > > >  On 26/03/2018 at 18:54,
> > > > > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200,

Re: [Numpy-discussion] Introduction: NumPy developers at BIDS

2018-04-10 Thread Sebastian Berg
On Tue, 2018-04-10 at 12:29 +0300, Matti Picus wrote:
> On 08/04/18 21:02, Eric Firing wrote:
> > On 2018/04/07 9:19 PM, Stefan van der Walt wrote:
> > > We would love community input on identifying the best areas &
> > > issues to
> > > pay attention to,
> > 
> > Stefan,
> > 
> > What is the best way to provide this, and how will the decisions be
> > made?
> > 
> > Eric
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> Hi. I feel very lucky to be able to dedicate the next phase of my
> career to working on NumPy. Even though BIDS has hired me, I view
> myself as working for the community, in an open and transparent way.
> In thinking about how to help make NumPy contributors more
> productive, we laid out these tasks:
> 

Welcome also from me :), I am looking forward to seeing how things
develop!

- Sebastian


> - triage open issues and pull requests, picking up some of the long-
> standing issues and trying to resolve them
> 
> - help with code review
>   
> - review and suggest improvements to the NumPy documentation
> 
> - if needed, help with releases and infrastructure maintenance tasks
> 
> Down the road, the next level of things would be
> 
> - setting up a benchmark site like speed.python.org
> 
> - add more downstream package testing to the NumPy CI so we can
> verify that new releases work with packages such as scipy, scikit-
> learn, astropy
> 
> To document my work, I have set up a wikihttps://github.com/mattip/nu
> mpy/wiki  that lists some longer-term tasks and ideas. I look forward
> to meeting and working with Tyler as well as SciPy2018 where there
> will be both a BOF meeting to discuss NumPy and a two-day sprint.
> 
> BIDS is ultimately responsible to the funders to make sure my work
> achieves the goals Stefan laid out, but I am going to try to be as
> responsive as possible to any input from the wider community, either
> directly (mattip on github and #numpy on IRC), via email, or this
> mailing list.
> 
> Matti
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a return value to np.random.shuffle

2018-04-12 Thread Sebastian Berg
On Thu, 2018-04-12 at 13:36 -0400, Joseph Fox-Rabinovitz wrote:
> Would it break backwards compatibility to add the input as a return
> value to np.random.shuffle? I doubt anyone out there is relying on
> the None return value.
> 

Well, python discourages this IIRC, and opts to not do these things for
in place functions (see random package specifically). Numpy breaks this
in a few places, but that is mostly because we have the out argument as
an optional input argument.

As is, it is a nice way of making people not write:

new = np.random.shuffle(old)

and think old won't change. So I think we should probably just stick
with the python/Guido van Rossum ideals, or did those change?

- Sebastian



> The change is trivial, and allows shuffling a new array in one line
> instead of two:
> 
> x = np.random.shuffle(np.array(some_junk))
> 
> I've implemented the change in PR#10893.
> 
> Regards,
> 
> - Joe
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding fweights and aweights to numpy.corrcoef

2018-04-26 Thread Sebastian Berg
I seem to recall that there was a discussion on this and it was a lot
trickier then expected.
I think statsmodels might have options in this direction.

- Sebastian


On Thu, 2018-04-26 at 15:44 +, Corin Hoad wrote:
> Hello,
> 
> Would it be possible to add the fweights and aweights keyword
> arguments from np.cov to np.corrcoef? They would retain their meaning
> from np.cov as frequency- or importance-based weightings
> respectively.
> 
> Yours,
> Corin Hoad
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Short-circuiting equivalent of np.any or np.all?

2018-04-26 Thread Sebastian Berg
On Thu, 2018-04-26 at 09:51 -0700, Hameer Abbasi wrote:
> Hi Nathan,
> 
> np.any and np.all call np.or.reduce and np.and.reduce respectively,
> and unfortunately the underlying function (ufunc.reduce) has no way
> of detecting that the value isn’t going to change anymore. It’s also
> used for (for example) np.sum (np.add.reduce), np.prod
> (np.multiply.reduce), np.min(np.minimum.reduce),
> np.max(np.maximum.reduce).


I would like to point out that this is not almost, but not quite true.
The boolean versions will short circuit on the innermost level, which
is good enough for all practical purposes probably.

One way to get around it would be to use a chunked iteration using
np.nditer in pure python. I admit it is a bit tricky to get start on,
but it is basically what numexpr uses also (at least in the simplest
mode), and if your arrays are relatively large, there is likely no real
performance hit compared to a non-pure python version.

- Sebastian



> 
> You can find more information about this on the ufunc doc page. I
> don’t think it’s worth it to break this machinery for any and all, as
> it has numerous other advantages (such as being able to override in
> duck arrays, etc)
> 
> Best regards,
> Hameer Abbasi
> Sent from Astro for Mac
> 
> > On Apr 26, 2018 at 18:45, Nathan Goldbaum 
> > wrote:
> > 
> > Hi all,
> > 
> > I was surprised recently to discover that both np.any and np.all()
> > do not have a way to exit early:
> > 
> > In [1]: import numpy as np
> > 
> > In [2]: data = np.arange(1e6)
> > 
> > In [3]: print(data[:10])
> > [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
> > 
> > In [4]: %timeit np.any(data)
> > 724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 loops
> > each)
> > 
> > In [5]: data = np.zeros(int(1e6))
> > 
> > In [6]: %timeit np.any(data)
> > 732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 loops
> > each)
> > 
> > I don't see any discussions about this on the NumPy issue tracker
> > but perhaps I'm missing something.
> > 
> > I'm curious if there's a way to get a fast early-terminating search
> > in NumPy? Perhaps there's another package I can depend on that does
> > this? I guess I could also write a bit of cython code that does
> > this but so far this project is pure python and I don't want to
> > deal with the packaging headache of getting wheels built and conda-
> > forge packages set up on all platforms.
> > 
> > Thanks for your help!
> > 
> > -Nathan
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Short-circuiting equivalent of np.any or np.all?

2018-04-26 Thread Sebastian Berg
On Thu, 2018-04-26 at 19:26 +0200, Sebastian Berg wrote:
> On Thu, 2018-04-26 at 09:51 -0700, Hameer Abbasi wrote:
> > Hi Nathan,
> > 
> > np.any and np.all call np.or.reduce and np.and.reduce respectively,
> > and unfortunately the underlying function (ufunc.reduce) has no way
> > of detecting that the value isn’t going to change anymore. It’s
> > also
> > used for (for example) np.sum (np.add.reduce), np.prod
> > (np.multiply.reduce), np.min(np.minimum.reduce),
> > np.max(np.maximum.reduce).
> 
> 
> I would like to point out that this is not almost, but not quite
> true.
> The boolean versions will short circuit on the innermost level, which
> is good enough for all practical purposes probably.
> 
> One way to get around it would be to use a chunked iteration using
> np.nditer in pure python. I admit it is a bit tricky to get start on,
> but it is basically what numexpr uses also (at least in the simplest
> mode), and if your arrays are relatively large, there is likely no
> real
> performance hit compared to a non-pure python version.
> 

I mean something like this:

def check_any(arr, func=lambda x: x, buffersize=0):
"""
Check if the function is true for any value in arr and stop once the first 
was found.

Parameters
--
arr : ndarray
Array to test.
func : function
Function taking a 1D array as argument and returning an array (on which 
``np.any``
will be called.
buffersize : int
Size of the chunk/buffer in the iteration, zero will use the default 
numpy value.
Notes
-
The stopping does not occur immediatly but in buffersize chunks.
"""
iterflags = ['buffered', 'external_loop', 'refs_ok', 'zerosize_ok']
for chunk in np.nditer((arr,), flags=iterflags, buffersize=buffersize):
if np.any(func(chunk)):
return True

return False


not sure how it performs actually, but you can give it a try especially
if you know you have large arrays, or if "func" is pretty expensive.
If the input is already bool, it will be quite a bit slower though I am
sure.

- Sebastian



> - Sebastian
> 
> 
> 
> > 
> > You can find more information about this on the ufunc doc page. I
> > don’t think it’s worth it to break this machinery for any and all,
> > as
> > it has numerous other advantages (such as being able to override in
> > duck arrays, etc)
> > 
> > Best regards,
> > Hameer Abbasi
> > Sent from Astro for Mac
> > 
> > > On Apr 26, 2018 at 18:45, Nathan Goldbaum 
> > > wrote:
> > > 
> > > Hi all,
> > > 
> > > I was surprised recently to discover that both np.any and
> > > np.all()
> > > do not have a way to exit early:
> > > 
> > > In [1]: import numpy as np
> > > 
> > > In [2]: data = np.arange(1e6)
> > > 
> > > In [3]: print(data[:10])
> > > [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
> > > 
> > > In [4]: %timeit np.any(data)
> > > 724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000
> > > loops
> > > each)
> > > 
> > > In [5]: data = np.zeros(int(1e6))
> > > 
> > > In [6]: %timeit np.any(data)
> > > 732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000
> > > loops
> > > each)
> > > 
> > > I don't see any discussions about this on the NumPy issue tracker
> > > but perhaps I'm missing something.
> > > 
> > > I'm curious if there's a way to get a fast early-terminating
> > > search
> > > in NumPy? Perhaps there's another package I can depend on that
> > > does
> > > this? I guess I could also write a bit of cython code that does
> > > this but so far this project is pure python and I don't want to
> > > deal with the packaging headache of getting wheels built and
> > > conda-
> > > forge packages set up on all platforms.
> > > 
> > > Thanks for your help!
> > > 
> > > -Nathan
> > > 
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Splitting MaskedArray into a separate package

2018-05-23 Thread Sebastian Berg
On Wed, 2018-05-23 at 17:33 -0400, Allan Haldane wrote:
> On 05/23/2018 04:02 PM, Eric Firing wrote:
> > Bad or missing values (and situations where one wants to use a mask
> > to
> > operate on a subset of an array) are found in many domains of real
> > life;
> > do you really want python users in those domains to have to fall
> > back on
> > Matlab-style reliance on nans and/or manual mask manipulations, as
> > the
> > new maskedarray package is sidelined?
> 
> I also think that missing value support is important to include
> inside
> numpy, just as it is included in other numerical packages like R and
> Julia.
> 
> The time is ripe to write a new and better MaskedArray, because
> __array_ufunc__ exists now. With some other numpy devs a few months
> ago
> we also played with rewriting MA using __array_ufunc__ and fixing up
> all
> the bugs and inconsistencies we have discovered over time (eg,
> getting
> rid of the Masked constant). Both Eric and I started working on some
> code changes, but never submitted PRs. See a little bit of discussion
> here (there was some more elsewhere I can't find now):
> 
> https://github.com/numpy/numpy/pull/9792#issuecomment-46420
> 
> As I say there, numpy's current MA support is pretty poor compared to
> R
> - Wes McKinney partly justified his desire to move pandas away from
> numpy because of it. We have a lot to gain by implementing it nicely.
> 
> We already have an NEP discussing possible ways forward:
> https://docs.scipy.org/doc/numpy-1.14.0/neps/missing-data.html
> 
> I was pretty excited by discussion above, and still am. I want to get
> back to it after I finish more immediate priorities - finishing
> printing/loading/saving fixes and structured array fixes.
> 
> But Masked-Array-2 is on my list of desired long-term enhancements
> for
> numpy.

Well, if we plan to replace it within numpy, I think we should wait
until then for any move on deprecation (after which it seems like the
obviously right choice)?

If we do not plan to replace it within numpy, we need to discuss a bit
how it might affect infrastructure (multiple implementations).

There is the other discussion about how to replace it. By opening
up/creating new masked dtypes or similar (cool but unclear how
complex/long term) or `__array_ufunc__` based (relatively simple, will
get rid of the nastier hacks that are currently needed).

Or even both, just on different time scales?

My first gut feeling about the proposal is: I love the idea to get rid
of it... but lets not do it, it does feel like it makes too much
infrastructure unclear.

- Sebastian


> 
> Allan
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Splitting MaskedArray into a separate package

2018-05-24 Thread Sebastian Berg
On Wed, 2018-05-23 at 23:48 +0200, Sebastian Berg wrote:
> On Wed, 2018-05-23 at 17:33 -0400, Allan Haldane wrote:



> 
> If we do not plan to replace it within numpy, we need to discuss a
> bit
> how it might affect infrastructure (multiple implementations).
> 
> There is the other discussion about how to replace it. By opening
> up/creating new masked dtypes or similar (cool but unclear how
> complex/long term) or `__array_ufunc__` based (relatively simple,
> will
> get rid of the nastier hacks that are currently needed).
> 
> Or even both, just on different time scales?
> 

I also somewhat like the idea of taking it out (once we have a first
replacement) in the case that we have a plan to do a better/lower level
replacement at a later point within numpy.
Removal generally has its merits, but if a (mid term) replacement will
come in any case, it would be nice to get those started first if
possible.
Otherwise downstream might end up having to fix up things twice.

- Sebastian


> My first gut feeling about the proposal is: I love the idea to get
> rid
> of it... but lets not do it, it does feel like it makes too much
> infrastructure unclear.
> 
> - Sebastian
> 
> 
> > 
> > Allan
> > 
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Allowing broadcasting of code dimensions in generalized ufuncs

2018-05-31 Thread Sebastian Berg



> > 
> > I'm currently -0.5 on both fixed dimensions and this broadcasting
> > dimension idea. My reasoning is:
> > 
> > - The use cases seem fairly esoteric. For fixed dimensions, I guess
> > the motivating example is cross-product (are there any others?).
> > But
> > would it be so bad for a cross-product gufunc to raise an error if
> > it
> > receives the wrong number of dimensions? For this broadcasting
> > case...
> > well, obviously we've survived this long without all_equal :-). And
> > there's something funny about all_equal, since it's really smushing
> > together two conceptually separate gufuncs for efficiency. Should
> > we
> > also have all_less_than, sum_square, ...? If this is a big problem,
> > then wouldn't it be better to solve it in a general way, like dask
> > or
> > Numba or numexpr do? To be clear, I'm not saying these features are
> > necessarily *bad* ideas, in isolation -- just that the benefits
> > aren't
> > very convincing, and there are trade-offs, like:
> 
> I have often wished numpy had these short-circuiting gufuncs, for a
> very 
> long time. I specifically remember my fruitless searches for how to
> do 
> it back to 2007.
> 
> While "on average" short-circuiting only gives a speedup of 2x, in
> many 
> situations you can arrange your algorithm so short circuiting will 
> happen early, eg usually in the first 10 elements of a 10^6 element 
> array, giving enormous speedups.

> Also, I do not imagine these as free-floating ufuncs, I think we can 
> arrange them in a logical way in a gufunc ecosystem. There would be
> some 
> "core ufuncs", with "associated gufuncs" accessible as attributes.
> For 
> instance, any_less_than will be accessible as less.any
> 

So then, why is it a gufunc and not an attribute using a ufunc with
binary output? I have asked this before, and even got arguments as to
why it fits gufuncs better, but frankly I still do not really
understand.

If it is an associated gufunc, why gufunc at all? We need any() and
all() here, so that is not that many methods, right? And when it comes
to buffering you have much more flexibility.

Say I have the operation:

(float_arr > int_arr).all(axis=(1, 2))

With int_arr being shaped (2, 1000, 1000) (i.e. large along the
interesting axes). A normal gufunc IIRC will get the whole inner
dimension as a float buffer. In other words, you gain practically
nothing, because the whole int_arr will be cast to float anyway.

If, however, you actually implement np.greater_than.all(float_arr,
int_arr, axis=(1, 2)) as a separate ufunc method, you would have the
freedom to work in the typical cache friendly buffersize chunk size for
each of the outer dimensions one at a time. A gufunc would require to
say: please do not buffer for me, or implement all possible type
combinations to do this.
(of course there are memory layout subtleties, since you would have to
optimize always for the "fast exit" case, potentially making the worst
case scenario much worse -- unless you do seriously fancy stuff
anyway).

A more general question is actually whether we should rather focus on
solving the same problem more generally.
For example if `numexpr` would implement all/any reductions, it may be
able to pretty simply get the identical tradeoffs with even more
flexibility! (I have to admit, it may get tricky with multiple
reduction dimensions, etc.)

- Sebastian 


> binary "comparison" ufuncs would have attributes
> 
> less.any
> less.all
> less.first  # returns first matching index
> less.count  # counts matches without intermediate bool array
> 
> This adds on to the existing attributes, for instance
> ufuncs already have:
> 
> add.reduce
> add.accumulate
> add.reduceat
> add.outer
> add.at
> 
> It is unfortunate that all ufuncs currently have these attributes
> even 
> if they are unimplemented/inappropriate (eg, np.sin.reduce), I would 
> like to  remove the inappropriate ones, so each core ufunc will only 
> have the appropriate attribute "associated gufuncs".
> 
> Incidentally, once we make reduce/accumuate/... into "associated 
> gufuncs", I propose completely removing the "method" argument of 
> __array_ufunc__, since it is no longer needed and adds a lot
> of complexity which implementors of an __array_ufunc__ are forced to
> account for.
> 
> Cheers,
> Allan
> 
> 
> 
> 
> 
> 
> > 
> > - When it comes to the core ufunc machinery, we have a limited
> > complexity budget. I'm nervous that if we add too many bells and
> > whistles, we'll end up writing ourselves into a corner where we
> > h

Re: [Numpy-discussion] Forcing new dimensions to appear at front in advanced indexing

2018-06-20 Thread Sebastian Berg
On Tue, 2018-06-19 at 19:37 -0400, Michael Lamparski wrote:
> Hi all,
> 
> So, in advanced indexing, numpy decides where to put new axes based
> on whether the "advanced indices" are all next to each other.
> 
> >>> np.random.random((3,4,5,6,7,8))[:, [[0,0],[0,0]], 1, :].shape
> (3, 2, 2, 6, 7, 8)
> >>> np.random.random((3,4,5,6,7,8))[:, [[0,0],[0,0]], :, 1].shape
> (2, 2, 3, 5, 7, 8)
> 
> In creating a wrapper type around arrays, I'm finding myself needing
> to suppress this behavior, so that the new axes consistently appear
> in the front.  I thought of a dumb hat trick:
> 
> def index(x, indices):
> return x[(True, None) + indices]
> 
> Which certainly gets the new dimensions where I want them, but it
> introduces a ghost dimension of 1 (and sometimes two such
> dimensions!) in a place where I'm not sure I can easily find it.
> 
> >>> np.random.random((3,4,5,6,7,8))[True, None, 1].shape
> (1, 1, 4, 5, 6, 7, 8)
> >>> np.random.random((3,4,5,6,7,8))[True, None, :, [[0,0],[0,0]], 1,
> :].shape
> (2, 2, 1, 3, 6, 7, 8)
> >>> np.random.random((3,4,5,6,7,8))[True, None, :, [[0,0],[0,0]], :,
> 1].shape
> (2, 2, 1, 3, 5, 7, 8)
> 
> any better ideas?
> 

We have proposed `arr.vindex[...]` to do this and there are is a pure
python implementation of it out there, I think it may be linked here
somewhere:

https://github.com/numpy/numpy/pull/6256

There is a way that will generally work using triple indexing:

arr[..., None, None][orig_indx * (slice(None), np.array(0))][..., 0]

The first and last indexing operation is just a view creation, so it is
basically a no-op. Now doing this gives me the shiver, but it will work
always. If you want to have a no-copy behaviour in case your original
index is ont an advanced indexing operation, you should replace the
np.array(0) with just 0.

- Sebastian





> ---
> 
> Michael
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Forcing new dimensions to appear at front in advanced indexing

2018-06-20 Thread Sebastian Berg
On Wed, 2018-06-20 at 09:15 -0400, Michael Lamparski wrote:
> > There is a way that will generally work using triple indexing:
> >
> > arr[..., None, None][orig_indx + (slice(None), np.array(0))][...,
> 0]
> 
> Impressive! (note: I fixed the * typo in the quote)
> 
> > The first and last indexing operation is just a view creation, so
> it is
> > basically a no-op. Now doing this gives me the shiver, but it will
> work
> > always. If you want to have a no-copy behaviour in case your
> original
> > index is ont an advanced indexing operation, you should replace the
> > np.array(0) with just 0.
> 
> I agree about the shivers, but any workaround is good to have
> nonetheless.
> 
> If the index is not an advanced indexing operation, does it not
> suffice to simply apply the index tuple as-is?

Yes, with the `np.array(0)` however, the result will forced to be a
copy and not a view into the original array, when writing the line
first I thought of "force advanced indexing", which there is likely no
reason for though.
If you replace it with 0, the result will be an identical view when the
index is not advanced (with only a tiny bit of call overhead).

So it might be nice to just use 0 instead, since if your index is
advanced indexing, there is no difference between the two. But then you
do not have to check if there is advanced indexing going on at all.

Btw. if you want to use it for an object, I might suggest to actually
use:

object.vindex[...]

notation for this logic (requires a slightly annoying helper class).
The NEP is basically just a draft/proposal status, but xarray is
already using that indexing method/property IIRC, so that name is
relatively certain by now.

I frankly am not sure right now if the vindex proposal was with a
forced copy or not, probably it was.

- Sebastian


> 
> Michael
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove sctypeNA and typeNA from numpy core

2018-06-21 Thread Sebastian Berg
On Thu, 2018-06-21 at 09:25 -0700, Matti Picus wrote:
> numpy.core has many ways to catalogue dtype names: sctypeDict,
> typeDict 
> (which is precisely sctypeDict), typecodes, and typename. We also 
> generate sctypeNA and typeNA but, as issue 11241 shows, it is
> sometimes 
> wrong. They are also not documented and never used inside numpy.
> Instead 
> of fixing it, I propose to remove sctypeNA and typeNA.
> 

Sounds like a good idea, we have too much stuff in there, and this one
is not even useful (I bet the NA is for the missing value support that
never happened).

Might be good to do a quick deprecation anyway though, mostly out of
principle.

- Sebastian

> Any thoughts or objections?
> Matti
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Sebastian Berg
On Tue, 2018-06-26 at 17:30 +1000, Andrew Nelson wrote:
> On Tue, 26 Jun 2018 at 17:12, Eric Wieser  m> wrote:
> > > I don't think it should be relegated to the "officially
> > discouraged" ghetto of `.legacy_index`
> > 
> > The way I read it, the new spelling lof that would be the explicit
> > but not discouraged `image.vindex[rr, cc]`.
> > 
> 
> If I'm understanding correctly what can be achieved now by `arr[rr,
> cc]` would have to be modified to use `arr.vindex[rr, cc]`, which is
> a very large change in behaviour. I suspect that there a lot of
> situations out there which use `arr[idxs]` where `idxs` can mean one
> of a range of things depending on the code path followed. If any of
> those change, or a mix of nomenclatures are required to access the
> different cases, then havoc will probably ensue.

Yes, that is true, but I doubt you will find a lot of code path that
need the current indexing as opposed to vindex here, and the idea was
to have a method to get the old behaviour indefinitely. You will need
to add the `.vindex`, but that should be the only code change needed,
and it would be easy to find where with errors/warnings.
I see a possible problem with code that has to work on different numpy
versions, but only in meaning we need to delay deprecations.

The only thing I could imagine where this might happen is if you
forward someone elses indexing objects and different users are used to
different results.
Otherwise, there is mostly one case which would get annoying, and that
is `arr[:, rr, cc]` since `arr.vindex[:, rr, cc]` would not be exactly
the same. Because, yes, in some cases the current logic is convenient,
just incredibly surprising as well.

- Sebastian

> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Sebastian Berg
On Tue, 2018-06-26 at 01:21 -0700, Robert Kern wrote:
> On Tue, Jun 26, 2018 at 12:58 AM Sebastian Berg
>  wrote:



> > 
> > Yes, that is true, but I doubt you will find a lot of code path
> > that
> > need the current indexing as opposed to vindex here,
> 
> That's probably true! But I think it's besides the point. I'd wager
> that most code paths that will use .vindex would work perfectly well
> with current indexing, too. Most of the time, people aren't getting
> into the hairy corners of advanced indexing.
> 

Right, the proposal was to have DeprecationWarnings when they differ,
now I also thought DeprecationWarnings on two advanced indexes in
general is good, because it is good for new users.
I have to agree with your argument that most of the confused should be
running into broadcast errors (if they expect oindex vs. fancy). So I
see this as a point that we likely should just limit ourselves at least
for now to the cases for example with sudden transposing going on.

However, I would like to point out that the reason for the more broad
warnings is that it could allow warping normal indexing at some point.
Also it decreases traps with array-likes that behave differently.


> Adding to the toolbox is great, but I don't see a good reason to take
> out the ones that are commonly used quite safely.
>  
> > and the idea was
> > to have a method to get the old behaviour indefinitely. You will
> > need
> > to add the `.vindex`, but that should be the only code change
> > needed,
> > and it would be easy to find where with errors/warnings.
> 
> It's not necessarily hard; it's just churn for no benefit to the
> downstream code. They didn't get a new feature; they just have to run
> faster to stay in the same place.
> 

So, yes, it is annoying for quite a few projects that correctly use
fancy indexing, but if we choose to not annoy you a little, we will
have much less long term options which also includes such projects
compatibility to new/current array-likes.
So basically one point is: if we annoy scikit-image now, their code
will work better for dask arrays in the future hopefully.

- Sebastian


> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Sebastian Berg
On Tue, 2018-06-26 at 04:23 -0400, Hameer Abbasi wrote:
> > Boolean indices are not supported. All indices must be integers,
> integer arrays or slices.
> 
> I would hope that there’s at least some way to do boolean indexing. I
> often find myself needing it. I realise that
> `arr.vindex[np.nonzero(boolean_idx)]` works, but it is slightly too
> verbose for my liking. Maybe we can have `arr.bindex[boolean_index]`
> as an alias to exactly that?
> 

That part is limited to `vindex` only. A single boolean index would
always work in plain indexing and you can mix it all up inside of
`oindex`. But with fancy indexing mixing boolean + integer seems
currently pretty much useless (and thus the same is true for `vindex`,
in `oindex` things make sense).
Now you could invent some new logic for such a mixing case in `vindex`,
but it seems easier to just ignore it for the moment.

- Sebastian


> Or is boolean indexing preserved as-is n the newest proposal? If so,
> great!
> 
> Another thing I’d say is `arr.?index` should be replaced with
> `arr.?idx`. I personally prefer `arr.?x` for my fingers but I realise
> that for someone not super into NumPy indexing, this is kind of
> opaque to read, so I propose this less verbose but hopefully equally
> clear version, for my (and others’) brains.
> 
> Best Regards,
> Hameer Abbasi
> Sent from Astro for Mac
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Sebastian Berg
On Tue, 2018-06-26 at 02:27 -0700, Robert Kern wrote:
> On Tue, Jun 26, 2018 at 1:36 AM Sebastian Berg  s.net> wrote:
> > On Tue, 2018-06-26 at 01:21 -0700, Robert Kern wrote:
> > > On Tue, Jun 26, 2018 at 12:58 AM Sebastian Berg
> > >  wrote:
> > 
> > 
> > 
> > > > 
> > > > Yes, that is true, but I doubt you will find a lot of code path
> > > > that
> > > > need the current indexing as opposed to vindex here,
> > > 
> > > That's probably true! But I think it's besides the point. I'd
> > wager
> > > that most code paths that will use .vindex would work perfectly
> > well
> > > with current indexing, too. Most of the time, people aren't
> > getting
> > > into the hairy corners of advanced indexing.
> > > 
> > 
> > Right, the proposal was to have DeprecationWarnings when they
> > differ,
> > now I also thought DeprecationWarnings on two advanced indexes in
> > general is good, because it is good for new users.
> > I have to agree with your argument that most of the confused should
> > be
> > running into broadcast errors (if they expect oindex vs. fancy). So
> > I
> > see this as a point that we likely should just limit ourselves at
> > least
> > for now to the cases for example with sudden transposing going on.
> > 
> > However, I would like to point out that the reason for the more
> > broad
> > warnings is that it could allow warping normal indexing at some
> > point.
> > 
> 
> I don't really understand this. You would discourage the "normal"
> syntax in favor of these more specific named syntaxes, so you can
> introduce different behavior for the "normal" syntax and encourage
> everyone to use it again? Just add more named syntaxes if you want
> new behavior! That's the beauty of the design underlying this NEP.
>  
> > Also it decreases traps with array-likes that behave differently.
> 
> If we were to take this seriously, then no one should use a bare []
> ever.
> 
> I'll go on record as saying that array-likes should respond to `a[rr,
> cc]`, as in Juan's example, with the current behavior. And if they
> don't, they don't deserve to be operated on by skimage functions.
> 
> If I'm reading the NEP correctly, the main thrust of the issue with
> array-likes is that it is difficult for some of them to implement the
> full spectrum of indexing possibilities. This NEP does not actually
> make it *easier* for those array-likes to implement every
> possibility. It just offers some APIs that more naturally express
> common use cases which can sometimes be implemented more naturally
> than if expressed in the current indexing. For instance, you can
> achieve the same effect as orthogonal indexing with the current
> implementation, but you have to manipulate the indices before you
> pass them over to __getitem__(), losing information along the way
> that could be used to make a more efficient lookup in some array-
> likes.
> 
> The NEP design is essentially more of a way to give these array-likes 
> standard places to raise NotImplementedError than it is to help them
> get rid of all of their NotImplementedErrors. More specifically, if
> these array-likes can't implement `a[rr, cc]`, they're not going to
> implement `a.vindex[rr, cc]`, either.
> 
> I think most of the problems that caused these libraries to make
> different choices in their __getitem__() implementation are due to
> the fact that these expressive APIs didn't exist, so they had to
> shoehorn them into __getitem__(); orthogonal indexing was too useful
> and efficient not to implement! I think that once we have .oindex and
> .vindex out there, they will be able to clean up their __getitem__()s
> to consistently support whatever of the current behavior that they
> can and raise NotImplementedError where they can't.
> 

Right, it helps mostly to be clear about what an object can and cannot
do. So h5py or whatever could error out for plain indexing and only
support `.oindex`, and we have all options cleanly available.

And yes, I agree that in itself is a big step forward.

The thing is there are also very strong opinions that the fancy
indexing behaviour is so confusing that it would ideally not be the
default since it breaks comparing analogy slice objects.

So, personally, I would argue that if we were to start over from
scratch, fancy indexing (multiple indexes), would not be the default
plain indexing behaviour.
Now, maybe the pain of a few warnings is too high, but if we wish to
move, no matter how slowly, in such regard, we will have to swallow 

Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Sebastian Berg
On Tue, 2018-06-26 at 04:01 -0400, Hameer Abbasi wrote:
> I second this design. If we were to consider the general case of a
> tuple `idx`, then we’d not be moving forward at all. Design changes
> would be impossible. I’d argue that this newer model would be easier
> for library maintainers overall (who are the kind of people using
> this), reducing maintenance cost in the long run because it’d lead to
> simpler code.
> 
> I would also that the “internal” classes expressing outer as
> vectorised indexing etc. should be exposed, for maintainers of duck
> arrays to use. God knows how many utility functions I’ve had to write
> to avoid relying on undocumented NumPy internals for pydata/sparse,
> fearing that I’d have to rewrite/modify them when behaviour changes
> or I find other corner cases.

Could you list some examples what you would need? We can expose some of
the internals, or maybe even provide funcs to map e.g. oindex to vindex
or vindex to plain indexing, etc. but it would be helpful to know what
downstream actually might need. For all I know the things that you are
thinking of may not even exist...

- Sebastian



> 
> Best Regards,
> Hameer Abbasi
> Sent from Astro for Mac
> 
> > On 26. Jun 2018 at 09:46, Robert Kern 
> > wrote:
> > 
> > On Tue, Jun 26, 2018 at 12:13 AM Eric Wieser  > il.com> wrote:
> > > > I don't think it should be relegated to the "officially
> > > discouraged" ghetto of `.legacy_index`
> > > 
> > > The way I read it, the new spelling lof that would be the
> > > explicit but not discouraged `image.vindex[rr, cc]`.
> > > 
> > 
> > Okay, I missed that the first time through. I think having more
> > self-contained descriptions of the semantics of each of these would
> > be a good idea. The current description of `.vindex` spends more
> > time talking about what it doesn't do, compared to the other
> > methods, than what it does.
> > 
> > Some more typical, less-exotic examples would be a good idea.
> > 
> > > > I would reserve warnings for the cases where the current
> > > behavior is something no one really wants, like mixing slices and
> > > integer arrays. 
> > > 
> > > These are the cases that would only be available under
> > > `legacy_index`.
> > > 
> > 
> > I'm still leaning towards not warning on current, unproblematic
> > common uses. It's unnecessary churn for currently working,
> > understandable code. I would still reserve warnings and deprecation
> > for the cases where the current behavior gives us something that no
> > one wants. Those are the real traps that people need to be warned
> > away from.
> > 
> > If someone is mixing slices and integer indices, that's a really
> > good sign that they thought indexing behaved in a different way
> > (e.g. orthogonal indexing).
> > 
> > If someone is just using multiple index arrays that would currently
> > not give an error, that's actually a really good sign that they are
> > using it correctly and are getting the semantics that they desired.
> > If they wanted orthogonal indexing, it is *really* likely that
> > their index arrays would *not* broadcast together. And even if they
> > did, the wrong shape of the result is one of the more easily
> > noticed things. These are not silent errors that would motivate
> > adding a new warning.
> > 
> > -- 
> > Robert Kern
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Sebastian Berg
On Tue, 2018-06-26 at 22:26 -0700, Robert Kern wrote:
> On Tue, Jun 26, 2018 at 10:21 PM Juan Nunez-Iglesias  com> wrote:
> > Let me start by thanking Robert for articulating my viewpoints far
> > better than I could have done myself. I want to explicitly flag the
> > following statements for endorsement:
> > 
> > > I would still reserve warnings and deprecation for the cases
> > > where the current behavior gives us something that no one wants.
> > > Those are the real traps that people need to be warned away from.
> > > In the post-NEP .oindex/.vindex order, everyone can get the
> > > behavior that they want. Your argument for deprecation is now
> > > just about what the default is, the semantics that get pride of
> > > place with the shortest spelling. I am sympathetic to the feeling
> > > like you wish you had a time machine to go fix a design with your
> > > new insight. But it seems to me that just changing which
> > > semantics are the default has relatively attenuated value while
> > > breaking compatibility for a fundamental feature of numpy has
> > > significant costs. Just introducing .oindex is the bulk of the
> > > value of this NEP. Everything else is window dressing.
> > > If someone is mixing slices and integer indices, that's a really
> > > good sign that they thought indexing behaved in a different way
> > > (e.g. orthogonal indexing).
> > 
> > I would offer the exception of trailing slices to this statement,
> > though:
> > 


OK, sounds fine to me, I see that we just can't start planning for a
possible long term future yet. I personally do not care really what the
warnings itself say for now (Deprecation or not), larger packages will
have to avoid them in any case though.
But I guess we have a consent on a certain amount of warnings (probably
will have to see how much they actually appear) and then can revisit in
a longer while.

- Sebastian


> > In [1]: from skimage import data
> > In [2]: astro = data.astronaut()
> > In [3]: astro.shape
> > Out[3]: (512, 512, 3)
> > 
> > In [4]: rr, cc = np.array([1, 3, 3, 3]), np.array([1, 8, 9, 10])
> > In [5]: astro[rr, cc].shape
> > Out[5]: (4, 3)
> > 
> > In [6]: astro[rr, cc, :].shape
> > Out[6]: (4, 3)
> > 
> > This does exactly what I would expect.
> > 
> 
> Yup, sorry, I didn't mean those. I meant when there is an explicit
> slice in between index arrays. (And maybe when index arrays follow
> slices; I'll need to think more on that.)
>  
> > Going back to the motivation for the NEP, I think this bit,
> > emphasis mine, is crucial:
> > 
> > > > the existing rules for advanced indexing with multiple array
> > > > indices are typically confusing to both new, **and in many
> > > > cases even old,** users of NumPy
> > 
> > I think it is ok for advanced indexing to be accessible to advanced
> > users. I remember that it took me quite a while to grok NumPy
> > advanced indexing, but once I did I just loved it.
> > 
> > I also like that this syntax translates perfectly from integer
> > indices to float coordinates in `ndimage.map_coordinates`. 
> > 
> > > I'll go on record as saying that array-likes should respond to
> > > `a[rr, cc]`, as in Juan's example, with the current behavior. And
> > > if they don't, they don't deserve to be operated on by skimage
> > > functions.
> > 
> > (I don't think of us highly enough to use the word "deserve", but I
> > would say that we would hesitate to support arrays that don't use
> > this convention.)
> > 
> 
> Ahem, yes, I was being provocative in a moment of weakness. May the
> array-like authors forgive me.
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] update to numpy-1.5.0 gives new warnings from scipy

2018-07-25 Thread Sebastian Berg
On Wed, 2018-07-25 at 07:44 -0400, Neal Becker wrote:
> After update to numpy-1.5.0, I'm getting warnings from scipy.
> These probably come from my code using convolve.  Does scipy need
> updating?
> 

Probably yes, I am a bit surprised we did not notice it before if it is
in scipy (or maybe scipy is already fixed?). This may be one of the
more controversial new warnings, so lets see if it comes up more. Right
now it seems not to affect much, I guess.
If the correct thing to do is to use the list as an array, then the
easiest solution maybe to do:

z[index,] = x  # note the additional `,`
# or alternatively of course: z[np.asarray(index)] = x

Otherwise, you will have to use `tuple(index)` to make sure numpy
interprets it as a multi-dimensional index.

The problem here, that this solves, is that if you have `z[some_list]`
currently numpy basically guesses whether you want a multi-dimensional
index or not.

- Sebastian


> /home/nbecker/.local/lib/python3.6/site-
> packages/scipy/fftpack/basic.py:160: FutureWarning: Using a non-tuple 
> sequence for multidimensional indexing is deprecated; use
> `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be
> interpreted as an array index, `arr[np.array(seq)]`, which will
> result either in an error or a different result.
>   z[index] = x
> /home/nbecker/.local/lib/python3.6/site-
> packages/scipy/signal/signaltools.py:491: FutureWarning: Using a non-
> tuple sequence for multidimensional indexing is deprecated; use
> `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be
> interpreted as an array index, `arr[np.array(seq)]`, which will
> result either in an error or a different result.
>   return x[reverse].conj()
> /home/nbecker/.local/lib/python3.6/site-
> packages/scipy/signal/signaltools.py:251: FutureWarning: Using a non-
> tuple sequence for multidimensional indexing is deprecated; use
> `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be
> interpreted as an array index, `arr[np.array(seq)]`, which will
> result either in an error or a different result.
>   in1zpadded[sc] = in1.copy()
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Roadmap proposal, v3

2018-08-02 Thread Sebastian Berg
On Thu, 2018-08-02 at 05:47 -0700, Ralf Gommers wrote:
> 
> 
> On Tue, Jul 24, 2018 at 12:04 PM, Stefan van der Walt  ey.edu> wrote:
> > Hi everyone,
> > 
> > Please take a look at the latest roadmap proposal:
> > 
> > https://github.com/numpy/numpy/pull/11611
> > 
> > This is a living document, so can easily be modified in the future,
> > but
> > we'd like to get in place a document that corresponds fairly
> > closely
> > with current community priorities.
> 
> The green button was pressed, the roadmap is now live on http://www.n
> umpy.org/neps/. Thanks all!
> 

Great, I hope we can check off some of them soon! :)

- Sebastian


> Cheers,
> Ralf
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adoption of a Code of Conduct

2018-08-02 Thread Sebastian Berg
On Thu, 2018-08-02 at 12:04 +0200, Sylvain Corlay wrote:
> The "political belief" was recently removed from the Jupyter CoC.
> 
> One reason for this decision is that Racism and Sexism are
> increasingly considered as mainstream "political beliefs", and we
> wanted to make it clear that people can still be sanctioned for e.g.
> sexist or racist behavior when engaging with the project (at events,
> on the mailing list or GitHub...) even if their racism and sexism is
> corresponds to a "political belief".
> 
> It is still not OK for people to be excluded or discriminated against
> because of their political affiliation. The CoC statement reads "This
> includes, but is not limited to...". Also we don't wish to prioritize
> or elevate any members of a particular political belief to the same
> level as any members of the examples remaining in the
> document. Ultimately, the CoC committee uses their own judgement to
> assess reports and the appropriate response.
> 

TL;DR: I don't think it matters, as for the CoC as such, it seems fine
to me, lets just put it in and be done with it.
I do not think we should have a long discussion (about that list), and
in case it might go there, would suggest we try to find a way to refuse
to have it. Maybe by letting the committee that is in the CoC decide.

Actually: I am good with the people currently listed for SciPy if they
will do it, or anyone else wants to jump in?



I won't really follow the discussion much more (except for reading) and
do not feel like I really know enough about CoCs, but my point is, that
I do not care much. The CoC as suggested seems pretty uncontroversial
to me (it does not draw any hard lines worth fighting over). And that
is probably the only current believe I have, that I think it should not
really draw those lines.

Political opinion being included or not? I am not sure I care, because
as I read it and you point out, it does not really matter whether or
not it is included, including it would just raise awareness for a
specific issue.

This is not about freedom to express political believes (on numpy
channels), I suppose there may be a point where even a boycott can be
discriminatory and things may be tricky to assess [1], but again those
cases need careful weighing (by the committee mostly), a CoC might bias
this a little, but not much, and if we decide which way to bias it we
might end up fighting, so lets refuse to do it outside specific cases?

Freedom of expression is always limited by the protection of other
individuals rights (note that I believe in the US this freedom tends to
be held very high when weighing the two). But, since there is normally
no reason for voicing political opinions on numpy, it seems obvious to
me that it will tend to lose when weighed against the other persons
rights being protected [2].
Weighing different "rights" is always tricky, but cannot be avoided or
really formalized too much IMO [3,4].

Which comes to the point that I think the list is one to raise
awareness for and be welcoming to specific people (either very general
or minority), who have in the past (or currently) not felt welcome. And
such a list always will be set in the current time/mentality.
We are maybe in an odd spot where political discussion/judicial
progress feels lagging behind social development (and some fronts are
hardening :(), which makes things a bit trickier.

Overall, all it would do is to maybe suggested that "political opinion"
is currently not something that need special raised awareness. It does
not mean this defines a "bias", nor that the list cannot change at some
point.

Either way, I do not read the list as giving any additional protection
for *voicing* your opinion. In fact, I would argue the opposite may be
the case. If you voice it you make the opposite (political) opinion
feel less welcome, and since there is no reason for voicing a political
opinion *on a numpy channel* when weighing those against each other it
seems like a hard case [5].

At some points lines may have to be drawn (and drawing them once, does
not set them in stone for the next time!). I do not think we draw or
should draw them (much) with this statement itself, the statement says
that they will be drawn if and when necessary and then it will be done
so carefully. Plus it generally raises awareness and gives a bit
guidelines.
It seems to me that this may be the actual discussion with many of
those other discussions. Not so much the wording, but over how exactly
lines were drawn in practice.
Sure, probably we set a bit of bias with the list, but I doubt it is
enough to fight over. And hopefully we can avoid a huge discussion :)
(for now looks like it).

Best,

Sebastian


PS: I do not mind synchronizing numpy and scipy (or numpy and Jupyter
or all three) as much as possible. I guess you could sum it up to,
maybe I am even 

Re: [Numpy-discussion] Taking back control of the #numpy irc channel

2018-08-07 Thread Sebastian Berg
On Mon, 2018-08-06 at 21:52 -0700, Ralf Gommers wrote:
> 
> 
> On Mon, Aug 6, 2018 at 7:15 PM, Nathan Goldbaum  m> wrote:
> > Hi,
> > 
> > I idle in #scipy and have op in there. I’m happy start idling in
> > #numpy and be op if the community is willing to let me.
> > 
> 
> Thanks Nathan. Sounds useful.
> 

Sounds good. I haven't really hung out there for a long time (frankly,
I never hung out in #numpy, I thought people just use #scipy).

Can we just give a few names (such as Matti, Nathan, maybe me, anyone
else right now?) and add others later ourselves?
I can get in contact with freenode (unless someone already did).

> There's also a Gitter numpy channel. AFAIK few/none core devs are
> regularly active on either IRC or Gitter. I would suggest that we
> document both these channels as community-run at
> https://scipy.org/scipylib/mailing-lists.html, and give Nathan and
> others who are interested the permissions they need.
> 

Yeah, the gitter seems pretty inactive as well. But I guess it doesn't
hurt to mention them.

- Sebastian


> I think our official recommendation for usage questions is
> StackOverflow.
> 
> Cheers,
> Ralf
> 
> 
> > I’m also in the process of getting ops for #matplotlib for similar
> > spam-related reasons. I’d say all the scientific python IRC
> > channels I’m in get a decent amount of traffic (perhaps 10% of the
> > number of questions that get asked on StackOverflow) and it’s a
> > good venue for asking quick questions. Let’s hope that forcing
> > people to register doesn’t kill that, although there’s not much we
> > can do given the spam attack.
> > 
> > Nathan
> > 
> > On Mon, Aug 6, 2018 at 9:03 PM Matti Picus 
> > wrote:
> > > Over the past few days spambots have been hitting freenode's IRC 
> > > channels[0, 1]. It turns out the #numpy channel has no operator,
> > > so we 
> > > cannot make the channel mode "|+q $~a"[2] - i.e. only registered 
> > > freenode users can talk but anyone can listen.
> > > 
> > > I was in touch with the freenode staff, they requested that
> > > someone from 
> > > the steering council reach out to them at ||proje...@freenode.net
> > > , here 
> > > is the quote from the discussion:
> > > 
> > > "
> > > it's pretty much a matter of them sending an email telling us who
> > > they'd 
> > > like to represent them on freenode, which channels and cloak
> > > namespaces 
> > > they want, and any info we might need on the project
> > > "
> > > 
> > > In the mean time they set the channel mode appropriately, so this
> > > is 
> > > also a notice that if you want to chat on the #numpy IRC channel
> > > you 
> > > need to register.
> > > 
> > > Hope someone from the council picks this up and reaches out to
> > > them, and 
> > > will decide who is to able to become channel operators (the
> > > recommended 
> > > practice is to use it like sudo, only assume the role when needed
> > > then 
> > > turn it back off).
> > > 
> > > Matti
> > > 
> > > [0] https://freenode.net/news/spambot-attack
> > > [1] https://freenode.net/news/spam-shake
> > > [2] https://nedbatchelder.com/blog/201808/fighting_spam_on_freeno
> > > de.html
> > > |
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> > 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Taking back control of the #numpy irc channel

2018-08-08 Thread Sebastian Berg
On Tue, 2018-08-07 at 22:07 -0700, Ralf Gommers wrote:
> 
> 
> On Tue, Aug 7, 2018 at 4:34 AM, Sebastian Berg  s.net> wrote:
> > On Mon, 2018-08-06 at 21:52 -0700, Ralf Gommers wrote:
> > > 
> > > 
> > > On Mon, Aug 6, 2018 at 7:15 PM, Nathan Goldbaum  > l.co
> > > m> wrote:
> > > > Hi,
> > > > 
> > > > I idle in #scipy and have op in there. I’m happy start idling
> > in
> > > > #numpy and be op if the community is willing to let me.
> > > > 
> > > 
> > > Thanks Nathan. Sounds useful.
> > > 
> > 
> > Sounds good. I haven't really hung out there for a long time
> > (frankly,
> > I never hung out in #numpy, I thought people just use #scipy).
> > 
> > Can we just give a few names (such as Matti, Nathan, maybe me,
> > anyone
> > else right now?) and add others later ourselves?
> > I can get in contact with freenode (unless someone already did).
> 
> Thanks Sebastian. Go ahead I'd say.


Will do, just realized looking at it. The Steering Council list, etc.
has a list of names, but not email addresses (or PGP keys). I do not
remember, was that intentional or not? Also I am not sure if the
steering council email address is published anywhere, IIRC it was
possible for anyone to send an email to it (OTOH, it would be nice to
not catch spam there, so maybe it is fine to ask for the address
first).

- Sebastian

> 
> Ralf
> 
> > > There's also a Gitter numpy channel. AFAIK few/none core devs are
> > > regularly active on either IRC or Gitter. I would suggest that we
> > > document both these channels as community-run at
> > > https://scipy.org/scipylib/mailing-lists.html, and give Nathan
> > and
> > > others who are interested the permissions they need.
> > > 
> > 
> > Yeah, the gitter seems pretty inactive as well. But I guess it
> > doesn't
> > hurt to mention them.
> > 
> > - Sebastian
> > 
> > 
> > > I think our official recommendation for usage questions is
> > > StackOverflow.
> > > 
> > > Cheers,
> > > Ralf
> > > 
> > > 
> > > > I’m also in the process of getting ops for #matplotlib for
> > similar
> > > > spam-related reasons. I’d say all the scientific python IRC
> > > > channels I’m in get a decent amount of traffic (perhaps 10% of
> > the
> > > > number of questions that get asked on StackOverflow) and it’s a
> > > > good venue for asking quick questions. Let’s hope that forcing
> > > > people to register doesn’t kill that, although there’s not much
> > we
> > > > can do given the spam attack.
> > > > 
> > > > Nathan
> > > > 
> > > > On Mon, Aug 6, 2018 at 9:03 PM Matti Picus  > om>
> > > > wrote:
> > > > > Over the past few days spambots have been hitting freenode's
> > IRC 
> > > > > channels[0, 1]. It turns out the #numpy channel has no
> > operator,
> > > > > so we 
> > > > > cannot make the channel mode "|+q $~a"[2] - i.e. only
> > registered 
> > > > > freenode users can talk but anyone can listen.
> > > > > 
> > > > > I was in touch with the freenode staff, they requested that
> > > > > someone from 
> > > > > the steering council reach out to them at ||projects@freenode
> > .net
> > > > > , here 
> > > > > is the quote from the discussion:
> > > > > 
> > > > > "
> > > > > it's pretty much a matter of them sending an email telling us
> > who
> > > > > they'd 
> > > > > like to represent them on freenode, which channels and cloak
> > > > > namespaces 
> > > > > they want, and any info we might need on the project
> > > > > "
> > > > > 
> > > > > In the mean time they set the channel mode appropriately, so
> > this
> > > > > is 
> > > > > also a notice that if you want to chat on the #numpy IRC
> > channel
> > > > > you 
> > > > > need to register.
> > > > > 
> > > > > Hope someone from the council picks this up and reaches out
> > to
> > > > > them, and 
> > > > > will decide who is to able to become channel operators (the
> > > > > recommended 
> > > > > practice is to use it like 

Re: [Numpy-discussion] Taking back control of the #numpy irc channel

2018-08-08 Thread Sebastian Berg
On Wed, 2018-08-08 at 08:55 -0700, Ralf Gommers wrote:
> 
> 
> On Wed, Aug 8, 2018 at 1:23 AM, Sebastian Berg  s.net> wrote:
> > On Tue, 2018-08-07 at 22:07 -0700, Ralf Gommers wrote:
> > > 
> > > 
> > > On Tue, Aug 7, 2018 at 4:34 AM, Sebastian Berg  > tion
> > > s.net> wrote:
> > > > On Mon, 2018-08-06 at 21:52 -0700, Ralf Gommers wrote:
> > > > > 
> > > > > 
> > > > > On Mon, Aug 6, 2018 at 7:15 PM, Nathan Goldbaum  > gmai
> > > > l.co
> > > > > m> wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I idle in #scipy and have op in there. I’m happy start
> > idling
> > > > in
> > > > > > #numpy and be op if the community is willing to let me.
> > > > > > 
> > > > > 
> > > > > Thanks Nathan. Sounds useful.
> > > > > 
> > > > 
> > > > Sounds good. I haven't really hung out there for a long time
> > > > (frankly,
> > > > I never hung out in #numpy, I thought people just use #scipy).
> > > > 
> > > > Can we just give a few names (such as Matti, Nathan, maybe me,
> > > > anyone
> > > > else right now?) and add others later ourselves?
> > > > I can get in contact with freenode (unless someone already
> > did).
> > > 
> > > Thanks Sebastian. Go ahead I'd say.
> > 
> > 
> > Will do, just realized looking at it. The Steering Council list,
> > etc.
> > has a list of names, but not email addresses (or PGP keys). I do
> > not
> > remember, was that intentional or not? 
> 
> I have a vague memory of that being intentional, but not sure. I
> don't mind making email addresses public; they can be found from git
> commit logs and mailing lists anyway, so why make life difficult for
> whomever wants to reach us.
>  


Yeah, well, I find PGP keys a good idea, even if they might outdate
once in a while That means if someone wants to check you can easily
sign an email and they can be pretty sure you can claim to have some
sway in NumPy (right now freenode did it by seeing I have power on
github, but it is not really quite ideal).

On a general note about IRC, we have claimed #numpy now:

If anyone wants anything #numpy related on IRC now (new channels, cloak
namespaces, ...), please contact me or Matti (I assume you are happy
with that role!). If someone is unhappy with us two being the main
contact/people who have those right on freenode, also contact us so we
can get it changed.

- Sebastian


> > Also I am not sure if the
> > steering council email address is published anywhere, IIRC it was
> > possible for anyone to send an email to it (OTOH, it would be nice
> > to
> > not catch spam there, so maybe it is fine to ask for the address
> > first).
> 
> Google's spam filters are pretty good. For the record, it is numpy-st
> eering-coun...@googlegroups.com
> 
> Cheers,
> Ralf
> 
>  
> > - Sebastian
> > 
> > > 
> > > Ralf
> > > 
> > > > > There's also a Gitter numpy channel. AFAIK few/none core devs
> > are
> > > > > regularly active on either IRC or Gitter. I would suggest
> > that we
> > > > > document both these channels as community-run at
> > > > > https://scipy.org/scipylib/mailing-lists.html, and give
> > Nathan
> > > > and
> > > > > others who are interested the permissions they need.
> > > > > 
> > > > 
> > > > Yeah, the gitter seems pretty inactive as well. But I guess it
> > > > doesn't
> > > > hurt to mention them.
> > > > 
> > > > - Sebastian
> > > > 
> > > > 
> > > > > I think our official recommendation for usage questions is
> > > > > StackOverflow.
> > > > > 
> > > > > Cheers,
> > > > > Ralf
> > > > > 
> > > > > 
> > > > > > I’m also in the process of getting ops for #matplotlib for
> > > > similar
> > > > > > spam-related reasons. I’d say all the scientific python IRC
> > > > > > channels I’m in get a decent amount of traffic (perhaps 10%
> > of
> > > > the
> > > > > > number of questions that get asked on StackOverflow) and
> > it’s a
> > > > > > good venue for asking quick questions. Let’s hope that
> > forcing
> > > > > > people t

Re: [Numpy-discussion] Stacklevel for warnings.

2018-08-11 Thread Sebastian Berg
On Fri, 2018-08-10 at 16:05 -0600, Charles R Harris wrote:
> Hi All,
> 
> Do we have a policy for the stacklevel that should be used in NumPy?
> How far back should the stack be displayed? I note that the optimum
> stacklevel may vary between users and developers.
> 

I thought it was so that it will point to the correct user line (or
tend to point there). So stacklevel=2 for exposed and higher for
private (python) functions IIRC.
As for developers, I would hope they are OK with (and know how to)
turning the warning into an error.

Not sure we discussed it much, I seem to have a vague memory of asking
if we are sure this is what we want and at least Ralf agreeing. Also I
don't know how consistent it is overall.

- Sebastian


> Chuck
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Stacklevel for warnings.

2018-08-11 Thread Sebastian Berg
On Sat, 2018-08-11 at 11:11 -0700, Ralf Gommers wrote:
> 
> 
> On Sat, Aug 11, 2018 at 1:22 AM, Sebastian Berg  ns.net> wrote:
> > On Fri, 2018-08-10 at 16:05 -0600, Charles R Harris wrote:
> > > Hi All,
> > > 
> > > Do we have a policy for the stacklevel that should be used in
> > NumPy?
> > > How far back should the stack be displayed? I note that the
> > optimum
> > > stacklevel may vary between users and developers.
> > > 
> > 
> > I thought it was so that it will point to the correct user line (or
> > tend to point there). So stacklevel=2 for exposed and higher for
> > private (python) functions IIRC.
> > As for developers, I would hope they are OK with (and know how to)
> > turning the warning into an error.
> > 
> > Not sure we discussed it much, I seem to have a vague memory of
> > asking
> > if we are sure this is what we want and at least Ralf agreeing.
> > Also I
> > don't know how consistent it is overall.
> 
> That sounds right to me. I think when it was introduced it was quite
> consistent, because Sebastian replace warning filters everywhere with
> suppress_warnings. Would be good to document this in the devguide.

Yeah, probably reasonably consistent, but I only added a test to check
that the stacklevel argument is never missing entirely, it is up to the
author to figure out what is the right level (or best easily possible,
since sometimes it would be pretty ugly to make it always right).

The warning testing (suppress_warnings, etc.) or any of our tests never
actually check the stacklevel that I am aware of, or maybe I forgot :),
could be something to think about though. I guess we did it around the
same time of the general warning testing cleanup probably.

- Sebastian



> Ralf
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adoption of a Code of Conduct

2018-08-15 Thread Sebastian Berg
On Tue, 2018-08-14 at 21:30 -0700, Ralf Gommers wrote:
> 
> 
> On Fri, Aug 3, 2018 at 1:02 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
> > 
> > On Fri, Aug 3, 2018 at 1:45 PM, Peter Creasey <
> > p.e.creasey...@googlemail.com> wrote:
> > > +1 for keeping the same CoC as Scipy, making a new thing just
> > > seems a
> > > bigger surface area to maintain. Personally I already assumed
> > > Scipy's
> > > "honour[ing] diversity in..." did not imply any protection of
> > > behaviours that violate the CoC *itself*, but if you wanted to be
> > > really explicit you could add "to the extent that these do not
> > > conflict with this code of conduct." to that line.
> > 
> > I prefer that to the proposed modification, short and sweet.
> > 
> 
> This edit to the SciPy CoC has now been merged.
> 
> It looks to me like we're good to go here and take over the SciPy
> CoC.


Sounds good, so +1.

I am happy with the committee as well, and I guess most/all are, but we
might want to discuss it separately?

- Sebastian


> 
> Cheers,
> Ralf
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] count_nonzero axis argument?

2018-09-17 Thread Sebastian Berg
On Mon, 2018-09-17 at 12:37 +0100, Matthew Brett wrote:
> Hi,
> 
> Is there any reason that np.count_nonzero should not take an axis
> argument?  As in:
> 

No, sounds like an obvious improvement, but as also with those, someone
has to volunteer to do it...
Coding it will probably mean adding the NpyIter and possibly fast paths
(not sure about the state of count nonzero), but should not be very
difficult.

- Sebastian



> > > > np.better_count_nonzero([[10, 11], [0, 3]], axis=1)
> 
> array([2, 1])
> 
> It would be much more useful if it did...
> 
> Cheers,
> 
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Exact semantics of ufunc.reduce

2018-10-12 Thread Sebastian Berg
On Fri, 2018-10-12 at 17:34 +0200, Hameer Abbasi wrote:
> Hello!
> 
> I’m trying to investigate the exact way ufunc.reduce works when given
> a custom dtype. Does it cast before or after the operation, or
> somewhere in between? How does this differ from ufunc.reduceat, for
> example?
> 

I am not 100% sure, but I think giving the dtype definitely casts the
output type. And since most ufunc loops are defined as "ff->f", etc.
that effectively casts the input as well. It might be it casts the
input specifically, but I doubt it.

The cast will occur within the buffering machinery, so the cast is only
done in small chunks. But the operation itself should be performed
using the given dtype.

- Sebastian


> We ran into this issue in pydata/sparse#191 when trying to match the
> two where the only thing differing is the number of zeros for sum,
> which shouldn’t change the result.
> 
> Best Regards,
> Hameer Abbasi
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Removing priority labels from github

2018-10-19 Thread Sebastian Berg
On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote:
> We currently have highest, high, normal, low, and lowest priority
> labels 
> for github issues/PRs. At the recent status meeting, we proposed 
> consolidating these to a single "high" priority label. Anything
> "low" 
> priority should be merged or closed since it will be quickly
> forgotten, 
> and no "normal" tag is needed.
> 
> 
> With that, we (the BIDS team) would like to encourage reviewers to
> use 
> the "high" priority tag to indicate things we should be working on.
> 
> Any objections or thoughts?
> 

Sounds like a plan, especially having practically meaningless tags
right now is no help. Most of them are historical and personally I have
only been using the milestones to tag things as high priority (very
occasionally).

- Sebastian


> 
> Matti (in the names of Tyler and Stefan)
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


  1   2   3   4   5   6   7   >