Re: [Numpy-discussion] NEP 31 — Context-local and global overrides of the NumPy API
On 2019-09-07 15:33, Ralf Gommers wrote: On Sat, Sep 7, 2019 at 1:07 PM Sebastian Berg wrote: On Fri, 2019-09-06 at 14:45 -0700, Ralf Gommers wrote: That's part of it. The concrete problems it's solving are threefold: Array creation functions can be overridden. Array coercion is now covered. "Default implementations" will allow you to re-write your NumPy array more easily, when such efficient implementations exist in terms of other NumPy functions. That will also help achieve similar semantics, but as I said, they're just "default"... There may be another very concrete one (that's not yet in the NEP): allowing other libraries that consume ndarrays to use overrides. An example is numpy.fft: currently both mkl_fft and pyfftw monkeypatch NumPy, something we don't like all that much (in particular for mkl_fft, because it's the default in Anaconda). `__array_function__` isn't able to help here, because it will always choose NumPy's own implementation for ndarray input. With unumpy you can support multiple libraries that consume ndarrays. Another example is einsum: if you want to use opt_einsum for all inputs (including ndarrays), then you cannot use np.einsum. And yet another is using bottleneck ( https://kwgoodman.github.io/bottleneck-doc/reference.html) for nan- functions and partition. There's likely more of these. The point is: sometimes the array protocols are preferred (e.g. Dask/Xarray-style meta-arrays), sometimes unumpy-style dispatch works better. It's also not necessarily an either or, they can be complementary. Let me try to move the discussion from the github issue here (this may not be the best place). (https://github.com/numpy/numpy/issues/14441 which asked for easier creation functions together with `__array_function__`). I think an important note mentioned here is how users interact with unumpy, vs. __array_function__. The former is an explicit opt-in, while the latter is implicit choice based on an `array-like` abstract base class and functional type based dispatching. To quote NEP 18 on this: "The downsides are that this would require an explicit opt-in from all existing code, e.g., import numpy.api as np, and in the long term would result in the maintenance of two separate NumPy APIs. Also, many functions from numpy itself are already overloaded (but inadequately), so confusion about high vs. low level APIs in NumPy would still persist." (I do think this is a point we should not just ignore, `uarray` is a thin layer, but it has a big surface area) Now there are things where explicit opt-in is obvious. And the FFT example is one of those, there is no way to implicitly choose another backend (except by just replacing it, i.e. monkeypatching) [1]. And right now I think these are _very_ different. Now for the end-users choosing one array-like over another, seems nicer as an implicit mechanism (why should I not mix sparse, dask and numpy arrays!?). This is the promise `__array_function__` tries to make. Unless convinced otherwise, my guess is that most library authors would strive for implicit support (i.e. sklearn, skimage, scipy). Circling back to creation and coercion. In a purely Object type system, these would be classmethods, I guess, but in NumPy and the libraries above, we are lost. Solution 1: Create explicit opt-in, e.g. through uarray. (NEP-31) * Required end-user opt-in. * Seems cleaner in many ways * Requires a full copy of the API. bullet 1 and 3 are not required. if we decide to make it default, then there's no separate namespace It does require explicit opt-in to have any benefits to the user. Solution 2: Add some coercion "protocol" (NEP-30) and expose a way to create new arrays more conveniently. This would practically mean adding an `array_type=np.ndarray` argument. * _Not_ used by end-users! End users should use dask.linspace! * Adds "strange" API somewhere in numpy, and possible a new "protocol" (additionally to coercion).[2] I still feel these solve different issues. The second one is intended to make array likes work implicitly in libraries (without end users having to do anything). While the first seems to force the end user to opt in, sometimes unnecessarily: def my_library_func(array_like): exp = np.exp(array_like) idx = np.arange(len(exp)) return idx, exp Would have all the information for implicit opt-in/Array-like support, but cannot do it right now. Can you explain this a bit more? `len(exp)` is a number, so `np.arange(number)` doesn't really have any information here. Right, but as a library author, I want a way a way to make it use the same type as `array_like` in this particular function, that is the point! The end-user already signaled they prefer say dask, due to the array that was actually passed in. (but this is just repeating what is below I think). This is what I have
[Numpy-discussion] Re: Dealing with static local variables in Numpy
On Tue, 2023-08-29 at 08:01 +, Nicolas Holzschuch wrote: > Hello, > > This is my first post to this group; I'd like to start by expressing > my appreciation for the amazing work in developing and maintaining > Numpy. > > I have a question. Numpy has quite a lot of static local variables > (variables defined as static inside a function, like this > (core/src/multiarraymodule.c, line 4483): > if (raise_exceptions) { > static PyObject *too_hard_cls = NULL; > /* ... */ > } > > I understand that these variables provide local caching and are > important for efficiency. They do however cause some issues when > dealing with multiple subinterpreters, where the static local > variable might have been initialized by one of the subinterpreters, > and is not reset when accessed by another subinterpreter. > More globally, they cannot be reset when the Numpy module is > released, and thus will likely cause an issue if it is reloaded after > being released. Right, but in the end these caches are there for a reason (or almost all), and just removing them does not seem acceptable to me. However, there are better ways to solve this. You can move it into module state. In the vast majority of cases that should not be hard: The patterns are known. In a few cases it may be harder but I believe CPython offers decent solutions now (not sure how it looks like). I had for a long time hoped for the HPy drive will solve this, but there is no reason to wait for it. In any case, contributions to this effect are very much welcome, I have been hoping they would come for a long time, but I am not excited about just removing the "static". - Sebastian > > I have seen the issue mentionned in at least one pull request: > https://github.com/numpy/numpy/pull/15169 and in several issues. If I > understand correctly, the issue is not considered as important > because subinterpreters are not yet prominent in CPython, static > local variables provide an important service in caching data locally > (instead of exposing these variables globally). So the benefits > outweigh the costs and risks (that would be a huge change to the code > base). > > I happen to maintain, compile and run a version of Python on iOS ( > https://github.com/holzschu/a-shell/ or > https://apps.apple.com/us/app/a-shell/id1473805438), where I have to > remove all these static local variables, because of the specificity > of the platform (in order to run Python multiple times, I have to > release and reset all modules). Right now, I'm maintaining the > changes to the code base in a separate branch ( > https://github.com/holzschu/numpy/) and not necessarily in a very > clean way. > > With the recent renewed interest in subinterpreters, I was wondering > if there was a way I could contribute these changes back to the main > numpy branch. I would have to clean up the code, obviously, and > probably get guidance on how to do it cleanly, but the first question > is: would there be an interest, or is that something I should keep in > my separate branch? > > > From a technical point of view, about 80% of these static local > > variables are just before a call to npy_cache_import(), and the > > most efficient way to do it (in terms of lines of code) is just to > > remove the part where npy_cache_import uses the static local > > variable. You pay a price in performance, but gain in usability. > > Best regards, > Nicolas Holzschuch > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Curious performance different with np.unique on arrays of characters
On Fri, 2023-09-29 at 11:39 +0200, Klaus Zimmermann wrote: > Hi, > > one thing that's been on my mind about this discussion: > > Isn't sorting strings simply a much harder job? Particularly Unicode > strings? Yes, but in theory if they are length 1 it is just sorting integers (8 or 64bit) for the current quirky NumPy fixed-length string dtypes. Modulo complicated stuff that Python doesn't worry about either [1]. But, of course that is in theory. In practice have a single implementation that deals with arbitrary string lengths, so the code does a lot of extra stuff (and it is harder to use fancy tricks, and our implementation for a lot of these things is very basic). Also while we do have the flexibility to create it now, we don't actually have an obvious place where to add such a specialization (of course you can always insert an `if ...` clause somewhere, but that isn't a nice design). - Sebastian [1] In principle you are right: sorting unicode is complicated! In practice, that is your problem as a user though. If you want to deal with weirder things, you have to normalize the unicode first, etc. > > Cheers > Klaus > > On 27/09/2023 13:12, Lyla Watts wrote: > > Could you share the processor you're currently running this on? I > > ask because np.sort leverages AVX-512 acceleration for sorting > > np.int32, and I'm curious if that could be contributing to the > > observed difference in performance. > > https://apkhexo.com/koloro-mod-apk/ > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: klaus.zimmerm...@smhi.se > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Merging very limited weights support for quantiles/percentiles
Hi all, there is a PR to merge very limited support for weights in quantiles, which given no further input I will probably merge based on sklearn devs saying that they will use it. This means, adding a `weights` kwarg [1]. See: https://github.com/numpy/numpy/pull/24254 Limited here means that it would only work for the "inverted_cdf" method (which is not the default one). Why is it very limited? Because this limited version is the only form we/I am pretty confident about getting it right. There are various problems with making it more broad: 1. Weights are not clearly defined and can have many meanings, e.g.: * frequency weights (repeated observations) * probability weights (removing sample biases) * "analytic"/"precision" weights (encoding observation precision/variance). 2. There is very little to no literature on how to deal with the subtleties of dealing with (in the context of the various types of weights: * Interpolation (relevant to all interpolating methods) * Unbiasing (the main difference between the methods) The PR adds the most minimal thing, where weights are largly equivalent (no unbiasing issues, no interpolation). [2] Due to these complexities (and the lack of many statistic specialists looking at it) there is a point to be made that we just shouldn't add this in NumPy, but if nobody else has an opinion, I will go with the sklearn devs who want it :). (Also with weights we have to rely on full sorting for now, which can be slow, which I can live with personally.) - Sebastian [1] There are different styles of weights and for some method that clearly matters. Thus, if we ever expand the definition, it may be that `weights` has to be mapped to one of these, or that the the generic `weights` kwarg would raise an error for these that you need to pick a specific one like `pweights=`, or `fweights=`. [2] I am not quite sure about "analytic weights" here, but to me these do not really make sense in the context of a discrete interpolation method. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Windows default integer now 64bit in main
Hi all, just a heads up, the PR to change the default integer is merged on main. This may cause issues, especially with Cython code because `np.int_t` cannot be reasonably defined anymore. Other code may also want to vet usage of "long" in any variation. Much code (like SciPy) simply supports any integer input, although even there integer output may be relevant. New NumPy defines `NPY_DEFAULT_INT` to be able to branch at runtime for backward compatiblity you could use: #ifndef NPY_DEFAULT_INT #define NPY_DEFAULT_INT NPY_LONG #endif Unfortunately, I expect this to be a bit painful, please let us know if it is too painful for some reason. But OTOH it has been a recurring surprise and is a common reason for linux written software to not run on windows. - Sebastian ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Windows default integer now 64bit in main
On Thu, 2023-11-02 at 19:37 +0100, Michael Siebert wrote: > Hi Sebastian, > > great news! Does that mean that Windows Numpy 64 bit default integers > are coming before Numpy 2.0, like in Numpy 1.27? Will there be > another release before 2.0? NumPy 2 of course. Way to big change. There is no 1.27 planned as of now, if it happens it would be a (big) backport release, though. (Due to files having been moved around backports seem to be getting harder, though.) - Sebastian > > Best, Michael > > > On 2. Nov 2023, at 16:25, Sebastian Berg < > > sebast...@sipsolutions.net> wrote: > > Hi all, > > > > just a heads up, the PR to change the default integer is merged on > > main. This may cause issues, especially with Cython code because > > `np.int_t` cannot be reasonably defined anymore. > > > > Other code may also want to vet usage of "long" in any variation. > > Much > > code (like SciPy) simply supports any integer input, although even > > there integer output may be relevant. New NumPy defines > > `NPY_DEFAULT_INT` to be able to branch at runtime for backward > > compatiblity you could use: > > > > #ifndef NPY_DEFAULT_INT > > #define NPY_DEFAULT_INT NPY_LONG > > #endif > > > > Unfortunately, I expect this to be a bit painful, please let us > > know if > > it is too painful for some reason. > > > > But OTOH it has been a recurring surprise and is a common reason > > for > > linux written software to not run on windows. > > > > - Sebastian > > > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: michael.sieber...@gmail.com > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Switching default order to column-major
Few things in the Python API care about order, but there are also quite a few places that will return C-order (and are faster for C-order inputs) whether you change those defaults or not. The main issue is that e.g. some cython wrappers will probably assume that the newly created array is C-order. And those will just not work. For example, I would imagine many libraries that have C/Cython wrappers have code that doesn't specify `order="C"` explicitly (why would they?) but then passes it into a typed memory-views (if cython) like `double[:, ::1]` enforcing a C-contiguous memory layout for speed. Such code should normally fail gracefully, but fail it will. Also, as Aaron said, a lot of these places might not enforce it but still be speed impacted. So yes, it would be expected break a lot of C-interfacing code that has Python wrappers around it to normalize input. - Sebastian On Fri, 2023-11-10 at 22:37 +, Valerio De Benedetto wrote: > Hi, I found that the documented default row-major order is enforced > throughout the library with a series of `order='C'` default > parameters, so given this I supposed there's no way to change the > default (or am I wrong?) > If, supposedly, I'd change that by patching the library (substituting > 'C's for 'F's), do you think there would by any problem with other > downstream libraries using numpy in my project? Do you think they > assume a default-constructed array is always row-major and access the > underlying data? > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Fri, 2023-12-22 at 18:01 -0500, Marten van Kerkwijk wrote: > Hi Martin, > > I agree it is a long-standing issue, and I was reminded of it by your > comment. I have a draft PR at > https://github.com/numpy/numpy/pull/25476 > that does not change the old behaviour, but allows you to pass in a > start-stop array which behaves more sensibly (exact API TBD). > > Please have a look! That looks nice, I don't have a clear feeling on the order of items, if we think of it in terms of `(start, stop)` there was also the idea voiced to simply add another name in which case you would allow start and stop to be separate arrays. Of course if go with your `slice(start, stop)` idea that also works, although passing as separate parameters seems nice too. Adding another name (if we can think of one at least) seems pretty good to me, since I suspect we would add docs to suggest not using `reduceat`. One small thing about the PR: I would like to distinct `default` and `initial`. I.e. the default value is used only for empty reductions, while the initial value should be always used (unless you would pass both, which we don't for normal reductions though). I suppose the machinery isn't quite set up to do both side-by-side. - Sebastian > > Marten > > Martin Ling writes: > > > Hi folks, > > > > I don't follow numpy development in much detail these days but I > > see > > that there is a 2.0 release planned soon. > > > > Would this be an opportunity to change the behaviour of 'reduceat'? > > > > This issue has been open in some form since 2006! > > https://github.com/numpy/numpy/issues/834 > > > > The current behaviour was originally inherited from Numeric, and > > makes > > reduceat often unusable in practice, even where it should be the > > perfect, concise, efficient solution. But it has been impossible to > > change it without breaking compatibіlity with existing code. > > > > As a result, horrible hacks are needed instead, e.g. my answer > > here: > > https://stackoverflow.com/questions/57694003 > > > > Is this something that could finally be fixed in 2.0? > > > > > > Martin > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: m...@astro.utoronto.ca > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote: > Hi Sebastian, > > > That looks nice, I don't have a clear feeling on the order of > > items, if > > we think of it in terms of `(start, stop)` there was also the idea > > voiced to simply add another name in which case you would allow > > start > > and stop to be separate arrays. > > Yes, one could add another method. Or perhaps even add a new > argument > to `.reduce` instead (say `slices`). But this seemed the simplest > route... Yeah, I don't mind this, doesn't stop us from a better idea either. Adding to `.reduce` could be fine, but overall I actually think a new name or using `reduceat` is nicer than overloading it more, even `reduce_slices()`. > > > > > > I suppose the machinery isn't quite set up to do both side-by-side. > > I just followed what is done for reduce, where a default could also > have > made sense given that `where` can exclude all inputs along a given > row. > I'm not convinced it would be necessary to have both, though it would > not be hard to add. Sorry, I misread the code: You do use initial the same way as in reductions, I thought it wasn't used when there were multiple elements. I.e. it is used for non-empty slices also. There is still a little annoyance when `initial=` isn't passed, since default/initial can be different (this is the case for object add for example: the default is `0`, but it is not used as initial for non empty reductions). Anyway, its a small details to some degree even if it may be finicky to get right. At the moment it seems passing `dtype=object` somehow changes the result also. - Sebastian > > All the best, > > Marten > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote: > Hi Sebastian, > > > That looks nice, I don't have a clear feeling on the order of > > items, if > > we think of it in terms of `(start, stop)` there was also the idea > > voiced to simply add another name in which case you would allow > > start > > and stop to be separate arrays. > > Yes, one could add another method. Or perhaps even add a new > argument > to `.reduce` instead (say `slices`). But this seemed the simplest > route... > > > Of course if go with your `slice(start, stop)` idea that also > > works, > > although passing as separate parameters seems nice too. > > > > Adding another name (if we can think of one at least) seems pretty > > good > > to me, since I suspect we would add docs to suggest not using > > `reduceat`. > > If we'd want to, even with the present PR it would be possible to > (very > slowly) deprecate the use of a list of single integers. But I'm > trying > to go with just making the existing method more useful. > > > One small thing about the PR: I would like to distinct `default` > > and > > `initial`. I.e. the default value is used only for empty > > reductions, > > while the initial value should be always used (unless you would > > pass > > both, which we don't for normal reductions though). > > I suppose the machinery isn't quite set up to do both side-by-side. > > I just followed what is done for reduce, where a default could also > have > made sense given that `where` can exclude all inputs along a given > row. > I'm not convinced it would be necessary to have both, though it would > not be hard to add. Was looking at the PR, which still seems worthwhile, although not urgnet right now. But, this makes me think (loudly ;)) that the `get_reduction_initial` should maybe distinguish this more fully... Because there are 3 cases, even if we only use the first two currently: 1. True idenity: default and initial are the same. 2. Default but no initial: Object sum has no initial, but does use `0` as default. 3. Initial is not valid default: This would be useful to simplify min/max reductions: `-inf` or `MIN_INT` are valid initial values but are not valid default values. - Sebastian > > All the best, > > Marten > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Proposal to accept NEP 55: Add a UTF-8 variable-width string DType to NumPy
On Mon, 2024-01-22 at 17:08 -0700, Nathan wrote: > Hi all, > > I propose we accept NEP 55 and merge PR #25347 implementing the NEP > in time > for the NumPy 2.0 RC: I really like this work and I think it is a big improvement! At this point we probably have to expect some things to be still buggy, but that is also a reason to get it in (testing is hard if it isn't shipped first-class unfortunately). Nathan summarized the things I might have brought up very well. The support of missing values is the one thing that to me may end up a bit more in flux. But I am happy to hope that this is in a way that pandas will not be affected and, honestly, without deep integration testing we won't make progress in figuring out whether there is some change needed or not. Thanks for the great work! - Sebastian > > https://numpy.org/neps/nep-0055-string_dtype.html > https://github.com/numpy/numpy/pull/25347 > > The most controversial aspect of the NEP was support for missing > strings > via a user-supplied sentinel object. In the previous discussion on > the > mailing list, Warren Weckesser argued for shipping a missing data > sentinel > with NumPy for use with the DType, while in code review and the PR > for the > NEP, Sebestian expressed concern about the additional complexity of > including missing data support at all. > > I found that supporting missing data is key to efficiently supporting > the > new DType in Pandas. I think that argues that we need some level of > missing > data support to fully replace object string arrays. I believe the > compromise proposal in the NEP is sufficient for downstream libraries > while > limiting additional complexity elsewhere in NumPy. > > Concerns raised in previous discussions about concretely specifying > the C > API to be made public, preventing use-after-free errors in a > multithreaded > context, and uncertainty around the arena allocator implementation > have > been resolved in the latest version of the NEP and the open PR. > Additionally, due to some excellent and timely work by Lysandros > Nikolaou, > we now have a number of string ufuncs in NumPy and a straightforward > plan > to add more. Loops have been implemented for all the ufuncs added in > the > NumPy 2.0 dev cycle so far. > > I would like to see us ship the DType in NumPy 2.0. This will allow > us to > advertise a major new feature, will spur efforts to support new > DTypes in > downstream libraries, and will allow us to get feedback from the > community > that would be difficult to obtain without releasing the code into the > wild. > Additionally, I am funded via a NASA ROSES grant for work related to > this > effort until the end of 2024, so including the DType in NumPy 2.0 > will more > efficiently use my funded time to fix issues. > > If there are no substantive objections to this email, then the NEP > will be > considered accepted; see NEP 0 for more details: > https://numpy.org/neps/nep-.html > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Automatic Clipping of array to upper / lower bounds of dtype
On Mon, 2024-03-25 at 13:49 +, percynichols...@gmail.com wrote: > Many thanks! > > Just one more inquiry along those lines, if I may. The code asserts > that clip should outpace np.maximum(mp.minumum(arr, max), min). > Despite this: > *time a = np.arange(100)it.clip(4, 20) # 8.48 µs > %timeit np.maximum(np.minimum(a, 20), 4) 2.09 nanoseconds > Will this be the norm? There some slow paths necessary due to NaN handling and a deprecation in `np.clip`. You should try with an up to date NumPy version. That was a known issue, but not much to do about it. You shouldn't really see much of a difference on up to date NumPy versions. - Sebastian > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0
On Mon, 2024-05-06 at 09:17 +1000, Matti Picus wrote: > On 05/05/2024 11:32, Mark Harfouche wrote: > > > > > Thank you for considering this last minute request. I know it adds > > work at this stage. > > > > Mark > > > I think NumPy should not be the leader in dropping versions, rather > should be one of the more conservative packages since other packages > depend on it. We have indeed dropped 3.9 on HEAD, and will not be > supporting it for 2.1, but to me it makes sense to support it for the > large 2.0 release. I think it is late anyway and NumPy always had a slightly longer support period and that seemed fine especially since NumPy is low in the stack. The SPEC was written to give the community that precedence and show that many agree with you (and numpy endorses it). Maybe the "endorsed by" list should rather be grown to strengthen the argument instead? (Of course there are true exceptions, IIRC scikit-learn chooses to have much longer support windows.) - Sebastian > > > Matti > > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0
On Tue, 2024-05-07 at 15:46 +1000, Juan Nunez-Iglesias wrote: > On Tue, 7 May 2024, at 7:04 AM, Ralf Gommers wrote: > > This problem could have been avoided by proper use of upper bounds. > > Scikit-image 0.22 not including a `numpy<2.0` upper bound is a bug > > in scikit-image (definitely for ABI reasons, and arguably also for > > API reasons). It would really be useful if downstream packages > > started to take adding upper bounds correctly more seriously. E.g., > > SciPy has always done it right, so the failure mode that this > > thread is about doesn't exist for SciPy. That said, this ship has > > sailed for 2.0 - most packages don't have upper bounds in some or > > all of their recent releases. > > I don't think this is a downstream problem, I think this is a "PyPI > has immutable metadata" problem. I'm a big fan of Henry Schreiner's > "Should You Use Upper Bound Version Constraints" < > https://iscinumpy.dev/post/bound-version-constraints/>, where he > argues convincingly that the answer is almost always no. This > highlighted bit contains the gist: Yes, that is all because of `pip` limitations, but those limitations are a given. And I think it is unfortunate/odd that it effectively argues that the lower in the stack you are, the fewer version you should support. But, with the clarification we have that there may be a lot of packages that never support both Python 3.9 and NumPy 2. That means not publishing for 3.9 may end up helping quite a lot of users who would have to downgrade NumPy explicitly. If that seems the case, that is an unfortunate, but good, argument for dropping 3.9. I don't have an idea for how many users we'll effectively help, or if we do the opposite because an application (more than library) wants to just use NumPy 2 always but still support Python 3.9. But it seems to me that is what the decision comes down to, and I can believe that it'll be a lot of hassle saved for `pip` installing users. (Note that skimage users will hit cython, so should get a relatively clear printout that includes a "please downgrade NumPy" suggestion.) - Sebastian > > > A library that requires a manual version intervention is not > > broken, it’s just irritating. A library that can’t be installed due > > to a version conflict is broken. If that version conflict is fake, > > then you’ve created an unsolvable problem where one didn’t exist. > > Dropping Py 3.9 will fix the issue for a subset of users, but > certainly not all users. Someone who pip installs scikit-image==0.22 > on Py 3.10 will have a broken install. But importantly, they will be > able to fix it in user space. > > At any rate, it's not like NumPy (or SciPy, or scikit-image) don't > change APIs over several minor versions. Quoting Henry again: > > > Quite ironically, the better a package follows SemVer, the smaller > > the change will trigger a major version, and therefore the less > > likely a major version will break a particular downstream code. > > In short, and independent of the Py3.9 issue, I don't think we should > advocate for upper caps in general, because in general it is > impossible to know whether an update is going to break your library, > regardless of their SemVer practices, and a fake upper pin is worse > than no upper pin. > > Juan. > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0
On Tue, 2024-05-07 at 11:41 +0200, Gael Varoquaux wrote: > On Tue, May 07, 2024 at 11:31:02AM +0200, Ralf Gommers wrote: > > make `pip install scikit-image==0.22` work if that version of > > scikit-image depends on an unconstrained numpy version. > > Would an option be for the scikit-image maintainers to release a > version of scikit-image 0.22 (like 0.22.1) with a constraint numpy > version? I don't think it helps, pip will just skip that version and pick the prevous one. IIUC, the one thing you could do is release a new version without a constraint that raises a detailed/informative error message at runtime. I.e. "work around" pip by telling users exactly what they should do. - Sebastian > > Gaël > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0
On Mon, 2024-05-06 at 22:39 +, Henry Schreiner wrote: > This will be messier for projects building wheels and wanting to > support non-EoL Python versions. To build a wheel with anything other > than pybind11, you now need the oldest supported NumPy for Python < > 3.9, the latest NumPy 1 for Python 3.9, and NumPy 2 for Python 3.10+. > I don't know if that's important in the decision, but thought I'd > point it out. Also, according to NEP 29, 3.10+ only became the > requirement a couple of weeks ago, while it has been months since > SPEC 0 dropped it. I don't think either document really details what > to do when there's a really long development cycle that spans a > cutoff date. FWIW, I have heard similar opinions now. Supporting a stack of libraries for all non-EoL Python versions is harder if NumPy must be different. The biggest problem would be if you end up with full support for only NumPy 1 or NumPy 2 (i.e. similar to what numba has to do due to promotion). I hope that is rare enough that it doesn't matter, but I can't say I am sure. (And yeah, if that happens, we might see the ask of downstream to support 3.9 and NumPy 2 in a release. And trying to avoid that was part of why the discussion started I think.) - Sebastian > > If you drop 3.9 from the metadata, I don't think there's any need to > secretly keep support. It's too hard to actually use it, and it's not > guaranteed to work; it would be better to just tell anyone needing > 3.9 to use a beta version when it was still supported. > > (Rant below) > > To be fair, I've never understood NEP 29's need to limit Python > versions to 42 months after the 2.7 issue was resolved with official > Python EoL. Now there's a standard (60 months, exactly 5 versions), > and almost all the rest of the ecosystem supports it. This just > wedges a divide in the ecosystem between "scientific" and "everyone > else". It makes me have to think "is this a scientific Python > project? Or a general Python project?" when I really shouldn't have > to on every project. > > I really didn't understand SPEC 0's _tightening_ it to 36 months (and > I was at the developer summit where this was decided, and stated I > was interested in being involved in this, but was never included in > any discussion on it, so not sure how this was even decided). > Dropping Python doesn't hurt projects that are mostly stable, but > ones that are not are really hurt by it. Python 3.8 is still heavily > used; people don't mind that NumPy dropped 3.8 support because an > older version works fine. But if there's a major change, then it > makes smaller or new projects have to do extra work. > > Current numbers (as of May 4th) for downloads of manylinux wheels: > * 2.7: 2% > * 3.5: 0.3% > * 3.6: 7.4% > * 3.7: 20.4% > * 3.8: 23.0% > * 3.9: 15.3% > * 3.10: 20.8% > * 3.11: 8.4% > * 3.12: 2.3% > > So only 30% of users have Python 3.10+ or newer. Most smaller or > newer projects can more than double their user base by supporting > 3.8+. I could even argue that 3.7+ is still helpful for a new > project. Once a library is large and stable, then it can go higher, > even 3.13+ and not hurt anyone unless there's a major development. > > Little rant finished. :) > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Unexpected return values for np.mod with x2=np.inf and similar
On Mon, 2024-06-10 at 10:49 +0300, Matti Picus wrote: > What operating system? > > > If I recall correctly, NumPy tries to be compatible with CPython for > these edge cases. > Right, and there seems nothing odd here to me. Try using `divmod()` on a few numbers (not infinities) to realize that this is how Python defines things. Python modulo is not identical to IEEE modulo as describd in the docs. - Sebastian > > The actual implementation is a bit scattered. I think it would be > nice > if we could have an "explain" decorator to ufuncs that would return > the > name of the inner loop used in practice to aid in debugging and > teaching. Until then your best bet is to build NumPy locally with > debug > information and use a debugger, but even that can be challenging at > times. > > Matti > > > On 07/06/2024 21:10, jesse.live...@gmail.com wrote: > > Hi all, > > > > I ran into an odd edge-case with np.mod and was wondering if this > > is the expected behavior, and if so why. This is on a fresh install > > of python 3.10.14 with numpy 1.26.4 from conda-forge. > > ... > > Any ideas why these are the return values? I had a hard time > > tracking down where in the numpy source np.mod was coming from. > > Jesse > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: matti.pi...@gmail.com > > https://github.com/numpy/numpy/blob/main/numpy/_core/src/umath/loops_modulo.dispatch.c.src#L557 > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Mysterious issue to build pyFFTW with Numpy 2.0 on Windows
The most probably change seems to me that NumPy now includes `complex.h`. But not sure that is the right direction or why it would lead to cryptic errors. - Sebastian On Wed, 2024-07-03 at 10:30 +0200, PIERRE AUGIER wrote: > Hi, > > We have a strange issue with building pyFFTW with Numpy 2.0 on > Windows. I observed it before when a build in the CI tried to use > Numpy 2.0. The solution was to pin the Numpy version used for the > build to <2.0. > > However, now I'm trying in this PR > (https://github.com/pyFFTW/pyFFTW/pull/383) to make pyFFTW compatible > with Numpy 2.0. With few simple changes, it works well on Linux and > Macosx but not on Windows. > > The meaningful part of the log seems to be: > > INFO:root:"C:\Program Files\Microsoft Visual > Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.e > xe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DPYFFTW_HAVE_DOUBLE=1 - > DPYFFTW_HAVE_DOUBLE_OMP=0 -DPYFFTW_HAVE_DOUBLE_THREADS=1 - > DPYFFTW_HAVE_DOUBLE_MULTITHREADING=1 -DPYFFTW_HAVE_DOUBLE_MPI=0 - > DPYFFTW_HAVE_SINGLE=1 -DPYFFTW_HAVE_SINGLE_OMP=0 - > DPYFFTW_HAVE_SINGLE_THREADS=1 -DPYFFTW_HAVE_SINGLE_MULTITHREADING=1 - > DPYFFTW_HAVE_SINGLE_MPI=0 -DPYFFTW_HAVE_LONG=1 - > DPYFFTW_HAVE_LONG_OMP=0 -DPYFFTW_HAVE_LONG_THREADS=1 - > DPYFFTW_HAVE_LONG_MULTITHREADING=1 -DPYFFTW_HAVE_LONG_MPI=0 - > DPYFFTW_HAVE_MPI=0 -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION - > ID:\a\pyFFTW\pyFFTW\include -ID:\a\pyFFTW\pyFFTW\pyfftw - > IC:\Users\runneradmin\AppData\Local\Temp\pip-build-env- > zhyzy1bf\overlay\Lib\site-packages\numpy\_core\include - > IC:\Users\runneradmin\AppData\Local\Temp\cibw-run-g1feworz\cp310- > win_amd64\build\venv\include -ID:\a\pyFFTW\pyFFTW\include\win - > IC:\Users\runneradmin\AppData\Local\Temp\cibw-run-g1few > orz\cp310-win_amd64\build\venv\include - > IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget- > cpython\python.3.10.11\tools\include - > IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget- > cpython\python.3.10.11\tools\Include - > IC:\Users\runneradmin\AppData\Local\Temp\cibw-run-g1feworz\cp310- > win_amd64\build\venv\include - > IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget- > cpython\python.3.10.11\tools\include - > IC:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget- > cpython\python.3.10.11\tools\Include "-IC:\Program Files\Microsoft > Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\include" "- > IC:\Program Files\Microsoft Visual > Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\ATLMFC\include" "- > IC:\Program Files\Microsoft Visual > Studio\2022\Enterprise\VC\Auxiliary\VS\include" "-IC:\Program Files > (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files > (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files > (x86)\W > indows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program Files > (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IC:\Program > Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" "- > IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "- > IC:\Program Files\Microsoft Visual > Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\include" "- > IC:\Program Files\Microsoft Visual > Studio\2022\Enterprise\VC\Tools\MSVC\14.40.33807\ATLMFC\include" "- > IC:\Program Files\Microsoft Visual > Studio\2022\Enterprise\VC\Auxiliary\VS\include" "-IC:\Program Files > (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files > (x86)\Windows Kits\10\\include\10.0.22621.0\\um" "-IC:\Program Files > (x86)\Windows Kits\10\\include\10.0.22621.0\\shared" "-IC:\Program > Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt" "- > IC:\Program Files (x86)\Windows > Kits\10\\include\10.0.22621.0\\cppwinrt" "-IC:\Program Files > (x86)\Windows Kits\NETFXSDK\4.8\include\um" /Tcpyfftw\pyfftw.c > /Fobuild\temp.win > -amd64-cpython-310\Release\pyfftw\pyfftw.obj > pyfftw.c > D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2061: syntax > error: identifier 'fftw_complex' > D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2059: syntax > error: ';' > D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2143: syntax > error: missing ')' before '*' > D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2081: > 'fftw_complex': name in formal parameter list illegal > D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2143: syntax > error: missing '{' before '*' > D:\a\pyFFTW\pyFFTW\pyfftw\fftw3.h(358): error C2143: syntax > error:
[Numpy-discussion] Re: Enhancement for generalized ufuncs
On Thu, 2024-07-11 at 19:31 -0400, Warren Weckesser wrote: > I have implemented quite a few generalized ufuncs over in ufunclab > (https://github.com/WarrenWeckesser/ufunclab), and in the process I > have accumulated a gufunc "wish list". Two items on that list are: > > (1) the ability to impose constraints on the core dimensions that are > checked when the gufunc is called. By far the most common use-case I > have is requiring that a dimension have length at least 1. To do this > currently, I check the shapes in the ufunc loop function, and if they > are not valid, raise an exception and hope that the gufunc machinery > processes it as expected when the loop function returns. (Sorry, I'm > using lingo--"loop function", "core dimension", etc--that will be > familiar to those who already know the ufunc C API, but not so > familiar to general users of NumPy.) > > (2) the ability to have the output dimension be a function of the > input dimensions, instead of being limited to one of the input > dimensions or an independent dimension. Then one could create, for > example, a 1-d convolution gufunc with shape signature that is > effectively `(m),(n)->(m + n - 1)` (corresponding to `mode='full'` in > `np.convolve`) and the gufunc code would automatically allocate the > output with the correct shape and dtype. > Nice, thanks! I have to look at the implementation in detail, but this seems like a good idea. Have to look at the PR for bike-shedding, but I think we should just add this. (You won't be able to know these relations from reading the signature, but I doubt it's worth worrying about that.) This seems like it should cover all or at least almost all of the things that have come up about ufunc core dimension flexibility (might be nice to check briefly, but even if not I suspect the hook here is the right choice). - Sebastian > I have proposed a change in https://github.com/numpy/numpy/pull/26908 > that makes both these features possible. A field is added to the > PyUFuncObject that is an optional user-defined C function that the > gufunc author implements. When a gufunc is called, this function is > called with an array of the values of the core dimensions of the > input > and output arrays. Some or all of the output core dimensions might be > -1, meaning the arrays are to be allocated by the gufunc/iterator > machinery. The new "hook" allows the user to check the given core > dimensions and raise an exception if some constraint is not > satisfied. > The user-defined function can also replace those -1 values with sizes > that it computes based on the given input core dimensions. > > To define the 1-d convolution gufunc, the actual shape signature that > is passed to `PyUFunc_FromFuncAndDataAndSignature` is `(m),(n)->(p)`. > When a user passes arrays with shapes, say, (20,) and (30,) as the > input and with no output array specified, the user-defined function > will get the array [20, 30, -1]. It would replace -1 with m + n - 1 = > 49 and return. If the caller *does* include an output array in the > call, the core dimension of that array will be the third element of > the array passed to the user-defined function. In that case, the > function verifies that the value equals m + n - 1, and raises an > exception if it doesn't. > > Here's that 1-d convolution, called `conv1d_full` here, in action: > > ``` > In [14]: import numpy as np > > In [15]: from experiment import conv1d_full > > In [16]: type(conv1d_full) > Out[16]: numpy.ufunc > ``` > > `m = 4`, `n = 6`, so the output shape is `p = m + n - 1 = 9`: > > ``` > In [17]: conv1d_full([1, 2, 3, 4], [-1, 1, 2, 1.5, -2, 1]) > Out[17]: array([-1. , -1. , 1. , 4.5, 11. , 9.5, 2. , -5. , 4. ]) > ``` > > Standard broadcasting: > > ``` > In [18]: conv1d_full([[1, 2, 3, 4], [0.5, 0, -1, 1]], [-1, 1, 2, 1.5, > -2, 1]) > Out[18]: > array([[-1. , -1. , 1. , 4.5 , 11. , 9.5 , 2. , -5. , 4. ], > [-0.5 , 0.5 , 2. , -1.25, -2. , 1. , 3.5 , -3. , 1. ]]) > ``` > > Comments here or over in the pull request are welcome. The essential > changes to the source code are just 7 lines in `ufunc_object.c` and 7 > lines in `ufuncobject.h`. The rest of the changes in the PR create a > couple gufuncs that use the new feature, with corresponding unit > tests. > > Warren > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Enhancement for generalized ufuncs
On Fri, 2024-07-12 at 09:56 -0400, Warren Weckesser wrote: > On Fri, Jul 12, 2024 at 7:47 AM Sebastian Berg > wrote: > > > > > (You won't be able to know these relations from reading the > > signature, > > but I doubt it's worth worrying about that.) > > After creating the gufunc with `PyUFunc_FromFuncAndDataAndSignature`, > the gufunc author could set the `core_signature` field at the same > time that `process_core_dims_func` is set. That requires freeing the > old signature and allocating memory for the new one. For the 1-d > convolution example, the signature would be set to `"(m),(n)->(m + n > - > 1)"`: > > ``` > In [1]: from experiment import conv1d_full > > In [2]: conv1d_full.signature > Out[2]: '(m),(n)->(m + n - 1)' > ``` I have to look at the PR, but the ufunc parses the signature only once? That solution seems very hacky, but allowing to just replace the signature may make sense. (Downside is, if someone else wants to parse the original signature, but I guess it is unlikely.) In either case, the only other thing to hook into would be the signature parsing itself with the full shapes available. But then you may need to deal with `axes=`, etc. as well, so I think your solution that only adjusts shapes seems better. It's much simpler and should cover most or even all relevant things. - Sebastian > > Warren > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Welcome Joren Hammudoglu to the NumPy Maintainers Team
Hi all, please join me in welcoming Joren (https://github.com/jorenham) to the NumPy maintainers team. Joren has done a lot of work recently contributing, reviewing, and maintaining typing related improvements to NumPy. We are looking forward to see new momentum to improve NumPy typing. Thanks for all the contributions! Cheers, Sebastian ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: ENH: Uniform interface for accessing minimum or maximum value of a dtype
On Mon, 2024-08-26 at 11:26 -0400, Marten van Kerkwijk wrote: > I think a NEP is a good idea. It would also seem to make sense to > consider how the dtype itself can hold/calculate this type of > information, since that will be the only way a generic ``info()`` > function can get information for a user-defined dtype. Indeed, > taking > that further, might a method or property on the dtype itself be > the cleaner interface? I.e., one would do ``dtype.info().min`` or > ``dtype.info.min``. > I agree, I think it should be properties/attributes (I don't think it needs to be a function, it should be cheap?) Now it might also be that `np.finfo()` could keep working via `dtype.finfo` or a dunder if we want to hide it. In general, I would lean towards some form of attributes, even if I am not sure if they should be `.info`, `.finfo`, or even directly on the dtype. (`.info.min` seems tricky, because I am not sure it is clear whether inf or the minimum finite value is "min".) A (potentially very short) NEP would probably help to get momentum on making a decision. I certainly would like to see this being worked on! - Sebastian > -- Marten > > Nathan writes: > > > That seems reasonable to me on its face. There are some corner > > cases to work out though. > > > > Swayam is tinkering with a quad precision dtype written using rhe > > new DType API and just ran into the > > fact that finfo doesn’t support user dtypes: > > > > https://github.com/numpy/numpy/issues/27231 > > > > IMO any new feature along these lines should have some thought in > > the design about how to handle > > user-defined data types. > > > > Another thing to consider is that data types can be non-numeric > > (things like categories) or number-like > > but not really just a number like a quantity with a physical unit. > > That means you should also think > > about what to do where fields like min and max don’t make any sense > > or need to be a generic python > > object rather than a numeric type. > > > > I think if someone proposed a NEP that fully worked this out it > > would be welcome. My understanding > > is that the array API consortium prefers to standardize on APIs > > that gain tractions in libraries rather > > than inventing APIs and telling libraries to adopt them, so I think > > a NEP is the right first step, rather > > than trying to standardize something in the array API. > > > > On Mon, Aug 26, 2024 at 8:06 AM Lucas Colley < > > lucas.coll...@gmail.com> wrote: > > > > Or how about `np.dtype_info(dt)`, which could return an object > > with attributes like `min` and `max > > `. Would that be possible? > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: nathan12...@gmail.com > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
Re: [Numpy-discussion] Optimize evaluation of function on matrix
On Sat, 2017-03-25 at 18:46 +0100, Florian Lindner wrote: > Hello, > > I have this function: > > def eval_BF(self, meshA, meshB): > """ Evaluates single BF or list of BFs on the meshes. """ > if type(self.basisfunction) is list: > A = np.empty((len(meshA), len(meshB))) > for i, row in enumerate(meshA): > for j, col in enumerate(meshB): > A[i, j] = self.basisfunction[j](row - col) > else: > mgrid = np.meshgrid(meshB, meshA) > A = self.basisfunction( np.abs(mgrid[0] - mgrid[1]) ) > return A > > > meshA and meshB are 1-dimensional numpy arrays. self.basisfunction is > e.g. > > def Gaussian(radius, shape): > """ Gaussian Basis Function """ > return np.exp( -np.power(shape*abs(radius), 2)) > > > or a list of partial instantations of such functions (from > functools.partial). > > How can I optimize eval_BF? Esp. in the case of basisfunction being a > list. > Are you sure you need to optimize it? If they have a couple of hundred elements or so for each row, the math is probably the problem and most of that might be the `exp`. You can get rid of the `row` loop though in case row if an individual row is a pretty small array. To be honest, I am a bit surprised that its a problem, since "basis function" sounds a bit like you have to do this once and then use the result many times. - Sebastian > Thanks! > Florian > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] heads up: gufuncs on empty arrays and NpyIter removal of empty axis
Hi all, just a small heads up for gufunc hackers and low level iterator users. We will probably very soon put in a commit into master that will allow the removal of empty axis from NpyIter/nditer, effectively removing the error: "ValueError: cannot remove a zero-sized axis from an iterator" and allowing: ``` arr = np.zeros((100, 0)) it = np.nditer((arr,), flags=["zerosize_ok", "multi_index"]) it.remove_axis(1) ``` As a follow up step, we also allow that gufuncs may be called with empty inner loop sizes. In some cases that may mean that your gufuncs may need special handling for lets say: ``` arr = np.zeros((100, 0)) # note the 0 dimension. my_gufunc(arr) ``` If this creates problems for you, please tell, so that we can slow down or undo the change. As an example, we have a matrix_multiply gufunc for testing purpose, which did not zero out the output for the case of `matrix_multiply(np.ones((10, 0)), np.ones((0, 10)))`. So this could turn code that errored out for weird reasons into wrong results in rare cases. - Sebastian signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Optimize evaluation of function on matrix
On Mon, 2017-03-27 at 13:06 +0200, Florian Lindner wrote: > Hey, > > I've timed the two versions, one basisfunction being a function: > > 1 loop, best of 3: 17.3 s per loop > > the other one, basisfunction being a list of functions: > > 1 loop, best of 3: 33.5 s per loop > > > To be honest, I am a bit surprised that its a problem, since "basis > > function" sounds a bit like you have to do this once and then use > > the > > result many times. > > It's part of a radial basis function interpolation algorithm. Yes, in > practice the matrix is filled only once and reused > a couple of times, but in my case, which is exploration of parameters > for the algorithm, I call eval_BF many times. > > > You can get rid of the `row` loop though in case row if an > > individual > > row is a pretty small array. > > Would you elaborate on that? Do you mean that the inner col loop > produces an array which is then assigned to the row. > But I think it stell need to row loop there. Well, I like to not serve the result, but if you exchange the loops: A = np.empty((len(meshA), len(meshB))) for j, col in enumerate(meshB): for i, row in enumerate(meshA): A[i, j] = self.basisfunction[j](row - col) Then you can see that there is broadcasting magic similar (do not want to use too many brain cells now) to: A = np.empty((len(meshA), len(meshB))) for j, col in enumerate(meshB): # possibly insert np.newaxis/None or a reshape in [??] A[:, j] = self.basisfunction[j](meshA[??] - col) - Sebastian > > Best, > Florian > > Am 25.03.2017 um 22:31 schrieb Sebastian Berg: > > On Sat, 2017-03-25 at 18:46 +0100, Florian Lindner wrote: > > > Hello, > > > > > > I have this function: > > > > > > def eval_BF(self, meshA, meshB): > > > """ Evaluates single BF or list of BFs on the meshes. """ > > > if type(self.basisfunction) is list: > > > A = np.empty((len(meshA), len(meshB))) > > > for i, row in enumerate(meshA): > > > for j, col in enumerate(meshB): > > > A[i, j] = self.basisfunction[j](row - col) > > > else: > > > mgrid = np.meshgrid(meshB, meshA) > > > A = self.basisfunction( np.abs(mgrid[0] - mgrid[1]) ) > > > return A > > > > > > > > > meshA and meshB are 1-dimensional numpy arrays. > > > self.basisfunction is > > > e.g. > > > > > > def Gaussian(radius, shape): > > > """ Gaussian Basis Function """ > > > return np.exp( -np.power(shape*abs(radius), 2)) > > > > > > > > > or a list of partial instantations of such functions (from > > > functools.partial). > > > > > > How can I optimize eval_BF? Esp. in the case of basisfunction > > > being a > > > list. > > > > > > > Are you sure you need to optimize it? If they have a couple of > > hundred > > elements or so for each row, the math is probably the problem and > > most > > of that might be the `exp`. > > You can get rid of the `row` loop though in case row if an > > individual > > row is a pretty small array. > > > > To be honest, I am a bit surprised that its a problem, since "basis > > function" sounds a bit like you have to do this once and then use > > the > > result many times. > > > > - Sebastian > > > > > > > Thanks! > > > Florian > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: SciPy2017 Sprints FinAid for sprint leaders/core devs
On Thu, 2017-03-30 at 22:46 +1300, Ralf Gommers wrote: > > > Agreed, and I would call that productive. Getting even one new > maintainer involved is worth organizing multiple sprints for. > > That said, also +1 to a developer meeting this year. It'd be good if > we could combine it with the NumFOCUS summit or a relevant conference > in the second half of the year. Would be good, even if there is nothing big going on. Can we gather possible dates and possible (personal) preferences? Here is a start: * SciPy (Austin, TX): July 10-16 * EuroScipy (Germany): August 23-27 * NumFocus Summit? * PyData Events?? Personally, I probably can't make longer trips until some time in July. time around then). We won't find a perfect time anyway probably, so personal preferences or not, whoever is willing to organize a bit can decide on the time and place as far as I am concerned :). - Sebastian > > Ralf > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposal: smaller representation of string arrays
> > saving > > > some memory in some ascii heavy cases, e.g. astronomy. > > > It is not that significant anymore as porting to python3 has > > > mostly > > > already happened via the ugly byte workaround and memory saving > > > is > > > probably not as significant in the context of numpy which is > > > already > > > heavy on memory usage. > > > > > > My initial approach was to not add a new dtype but to make > > > unicode > > > parametrizable which would have meant almost no cluttering of > > > numpys > > > internals and keeping the api more or less consistent which would > > > make > > > this a relatively simple addition of minor functionality for > > > people that > > > want it. > > > But adding a completely new partially redundant dtype for this > > > usecase > > > may be a too large change to the api. Having two partially > > > redundant > > > string types may confuse users more than our current status quo > > > of our > > > single string type (U). > > > > > > Discussing whether we want to support truncated utf8 has some > > > merit as > > > it is a decision whether to give the users an even larger gun to > > > shot > > > themselves in the foot with. > > > But I'd like to focus first on the 1 byte type to add a symmetric > > > API > > > for python2 and python3. > > > utf8 can always be added latter should we deem it a good idea. > > > > What is your current proposal? A string dtype parameterized with > > the > > encoding (initially supporting the latin-1 that you desire and > > maybe > > adding utf-8 later)? Or a latin-1-specific dtype such that we will > > have > > to add a second utf-8 dtype at a later date? > > My proposal is a single new parameterizable dtype. Adding multiple > dtypes for each encoding seems unnecessary to me given that numpy > already supports parameterizable types. > For example datetime is very similar, it is basically encoded > integers. > There are multiple encodings = units supported. > > > > > If you're not going to support arbitrary encodings right off the > > bat, > > I'd actually suggest implementing UTF-8 and ASCII-surrogateescape > > first > > as they seem to knock off more use cases straight away. > > > > > Please list the use cases in the context of numpy usage. hdf5 is the > most obvious, but how exactly would hdf5 use an utf8 array in the > actual > implementation? > > What you save by having utf8 in the numpy array is replacing a > decoding > ane encoding step with a stripping null padding step. > That doesn't seem very worthwhile compared to all their other > overheads > involved. I remember talking with a colleague about something like that. And basically an annoying thing there was that if you strip the zero bytes in a zero padded string, some encodings (UTF16) may need one of the zero bytes to work right. (I think she got around it, by weird trickery, inverting the endianess or so and thus putting the zero bytes first). Maybe will ask her if this discussion is interesting to her. Though I think it might have been something like "make everything in hdf5/something similar work" without any actual use case, I don't know. Have not read the thread, I think a fixed byte but settable encoding type would make sense. I personally wonder whether storing the length might make sense, even if that removes direct memory mapping, but as you said, you can still memmap the bytes, and then probably just cast back and forth. Sorry if there is zero actual input here :) - Sebastian > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-User] NumPy v1.13.0rc1 released.
On Fri, 2017-05-12 at 16:28 +0200, Jens Jørgen Mortensen wrote: > Den 11-05-2017 kl. 03:48 skrev Charles R Harris: > > Hi All, > > > > I'm please to announce the NumPy 1.13.0rc1 release. This release > > supports Python 2.7 and 3.4-3.6 and contains many new features. It > > is one of the most ambitious releases in the last several years. > > Some of the highlights and new functions are > > I found this strange behavior: > > (np113) [jensj@ASUS np113]$ python3 > Python 3.5.3 (default, Jan 19 2017, 14:11:04) > [GCC 6.3.0 20170118] on linux > Type "help", "copyright", "credits" or "license" for more > information. > >>> import numpy as np > >>> np.__version__ > '1.13.0rc1' > >>> s = (27, 27, 27) > >>> x = np.ones(s, complex) > >>> y = np.zeros(s) > >>> y += abs(x * 2.0)**2 > Traceback (most recent call last): > File "", line 1, in > TypeError: Cannot cast ufunc add output from dtype('complex128') to > dtype('float64') with casting rule 'same_kind' > > Works OK with s=(3,3,3). > I have opened an issue: https://github.com/numpy/numpy/issues/9109 since it is so "odd", I expect it is due to the temporary elision kicking in when it should not in this case. - Sebastian > Jens Jørgen > > > Highlights > > Operations like ``a + b + c`` will reuse temporaries on some > > platforms, resulting in less memory use and faster execution. > > Inplace operations check if inputs overlap outputs and create > > temporaries to avoid problems. > > New __array_ufunc__ attribute provides improved ability for classes > > to override default ufunc behavior. > > New np.block function for creating blocked arrays. > > > > New functions > > New ``np.positive`` ufunc. > > New ``np.divmod`` ufunc provides more efficient divmod. > > New ``np.isnat`` ufunc tests for NaT special values. > > New ``np.heaviside`` ufunc computes the Heaviside function. > > New ``np.isin`` function, improves on ``in1d``. > > New ``np.block`` function for creating blocked arrays. > > New ``PyArray_MapIterArrayCopyIfOverlap`` added to NumPy C-API. > > Wheels for the pre-release are available on PyPI. Source tarballs, > > zipfiles, release notes, and the Changelog are available on github. > > > > A total of 100 people contributed to this release. People with a > > "+" by their > > names contributed a patch for the first time. > > A. Jesse Jiryu Davis + > > Alessandro Pietro Bardelli + > > Alex Rothberg + > > Alexander Shadchin > > Allan Haldane > > Andres Guzman-Ballen + > > Antoine Pitrou > > Antony Lee > > B R S Recht + > > Baurzhan Muftakhidinov + > > Ben Rowland > > Benda Xu + > > Blake Griffith > > Bradley Wogsland + > > Brandon Carter + > > CJ Carey > > Charles Harris > > Danny Hermes + > > Duke Vijitbenjaronk + > > Egor Klenin + > > Elliott Forney + > > Elliott M Forney + > > Endolith > > Eric Wieser > > Erik M. Bray > > Eugene + > > Evan Limanto + > > Felix Berkenkamp + > > François Bissey + > > Frederic Bastien > > Greg Young > > Gregory R. Lee > > Importance of Being Ernest + > > Jaime Fernandez > > Jakub Wilk + > > James Cowgill + > > James Sanders > > Jean Utke + > > Jesse Thoren + > > Jim Crist + > > Joerg Behrmann + > > John Kirkham > > Jonathan Helmus > > Jonathan L Long > > Jonathan Tammo Siebert + > > Joseph Fox-Rabinovitz > > Joshua Loyal + > > Juan Nunez-Iglesias + > > Julian Taylor > > Kirill Balunov + > > Likhith Chitneni + > > Loïc Estève > > Mads Ohm Larsen > > Marein Könings + > > Marten van Kerkwijk > > Martin Thoma > > Martino Sorbaro + > > Marvin Schmidt + > > Matthew Brett > > Matthias Bussonnier + > > Matthias C. M. Troffaes + > > Matti Picus > > Michael Seifert > > Mikhail Pak + > > Mortada Mehyar > > Nathaniel J. Smith > > Nick Papior > > Oscar Villellas + > > Pauli Virtanen > > Pavel Potocek > > Pete Peeradej Tanruangporn + > > Philipp A + > > Ralf Gommers > > Robert Kern > > Roland Kaufmann + > > Ronan Lamy > > Sami Salonen + > > Sanchez Gonzalez Alvaro > > Sebastian Berg > > Shota Kawabuchi > > Simon Gibbons > > Stefan Otte > > Stefan Peterson + > > Stephan Hoyer > > Søren Fuglede Jørgensen + > > Takuya Akiba > > Tom Boyd + > > Ville Skyttä + > > Warren Weckesser > > Wendell Smith > > Yu Feng > > Zixu Zhao + > > Zè Vinícius + > > aha66 + > > davidjn + > > drabach + > > drlvk + > > jsh9 + > > solarjoe + > > zengi + > > Cheers, > > > > Chuck > > > > > > ___ > > SciPy-User mailing list > > scipy-u...@python.org > > https://mail.python.org/mailman/listinfo/scipy-user > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] failed to add routine to the core module
On Thu, 2017-05-18 at 15:04 +0200, marc wrote: > Dear Numpy developers, > I'm trying to add a routine to calculate the sum of a product of two > arrays (a dot product). But that would not increase the memory (from > what I saw np.dot is increasing the memory while it should not be > necessary). The idea is to avoid the use of the temporary array in > the calculation of the variance ( numpy/numpy/core/_methods.py line > 112). np.dot should only increase memory in some cases (such as non- contiguous arrays) and be much faster in most cases (unless e.g. you do not have a BLAS compatible type). You might also want to check out np.einsum, which is pretty slick and can handle these kind of operations as well. Note that `np.dot` calls into BLAS so that it is in general much faster then np.einsum. - Sebastian > The routine that I want to implement look like this in python, > arr = np.random.rand(10) > mean = arr.mean() > var = 0.0 > for ai in arr: var += (ai-mean)**2 > I would like to implement it in the umath module. As a first step, I > tried to reproduce the divmod function of umath, but I did not manage > to do it, you can find my fork here (the branch with the changes is > call looking_around). During compilation I get the following error, > gcc: numpy/core/src/multiarray/number.c > In file included from numpy/core/src/multiarray/number.c:17:0: > numpy/core/src/multiarray/number.c: In function > ‘array_sum_multiply’: > numpy/core/src/private/binop_override.h:176:39: error: > ‘PyNumberMethods {aka struct }’ has no member named > ‘nb_sum_multiply’ (void*)(Py_TYPE(m2)->tp_as_number->SLOT_NAME) != > (void*)(test_func)) > ^ > numpy/core/src/private/binop_override.h:180:13: note: in expansion of > macro ‘BINOP_IS_FORWARD’ if (BINOP_IS_FORWARD(m1, m2, slot_expr, > test_func) && \ > ^ > numpy/core/src/multiarray/number.c:363:5: note: in expansion of macro > ‘BINOP_GIVE_UP_IF_NEEDED’ BINOP_GIVE_UP_IF_NEEDED(m1, m2, > nb_sum_multiply, array_sum_multiply); > Sorry if my question seems basic, but I'm new in Numpy development. > Any help? > Thank you in advance, > Marc Barbry > > PS: I opened an issues as well on the github repository > https://github.com/numpy/numpy/issues/9130 > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] UC Berkeley hiring developers to work on NumPy
On Mon, 2017-05-22 at 17:35 +0100, Matthew Brett wrote: > Hi, > > On Mon, May 22, 2017 at 4:52 PM, Marten van Kerkwijk > wrote: > > Hi Matthew, > > > > > it seems to me that we could get 80% of the way to a reassuring > > > blueprint with a relatively small amount of effort. > > > > My sentence "adapt the typical academic rule for conflicts of > > interests to PRs, that non-trivial ones cannot be merged by someone > > who has a conflict of interest with the author, i.e., it cannot be > > a > > superviser, someone from the same institute, etc." was meant as a > > suggestion for part of this blueprint! > > > > I'll readily admit, though, that since I'm not overly worried, I > > haven't even looked at the policies that are in place, nor do I > > intend > > to contribute much beyond this e-mail. Indeed, it may be that the > > old > > adage "every initiative is punishable" holds here... > > I understand what you're saying, but I think a more helpful way of > thinking of it, is putting the groundwork in place for the most > fruitful possible collaboration. > > > would you, or one > > of the others who feels it is important to have a blueprint, be > > willing to provide a concrete text for discussion? > > It doesn't make sense for me to do that, I'm #13 for commits in the > last year. I'm just one of the many people who completely depend on > numpy. Also, taking a little time to think these things through > seems > like a small investment with the potential for significant gain, in > terms of improving communication and mitigating risk. > > So, I think my suggestion is that it would be a good idea for > Nathaniel and the current steering committee to talk through how this > is going to play out, how the work will be selected and directed, and > so on. > Frankly, I would suggest to wait for now and ask whoever is going to get the job to work out how they think it should be handled. And then we complain if we expect more/better ;). For now I only would say that I will expect more community type of work then we now often manage to do. And things such as meticulously sticking to writing NEPs. So the only thing I can see that might be good is putting "community work" or something like it specifically as part of the job description, and thats up to Nathaniel probably. Some things like not merging large changes by two people sittings in the same office should be obvious (and even if it happens, we can revert). But its nothing much new there I think. - Sebastian > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Future of ufuncs
On Sun, 2017-05-28 at 14:53 -0600, Charles R Harris wrote: > Hi All, > This post is to open a discussion of the future of ufuncs. There are > two contradictory ideas that have floated about regarding ufuncs > evolution. One is to generalize ufuncs to operate on buffers, > essentially separating them from their current entanglement with > ndarrays. The other is to accept that they are fundamentally part of > the ndarray universe and move them into the multiarray module, thus > avoiding the odd overloading of functions in the multiarray module. > The first has been a long time proposal that I once thought sounded > good, but I've come to prefer the second. That change of mind was > driven by the resulting code simplification and the removal of a > dependence on a Python feature, buffers, that we cannot easily modify > to adapt to changing needs and new dtypes. Because I'd like to move > the ufuncs, if we decide to move them, sometime after NumPy 1.14 is > released, now seems a good time to decide the issue. > Thoughts? I did not think about it much. But I agree that the dtypes are probably the biggest issue, also I am not sure anymore if there is much of a gain on having ufuncs work on buffers in any case? The dtype thing goes a bit back to ideas like the datashape things and trying to make the dtypes somewhat separate from numpy? Though I doubt I would want to make that an explicit goal. I wonder how much of the C-loops and type resolving we could/should expose? What I mean is that ufuncs are: * type resolving (somewhat ufunc specific) * outer loops (normal, reduce, etc.) using nditer (buffering) * inner 1d loops It is a bit more complicating, but just wondering if it might make sense to try and expose the individual ufunc things (type resolving and 1d loop) but not all the outer loop nditer setup which is ndarray specific in any case (honestly, I am not sure it is entirely possible it is already exposed). - Sebastian > Chuck > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.diff on boolean arrays now raises
On Thu, 2017-06-15 at 22:35 +1200, Ralf Gommers wrote: > > > On Thu, Jun 15, 2017 at 7:08 PM, Jaime Fernández del Río gmail.com> wrote: > > There is an ongoing discussion on github: > > > > https://github.com/numpy/numpy/issues/9251 > > > > In 1.13 np.diff has started raising on boolean arrays, since > > subtraction of boolean arrays is now deprecated. > > > > A decision has to be made whether: > > raising an error is the correct thing to do, and only the docstring > > needs updating, or > > backwards compatibility is more important and diff should still > > work on boolean arrays. > > > > The issue is bigger than np.diff. For example, there's a problem with > the scipy.ndimage morphology functions (https://github.com/scipy/scip > y/issues/7493) that looks pretty serious. All ndimage.binary_* > functions (7 of them) for example return boolean arrays, and chaining > those is now broken. Unfortunately it seems that that wasn't covered > by the ndimage test suite. > > Possibly reverting the breaking change in 1.13.1 is the best way to > fix this. > Sure, I would say there is nothing wrong with reverting for now (and it simply is the safe and easy way). Though it would be good to address the issue of what should happen in the future with diff (and possibly the subtract deprecation itself). If we stick to it, but its necessary, we could delay the deprecation and make it a VisibleDeprecationWarning. - Sebastian > Ralf > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-Dev] PyRSB: Python interface to librsb sparse matrices library
On Sat, 2017-06-24 at 15:47 -0400, josef.p...@gmail.com wrote: > > > On Sat, Jun 24, 2017 at 3:16 PM, Nathaniel Smith > wrote: > > On Jun 24, 2017 7:29 AM, "Sylvain Corlay" > > wrote: > > > > Also, one quick question: is the LGPL license a deliberate choice > > or is it not important to you? Most projects in the Python > > scientific stack are BSD licensed. So the LGPL choice makes it > > unlikely that a higher-level project adopts it as a dependency. If > > you are the only copyright holder, you would still have the > > possibility to license it under a more permissive license such as > > BSD or MIT... > > > > Why would LGPL be a problem in a dependency? That doesn't stop you > > making your code BSD, and it's less restrictive license-wise than > > depending on MKL or the windows C runtime... > > > > Is scipy still including any LGPL code, I thought not. > There might still be some optional dependencies that not many users > are using by default. ? > Julia packages are mostly MIT, AFAIK. (except for the GPL parts > because of cholmod, which we (?) avoid) > Well, I don't think scipy has many dependencies (but I would not be surprised if those are LGPL). Not a specialist, but as a dependency it should be fine (that is the point of the L in LGPL after all as far as I understand, it is much less viral). If you package it with your own stuff, you have to make sure to point out that parts are LGPL of course (just like there is a reason you get the GPL printed out with some devices) and if you modify it provide these modifications, etc. Of course you cannot include it into the scipy codebase itself, but there is probably no aim of doing so here, so without a specific reason I would think that LGPL is a great license. - Sebastian > Josef > > > -n > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-Dev] PyRSB: Python interface to librsb sparse matrices library
On Sat, 2017-06-24 at 22:58 +0200, Carl Kleffner wrote: > Does this still apply: https://scipy.github.io/old-wiki/pages/License > _Compatibility.html > Of course, but it talks about putting it into the code base of scipy not about being able to use the package in any way in a dependency (i.e. `import package`). - Sebastian > Carl > > 2017-06-24 22:07 GMT+02:00 Sebastian Berg >: > > On Sat, 2017-06-24 at 15:47 -0400, josef.p...@gmail.com wrote: > > > > > > > > > On Sat, Jun 24, 2017 at 3:16 PM, Nathaniel Smith > > > wrote: > > > > On Jun 24, 2017 7:29 AM, "Sylvain Corlay" > .com > > > > > wrote: > > > > > > > > Also, one quick question: is the LGPL license a deliberate > > choice > > > > or is it not important to you? Most projects in the Python > > > > scientific stack are BSD licensed. So the LGPL choice makes it > > > > unlikely that a higher-level project adopts it as a dependency. > > If > > > > you are the only copyright holder, you would still have the > > > > possibility to license it under a more permissive license such > > as > > > > BSD or MIT... > > > > > > > > Why would LGPL be a problem in a dependency? That doesn't stop > > you > > > > making your code BSD, and it's less restrictive license-wise > > than > > > > depending on MKL or the windows C runtime... > > > > > > > > > > Is scipy still including any LGPL code, I thought not. > > > There might still be some optional dependencies that not many > > users > > > are using by default. ? > > > Julia packages are mostly MIT, AFAIK. (except for the GPL parts > > > because of cholmod, which we (?) avoid) > > > > > > > > > Well, I don't think scipy has many dependencies (but I would not be > > surprised if those are LGPL). Not a specialist, but as a dependency > > it > > should be fine (that is the point of the L in LGPL after all as far > > as > > I understand, it is much less viral). > > If you package it with your own stuff, you have to make sure to > > point > > out that parts are LGPL of course (just like there is a reason you > > get > > the GPL printed out with some devices) and if you modify it provide > > these modifications, etc. > > > > Of course you cannot include it into the scipy codebase itself, but > > there is probably no aim of doing so here, so without a specific > > reason > > I would think that LGPL is a great license. > > > > - Sebastian > > > > > > > Josef > > > > > > > -n > > > > > > > > ___ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion@python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Boolean binary '-' operator
On Sun, 2017-06-25 at 18:59 +0200, Julian Taylor wrote: > On 25.06.2017 18:45, Stefan van der Walt wrote: > > Hi Chuck > > > > On Sun, Jun 25, 2017, at 09:32, Charles R Harris wrote: > > > The boolean binary '-' operator was deprecated back in NumPy 1.9 > > > and > > > changed to an error in 1.13. This caused a number of failures in > > > downstream projects. The choices now are to continue the > > > deprecation > > > for another couple of releases, or simply give up on the change. > > > For > > > booleans, `a - b` was implemented as `a xor b`, which leads to > > > the > > > somewhat unexpected identity `a - b == b - a`, but it is a handy > > > operator that allows simplification of some functions, > > > `numpy.diff` > > > among therm. At this point I'm inclined to give up on the > > > deprecation > > > and retain the old behavior. It is a bit impure but perhaps we > > > can > > > consider it a feature rather than a bug. > > > > What was the original motivation behind the deprecation? `xor` > > seems > > like exactly what one would expect when subtracting boolean arrays. > > > > But, in principle, I'm not against the deprecation (we've had to > > fix a > > few problems that arose in skimage, but nothing big). > > > > Stéfan > > > > > > I am against this deprecation for apparently cosmetic reasons. > Is there any practical drawback in that it makes subtraction > commutative > for booleans? > > numpy should not be imposing change of style when the existing sub > par > historical style does not cause actual bugs. > > While I don't like it I can accept a deprecation warning that will > never > be acted upon. Dunno, that is also weird, then a UserWarning might even be better ;), but more visible For the unary minus, there are good reasons. For subtract, I don't remember really, but I don't think there was any huge argument for it. Probably it was mostly that many feel that: `False - True == -1` as is the case in python while we have: `np.False_ - np.True_ == np.True_`. And going to a deprecation would open up that possibility (though maybe you could go there directly). Not that I am convinced of that option. So, I don't mind much either way, but unless there is a concrete plan with quite a bit of support we should maybe just go with the conservative option. - Sebastian > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] why a[0][0].__mul__(a[0][0]) where a is np.array, gives 'missing 1 required positional argument'?
x27;,mp) > --}--cut here-- > > > When run by make, gives this result: > > > --{--cut here-- > make -k > python3 shortestPathABC.py > d0= <0> d1= <1> d2= 3.0 d3= 6.0 > type(d0)= ShortestNull > d4= 3.0 > d5= 9.0 > d6= <0> > d7= 3.0 > d8= <0> > d9= 3.0 > a= > [[ 12.0] > [12.0 <0>]] > a[0]= > [ 12.0] > a[0][0]= > > Traceback (most recent call last): > File "shortestPathABC.py", line 123, in > a00mul=a[0][0].__mul__(a[0][0]) > TypeError: __mul__() missing 1 required positional argument: 'other' > Makefile:7: recipe for target 'all' failed > make: *** [all] Error 1 > --}--cut here-- > > I don't understand why. Apparently, a[0][0] is not a ShortestNull > because otherwise, the .__mul__ would have the positional argument, > 'other' equal to a[0][0]. > I don't think debugging support really suits the list, but how about you see why in your result: [[ 12.0] [12.0 <0>]] a[0, 0] and a[1, 1] do not show up as the same thing? - Sebastian > What am I missing? > > TIA. > > -regards, > Larry > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Array blitting (pasting one array into another)
On Fri, 2017-06-30 at 02:16 +0200, Mikhail V wrote: > Hello all > > I often need to copy one array into another array, given an offset. > This is how the "blit" function can be understood, i.e. in > every graphical lib there is such a function. > The common definition is like: > blit ( dest, src, offset ): > where dest is destination array, src is source array and offset is > coordinates in destination where the src should pe blitted. > Main feature of such function is that it never gives an error, > so if the source does not fit into the destination array, it is > simply trimmed. > And respectively if there is no intersection area then nothing > happens. > > Hope this is clear. > So to make it work with Numpy arrays one need to calculate the > slices before copying the data. > I cannot find any Numpy or Python method to help with that so > probably > it does not exist yet. > If so, my proposal is to add a Numpy method which helps with that. > Namely the proposal will be to add a method which returns > the slices for the intersection areas of two arbitrary arrays, given > an offset, > so then one can "blit" the array into another with simple assignment > =. > > Here is a Python function I use for 2d arrays now: > > def interslice ( dest, src, offset ): > y,x = offset > H,W = dest.shape > h,w = src.shape > > dest_starty = max (y,0) > dest_endy = min (y+h,H) > dest_startx = max (x,0) > dest_endx = min (x+w,W) > > src_starty = 0 > src_endy = h > if y<0: src_starty = -y > by = y+h - H # Y bleed > if by>0: src_endy = h - by > > src_startx = 0 > src_endx = w > if x<0: src_startx = -x > bx = x+w - W # X bleed > if bx>0: src_endx = w - bx > > dest_sliceY = slice(dest_starty,dest_endy) > dest_sliceX = slice(dest_startx,dest_endx) > src_sliceY = slice(src_starty, src_endy) > src_sliceX = slice(src_startx, src_endx) > if dest_endy <= dest_starty: > print "No Y intersection !" > dest_sliceY = ( slice(0, 0) ) > src_sliceY = ( slice(0, 0) ) > if dest_endx <= dest_startx: > print "No X intersection !" > dest_sliceX = ( slice(0, 0) ) > src_sliceX = ( slice(0, 0) ) > dest_slice = ( dest_sliceY, dest_sliceX ) > src_slice = ( src_sliceY, src_sliceX ) > return ( dest_slice, src_slice ) > > > -- > > I have intentionally made it expanded and without contractions > so that it is better understandable. > It returns the intersection area of two arrays given an offset. > First returned tuple element is the slice for DEST array and the > second element is the slice for SRC array. > If there is no intersection along one of the axis at all > it returns the corresponding slice as (0,0) > > With this helper function one can blit arrays easily e.g. example > code: > > W = 8; H = 8 > DEST = numpy.ones([H,W], dtype = "uint8") > w = 4; h = 1 > SRC = numpy.zeros([h,w], dtype = "uint8") > SRC[:]=8 > offset = (0,9) > ds, ss = interslice (DEST, SRC, offset ) > > # blit SRC into DEST > DEST[ds] = SRC[ss] > > So changing the offset one can observe how the > SRC array is trimmed if it is crosses the DEST boundaries. > I think it is very useful function in general and it has > well defined behaviour. It has usage not only for graphics, > but actually any data copying-pasting between arrays. > > So I am looking forward to comments on this proposal. > First, the slice object provides some help: ``` In [8]: s = slice(1, 40, 2) In [9]: s.indices(20) # length of dimension Out[9]: (1, 20, 2) # and the 40 becomes 20 ``` Second, there is almost no overhead of creating a view, so just create the views first (it may well be faster). Then use the result to see how large they actually are and index those (a second time) instead of creating new slice objects. - Sebastian > > Mikhail > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] proposed changes to array printing in 1.14
On Fri, 2017-06-30 at 17:55 +1000, Juan Nunez-Iglesias wrote: > To reiterate my point on a previous thread, I don't think this should > happen until NumPy 2.0. This *will* break a massive number of > doctests, and what's worse, it will do so in a way that makes it > difficult to support doctesting for both 1.13 and 1.14. I don't see a > big enough benefit to these changes to justify breaking everyone's > tests before an API-breaking version bump. > Just so we are on the same page, nobody is planning a NumPy 2.0, so insisting on not changing anything until a possible NumPy 2.0 is almost like saying it should never happen. Of course we could enmass deprecations and at some point do many at once and call it 2.0, but I am not sure that helps anyone, when compared to saying that we do deprecations for 1-2 years at least, and longer if someone complains. The question is, do you really see a big advantage in fixing a gazillion tests at once over doing a small part of the fixes one after another? The "big step" thing did not work too well for Python 3 - Sebastian > On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk > , wrote: > > To add to Allan's message: point (2), the printing of 0-d arrays, > > is > > the one that is the most important in the sense that it rectifies a > > really strange situation, where the printing cannot be logically > > controlled by the same mechanism that controls >=1-d arrays (see > > PR). > > > > While point 3 can also be considered a bug fix, 1 & 4 are at some > > level matters of taste; my own reason for supporting their > > implementation now is that the 0-d arrays already forces me (or, > > specifically, astropy) to rewrite quite a few doctests, and I'd > > rather > > have everything in one go -- in this respect, it is a pity that > > this > > is separate from the earlier change in printing for structured > > arrays > > (which was also much for the better, but broke a lot of doctests). > > > > -- Marten > > > > > > > > On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane > com> wrote: > > > Hello all, > > > > > > There are various updates to array printing in preparation for > > > numpy > > > 1.14. See https://github.com/numpy/numpy/pull/9139/ > > > > > > Some are quite likely to break other projects' doc-tests which > > > expect a > > > particular str or repr of arrays, so I'd like to warn the list in > > > case > > > anyone has opinions. > > > > > > The current proposed changes, from most to least painful by my > > > reckoning, are: > > > > > > 1. > > > For float arrays, an extra space previously used for the sign > > > position > > > will now be omitted in many cases. Eg, `repr(arange(4.))` will > > > now > > > return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., > > > 3.])'. > > > > > > 2. > > > The printing of 0d arrays is overhauled. This is a bit finicky to > > > describe, please see the release note in the PR. As an example of > > > the > > > effect of this, the `repr(np.array(0.))` now prints as > > > 'array(0.)` > > > instead of 'array(0.0)'. Also the repr of 0d datetime arrays is > > > now like > > > "array('2005-04-04', dtype='datetime64[D]')" instead of > > > "array(datetime.date(2005, 4, 4), dtype='datetime64[D]')". > > > > > > 3. > > > User-defined dtypes which did not properly implement their `repr` > > > (and > > > `str`) should do so now. Otherwise it now falls back to > > > `object.__repr__`, which will return something ugly like > > > ``. (Previously you could depend > > > on > > > only implementing the `item` method and the repr of that would be > > > printed. But no longer, because this risks infinite recursions.). > > > > > > 4. > > > Bool arrays of size 1 with a 'True' value will now omit a space, > > > so that > > > `repr(array([True]))` is now 'array([True])' instead of > > > 'array([ True])'. > > > > > > Allan > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Scipy 2017 NumPy sprint
On Sun, 2017-07-02 at 10:49 -0400, Allan Haldane wrote: > On 07/02/2017 10:03 AM, Charles R Harris wrote: > > Updated list below. > > > > On Sat, Jul 1, 2017 at 7:08 PM, Benjamin Root > > > <mailto:ben.v.r...@gmail.com>> wrote: > > > > Just a heads-up. There is now a sphinx-gallery plugin. > > Matplotlib > > and a few other projects have migrated their docs over to use > > it. > > > > https://sphinx-gallery.readthedocs.io/en/latest/ > > <https://sphinx-gallery.readthedocs.io/en/latest/> > > > > Cheers! > > Ben Root > > > > > > On Sat, Jul 1, 2017 at 7:12 AM, Ralf Gommers > l.com > > <mailto:ralf.gomm...@gmail.com>> wrote: > > > > > > > > On Fri, Jun 30, 2017 at 6:50 AM, Pauli Virtanen > <mailto:p...@iki.fi>> wrote: > > > > Charles R Harris kirjoitti 29.06.2017 klo 20:45: > > > Here's a random idea: how about building a NumPy > > gallery? > > > scikit-{image,learn} has it, and while those > > projects may have more > > > visual datasets, I can imagine something along > > the lines of Nicolas > > > Rougier's beautiful book: > > > > > > http://www.labri.fr/perso/nrougier/from-python-to > > -numpy/ > > <http://www.labri.fr/perso/nrougier/from-python-to-nump > > y/> > > > <http://www.labri.fr/perso/nrougier/from-python-t > > o-numpy/ > > <http://www.labri.fr/perso/nrougier/from-python-to-nump > > y/>> > > > > > > > > > So that would be added in the numpy > > > <https://github.com/numpy>/numpy.org > > <http://numpy.org> > > > <https://github.com/numpy/numpy.org > > <https://github.com/numpy/numpy.org>> repo? > > > > Or https://scipy-cookbook.readthedocs.io/ > > <https://scipy-cookbook.readthedocs.io/> ? > > (maybe minus bitrot and images added :) > > _ > > > > > > I'd like the numpy.org <http://numpy.org> one. numpy.org > > <http://numpy.org> is now incredibly sparse and ugly, a > > gallery > > would make it look a lot better. > > > > Another idea, from the "deprecate np.matrix" discussion: > > add > > numpy documentation describing the preferred way to handle > > matrices, extolling the virtues of @, and move np.matrix > > documentation to a deprecated section. > > > > > > Putting things together with a few new ideas, > > > > 1. add gallery to numpy.org <http://numpy.org>, > > 2. add extended documentation of '@' operator, > > 3. make Numpy tests Pytest compatible, > > 4. add matrix multiplication ufunc. > > > > Any more ideas? > > The new doctest runner suggested in the printing thread? This is to > ignore whitespace and precision in ndarray output. > > I can see an argument for distributing it in numpy if it is designed > to > be specially aware of ndarrays or numpy scalars (eg to test equality > between 'wants' and 'got') > I don't really feel it is very numpy specific or should be under the numpy umbrella (I mean if there is no other spot, I guess it could live on the numpy github page). Its about as numpy specific, as the gallery sphinx extension is probably matplotlib specific That doesn't mean that it might not be a good sprint, though :). The question to me is a bit what those who actually go there want from it or do a few people who know numpy/scipy already plan to come? Two years ago, we did not have much of a plan, so it was mostly giving three people or so a bit of a tutorial of how numpy worked internally leading to some bug fixes. One quick idea that might be nice and dives a bit into the C-layer (might be nice if there is no big topic with a few people working on): * Find places that should have the new memory overlap detection and implement it there. If someone who does subclasses/array-likes or so (e.g. like Stefan Hoyer ;)) and is interested, and also we do some teleconferencing/chatting (and I have time) I might be interested in discussing and possibly trying to develop the new indexer ideas, which I feel are pretty far, but I got stuck on how to get subclasses right. - Sebastian > Allan > > > > Chuck > > > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] reshape 2D array into 3D
On Mon, 2017-07-10 at 16:16 +0300, eat wrote: > Hi, > > On Mon, Jul 10, 2017 at 3:20 PM, wrote: > > Dear All > > I'm looking in a way to reshape a 2D matrix into a 3D one ; in my > > example I want to move the columns from the 4th to the 8th in the > > 2nd plane (3rd dimension i guess) > > a = np.random.rand(5,8); print(a) > > I tried > > a = p.reshape(d, (2,5,4), ) but it is not what I'm expecting > > > > Nota : it looks like the following task (while I want to split it > > in 2 levels and not in 4), but I've not understood at all > > https://stackoverflow.com/questions/31686989/numpy-reshape-and-part > > ition-2d-array-to-3d > > > > Is this what you are looking for: > import numpy as np > > a= np.arange(40).reshape(5, 8) > > a > Out[]: > array([[ 0, 1, 2, 3, 4, 5, 6, 7], > [ 8, 9, 10, 11, 12, 13, 14, 15], > [16, 17, 18, 19, 20, 21, 22, 23], > [24, 25, 26, 27, 28, 29, 30, 31], > [32, 33, 34, 35, 36, 37, 38, 39]]) > > np.lib.stride_tricks.as_strided(a, (2, 5, 4), (16, 32, 4)) > Out[]: > array([[[ 0, 1, 2, 3], > [ 8, 9, 10, 11], > [16, 17, 18, 19], > [24, 25, 26, 27], > [32, 33, 34, 35]], > > [[ 4, 5, 6, 7], > [12, 13, 14, 15], > [20, 21, 22, 23], > [28, 29, 30, 31], > [36, 37, 38, 39]]]) > While maybe what he wants, I would avoid stride tricks if you can achieve the same thing with a reshape + transpose. Far more safe if you hardcode the strides, and much shorter if you don't, plus easier to read usually. One thing some people might get confused about with reshape is the order, numpy reshape defaults to C-order, while other packages may use fortran order for reshaping, you can actually change the order you want to use (though it is in general a good idea to prefer C-order in numpy probably). - Sebastian > Regards, > -eat > > Thanks for your support > > > > Paul > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] pytest and degrees of separation.
On Tue, 2017-07-11 at 14:49 -0600, Charles R Harris wrote: > Hi All, > > Just looking for opinions and feedback on the need to keep NumPy from > having a hard nose/pytest dependency. The options as I see them are: > > pytest is never imported until the tests are run -- current practice > with nose > pytest is never imported unless the testfiles are imported -- what I > would like > pytest is imported together when numpy is -- what we need to avoid. > Currently the approach has been 1), but I think 2) makes more sense > and allows more flexibility. I am not quite sure about everything here. My guess is we can do whatever we want when it comes to our own tests, and I don't mind just switching everything to pytest (I for one am happy as long as I can run `runtests.py` ;)). When it comes to the utils we provide, those should keep working without nose/pytest if they worked before without it I think. My guess is that all your options do that, so I think we should take the one that gives the nicest maintainable code :). Though can't say I looked enough into it to really make a well educated decision, that probably means your option 2. - Sebastian > Thoughts? > Chuck > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to compare an array of arrays elementwise to None in Numpy 1.13 (was easy before)?
On Mon, 2017-07-17 at 09:13 +, martin.gfel...@swisscom.com wrote: > Dear all > > I have object array of arrays, which I compare element-wise to None > in various places: > > > > > a = > > > > numpy.array([numpy.arange(5),None,numpy.nan,numpy.arange(6),Non > > > > e],dtype=numpy.object) > > > > a > > array([array([0, 1, 2, 3, 4]), None, nan, array([0, 1, 2, 3, 4, 5]), > None], dtype=object) > > > > numpy.equal(a,None) > > FutureWarning: comparison to `None` will result in an elementwise > object comparison in the future. > > > So far, I always ignored the warning, for lack of an idea how to > resolve it. > > Now, with Numpy 1.13, I have to resolve the issue, because it fails > with: > > ValueError: The truth value of an array with more than one element is > ambiguous. Use a.any() or a.all() > > It seem that the numpy.equal is applied to each inner array, > returning a Boolean array for each element, which cannot be coerced > to a single Boolean. > > The expression > > > > > numpy.vectorize(operator.is_)(a,None) > > gives the desired result, but feels a bit clumsy. > Yes, I guess ones bug is someone elses feature :(, if it is very bad, we could delay the deprecation probably. For a solutions, maybe we could add a ufunc for elementwise `is` on object arrays (dunno about the name, maybe `object_identity`. Just some quick thoughts. - Sebastian > Is there a cleaner, efficient way to do an element-wise (but shallow) > comparison? > > Thank you and best regards, > Martin Gfeller, Swisscom > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to compare an array of arrays elementwise to None in
On Wed, 2017-07-19 at 08:31 +, martin.gfel...@swisscom.com wrote: > Thank you for your help! > > Sebastian, I couldn't agree more with someone's bug being someone > else's feature! A fast identity ufunc would be useful, though. > An `object_identity` ufunc should be very easy to implement, the bigger work is likely to actually decide on it and the name. Also should probably check back with the PyPy guys to make sure it would also work on PyPy. - Sebastian > Actually, numpy.frompyfunc(operator.is_,2,1) is much faster than the > numpy.vectorize approach - only about 35% slower on quick > measurement > than the direct ==, as opposed to 62% slower with vectorize (with > otypes hint). > > Robert, yes, that's what I already did provisionally. > > Eric, that is a nice puzzle - but I agree with Robert about > understanding by code maintainers. > > Thanks again, and best regards, > Martin > > > > > On Mon, 17 Jul 2017 11:41 Sebastian Berg > write > > > Yes, I guess ones bug is someone elses feature :(, if it is very > > bad, we could delay the deprecation probably. For a solutions, > > maybe > > we could add a ufunc for elementwise `is` on object arrays (dunno > > about the name, maybe `object_identity`. > > Just some quick thoughts. > > - Sebastian > > On Mon, 17 Jul 2017 at 17:45 Robert Kern > wrote: > > > Wrap the clumsiness up in a documented, tested utility function > > with a descriptive name and use that function everywhere instead. > > Robert Kern > > On Mon, Jul 17, 2017 at 10:52 AM, Eric Wieser l.com> > wrote: > > > Here's a hack that lets you keep using ==: > > > > class IsCompare: > > __array_priority__ = 99 # needed to make it work on either > > side of `==` > > def __init__(self, val): self._val = val > > def __eq__(self, other): return other is self._val > > def __neq__(self, other): return other is not self._val > > > > a == IsCompare(None) # a is None > > a == np.array(IsCompare(None)) # broadcasted a is None > > > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy steering councils members
On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > On 21.07.2017 08:52, Ralf Gommers wrote: > > Hi all, > > > > It has been well over a year since we put together the governance > > structure and steering council > > (https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#go > > vernance-people). > > We haven't reviewed the people on the steering council in that > > time. > > Based on the criteria for membership I would like to make the > > following > > suggestion (note, not discussed with everyone in private > > beforehand): > > > > Adding the following people to the steering council: > > - Eric Wieser > > - Marten van Kerkwijk > > - Stephan Hoyer > > - Allan Haldane > > > > > Eric and Marten have only been members with commit rights for 6 > months, > While they have been contributing and very valuable to the project > for > significantly longer, I do think this it is a bit to short time to be > considered for the steering council. > I certainly approve of them becoming members at some point, but I do > want to avoid the steering council to grow to large to quick as long > as > it does not need more members to do its job. > What I do want to avoid is that the steering council becomes like our > committers list, a group that only grows and never shrinks as long as > the occasional heartbeat is heard. > > That said if we think the current steering council is not able to > fulfil > its purpose I do offer my seat for a replacement as I currently have > not > really been contributing much. I doubt that ;). IIRC the rules were "at least one year", so you are probably right that we should delay the official status until then, but I care much personally. I think all of us are in the position where we don't mind giving up this "official" position in favor of more active people (just to note, IIRC in two years now, it was _somewhat_ used a single time when we donate a bit of numpy money to the mingwpy effort). I am not sure if we had it, but we could put in (up to changes of course), a rough number of people we aim to have on it. Just so we don't forget to discuss that there should be a bit flux. And I am all for some flux, because I would think it silly if those who actually make decisions don't end up on it because someone is occasionally throws in a comment. And yes, that person may well be me :). - Sebastian > cheers, > Julian > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy steering councils members
On Fri, 2017-07-21 at 12:59 -0700, Nathaniel Smith wrote: > On Jul 21, 2017 9:36 AM, "Sebastian Berg" > wrote: > On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > > On 21.07.2017 08:52, Ralf Gommers wrote: > Also FWIW, the jupyter steering council is currently 15 people, or 16 > including Fernando: > https://github.com/jupyter/governance/blob/master/people.md > > By comparison, Numpy's currently has 8, so Ralf's proposal would > bring it to 11: > https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#gov > ernance-people > > Looking at the NumPy council, then with the exception of Alex who I > haven't heard from in a while, it looks like a list of people who > regularly speak up and have sensible things to say, so I don't > personally see any problem with keeping everyone around. It's not > like the council is an active working group; it's mainly for > occasional oversight and boring logistics. > For what its worth, I fully agree. Frankly, I thought the lits might be longer ;). And yes, while I can understand that there might be a problem at some point, I am sure we are far from it for a while. Anyway, I think all of those four people Ralf mentioned would be a great addition (and if anyone wants to suggest someone else please speak up). - Sebastian > -n > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slice nested arrays, How to
On Mon, 2017-07-24 at 16:37 +0200, Bob wrote: > Hello, > > I created the following array by converting it from a nested list: > > a = np.array([np.array([ > 17.56578416, 16.82712825, 16.57992292, > 15.83534836]), > np.array([ 17.9002445 > , 17.35024876, 16.69733472, 15.78809856]), > np.array([ > 17.90086839, 17.64315136, 17.40653009, 17.26346787, > 16.99901931, 16.87787178, 16.68278558, 16.56006419, > 16.43672445]), > np.array([ 17.91147242, 17.2770623 , 17.0320501 , > 16.73729491, 16.4910479 ])], dtype=object) > > I wish to slice the first element of each sub-array so I can perform > basic statistics (mean, sd, etc...0). > > How can I do that for large data without resorting to loops? Here's > the > result I want with a loop: > Arrays of arrays are not very nice in these regards, you could use np.frompyfunc/np.vectorize together with `operator.getitem` to avoid the loop. It probably will not be much faster though. - Sebastian > s = np.zeros(4) > for i in np.arange(4): > s[i] = a[i][0] > > array([ 17.56578416, 17.9002445 , 17.90086839, 17.91147242]) > > Thank you > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy steering councils members
Hi all, so I guess this means: Unless anyone protests (soon, though at least a week from now probably) publicly or privately. We will invite four new members to the steering council and, if they accept, they will be added soon [1]. These are: - Eric Wieser - Marten van Kerkwijk - Stephan Hoyer - Allan Haldane all of whom have done considerable work for NumPy for a long time. I would like to also note again that I am happy about any additional suggestions. Alex Griffin will be informed that depending on his wishes, he may have to leave soon or within about a year (IIRC that was about what the governance docs say). Regards, Sebastian [1] Two of whom may be appointed with some delay due to the one year rule. We may have to hash out details here. On Fri, 2017-07-21 at 22:18 +0200, Sebastian Berg wrote: > On Fri, 2017-07-21 at 12:59 -0700, Nathaniel Smith wrote: > > On Jul 21, 2017 9:36 AM, "Sebastian Berg" > et > > > wrote: > > > > On Fri, 2017-07-21 at 16:58 +0200, Julian Taylor wrote: > > > On 21.07.2017 08:52, Ralf Gommers wrote: > > > > Also FWIW, the jupyter steering council is currently 15 people, or > > 16 > > including Fernando: > > https://github.com/jupyter/governance/blob/master/people.md > > > > By comparison, Numpy's currently has 8, so Ralf's proposal would > > bring it to 11: > > https://docs.scipy.org/doc/numpy-dev/dev/governance/people.html#g > > ov > > ernance-people > > > > Looking at the NumPy council, then with the exception of Alex who I > > haven't heard from in a while, it looks like a list of people who > > regularly speak up and have sensible things to say, so I don't > > personally see any problem with keeping everyone around. It's not > > like the council is an active working group; it's mainly for > > occasional oversight and boring logistics. > > > > For what its worth, I fully agree. Frankly, I thought the lits might > be > longer ;). And yes, while I can understand that there might be a > problem at some point, I am sure we are far from it for a while. > > Anyway, I think all of those four people Ralf mentioned would be a > great addition (and if anyone wants to suggest someone else please > speak up). > > - Sebastian > > > > -n > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.array, copy=False and memmap
On Thu, 2017-08-10 at 12:27 -0400, Allan Haldane wrote: > On 08/07/2017 05:01 PM, Nisoli Isaia wrote: > > Dear all, > > I have a question about the behaviour of > > > > y = np.array(x, copy=False, dtype='float32') > > > > when x is a memmap. If we check the memmap attribute of mmap > > > > print "mmap attribute", y._mmap > > > > numpy tells us that y is not a memmap. > > But the following code snippet crashes the python interpreter > > > > # opens the memmap > > with open(filename,'r+b') as f: > > mm = mmap.mmap(f.fileno(),0) > > x = np.frombuffer(mm, dtype='float32') > > > > # builds an array from the memmap, with the option copy=False > > y = np.array(x, copy=False, dtype='float32') > > print "before", y > > > > # closes the file > > mm.close() > > print "after", y > > > > In my code I use memmaps to share read-only objects when doing > > parallel > > processing > > and the behaviour of np.array, even if not consistent, it's > > desirable. > > I share scipy sparse matrices over many processes and if np.array > > would > > make a copy > > when dealing with memmaps this would force me to rewrite part of > > the sparse > > matrices > > code. > > Would it be possible in the future releases of numpy to have > > np.array > > check, > > if copy is false, if y is a memmap and in that case return a full > > memmap > > object > > instead of slicing it? > > This does appear to be a bug in numpy or mmap. > Frankly on first sight, I do not think it is a bug in either of them. Numpy uses view (memmap really is just a name for a memory map backed numpy array). The numpy array will hold a reference to the memory map object in its `.base` attribute (or the base of the base, etc.). If you close a mmap object, and then keep using it, you can get segfaults of course, I am not sure what you can do about it. Maybe python can try to warn you when you exit the context/close a file pointer, but I suppose: Python does memory management for you, it makes doing IO management easy, but you need to manage the IO correctly. That this segfaults and not just errors may be annoying, but seems the nature of things on first sight. - Sebastian > Probably the solution isn't to make mmaps a special case, rather we > should fix a bug somewhere in the use of the PEP3118 interface. > > I've opened an issue on github for your issue: > https://github.com/numpy/numpy/issues/9537 > > It seems to me that the "correct" behavior may be for it to me > impossible to close the memmap while pointers to it exist; this is > the > behavior for `memoryview`s of mmaps. That is, your line `mm.close()` > shoud raise an error `BufferError: cannot close exported pointers > exist`. > > > > Best wishes > > Isaia > > > > P.S. A longer account of the issue may be found on my university > > blog > > http://www.im.ufrj.br/nisoli/blog/?p=131 > > > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Tensor Contraction (HPTT) and Tensor Transposition (TCL)
On Thu, 2017-08-17 at 00:33 +0200, Paul Springer wrote: > Am 8/16/17 um 6:08 PM schrieb Anne Archibald: > > > If you wanted to integrate HPTT into numpy, I think the best > > approach might be to wire it into the assignment machinery, so that > > when users do things like a[::2,:] = b[:,::3].T HPTT springs into > > action behind the scenes and makes this assignment as efficient as > > possible (how well does it handle arrays with spaces between > > elements?). Then ascontiguousarray and asfortranarray and the like > > could simply use assignment to an appropriately-ordered destination > > when they actually needed to do anything. > HPTT offers support for subtensor (via the outerSize parameter, > which is similar to the leading dimension in BLAS), thus, HPTT can > also deal with arbitrarily strided transpositions. > However, a non-unite stride for the fastest-varying index is > devastating for performance since this prohibits the use of > vectorization and the exploitation of spatial locality. > > How would the integration of HPTT into NumPY look like? > Which steps would need to be taken? > Would it be required the HPTT be distributed in source code along > side NumPY (at that point I might have to change the license for > HPTT) or would it be fine to add an git dependency? That way users > who build NumPY from source could fetch HPTT and set a flag during > the build process of NumPY, indicating the HPTT is available? > How would the process look like if NumPY is distributed as a > precompiled binary? > Well, numpy is BSD, and the official binaries will be BSD, someone else could do less free binaries of course. I doubt we can have a hard dependency unless it is part of the numpy source (some trick like this at one point existed for fftw, but). I doubt including the source itself is going to happen quickly since we would first have to decide to actually use a modern C++ compiler (I have no idea if that is problematic or not). Having a blocked/fancier (I assume) iterator jump in at least for simple operations such as transposed+copy as Anne suggested sounds very cool though. It could be nice for simple ufuncs at least as well. I have no idea how difficult that may be though or how much complexity it would add to maintenance. My guess is it might require quite a lot of work to integrate such optimizations into the Iterator itself (even though it would be awesome), compared to just trying to plug it into some selected fast paths as Anne suggested. One thing that might be very simple and also pretty nice is just trying to keep the documentation (or wiki page or so linked from the documentation) up to date with suggestions for people interested in speed improvements listing things such as (not sure if we have that): * Use pyfftw for speeding up ffts * numexpr can be nice and gives a way to quickly use multiple cores * numba can automagically compile some python functions to be fast * Use TCL if you need faster einsum(like) operations * ... Just a few thoughts, did not think about details really. But yes, it is sounds reasonable to me to re-add support for optional dependencies such as fftw or your TCL. But packagers have to make use of that or I fear it is actually less available than a standalone python module. - Sebastian > The same questions apply with respect to TCL. > > > TCL uses the Transpose-Transpose-GEMM-Transpose approach where > > > all tensors are flattened into matrices (via HPTT) and then > > > contracted via GEMM; the final result is eventually folded (via > > > HPTT) into the desired output tensor. > > > > > > > This is a pretty direct replacement of einsum, but I think einsum > > may well already do pretty much this, apart from not using HPTT to > > do the transposes. So the way to get this functionality would be to > > make the matrix-rearrangement primitives use HPTT, as above. > That would certainly be one approach, however, TCL also explores > several different strategies/candidates and picks the one that > minimizes the data movements required by the transpositions. > > > Would it be possible to expose HPTT and TCL as optional packages > > > within NumPY? This way I don't have to redo the work that I've > > > already put into those libraries. > > > > > > > I think numpy should be regarded as a basically-complete package > > for manipulating strided in-memory data, to which we are reluctant > > to add new user-visible functionality. Tools that can act under the > > hood to make existing code faster, or to reduce the work users must > > to to make their code run fast enough, are valuable. > It seems to me that TCL is such a candidate, since it can replace a > significant portion of th
[Numpy-discussion] Github overview change
Hi all, probably silly, but is anyone else annoyed at not seeing comments anymore in the github overview/start page? I stopped getting everything as mails and had a (bad) habit of glancing at them which would spot at least bigger discussions going on, but now it only shows actual commits, which honestly are less interesting to me. Probably just me, was just wondering if anyone knew a setting or so? - Sebastian signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Github overview change
On Wed, 2017-10-18 at 13:25 -0500, Nathan Goldbaum wrote: > This is a change in the UI that github introduced a couple weeks ago > during their annual conference. > > See https://github.com/blog/2447-a-more-connected-universe > This announces the "Discover repositories" thing, but my normal news feed changed significantly, maybe at the same time, not showing comments at all. Is there a simple setup where: 1. I can get a rough overview what is being discussed without necessarily reading everything. 2. Still get anything with @mention, etc. so that I can't really miss it? (right now I have those in mail -- which I like -- and on the website, which I don't care too much about). Probably I can set it up to get everything as mail, and set the website to still only give notifications for 2., which would be OK. Maybe I am just change resistant ;). - Sebastian > On Wed, Oct 18, 2017 at 11:49 AM Charles R Harris > wrote: > > On Wed, Oct 18, 2017 at 7:23 AM, Sebastian Berg > ions.net> wrote: > > > Hi all, > > > > > > probably silly, but is anyone else annoyed at not seeing comments > > > anymore in the github overview/start page? I stopped getting > > > everything > > > as mails and had a (bad) habit of glancing at them which would > > > spot at > > > least bigger discussions going on, but now it only shows actual > > > commits, which honestly are less interesting to me. > > > > > > Probably just me, was just wondering if anyone knew a setting or > > > so? > > > > Don't know any settings. It's almost as annoying as not forwarding > > my own comments ... > > > > Chuck > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposal of timeline for dropping Python 2.7 support
On Wed, 2017-11-08 at 18:15 +0100, Ilhan Polat wrote: > I was about to send the same thing. I think this matter became a > vim/emacs issue and Py2 supporters won't take any arguments anymore. > But if Instagram can do it, it means that legacy code argument is a > matter of will but not a technicality. https://thenewstack.io/instagr > am-makes-smooth-move-python-3/ > > Also people are really going out of their ways such as Tauthon https: > //github.com/naftaliharris/tauthon to stay with Python2. To be > honest, I'm convinced that this is a sentimental debate after seeing > this fork. > > In my opinion it is fine for us to drop support for python 2 in master relatively soon (as proposed here). But I guess we will need to a "LTS" release which means some extra maintenance burden until 2020. I could hope those who really need it jumping in to carry some of that (and by 2020 my guess is if anyone still wants to support it longer, we won't stop you, but I doubt the current core devs, at least not me, would be very interested in it). So in my opinion, the current NumPy is excellent and very stable, anyone who needs fancy new stuff likely also wants other fancy new stuff so will soon have to use python 3 anyway Which means, if we think the extra burden of a "LTS" is lower then the current hassle, lets do it :). Also downstream seems only half a reason to me, since downstream normally supports much outdated versions anyway? - Sebastian > > > > > > On Wed, Nov 8, 2017 at 5:50 PM, Peter Cock > wrote: > > On Tue, Nov 7, 2017 at 11:40 PM, Nathaniel Smith > > wrote: > > > > > > > > > > > > Right now, the decision in front of us is what to tell people who > > ask about > > > numpy's py2 support plans, so that they can make their own plans. > > Given what > > > we know right now, I don't think we should promise to keep > > support past > > > 2018. If we get there and the situation's changed, and there's > > both desire > > > and means to extend support we can revisit that. But's better to > > > under-promise and possibly over-deliver, instead of promising to > > support py2 > > > until after it becomes a millstone around our necks and then > > realizing we > > > haven't warned anyone and are stuck supporting it another year > > beyond > > > that... > > > > > > -n > > > > NumPy (and to a lesser extent SciPy) is in a tough position being > > at the > > bottom of many scientific Python programming stacks. Whenever you > > drop Python 2 support is going to upset someone. > > > > It is too ambitious to pledge to drop support for Python 2.7 no > > later than > > 2020, coinciding with the Python development team’s timeline for > > dropping > > support for Python 2.7? > > > > If that looks doable, NumPy could sign up to http://www.python3stat > > ement.org/ > > > > Regards, > > > > Peter > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a way that indexing a matrix of data with a matrix of indices?
On Wed, 2017-11-29 at 14:56 +, ZHUO QL (KDr2) wrote: > Hi, all > > suppose: > > - D, is the data matrix, its shape is M x N > - I, is the indices matrix, its shape is M x K, K<=N > > Is there a efficient way to get a Matrix R with the same shape of I > so that R[x,y] = D[x, I[x,y]] ? > > A nested for-loop or list-comprehension is too slow for me. > Advanced indexing can do any odd thing you might want to do. I would not suggest to use the matrix class, but always use the array class in case you are doing that though. This should do the trick, I will refer the the documentation for how it works, except that it is basically: R[x,y] = D[I1[x, y], I2[x, y]] R = D[np.arange(I.shape[0])[:, np.newaxis], I] - Sebastian > Thanks. > > > ZHUO QL (KDr2) http://kdr2.com > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Which rule makes x[np.newaxis, :] and x[np.newaxis] equivalent?
On Tue, 2017-12-12 at 14:19 +0100, Joe wrote: > Ah, ok, now that I knew what to look for I guess I found it: > > "If the number of objects in the selection tuple is less than N , > then : > is assumed for any subsequent dimensions." > > https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.htm > l > > This is the one, right? > Yeah, plus if it is not a tuple, it actually behaves the same as a tuple, e.g. `arr[obj]` is identical to `arr[obj,]` (or `arr[(obj,)]` which is the same). There are some weird exception when obj is a list a sequence but not an array. Note also that while everything has an implicit `, ...` at the end of indexing, if you have exactly as many integers to index as dimensions you get a scalar, if you would add the Ellipsis you would get an array back. Anyway, too many weird details for day to day stuff :). And all of that should be covered in the docs? - Sebastian > > Am 12.12.2017 09:09 schrieb Nathaniel Smith: > > On Tue, Dec 12, 2017 at 12:02 AM, Joe wrote: > > > Hi, > > > > > > question says it all. I looked through the basic and advanced > > > indexing, > > > but I could not find the rule that is applied to make > > > x[np.newaxis,:] and x[np.newaxis] the same. > > > > I think it's the general rule that all indexing expressions have an > > invisible "..." on the right edge. For example, x[i][j][k] is an > > inefficient and IMO somewhat confusing way to write x[i, j, k], > > because x[i][j][k] is interpreted as: > > > > -> x[i, ...][j, ...][k, ...] > > -> x[i, :, :][j, :][k] > > > > That this also applies to newaxis is a little surprising, but I > > guess > > consistent. > > > > -n > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Does x[True] trigger basic or advanced indexing?
On Thu, 2017-12-14 at 16:24 +, Eric Wieser wrote: > It sounds like you're using an old version of numpy, where boolean > scalars were interpreted as integers. > What version are you using? > Eric > Indeed, you are maybe using a pre 1.9 version (post 1.9 should at least have a DeprecationWarning or some such, though you might not notice it IIRC). For newer versions you should get boolean indexing, the result of it may be a bit confusing. It is advanced indexing, basically with False giving you an empty array (with an extra dimension of size 0) and True being much like an `np.newaxis`. It all makes perfect sense if you think of it of a 0-d array picking The same thing is true for example for lists of booleans. - Sebastian > On Thu, Dec 14, 2017, 04:27 Joe wrote: > > Hello, > > thanks for you feedback. > > > > Sorry, if thie question is stupid and the case below does not make > > sense. > > I am just trying to understand the logic. > > For > > > > x = np.random.rand(2,3) > > > > x[True] > > x[(True,)] > > > > or > > > > x[False] > > x[(False,)] > > > > where True and False are not arrays, > > it will pick the first or second row. > > > > Is this basic indexing then with one the rules > > - obj is an integer > > - obj is a tuple of slice objects and integers. > > ? > > > > > > Am 13.12.2017 21:49 schrieb Eric Wieser: > > > Increasingly, NumPy does not considers booleans to be integer > > types, > > > and indexing is one of these cases. > > > > > > So no, it will not be treated as a tuple of integers, but as a 0d > > mask > > > > > > Eric > > > > > > On Wed, 13 Dec 2017 at 12:44 Joe wrote: > > > > > >> Hi, > > >> > > >> yet another question. > > >> > > >> I looked through the indexing rules in the > > >> documentation but I count not find which one > > >> applies to x[True] and x[False] > > >> > > >> that might e.g result from > > >> > > >> import numpy as np > > >> x = np.array(3) > > >> x[x>5] > > >> x[x<1] > > >> x[True] > > >> x[False] > > >> > > >> x = np.random.rand(2,3) > > >> x[x>5] > > >> x[x<1] > > >> x[True] > > >> x[False] > > >> > > >> I understood that they are equivalent to > > >> > > >> x[(False,)] > > >> > > >> I tested it and it looks like advanced indexing, > > >> but I try to unterstand the logic behind this, > > >> if there is any :) > > >> > > >> In x[x<1] the x<1 is a mask and thus I guess it is a > > >> "tuple with at least one sequence object or ndarray (of data > > type > > >> integer or bool)", right? > > >> > > >> Or will x[True] trigger basic indexing as it is "a tuple of > > >> integers" > > >> because True will be converted to Int? > > >> > > >> Cheers, > > >> Joe > > >> ___ > > >> NumPy-Discussion mailing list > > >> NumPy-Discussion@python.org > > >> https://mail.python.org/mailman/listinfo/numpy-discussion [1] > > > > > > > > > Links: > > > -- > > > [1] https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Applying logical operations along an axis of a boolean array?
On Mon, 2017-12-18 at 12:02 +0100, hanno_li...@gmx.net wrote: > Hi, > > is it possible, to apply a logical operation, such as AND or OR along > a particular axis of a numpy array? > As mentioned, `np.any` and `np.all` work. However, what is more/also interesting to you is maybe that: `np.logical_or.reduce` works. All binary ufuncs (most elementwise functions such as addition, subtraction, multiplication, etc. support this `reduce` (and some other, please find out yourself ;)) methods. So that thing like `any`, `sum`, or `cumsum` are actually just thin wrappers around those. - Sebastian > > Let's say I have an (n,m) array and I want to AND along the first > axis, such that I get a (1,m) (or just (m,) dimensional array back. I > looked at the documentation for np.logical_and and friends but > couldn't find an axis keyword on the logical_xyz operations and > nothing else seemed to fit. > > Thanks, and best regards, > Hanno > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] GSoC'18 participation
On Tue, 2017-12-26 at 08:19 -0700, Charles R Harris wrote: > > > On Mon, Dec 25, 2017 at 7:12 PM, Ralf Gommers > wrote: > > Hi all, > > > > It's the time of the year again where projects start preparing for > > GSoC. So I wanted to bring it up here. Last year I wrote: "in > > practice working on NumPy is just far too hard for most GSoC > > students. Previous years we've registered and generated ideas, but > > not gotten any students. We're also short on maintainer capacity. > > So I propose to not participate this year." > > > > I think that's still the case, so I won't be mentoring or > > organizing. In case anyone is interested to do one of those things, > > please speak up! > > > > > > Sounds realistic. I thought some of the ideas last year were doable, > but no bites. > A bit unfortunate, but yeah, realistic. I do not have time to help out in any case. - Sebastian > Chuck > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] array - dimension size of 1-D and 2-D examples
On Tue, 2018-01-09 at 12:27 +, martin.gfel...@swisscom.com wrote: > Hi Derek > > I have a related question: > > Given: > > a = numpy.array([[0,1,2],[3,4]]) > assert a.ndim == 1 > b = numpy.array([[0,1,2],[3,4,5]]) > assert b.ndim == 2 > > Is there an elegant way to force b to remain a 1-dim object array? > You will have to create an empty object array and assign the lists to it. ``` b = np.empty(len(l), dtype=object) b[...] = l ``` > I have a use case where normally the sublists are of different > lengths, but I get a completely different structure when they are > (coincidentally in my case) of the same length. > > Thanks and best regards, Martin > > > Martin Gfeller, Swisscom / Enterprise / Banking / Products / Quantax > > Message: 1 > Date: Sun, 31 Dec 2017 00:11:48 +0100 > From: Derek Homeier > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] array - dimension size of 1-D and 2-D > examples > Message-ID: >en.de> > Content-Type: text/plain; charset=utf-8 > > On 30 Dec 2017, at 5:38 pm, Vinodhini Balusamy > wrote: > > > > Just one more question from the details you have provided which > > from > > my understanding strongly seems to be Design [DEREK] You cannot > > create > > a regular 2-dimensional integer array from one row of length 3 > > > and a second one of length 0. Thus np.array chooses the next > > > most > > > basic type of array it can fit your input data in > > Indeed, the general philosophy is to preserve the structure and type > of your input data as far as possible, i.e. a list is turned into a > 1d-array, a list of lists (or tuples etc?) into a 2d-array,_ if_ the > sequences are of equal length (even if length 1). > As long as there is an unambiguous way to convert the data into an > array (see below). > > >Which is the case, only if an second one of length 0 is given. > >What about the case 1 : > > > > > x12 = np.array([[1,2,3]]) > > > > > x12 > > > > array([[1, 2, 3]]) > > > > > print(x12) > > > > [[1 2 3]] > > > > > x12.ndim > > > > 2 > > > > > > > > > > > > > > This seems to take 2 dimension. > > Yes, structurally this is equivalent to your second example > > > also, > > > > x12 = np.array([[1,2,3],[0,0,0]]) > > > > print(x12) > > [[1 2 3] > [0 0 0]] > > > > x12.ndim > > 2 > > > I presumed the above case and the case where length 0 is provided > > to be treated same(I mean same behaviour). > > Correct me if I am wrong. > > > > In this case there is no unambiguous way to construct the array - you > would need a shape (2, 3) array to store the two lists with 3 > elements in the first list. Obviously x12[0] would be > np.array([1,2,3]), but what should be the value of x12[1], if the > second list is empty - it could be zeros, or repeating x12[0], or > simply undefined. np.array([1, 2, 3], [4]]) would be even less > clearly defined. > These cases where there is no obvious ?right? way to create the array > have usually been discussed at some length, but I don?t know if this > is fully documented in some place. For the essentials, see > > https://docs.scipy.org/doc/numpy/reference/routines.array-creation.ht > ml > > note also the upcasting rules if you have e.g. a mix of integers and > reals or complex numbers, and also how to control shape or data type > explicitly with the respective keywords. > > Derek > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.14.0 release
On Sun, 2018-01-14 at 11:35 +, Matthew Brett wrote: > Hi, > > On Sun, Jan 14, 2018 at 3:35 AM, Eric Wieser > wrote: > > Did recarrays change? I didn’t see anything in the release notes. > > > > Not directly, but structured arrays did, for which recarrays are > > really just > > a thin and somewhat buggy wrapper. > > Oh dear oh dear - for some reason I had completely missed these > changes, and the justification for them. > > They do exactly the kind of thing that Konrad Hinsen was complaining > about before, with justification, which is to change the behavior of > previous code, without an intervening (long) period of raising an > error. In this case, the benefits of these changes seem small, > compared to the inevitable breakage and silently changed results they > will cause. > > Is there any chance of reversing them? > Without knowing the change, there is always a chance of (temporary) reversal and for unexpected complications its probably the safest default if there is no agreement anyway. - Sebastian > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NumPy 1.14.1 released
Great news, as always, thanks for your relentless effort Chuck! - Sebastian On Tue, 2018-02-20 at 18:21 -0700, Charles R Harris wrote: > Hi All, > > On behalf of the NumPy team, I am pleased to announce NumPy > 1.14.1. This is a bugfix release for some problems reported following > the 1.14.0 release. The major problems fixed are the following. > Problems with the new array printing, particularly the printing of > complex values, Please report any additional problems that may turn > up. > > Problems with ``np.einsum`` due to the new ``optimized=True`` > default. Some fixes for optimization have been applied and > ``optimize=False`` is now the default. > > The sort order in ``np.unique`` when ``axis=`` will now > always be lexicographic in the subarray elements. In previous NumPy > versions there was an optimization that could result in sorting the > subarrays as unsigned byte strings. > > The change in 1.14.0 that multi-field indexing of structured arrays > returns a view instead of a copy has been reverted but remains on > track for NumPy 1.15. Affected users should read the 1.14.1 Numpy > User Guide section "basics/structured arrays/accessing multiple > fields" for advice on how to manage this transition. > This release supports Python 2.7 and 3.4 - 3.6. Wheels for the > release are available on PyPI. Source tarballs, zipfiles, release > notes, and the changelog are available on github. > > Contributors > > A total of 14 people contributed to this release. People with a "+" > by their names contributed a patch for the first time. > > * Allan Haldane > * Charles Harris > * Daniel Smith > * Dennis Weyland + > * Eric Larson > * Eric Wieser > * Jarrod Millman > * Kenichi Maehashi + > * Marten van Kerkwijk > * Mathieu Lamarre > * Sebastian Berg > * Simon Conseil > * Simon Gibbons > * xoviat > > Cheers, > > Charles Harris > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] improving arange()? introducing fma()?
On Thu, 2018-02-22 at 14:33 -0500, Benjamin Root wrote: > Sorry, I have been distracted with xarray improvements the past > couple of weeks. > > Some thoughts on what has been discussed: > > First, you are right...Decimal is not the right module for this. I > think instead I should use the 'fractions' module for loading grid > spec information from strings (command-line, configs, etc). The > tricky part is getting the yaml reader to use it instead of > converting to a float under the hood. > > Second, what has been pointed out about the implementation of arange > actually helps to explain some oddities I have encountered. In some > situations, I have found that it was better for me to produce the > reversed sequence, and then reverse that array back and use it. > > Third, it would be nice to do what we can to improve arange()'s > results. Would we be ok with a PR that uses fma() if it is available, > but then falls back on a regular multiply and add if it isn't > available, or are we going to need to implement it ourselves for > consistency? > I am not sure I like the idea: 1. It sounds like it might break code 2. It sounds *not* like a fix, but rather a "make it slightly less bad, but it is still awful" Using fma inside linspace might make linspace a bit more exact possible, and would be a good thing, though I am not sure we have a policy yet for something that is only used sometimes, nor am I sure it actually helps. It also would be nice to add stable summation to numpy in general (in whatever way), which maybe is half related but on nobody's specific todo list. > > Lastly, there definitely needs to be a better tool for grid making. > The problem appears easy at first, but it is fraught with many > pitfalls and subtle issues. It is easy to say, "always use > linspace()", but if the user doesn't have the number of pixels, they > will need to calculate that using --- gasp! -- floating point > numbers, which could result in the wrong answer. Or maybe their > first/last positions were determined by some other calculation, and > so the resulting grid does not have the expected spacing. Another > problem that I run into is starting from two different sized grids > and padding them both to be the same spec -- and getting that to > match what would come about if I had generated the grid from scratch. > Maybe you are right, but right now I have no clue what that tool would do :). If we should add it to numpy likely depends on what exactly it does and how complex it is. I once wanted to add a "step" argument to linspace, but didn't in the end, largely because it basically enforced in a very convoluted way that the step fit exactly to a number of steps (up to floating point precision) and body was quite sure it was a good idea, since it would just be useful for a little convenience when you do not want to calculate the steps. Best, Sebastian > > Getting these things right is hard. I am not even certain that my > existing code for doing this even right. But, what I do know is that > until we build such a tool, users will continue to incorrectly use > arange() and linspace(), and waste time trying to re-invent the wheel > badly, assuming they even notice their mistakes in the first place! > So, should such a tool go into numpy, given how fundamental it is to > generate a sequence of floating point numbers, or should we try to > put it into a package like rasterio or xarray? > > Cheers! > Ben Root > > > > On Thu, Feb 22, 2018 at 2:02 PM, Chris Barker > wrote: > > @Ben: Have you found a solution to your problem? Are there thinks > > we could do in numpy to make it better? > > > > -CHB > > > > > > On Mon, Feb 12, 2018 at 9:33 AM, Chris Barker > v> wrote: > > > I think it's all been said, but a few comments: > > > > > > On Sun, Feb 11, 2018 at 2:19 PM, Nils Becker > > com> wrote: > > > > Generating equidistantly spaced grids is simply not always > > > > possible. > > > > > > > > > > exactly -- and linspace gives pretty much teh best possible > > > result, guaranteeing tha tthe start an end points are exact, and > > > the spacing is within an ULP or two (maybe we could make that > > > within 1 ULP always, but not sure that's worth it). > > > > > > > The reason is that the absolute spacing of the possible > > > > floating point numbers depends on their magnitude [1]. > > > > > > > > > > Also that the exact spacing may not be exactly representable in > > > FP -- so you have to have at least one space
Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray
On Thu, 2018-03-08 at 18:56 +, Stephan Hoyer wrote: > Hi Nathaniel, > > Thanks for starting the discussion! > > Like Marten says, I think it would be useful to more clearly define > what it means to be an abstract array. ndarray has lots of > methods/properties that expose internal implementation (e.g., view, > strides) that presumably we don't want to require as part of this > interfaces. On the other hand, dtype and shape are almost assuredly > part of this interface. > > To help guide the discussion, it would be good to identify concrete > examples of types that should and should not satisfy this interface, > e.g., > Marten's case 1: works exactly like ndarray, but stores data > differently: parallel arrays (e.g., dask.array), sparse arrays (e.g., > https://github.com/pydata/sparse), hypothetical non-strided arrays > (e.g., always C ordered). > Marten's case 2: same methods as ndarray, but gives different > results: np.ma.MaskedArray, arrays with units (quantities), maybe > labeled arrays like xarray.DataArray > > I don't think we have a hope of making a single base class for case 2 > work with everything in NumPy, but we can define interfaces with > different levels of functionality. True, but I guess the aim is not to care at all about how things are implemented (so only 2)? I agree that we can aim to be as close as possible, but should not expect to reach it. My personal opinion: 1. To do this, we should start it "experimentally" 2. We need something like a reference implementation. First, because it allows testing whether a function e.g. in numpy is actually abstract- safe and second because it will be the only way to find out what our minimal abstract interface actually is (assuming we have started 3). 3. Go ahead with putting it into numpy functions and see how much you need to make them work. In the end, my guess is, everything that works for MaskedArrays and xarray is a pretty safe bet. I disagree with the statement that we do not need to define the minimal reference. In practice we do as soon as we use it for numpy functions. - Sebastian > > Because there is such a gradation of "duck array" types, I agree with > Marten that we should not deprecate NDArrayOperatorsMixin. It's > useful for types like xarray.Dataset that define __array_ufunc__ but > cannot satisfy the full abstract array interface. > > Finally for the name, what about `asduckarray`? Thought perhaps that > could be a source of confusion, and given the gradation of duck array > like types. > > Cheers, > Stephan > > On Thu, Mar 8, 2018 at 7:07 AM Marten van Kerkwijk mail.com> wrote: > > Hi Nathaniel, > > > > Overall, hugely in favour! For detailed comments, it would be good > > to > > have a link to a PR; could you put that up? > > > > A larger comment: you state that you think `np.asanyarray` is a > > mistake since `np.matrix` and `np.ma.MaskedArray` would pass > > through > > and that those do not strictly mimic `NDArray`. Here, I agree with > > `matrix` (but since we're deprecating it, let's remove that from > > the > > discussion), but I do not see how your proposed interface would not > > let `MaskedArray` pass through, nor really that one would > > necessarily > > want that. > > > > I think it may be good to distinguish two separate cases: > > 1. Everything has exactly the same meaning as for `ndarray` but the > > data is stored differently (i.e., only `view` does not work). One > > can > > thus expect that for `output = function(inputs)`, at the end all > > `duck_output == ndarray_output`. > > 2. Everything is implemented but operations may give different > > output > > (depending on masks for masked arrays, units for quantities, etc.), > > so > > generally `duck_output != ndarray_output`. > > > > Which one of these are you aiming at? By including > > `NDArrayOperatorsMixin`, it would seem option (2), but perhaps not? > > Is > > there a case for both separately? > > > > Smaller general comment: at least in the NEP I would not worry > > about > > deprecating `NDArrayOperatorsMixin` - this may well be handy in > > itself > > (for things that implement `__array_ufunc__` but do not have shape, > > etc. (I have been doing some work on creating ufunc chains that > > would > > use this -- but they definitely are not array-like). Similarly, I > > think there is room for an `NDArrayShapeMixin` which might help > > with > > `concatenate` and friends. > > > > Finally, on the name: `asarray` and `asanyarray` are just shims > > over >
Re: [Numpy-discussion] 3D array slicing bug?
This NEP draft has some more hints/explanations if you are interested: https://github.com/seberg/numpy/blob/5becd12914d0402967205579d6f59a9815 1e0d98/doc/neps/indexing.rst#examples Plus, it tries to avoid the word "subspace" hehehe. - Sebastian On Thu, 2018-03-22 at 10:41 +0100, Pauli Virtanen wrote: > ke, 2018-03-21 kello 20:40 +, Michael Himes kirjoitti: > > I have discovered what I believe is a bug with array slicing > > involving 3D (and higher) dimension arrays. When slicing a 3D array > > by a single value for axis 0, all values for axis 1, and a list to > > slice axis 2, the dimensionality of the resulting 2D array is > > flipped. However, slicing more than a single index for axis 0 or > > performing the slicing in two steps results in the correct > > dimensionality. Below is a quick example to demonstrate this > > behavior. > > > > https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combi > ning-advanced-and-basic-indexing > > The key part seems to be: "There are two parts to the indexing > operation, the subspace defined by the basic indexing > (**excluding integers**) and the subspace from the advanced indexing > part." > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
Initializer or this sounds fine to me. As an other data point which I think has been mentioned before, `sum` uses start and min/max use default. `start` does not work, unless we also change the code to always use the identity if given (currently that is not the case), in which case it might be nice. However, "start" seems a bit like solving a different issue in any case. Anyway, mostly noise. I really like adding this, the only thing worth discussing a bit is the name :). - Sebastian On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > It calls it `initializer` - See https://docs.python.org/3.5/library/f > unctools.html#functools.reduce > > Sent from Astro for Mac > > > On Mar 26, 2018 at 09:54, Eric Wieser > > wrote: > > > > It turns out I mispoke - functools.reduce calls the argument > > `initial` > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer > > wrote: > > > This looks like a very logical addition to the reduce interface. > > > It has my support! > > > > > > I would have preferred the more descriptive name "initial_value", > > > but consistency with functools.reduce makes a compelling case for > > > "initializer". > > > > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser > > ail.com> wrote: > > > > To reiterate my comments in the issue - I'm in favor of this. > > > > > > > > It seems seem especially valuable for identity-less functions > > > > (`min`, `max`, `lcm`), and the argument name is consistent with > > > > `functools.reduce`. too. > > > > > > > > The only argument I can see against merging this would be > > > > `kwarg`-creep of `reduce`, and I think this has enough use > > > > cases to justify that. > > > > > > > > I'd like to merge in a few days, if no one else has any > > > > opinions. > > > > > > > > Eric > > > > > > > > On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi > > > il.com> wrote: > > > > > Hello, everyone. I’ve submitted a PR to add a initializer > > > > > kwarg to ufunc.reduce. This is useful in a few cases, e.g., > > > > > it allows one to supply a “default” value for identity-less > > > > > ufunc reductions, and specify an initial value for reductions > > > > > such as sum (other than zero.) > > > > > > > > > > Please feel free to review or leave feedback, (although I > > > > > think Eric and Marten have picked it apart pretty well). > > > > > > > > > > https://github.com/numpy/numpy/pull/10635 > > > > > > > > > > Thanks, > > > > > > > > > > Hameer > > > > > Sent from Astro for Mac > > > > > > > > > > ___ > > > > > NumPy-Discussion mailing list > > > > > NumPy-Discussion@python.org > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > ___ > > > > NumPy-Discussion mailing list > > > > NumPy-Discussion@python.org > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
OK, the new documentation is actually clear: initializer : scalar, optional The value with which to start the reduction. Defaults to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given, the first element of the reduction is used, and an error is thrown if the reduction is empty. If ``a.dtype`` is ``object``, then the initializer is _only_ used if reduction is empty. I would actually like to say that I do not like the object special case much (and it is probably the reason why I was confused), nor am I quite sure this is what helps a lot? Logically, I would argue there are two things: 1. initializer/start (always used) 2. default (oly used for empty reductions) For example, I might like to give `np.nan` as the default for some empty reductions, this will not work. I understand that this is a minimal invasive PR and I am not sure I find the solution bad enough to really dislike it, but what do other think? My first expectation was the default behaviour (in all cases, not just object case) for some reason. To be honest, for now I just wonder a bit: How hard would it be to do both, or is that too annoying? It would at least get rid of that annoying thing with object ufuncs (which currently have a default, but not really an identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t that of `default` but that of > `initializer` or `start`. > > This was discussed further down in the PR but to reiterate: > `np.sum([10], initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`, so it isn’t really > the default value, it’s the initial value among which the reduction > is performed. > > This was the reason to call it initializer in the first place. I like > `initial` and `initial_value` as well, and `start` also makes sense > but isn’t descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar 26, 2018 at 12:06, Sebastian Berg > t> wrote: > > > > Initializer or this sounds fine to me. As an other data point which > > I > > think has been mentioned before, `sum` uses start and min/max use > > default. `start` does not work, unless we also change the code to > > always use the identity if given (currently that is not the case), > > in > > which case it might be nice. However, "start" seems a bit like > > solving > > a different issue in any case. > > > > Anyway, mostly noise. I really like adding this, the only thing > > worth > > discussing a bit is the name :). > > > > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote: > > > It calls it `initializer` - See https://docs.python.org/3.5/libra > > > ry/f > > > unctools.html#functools.reduce > > > > > > Sent from Astro for Mac > > > > > > > On Mar 26, 2018 at 09:54, Eric Wieser > > > com> > > > > wrote: > > > > > > > > It turns out I mispoke - functools.reduce calls the argument > > > > `initial` > > > > > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer > > > > wrote: > > > > > This looks like a very logical addition to the reduce > > > > > interface. > > > > > It has my support! > > > > > > > > > > I would have preferred the more descriptive name > > > > > "initial_value", > > > > > but consistency with functools.reduce makes a compelling case > > > > > for > > > > > "initializer". > > > > > > > > > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser > > > > y@gm > > > > > ail.com> wrote: > > > > > > To reiterate my comments in the issue - I'm in favor of > > > > > > this. > > > > > > > > > > > > It seems seem especially valuable for identity-less > > > > > > functions > > > > > > (`min`, `max`, `lcm`), and the argument name is consistent > > > > > > with > > > > > > `functools.reduce`. too. > > > > > > > > > > > > The only argument I can see against merging this would be > > > > > > `kwarg`-creep of `reduce`, and I think this has enough use > > > > > > cases to justify that. > > > > > > > > > > > > I'd like to merge in a few days, if no one else has any > > > > > > opinions. > > > >
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi wrote: > That is the idea, but NaN functions are in a separate branch for > another PR to be discussed later. You can see it on my fork, if > you're > interested. Except that as far as I understand I am not sure it will help much with it, since it is not a default, but an initializer. Initializing to NaN would just make all results NaN. - Sebastian > On 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. > I imagine it would finally give some people a choice on what > np.nansum([np.nan]) should return? It caused a huge hullabeloo a few > years ago when we changed it from returning NaN to returning zero. > Ben > Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg > wrote: OK, the new documentation is > actually clear: initializer : scalar, optional The value with which > to > start the reduction. Defaults to the `~numpy.ufunc.identity` of the > ufunc. If ``None`` is given, the first element of the reduction is > used, and an error is thrown if the reduction is empty. If > ``a.dtype`` > is ``object``, then the initializer is _only_ used if reduction is > empty. I would actually like to say that I do not like the object > special case much (and it is probably the reason why I was confused), > nor am I quite sure this is what helps a lot? Logically, I would > argue > there are two things: 1. initializer/start (always used) 2. default > (oly used for empty reductions) For example, I might like to give > `np.nan` as the default for some empty reductions, this will not > work. > I understand that this is a minimal invasive PR and I am not sure I > find the solution bad enough to really dislike it, but what do other > think? My first expectation was the default behaviour (in all cases, > not just object case) for some reason. To be honest, for now I just > wonder a bit: How hard would it be to do both, or is that too > annoying? It would at least get rid of that annoying thing with > object > ufuncs (which currently have a default, but not really an > identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20 > -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t > that of `default` but that of > `initializer` or `start`. > > This > was > discussed further down in the PR but to reiterate: > `np.sum([10], > initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)` > becomes `0`, so it isn’t really > the default value, it’s the initial > value among which the reduction > is performed. > > This was the > reason to call it initializer in the first place. I like > `initial` > and `initial_value` as well, and `start` also makes sense > but isn’t > descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar > 26, 2018 at 12:06, Sebastian Berg > t> > wrote: > > > > Initializer or this sounds fine to me. As an other > data > point which > > I > > think has been mentioned before, `sum` uses > start and min/max use > > default. `start` does not work, unless we > also change the code to > > always use the identity if given > (currently that is not the case), > > in > > which case it might be > nice. However, "start" seems a bit like > > solving > > a different > issue in any case. > > > > Anyway, mostly noise. I really like adding > this, the only thing > > worth > > discussing a bit is the name :). > > > > > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, > > Hameer Abbasi wrote: > > > It calls it `initializer` - See > https://docs.python.org/3.5/libra > > > ry/f > > > > unctools.html#functools.reduce > > > > > > Sent from Astro for Mac > > > > > > > > > On Mar 26, 2018 at 09:54, Eric Wieser > > > > > com> > > > > wrote: > > > > > > > > > It turns out I mispoke - functools.reduce calls the argument > > > > > `initial` > > > > > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer > > > > > wrote: > > > > > This looks like a very > logical addition to the reduce > > > > > interface. > > > > > It has > my support! > > > > > > > > > > I would have preferred the more > descriptive name > > > > > "initial_value", > > > > > but consistency > with functools.reduce makes a compelling case > > > > > for > > > > > > "initializer". > > > > > > > > > > On
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: > It'll need to be thought out for object arrays and subclasses. But > for > Regular numeric stuff, Numpy uses fmin and this would have the > desired > effect. I do not want to block this, but I would like a clearer opinion about this issue, `np.nansum` as Benjamin noted would require something like: np.nansum([np.nan], default=np.nan) because np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan) would both give NaN if the logic is the same as the current `np.sum`. And yes, I guess for fmin/fmax NaN happens to work. And then there are many nonsense reduces which could make sense with `initializer`. Now nansum is not implemented in a way that could make use of the new kwarg anyway, so maybe it does not matter in some sense. We can in principle use `default` in nansum and at some point possibly add `default` to the normal ufuncs. If we argue like that, the only annoying thing is the `object` dtype which confuses the two use cases currently. This confusion IMO is not harmless, because I might want to use it (e.g. sum with initializer=5), and I would expect things like dropping in `decimal.Decimal` to work most of the time, while here it would give silently bad results. - Sebastian > On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at > 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN functions > are in a separate branch for another PR to be discussed later. You > can > see it on my fork, if you're interested. Except that as far as I > understand I am not sure it will help much with it, since it is not a > default, but an initializer. Initializing to NaN would just make all > results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: Hmm, > this is neat. I imagine it would finally give some people a choice on > what np.nansum([np.nan]) should return? It caused a huge hullabeloo a > few years ago when we changed it from returning NaN to returning > zero. > Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg > wrote: OK, the new documentation is > actually clear: initializer : scalar, optional The value with which > to > start the reduction. Defaults to the `~numpy.ufunc.identity` of the > ufunc. If ``None`` is given, the first element of the reduction is > used, and an error is thrown if the reduction is empty. If > ``a.dtype`` > is ``object``, then the initializer is _only_ used if reduction is > empty. I would actually like to say that I do not like the object > special case much (and it is probably the reason why I was confused), > nor am I quite sure this is what helps a lot? Logically, I would > argue > there are two things: 1. initializer/start (always used) 2. default > (oly used for empty reductions) For example, I might like to give > `np.nan` as the default for some empty reductions, this will not > work. > I understand that this is a minimal invasive PR and I am not sure I > find the solution bad enough to really dislike it, but what do other > think? My first expectation was the default behaviour (in all cases, > not just object case) for some reason. To be honest, for now I just > wonder a bit: How hard would it be to do both, or is that too > annoying? It would at least get rid of that annoying thing with > object > ufuncs (which currently have a default, but not really an > identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20 > -0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t > that of `default` but that of > `initializer` or `start`. > > This > was > discussed further down in the PR but to reiterate: > `np.sum([10], > initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)` > becomes `0`, so it isn’t really > the default value, it’s the initial > value among which the reduction > is performed. > > This was the > reason to call it initializer in the first place. I like > `initial` > and `initial_value` as well, and `start` also makes sense > but isn’t > descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar > 26, 2018 at 12:06, Sebastian Berg > t> > wrote: > > > > Initializer or this sounds fine to me. As an other > data > point which > > I > > think has been mentioned before, `sum` uses > start and min/max use > > default. `start` does not work, unless we > also change the code to > > always use the identity if given > (currently that is not the case), > > in > > which case it might be > nice. However, "start" seems a bit like > > solving > > a different > issue in any case. > > > > Anyway, mostly noise. I really like adding > this, the only thing > > worth &g
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg wrote: > On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: > > It'll need to be thought out for object arrays and subclasses. But > > for > > Regular numeric stuff, Numpy uses fmin and this would have the > > desired > > effect. > > I do not want to block this, but I would like a clearer opinion about > this issue, `np.nansum` as Benjamin noted would require something > like: > > np.nansum([np.nan], default=np.nan) > > because > > np.sum([1], initializer=np.nan) > np.nansum([1], initializer=np.nan) > > would both give NaN if the logic is the same as the current `np.sum`. > And yes, I guess for fmin/fmax NaN happens to work. And then there > are > many nonsense reduces which could make sense with `initializer`. > > Now nansum is not implemented in a way that could make use of the new > kwarg anyway, so maybe it does not matter in some sense. We can in > principle use `default` in nansum and at some point possibly add > `default` to the normal ufuncs. If we argue like that, the only > annoying thing is the `object` dtype which confuses the two use cases > currently. > > This confusion IMO is not harmless, because I might want to use it > (e.g. sum with initializer=5), and I would expect things like > dropping > in `decimal.Decimal` to work most of the time, while here it would > give > silently bad results. > In other words: I am very very much in favor if you get rid that object dtype special case. I frankly not see why not (except that it needs a bit more code change). If given explicitly, we might as well force the use and not do the funny stuff which is designed to be more type agnostic! If it happens to fail due to not being type agnostic, it will at least fail loudly. If you leave that object special case I am *very* hesitant about it. That I think I would like a `default` argument as well, is another issue and it can wait to another day. - Sebastian > - Sebastian > > > > > > > On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at > > 11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN > > functions > > are in a separate branch for another PR to be discussed later. You > > can > > see it on my fork, if you're interested. Except that as far as I > > understand I am not sure it will help much with it, since it is not > > a > > default, but an initializer. Initializing to NaN would just make > > all > > results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: > > Hmm, > > this is neat. I imagine it would finally give some people a choice > > on > > what np.nansum([np.nan]) should return? It caused a huge hullabeloo > > a > > few years ago when we changed it from returning NaN to returning > > zero. > > Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg > > wrote: OK, the new documentation is > > actually clear: initializer : scalar, optional The value with which > > to > > start the reduction. Defaults to the `~numpy.ufunc.identity` of the > > ufunc. If ``None`` is given, the first element of the reduction is > > used, and an error is thrown if the reduction is empty. If > > ``a.dtype`` > > is ``object``, then the initializer is _only_ used if reduction is > > empty. I would actually like to say that I do not like the object > > special case much (and it is probably the reason why I was > > confused), > > nor am I quite sure this is what helps a lot? Logically, I would > > argue > > there are two things: 1. initializer/start (always used) 2. default > > (oly used for empty reductions) For example, I might like to give > > `np.nan` as the default for some empty reductions, this will not > > work. > > I understand that this is a minimal invasive PR and I am not sure I > > find the solution bad enough to really dislike it, but what do > > other > > think? My first expectation was the default behaviour (in all > > cases, > > not just object case) for some reason. To be honest, for now I just > > wonder a bit: How hard would it be to do both, or is that too > > annoying? It would at least get rid of that annoying thing with > > object > > ufuncs (which currently have a default, but not really an > > identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20 > > -0400, Hameer Abbasi wrote: > Actually, the behavior right now > > isn’t > > that of `default` but that of > `initializer` or `start`. > > This > > was > > discussed further down in the PR but to reiterate: > `np.sum([10], > > initializer=5)` bec
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote: > That may be complicated. Currently, the identity isn't used in object > dtype reductions. We may need to change that, which could cause a > whole lot of other backwards incompatible changes. For example, sum > actually including zero in object reductions. Or we could pass in a > flag saying an initializer was passed in to change that behaviour. If > this is agreed upon and someone is kind enough to point me to the > code, I'd be willing to make this change. I realize the implication, I am not suggesting to change the default behaviour (when no initial=... is passed), I would think about deprecating it, but probably only if we also have the `default` argument, since otherwise you cannot replicate the old behaviour. What I think I would like to see is to change how it works if (and only if) the initializer is passed in. Yes, this will require holding on to some extra information since you will have to know/remember whether the "identity" was passed in or defined otherwise. I did not check the code, but I would hope that it is not awfully tricky to do that. - Sebastian PS: A side note, but I see your emails as a single block of text with no/broken new-lines. > On 26/03/2018 at 18:54, > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg > wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll > need to be thought out for object arrays and subclasses. But for > Regular numeric stuff, Numpy uses fmin and this would have the > desired > effect. I do not want to block this, but I would like a clearer > opinion about this issue, `np.nansum` as Benjamin noted would require > something like: np.nansum([np.nan], default=np.nan) because > np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan) > would both give NaN if the logic is the same as the current `np.sum`. > And yes, I guess for fmin/fmax NaN happens to work. And then there > are > many nonsense reduces which could make sense with `initializer`. Now > nansum is not implemented in a way that could make use of the new > kwarg anyway, so maybe it does not matter in some sense. We can in > principle use `default` in nansum and at some point possibly add > `default` to the normal ufuncs. If we argue like that, the only > annoying thing is the `object` dtype which confuses the two use cases > currently. This confusion IMO is not harmless, because I might want > to > use it (e.g. sum with initializer=5), and I would expect things like > dropping in `decimal.Decimal` to work most of the time, while here it > would give silently bad results. In other words: I am very very much > in favor if you get rid that object dtype special case. I frankly not > see why not (except that it needs a bit more code change). If given > explicitly, we might as well force the use and not do the funny stuff > which is designed to be more type agnostic! If it happens to fail due > to not being type agnostic, it will at least fail loudly. If you > leave > that object special case I am *very* hesitant about it. That I think > I > would like a `default` argument as well, is another issue and it can > wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45, > Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi > wrote: That is the idea, but NaN functions are in a separate branch > for another PR to be discussed later. You can see it on my fork, if > you're interested. Except that as far as I understand I am not sure > it > will help much with it, since it is not a default, but an > initializer. > Initializing to NaN would just make all results NaN. - Sebastian On > 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it > would finally give some people a choice on what np.nansum([np.nan]) > should return? It caused a huge hullabeloo a few years ago when we > changed it from returning NaN to returning zero. Ben Root On Mon, Mar > 26, 2018 at 11:16 AM, Sebastian Berg > wrote: OK, the new documentation is actually clear: initializer : > scalar, optional The value with which to start the reduction. > Defaults > to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given, > the > first element of the reduction is used, and an error is thrown if the > reduction is empty. If ``a.dtype`` is ``object``, then the > initializer > is _only_ used if reduction is empty. I would actually like to say > that I do not like the object special case much (and it is probably > the reason why I was confused), nor am I quite sure this is what > helps > a lot? Logically, I would argue there are two things: 1. > initializer/start (always used) 2. default (oly used for empty > reductions) For example, I might like to give `np.nan` as th
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
On Mon, 2018-03-26 at 17:40 +, Eric Wieser wrote: > The difficulty in supporting object arrays is that func.reduce(arr, > initial=func.identity) and func.reduce(arr) have different meanings - > whereas with the current patch, they are equivalent. > True, but the current meaning is: func.reduce(arr, intial=, default=func.identity) in the case for object dtype. Luckily for normal dtypes, func.identity is both the correct default "default" and a no-op for initial. Thus the name "identity" kinda works there. I am also not really sure that both kwargs would make real sense (plus initial probably disallows default...), but I got some feeling that the "default" meaning may be even more useful to simplify special casing the empty case. Anyway, still just pointing out that I it gives me some headaches to see such a special case for objects :(. - Sebastian > > On Mon, 26 Mar 2018 at 10:10 Sebastian Berg et> wrote: > > On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote: > > > That may be complicated. Currently, the identity isn't used in > > object > > > dtype reductions. We may need to change that, which could cause a > > > whole lot of other backwards incompatible changes. For example, > > sum > > > actually including zero in object reductions. Or we could pass in > > a > > > flag saying an initializer was passed in to change that > > behaviour. If > > > this is agreed upon and someone is kind enough to point me to the > > > code, I'd be willing to make this change. > > > > I realize the implication, I am not suggesting to change the > > default > > behaviour (when no initial=... is passed), I would think about > > deprecating it, but probably only if we also have the `default` > > argument, since otherwise you cannot replicate the old behaviour. > > > > What I think I would like to see is to change how it works if (and > > only > > if) the initializer is passed in. Yes, this will require holding on > > to > > some extra information since you will have to know/remember whether > > the > > "identity" was passed in or defined otherwise. > > > > I did not check the code, but I would hope that it is not awfully > > tricky to do that. > > > > - Sebastian > > > > > > PS: A side note, but I see your emails as a single block of text > > with > > no/broken new-lines. > > > > > > > On 26/03/2018 at 18:54, > > > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian > > Berg > > > wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: > > It'll > > > need to be thought out for object arrays and subclasses. But for > > > Regular numeric stuff, Numpy uses fmin and this would have the > > > desired > > > effect. I do not want to block this, but I would like a clearer > > > opinion about this issue, `np.nansum` as Benjamin noted would > > require > > > something like: np.nansum([np.nan], default=np.nan) because > > > np.sum([1], initializer=np.nan) np.nansum([1], > > initializer=np.nan) > > > would both give NaN if the logic is the same as the current > > `np.sum`. > > > And yes, I guess for fmin/fmax NaN happens to work. And then > > there > > > are > > > many nonsense reduces which could make sense with `initializer`. > > Now > > > nansum is not implemented in a way that could make use of the new > > > kwarg anyway, so maybe it does not matter in some sense. We can > > in > > > principle use `default` in nansum and at some point possibly add > > > `default` to the normal ufuncs. If we argue like that, the only > > > annoying thing is the `object` dtype which confuses the two use > > cases > > > currently. This confusion IMO is not harmless, because I might > > want > > > to > > > use it (e.g. sum with initializer=5), and I would expect things > > like > > > dropping in `decimal.Decimal` to work most of the time, while > > here it > > > would give silently bad results. In other words: I am very very > > much > > > in favor if you get rid that object dtype special case. I frankly > > not > > > see why not (except that it needs a bit more code change). If > > given > > > explicitly, we might as well force the use and not do the funny > > stuff > > > which is designed to be more type agnostic! If it happens to fail > > due > > > to not being type agnostic, it will at least fail loudly. If you > > > leave
Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)
On Mon, 2018-04-09 at 13:37 +0200, Hameer Abbasi wrote: > I've renamed the kwarg to `initial`. I'm willing to make the object > dtype changes as well, if someone pointed me to relevant bits of > code. > > Unfortunately, currently, the identity is only used for object dtypes > if the reduction is empty. I think this is to prevent things like `0` > being passed in the sum of objects (and similar cases), which makes > sense. > > However, with the kwarg, it makes sense to include it in the > reduction. I think the change will be somewhere along the lines of: > Detect if `initial` was passed, if so, include for object, otherwise > exclude. > > I personally feel `initial` renders `default` redundant. It can be > used for both purposes. I can't think of a reasonable use case where > you would want the default to be different from the initial value. > However, I do agree that fixing the object case is important, we > don't want users to get used to this behaviour and then rely on it > later. The reason would be the case of NaN which is not a possible initial value for the reduction. I personally find the object case important, if someone seriously argues the opposite I might be swayed possibly. - Sebastian > > Hameer > > On Mon, Mar 26, 2018 at 8:09 PM, Sebastian Berg ns.net> wrote: > > On Mon, 2018-03-26 at 17:40 +, Eric Wieser wrote: > > > The difficulty in supporting object arrays is that > > func.reduce(arr, > > > initial=func.identity) and func.reduce(arr) have different > > meanings - > > > whereas with the current patch, they are equivalent. > > > > > > > True, but the current meaning is: > > > > func.reduce(arr, intial=, default=func.identity) > > > > in the case for object dtype. Luckily for normal dtypes, > > func.identity > > is both the correct default "default" and a no-op for initial. Thus > > the > > name "identity" kinda works there. I am also not really sure that > > both > > kwargs would make real sense (plus initial probably disallows > > default...), but I got some feeling that the "default" meaning may > > be > > even more useful to simplify special casing the empty case. > > > > Anyway, still just pointing out that I it gives me some headaches > > to > > see such a special case for objects :(. > > > > - Sebastian > > > > > > > > > > On Mon, 26 Mar 2018 at 10:10 Sebastian Berg > ns.n > > > et> wrote: > > > > On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote: > > > > > That may be complicated. Currently, the identity isn't used > > in > > > > object > > > > > dtype reductions. We may need to change that, which could > > cause a > > > > > whole lot of other backwards incompatible changes. For > > example, > > > > sum > > > > > actually including zero in object reductions. Or we could > > pass in > > > > a > > > > > flag saying an initializer was passed in to change that > > > > behaviour. If > > > > > this is agreed upon and someone is kind enough to point me to > > the > > > > > code, I'd be willing to make this change. > > > > > > > > I realize the implication, I am not suggesting to change the > > > > default > > > > behaviour (when no initial=... is passed), I would think about > > > > deprecating it, but probably only if we also have the `default` > > > > argument, since otherwise you cannot replicate the old > > behaviour. > > > > > > > > What I think I would like to see is to change how it works if > > (and > > > > only > > > > if) the initializer is passed in. Yes, this will require > > holding on > > > > to > > > > some extra information since you will have to know/remember > > whether > > > > the > > > > "identity" was passed in or defined otherwise. > > > > > > > > I did not check the code, but I would hope that it is not > > awfully > > > > tricky to do that. > > > > > > > > - Sebastian > > > > > > > > > > > > PS: A side note, but I see your emails as a single block of > > text > > > > with > > > > no/broken new-lines. > > > > > > > > > > > > > On 26/03/2018 at 18:54, > > > > > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200,
Re: [Numpy-discussion] Introduction: NumPy developers at BIDS
On Tue, 2018-04-10 at 12:29 +0300, Matti Picus wrote: > On 08/04/18 21:02, Eric Firing wrote: > > On 2018/04/07 9:19 PM, Stefan van der Walt wrote: > > > We would love community input on identifying the best areas & > > > issues to > > > pay attention to, > > > > Stefan, > > > > What is the best way to provide this, and how will the decisions be > > made? > > > > Eric > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > Hi. I feel very lucky to be able to dedicate the next phase of my > career to working on NumPy. Even though BIDS has hired me, I view > myself as working for the community, in an open and transparent way. > In thinking about how to help make NumPy contributors more > productive, we laid out these tasks: > Welcome also from me :), I am looking forward to seeing how things develop! - Sebastian > - triage open issues and pull requests, picking up some of the long- > standing issues and trying to resolve them > > - help with code review > > - review and suggest improvements to the NumPy documentation > > - if needed, help with releases and infrastructure maintenance tasks > > Down the road, the next level of things would be > > - setting up a benchmark site like speed.python.org > > - add more downstream package testing to the NumPy CI so we can > verify that new releases work with packages such as scipy, scikit- > learn, astropy > > To document my work, I have set up a wikihttps://github.com/mattip/nu > mpy/wiki that lists some longer-term tasks and ideas. I look forward > to meeting and working with Tyler as well as SciPy2018 where there > will be both a BOF meeting to discuss NumPy and a two-day sprint. > > BIDS is ultimately responsible to the funders to make sure my work > achieves the goals Stefan laid out, but I am going to try to be as > responsive as possible to any input from the wider community, either > directly (mattip on github and #numpy on IRC), via email, or this > mailing list. > > Matti > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding a return value to np.random.shuffle
On Thu, 2018-04-12 at 13:36 -0400, Joseph Fox-Rabinovitz wrote: > Would it break backwards compatibility to add the input as a return > value to np.random.shuffle? I doubt anyone out there is relying on > the None return value. > Well, python discourages this IIRC, and opts to not do these things for in place functions (see random package specifically). Numpy breaks this in a few places, but that is mostly because we have the out argument as an optional input argument. As is, it is a nice way of making people not write: new = np.random.shuffle(old) and think old won't change. So I think we should probably just stick with the python/Guido van Rossum ideals, or did those change? - Sebastian > The change is trivial, and allows shuffling a new array in one line > instead of two: > > x = np.random.shuffle(np.array(some_junk)) > > I've implemented the change in PR#10893. > > Regards, > > - Joe > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding fweights and aweights to numpy.corrcoef
I seem to recall that there was a discussion on this and it was a lot trickier then expected. I think statsmodels might have options in this direction. - Sebastian On Thu, 2018-04-26 at 15:44 +, Corin Hoad wrote: > Hello, > > Would it be possible to add the fweights and aweights keyword > arguments from np.cov to np.corrcoef? They would retain their meaning > from np.cov as frequency- or importance-based weightings > respectively. > > Yours, > Corin Hoad > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Short-circuiting equivalent of np.any or np.all?
On Thu, 2018-04-26 at 09:51 -0700, Hameer Abbasi wrote: > Hi Nathan, > > np.any and np.all call np.or.reduce and np.and.reduce respectively, > and unfortunately the underlying function (ufunc.reduce) has no way > of detecting that the value isn’t going to change anymore. It’s also > used for (for example) np.sum (np.add.reduce), np.prod > (np.multiply.reduce), np.min(np.minimum.reduce), > np.max(np.maximum.reduce). I would like to point out that this is not almost, but not quite true. The boolean versions will short circuit on the innermost level, which is good enough for all practical purposes probably. One way to get around it would be to use a chunked iteration using np.nditer in pure python. I admit it is a bit tricky to get start on, but it is basically what numexpr uses also (at least in the simplest mode), and if your arrays are relatively large, there is likely no real performance hit compared to a non-pure python version. - Sebastian > > You can find more information about this on the ufunc doc page. I > don’t think it’s worth it to break this machinery for any and all, as > it has numerous other advantages (such as being able to override in > duck arrays, etc) > > Best regards, > Hameer Abbasi > Sent from Astro for Mac > > > On Apr 26, 2018 at 18:45, Nathan Goldbaum > > wrote: > > > > Hi all, > > > > I was surprised recently to discover that both np.any and np.all() > > do not have a way to exit early: > > > > In [1]: import numpy as np > > > > In [2]: data = np.arange(1e6) > > > > In [3]: print(data[:10]) > > [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] > > > > In [4]: %timeit np.any(data) > > 724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 loops > > each) > > > > In [5]: data = np.zeros(int(1e6)) > > > > In [6]: %timeit np.any(data) > > 732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 loops > > each) > > > > I don't see any discussions about this on the NumPy issue tracker > > but perhaps I'm missing something. > > > > I'm curious if there's a way to get a fast early-terminating search > > in NumPy? Perhaps there's another package I can depend on that does > > this? I guess I could also write a bit of cython code that does > > this but so far this project is pure python and I don't want to > > deal with the packaging headache of getting wheels built and conda- > > forge packages set up on all platforms. > > > > Thanks for your help! > > > > -Nathan > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Short-circuiting equivalent of np.any or np.all?
On Thu, 2018-04-26 at 19:26 +0200, Sebastian Berg wrote: > On Thu, 2018-04-26 at 09:51 -0700, Hameer Abbasi wrote: > > Hi Nathan, > > > > np.any and np.all call np.or.reduce and np.and.reduce respectively, > > and unfortunately the underlying function (ufunc.reduce) has no way > > of detecting that the value isn’t going to change anymore. It’s > > also > > used for (for example) np.sum (np.add.reduce), np.prod > > (np.multiply.reduce), np.min(np.minimum.reduce), > > np.max(np.maximum.reduce). > > > I would like to point out that this is not almost, but not quite > true. > The boolean versions will short circuit on the innermost level, which > is good enough for all practical purposes probably. > > One way to get around it would be to use a chunked iteration using > np.nditer in pure python. I admit it is a bit tricky to get start on, > but it is basically what numexpr uses also (at least in the simplest > mode), and if your arrays are relatively large, there is likely no > real > performance hit compared to a non-pure python version. > I mean something like this: def check_any(arr, func=lambda x: x, buffersize=0): """ Check if the function is true for any value in arr and stop once the first was found. Parameters -- arr : ndarray Array to test. func : function Function taking a 1D array as argument and returning an array (on which ``np.any`` will be called. buffersize : int Size of the chunk/buffer in the iteration, zero will use the default numpy value. Notes - The stopping does not occur immediatly but in buffersize chunks. """ iterflags = ['buffered', 'external_loop', 'refs_ok', 'zerosize_ok'] for chunk in np.nditer((arr,), flags=iterflags, buffersize=buffersize): if np.any(func(chunk)): return True return False not sure how it performs actually, but you can give it a try especially if you know you have large arrays, or if "func" is pretty expensive. If the input is already bool, it will be quite a bit slower though I am sure. - Sebastian > - Sebastian > > > > > > > You can find more information about this on the ufunc doc page. I > > don’t think it’s worth it to break this machinery for any and all, > > as > > it has numerous other advantages (such as being able to override in > > duck arrays, etc) > > > > Best regards, > > Hameer Abbasi > > Sent from Astro for Mac > > > > > On Apr 26, 2018 at 18:45, Nathan Goldbaum > > > wrote: > > > > > > Hi all, > > > > > > I was surprised recently to discover that both np.any and > > > np.all() > > > do not have a way to exit early: > > > > > > In [1]: import numpy as np > > > > > > In [2]: data = np.arange(1e6) > > > > > > In [3]: print(data[:10]) > > > [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] > > > > > > In [4]: %timeit np.any(data) > > > 724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 > > > loops > > > each) > > > > > > In [5]: data = np.zeros(int(1e6)) > > > > > > In [6]: %timeit np.any(data) > > > 732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 > > > loops > > > each) > > > > > > I don't see any discussions about this on the NumPy issue tracker > > > but perhaps I'm missing something. > > > > > > I'm curious if there's a way to get a fast early-terminating > > > search > > > in NumPy? Perhaps there's another package I can depend on that > > > does > > > this? I guess I could also write a bit of cython code that does > > > this but so far this project is pure python and I don't want to > > > deal with the packaging headache of getting wheels built and > > > conda- > > > forge packages set up on all platforms. > > > > > > Thanks for your help! > > > > > > -Nathan > > > > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Splitting MaskedArray into a separate package
On Wed, 2018-05-23 at 17:33 -0400, Allan Haldane wrote: > On 05/23/2018 04:02 PM, Eric Firing wrote: > > Bad or missing values (and situations where one wants to use a mask > > to > > operate on a subset of an array) are found in many domains of real > > life; > > do you really want python users in those domains to have to fall > > back on > > Matlab-style reliance on nans and/or manual mask manipulations, as > > the > > new maskedarray package is sidelined? > > I also think that missing value support is important to include > inside > numpy, just as it is included in other numerical packages like R and > Julia. > > The time is ripe to write a new and better MaskedArray, because > __array_ufunc__ exists now. With some other numpy devs a few months > ago > we also played with rewriting MA using __array_ufunc__ and fixing up > all > the bugs and inconsistencies we have discovered over time (eg, > getting > rid of the Masked constant). Both Eric and I started working on some > code changes, but never submitted PRs. See a little bit of discussion > here (there was some more elsewhere I can't find now): > > https://github.com/numpy/numpy/pull/9792#issuecomment-46420 > > As I say there, numpy's current MA support is pretty poor compared to > R > - Wes McKinney partly justified his desire to move pandas away from > numpy because of it. We have a lot to gain by implementing it nicely. > > We already have an NEP discussing possible ways forward: > https://docs.scipy.org/doc/numpy-1.14.0/neps/missing-data.html > > I was pretty excited by discussion above, and still am. I want to get > back to it after I finish more immediate priorities - finishing > printing/loading/saving fixes and structured array fixes. > > But Masked-Array-2 is on my list of desired long-term enhancements > for > numpy. Well, if we plan to replace it within numpy, I think we should wait until then for any move on deprecation (after which it seems like the obviously right choice)? If we do not plan to replace it within numpy, we need to discuss a bit how it might affect infrastructure (multiple implementations). There is the other discussion about how to replace it. By opening up/creating new masked dtypes or similar (cool but unclear how complex/long term) or `__array_ufunc__` based (relatively simple, will get rid of the nastier hacks that are currently needed). Or even both, just on different time scales? My first gut feeling about the proposal is: I love the idea to get rid of it... but lets not do it, it does feel like it makes too much infrastructure unclear. - Sebastian > > Allan > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Splitting MaskedArray into a separate package
On Wed, 2018-05-23 at 23:48 +0200, Sebastian Berg wrote: > On Wed, 2018-05-23 at 17:33 -0400, Allan Haldane wrote: > > If we do not plan to replace it within numpy, we need to discuss a > bit > how it might affect infrastructure (multiple implementations). > > There is the other discussion about how to replace it. By opening > up/creating new masked dtypes or similar (cool but unclear how > complex/long term) or `__array_ufunc__` based (relatively simple, > will > get rid of the nastier hacks that are currently needed). > > Or even both, just on different time scales? > I also somewhat like the idea of taking it out (once we have a first replacement) in the case that we have a plan to do a better/lower level replacement at a later point within numpy. Removal generally has its merits, but if a (mid term) replacement will come in any case, it would be nice to get those started first if possible. Otherwise downstream might end up having to fix up things twice. - Sebastian > My first gut feeling about the proposal is: I love the idea to get > rid > of it... but lets not do it, it does feel like it makes too much > infrastructure unclear. > > - Sebastian > > > > > > Allan > > > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Allowing broadcasting of code dimensions in generalized ufuncs
> > > > I'm currently -0.5 on both fixed dimensions and this broadcasting > > dimension idea. My reasoning is: > > > > - The use cases seem fairly esoteric. For fixed dimensions, I guess > > the motivating example is cross-product (are there any others?). > > But > > would it be so bad for a cross-product gufunc to raise an error if > > it > > receives the wrong number of dimensions? For this broadcasting > > case... > > well, obviously we've survived this long without all_equal :-). And > > there's something funny about all_equal, since it's really smushing > > together two conceptually separate gufuncs for efficiency. Should > > we > > also have all_less_than, sum_square, ...? If this is a big problem, > > then wouldn't it be better to solve it in a general way, like dask > > or > > Numba or numexpr do? To be clear, I'm not saying these features are > > necessarily *bad* ideas, in isolation -- just that the benefits > > aren't > > very convincing, and there are trade-offs, like: > > I have often wished numpy had these short-circuiting gufuncs, for a > very > long time. I specifically remember my fruitless searches for how to > do > it back to 2007. > > While "on average" short-circuiting only gives a speedup of 2x, in > many > situations you can arrange your algorithm so short circuiting will > happen early, eg usually in the first 10 elements of a 10^6 element > array, giving enormous speedups. > Also, I do not imagine these as free-floating ufuncs, I think we can > arrange them in a logical way in a gufunc ecosystem. There would be > some > "core ufuncs", with "associated gufuncs" accessible as attributes. > For > instance, any_less_than will be accessible as less.any > So then, why is it a gufunc and not an attribute using a ufunc with binary output? I have asked this before, and even got arguments as to why it fits gufuncs better, but frankly I still do not really understand. If it is an associated gufunc, why gufunc at all? We need any() and all() here, so that is not that many methods, right? And when it comes to buffering you have much more flexibility. Say I have the operation: (float_arr > int_arr).all(axis=(1, 2)) With int_arr being shaped (2, 1000, 1000) (i.e. large along the interesting axes). A normal gufunc IIRC will get the whole inner dimension as a float buffer. In other words, you gain practically nothing, because the whole int_arr will be cast to float anyway. If, however, you actually implement np.greater_than.all(float_arr, int_arr, axis=(1, 2)) as a separate ufunc method, you would have the freedom to work in the typical cache friendly buffersize chunk size for each of the outer dimensions one at a time. A gufunc would require to say: please do not buffer for me, or implement all possible type combinations to do this. (of course there are memory layout subtleties, since you would have to optimize always for the "fast exit" case, potentially making the worst case scenario much worse -- unless you do seriously fancy stuff anyway). A more general question is actually whether we should rather focus on solving the same problem more generally. For example if `numexpr` would implement all/any reductions, it may be able to pretty simply get the identical tradeoffs with even more flexibility! (I have to admit, it may get tricky with multiple reduction dimensions, etc.) - Sebastian > binary "comparison" ufuncs would have attributes > > less.any > less.all > less.first # returns first matching index > less.count # counts matches without intermediate bool array > > This adds on to the existing attributes, for instance > ufuncs already have: > > add.reduce > add.accumulate > add.reduceat > add.outer > add.at > > It is unfortunate that all ufuncs currently have these attributes > even > if they are unimplemented/inappropriate (eg, np.sin.reduce), I would > like to remove the inappropriate ones, so each core ufunc will only > have the appropriate attribute "associated gufuncs". > > Incidentally, once we make reduce/accumuate/... into "associated > gufuncs", I propose completely removing the "method" argument of > __array_ufunc__, since it is no longer needed and adds a lot > of complexity which implementors of an __array_ufunc__ are forced to > account for. > > Cheers, > Allan > > > > > > > > > > - When it comes to the core ufunc machinery, we have a limited > > complexity budget. I'm nervous that if we add too many bells and > > whistles, we'll end up writing ourselves into a corner where we > > h
Re: [Numpy-discussion] Forcing new dimensions to appear at front in advanced indexing
On Tue, 2018-06-19 at 19:37 -0400, Michael Lamparski wrote: > Hi all, > > So, in advanced indexing, numpy decides where to put new axes based > on whether the "advanced indices" are all next to each other. > > >>> np.random.random((3,4,5,6,7,8))[:, [[0,0],[0,0]], 1, :].shape > (3, 2, 2, 6, 7, 8) > >>> np.random.random((3,4,5,6,7,8))[:, [[0,0],[0,0]], :, 1].shape > (2, 2, 3, 5, 7, 8) > > In creating a wrapper type around arrays, I'm finding myself needing > to suppress this behavior, so that the new axes consistently appear > in the front. I thought of a dumb hat trick: > > def index(x, indices): > return x[(True, None) + indices] > > Which certainly gets the new dimensions where I want them, but it > introduces a ghost dimension of 1 (and sometimes two such > dimensions!) in a place where I'm not sure I can easily find it. > > >>> np.random.random((3,4,5,6,7,8))[True, None, 1].shape > (1, 1, 4, 5, 6, 7, 8) > >>> np.random.random((3,4,5,6,7,8))[True, None, :, [[0,0],[0,0]], 1, > :].shape > (2, 2, 1, 3, 6, 7, 8) > >>> np.random.random((3,4,5,6,7,8))[True, None, :, [[0,0],[0,0]], :, > 1].shape > (2, 2, 1, 3, 5, 7, 8) > > any better ideas? > We have proposed `arr.vindex[...]` to do this and there are is a pure python implementation of it out there, I think it may be linked here somewhere: https://github.com/numpy/numpy/pull/6256 There is a way that will generally work using triple indexing: arr[..., None, None][orig_indx * (slice(None), np.array(0))][..., 0] The first and last indexing operation is just a view creation, so it is basically a no-op. Now doing this gives me the shiver, but it will work always. If you want to have a no-copy behaviour in case your original index is ont an advanced indexing operation, you should replace the np.array(0) with just 0. - Sebastian > --- > > Michael > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Forcing new dimensions to appear at front in advanced indexing
On Wed, 2018-06-20 at 09:15 -0400, Michael Lamparski wrote: > > There is a way that will generally work using triple indexing: > > > > arr[..., None, None][orig_indx + (slice(None), np.array(0))][..., > 0] > > Impressive! (note: I fixed the * typo in the quote) > > > The first and last indexing operation is just a view creation, so > it is > > basically a no-op. Now doing this gives me the shiver, but it will > work > > always. If you want to have a no-copy behaviour in case your > original > > index is ont an advanced indexing operation, you should replace the > > np.array(0) with just 0. > > I agree about the shivers, but any workaround is good to have > nonetheless. > > If the index is not an advanced indexing operation, does it not > suffice to simply apply the index tuple as-is? Yes, with the `np.array(0)` however, the result will forced to be a copy and not a view into the original array, when writing the line first I thought of "force advanced indexing", which there is likely no reason for though. If you replace it with 0, the result will be an identical view when the index is not advanced (with only a tiny bit of call overhead). So it might be nice to just use 0 instead, since if your index is advanced indexing, there is no difference between the two. But then you do not have to check if there is advanced indexing going on at all. Btw. if you want to use it for an object, I might suggest to actually use: object.vindex[...] notation for this logic (requires a slightly annoying helper class). The NEP is basically just a draft/proposal status, but xarray is already using that indexing method/property IIRC, so that name is relatively certain by now. I frankly am not sure right now if the vindex proposal was with a forced copy or not, probably it was. - Sebastian > > Michael > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Remove sctypeNA and typeNA from numpy core
On Thu, 2018-06-21 at 09:25 -0700, Matti Picus wrote: > numpy.core has many ways to catalogue dtype names: sctypeDict, > typeDict > (which is precisely sctypeDict), typecodes, and typename. We also > generate sctypeNA and typeNA but, as issue 11241 shows, it is > sometimes > wrong. They are also not documented and never used inside numpy. > Instead > of fixing it, I propose to remove sctypeNA and typeNA. > Sounds like a good idea, we have too much stuff in there, and this one is not even useful (I bet the NA is for the missing value support that never happened). Might be good to do a quick deprecation anyway though, mostly out of principle. - Sebastian > Any thoughts or objections? > Matti > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing
On Tue, 2018-06-26 at 17:30 +1000, Andrew Nelson wrote: > On Tue, 26 Jun 2018 at 17:12, Eric Wieser m> wrote: > > > I don't think it should be relegated to the "officially > > discouraged" ghetto of `.legacy_index` > > > > The way I read it, the new spelling lof that would be the explicit > > but not discouraged `image.vindex[rr, cc]`. > > > > If I'm understanding correctly what can be achieved now by `arr[rr, > cc]` would have to be modified to use `arr.vindex[rr, cc]`, which is > a very large change in behaviour. I suspect that there a lot of > situations out there which use `arr[idxs]` where `idxs` can mean one > of a range of things depending on the code path followed. If any of > those change, or a mix of nomenclatures are required to access the > different cases, then havoc will probably ensue. Yes, that is true, but I doubt you will find a lot of code path that need the current indexing as opposed to vindex here, and the idea was to have a method to get the old behaviour indefinitely. You will need to add the `.vindex`, but that should be the only code change needed, and it would be easy to find where with errors/warnings. I see a possible problem with code that has to work on different numpy versions, but only in meaning we need to delay deprecations. The only thing I could imagine where this might happen is if you forward someone elses indexing objects and different users are used to different results. Otherwise, there is mostly one case which would get annoying, and that is `arr[:, rr, cc]` since `arr.vindex[:, rr, cc]` would not be exactly the same. Because, yes, in some cases the current logic is convenient, just incredibly surprising as well. - Sebastian > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing
On Tue, 2018-06-26 at 01:21 -0700, Robert Kern wrote: > On Tue, Jun 26, 2018 at 12:58 AM Sebastian Berg > wrote: > > > > Yes, that is true, but I doubt you will find a lot of code path > > that > > need the current indexing as opposed to vindex here, > > That's probably true! But I think it's besides the point. I'd wager > that most code paths that will use .vindex would work perfectly well > with current indexing, too. Most of the time, people aren't getting > into the hairy corners of advanced indexing. > Right, the proposal was to have DeprecationWarnings when they differ, now I also thought DeprecationWarnings on two advanced indexes in general is good, because it is good for new users. I have to agree with your argument that most of the confused should be running into broadcast errors (if they expect oindex vs. fancy). So I see this as a point that we likely should just limit ourselves at least for now to the cases for example with sudden transposing going on. However, I would like to point out that the reason for the more broad warnings is that it could allow warping normal indexing at some point. Also it decreases traps with array-likes that behave differently. > Adding to the toolbox is great, but I don't see a good reason to take > out the ones that are commonly used quite safely. > > > and the idea was > > to have a method to get the old behaviour indefinitely. You will > > need > > to add the `.vindex`, but that should be the only code change > > needed, > > and it would be easy to find where with errors/warnings. > > It's not necessarily hard; it's just churn for no benefit to the > downstream code. They didn't get a new feature; they just have to run > faster to stay in the same place. > So, yes, it is annoying for quite a few projects that correctly use fancy indexing, but if we choose to not annoy you a little, we will have much less long term options which also includes such projects compatibility to new/current array-likes. So basically one point is: if we annoy scikit-image now, their code will work better for dask arrays in the future hopefully. - Sebastian > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing
On Tue, 2018-06-26 at 04:23 -0400, Hameer Abbasi wrote: > > Boolean indices are not supported. All indices must be integers, > integer arrays or slices. > > I would hope that there’s at least some way to do boolean indexing. I > often find myself needing it. I realise that > `arr.vindex[np.nonzero(boolean_idx)]` works, but it is slightly too > verbose for my liking. Maybe we can have `arr.bindex[boolean_index]` > as an alias to exactly that? > That part is limited to `vindex` only. A single boolean index would always work in plain indexing and you can mix it all up inside of `oindex`. But with fancy indexing mixing boolean + integer seems currently pretty much useless (and thus the same is true for `vindex`, in `oindex` things make sense). Now you could invent some new logic for such a mixing case in `vindex`, but it seems easier to just ignore it for the moment. - Sebastian > Or is boolean indexing preserved as-is n the newest proposal? If so, > great! > > Another thing I’d say is `arr.?index` should be replaced with > `arr.?idx`. I personally prefer `arr.?x` for my fingers but I realise > that for someone not super into NumPy indexing, this is kind of > opaque to read, so I propose this less verbose but hopefully equally > clear version, for my (and others’) brains. > > Best Regards, > Hameer Abbasi > Sent from Astro for Mac > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing
On Tue, 2018-06-26 at 02:27 -0700, Robert Kern wrote: > On Tue, Jun 26, 2018 at 1:36 AM Sebastian Berg s.net> wrote: > > On Tue, 2018-06-26 at 01:21 -0700, Robert Kern wrote: > > > On Tue, Jun 26, 2018 at 12:58 AM Sebastian Berg > > > wrote: > > > > > > > > > > > > > > Yes, that is true, but I doubt you will find a lot of code path > > > > that > > > > need the current indexing as opposed to vindex here, > > > > > > That's probably true! But I think it's besides the point. I'd > > wager > > > that most code paths that will use .vindex would work perfectly > > well > > > with current indexing, too. Most of the time, people aren't > > getting > > > into the hairy corners of advanced indexing. > > > > > > > Right, the proposal was to have DeprecationWarnings when they > > differ, > > now I also thought DeprecationWarnings on two advanced indexes in > > general is good, because it is good for new users. > > I have to agree with your argument that most of the confused should > > be > > running into broadcast errors (if they expect oindex vs. fancy). So > > I > > see this as a point that we likely should just limit ourselves at > > least > > for now to the cases for example with sudden transposing going on. > > > > However, I would like to point out that the reason for the more > > broad > > warnings is that it could allow warping normal indexing at some > > point. > > > > I don't really understand this. You would discourage the "normal" > syntax in favor of these more specific named syntaxes, so you can > introduce different behavior for the "normal" syntax and encourage > everyone to use it again? Just add more named syntaxes if you want > new behavior! That's the beauty of the design underlying this NEP. > > > Also it decreases traps with array-likes that behave differently. > > If we were to take this seriously, then no one should use a bare [] > ever. > > I'll go on record as saying that array-likes should respond to `a[rr, > cc]`, as in Juan's example, with the current behavior. And if they > don't, they don't deserve to be operated on by skimage functions. > > If I'm reading the NEP correctly, the main thrust of the issue with > array-likes is that it is difficult for some of them to implement the > full spectrum of indexing possibilities. This NEP does not actually > make it *easier* for those array-likes to implement every > possibility. It just offers some APIs that more naturally express > common use cases which can sometimes be implemented more naturally > than if expressed in the current indexing. For instance, you can > achieve the same effect as orthogonal indexing with the current > implementation, but you have to manipulate the indices before you > pass them over to __getitem__(), losing information along the way > that could be used to make a more efficient lookup in some array- > likes. > > The NEP design is essentially more of a way to give these array-likes > standard places to raise NotImplementedError than it is to help them > get rid of all of their NotImplementedErrors. More specifically, if > these array-likes can't implement `a[rr, cc]`, they're not going to > implement `a.vindex[rr, cc]`, either. > > I think most of the problems that caused these libraries to make > different choices in their __getitem__() implementation are due to > the fact that these expressive APIs didn't exist, so they had to > shoehorn them into __getitem__(); orthogonal indexing was too useful > and efficient not to implement! I think that once we have .oindex and > .vindex out there, they will be able to clean up their __getitem__()s > to consistently support whatever of the current behavior that they > can and raise NotImplementedError where they can't. > Right, it helps mostly to be clear about what an object can and cannot do. So h5py or whatever could error out for plain indexing and only support `.oindex`, and we have all options cleanly available. And yes, I agree that in itself is a big step forward. The thing is there are also very strong opinions that the fancy indexing behaviour is so confusing that it would ideally not be the default since it breaks comparing analogy slice objects. So, personally, I would argue that if we were to start over from scratch, fancy indexing (multiple indexes), would not be the default plain indexing behaviour. Now, maybe the pain of a few warnings is too high, but if we wish to move, no matter how slowly, in such regard, we will have to swallow
Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing
On Tue, 2018-06-26 at 04:01 -0400, Hameer Abbasi wrote: > I second this design. If we were to consider the general case of a > tuple `idx`, then we’d not be moving forward at all. Design changes > would be impossible. I’d argue that this newer model would be easier > for library maintainers overall (who are the kind of people using > this), reducing maintenance cost in the long run because it’d lead to > simpler code. > > I would also that the “internal” classes expressing outer as > vectorised indexing etc. should be exposed, for maintainers of duck > arrays to use. God knows how many utility functions I’ve had to write > to avoid relying on undocumented NumPy internals for pydata/sparse, > fearing that I’d have to rewrite/modify them when behaviour changes > or I find other corner cases. Could you list some examples what you would need? We can expose some of the internals, or maybe even provide funcs to map e.g. oindex to vindex or vindex to plain indexing, etc. but it would be helpful to know what downstream actually might need. For all I know the things that you are thinking of may not even exist... - Sebastian > > Best Regards, > Hameer Abbasi > Sent from Astro for Mac > > > On 26. Jun 2018 at 09:46, Robert Kern > > wrote: > > > > On Tue, Jun 26, 2018 at 12:13 AM Eric Wieser > il.com> wrote: > > > > I don't think it should be relegated to the "officially > > > discouraged" ghetto of `.legacy_index` > > > > > > The way I read it, the new spelling lof that would be the > > > explicit but not discouraged `image.vindex[rr, cc]`. > > > > > > > Okay, I missed that the first time through. I think having more > > self-contained descriptions of the semantics of each of these would > > be a good idea. The current description of `.vindex` spends more > > time talking about what it doesn't do, compared to the other > > methods, than what it does. > > > > Some more typical, less-exotic examples would be a good idea. > > > > > > I would reserve warnings for the cases where the current > > > behavior is something no one really wants, like mixing slices and > > > integer arrays. > > > > > > These are the cases that would only be available under > > > `legacy_index`. > > > > > > > I'm still leaning towards not warning on current, unproblematic > > common uses. It's unnecessary churn for currently working, > > understandable code. I would still reserve warnings and deprecation > > for the cases where the current behavior gives us something that no > > one wants. Those are the real traps that people need to be warned > > away from. > > > > If someone is mixing slices and integer indices, that's a really > > good sign that they thought indexing behaved in a different way > > (e.g. orthogonal indexing). > > > > If someone is just using multiple index arrays that would currently > > not give an error, that's actually a really good sign that they are > > using it correctly and are getting the semantics that they desired. > > If they wanted orthogonal indexing, it is *really* likely that > > their index arrays would *not* broadcast together. And even if they > > did, the wrong shape of the result is one of the more easily > > noticed things. These are not silent errors that would motivate > > adding a new warning. > > > > -- > > Robert Kern > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing
On Tue, 2018-06-26 at 22:26 -0700, Robert Kern wrote: > On Tue, Jun 26, 2018 at 10:21 PM Juan Nunez-Iglesias com> wrote: > > Let me start by thanking Robert for articulating my viewpoints far > > better than I could have done myself. I want to explicitly flag the > > following statements for endorsement: > > > > > I would still reserve warnings and deprecation for the cases > > > where the current behavior gives us something that no one wants. > > > Those are the real traps that people need to be warned away from. > > > In the post-NEP .oindex/.vindex order, everyone can get the > > > behavior that they want. Your argument for deprecation is now > > > just about what the default is, the semantics that get pride of > > > place with the shortest spelling. I am sympathetic to the feeling > > > like you wish you had a time machine to go fix a design with your > > > new insight. But it seems to me that just changing which > > > semantics are the default has relatively attenuated value while > > > breaking compatibility for a fundamental feature of numpy has > > > significant costs. Just introducing .oindex is the bulk of the > > > value of this NEP. Everything else is window dressing. > > > If someone is mixing slices and integer indices, that's a really > > > good sign that they thought indexing behaved in a different way > > > (e.g. orthogonal indexing). > > > > I would offer the exception of trailing slices to this statement, > > though: > > OK, sounds fine to me, I see that we just can't start planning for a possible long term future yet. I personally do not care really what the warnings itself say for now (Deprecation or not), larger packages will have to avoid them in any case though. But I guess we have a consent on a certain amount of warnings (probably will have to see how much they actually appear) and then can revisit in a longer while. - Sebastian > > In [1]: from skimage import data > > In [2]: astro = data.astronaut() > > In [3]: astro.shape > > Out[3]: (512, 512, 3) > > > > In [4]: rr, cc = np.array([1, 3, 3, 3]), np.array([1, 8, 9, 10]) > > In [5]: astro[rr, cc].shape > > Out[5]: (4, 3) > > > > In [6]: astro[rr, cc, :].shape > > Out[6]: (4, 3) > > > > This does exactly what I would expect. > > > > Yup, sorry, I didn't mean those. I meant when there is an explicit > slice in between index arrays. (And maybe when index arrays follow > slices; I'll need to think more on that.) > > > Going back to the motivation for the NEP, I think this bit, > > emphasis mine, is crucial: > > > > > > the existing rules for advanced indexing with multiple array > > > > indices are typically confusing to both new, **and in many > > > > cases even old,** users of NumPy > > > > I think it is ok for advanced indexing to be accessible to advanced > > users. I remember that it took me quite a while to grok NumPy > > advanced indexing, but once I did I just loved it. > > > > I also like that this syntax translates perfectly from integer > > indices to float coordinates in `ndimage.map_coordinates`. > > > > > I'll go on record as saying that array-likes should respond to > > > `a[rr, cc]`, as in Juan's example, with the current behavior. And > > > if they don't, they don't deserve to be operated on by skimage > > > functions. > > > > (I don't think of us highly enough to use the word "deserve", but I > > would say that we would hesitate to support arrays that don't use > > this convention.) > > > > Ahem, yes, I was being provocative in a moment of weakness. May the > array-like authors forgive me. > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] update to numpy-1.5.0 gives new warnings from scipy
On Wed, 2018-07-25 at 07:44 -0400, Neal Becker wrote: > After update to numpy-1.5.0, I'm getting warnings from scipy. > These probably come from my code using convolve. Does scipy need > updating? > Probably yes, I am a bit surprised we did not notice it before if it is in scipy (or maybe scipy is already fixed?). This may be one of the more controversial new warnings, so lets see if it comes up more. Right now it seems not to affect much, I guess. If the correct thing to do is to use the list as an array, then the easiest solution maybe to do: z[index,] = x # note the additional `,` # or alternatively of course: z[np.asarray(index)] = x Otherwise, you will have to use `tuple(index)` to make sure numpy interprets it as a multi-dimensional index. The problem here, that this solves, is that if you have `z[some_list]` currently numpy basically guesses whether you want a multi-dimensional index or not. - Sebastian > /home/nbecker/.local/lib/python3.6/site- > packages/scipy/fftpack/basic.py:160: FutureWarning: Using a non-tuple > sequence for multidimensional indexing is deprecated; use > `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be > interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > z[index] = x > /home/nbecker/.local/lib/python3.6/site- > packages/scipy/signal/signaltools.py:491: FutureWarning: Using a non- > tuple sequence for multidimensional indexing is deprecated; use > `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be > interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > return x[reverse].conj() > /home/nbecker/.local/lib/python3.6/site- > packages/scipy/signal/signaltools.py:251: FutureWarning: Using a non- > tuple sequence for multidimensional indexing is deprecated; use > `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be > interpreted as an array index, `arr[np.array(seq)]`, which will > result either in an error or a different result. > in1zpadded[sc] = in1.copy() > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Roadmap proposal, v3
On Thu, 2018-08-02 at 05:47 -0700, Ralf Gommers wrote: > > > On Tue, Jul 24, 2018 at 12:04 PM, Stefan van der Walt ey.edu> wrote: > > Hi everyone, > > > > Please take a look at the latest roadmap proposal: > > > > https://github.com/numpy/numpy/pull/11611 > > > > This is a living document, so can easily be modified in the future, > > but > > we'd like to get in place a document that corresponds fairly > > closely > > with current community priorities. > > The green button was pressed, the roadmap is now live on http://www.n > umpy.org/neps/. Thanks all! > Great, I hope we can check off some of them soon! :) - Sebastian > Cheers, > Ralf > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adoption of a Code of Conduct
On Thu, 2018-08-02 at 12:04 +0200, Sylvain Corlay wrote: > The "political belief" was recently removed from the Jupyter CoC. > > One reason for this decision is that Racism and Sexism are > increasingly considered as mainstream "political beliefs", and we > wanted to make it clear that people can still be sanctioned for e.g. > sexist or racist behavior when engaging with the project (at events, > on the mailing list or GitHub...) even if their racism and sexism is > corresponds to a "political belief". > > It is still not OK for people to be excluded or discriminated against > because of their political affiliation. The CoC statement reads "This > includes, but is not limited to...". Also we don't wish to prioritize > or elevate any members of a particular political belief to the same > level as any members of the examples remaining in the > document. Ultimately, the CoC committee uses their own judgement to > assess reports and the appropriate response. > TL;DR: I don't think it matters, as for the CoC as such, it seems fine to me, lets just put it in and be done with it. I do not think we should have a long discussion (about that list), and in case it might go there, would suggest we try to find a way to refuse to have it. Maybe by letting the committee that is in the CoC decide. Actually: I am good with the people currently listed for SciPy if they will do it, or anyone else wants to jump in? I won't really follow the discussion much more (except for reading) and do not feel like I really know enough about CoCs, but my point is, that I do not care much. The CoC as suggested seems pretty uncontroversial to me (it does not draw any hard lines worth fighting over). And that is probably the only current believe I have, that I think it should not really draw those lines. Political opinion being included or not? I am not sure I care, because as I read it and you point out, it does not really matter whether or not it is included, including it would just raise awareness for a specific issue. This is not about freedom to express political believes (on numpy channels), I suppose there may be a point where even a boycott can be discriminatory and things may be tricky to assess [1], but again those cases need careful weighing (by the committee mostly), a CoC might bias this a little, but not much, and if we decide which way to bias it we might end up fighting, so lets refuse to do it outside specific cases? Freedom of expression is always limited by the protection of other individuals rights (note that I believe in the US this freedom tends to be held very high when weighing the two). But, since there is normally no reason for voicing political opinions on numpy, it seems obvious to me that it will tend to lose when weighed against the other persons rights being protected [2]. Weighing different "rights" is always tricky, but cannot be avoided or really formalized too much IMO [3,4]. Which comes to the point that I think the list is one to raise awareness for and be welcoming to specific people (either very general or minority), who have in the past (or currently) not felt welcome. And such a list always will be set in the current time/mentality. We are maybe in an odd spot where political discussion/judicial progress feels lagging behind social development (and some fronts are hardening :(), which makes things a bit trickier. Overall, all it would do is to maybe suggested that "political opinion" is currently not something that need special raised awareness. It does not mean this defines a "bias", nor that the list cannot change at some point. Either way, I do not read the list as giving any additional protection for *voicing* your opinion. In fact, I would argue the opposite may be the case. If you voice it you make the opposite (political) opinion feel less welcome, and since there is no reason for voicing a political opinion *on a numpy channel* when weighing those against each other it seems like a hard case [5]. At some points lines may have to be drawn (and drawing them once, does not set them in stone for the next time!). I do not think we draw or should draw them (much) with this statement itself, the statement says that they will be drawn if and when necessary and then it will be done so carefully. Plus it generally raises awareness and gives a bit guidelines. It seems to me that this may be the actual discussion with many of those other discussions. Not so much the wording, but over how exactly lines were drawn in practice. Sure, probably we set a bit of bias with the list, but I doubt it is enough to fight over. And hopefully we can avoid a huge discussion :) (for now looks like it). Best, Sebastian PS: I do not mind synchronizing numpy and scipy (or numpy and Jupyter or all three) as much as possible. I guess you could sum it up to, maybe I am even
Re: [Numpy-discussion] Taking back control of the #numpy irc channel
On Mon, 2018-08-06 at 21:52 -0700, Ralf Gommers wrote: > > > On Mon, Aug 6, 2018 at 7:15 PM, Nathan Goldbaum m> wrote: > > Hi, > > > > I idle in #scipy and have op in there. I’m happy start idling in > > #numpy and be op if the community is willing to let me. > > > > Thanks Nathan. Sounds useful. > Sounds good. I haven't really hung out there for a long time (frankly, I never hung out in #numpy, I thought people just use #scipy). Can we just give a few names (such as Matti, Nathan, maybe me, anyone else right now?) and add others later ourselves? I can get in contact with freenode (unless someone already did). > There's also a Gitter numpy channel. AFAIK few/none core devs are > regularly active on either IRC or Gitter. I would suggest that we > document both these channels as community-run at > https://scipy.org/scipylib/mailing-lists.html, and give Nathan and > others who are interested the permissions they need. > Yeah, the gitter seems pretty inactive as well. But I guess it doesn't hurt to mention them. - Sebastian > I think our official recommendation for usage questions is > StackOverflow. > > Cheers, > Ralf > > > > I’m also in the process of getting ops for #matplotlib for similar > > spam-related reasons. I’d say all the scientific python IRC > > channels I’m in get a decent amount of traffic (perhaps 10% of the > > number of questions that get asked on StackOverflow) and it’s a > > good venue for asking quick questions. Let’s hope that forcing > > people to register doesn’t kill that, although there’s not much we > > can do given the spam attack. > > > > Nathan > > > > On Mon, Aug 6, 2018 at 9:03 PM Matti Picus > > wrote: > > > Over the past few days spambots have been hitting freenode's IRC > > > channels[0, 1]. It turns out the #numpy channel has no operator, > > > so we > > > cannot make the channel mode "|+q $~a"[2] - i.e. only registered > > > freenode users can talk but anyone can listen. > > > > > > I was in touch with the freenode staff, they requested that > > > someone from > > > the steering council reach out to them at ||proje...@freenode.net > > > , here > > > is the quote from the discussion: > > > > > > " > > > it's pretty much a matter of them sending an email telling us who > > > they'd > > > like to represent them on freenode, which channels and cloak > > > namespaces > > > they want, and any info we might need on the project > > > " > > > > > > In the mean time they set the channel mode appropriately, so this > > > is > > > also a notice that if you want to chat on the #numpy IRC channel > > > you > > > need to register. > > > > > > Hope someone from the council picks this up and reaches out to > > > them, and > > > will decide who is to able to become channel operators (the > > > recommended > > > practice is to use it like sudo, only assume the role when needed > > > then > > > turn it back off). > > > > > > Matti > > > > > > [0] https://freenode.net/news/spambot-attack > > > [1] https://freenode.net/news/spam-shake > > > [2] https://nedbatchelder.com/blog/201808/fighting_spam_on_freeno > > > de.html > > > | > > > ___ > > > NumPy-Discussion mailing list > > > NumPy-Discussion@python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Taking back control of the #numpy irc channel
On Tue, 2018-08-07 at 22:07 -0700, Ralf Gommers wrote: > > > On Tue, Aug 7, 2018 at 4:34 AM, Sebastian Berg s.net> wrote: > > On Mon, 2018-08-06 at 21:52 -0700, Ralf Gommers wrote: > > > > > > > > > On Mon, Aug 6, 2018 at 7:15 PM, Nathan Goldbaum > l.co > > > m> wrote: > > > > Hi, > > > > > > > > I idle in #scipy and have op in there. I’m happy start idling > > in > > > > #numpy and be op if the community is willing to let me. > > > > > > > > > > Thanks Nathan. Sounds useful. > > > > > > > Sounds good. I haven't really hung out there for a long time > > (frankly, > > I never hung out in #numpy, I thought people just use #scipy). > > > > Can we just give a few names (such as Matti, Nathan, maybe me, > > anyone > > else right now?) and add others later ourselves? > > I can get in contact with freenode (unless someone already did). > > Thanks Sebastian. Go ahead I'd say. Will do, just realized looking at it. The Steering Council list, etc. has a list of names, but not email addresses (or PGP keys). I do not remember, was that intentional or not? Also I am not sure if the steering council email address is published anywhere, IIRC it was possible for anyone to send an email to it (OTOH, it would be nice to not catch spam there, so maybe it is fine to ask for the address first). - Sebastian > > Ralf > > > > There's also a Gitter numpy channel. AFAIK few/none core devs are > > > regularly active on either IRC or Gitter. I would suggest that we > > > document both these channels as community-run at > > > https://scipy.org/scipylib/mailing-lists.html, and give Nathan > > and > > > others who are interested the permissions they need. > > > > > > > Yeah, the gitter seems pretty inactive as well. But I guess it > > doesn't > > hurt to mention them. > > > > - Sebastian > > > > > > > I think our official recommendation for usage questions is > > > StackOverflow. > > > > > > Cheers, > > > Ralf > > > > > > > > > > I’m also in the process of getting ops for #matplotlib for > > similar > > > > spam-related reasons. I’d say all the scientific python IRC > > > > channels I’m in get a decent amount of traffic (perhaps 10% of > > the > > > > number of questions that get asked on StackOverflow) and it’s a > > > > good venue for asking quick questions. Let’s hope that forcing > > > > people to register doesn’t kill that, although there’s not much > > we > > > > can do given the spam attack. > > > > > > > > Nathan > > > > > > > > On Mon, Aug 6, 2018 at 9:03 PM Matti Picus > om> > > > > wrote: > > > > > Over the past few days spambots have been hitting freenode's > > IRC > > > > > channels[0, 1]. It turns out the #numpy channel has no > > operator, > > > > > so we > > > > > cannot make the channel mode "|+q $~a"[2] - i.e. only > > registered > > > > > freenode users can talk but anyone can listen. > > > > > > > > > > I was in touch with the freenode staff, they requested that > > > > > someone from > > > > > the steering council reach out to them at ||projects@freenode > > .net > > > > > , here > > > > > is the quote from the discussion: > > > > > > > > > > " > > > > > it's pretty much a matter of them sending an email telling us > > who > > > > > they'd > > > > > like to represent them on freenode, which channels and cloak > > > > > namespaces > > > > > they want, and any info we might need on the project > > > > > " > > > > > > > > > > In the mean time they set the channel mode appropriately, so > > this > > > > > is > > > > > also a notice that if you want to chat on the #numpy IRC > > channel > > > > > you > > > > > need to register. > > > > > > > > > > Hope someone from the council picks this up and reaches out > > to > > > > > them, and > > > > > will decide who is to able to become channel operators (the > > > > > recommended > > > > > practice is to use it like
Re: [Numpy-discussion] Taking back control of the #numpy irc channel
On Wed, 2018-08-08 at 08:55 -0700, Ralf Gommers wrote: > > > On Wed, Aug 8, 2018 at 1:23 AM, Sebastian Berg s.net> wrote: > > On Tue, 2018-08-07 at 22:07 -0700, Ralf Gommers wrote: > > > > > > > > > On Tue, Aug 7, 2018 at 4:34 AM, Sebastian Berg > tion > > > s.net> wrote: > > > > On Mon, 2018-08-06 at 21:52 -0700, Ralf Gommers wrote: > > > > > > > > > > > > > > > On Mon, Aug 6, 2018 at 7:15 PM, Nathan Goldbaum > gmai > > > > l.co > > > > > m> wrote: > > > > > > Hi, > > > > > > > > > > > > I idle in #scipy and have op in there. I’m happy start > > idling > > > > in > > > > > > #numpy and be op if the community is willing to let me. > > > > > > > > > > > > > > > > Thanks Nathan. Sounds useful. > > > > > > > > > > > > > Sounds good. I haven't really hung out there for a long time > > > > (frankly, > > > > I never hung out in #numpy, I thought people just use #scipy). > > > > > > > > Can we just give a few names (such as Matti, Nathan, maybe me, > > > > anyone > > > > else right now?) and add others later ourselves? > > > > I can get in contact with freenode (unless someone already > > did). > > > > > > Thanks Sebastian. Go ahead I'd say. > > > > > > Will do, just realized looking at it. The Steering Council list, > > etc. > > has a list of names, but not email addresses (or PGP keys). I do > > not > > remember, was that intentional or not? > > I have a vague memory of that being intentional, but not sure. I > don't mind making email addresses public; they can be found from git > commit logs and mailing lists anyway, so why make life difficult for > whomever wants to reach us. > Yeah, well, I find PGP keys a good idea, even if they might outdate once in a while That means if someone wants to check you can easily sign an email and they can be pretty sure you can claim to have some sway in NumPy (right now freenode did it by seeing I have power on github, but it is not really quite ideal). On a general note about IRC, we have claimed #numpy now: If anyone wants anything #numpy related on IRC now (new channels, cloak namespaces, ...), please contact me or Matti (I assume you are happy with that role!). If someone is unhappy with us two being the main contact/people who have those right on freenode, also contact us so we can get it changed. - Sebastian > > Also I am not sure if the > > steering council email address is published anywhere, IIRC it was > > possible for anyone to send an email to it (OTOH, it would be nice > > to > > not catch spam there, so maybe it is fine to ask for the address > > first). > > Google's spam filters are pretty good. For the record, it is numpy-st > eering-coun...@googlegroups.com > > Cheers, > Ralf > > > > - Sebastian > > > > > > > > Ralf > > > > > > > > There's also a Gitter numpy channel. AFAIK few/none core devs > > are > > > > > regularly active on either IRC or Gitter. I would suggest > > that we > > > > > document both these channels as community-run at > > > > > https://scipy.org/scipylib/mailing-lists.html, and give > > Nathan > > > > and > > > > > others who are interested the permissions they need. > > > > > > > > > > > > > Yeah, the gitter seems pretty inactive as well. But I guess it > > > > doesn't > > > > hurt to mention them. > > > > > > > > - Sebastian > > > > > > > > > > > > > I think our official recommendation for usage questions is > > > > > StackOverflow. > > > > > > > > > > Cheers, > > > > > Ralf > > > > > > > > > > > > > > > > I’m also in the process of getting ops for #matplotlib for > > > > similar > > > > > > spam-related reasons. I’d say all the scientific python IRC > > > > > > channels I’m in get a decent amount of traffic (perhaps 10% > > of > > > > the > > > > > > number of questions that get asked on StackOverflow) and > > it’s a > > > > > > good venue for asking quick questions. Let’s hope that > > forcing > > > > > > people t
Re: [Numpy-discussion] Stacklevel for warnings.
On Fri, 2018-08-10 at 16:05 -0600, Charles R Harris wrote: > Hi All, > > Do we have a policy for the stacklevel that should be used in NumPy? > How far back should the stack be displayed? I note that the optimum > stacklevel may vary between users and developers. > I thought it was so that it will point to the correct user line (or tend to point there). So stacklevel=2 for exposed and higher for private (python) functions IIRC. As for developers, I would hope they are OK with (and know how to) turning the warning into an error. Not sure we discussed it much, I seem to have a vague memory of asking if we are sure this is what we want and at least Ralf agreeing. Also I don't know how consistent it is overall. - Sebastian > Chuck > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Stacklevel for warnings.
On Sat, 2018-08-11 at 11:11 -0700, Ralf Gommers wrote: > > > On Sat, Aug 11, 2018 at 1:22 AM, Sebastian Berg ns.net> wrote: > > On Fri, 2018-08-10 at 16:05 -0600, Charles R Harris wrote: > > > Hi All, > > > > > > Do we have a policy for the stacklevel that should be used in > > NumPy? > > > How far back should the stack be displayed? I note that the > > optimum > > > stacklevel may vary between users and developers. > > > > > > > I thought it was so that it will point to the correct user line (or > > tend to point there). So stacklevel=2 for exposed and higher for > > private (python) functions IIRC. > > As for developers, I would hope they are OK with (and know how to) > > turning the warning into an error. > > > > Not sure we discussed it much, I seem to have a vague memory of > > asking > > if we are sure this is what we want and at least Ralf agreeing. > > Also I > > don't know how consistent it is overall. > > That sounds right to me. I think when it was introduced it was quite > consistent, because Sebastian replace warning filters everywhere with > suppress_warnings. Would be good to document this in the devguide. Yeah, probably reasonably consistent, but I only added a test to check that the stacklevel argument is never missing entirely, it is up to the author to figure out what is the right level (or best easily possible, since sometimes it would be pretty ugly to make it always right). The warning testing (suppress_warnings, etc.) or any of our tests never actually check the stacklevel that I am aware of, or maybe I forgot :), could be something to think about though. I guess we did it around the same time of the general warning testing cleanup probably. - Sebastian > Ralf > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adoption of a Code of Conduct
On Tue, 2018-08-14 at 21:30 -0700, Ralf Gommers wrote: > > > On Fri, Aug 3, 2018 at 1:02 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: > > > > On Fri, Aug 3, 2018 at 1:45 PM, Peter Creasey < > > p.e.creasey...@googlemail.com> wrote: > > > +1 for keeping the same CoC as Scipy, making a new thing just > > > seems a > > > bigger surface area to maintain. Personally I already assumed > > > Scipy's > > > "honour[ing] diversity in..." did not imply any protection of > > > behaviours that violate the CoC *itself*, but if you wanted to be > > > really explicit you could add "to the extent that these do not > > > conflict with this code of conduct." to that line. > > > > I prefer that to the proposed modification, short and sweet. > > > > This edit to the SciPy CoC has now been merged. > > It looks to me like we're good to go here and take over the SciPy > CoC. Sounds good, so +1. I am happy with the committee as well, and I guess most/all are, but we might want to discuss it separately? - Sebastian > > Cheers, > Ralf > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] count_nonzero axis argument?
On Mon, 2018-09-17 at 12:37 +0100, Matthew Brett wrote: > Hi, > > Is there any reason that np.count_nonzero should not take an axis > argument? As in: > No, sounds like an obvious improvement, but as also with those, someone has to volunteer to do it... Coding it will probably mean adding the NpyIter and possibly fast paths (not sure about the state of count nonzero), but should not be very difficult. - Sebastian > > > > np.better_count_nonzero([[10, 11], [0, 3]], axis=1) > > array([2, 1]) > > It would be much more useful if it did... > > Cheers, > > Matthew > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Exact semantics of ufunc.reduce
On Fri, 2018-10-12 at 17:34 +0200, Hameer Abbasi wrote: > Hello! > > I’m trying to investigate the exact way ufunc.reduce works when given > a custom dtype. Does it cast before or after the operation, or > somewhere in between? How does this differ from ufunc.reduceat, for > example? > I am not 100% sure, but I think giving the dtype definitely casts the output type. And since most ufunc loops are defined as "ff->f", etc. that effectively casts the input as well. It might be it casts the input specifically, but I doubt it. The cast will occur within the buffering machinery, so the cast is only done in small chunks. But the operation itself should be performed using the given dtype. - Sebastian > We ran into this issue in pydata/sparse#191 when trying to match the > two where the only thing differing is the number of zeros for sum, > which shouldn’t change the result. > > Best Regards, > Hameer Abbasi > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing priority labels from github
On Fri, 2018-10-19 at 11:02 +0300, Matti Picus wrote: > We currently have highest, high, normal, low, and lowest priority > labels > for github issues/PRs. At the recent status meeting, we proposed > consolidating these to a single "high" priority label. Anything > "low" > priority should be merged or closed since it will be quickly > forgotten, > and no "normal" tag is needed. > > > With that, we (the BIDS team) would like to encourage reviewers to > use > the "high" priority tag to indicate things we should be working on. > > Any objections or thoughts? > Sounds like a plan, especially having practically meaningless tags right now is no help. Most of them are historical and personally I have only been using the milestones to tag things as high priority (very occasionally). - Sebastian > > Matti (in the names of Tyler and Stefan) > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion