On Sat, Jun 1, 2019 at 6:12 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote:
> Hi Ralf, > > Despite sharing Nathaniel's doubts about the ease of defining the numpy > API and the likelihood of people actually sticking to a limited subset of > what numpy exposes, I quite like the actual things you propose to do! > > But my liking it is for reasons that are different from your stated ones: > I think the proposed actions are likely to benefit greatly both for users > (like Bill above) and current and prospective developers. To me, it seems > almost as a side benefit (if a very nice one) that it might help other > projects to share an API; a larger benefit may come from tapping into the > experience of other projects in thinking about what are the true basic > functions/method that one should have. > Agreed, there is some reverse learning there as well. Projects like Dask and Xtensor already went through making these choices, which can teach us as NumPy developers some lessons. > More concretely, to address Nathaniel's (very reasonable) worry about > ending up wasting a lot of time, I think it may be good to identify smaller > parts, each of which are useful on their own. > > In this respect, I think an excellent place to start might be something > you are planning already anyway: update the user documentation. Doing this > will necessarily require thinking about, e.g., what `ndarray` methods and > properties are actually fundamental, as you only want to focus on a few. > With that in place, one could then, as you suggest, reorganize the > reference documentation to put those most important properties up front, > and ones that we really think are mistakes at the bottom, with explanations > of why we think so and what the alternative is. Also for the reference > documentation, it would help to group functions more logically. > That perhaps another rationale for doing this. The docs are likely to get a fairly major overhaul this year. If we don't write down a coherent plan then we're just going to make very similar decisions as when we'd write up a "standard", just ad hoc and with much less review. > The above could lead to three next steps, all of which I think would be > useful. First, for (prospective) developers as well as for future > maintenance, I think it would be quite a large benefit if we (slowly but > surely) rewrote code that implements the less basic functionality in terms > of more basic functions (e.g., replace use of `array.fill(...)` or > `np.copyto(array, ...)` with `array[...] =`). > That could indeed be nice. I think Travis referred to this as defining an "RNumPy" (similar to RPython as a subset of Python). > Second, we could update Nathaniel's NEP about distinct properties duck > arrays might want to mimic/implement. > I wasn't thinking about that indeed, but agreed that it could be helpful. > Third, we could actual implementing the logical groupings identified in > the code base (and describing them!). Currently, it is a mess: for the C > files, I typically have to grep to even find where things are done, and > while for the functions defined in python files that is not necessary, many > have historical rather than logical groupings (looking at you, > `from_numeric`!), and even more descriptive ones like `shape_base` are > split over `lib` and `core`. I think it would help everybody if we went to > a python-like layout, with a true core and libraries such as polynomial, > fft, ma, etc. > I'd really like this. Also to have sane namespace in numpy, and a basis for putting something in numpy.lib vs the main namespace vs some other namespace (there are a couple of semi-public ones). > Anyway, re-reading your message, I realize the above is not really what > you wrote about, so perhaps this is irrelevant... > Not irrelevant, I think you're making some good points. Cheers, Ralf
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion