On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris <charlesr.har...@gmail.com> wrote:
> > > > On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant <tra...@continuum.io> > wrote: > >> Believe me, I'm all for incremental changes if it is actually possible >> and doesn't actually cost more. It's also why I've been silent until now >> about anything we are doing being a candidate for a NumPy 2.0. I >> understand the challenges of getting people to change. But, features and >> solid improvements *will* get people to change --- especially if their new >> library can be used along with the old library and the transition can be >> done gradually. Python 3's struggle is the lack of features. >> >> At some point there *will* be a NumPy 2.0. What features go into NumPy >> 2.0, how much backward compatibility is provided, and how much porting is >> needed to move your code from NumPy 1.X to NumPy 2.X is the real user >> question --- not whether it is characterized as "incremental" change or >> "re-write". What I call a re-write and what you call an >> "incremental-change" are two points on a spectrum and likely overlap >> signficantly if we really compared what we are thinking about. >> >> One huge benefit that came out of the numeric / numarray / numpy >> transition that we mustn't forget about was actually the extended buffer >> protocol and memory view objects. This really does allow multiple array >> objects to co-exist and libraries to use the object that they prefer in a >> way that did not exist when Numarray / numeric / numpy came out. So, we >> shouldn't be afraid of that world. The existence of easy package managers >> to update environments to try out new features and have applications on a >> single system that use multiple versions of the same library is also >> something that didn't exist before and that will make any transition easier >> for users. >> >> One thing I regret about my working on NumPy originally is that I didn't >> have the foresight, skill, and understanding to work more on a more >> extended and better designed multiple-dispatch system so that multiple >> array objects could participate together in an expression flow. The >> __numpy_ufunc__ mechanism gives enough capability in that direction that it >> may be better now. >> >> Ultimately, I don't disagree that NumPy can continue to exist in >> "incremental" change mode ( though if you are swapping out whole swaths of >> C-code for Cython code --- it sounds a lot like a "re-write") as long as >> there are people willing to put the effort into changing it. I think this >> is actually benefited by the existence of other array objects that are >> pushing the feature envelope without the constraints --- in much the same >> way that the Python standard library is benefitted by many versions of >> different capabilities being tried out before moving into the standard >> library. >> >> I remain optimistic that things will continue to improve in multiple ways >> --- if a little "messier" than any of us would conceive individually. It >> *is* great to see all the PR's coming from multiple people on NumPy and all >> the new energy around improving things whether great or small. >> > > @nathaniel IIRC, one of the objections to the missing values work was that > it changed the underlying array object by adding a couple of variables to > the structure. I'm willing to do that sort of thing, but it would be good > to have general agreement that that is acceptable. > I think changing the ABI for some versions of numpy (2.0 , whatever) is acceptable. There is little doubt that the ABI will need to change to accommodate a better and more flexible architecture. Changing the C API is more tricky: I am not up to date to the changes from the last 2-3 years, but at that time, most things could have been changed internally without breaking much, though I did not go far enough to estimate what the performance impact could be (if any). > As to blaze/dynd, I'd like to steal bits here and there, and maybe in the > long term base numpy on top of it with a compatibility layer. There is a > lot of thought and effort that has gone into those projects and we should > use what we can. As is, I think numpy is good for another five to ten years > and will probably hang on for fifteen, but it will be outdated by the end > of that period. Like great whites, we need to keep swimming just to have > oxygen. Software projects tend to be obligate ram ventilators. > > The Python 3 experience is definitely something we want to avoid. And > while blaze does big data and offers some nice features, I don't know that > it offers compelling reasons to upgrade to the more ordinary user at this > time, so I'd like to sort of slip it into numpy if possible. > > If we do start moving numpy forward in more radical steps, we should try > to have some agreement beforehand as to what sort of changes are > acceptable. For instance, to maintain backward compatibility, is it > sufficient that a recompile will do the job, or do we require forward > compatibility for extensions compiled against earlier releases? Do we stay > with C or should we support C++ code with its advantages of smart pointers, > exception handling, and templates? We will need a certain amount of > flexibility going forward and we should decide, or at least discuss, such > issues up front. > Last time the C++ discussion was brought up, no consensus could be made. I think quite a few radical changes can be made without that consensus already, though other may disagree there. IMO, what is needed the most is refactoring the internal to extract the Python C API low level from the rest of the code, as I think that's the main bottleneck to get more contributors (or get new core features more quickly). David
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion