On Mon, Sep 2, 2019 at 2:09 PM Nathaniel Smith <[email protected]> wrote:
> On Mon, Sep 2, 2019 at 2:15 AM Hameer Abbasi <[email protected]> > wrote: > > Me, Ralf Gommers and Peter Bell (both cc’d) have come up with a proposal > on how to solve the array creation and duck array problems. The solution is > outlined in NEP-31, currently in the form of a PR, [1] > > Thanks for putting this together! It'd be great to have more > engagement between uarray and numpy. > > > ============================================================ > > > > NEP 31 — Context-local and global overrides of the NumPy API > > > > ============================================================ > > Now that I've read this over, my main feedback is that right now it > seems too vague and high-level to give it a fair evaluation? The idea > of a NEP is to lay out a problem and proposed solution in enough > detail that it can be evaluated and critiqued, but this felt to me > more like it was pointing at some other documents for all the details > and then promising that uarray has solutions for all our problems. > This is fair enough I think. We'll need to put some more thought in where to refer to other NEPs, and where to be more concrete. > > This NEP takes a more holistic approach: It assumes that there are parts > of the API that need to be > > overridable, and that these will grow over time. It provides a general > framework and a mechanism to > > avoid a design of a new protocol each time this is required. > > The idea of a holistic approach makes me nervous, because I'm not sure > we have holistic problems. Sometimes a holistic approach is the right > thing; other times it means sweeping the actual problems under the > rug, so things *look* simple and clean but in fact nothing has been > solved, and they just end up biting us later. And from the NEP as > currently written, I can't tell whether this is the good kind of > holistic or the bad kind of holistic. > > Now I'm writing vague handwavey things, so let me follow my own advice > and make it more concrete with an example :-). > > When Stephan and I were writing NEP 22, the single thing we spent the > most time discussing was the problem of duck-array coercion, and in > particular what to do about existing code that does > np.asarray(duck_array_obj). > > The reason this is challenging is that there's a lot of code written > in Cython/C/C++ that calls np.asarray, Cython code only perhaps? It would surprise me if there's a lot of C/C++ code that explicitly calls into our Python rather than C API. and then blindly casts the > return value to a PyArray struct and starts accessing the raw memory > fields. If np.asarray starts returning anything besides a real-actual > np.ndarray object, then this code will start corrupting random memory, > leading to a segfault at best. > > Stephan felt strongly that this meant that existing np.asarray calls > *must not* ever return anything besides an np.ndarray object, and > therefore we needed to add a new function np.asduckarray(), or maybe > an explicit opt-in flag like np.asarray(..., allow_duck_array=True). > > I agreed that this was a problem, but thought we might be able to get > away with an "opt-out" system, where we add an allow_duck_array= flag, > but make it *default* to True, and document that the Cython/C/C++ > users who want to work with a raw np.ndarray object should modify > their code to explicitly call np.asarray(obj, allow_duck_array=False). > This would mean that for a while people who tried to pass duck-arrays > into legacy library would get segfaults, but there would be a clear > path for fixing these issues as they were discovered. > > Either way, there are also some other details to figure out: how does > this affect the C version of asarray? What about np.asfortranarray – > probably that should default to allow_duck_array=False, even if we did > make np.asarray default to allow_duck_array=True, right? > > Now if I understand right, your proposal would be to make it so any > code in any package could arbitrarily change the behavior of > np.asarray for all inputs, e.g. I could just decide that > np.asarray([1, 2, 3]) should return some arbitrary non-np.ndarray > object. No, definitely not! It's all opt-in, by explicitly importing from `numpy.overridable` or `unumpy`. No behavior of anything in the existing numpy namespaces should be affected in any way. I agree with the concerns below, hence it should stay opt-in. Cheers, Ralf It seems like this has a much greater potential for breaking > existing Cython/C/C++ code, and the NEP doesn't currently describe why > this extra power is useful, and it doesn't currently describe how it > plans to mitigate the downsides. (For example, if a caller needs a > real np.ndarray, then is there some way to explicitly request one? The > NEP doesn't say.) Maybe this is all fine and there are solutions to > these issues, but any proposal to address duck array coercion needs to > at least talk about these issues! > > And that's just one example... array coercion is a particularly > central and tricky problem, but the numpy API big, and there are > probably other problems like this. For another example, I don't > understand what the NEP is proposing to do about dtypes at all. > > That's why I think the NEP needs to be fleshed out a lot more before > it will be possible to evaluate fairly. > > -n > > -- > Nathaniel J. Smith -- https://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list [email protected] https://mail.python.org/mailman/listinfo/numpy-discussion
