On Tue, Aug 21, 2018 at 6:12 PM, Stephan Hoyer <sho...@gmail.com> wrote: > On Tue, Aug 21, 2018 at 12:21 AM Nathaniel Smith <n...@pobox.com> wrote: >> >> >> My suggestion: at numpy import time, check for an envvar, like say >> >> NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1. If it's not set, then all the >> >> __array_function__ dispatches turn into no-ops. This lets interested >> >> downstream libraries and users try this out, but makes sure that we >> >> won't have a hundred thousand end users depending on it without >> >> realizing. >> >> >> >> >> >> >> >> - makes it easy for end-users to check how much overhead this adds (by >> >> running their code with it enabled vs disabled) >> >> - if/when we decide to commit to supporting it for real, we just >> >> remove the envvar. >> > >> > >> > I'm slightly concerned that the cost of reading an environment variable >> > with >> > os.environ could exaggerate the performance cost of __array_function__. >> > It >> > takes about 1 microsecond to read an environment variable on my laptop, >> > which is comparable to the full overhead of __array_function__. >> >> That's why I said "at numpy import time" :-). I was imagining we'd >> check it once at import, and then from then on it'd be stashed in some >> C global, so after that the overhead would just be a single >> predictable branch 'if (array_function_is_enabled) { ... }'. > > > Indeed, I missed the "at numpy import time" bit :). > > In that case, I'm concerned that it isn't always possible to set environment > variables once before importing NumPy. The environment variable solution > works great if users have full control of their own Python binaries, but > that isn't always the case today in this era of server-less infrastructure > and online notebooks. > > One example offhand is Google's Colaboratory > (https://research.google.com/colaboratory), a web based Jupyter notebook. > NumPy is always loaded when a notebook is opened, as you can check from > inspecting sys.modules. Now, I work with the developers of Colaboratory, so > we could probably figure out a work-around together, but I'm pretty sure > this would also come up in the context of other tools.
I mean, the idea of the envvar is to be a temporary measure enable devs to experiment with a provisional feature, while being awkward enough that people don't build lots of stuff assuming its there. It doesn't have to 100% supported in every environment. > Another problem is unit testing. Does pytest use a separate Python process > for running the tests in each file? I don't know and that feels like an > implementation detail that I shouldn't have to know :). Yes, in principle I > could use a subprocess in my __array_function__ for unit tests, but that > would be really awkward. Set the envvar before invoking pytest? For numpy itself we'll need to write a few awkward tests involving subprocesses to make sure the envvar parsing is working properly, but I don't think this is a big deal. As long as we only have 1-2 places that __array_function__ dispatch funnels through, we just need to make sure that they work properly with/without the envvar; no need to test every API separately. Or if it is an issue we can have some private API that's only available to the numpy test suite... >> > So we may >> > want to switch to an explicit Python API instead, e.g., >> > np.enable_experimental_array_function(). >> >> If we do this, then libraries that want to use __array_function__ will >> just call it themselves at import time. The point of the env-var is >> that our policy is not to break end-users, so if we want an API to be >> provisional and experimental then it's end-users who need to be aware >> of that before using it. (This is also an advantage of checking the >> envvar only at import time: it means libraries can't easily just >> setenv() to enable the functionality behind users' backs.) > > > I'm in complete agreement that only authors of end-user applications should > invoke this option, but just because something is technically possible > doesn't mean that people will actually do it or that we need to support that > use case :). I didn't say "authors of end-user applications", I said "end-users" :-). That said, I dunno. My intuition is that if we have a function call like this then libraries that define __array_function__ will merrily call it in their package __init__ and it accomplishes nothing, but maybe I'm being too cynical and untrusting. -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion