On Mon, Jul 2, 2012 at 4:34 PM, Nathaniel Smith <n...@pobox.com> wrote:
> On Mon, Jul 2, 2012 at 8:17 PM, Andrew Dalke <da...@dalkescientific.com> > wrote: > > In this email I propose a few changes which I think are minor > > and which don't really affect the external NumPy API but which > > I think could improve the "import numpy" performance by at > > least 40%. This affects me because I and my clients use a > > chemistry toolkit which uses only NumPy arrays, and where > > we run short programs often on the command-line. > > > > > > In July of 2008 I started a thread about how "import numpy" > > was noticeably slow for one of my customers. They had > > chemical analysis software, often even run on a single > > molecular structure using command-line tools, and the > > several invocations with 0.1 seconds overhead was one of > > the dominant costs even when numpy wasn't needed. > > > > I fixed most of their problems by deferring numpy imports > > until needed. I remember well the Steve Jobs anecdote at > > > http://folklore.org/StoryView.py?project=Macintosh&story=Saving_Lives.txt > > and spent another day of my time in 2008 to identify the > > parts of the numpy import sequence which seemed excessive. > > I managed to get the import time down from 0.21 seconds to > > 0.08 seconds. > > > > Very little of that made it into NumPy. > > > > > > The three biggest changes I would like are: > > > > 1) remove "add_newdocs" and put the docstrings in the C code > > 'add_newdocs' still needs to be there, > > > > The code says: > > > > # This is only meant to add docs to objects defined in C-extension > modules. > > # The purpose is to allow easier editing of the docstrings without > > # requiring a re-compile. > > > > However, the change log shows that there are relatively few commits > > to this module > > > > Year Number of commits > > ==== ================= > > 2012 8 > > 2011 62 > > 2010 9 > > 2009 18 > > 2008 17 > > > > so I propose moving the docstrings to the C code, and perhaps > > leaving 'add_newdocs' there, but only used when testing new > > docstrings. > > I don't have any opinion on how acceptable this would be, but I also > don't see a benchmark showing how much this would help? > > > 2) Don't optimistically assume that all submodules are > > needed. For example, some current code uses > > > >>>> import numpy > >>>> numpy.fft.ifft > > <function ifft at 0x10199f578> > > > > (See a real-world example at > > > http://stackoverflow.com/questions/10222812/python-numpy-fft-and-inverse-fft > > ) > > > > IMO, this optimizes the needs of the interactive shell > > NumPy author over the needs of the many-fold more people > > who don't spend their time in the REPL and/or don't need > > those extra features added to every NumPy startup. Please > > bear in mind that NumPy users of the first category will > > be active on the mailing list, go to SciPy conferences, > > etc. while members of the second category are less visible. > > > > I recognize that this is backwards incompatible, and will > > not change. However, I understand that "NumPy 2.0" is a > > glimmer in the future, which might be a natural place for > > a transition to the more standard Python style of > > > > from numpy import fft > > > > Personally, I think the documentation now (if it doesn't > > already) should transition to use this form. > > I think this ship has sailed, but it'd be worth looking into lazy > importing, where 'numpy.fft' isn't actually imported until someone > starts using it. There are a bunch of libraries that do this, and one > would have to fiddle to get compatibility with all the different > python versions and make sure you're not killing performance (might > have to be in C) but something along the lines of > > class _FFTModule(object): > def __getattribute__(self, name): > mod = importlib.import_module("numpy.fft") > _FFTModule.__getattribute__ = mod.__getattribute__ > return getattr(mod, name) > fft = _FFTModule() > > Not sure how this would impact projects like ipython that does tab-completion support, but I know that that would drive me nuts in my basic tab-completion setup I have for my regular python terminal. Of course, in the grand scheme of things, that really isn't all that important, I don't think. Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion