I've got a proof of concept that take the time on my machine to "import numpy" from 0.21 seconds down to 0.08 seconds. Doing that required some somewhat awkward things, like deferring all 'import re' statements. I don't think that's stable in the long run because people will blithely import re in the future and not care that it takes 0.02 seconds to import. I don't blame them for complaining; I was curious on how fast I could get things.
Note that when I started complaining about this a month ago the import time on my machine was about 0.3 seconds. I'll work on patches within the next couple of days. Here's an outline of what I did, along with some questions about what's feasible. 1) don't import 'numpy.testing'. Savings = 0.012s. Doing so required patches like -from numpy.testing import Tester -test = Tester().test -bench = Tester().bench +def test(label='fast', verbose=1, extra_argv=None, doctests=False, + coverage=False, **kwargs): + from testing import Tester + import numpy + Tester(numpy).test(label, verbose, extra_argv, doctests, + coverage, **kwargs) +def bench(label='fast', verbose=1, extra_argv=None): + from testing import Tester + import numpy + Tester(numpy).bench(label, verbose, extra_argv) QUESTION: since numpy is moving to nose, and the documentation only describes doing 'import numpy; numpy.test()', can I remove all other definitions of "test" and "bench"? 2) removing 'import ctypeslib' in top-level -> 0.023 seconds QUESTION: is this considered part of the API that must be preserved? The primary use case is supposed to be to help interactive users. I don't think interactive users spend much time using ctypes, and those that do are also those that aren't confused about needing an extra import statement. 3) removing 'import string' in numerictypes.py -> 0.008 seconds . This requires some ugly but simple changes to the code. 4) remove the 'import re' in _internal, numpy/lib/, function_base, and other places. This reduced my overall startup cost by 0.013. 5) defer bzip and gzip imports in _datasource: 0.009 s. This will require non-trivial code changes. 6) defer 'format' from io.py: 0.007 s 7) _datasource imports shutil in order to use shutil.rmdir in a __del__. I don't think this can be deferred, because I don't want to do an import during system shutdown, which is when the __del__ might be called. It would save 0.004s. 8) If I can remove 'import doc' from the top-level numpy (is that part of the required API?) then I can save 0.004s. 9) defer urlparse in _datasource: about 0.003s 10) If I get rid of the cPickle top-level numeric.py then I can save 0.006 seconds. 11) not importing add_newdocs saves 0.005 s. This might be possible by moving all of the docstrings to the actual functions. I haven't looked into this much and it might not be possible. Those millisecond improvements add up! When I do an interactive 'import numpy' on my system I don't notice the import time like I did before. Andrew [EMAIL PROTECTED] Andrew [EMAIL PROTECTED] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion