At 04:23 PM 4/23/2010 +0900, David Cournapeau wrote:
 Importing pkg_resources
causes many more syscalls than relatively big packages (~ 1000 for
python -c "", 3000 for importing one of numpy/wx/gtk, 6000 for
pkg_resources). Assuming those are unavoidable (and the current
namespace implementation in setuptools requires it, right ?), I don't
see a way to reduce that cost significantly,

If you don't mind trying a simple test for me, would you patch your pkg_resources to comment out this loop:

            for pkg in self._get_metadata('namespace_packages.txt'):
                if pkg in sys.modules: declare_namespace(pkg)

It's in the 'activate()' method of Distribution, and it's targeted for removal in setuptools 0.7 anyway... I suspect you will see a huge reduction in stat calls, and the startup time should drop to being proportional to the number of non-.egg entries on sys.path, rather than being proportional the total number of packages installed in such directories. (For .egg files and directories on sys.path, there should be no system calls at all with the above removed.)

This change is not backward compatible with some older packages (from years ago) that were not declaring their namespace packages correctly, but it has been announced for some time (with warnings) that such packages will not work with setuptools 0.7.

(By the way, in case you're thinking this change would only affect namespace packages, and you don't have any, what's happening is that the _get_metadata() call forces a check for the *existence* of namespace_packages.txt in every .egg-info or .egg/EGG-INFO on your path, whether the file actually exists or not. In the case of zipped eggs, this check is just looking in a dictionary; for actual files/directories, this is a stat call.)

_______________________________________________
Distutils-SIG maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to