"Martin v. Löwis" <[EMAIL PROTECTED]> writes on Sun, 22 May 2005 21:24:41 +0200: > ... > What do you mean, "unable to"? It just doesn't.
The original question was: "why does Python put non-existing entries on 'sys.path'". Your answer seems to be: "it just does not do it -- but it might be changed if someone does the work". This fine with me. > ... > In the past, there was a silent guarantee that you could add > items to sys.path, and only later create the directories behind > these items. I don't know whether people rely on this guarantee. I do not argue that Python should prevent adding non-existing items on "path". This would not work as Python may not know what "existing" means (due to "path_hooks"). I only argue that it should not *itself* (automatically) put items on path where it knows the responsible importers and knows (or can easily determine) that they are non existing for them. > ... > > The application was Zope importing about 2.500 modules > > from 2 zip files "zope.zip" and "python24.zip". > > This resulted in about 12.500 opens -- about 4 times more > > than would be expected -- about 10.000 of them failing opens. > > I see. Out of curiosity: how much startup time was saved > when sys.path was explicitly stripped to only contain these > two zip files? I cannot tell you precisely because it is very time consuming to analyse cold start timing behavior (it requires a reboot for each measurement). We essentially have the following numbers only: warm start cold start (filled OS caches) (empty OS caches) from file system 5s 13s from ZIP archives 4s 8s frozen 3s 5s The ZIP archive time was measured after a patch to "import.c" that prevents Python to view a ZIP archive member as a directory when it cannot find the currently looked for module (of course, this lookup fails also when the archive member is viewed as a directory). Furthermore, all C-extensions were loaded via a "meta_path" hook (and not "sys.path") and "sys.path" contained just the two Zip archives. These optimizations led to about 3.000 opens (down from originally 12.500). > I would expect that importing 2500 modules takes *way* > more time than doing 10.000 failed opens. You may be wrong: searching for non existing files may cause disk io which is several orders of magnitude slower that CPU activities. The comparison between warm start (few disc io) and cold start (much disc io) tells you that the import process is highly io dominated (for cold starts). I know that this does not prove that the failing opens contribute significantly. However, a colleague reported that the "import.c" patch (essential for the reduction of the number of opens) resulted in significant (but not specified) improvements. Dieter -- http://mail.python.org/mailman/listinfo/python-list