Re: __file__ access extremely slow
I think I have figured this out, thanks for your input. The time comes from lazy modules related to e-mail importing on attribute access, which is acceptable. Hence of course why ImportError was sometime raised. I originally was thinking that accessing __file__ was triggering some mechanism that caused an attempt at importing other modules, but the lazy import explanation makes much more sense. -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games On Fri, Jun 5, 2009 at 2:15 AM, Gabriel Genellinagagsl-...@yahoo.com.ar wrote: En Fri, 05 Jun 2009 00:12:25 -0300, John Machin sjmac...@lexicon.net escribió: (2) This will stop processing on the first object in sys.modules that doesn't have a __file__ attribute. Since these objects aren't *guaranteed* to be modules, Definitely not guaranteed to be modules. Python itself drops non-modules in there! Python 2.3 introduced four keys mapped to None -- one of these was dropped in 2.4, but the other three are still there in 2.5 and 2.6: In case someone wonders what all those None are: they're a flag telling the import machinery that those modules don't exist (to avoid doing a directory scan over and over, because Python2.7 attempts first to do a relative import, and only if unsuccessful attempts an absolute one) C:\junk\python23\python -c import sys; print [k for (k, v) in sys.modules.items() if v is None] ['encodings.encodings', 'encodings.codecs', 'encodings.exceptions', 'encodings.types'] In this case, somewhere inside the encodings package, there are statements like import types or from types import ..., and Python could not find types.py in the package directory. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: __file__ access extremely slow
En Fri, 05 Jun 2009 00:12:25 -0300, John Machin sjmac...@lexicon.net escribió: (2) This will stop processing on the first object in sys.modules that doesn't have a __file__ attribute. Since these objects aren't *guaranteed* to be modules, Definitely not guaranteed to be modules. Python itself drops non-modules in there! Python 2.3 introduced four keys mapped to None -- one of these was dropped in 2.4, but the other three are still there in 2.5 and 2.6: In case someone wonders what all those None are: they're a flag telling the import machinery that those modules don't exist (to avoid doing a directory scan over and over, because Python2.7 attempts first to do a relative import, and only if unsuccessful attempts an absolute one) C:\junk\python23\python -c import sys; print [k for (k, v) in sys.modules.items() if v is None] ['encodings.encodings', 'encodings.codecs', 'encodings.exceptions', 'encodings.types'] In this case, somewhere inside the encodings package, there are statements like import types or from types import ..., and Python could not find types.py in the package directory. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: __file__ access extremely slow
Sorry, there is a typo. The code should read as below to repro the problem: for module in sys.modules.itervalues(): try: path = module.__file__ except (AttributeError, ImportError): return -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games On Thu, Jun 4, 2009 at 6:24 PM, Zac Burns zac...@gmail.com wrote: The section of code below, which simply gets the __file__ attribute of the imported modules, takes more than 1/3 of the total startup time. Given that many modules are complicated and even have dynamic population this figure seems very high to me. it would seem very high if one just considered the time it would take to load the pyc files off the disk vs... whatever happens when module.__file__ happens. The calculation appears to be cached though, so a subsequent check does not take very long. From once python starts and loads the main module to after all the imports occur and this section executes takes 1.3sec. This section takes 0.5sec. Total module count is ~800. Python version is 2.5.1 Code: for module in sys.modules: try: path = module.__file__ except (AttributeError, ImportError): return -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games -- http://mail.python.org/mailman/listinfo/python-list
Re: __file__ access extremely slow
En Thu, 04 Jun 2009 22:24:48 -0300, Zac Burns zac...@gmail.com escribió: The section of code below, which simply gets the __file__ attribute of the imported modules, takes more than 1/3 of the total startup time. Given that many modules are complicated and even have dynamic population this figure seems very high to me. it would seem very high if one just considered the time it would take to load the pyc files off the disk vs... whatever happens when module.__file__ happens. Code: [fixed] for module in sys.modules.itervalues(): try: path = module.__file__ except (AttributeError, ImportError): return __file__ is just an instance attribute of module objects. Although a custom importer *might* define a special module type which *could* use a special computed attribute, I doubt so... module.__file__ just returns a string, when it exists. Built-in modules have no __file__ attribute set, and some entries in sys.modules may be set to None. These should be the only exceptions. The calculation appears to be cached though, so a subsequent check does not take very long. From once python starts and loads the main module to after all the imports occur and this section executes takes 1.3sec. This section takes 0.5sec. Total module count is ~800. Are you sure you posted the actual code? That return statement would stop the iteration as soon as it hits a builtin module, or a None flag. I'd say the time is spent somewhere else, or you're misinterpreting your results. BTW, what's the point of all this? -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: __file__ access extremely slow
On Fri, 05 Jun 2009 02:21:07 +, Steven D'Aprano wrote: You corrected this to: for module in sys.modules.itervalues(): try: path = module.__file__ except (AttributeError, ImportError): return (1) You're not importing anything inside the try block. Why do you think ImportError could be raised? (2) This will stop processing on the first object in sys.modules that doesn't have a __file__ attribute. Since these objects aren't *guaranteed* to be modules, this is a subtle bug waiting to bite you. In fact, not all modules have a __file__ attribute. import errno errno.__file__ Traceback (most recent call last): File stdin, line 1, in module AttributeError: 'module' object has no attribute '__file__' -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: __file__ access extremely slow
Zac Burns wrote: The section of code below, which simply gets the __file__ attribute of the imported modules, takes more than 1/3 of the total startup time. Given that many modules are complicated and even have dynamic population this figure seems very high to me. it would seem very high if one just considered the time it would take to load the pyc files off the disk vs... whatever happens when module.__file__ happens. The calculation appears to be cached though, so a subsequent check does not take very long. From once python starts and loads the main module to after all the imports occur and this section executes takes 1.3sec. This section takes 0.5sec. Total module count is ~800. Perhaps some of the modules use a delayed import mechanism. Python version is 2.5.1 Code: for module in sys.modules: try: path = module.__file__ except (AttributeError, ImportError): return If any modules lack the attribute, you will not scan them all. Perhaps you meant 'continue'? -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games -- http://mail.python.org/mailman/listinfo/python-list