Re: __file__ access extremely slow

2009-06-06 Thread Zac Burns
I think I have figured this out, thanks for your input.

The time comes from lazy modules related to e-mail importing on
attribute access, which is acceptable. Hence of course
why ImportError was sometime raised.

I originally was thinking that accessing __file__ was triggering some
mechanism that caused an attempt at importing other modules, but the
lazy import explanation makes much more sense.

--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games



On Fri, Jun 5, 2009 at 2:15 AM, Gabriel Genellinagagsl-...@yahoo.com.ar wrote:
 En Fri, 05 Jun 2009 00:12:25 -0300, John Machin sjmac...@lexicon.net
 escribió:

  (2) This will stop processing on the first object in sys.modules that
  doesn't have a __file__ attribute. Since these objects aren't
  *guaranteed* to be modules,

 Definitely not guaranteed to be modules. Python itself drops non-modules
 in
 there! Python 2.3 introduced four keys mapped to None -- one of these was
 dropped in 2.4, but the other three are still there in 2.5 and 2.6:

 In case someone wonders what all those None are: they're a flag telling
 the import machinery that those modules don't exist (to avoid doing a
 directory scan over and over, because Python2.7 attempts first to do a
 relative import, and only if unsuccessful attempts an absolute one)

 C:\junk\python23\python -c import sys; print [k for (k, v) in
 sys.modules.items() if v is None]
 ['encodings.encodings', 'encodings.codecs', 'encodings.exceptions',
 'encodings.types']

 In this case, somewhere inside the encodings package, there are statements
 like import types or from types import ..., and Python could not find
 types.py in the package directory.

 --
 Gabriel Genellina

 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __file__ access extremely slow

2009-06-05 Thread Gabriel Genellina
En Fri, 05 Jun 2009 00:12:25 -0300, John Machin sjmac...@lexicon.net  
escribió:



 (2) This will stop processing on the first object in sys.modules that
 doesn't have a __file__ attribute. Since these objects aren't
 *guaranteed* to be modules,


Definitely not guaranteed to be modules. Python itself drops non-modules  
in

there! Python 2.3 introduced four keys mapped to None -- one of these was
dropped in 2.4, but the other three are still there in 2.5 and 2.6:


In case someone wonders what all those None are: they're a flag telling  
the import machinery that those modules don't exist (to avoid doing a  
directory scan over and over, because Python2.7 attempts first to do a  
relative import, and only if unsuccessful attempts an absolute one)



C:\junk\python23\python -c import sys; print [k for (k, v) in
sys.modules.items() if v is None]
['encodings.encodings', 'encodings.codecs', 'encodings.exceptions',
'encodings.types']


In this case, somewhere inside the encodings package, there are statements  
like import types or from types import ..., and Python could not find  
types.py in the package directory.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


__file__ access extremely slow

2009-06-04 Thread Zac Burns
The section of code below, which simply gets the __file__ attribute of
the imported modules, takes more than 1/3 of the total startup time.
Given that many modules are complicated and even have dynamic
population this figure seems very high to me. it would seem very high
if one just considered the time it would take to load the pyc files
off the disk vs... whatever happens when module.__file__ happens.

The calculation appears to be cached though, so a subsequent check
does not take very long.

From once python starts and loads the main module to after all the
imports occur and this section executes takes 1.3sec. This section
takes 0.5sec. Total module count is ~800.

Python version is 2.5.1

Code:

for module in sys.modules:
try:
path = module.__file__
except (AttributeError, ImportError):
return




--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __file__ access extremely slow

2009-06-04 Thread Zac Burns
Sorry, there is a typo. The code should read as below to repro the problem:


for module in sys.modules.itervalues():
   try:
   path = module.__file__
   except (AttributeError, ImportError):
   return




--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games



On Thu, Jun 4, 2009 at 6:24 PM, Zac Burns zac...@gmail.com wrote:
 The section of code below, which simply gets the __file__ attribute of
 the imported modules, takes more than 1/3 of the total startup time.
 Given that many modules are complicated and even have dynamic
 population this figure seems very high to me. it would seem very high
 if one just considered the time it would take to load the pyc files
 off the disk vs... whatever happens when module.__file__ happens.

 The calculation appears to be cached though, so a subsequent check
 does not take very long.

 From once python starts and loads the main module to after all the
 imports occur and this section executes takes 1.3sec. This section
 takes 0.5sec. Total module count is ~800.

 Python version is 2.5.1

 Code:
 
 for module in sys.modules:
        try:
                path = module.__file__
        except (AttributeError, ImportError):
                return
 



 --
 Zachary Burns
 (407)590-4814
 Aim - Zac256FL
 Production Engineer (Digital Overlord)
 Zindagi Games

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __file__ access extremely slow

2009-06-04 Thread Gabriel Genellina

En Thu, 04 Jun 2009 22:24:48 -0300, Zac Burns zac...@gmail.com escribió:


The section of code below, which simply gets the __file__ attribute of
the imported modules, takes more than 1/3 of the total startup time.
Given that many modules are complicated and even have dynamic
population this figure seems very high to me. it would seem very high
if one just considered the time it would take to load the pyc files
off the disk vs... whatever happens when module.__file__ happens.



Code: [fixed]

for module in sys.modules.itervalues():
  try:
  path = module.__file__
  except (AttributeError, ImportError):
  return



__file__ is just an instance attribute of module objects. Although a  
custom importer *might* define a special module type which *could* use a  
special computed attribute, I doubt so...
module.__file__ just returns a string, when it exists. Built-in modules  
have no __file__ attribute set, and some entries in sys.modules may be set  
to None. These should be the only exceptions.



The calculation appears to be cached though, so a subsequent check
does not take very long.
From once python starts and loads the main module to after all the
imports occur and this section executes takes 1.3sec. This section
takes 0.5sec. Total module count is ~800.


Are you sure you posted the actual code?
That return statement would stop the iteration as soon as it hits a  
builtin module, or a None flag.


I'd say the time is spent somewhere else, or you're misinterpreting your  
results.

BTW, what's the point of all this?

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: __file__ access extremely slow

2009-06-04 Thread Steven D'Aprano
On Fri, 05 Jun 2009 02:21:07 +, Steven D'Aprano wrote:

 You corrected this to:
 
 for module in sys.modules.itervalues():
try:
path = module.__file__
except (AttributeError, ImportError):
return
 
 (1) You're not importing anything inside the try block. Why do you think
 ImportError could be raised?
 
 (2) This will stop processing on the first object in sys.modules that
 doesn't have a __file__ attribute. Since these objects aren't
 *guaranteed* to be modules, this is a subtle bug waiting to bite you.


In fact, not all modules have a __file__ attribute.


 import errno
 errno.__file__
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: 'module' object has no attribute '__file__'



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: __file__ access extremely slow

2009-06-04 Thread Terry Reedy

Zac Burns wrote:

The section of code below, which simply gets the __file__ attribute of
the imported modules, takes more than 1/3 of the total startup time.
Given that many modules are complicated and even have dynamic
population this figure seems very high to me. it would seem very high
if one just considered the time it would take to load the pyc files
off the disk vs... whatever happens when module.__file__ happens.

The calculation appears to be cached though, so a subsequent check
does not take very long.


From once python starts and loads the main module to after all the

imports occur and this section executes takes 1.3sec. This section
takes 0.5sec. Total module count is ~800.


Perhaps some of the modules use a delayed import mechanism.


Python version is 2.5.1

Code:

for module in sys.modules:
try:
path = module.__file__
except (AttributeError, ImportError):
return


If any modules lack the attribute, you will not scan them all.  Perhaps 
you meant 'continue'?







--
Zachary Burns
(407)590-4814
Aim - Zac256FL
Production Engineer (Digital Overlord)
Zindagi Games


--
http://mail.python.org/mailman/listinfo/python-list