[issue22167] iglob() has misleading documentation (does indeed store names internally)

R. David Murray Fri, 08 Aug 2014 06:23:27 -0700

R. David Murray added the comment:

IMO the documentation isn't *wrong*, just misleading :)


What it is saying is that *your program* doesn't have to store the full list 
returned by iglob before being able to use it (ie: iglob doesn't return a 
list).  It says nothing about what resources are used internally, other than an 
implied contract that there is *some* efficiency over calling glob; which, as 
explained above, there is.  The fact that the implementation uses lots of 
memory if any single directory is large is then a performance bug, which can 
theoretically be fixed in 3.5 using scandir.

The reason iglob was introduced, if you check the revision history, is that 
glob used to call itself recursively for each sub-directory, which meant it 
held *all* of the files in *all* of the scanned tree in memory at one time.  It 
is literally true that the difference between glob and iglob is that with iglob 
your program doesn't have to store the full list of matches from all 
subdirectories, but talking about "your program" is not something we typically 
do in python docs, it is implied.

Perhaps in 2.7/3.4 we can mention in the module docs that at most one 
directory's worth of data will be held in memory during the globbing process, 
but it feels a little weird to document an implementation detail like that.  
Still, if someone can come up with improved wording for the docs, we can add it.

----------
nosy: +r.david.murray

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22167>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue22167] iglob() has misleading documentation (does indeed store names internally)

Reply via email to