Guido van Rossum added the comment: Thanks for persevering!!!
The dangers of switching between fileno(fp) and fp are actually well documented in the C and/or POSIX standards. The problem is caused in PyFile_FromFileEx() -- it creates a Python file object from the file descriptor. The fix actually only works because we're not using the FILE struct once PyTokenizer_FindEncoding() is called. I think it would be better to move the lseek() into call_find_module() so the FILE abstraction is not broken by PyTokenizer_FindEncoding(). I think there's still a bug or two lurking in this area: first, each time you call imp.find_module() you leak a FILE object; second, the encoding allocated in PyTokenizer_FindEncoding() is leaked. You're right that a lot of this could be avoided if we used file descriptors consistently. It seems find_module() itself doesn't read the file; it just needs to know that it's possible to open the file. Rewriting everywhere that uses PyFile_FromFile[Ex] to use file descriptors doesn't seem too hard; there are only a few places. __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1267> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com