On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum <gu...@python.org> wrote: > This reminds me of the following bug, which can happen when two > processes are both writing the .pyc file and a third is reading it. > First some background. > > When writing a .pyc file, we use the following strategy:
> - open the file for writing > - write a dummy header (four null bytes) > - write the .py file's mtime > - write the marshalled code object > - replace the dummy heaer with the correct magic word > Just so people know, this is how we used to do it. In importlib we write the entire file to a temp file and then to an atomic rename. > Even py_compile.py (used by compileall.py) uses this strategy. py_compile as of Python 3.4 now just uses importlib directly, so it matches its semantics. -Brett > > When reading a .pyc file, we ignore it when the magic word isn't there > (or when the mtime doesn't match that of the .py file exactly), and > then we will write it back like described above. > > Now consider the following scenario. It involves *three* processes. > > - Two unrelated processes both start and want to import the same module. > - They both see the .pyc file is missing/corrupt and decide to write it. > - The first process finishing writing the file, writing the correct header. > - Now a third process wants to import the module, sees the valid > header, and starts reading the file. > - However, while this is going on, the second process gets ready to > write the file. > - The second process truncates the file, writes the dummy header, and > then stalls. > - At this point the third process (which thought it was reading a > valid file) sees an unexpected EOF because the file has been > truncated. > > Now, this would explain the EOFError, but not necessarily the > ValueError with "unknown type code". However, it looks like marshal > doesn't always check for EOF immediately (sometimes it calls getc() > without checking the result, and sometimes it doesn't check the error > state after calling r_string()), so I think all the errors are > actually explainable from this scenario. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com