On Feb 11, 2:00 pm, "Martin v. Löwis" <mar...@v.loewis.de> wrote: > > Can someone describe the details of how Python loads modules into > > memory? I assume once the .py file is compiled to .pyc that it is > > mmap'ed in. But that assumption is very naive. Maybe it uses an > > anonymous mapping? Maybe it does other special magic? This is all > > very alien to me, so if someone could explain it in terms that a > > person who never usually worries about memory could understand, that > > would be much appreciated. > > There is no magic whatsoever. Python opens a sequential file descriptor > for the .pyc file, and then reads it in small chunks, "unmarshalling" > it (indeed, the marshal module is used to restore Python objects). > > The marshal format is an object serialization in a type-value encoding > (sometimes type-length-value), with type codes for: > - None, True, False > - 32-bit ints, 64-bit ints (unmarshalled into int/long) > - floats, complex > - arbitrary-sized longs > - strings, unicode > - tuples (length + marshal data of values) > - lists > - dicts > - code objects > - a few others > > Result of unmarshalling is typically a code object. > > > Follow up: is this process different if the modules are loaded from a > > zipfile? > > No; it uncompresses into memory, and then unmarshals from there ( > compressed block for compressed block) > > > If there is a link that covers this info, that'd be great too. > > See the description of the marshal module. > > HTH, > Martin
Thanks for the answers. For my own edification, and in case anyone is interested, I confirmed this by looking at import.c and marshal.c in the Python2.5.4 source. Looks like the actual reading of the file is done in the marshal.c function PyMarshal_ReadLastObjectFromFile. It is read sequentially using a small buffer on the heap. -sjbrown -- http://mail.python.org/mailman/listinfo/python-list