On 01/13/2012 09:19 PM, Dag Sverre Seljebotn wrote: > On 01/13/2012 02:13 AM, Asher Langton wrote: >> Hi all, >> >> (I originally posted this to the BayPIGgies list, where Fernando Perez >> suggested I send it to the NumPy list as well. My apologies if you're >> receiving this email twice.) >> >> I work on a Python/C++ scientific code that runs as a number of >> independent Python processes communicating via MPI. Unfortunately, as >> some of you may have experienced, module importing does not scale well >> in Python/MPI applications. For 32k processes on BlueGene/P, importing >> 100 trivial C-extension modules takes 5.5 hours, compared to 35 >> minutes for all other interpreter loading and initialization. We >> developed a simple pure-Python module (based on knee.py, a >> hierarchical import example) that cuts the import time from 5.5 hours >> to 6 minutes. >> >> The code is available here: >> >> https://github.com/langton/MPI_Import >> >> Usage, implementation details, and limitations are described in a >> docstring at the beginning of the file (just after the mandatory >> legalese). >> >> I've talked with a few people who've faced the same problem and heard >> about a variety of approaches, which range from putting all necessary >> files in one directory to hacking the interpreter itself so it >> distributes the module-loading over MPI. Last summer, I had a student >> intern try a few of these approaches. It turned out that the problem >> wasn't so much the simultaneous module loads, but rather the huge >> number of failed open() calls (ENOENT) as the interpreter tries to >> find the module files. In the MPI_Import module, we have rank 0 >> perform the module lookups and then broadcast the locations to the >> rest of the processes. For our real-world scientific applications >> written in Python and C++, this has meant that we can start a problem >> and actually make computational progress before the batch allocation >> ends. > > This is great news! I've forwarded to the mpi4py mailing list which > despairs over this regularly. > > Another idea: Given your diagnostics, wouldn't dumping the output of > "find" of every path in sys.path to a single text file work well? Then > each node download that file once and consult it when looking for > modules, instead of network file metadata. > > (In fact I think "texhash" does the same for LaTeX?) > > The disadvantage is that one would need to run "update-python-paths" > every time a package is installed to update the text file. But I'm not > sure if that that disadvantage is larger than remembering to avoid > diverging import paths between nodes; hopefully one could put a reminder > to run update-python-paths in the ImportError string.
I meant "diverging code paths during imports between nodes".. Dag > > >> If you try out the code, I'd appreciate any feedback you have: >> performance results, bugfixes/feature-additions, or alternate >> approaches to solving this problem. Thanks! > > I didn't try it myself, but forwarding this from the mpi4py mailing list: > > """ > I'm testing it now and actually > running into some funny errors with unittest on Python 2.7 causing > infinite recursion. If anyone is able to get this going, and could > report successes back to the group, that would be very helpful. > """ > > Dag Sverre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion