Tim, I left out some details that I believe probably rule out the
"swapped out" theory.  The machine in question has 64GB RAM, but only
16GB swap.  I'd prefer more swap, but in any case only around ~400MB
of the swap was actually in use during my program's entire run.
Furthermore, during my program's exit, it was using 100% CPU, and I'm
95% sure there was no significant "system" or "wait" CPU time for the
system.  (All observations via 'top'.)  So, I think that the problem
is entirely a computational one within this process.

The system does have 8 CPUs.  I'm not sure about it's memory
architecture, but if it's some kind of NUMA box, I guess access to
memory could be slower than what we'd normally expect.  I'm skeptical
about that being a significant factor here, though.

Just to clarify, I didn't gc.disable() to address this problem, but
rather because it destroys performance during the creation of the huge
dict.  I don't have a specific number, but I think disabling gc
reduced construction from something like 70 minutes to 5 (or maybe
10).  Quite dramatic.

Mike


>From Tim Peters:
BTW, the original poster should try this:  use whatever tools the OS
supplies to look at CPU and disk usage during the long exit.  What I
/expect/ is that almost no CPU time is being used, while the disk is
grinding itself to dust.  That's what happens when a large number of
objects have been swapped out to disk, and exit processing has to page
them all back into memory again (in order to decrement their
refcounts).
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to