On 2008-12-22 22:45, Steven D'Aprano wrote: > On Mon, 22 Dec 2008 11:20:59 pm M.-A. Lemburg wrote: >> On 2008-12-20 23:16, Martin v. Löwis wrote: >>>>> I will try next week to see if I can come up with a smaller, >>>>> submittable example. Thanks. >>>> These long exit times are usually caused by the garbage collection >>>> of objects. This can be a very time consuming task. >>> I doubt that. The long exit times are usually caused by a bad >>> malloc implementation. >> With "garbage collection" I meant the process of Py_DECREF'ing the >> objects in large containers or deeply nested structures, not the GC >> mechanism for breaking circular references in Python. >> >> This will usually also involve free() calls, so the malloc >> implementation affects this as well. However, I've seen such long >> exit times on Linux and Windows, which both have rather good >> malloc implementations. >> >> I don't think there's anything much we can do about it at the >> interpreter level. Deleting millions of objects takes time and that's >> not really surprising at all. It takes even longer if you have >> instances with .__del__() methods written in Python. > > > This behaviour appears to be specific to deleting dicts, not deleting > random objects. I haven't yet confirmed that the problem still exists > in trunk (I hope to have time tonight or tomorrow), but in my previous > tests deleting millions of items stored in a list of tuples completed > in a minute or two, while deleting the same items stored as key:item > pairs in a dict took 30+ minutes. I say plus because I never had the > patience to let it run to completion, it could have been hours for all > I know.
That's interesting. The dictionary dealloc routine doesn't give any hint as to why this should take longer than deallocating a list of tuples. However, due to the way dictionary tables are allocated, it is possible that you create a table that is nearly twice the size of the actual number of items needed by the dictionary. At those dictionary size, this can result in a lot of extra memory being allocated, certainly more than the corresponding list of tuples would use. >> Applications can choose other mechanisms for speeding up the >> exit process in various (less clean) ways, if they have a need for >> this. >> >> BTW: Rather than using a huge in-memory dict, I'd suggest to either >> use an on-disk dictionary such as the ones found in mxBeeBase or >> a database. > > The original poster's application uses 45GB of data. In my earlier > tests, I've experienced the problem with ~ 300 *megabytes* of data: > hardly what I would call "huge". Times have changed, that's true :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 23 2008) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2008-12-02: Released mxODBC.Connect 1.0.0 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com