Guido van Rossum wrote: > I was just reminded that in Python 3, list.sort() and sorted() no > longer support the cmp (comparator) function argument. The reason is > that the key function argument is always better. But now I have a > nagging doubt about this: > > I recently advised a Googler who was sorting a large dataset and > running out of memory. My analysis of the situation was that he was > sorting a huge list of short lines of the form "shortstring,integer" > with a key function that returned a tuple of the form ("shortstring", > integer). Using the key function argument, in addition to N short > string objects, this creates N tuples of length 2, N more slightly > shorter string objects, and N integer objects. (Not to count a > parallel array of N more pointers.) Given the object overhead, this > dramatically increased the memory usage. It so happens that in this > particular Googler's situation, memory is constrained but CPU time is > not, and it would be better to parse the strings over and over again > in a comparator function. > > But in Python 3 this solution is no longer available. How bad is that? > I'm not sure. But I'd like to at least get the issue out in the open.
While there are other arguments to reintroduce cmp (or less_than instead?) the memory problem could also be addressed with a dont_cache_keys flag or max_cache_keys limit. Peter _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com