[issue6370] Bad performance of colllections.Counter at initialisation from an iterable

SilentGhost Mon, 29 Jun 2009 07:56:36 -0700

New submission from SilentGhost <michael.mischurow+...@gmail.com>:

I'm comparing initialisation of Counter from an iterable with the
following function:


def unique(seq):
        """Dict of unique values (keys) & their counts in original sequence"""

        out_dict = dict.fromkeys(set(seq), 0)
        for i in seq:
                out_dict[i] += 1
        return out_dict


iterable = list(range(43)) + list(range(43, 0, -1))

The timeit-obtained values show that it takes Counter four (4) times
longer to finish. As it's obvious from comparing my function and lines
429-430 of collections.py the only difference is preallocating the final
dictionary. When line 430 of collections is replaced with:

self[elem] = self.get(elem, 0) + 1

I was able to get about 25% time-performance increase (I assume
__missing__ is bypassed). I hope that it's possible to improve its
implementation even further.

----------
components: Library (Lib)
messages: 89846
nosy: SilentGhost
severity: normal
status: open
title: Bad performance of colllections.Counter at initialisation from an 
iterable
type: performance
versions: Python 3.1

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6370>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue6370] Bad performance of colllections.Counter at initialisation from an iterable

Reply via email to