New submission from SilentGhost <[email protected]>:
I'm comparing initialisation of Counter from an iterable with the
following function:
def unique(seq):
"""Dict of unique values (keys) & their counts in original sequence"""
out_dict = dict.fromkeys(set(seq), 0)
for i in seq:
out_dict[i] += 1
return out_dict
iterable = list(range(43)) + list(range(43, 0, -1))
The timeit-obtained values show that it takes Counter four (4) times
longer to finish. As it's obvious from comparing my function and lines
429-430 of collections.py the only difference is preallocating the final
dictionary. When line 430 of collections is replaced with:
self[elem] = self.get(elem, 0) + 1
I was able to get about 25% time-performance increase (I assume
__missing__ is bypassed). I hope that it's possible to improve its
implementation even further.
----------
components: Library (Lib)
messages: 89846
nosy: SilentGhost
severity: normal
status: open
title: Bad performance of colllections.Counter at initialisation from an
iterable
type: performance
versions: Python 3.1
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue6370>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com