[issue18594] C accelerator for collections.Counter is slow
Roundup Robot added the comment: New changeset e4cec1116e5c by Raymond Hettinger in branch '3.3': Issue #18594: Make the C code more closely match the pure python code. http://hg.python.org/cpython/rev/e4cec1116e5c -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Roundup Robot added the comment: New changeset 6aef095fdb30 by Raymond Hettinger in branch '3.3': Issue #18594: Fix the fast path for collections.Counter(). http://hg.python.org/cpython/rev/6aef095fdb30 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Raymond Hettinger added the comment: Attaching a patch for the slow path. Makes the code exactly match the pure python version. This kicks in whether someone has subclassed Counter and overridden either __getitem__ or __setitem__. -- Added file: http://bugs.python.org/file31932/fix_counter2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Stefan Behnel added the comment: Can you update the benchmark numbers to show what the difference is compared to pure Python (and to the fastpath) now? One more thing: the fastpath depends on .__getitem__() and friends, whereas the fallback path depends on .get(). What if someone overrides .get() but not .__getitem__()? (Might be a hypothetical case...) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Roundup Robot added the comment: New changeset 1ee6f8a96fb9 by Raymond Hettinger in branch '3.3': Issue #18594: Fix the fallback path in collections.Counter(). http://hg.python.org/cpython/rev/1ee6f8a96fb9 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- stage: needs patch - patch review Added file: http://bugs.python.org/file31917/fix_counter.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Raymond Hettinger raymond.hettin...@gmail.com: Removed file: http://bugs.python.org/file31917/fix_counter.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Raymond Hettinger raymond.hettin...@gmail.com: Added file: http://bugs.python.org/file31918/fix_counter.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Raymond Hettinger added the comment: Repaired version $ py -m timeit -s 'from random import seed, randrange; seed(8675309); data=[randrange(1000) for i in range(10)]; from collections import Counter' 'Counter(data)' 100 loops, best of 3: 14.3 msec per loop $ py -m timeit -s 'from random import seed, randrange; seed(8675309); data=[randrange(50) for i in range(10)]; from collections import Counter' 'Counter(data)' 10 loops, best of 3: 40.8 msec per loop Current with accelerator $ py -m timeit -s 'from random import seed, randrange; seed(8675309); data=[randrange(1000) for i in range(10)]; from collections import Counter' 'Counter(data)' 10 loops, best of 3: 61.7 msec per loop $ py -m timeit -s 'from random import seed, randrange; seed(8675309); data=[randrange(50) for i in range(10)]; from collections import Counter' 'Counter(data)' 10 loops, best of 3: 118 msec per loop Current without accelerator --- $ py -m timeit -s 'from random import seed, randrange; seed(8675309); data=[randrange(1000) for i in range(10)]; from collections import Counter' 'Counter(data)' 10 loops, best of 3: 54.9 msec per loop $ py -m timeit -s 'from random import seed, randrange; seed(8675309); data=[randrange(50) for i in range(10)]; from collections import Counter' 'Counter(data)' 10 loops, best of 3: 80.8 msec per loop -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Stefan Behnel added the comment: Patch LGTM and seems to work well, according to your numbers. Only minor nitpick would be that the method references could be decref-ed earlier, but that would complicate the code a bit. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Serhiy Storchaka added the comment: Benchmarking results look great. But isn't _PyObject_LookupSpecial() more appropriate function for special methods lookup than PyObject_GetAttrString()? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: - rhettinger priority: high - normal ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Raymond Hettinger added the comment: A C-accelerator should ALWAYS be able to beat a pure python version if it does the same steps but without the overhead of the eval-loop. And in special cases such as type(self)==Counter, it can do much better. To resolve this report, the C accelerator needs to be fixed. As Stefan pointed-out, the fast-path isn't being triggered because of PyDict_CheckExact test. And, the fallback path can be sped-up as well (by doing the same steps in C as are being done with the pure python code). -- stage: - needs patch versions: +Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Eli Bendersky eli...@gmail.com: -- nosy: -eli.bendersky ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Anoop Thomas Mathew added the comment: 40% faster collections.Counter() . Removed C accelerator. Patch attached. Passes all tests. Results comparison follows. -- keywords: +patch nosy: +Anoop.Thomas.Mathew Added file: http://bugs.python.org/file31774/collections_Counter_without_C_accelerator_with_comprehension.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Anoop Thomas Mathew added the comment: Performance comparison with and without patch applied. -- Added file: http://bugs.python.org/file31775/performance_comparision.csv ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Phil Connell pconn...@gmail.com: -- nosy: +pconnell ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
New submission from Stefan Behnel: The C accelerator for the collections.Counter class (_count_elements() in _collections.c) is slower than the pure Python versions for data that has many unique entries. This is because the fast path for dicts is not taken (Counter is a subtype of dict) and the slower fallback path raises exceptions for each value that wasn't previously seen. This can apparently make it slower than calling get() on Python side. My suggestion is to drop the fallback path from the accelerator completely and to only call the C function when it's safe to use it, e.g. when type(self) is Counter and not a subclass. -- components: Library (Lib) messages: 193914 nosy: scoder, serhiy.storchaka priority: normal severity: normal status: open title: C accelerator for collections.Counter is slow type: performance versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +rhettinger priority: normal - high ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Serhiy Storchaka added the comment: Are there any cases where the counter class with the C accelerator is faster than the pure Python version? Here is a benchmarking script (modification of Roy Smith's script [1]) and looks as the pure Python version is faster even for data that has not many unique entries. [1] http://permalink.gmane.org/gmane.comp.python.general/738820 -- Added file: http://bugs.python.org/file31085/counterbench.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18594] C accelerator for collections.Counter is slow
Eli Bendersky added the comment: That sounds like a good idea, Stefan. -- nosy: +eli.bendersky ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18594 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com