New submission from dw <[email protected]>: (Confirmed in 22 Apr nightly)
http://github.com/dw/mua/mua/mboxrd.py is a nasty reimplementation of the mailbox module's mbox parser, since that one is slow and doesn't handle Content-length. The module keeps a cache of mmaps and relies on mmap.find() and two regexes to discover message boundaries. My test feeds mboxrd.Mboxrd() a 1.8gb mailbox. A mailbox of the same size, but with about 10x the messages can be downloaded at http://k2.botanicus.net/qemu-all.mbox.bz2 (353mb compressed). On CPython the module can parse out 57k messages in about 9 seconds, whereas on PyPy one of the regexes, CONTENT_LENGTH_PAT or SEPARATOR_PAT, appears to be killing performance. Test script was simply: import mua.mboxrd mbox = mua.mboxrd.Mboxrd('source-mbox') print sum(1 for _ in mbox.iter()) Running PyPy nightly under 'perf record -g' produced: # Overhead Command Shared Object Symbol # ........ ....... .................. .......................................... 35.17% pypy libc-2.18.so [.] __memcpy_sse2_unaligned 11.07% pypy [kernel.kallsyms] [k] clear_page_c 10.64% pypy libc-2.18.so [.] memset 8.25% pypy [kernel.kallsyms] [k] page_fault 4.48% pypy [kernel.kallsyms] [k] _raw_spin_lock 3.60% pypy [kernel.kallsyms] [k] get_page_from_freelist 3.22% pypy [kernel.kallsyms] [k] free_pcppages_bulk 3.15% pypy [kernel.kallsyms] [k] __rmqueue 2.33% pypy [kernel.kallsyms] [k] unmap_single_vma 2.19% pypy [kernel.kallsyms] [k] release_pages 1.66% pypy [kernel.kallsyms] [k] handle_mm_fault 1.33% pypy [kernel.kallsyms] [k] page_add_new_anon_rmap 1.22% pypy [kernel.kallsyms] [k] __pagevec_lru_add_fn 1.06% pypy [kernel.kallsyms] [k] get_pageblock_flags_group 1.00% pypy [kernel.kallsyms] [k] free_hot_cold_page Copies, memsets and faults clearly consume >50% runtime. cProfile output: 13653 function calls (13459 primitive calls) in 60.199 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 60.199 60.199 p.py:2(<module>) 34 0.000 0.000 60.157 1.769 p.py:5(<genexpr>) 34 0.001 0.000 60.157 1.769 mboxrd.py:124(iter) 34 0.002 0.000 60.155 1.769 mboxrd.py:104(_find_content_length_end) 68 60.153 0.885 60.153 0.885 {method 'search' of 'SRE_Pattern' objects} 1 0.000 0.000 0.041 0.041 mboxrd.py:2(<module>) 1 0.002 0.002 0.040 0.040 rfc822.py:2(<module>) 1 0.000 0.000 0.031 0.031 header.py:5(<module>) 1 0.001 0.001 0.027 0.027 quoprimime.py:5(<module>) 1 0.002 0.002 0.021 0.021 utils.py:5(<module>) 22 0.000 0.000 0.020 0.001 re.py:188(compile) 22 0.000 0.000 0.020 0.001 re.py:226(_compile) 21 0.000 0.000 0.020 0.001 sre_compile.py:493(compile) 21 0.000 0.000 0.011 0.001 sre_compile.py:478(_code) 36 0.000 0.000 0.009 0.000 sre_compile.py:178(_compile_charset) 36 0.005 0.000 0.009 0.000 sre_compile.py:207(_optimize_charset) Notice SRE_Pattern.search only called 68 times in a minute. Attempting to replace the pattern with one that does not rely on re.I provided no speed gain. While chatting with Armin on #pypy I erroneously suggested the copies may be due to mem_file(), which relies on a CPython-specific trick to produce a byte aligned zero copy file-like object over a subset of an mmap, however that code is never invoked. It seems the issue is purely with the regex. ---------- messages: 6736 nosy: dw, pypy-issue priority: performance bug status: unread title: SRE_Pattern.search() of simple pattern causes crazy slowdown ________________________________________ PyPy bug tracker <[email protected]> <https://bugs.pypy.org/issue1741> ________________________________________ _______________________________________________ pypy-issue mailing list [email protected] https://mail.python.org/mailman/listinfo/pypy-issue
