New submission from Steve Newcomb:

Our most regular-expression-processing-intensive Python 2.7 code takes 2.5x 
more execution time in 2.7.12 than it did in 2.7.6.  I discovered this after 
upgrading from Ubuntu 14.04 to Ubuntu 16.04.  Basically this code runs 
thousands of compiled regular expressions on thousands of texts.  Both the 
multiprocessing module and the re module are heavily used.

See attached profiler outputs, which look quite different in several respects.  
I used the profiling module to profile the same Python code, processing the 
same data, using the same hardware, under both Ubuntu 14.04 (Python 2.7.6) and 
Ubuntu 16.04 (Python 2.7.12).  

It is striking, for example, that cPickle.load appears so prominently in the 
2.7.12 profile -- a fact which appears to implicate the multiprocessing module 
somehow.  But I suspect that the re module is more likely the main source of 
the problem, because the execution times of other production steps -- steps 
that do not call the multiprocessing module -- also appear to be extended to a 
degree that is roughly proportional to the amount of regular expression 
processing done in those other steps.

I will happily provide any further information I can.  Any insights about this 
surprisingly severe performance degradation would be welcome.

----------
files: profiles_2.7.6_vs_2.7.12
messages: 273932
nosy: steve.newcomb
priority: normal
severity: normal
status: open
title: regexp performance degradation between 2.7.6 and 2.7.12
Added file: http://bugs.python.org/file44277/profiles_2.7.6_vs_2.7.12

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27898>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to