Regarding to the performance difference between "re" and "regex" and packaging 
related options, we did a performance comparison using Python 3.6.0 to run some 
micro-benchmarks in the Python Benchmark Suite 
(https://github.com/python/performance):

Results in ms, and the lower the better (running on Ubuntu 15.10)
                                re              regex (via pip install regex, 
and a replacement of "import re" with "import regex as re")
bm_regex_compile.py             229             298
bm_regex_dna.py         171             267
bm_regex_effbot.py              2.77            3.04
bm_regex_v8.py          24.8            14.1
This data shows "re" is better than "regex" in term of performance in 3 out of 
4 above micro-benchmarks.

Anyone searching for "regular expression python" will get a first hit at the 
Python documentation on "re".  Naturally, any new developer could start with 
"re" since day 1 and not bother to look elsewhere for alternatives later on.

We did a query for "import re" against the big cloud computing software 
application, OpenStack (with 3.7 million lines of source codes and majority of 
them written in Python), and got ~1000 hits.

With that being said, IMHO, it would be nice to capture ("borrow") the 
performance benefit from "regex" and merged into "re", without knowing or 
worrying about packaging/installing stuff.

Cheers,

Peter

 

-----Original Message-----
From: Python-Dev 
[mailto:python-dev-bounces+peter.xihong.wang=intel....@python.org] On Behalf Of 
Nick Coghlan
Sent: Tuesday, January 31, 2017 1:54 AM
To: Barry Warsaw <ba...@python.org>
Cc: python-dev@python.org
Subject: Re: [Python-Dev] re performance

On 30 January 2017 at 15:26, Barry Warsaw <ba...@python.org> wrote:
> On Jan 30, 2017, at 12:38 PM, Nick Coghlan wrote:
>
>>I think there are 3 main candidates that could fit that bill:
>>
>>- requests
>>- setuptools
>>- regex
>
> Actually, I think pkg_resources would make an excellent candidate.  
> The setuptools crew is working on a branch that would allow for 
> setuptools and pkg_resources to be split, which would be great for 
> other reasons.  Splitting them may mean that pkg_resources could 
> eventually be added to the stdlib, but as an intermediate step, it 
> could also test out this idea.  It probably has a lot less of the baggage 
> that you outline.

Yep, if/when pkg_resources is successfully split out from the rest of 
setuptools, I agree it would also be a good candidate for stdlib bundling - 
version independent runtime access to the database of installed packages is a 
key capability for many use cases, and not currently something we support 
especially well.

It's also far more analogous to the existing pip bundling, since 
setuptools/pkg_resources are also maintained under the PyPA structure.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to