Re: [Python-Dev] New regex module for 3.2?

2010-07-28 Thread Gregory P. Smith
On Tue, Jul 27, 2010 at 6:43 PM, R. David Murray rdmur...@bitdance.comwrote: On Tue, 27 Jul 2010 08:27:35 +0200, Stefan Behnel stefan...@behnel.de wrote: Gregory P. Smith, 27.07.2010 07:40: A max cache size of 100 was too small. I just increased it to 500 in the py3k branch along with

Re: [Python-Dev] New regex module for 3.2?

2010-07-28 Thread Nick Coghlan
On Wed, Jul 28, 2010 at 4:50 PM, Gregory P. Smith g...@krypto.org wrote: On Tue, Jul 27, 2010 at 6:43 PM, R. David Murray rdmur...@bitdance.com wrote: On Tue, 27 Jul 2010 08:27:35 +0200, Stefan Behnel stefan...@behnel.de wrote: Gregory P. Smith, 27.07.2010 07:40: Random replacement

Re: [Python-Dev] New regex module for 3.2?

2010-07-27 Thread Stefan Behnel
Gregory P. Smith, 27.07.2010 07:40: A max cache size of 100 was too small. I just increased it to 500 in the py3k branch along with implementing a random replacement cache overflow policy. It now randomly drops 20% of the compiled regular expression cache instead of simply dropping the entire

Re: [Python-Dev] New regex module for 3.2?

2010-07-27 Thread R. David Murray
On Tue, 27 Jul 2010 08:27:35 +0200, Stefan Behnel stefan...@behnel.de wrote: Gregory P. Smith, 27.07.2010 07:40: A max cache size of 100 was too small. I just increased it to 500 in the py3k branch along with implementing a random replacement cache overflow policy. It now randomly drops

Re: [Python-Dev] New regex module for 3.2?

2010-07-26 Thread Georg Brandl
Am 22.07.2010 12:53, schrieb Guido van Rossum: On Fri, Jul 16, 2010 at 6:08 PM, Georg Brandl g.bra...@gmx.net wrote: Nevertheless, the authoritative reference for our regex engine is its docs, i.e. http://docs.python.org/library/re.html -- and that states clearly that inline flags apply to the

Re: [Python-Dev] New regex module for 3.2?

2010-07-26 Thread Gregory P. Smith
On Thu, Jul 22, 2010 at 3:26 PM, Nick Coghlan ncogh...@gmail.com wrote: On Fri, Jul 23, 2010 at 12:42 AM, Georg Brandl g.bra...@gmx.net wrote: Sure -- I don't think this is a showstopper for regex. However if we don't include regex in a future version, we might think about increasing

Re: [Python-Dev] New regex module for 3.2?

2010-07-23 Thread Hrvoje Niksic
On 07/22/2010 01:34 PM, Georg Brandl wrote: Timings (seconds to run the test suite): re 26.689 26.015 26.008 regex 26.066 25.797 25.865 So, I thought there wasn't a difference in performance for this use case (which is compiling a lot of regexes and matching most of them only a few

Re: [Python-Dev] New regex module for 3.2?

2010-07-23 Thread Nick Coghlan
On Fri, Jul 23, 2010 at 8:16 PM, Hrvoje Niksic hrvoje.nik...@avl.com wrote: The performance trade-off should make regex slower with sufficiently small compiled regex cache, when a lot of time is wasted on compilation.  But as the cache gets larger (and, for fairness, of the same size in both

Re: [Python-Dev] New regex module for 3.2?

2010-07-23 Thread Georg Brandl
Am 23.07.2010 11:16, schrieb Hrvoje Niksic: On 07/22/2010 01:34 PM, Georg Brandl wrote: Timings (seconds to run the test suite): re 26.689 26.015 26.008 regex 26.066 25.797 25.865 So, I thought there wasn't a difference in performance for this use case (which is compiling a lot of

Re: [Python-Dev] New regex module for 3.2?

2010-07-22 Thread Georg Brandl
Am 13.07.2010 15:35, schrieb Antoine Pitrou: On Tue, 13 Jul 2010 15:20:23 +0100 Michael Foord fuzzy...@voidspace.org.uk wrote: On 13/07/2010 15:17, Reid Kleckner wrote: On Mon, Jul 12, 2010 at 2:07 PM, Nick Coghlanncogh...@gmail.com wrote: MRAB's module offers a superset of re's

Re: [Python-Dev] New regex module for 3.2?

2010-07-22 Thread Guido van Rossum
On Fri, Jul 16, 2010 at 6:08 PM, Georg Brandl g.bra...@gmx.net wrote: Nevertheless, the authoritative reference for our regex engine is its docs, i.e. http://docs.python.org/library/re.html -- and that states clearly that inline flags apply to the whole regex. I think with a new regex

Re: [Python-Dev] New regex module for 3.2?

2010-07-22 Thread Nick Coghlan
On Thu, Jul 22, 2010 at 9:34 PM, Georg Brandl g.bra...@gmx.net wrote: So, I thought there wasn't a difference in performance for this use case (which is compiling a lot of regexes and matching most of them only a few times in comparison).  However, I found that looking at the regex caching is

Re: [Python-Dev] New regex module for 3.2?

2010-07-22 Thread Georg Brandl
Am 22.07.2010 14:12, schrieb Nick Coghlan: On Thu, Jul 22, 2010 at 9:34 PM, Georg Brandl g.bra...@gmx.net wrote: So, I thought there wasn't a difference in performance for this use case (which is compiling a lot of regexes and matching most of them only a few times in comparison). However, I

Re: [Python-Dev] New regex module for 3.2?

2010-07-22 Thread Reid Kleckner
On Thu, Jul 22, 2010 at 7:42 AM, Georg Brandl g.bra...@gmx.net wrote: Am 22.07.2010 14:12, schrieb Nick Coghlan: On Thu, Jul 22, 2010 at 9:34 PM, Georg Brandl g.bra...@gmx.net wrote: So, I thought there wasn't a difference in performance for this use case (which is compiling a lot of regexes

Re: [Python-Dev] New regex module for 3.2?

2010-07-22 Thread Nick Coghlan
On Fri, Jul 23, 2010 at 12:42 AM, Georg Brandl g.bra...@gmx.net wrote: Sure -- I don't think this is a showstopper for regex.  However if we don't include regex in a future version, we might think about increasing MAXCACHE a bit, and maybe not clear the cache when it reaches its max length, but

Re: [Python-Dev] New regex module for 3.2?

2010-07-16 Thread Vlastimil Brom
2010/7/9 Georg Brandl g.bra...@gmx.net: Am 09.07.2010 02:35, schrieb MRAB: 1. Some of the inline flags are scoped; for example, putting (?i) at the end of a regex will now have no effect because it's no longer a global, all-or-nothing, flag. That is problematic.  I've often seen people put

Re: [Python-Dev] New regex module for 3.2?

2010-07-16 Thread Georg Brandl
Am 16.07.2010 17:08, schrieb Vlastimil Brom: 2010/7/9 Georg Brandl g.bra...@gmx.net: Am 09.07.2010 02:35, schrieb MRAB: 1. Some of the inline flags are scoped; for example, putting (?i) at the end of a regex will now have no effect because it's no longer a global, all-or-nothing, flag.

Re: [Python-Dev] New regex module for 3.2?

2010-07-13 Thread Gregory P. Smith
On Thu, Jul 8, 2010 at 12:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at: http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be

Re: [Python-Dev] New regex module for 3.2?

2010-07-13 Thread Vlastimil Brom
2010/7/8 MRAB pyt...@mrabarnett.plus.com: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:    http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be interested in any comments or feedback.

Re: [Python-Dev] New regex module for 3.2?

2010-07-13 Thread Reid Kleckner
On Mon, Jul 12, 2010 at 2:07 PM, Nick Coghlan ncogh...@gmail.com wrote: MRAB's module offers a superset of re's features rather than a subset though, so once it has had more of a chance to bake on PyPI it may be worth another look. I feel like the new module is designed to replace the current

Re: [Python-Dev] New regex module for 3.2?

2010-07-13 Thread Michael Foord
On 13/07/2010 15:17, Reid Kleckner wrote: On Mon, Jul 12, 2010 at 2:07 PM, Nick Coghlanncogh...@gmail.com wrote: MRAB's module offers a superset of re's features rather than a subset though, so once it has had more of a chance to bake on PyPI it may be worth another look. I feel

Re: [Python-Dev] New regex module for 3.2?

2010-07-13 Thread Antoine Pitrou
On Tue, 13 Jul 2010 15:20:23 +0100 Michael Foord fuzzy...@voidspace.org.uk wrote: On 13/07/2010 15:17, Reid Kleckner wrote: On Mon, Jul 12, 2010 at 2:07 PM, Nick Coghlanncogh...@gmail.com wrote: MRAB's module offers a superset of re's features rather than a subset though, so once it

Re: [Python-Dev] New regex module for 3.2?

2010-07-12 Thread Nick Coghlan
On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Aprano st...@pearwood.info wrote: On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is interesting from the point of if it should be included in stdlib. Is it re2 or regex? I don't see having 2 regular expression engines in the

Re: [Python-Dev] New regex module for 3.2?

2010-07-12 Thread Michael Foord
On 12/07/2010 15:07, Nick Coghlan wrote: On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Apranost...@pearwood.info wrote: On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is interesting from the point of if it should be included in stdlib. Is it re2 or regex? I

Re: [Python-Dev] New regex module for 3.2?

2010-07-12 Thread Antoine Pitrou
On Mon, 12 Jul 2010 16:18:38 +0100 Michael Foord fuzzy...@voidspace.org.uk wrote: On 12/07/2010 15:07, Nick Coghlan wrote: On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Apranost...@pearwood.info wrote: On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is

Re: [Python-Dev] New regex module for 3.2?

2010-07-12 Thread Tim Wintle
On Mon, 2010-07-12 at 16:18 +0100, Michael Foord wrote: On 12/07/2010 15:07, Nick Coghlan wrote: On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Apranost...@pearwood.info wrote: re2 deliberately omits some features for efficiency reasons, hence is not even on the table as a possible

Re: [Python-Dev] New regex module for 3.2?

2010-07-12 Thread Collin Winter
On Mon, Jul 12, 2010 at 8:18 AM, Michael Foord fuzzy...@voidspace.org.uk wrote: On 12/07/2010 15:07, Nick Coghlan wrote: On Mon, Jul 12, 2010 at 9:42 AM, Steven D'Apranost...@pearwood.info  wrote: On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is interesting from the

Re: [Python-Dev] New regex module for 3.2?

2010-07-11 Thread anatoly techtonik
On Fri, Jul 9, 2010 at 6:59 PM, Jeffrey Yasskin jyass...@gmail.com wrote: While the re2 comparison might be interesting from an abstract standpoint, it intentionally supports a different regex language from Python so that it can run faster and use less memory. Since re2 can never replace

Re: [Python-Dev] New regex module for 3.2?

2010-07-11 Thread Eric Smith
On 7/11/2010 5:19 AM, anatoly techtonik wrote: On Fri, Jul 9, 2010 at 6:59 PM, Jeffrey Yasskinjyass...@gmail.com wrote: While the re2 comparison might be interesting from an abstract standpoint, it intentionally supports a different regex language from Python so that it can run faster and use

Re: [Python-Dev] New regex module for 3.2?

2010-07-11 Thread Nick Coghlan
On Sun, Jul 11, 2010 at 7:19 PM, anatoly techtonik techto...@gmail.com wrote: On Fri, Jul 9, 2010 at 6:59 PM, Jeffrey Yasskin jyass...@gmail.com wrote: While the re2 comparison might be interesting from an abstract standpoint, it intentionally supports a different regex language from Python

Re: [Python-Dev] New regex module for 3.2?

2010-07-11 Thread Steven D'Aprano
On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is interesting from the point of if it should be included in stdlib. Is it re2 or regex? I don't see having 2 regular expression engines in the stdlib. There's precedence though... the old regex engine and the new re engine

Re: [Python-Dev] New regex module for 3.2?

2010-07-11 Thread geremy condra
On Sun, Jul 11, 2010 at 7:42 PM, Steven D'Aprano st...@pearwood.info wrote: On Sun, 11 Jul 2010 09:37:22 pm Eric Smith wrote: re2 comparison is interesting from the point of if it should be included in stdlib. Is it re2 or regex? I don't see having 2 regular expression engines in the

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread Georg Brandl
Am 09.07.2010 02:35, schrieb MRAB: That's not what I'm asking. I'm asking what happens if you take an existing Python installation's re module, move it aside, and drop regex in its place as re.py. Doing that and then running Python's own test suite as well as the test suites of major

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread anatoly techtonik
On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:    http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be interested

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread Jeffrey Yasskin
On Fri, Jul 9, 2010 at 7:06 AM, anatoly techtonik techto...@gmail.com wrote: On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:    http://pypi.python.org/pypi/regex

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread MRAB
anatoly techtonik wrote: On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at: http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread Collin Winter
On Fri, Jul 9, 2010 at 10:28 AM, MRAB pyt...@mrabarnett.plus.com wrote: anatoly techtonik wrote: On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:  

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread MRAB
Collin Winter wrote: On Fri, Jul 9, 2010 at 10:28 AM, MRAB pyt...@mrabarnett.plus.com wrote: anatoly techtonik wrote: On Thu, Jul 8, 2010 at 10:52 PM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread Fred Drake
On Fri, Jul 9, 2010 at 3:35 PM, MRAB pyt...@mrabarnett.plus.com wrote: I concentrated my efforts on the matching speed because regexes tend to be compiled only once, and are cached anyway, so I don't think it's as important. I think most here will agree with that, but it might be good to keep

Re: [Python-Dev] New regex module for 3.2?

2010-07-09 Thread William Wahl
H as long as we aren't the ones writing the check:) BJ --Original Message-- From: Fred Drake fdr...@acm.org Sent: Fri, July 09, 2010 1:16 PM To: MRAB pyt...@mrabarnett.plus.com Cc: Python-Dev python-dev@python.org Subject: Re: [Python-Dev] New regex module for 3.2? On Fri, Jul

[Python-Dev] New regex module for 3.2?

2010-07-08 Thread MRAB
Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at: http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be interested in any comments or feedback. How does it compare with re in terms of speed

Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Nick Coghlan
On Fri, Jul 9, 2010 at 5:52 AM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:    http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be interested

Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Benjamin Peterson
2010/7/8 MRAB pyt...@mrabarnett.plus.com: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at:    http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re. I'd be interested in any comments or feedback.

Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Antoine Pitrou
On Thu, 08 Jul 2010 20:52:44 +0100 MRAB pyt...@mrabarnett.plus.com wrote: I'd be interested in any comments or feedback. How does it compare with re in terms of speed on real-world data? The benchmarks suggest it should be faster, or at worst comparable. Can you publish these benchmarks

Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread MRAB
Nick Coghlan wrote: On Fri, Jul 9, 2010 at 5:52 AM, MRAB pyt...@mrabarnett.plus.com wrote: Hi all, I re-implemented the re module, adding new features and speed improvements. It's available at: http://pypi.python.org/pypi/regex under the name regex so that it can be tried alongside re.

Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Nick Coghlan
On Fri, Jul 9, 2010 at 7:54 AM, MRAB pyt...@mrabarnett.plus.com wrote: You should be able to replace:    import re with:    import regex as re and still have everything work the same, ie it's backwards compatible with re. That's not what I'm asking. I'm asking what happens if you take

Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread MRAB
Nick Coghlan wrote: On Fri, Jul 9, 2010 at 7:54 AM, MRAB pyt...@mrabarnett.plus.com wrote: You should be able to replace: import re with: import regex as re and still have everything work the same, ie it's backwards compatible with re. That's not what I'm asking. I'm asking what