[Python-ideas] Regex timeouts

2022-02-15 Thread Ben Rudiak-Gould
On Tue, Feb 15, 2022 at 6:13 PM Chris Angelico wrote: > Once upon a time, a "regular expression" was a regular grammar. That is no > longer the case. > I use "regex" for the weird backtracking minilanguages and deliberately never call them "regular expressions". (I was under the impression that

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread David Mertz, Ph.D.
I know this is probably too much self promotion, but I really enjoyed writing this less than a year ago: https://gnosis.cx/regex/ (The Puzzling Quirks of Regular Expressions). It's like other puzzle books, but for programmers. You should certainly still get Friedl's book if you don't have it. You

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Tim Peters
[Chris Angelico ] > Is there any sort of standardization of regexp syntax and semantics, Sure. "The nice thing about standards is that you have so many to choose from" ;-) For example, POSIX defines a regexp flavor so it can specify what things like grep do. The ECMAScruot standard defines its

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread MRAB
On 2022-02-16 02:11, Chris Angelico wrote: On Wed, 16 Feb 2022 at 12:56, Tim Peters wrote: Regexps keep "evolving"... Once upon a time, a "regular expression" was a regular grammar. That is no longer the case. Once upon a time, a regular expression could be broadly compatible with multiple

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Chris Angelico
On Wed, 16 Feb 2022 at 12:56, Tim Peters wrote: > Regexps keep "evolving"... Once upon a time, a "regular expression" was a regular grammar. That is no longer the case. Once upon a time, a regular expression could be broadly compatible with multiple different parser engines. That is being

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Tim Peters
[Steven D'Aprano ] > After this thread, I no longer trust that "easy" regexes will do what > they "obviously" look like they should do :-( > > I'm not trying to be funny or snarky. I *thought* I had a reasonable > understanding of regexes, and now I have learned that I don't, and that > the

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Chris Angelico
On Wed, 16 Feb 2022 at 10:15, Steven D'Aprano wrote: > > On Tue, Feb 15, 2022 at 11:51:41PM +0900, Stephen J. Turnbull wrote: > > > scanf just isn't powerful enough. For example, consider parsing user > > input dates: scanf("%d/%d/%d", , , ). This is nice and > > simple, but handling

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Chris Angelico
On Wed, 16 Feb 2022 at 09:28, Steven D'Aprano wrote: > > On Wed, Feb 16, 2022 at 01:02:44AM +1100, Chris Angelico wrote: > > > Yeah, regexes always look terrible when they're used for simple > > examples :) But try matching a line that has (somewhere in it) the > > word "spam", then whitespace,

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Steven D'Aprano
On Tue, Feb 15, 2022 at 11:51:41PM +0900, Stephen J. Turnbull wrote: > scanf just isn't powerful enough. For example, consider parsing user > input dates: scanf("%d/%d/%d", , , ). This is nice and > simple, but handling "2022-02-15" as well requires a bit of thinking > and several extra

[Python-ideas] New convenience attribute pathlib.Path.stems

2022-02-15 Thread Clay Gerrard
>>> from pathlib import Path >>> p = Path('/etc/swift/object.ring.gz') >>> p.suffix '.gz' >>> p.suffixes ['.ring', '.gz'] >>> p.stem 'object.ring' >>> p.stems Traceback (most recent call last): File "", line 1, in AttributeError: 'PosixPath' object has no attribute 'stems' I think it would

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Steven D'Aprano
On Wed, Feb 16, 2022 at 01:02:44AM +1100, Chris Angelico wrote: > Yeah, regexes always look terrible when they're used for simple > examples :) But try matching a line that has (somewhere in it) the > word "spam", then whitespace, then a number (or if you prefer: then a > sequence of ASCII

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread J.B. Langston
How embarassing... I apologize for all the signature garbage at the end of my message. ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Chris Angelico
On Wed, 16 Feb 2022 at 01:54, Stephen J. Turnbull wrote: > The Zawinski quote is motivated by the perception that people seem to > think that simplicity lies in minimizing the number of tools you need > to learn. REXX and SNOBOL pattern matching quite a bit more > specialized to particular tools

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread J.B. Langston
Tim Peters wrote: > """ > Some people, when confronted with a problem, think “I know, I'll use > regular expressions.” Now they have two problems. > - Jamie Zawinski > """ Maybe so, but I'm committed now :). I have dozens of regexes to parse specific log messages I'm interested in. I made a

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread J.B. Langston
> > A regex that's vulnerable to pathological behavior is a DoS attack waiting >> to happen. Especially when used for parsing log data (which might contain >> untrusted data). If possible, we should make it harder for people to shoot >> themselves in the feet. >> > And this is exactly what

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread MRAB
On 2022-02-15 06:05, Tim Peters wrote: [Steven D'Aprano ] I've been interested in the existence of SNOBOL string scanning for a long time, but I know very little about it. How does it differ from regexes, and why have programming languages pretty much standardised on regexes rather than other

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Stephen J. Turnbull
Tim Peters writes: > Chris didn't say this, but I will: I'm amazed that things much > _simpler_ than regexps, like his scanf and REXX PARSE > examples, haven't spread more. scanf just isn't powerful enough. For example, consider parsing user input dates: scanf("%d/%d/%d", , , ). This is

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Chris Angelico
On Wed, 16 Feb 2022 at 00:55, Steven D'Aprano wrote: > > On Tue, Feb 15, 2022 at 05:39:33AM -0600, Tim Peters wrote: > > > ([^s]|s(?!pam))*spam > > > > Bingo. That pattern is easy enough to understand > > You and I have very different definitions of the word "easy" :-) > > > (if not to invent the

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Steven D'Aprano
On Tue, Feb 15, 2022 at 05:39:33AM -0600, Tim Peters wrote: > ([^s]|s(?!pam))*spam > > Bingo. That pattern is easy enough to understand You and I have very different definitions of the word "easy" :-) > (if not to invent the > first time): we can chew up a character if it's not an "s", or if

[Python-ideas] Re: Regex timeouts

2022-02-15 Thread Tim Peters
[Tim, on trying to match only the next instance of "spam"] > Assertions aren't needed, but it is nightmarish to get right. Followed by a nightmare that got it wrong. My apologies - that's what I get for trying to show off ;-) It's actually far easier if assertions are used, and I'm too old to

[Python-ideas] Re: Please consider mentioning property without setter when an attribute can't be set

2022-02-15 Thread André Roberge
On Fri, Feb 11, 2022 at 5:39 AM Steven D'Aprano wrote: > On Thu, Feb 10, 2022 at 02:27:42PM -0800, Neil Girdhar wrote: > > > AttributeError: can't set attribute 'f' > > > > This can be a pain to debug when the property is buried in a base class. > > > Would it make sense to mention the reason