subject:"std.regex literal syntax \(the \\Q…\\E escape sequence\)"

Re: std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Dmitry Olshansky


19-Dec-2013 01:05, Andrej Mitrovic пишет:

On 12/18/13, Dmitry Olshansky  wrote:

The precedent is Perl. A heavy influencer on the (former) std.regex design.
http://perldoc.perl.org/perlre.html#Capture-groups
(grep for $')


Ah, classic Perl. Write once - don't bother to read ever again. :p



Or rather - if it's so fast to (re)write, why bother reading at all? :)

--
Dmitry Olshansky

Re: std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Andrej Mitrovic

On 12/18/13, Dmitry Olshansky  wrote:
> The precedent is Perl. A heavy influencer on the (former) std.regex design.
> http://perldoc.perl.org/perlre.html#Capture-groups
> (grep for $')

Ah, classic Perl. Write once - don't bother to read ever again. :p

Re: std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Dmitry Olshansky


18-Dec-2013 23:54, Andrej Mitrovic пишет:

On 12/18/13, Dmitry Olshansky  wrote:

P.S. This reminds me to put a roadmap of sorts on where std.regex is
going and what to expect.


Btw one thing I'm not fond of is the format specifiers, in particular:

$`  part of input preceding the match.
$'  part of input following the match.

` and ' are very hard to tell apart. But I guess this was based on an
existing standard? Personally I'd prefer $< and $>.


The precedent is Perl. A heavy influencer on the (former) std.regex design.
http://perldoc.perl.org/perlre.html#Capture-groups
(grep for $')

Personally I'd prefer both simply gone :) Reasoning is that you can't 
support these while pattern matching on the fly (say on a network 
stream). Since we can't do that - anything better that is popular enough 
is acceptable.


--
Dmitry Olshansky

Re: std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Andrej Mitrovic

On 12/18/13, Dmitry Olshansky  wrote:
> P.S. This reminds me to put a roadmap of sorts on where std.regex is
> going and what to expect.

Btw one thing I'm not fond of is the format specifiers, in particular:

$`  part of input preceding the match.
$'  part of input following the match.

` and ' are very hard to tell apart. But I guess this was based on an
existing standard? Personally I'd prefer $< and $>.

Re: std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Andrej Mitrovic

On 12/18/13, Dmitry Olshansky  wrote:
> By the end of day any feature is interesting as long as we carefully
> weight:
>
> - how useful a feature is
> - how widespread the syntax/how many precedents in other libraries
>
> against
>
> - how difficult to implement
> - does it affect backwards compatibility
> - any other hidden costs
>
> I'd be glad to implement well motivated enhancement requests.

Excellent, that's what I'm hoping for from any library dev. Weigh the
odds before adding random features. :)

Re: std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Dmitry Olshansky


18-Dec-2013 22:33, Andrej Mitrovic пишет:

I'm reading through http://www.regular-expressions.info, and there's a
feature that's missing from std.regex,
quoted:

-
All the characters between the \Q and the \E are interpreted as
literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The
\E may be omitted at the end of the regex, so \Q*\d+* is the same as
\Q*\d+*\E.


[snip]

Should this feature be added? I guess there's probably more regex
features missing (I just began reading the page), I'm not sure how
Dmitry feels about adding X number of features though.


All in all I wanted to be principled about what set of features to 
support. The initial design was:

1. Choose a syntax flavor (ECMAScript)
2. Add some powerful stuff (e.g. unlimited lookbehind, full unicode-support)
3. Add some convenient stuff that is popular enough/easy to implement 
(named captures).
4. Avoid extensions that complicate engine and preclude optimizations, 
or heavily depend on implementation. (So no recursion and similar madness)


In that light 'missing' might be on purpose. For instance std.regex 
doesn't provide 'atomic'(possessive) groups simply because it's a kludge 
invented for poor (performance of) backtracking engines.


By the end of day any feature is interesting as long as we carefully weight:

- how useful a feature is
- how widespread the syntax/how many precedents in other libraries

against

- how difficult to implement
- does it affect backwards compatibility
- any other hidden costs

I'd be glad to implement well motivated enhancement requests.

P.S. This reminds me to put a roadmap of sorts on where std.regex is 
going and what to expect.


--
Dmitry Olshansky

std.regex literal syntax (the \Q…\E escape sequence)

2013-12-18 Thread Andrej Mitrovic

I'm reading through http://www.regular-expressions.info, and there's a
feature that's missing from std.regex, quoted:

-
All the characters between the \Q and the \E are interpreted as
literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The
\E may be omitted at the end of the regex, so \Q*\d+* is the same as
\Q*\d+*\E.
-

This would translate to the following needing to work (which fails at
runtime with an exception):

writeln(r"*\d+*".match(r"\Q*\d+*\E"));

Should this feature be added? I guess there's probably more regex
features missing (I just began reading the page), I'm not sure how
Dmitry feels about adding X number of features though.

Re: std.regex literal syntax (the \Q…\E escape sequence)

Re: std.regex literal syntax (the \Q…\E escape sequence)

Re: std.regex literal syntax (the \Q…\E escape sequence)

Re: std.regex literal syntax (the \Q…\E escape sequence)

Re: std.regex literal syntax (the \Q…\E escape sequence)

Re: std.regex literal syntax (the \Q…\E escape sequence)

std.regex literal syntax (the \Q…\E escape sequence)

7 matches

Site Navigation

Mail list logo

Footer information