Re: apo5

Ruud H.G. van Tol Mon, 21 Nov 2005 10:58:19 -0800

Larry Wall:
> Ruud H.G. van Tol:


>     dev.perl.org one day latency but html-ified
>     svn.perl.org up to the minute but only in pod

Thanks, much better. Can't say that I haven't been there before.

There is a "[[:alpha:][:digit:]" and a "[[:alpha:][:digit]]" on the
A5-page.


>> The '^' could be used for both the ultimate start- and end-of-string.
>> This frees the '$'.
>
> I think this is one of those aspects of regex culture that is too
> entrenched to remove.

Yes, I have experienced that with some of my procmail-recipes that use
'^' to match embedded newlines.
In procmail the '^^' matches begin- or end-of-string. Both a '^' and a
'$' can be used to match a real or putative newline. Some people
replaced my '^'s with '$'s.

OK, everybody can stop reading here, no serious attempts below.

"Within C++, there is a much smaller and cleaner language struggling to
get out," which "would ... have been an unimportant cult language."
(Bjarne Stroustrup, The Design and Evolution of C++).


> Besides, you have to be able to distinguish
> s/^/foo/ from s/$/foo/.

's/$/foo/' becomes 's/<after .*>/foo/'
<g>


>> There is still the '$$' that matches before embedded newlines, and
>> since '^^' matches after those newlines, the '^^' and '$$' can only
>> be unified to '^^' if it is one-width inside a string, so is like
>> '[$$\n^^]' (or just '\n') there.
>
> But then if you use it within a capture, you get an extra newline you
> probably don't want.

Place the ^^ outside the ().

I wasn't sure about the default for the greediness of '^^' at begin- or
end-of-string, I guess non-greediness can be arranged with a trailing
'?'.


>> At start- and end-of-string the '^^' can still be a zero-width match.
>> I am not sure about greedy (meaning to try one-width first) or
>> non-greedy.
>>
>> Example: '^[(\N*)^^]*^' to capture all lines, clean of newlines.
>> Not a lot clearer than '^[(\N*)\n*]*$', but freeing the '$' and '$$'
>> might be worth it.
>
> I don't think it's any clearer.

Pardon my Dutch, I didn't find it clearer either ("but, might be worth
it").


> In fact, I find all the ^'s there
> are a little too visually confusing and contextual.

    /^          # BoS
       [        # start of non-capturing group
         (\N*)  # capture a substring of non-newlines
         ^^     # newline or EoS
       ]*       # end of non-capturing group, repeat
     ^/x        # EoS

As I just said, I am used to '^^' as start- and end-of-buffer, and '^'
as matching a real or putative newline, because of procmail.

-- 
Grtz, Ruud

Re: apo5

Reply via email to