Re: Is there another way to define a regex?
On 2016-01-17 Tom Browder wrote: > My question: Is there a way to have Perl 6 do the required escaping > for the regex programmatically, i.e., turn this: > > my $str = '/home/usr/.cpan'; > > into this: > > my regex dirs { > \/home\/usr\/\.cpan > } > > automatically? Yes! And it's also simpler than in Perl 5. I Perl 5, you would have to do something like: my $dirs = qr{\Q$str}; but in Perl 6 you just do: my $dirs = regex { $str }; because the other behaviour, to interpret the contents of $str as another regex to match, has more explicit syntax: regex { <$another_regex> } For your initial use-case there is also another shortcut: an array is interpreted as an alternation, so you could write: my @dirs = < /home/user/.cpan /home/tbrowde/.emacs >; my $regex = regex { @dirs }; and it would do what your Perl 5 example does. If you want to be more restrictive, you can anchor the alternation: my $regex = regex { ^ @dirs }; Here is a complete program: use v6; my @dirs = < foo bar baz >; my $matcher = regex { ^ @dirs $ }; for $*IN.lines -> $line { if $line ~~ $matcher { say "<$line> matched"; } else { say "<$line> didn't match"; } } -- Dakkar - GPG public key fingerprint = A071 E618 DD2C 5901 9574 6FE2 40EA 9883 7519 3F88 key id = 0x75193F88 To spot the expert, pick the one who predicts the job will take the longest and cost the most.
Re: Is there another way to define a regex?
Curious but are the non capturing groups necessary? On 17 Jan 2016 11:35 p.m., "Tom Browder" wrote: > On Sun, Jan 17, 2016 at 3:03 PM, Moritz Lenz wrote: > > On 01/17/2016 06:07 PM, Tom Browder wrote: > ... > >> # regex of dirs to ignore > >> my regex dirs { > >> \/home\/user\/\.cpan | > >> \/home\/tbrowde\/\.emacs > >> } > > > > Better written as > > > > my regex dirs { > >| '/home/user/.cpan' > >| '/home/tbowde/.emacs' > > } > > > > Yes, quoting does now work in regexes too. Cool, right? :-) > > Yes, very cool! I have decided to use that format and it does ease > adding or modifying entries. > > My final regex looks like this (shortened to only a couple of entries): > > my regex dirs { > ^ > \s* > [# <= non-capture grouping >| '/home/user/.cpan' >| '/home/tbowde/.emacs' ># more entries... > ] > } > > used like this: > > next LINE if $line ~~ //; > > Thanks, Moritz! > > Best, > > -Tom >
Re: Is there another way to define a regex?
On Sunday, January 17, 2016, Nelo Onyiah wrote: > Curious but are the non capturing groups necessary? > I don't know the answer for the Perl 6 construct but I think the grouping is required in the Perl 5 equivalent. But I certainly may be wrong. Now that you mention it, I think you are probably correct. Thanks for pointing that out. -Tom
Re: Is there another way to define a regex?
On Sun, Jan 17, 2016 at 3:03 PM, Moritz Lenz wrote: > On 01/17/2016 06:07 PM, Tom Browder wrote: ... >> # regex of dirs to ignore >> my regex dirs { >> \/home\/user\/\.cpan | >> \/home\/tbrowde\/\.emacs >> } > > Better written as > > my regex dirs { >| '/home/user/.cpan' >| '/home/tbowde/.emacs' > } > > Yes, quoting does now work in regexes too. Cool, right? :-) Yes, very cool! I have decided to use that format and it does ease adding or modifying entries. My final regex looks like this (shortened to only a couple of entries): my regex dirs { ^ \s* [# <= non-capture grouping | '/home/user/.cpan' | '/home/tbowde/.emacs' # more entries... ] } used like this: next LINE if $line ~~ //; Thanks, Moritz! Best, -Tom
Re: Is there another way to define a regex?
On Sun, Jan 17, 2016 at 2:10 PM, Tom Browder wrote: > On Sun, Jan 17, 2016 at 1:55 PM, Bruce Gray wrote: >> On Jan 17, 2016, at 11:07 AM, Tom Browder wrote: >> >>> I'm trying to write all new Perl code in Perl 6. One thing I need is >>> the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've > ... > > I'll try all that, Bruce. Thanks! Hm, I'm not getting good results (> 20 sec vs 10 sec). I realize I may have oversimplified what I'm trying to do. My incoming data lines are complete file names. I want to ignore file names whose path meets certain partial paths. So my working Perl 6 example, fully-escaped, should be: my regex dirs_to_ignore { ^ \s* [ # non-capture grouping \/home\/user\/\.cpan | \/home\/user\/some\-dir ] } And it runs very fast and accurately. When I remove the backslashes and use single quotes around the partial paths it also DOES work as shown here!! my regex dirs_to_ignore { ^ \s* [ # non-capture grouping '/home/user/.cpan' | '/home/user/some-dir' ] } I probably made some subtle error previously. Sorry for the wasted bandwidth. I also tried again auto-building the regex but with no final success. I could get the string exactly as in the second example bout I couldn't get it to compile or to be used as a regex as far as I could tell by timing and output comparison. Best, -Tom
Re: Is there another way to define a regex?
On 01/17/2016 06:07 PM, Tom Browder wrote: > I'm trying to write all new Perl code in Perl 6. One thing I need is > the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've > come to use something like this (I'm trying to ignore certain > directories): > > # regex of dirs to ignore > my regex dirs { > \/home\/user\/\.cpan | > \/home\/tbrowde\/\.emacs > } Better written as my regex dirs { | '/home/user/.cpan' | '/home/tbowde/.emacs' } Yes, quoting does now work in regexes too. Cool, right? :-) (The leading | is ignored, it just allows you to format the alternation more consistently) > for "dirlist.txt".IO.lines -> $line { > # ignore certain dirs > if $line ~~ m{} { > next; > } > } > > My question: Is there a way to have Perl 6 do the required escaping > for the regex programmatically, i.e., turn this: > > my $str = '/home/usr/.cpan'; > > into this: > > my regex dirs { > \/home\/usr\/\.cpan > } > > automatically? Even better: No need to escape anymore. If you use $str in a regex, and it actually contains a Str, it is always taken to match literally, so my $dir1 = '/home/user/.cpan'; my $dir2 = '/home/tbowde/.emacs'; my regex ignore_dirs { $dir1 | $dir2 } does what you want. If you want the interpolated string to be treated as a regex, you have to use it as my regex dirs { <$dir1> }. Cheers, Moritz
Re: Is there another way to define a regex?
On Sun, Jan 17, 2016 at 1:55 PM, Bruce Gray wrote: > On Jan 17, 2016, at 11:07 AM, Tom Browder wrote: > >> I'm trying to write all new Perl code in Perl 6. One thing I need is >> the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've ... I'll try all that, Bruce. Thanks! -Tom
Re: Is there another way to define a regex?
On Jan 17, 2016, at 11:07 AM, Tom Browder wrote: > I'm trying to write all new Perl code in Perl 6. One thing I need is > the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've > come to use something like this (I'm trying to ignore certain > directories): > > # regex of dirs to ignore > my regex dirs { > \/home\/user\/\.cpan | > \/home\/tbrowde\/\.emacs > } > > for "dirlist.txt".IO.lines -> $line { > # ignore certain dirs > if $line ~~ m{} { >next; > } > } > > My question: Is there a way to have Perl 6 do the required escaping > for the regex programmatically, i.e., turn this: > > my $str = '/home/usr/.cpan'; > > into this: > > my regex dirs { > \/home\/usr\/\.cpan > } > > automatically? I think that your regex needs anchors to do what you want; your current regex will also exclude `/home/user/.cpanabcedfg`, for example. This is how I do it in Perl 5 (when using regexes instead of a hash): my $dirs = join '|', map { quotemeta } qw( /home/user/.cpan /home/tbrowde/.emacs ); my $dirs_re = qr/^(?:$dirs)$/; So, `quotemeta` is what you are looking for. Except that https://design.perl6.org/S29.html#Obsolete_Functions says: quotemeta Because regex escaping metacharacters can easily be solved by quotes ("Simplified lexical parsing of patterns" in S05), and variables are not automatically interpolated ("Variable (non-)interpolation" in S05), quotemeta is no longer needed. http://design.perl6.org/S05.html#Simplified_lexical_parsing_of_patterns http://design.perl6.org/S05.html#Variable_%28non-%29interpolation Summary of `Variable_%28non-%29interpolation`: In Perl 5, /$var/ deposits the contents of `$var` into the regex, just like of you had typed them in the source code. In Perl 6, /$var/ runs quotemeta on the contents of `$var`. If you want the Perl 5 behavior, then you write it as /<$var>/. Also, if you embed an array in a regex, it automatically treats it as `|` alternatives. So cool! # Plain text, *not* regex! my @dirs_to_skip = < /home/user/.cpan /home/tbrowde/.emacs > ; ... next if $line ~~ / ^ @dirs_to_skip $ /; Or, precompiled: my @dirs_to_skip = < /home/user/.cpan /home/tbrowde/.emacs > ; my $dir_re = / ^ @dirs_to_skip $ /; ... next if $line ~~ $dir_re; If you know that your exclusion list is always literal, you can leave the regex engine out of the process, and just do a hash lookup (which is even easier in Perl 6 with `set`, which sets the values all to `True`): my %dirs_to_skip = set < /home/user/.cpan /home/tbrowde/.emacs > ; ... next if %dirs_to_skip{$line}; -- Hope this helps, Bruce Gray (Util of PerlMonks)
Is there another way to define a regex?
I'm trying to write all new Perl code in Perl 6. One thing I need is the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've come to use something like this (I'm trying to ignore certain directories): # regex of dirs to ignore my regex dirs { \/home\/user\/\.cpan | \/home\/tbrowde\/\.emacs } for "dirlist.txt".IO.lines -> $line { # ignore certain dirs if $line ~~ m{} { next; } } My question: Is there a way to have Perl 6 do the required escaping for the regex programmatically, i.e., turn this: my $str = '/home/usr/.cpan'; into this: my regex dirs { \/home\/usr\/\.cpan } automatically? Thanks. Best regards, -Tom