Re: Is there another way to define a regex?

2016-01-18 Thread Gianni Ceccarelli
On 2016-01-17 Tom Browder  wrote:
> My question: Is there a way to have Perl 6 do the required escaping
> for the regex programmatically, i.e., turn this:
> 
> my $str = '/home/usr/.cpan';
> 
> into this:
> 
> my regex dirs {
>   \/home\/usr\/\.cpan
> }
> 
> automatically?  

Yes! And it's also simpler than in Perl 5. I Perl 5, you would have to
do something like:

  my $dirs = qr{\Q$str};

but in Perl 6 you just do:

  my $dirs = regex { $str };

because the other behaviour, to interpret the contents of $str as
another regex to match, has more explicit syntax:

  regex { <$another_regex> }

For your initial use-case there is also another shortcut: an array is
interpreted as an alternation, so you could write:

  my @dirs = < /home/user/.cpan /home/tbrowde/.emacs >;
  my $regex = regex { @dirs };

and it would do what your Perl 5 example does.

If you want to be more restrictive, you can anchor the alternation:
  
  my $regex = regex { ^ @dirs };

Here is a complete program:

  use v6;

  my @dirs = < foo bar baz >;

  my $matcher = regex { ^ @dirs $ };

  for $*IN.lines -> $line {
  if $line ~~ $matcher {
  say "<$line> matched";
  }
  else {
  say "<$line> didn't match";
  }
  }

-- 
Dakkar - 
GPG public key fingerprint = A071 E618 DD2C 5901 9574
 6FE2 40EA 9883 7519 3F88
key id = 0x75193F88

To spot the expert, pick the one who predicts the job will take the
longest and cost the most.


Re: Is there another way to define a regex?

2016-01-18 Thread Nelo Onyiah
Curious but are the non capturing groups necessary?
On 17 Jan 2016 11:35 p.m., "Tom Browder"  wrote:

> On Sun, Jan 17, 2016 at 3:03 PM, Moritz Lenz  wrote:
> > On 01/17/2016 06:07 PM, Tom Browder wrote:
> ...
> >> # regex of dirs to ignore
> >> my regex dirs {
> >>   \/home\/user\/\.cpan |
> >>   \/home\/tbrowde\/\.emacs
> >> }
> >
> > Better written as
> >
> > my regex dirs {
> >| '/home/user/.cpan'
> >| '/home/tbowde/.emacs'
> > }
> >
> > Yes, quoting does now work in regexes too. Cool, right? :-)
>
> Yes, very cool!  I have decided to use that format and it does ease
> adding or modifying entries.
>
> My final regex looks like this (shortened to only a couple of entries):
>
> my regex dirs {
>   ^
>   \s*
>   [# <= non-capture grouping
>| '/home/user/.cpan'
>| '/home/tbowde/.emacs'
># more entries...
>   ]
> }
>
> used like this:
>
>   next LINE if $line ~~ //;
>
> Thanks, Moritz!
>
> Best,
>
> -Tom
>


Re: Is there another way to define a regex?

2016-01-18 Thread Tom Browder
On Sunday, January 17, 2016, Nelo Onyiah  wrote:

> Curious but are the non capturing groups necessary?
>
I don't know the answer for the Perl 6 construct but I think the grouping
is required in the Perl 5 equivalent.  But I certainly may be wrong.  Now
that you mention it, I think you are probably correct.

Thanks for pointing that out.

-Tom


Re: Is there another way to define a regex?

2016-01-17 Thread Tom Browder
On Sun, Jan 17, 2016 at 3:03 PM, Moritz Lenz  wrote:
> On 01/17/2016 06:07 PM, Tom Browder wrote:
...
>> # regex of dirs to ignore
>> my regex dirs {
>>   \/home\/user\/\.cpan |
>>   \/home\/tbrowde\/\.emacs
>> }
>
> Better written as
>
> my regex dirs {
>| '/home/user/.cpan'
>| '/home/tbowde/.emacs'
> }
>
> Yes, quoting does now work in regexes too. Cool, right? :-)

Yes, very cool!  I have decided to use that format and it does ease
adding or modifying entries.

My final regex looks like this (shortened to only a couple of entries):

my regex dirs {
  ^
  \s*
  [# <= non-capture grouping
   | '/home/user/.cpan'
   | '/home/tbowde/.emacs'
   # more entries...
  ]
}

used like this:

  next LINE if $line ~~ //;

Thanks, Moritz!

Best,

-Tom


Re: Is there another way to define a regex?

2016-01-17 Thread Tom Browder
On Sun, Jan 17, 2016 at 2:10 PM, Tom Browder  wrote:
> On Sun, Jan 17, 2016 at 1:55 PM, Bruce Gray  wrote:
>> On Jan 17, 2016, at 11:07 AM, Tom Browder  wrote:
>>
>>> I'm trying to write all new Perl code in Perl 6.  One thing I need is
>>> the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've
> ...
>
> I'll try all that, Bruce.  Thanks!

Hm, I'm not getting good results (> 20 sec vs 10 sec).  I realize I
may have oversimplified what I'm trying to do.  My incoming data lines
are complete file names.  I want to ignore file names whose path meets
certain partial paths.  So my working Perl 6 example, fully-escaped,
should be:

my regex dirs_to_ignore {
  ^ \s* [  # non-capture grouping
   \/home\/user\/\.cpan |
   \/home\/user\/some\-dir
 ]
}

And it runs very fast and accurately.  When I remove the backslashes
and use single quotes around the partial paths it also DOES work as
shown here!!

my regex dirs_to_ignore {
  ^ \s* [  # non-capture grouping
   '/home/user/.cpan'  |
   '/home/user/some-dir'
 ]
}

I probably made some subtle error previously.  Sorry for the wasted bandwidth.

I also tried again auto-building the regex but with no final success.
I could get the string exactly as in the second example bout I
couldn't get it to compile or to be used as a regex as far as I could
tell by timing and output comparison.

Best,

-Tom


Re: Is there another way to define a regex?

2016-01-17 Thread Moritz Lenz


On 01/17/2016 06:07 PM, Tom Browder wrote:
> I'm trying to write all new Perl code in Perl 6.  One thing I need is
> the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've
> come to use something like this (I'm trying to ignore certain
> directories):
> 
> # regex of dirs to ignore
> my regex dirs {
>   \/home\/user\/\.cpan |
>   \/home\/tbrowde\/\.emacs
> }

Better written as

my regex dirs {
   | '/home/user/.cpan'
   | '/home/tbowde/.emacs'
}

Yes, quoting does now work in regexes too. Cool, right? :-)

(The leading | is ignored, it just allows you to format the alternation
more consistently)

> for "dirlist.txt".IO.lines -> $line {
>   # ignore certain dirs
>   if $line ~~ m{} {
>  next;
>   }
> }
> 
> My question: Is there a way to have Perl 6 do the required escaping
> for the regex programmatically, i.e., turn this:
> 
> my $str = '/home/usr/.cpan';
> 
> into this:
> 
> my regex dirs {
>   \/home\/usr\/\.cpan
> }
> 
> automatically?

Even better: No need to escape anymore. If you use $str in a regex, and
it actually contains a Str, it is always taken to match literally, so


my $dir1 = '/home/user/.cpan';
my $dir2 = '/home/tbowde/.emacs';
my regex ignore_dirs { $dir1 | $dir2 }

does what you want.

If you want the interpolated string to be treated as a regex, you have
to use it as   my regex dirs { <$dir1> }.

Cheers,
Moritz


Re: Is there another way to define a regex?

2016-01-17 Thread Tom Browder
On Sun, Jan 17, 2016 at 1:55 PM, Bruce Gray  wrote:
> On Jan 17, 2016, at 11:07 AM, Tom Browder  wrote:
>
>> I'm trying to write all new Perl code in Perl 6.  One thing I need is
>> the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've
...

I'll try all that, Bruce.  Thanks!

-Tom


Re: Is there another way to define a regex?

2016-01-17 Thread Bruce Gray

On Jan 17, 2016, at 11:07 AM, Tom Browder  wrote:

> I'm trying to write all new Perl code in Perl 6.  One thing I need is
> the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've
> come to use something like this (I'm trying to ignore certain
> directories):
> 
> # regex of dirs to ignore
> my regex dirs {
> \/home\/user\/\.cpan |
> \/home\/tbrowde\/\.emacs
> }
> 
> for "dirlist.txt".IO.lines -> $line {
> # ignore certain dirs
> if $line ~~ m{} {
>next;
> }
> }
> 
> My question: Is there a way to have Perl 6 do the required escaping
> for the regex programmatically, i.e., turn this:
> 
> my $str = '/home/usr/.cpan';
> 
> into this:
> 
> my regex dirs {
> \/home\/usr\/\.cpan
> }
> 
> automatically?

I think that your regex needs anchors to do what you want; 
your current regex will also exclude `/home/user/.cpanabcedfg`, for example.

This is how I do it in Perl 5 (when using regexes instead of a hash):
my $dirs = join '|', map { quotemeta } qw(
   /home/user/.cpan
   /home/tbrowde/.emacs
);
my $dirs_re = qr/^(?:$dirs)$/;

So, `quotemeta` is what you are looking for.
Except that https://design.perl6.org/S29.html#Obsolete_Functions says:
   quotemeta
   Because regex escaping metacharacters can easily be solved by quotes 
   ("Simplified lexical parsing of patterns" in S05), 
   and variables are not automatically interpolated 
   ("Variable (non-)interpolation" in S05), 
   quotemeta is no longer needed.
   http://design.perl6.org/S05.html#Simplified_lexical_parsing_of_patterns
   http://design.perl6.org/S05.html#Variable_%28non-%29interpolation
Summary of `Variable_%28non-%29interpolation`:
In Perl 5, /$var/ deposits the contents of `$var` into the regex,
just like of you had typed them in the source code.
In Perl 6, /$var/ runs quotemeta on the contents of `$var`.
If you want the Perl 5 behavior, then you write it as /<$var>/.
Also, if you embed an array in a regex, it automatically treats it as `|` 
alternatives.
So cool!

# Plain text, *not* regex!
my @dirs_to_skip = <
   /home/user/.cpan
   /home/tbrowde/.emacs
> ;
...
   next if $line ~~ / ^ @dirs_to_skip $ /;

Or, precompiled:
my @dirs_to_skip = <
   /home/user/.cpan
   /home/tbrowde/.emacs
> ;
my $dir_re = / ^ @dirs_to_skip $ /;
...
   next if $line ~~ $dir_re;

If you know that your exclusion list is always literal, 
you can leave the regex engine out of the process, 
and just do a hash lookup
(which is even easier in Perl 6 with `set`, which sets the values all to 
`True`):
my %dirs_to_skip = set <
   /home/user/.cpan
   /home/tbrowde/.emacs
> ;
...
   next if %dirs_to_skip{$line};

-- 
Hope this helps,
Bruce Gray (Util of PerlMonks)



Is there another way to define a regex?

2016-01-17 Thread Tom Browder
I'm trying to write all new Perl code in Perl 6.  One thing I need is
the equivalent of the Perl 5 qr// and, following Perl 6 docs, I've
come to use something like this (I'm trying to ignore certain
directories):

# regex of dirs to ignore
my regex dirs {
  \/home\/user\/\.cpan |
  \/home\/tbrowde\/\.emacs
}

for "dirlist.txt".IO.lines -> $line {
  # ignore certain dirs
  if $line ~~ m{} {
 next;
  }
}

My question: Is there a way to have Perl 6 do the required escaping
for the regex programmatically, i.e., turn this:

my $str = '/home/usr/.cpan';

into this:

my regex dirs {
  \/home\/usr\/\.cpan
}

automatically?

Thanks.

Best regards,

-Tom