In your original example:

print "match1='$1' '$2'\n" if ($T=~/^((mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
print "match2='$1' '$2'\n" if ($T=~/^(mr|mrs|miss|dr|prof|sir .{5,}?)\n/smi);

the interior parentheses in example one terminates the alternation, so the last 
string is ’sir’.

In example two, the alternation is not terminated until the first ‘)', so the 
last string is ’sir .{5,}?’. followed in the regular expression by the “\n” 
character. Since in $T ‘miss’ is not followed by an \n, the match fails. Vlado 
has explained how to group and terminate the alternation without capturing the 
match result.


> On Dec 2, 2020, at 6:08 AM, Gary Stainburn <gary.stainb...@ringways.co.uk> 
> wrote:
> 
> On 02/12/2020 13:56, Vlado Keselj wrote:
>> Well, it seems that the first one is what you want, but you just need to
>> use $1 and ignore $2.
>> 
>> You do need parentheses in '(mr|mrs|miss|dr|prof|sir)' but if you do not
>> want for them to be captured in $2, you can use:
>> '(?:mr|mrs|miss|dr|prof|sir)'.  For example:
>> 
>> print "match3='$1' '$2'\n" if
>> ($T=~/^((?:mr|mrs|miss|dr|prof|sir) .{5,}?)\n/smi);
>> 
>> would give output:
>> 
>> match3='Miss Jayne Doe' ''
> Perfect, thank you.
> 
> I can't ignore $2 as it's in a loop with other regex that genuinely returns 
> multiple matches.  The amendment to the REGEX worked perfectly.

It is always best to save the results of a match with capturing in another 
variable. The capturing variables $1, $2, etc. are not reassigned if a match 
fails, so if you use them after a failed match, they will be the values left 
over from a previous match. So do this:

my $salutation = $1;
my $name = $2;

If you don’t want a possible undefined value, so this instead:

my $name = $2 || '';


Jim Gibson
j...@gibson.org

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to