Re: mech->content match regex howto

raphael() Mon, 01 Mar 2010 23:30:12 -0800

On Mon, Mar 1, 2010 at 11:27 PM, John W. Krahn <jwkr...@shaw.ca> wrote:


> raphael() wrote:
>
>> Hi,
>>
>
> Hello,
>
>
>  I am trying to understand WWW::Mechanize
>>
>
> Did you also look at these pages:
>
>
> http://search.cpan.org/~petdance/WWW-Mechanize-1.60/lib/WWW/Mechanize/Examples.pod<http://search.cpan.org/%7Epetdance/WWW-Mechanize-1.60/lib/WWW/Mechanize/Examples.pod>
>
> http://search.cpan.org/~petdance/WWW-Mechanize-1.60/lib/WWW/Mechanize/FAQ.pod<http://search.cpan.org/%7Epetdance/WWW-Mechanize-1.60/lib/WWW/Mechanize/FAQ.pod>
>
> http://search.cpan.org/~petdance/WWW-Mechanize-1.60/lib/WWW/Mechanize/Cookbook.pod<http://search.cpan.org/%7Epetdance/WWW-Mechanize-1.60/lib/WWW/Mechanize/Cookbook.pod>
>
>
>
>  I understand that the downloaded content is stored in content().
>> Why am I not able to use a regex on it in scalar form?
>>
>> ------code------
>>
>> use strict;
>> use warnings;
>> use WWW::Mechanize;
>>
>> my $mech = WWW::Mechanize->new();
>> $mech->get("http://checkip.dyndns.org";);
>> my $last_page = $mech->content(); # last page fetched
>>
>> # this works if I store content in an array @last_page
>> # for ( @last_page ) {
>> #    if ( m/([\d+.]+)/ ) {
>> #    print "$1\n";
>> #    }
>> # }
>>
>
> $mech->content() returns a scalar value so that is the same as saying:
>
> if ( $last_page[ 0 ] =~ m/([\d+.]+)/ ) {
>
>   print "$1\n";
> }
>
>
>  # ( my $ip ) = grep/(\d+\.)/, $last_page;
>>
>
> grep() returns the list items that match the expression /(\d+\.)/.  The
> regular expression is only used to determine which items to return, it has
> no effect on the content of those items.  If you want to effect the contents
> of the list then you have to use map() instead.
>
>
>
>  ( my $ip = $last_page ) =~ m/([\d+\.]+)/;
>> print "$ip\n";
>>
>> ------end------
>>
>> my $ip gets the whole source page as its value.
>>
>> --
>> Got it while writing out this post :)
>> --
>>
>> Now the question becomes what is the difference between these two?
>>
>> ( my $ip = $last_page ) =~ m/([\d+\.]+)/;
>>
>
> That is the same as:
>
> my $ip = $last_page;
> $ip =~ m/([\d+\.]+)/;
>
> You are not doing anything with the string stored in $1.
>
> And BTW, '+' is not a valid IP address character.
>
>
>
>  ( my $ip ) = ( $last_page ) =~ m/([\d+\.]+)/;
>>
>
> That is equivalent to:
>
> my $ip;
> if ( $last_page =~ m/([\d+\.]+)/ ) {
>    $ip = $1;
>
>    }
>
>
>  I think the above one is "wrong syntax" for using list context?
>>
>
> No, you *have* to use list context or $ip will be assigned the result of
> the match operator (true or false) and not the contents of the capturing
> parentheses.
>
>
>
>  Also  how can I make grep work?
>>
>> ( my $ip ) = grep/(\d+\.)/, $last_page;
>>
>
> You can't, grep() doesn't work that way.  What you are looking for is
> map():
>
> ( my $ip ) = map /([\d.]+)/, $last_page;
>
> Or, since you are not actually using a list, use the /g global option to
> the match operator:
>
> ( my $ip ) = $last_page =~ /[\d.]+/g;
>
> Note that this will return a list of [\d.]+ strings but only the first one
> will be stored in $ip and the rest will be discarded.
>
>
>
>
> John
> --
> The programmer is fighting against the two most
> destructive forces in the universe: entropy and
> human stupidity.               -- Damian Conway
>
> --
> To unsubscribe, e-mail: beginners-unsubscr...@perl.org
> For additional commands, e-mail: beginners-h...@perl.org
> http://learn.perl.org/
>
>
>
Cool! I have to admit that is a "detailed" answer.
Also thanks for clearing out the differences between these two..

( my $ip = $last_page ) =~ m/([\d+\.]+)/;
( my $ip ) = ( $last_page ) =~ m/([\d+\.]+)/;

Just to clear out any misunderstanding "by above one"
I meant ( my $ip = $last_page ) =~ m/([\d+\.]+)/;
Now am I getting this right that this is the *wrong syntax* to get list
context.

Your post was very helpful since I didn't know about parenthesis (or lack of
it)
to capture values.

I always did use parenthesis to capture values like
( my $ip ) = $last_page =~ m/*(*[\d+\.]+*)*/g;

Now I *know* this works
( my $ip ) = $last_page =~ m/[\d+\.]+/g;

Thanks again John.

Re: mech->content match regex howto

Reply via email to