I tried switching to a variable... although
I am not certain I have the syntax correct.

I am getting the same results. Without
the substitutions, it is perfect. With the
subs taking place, I get zero results in
the array.


$parsed_html =~ s/\s+/ /gs;
$parsed_html =~ s/>/">/gs;
$parsed_html =~ s/=http/="http/gis;
$parsed_html =~ s/"+/"/gs;
$parsed_html =~ s/'"/'/gs;
$a = $parsed_html;

[EMAIL PROTECTED] = (@urlmatch,$2,$4) while m{
#     < \s*
#     A \s+ HREF \s* = \s* (["'])  (.*?)  (["'])
#     \s* > \s* (.*?) \s* <\/a \s* >
#}gsix;

while ($a =~ m/< \s* A \s+ HREF \s* = \s* (["']) (.*?) (["']) \s* > \s* (.*?) \s* <\/a \s* >/gsix)
{
@urlmatch = (@urlmatch,$2,$4);
}

print "0=$urlmatch[0]<BR>1=$urlmatch[1]<BR>2=$urlmatch[2]<BR>";
print "3=$urlmatch[3]<BR>4=$urlmatch[4]<BR>5=$urlmatch[5]<BR>";



Bill


[EMAIL PROTECTED] wrote:
I'd copy off $_ to a variable at the very beginning of a subroutine and 
only use it within a loop, otherwise consider a variable.
I don't see why you couldn't shift to a variable instead.

----- Original Message -----
From: Bill Platt <[EMAIL PROTECTED]>
Date: Monday, October 31, 2005 11:43 am
Subject: Changing $_ Output - Unknown Reason

  
Hello,

I have included a section of code below
that is driving me nuts.

If I don't run the Substitution operations,
then I can successfully extract the URL
and the imbedded anchor text from
$parsed_html.

Once I include the Substitution operations,
then I cannot extract the same results.

Even though the output text looks theoretically
correct, I cannot see why any combination of the
Substitution operation breaks my code.

Can you offer any suggestions to me?



if($parsed_html =~ m/href/)
{

$parsed_html =~ s/\s+/ /gs;
$parsed_html =~ s/>/">/gs;
$parsed_html =~ s/=http/="http/gis;
$parsed_html =~ s/"+/"/gs;
$parsed_html =~ s/'"/'/gs;
$_ = "$parsed_html";

@urlmatch = (@urlmatch,$2,$4) while m{
    < \s*
    A \s+ HREF \s* = \s* (["'])  (.*?)  (["'])
    \s* > \s* (.*?) \s* <\/a \s* >
}gsix;

print "0=$urlmatch[0]<BR>1=$urlmatch[1]<BR>2=$urlmatch[2]<BR>";
print "3=$urlmatch[3]<BR>4=$urlmatch[4]<BR>5=$urlmatch[5]<BR>";

print "s0=$0<BR>s1=$1<BR>s2=$2<BR>s3=$3<BR>s4=$4<BR>s5=$5<BR>";
print "$_<BR><HR>$parsed_html<BR><HR>";

}


Thank you,

Bill Platt

    



  
_______________________________________________
Perl-Unix-Users mailing list
Perl-Unix-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to