RE: extracting links from HTML data (7my own problem usiing grep
Friends, I am Sanil.I am a computer Science student.I want to study expect programming and python Qt. If anybody of you know this or sites for the corresponding tutorial please please mail to me. from SANIL On Tue, 15 Jan 2002, Gary Hawkins wrote: > > However the script continues > > print @list3; > > my $var1='META'; > > @lista= grep{$var1} @list3;## not picked up at all > > print @lista > > > > anyone any clues > > Suppose I'm a little confused but perhaps you meant: > > print @list3; > @lista= grep(/META/, @list3); > print @lista; > > /g > > > -- # # # SANILKUMAR.M.M # # S4 CSE # # REGIONAL ENGINEERING COLLEGE # # CALICUT - KERALA # # # -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: extracting links from HTML data (7my own problem usiing grep
On Wed, 2002-01-16 at 07:15, Gary Hawkins wrote: > > However the script continues > > print @list3; > > my $var1='META'; > > @lista= grep{$var1} @list3;## not picked up at all > > print @lista > > > > anyone any clues > > Suppose I'm a little confused but perhaps you meant: > > print @list3; > @lista= grep(/META/, @list3); > print @lista; > > /g thanks a lot couple of points this is basically a learning/getting rid of rustiness exercise why is this syntax not mentioned in programming perl - any ideas? -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: extracting links from HTML data (7my own problem usiing grep
> However the script continues > print @list3; > my $var1='META'; > @lista= grep{$var1} @list3;## not picked up at all > print @lista > > anyone any clues Suppose I'm a little confused but perhaps you meant: print @list3; @lista= grep(/META/, @list3); print @lista; /g -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: extracting links from HTML data (7my own problem usiing grep
On Tue, 2002-01-15 at 01:44, Lorne Easton wrote: > I need to write some code that extracts that extracts hyperlinks from a > scalar ($data) and puts them into an array. > > I imagine that grep can do this, but my mastery of it and > reqular expressions are not brilliant. > > Can you please provide some example code, or at least point me in the right > direction? > > Cheers, > Lorne This bit of my script does what you want #!/usr/bin/perl -w use locale; my @list1=`grep 'HREF.*.html' $ARGV[0]`; foreach $list1 (@list1){ $list1=~s/HREF="/\n/gi ; $list1=~s/(\#.*\n)//gi ;# strip name refs $list1=~s/"//gi ; foreach $list1 (@list1){ %seen =(); @list3 = grep{ ! $seen{$_} ++ } @list1;# dedupe } However the script continues print @list3; my $var1='META'; @lista= grep{$var1} @list3;## not picked up at all print @lista anyone any clues -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: extracting links from HTML data
In article <[EMAIL PROTECTED]>, Lorne Easton wrote: > I need to write some code that extracts that extracts hyperlinks from a > scalar ($data) and puts them into an array. > > I imagine that grep can do this, but my mastery of it and > reqular expressions are not brilliant. > > Can you please provide some example code, or at least point me in the right > direction? If you only need the URLs of the hyperlinks, then HTML::LinkExtor is just what you need, and it is provided with HTML::Parser. HTML::SimpleLinkExtor might be worth a try too. http://search.cpan.org/search?dist=HTML-SimpleLinkExtor http://search.cpan.org/search?dist=HTML-Parser Otherwise, if you want the URLs and the text inside, something like the following might work: #!/usr/bin/perl -w use strict; use HTML::Parser 3; my $data = <<'_HTML_'; http://foo";>bar foo text baz http://baz";>quux _HTML_ my @links = parse_links($data); # We now print the links we found my $count; foreach (@links){ print ++$count . ". Description: $_->[1]\n URL: $_->[0]\n\n" } sub parse_links { my $data = shift; my ( @links, $inside ); my $count = 0; # Preparing the parser my $linkparser = HTML::Parser->new( report_tags => ['a'], # Only dealing with tags unbroken_text => 1, # Avoid text split over several lines # Called each time a is found start_h => [ sub { # Storing the HREF attribute $links[$count] = shift->{href}; # We should recall we're inside a element $inside = 1; }, 'attr' ], # Called when is found end_h => [ sub { $count++; $inside = 0; }, '' ], # Called when text is found text_h => [ sub { # We're only interested in text inside ... return unless $inside; # Store the text with the previous stored HREF # attribute $links[$count] = [ $links[$count], shift ]; }, 'dtext' ], ); # Launch the parser $linkparser->parse($data)->eof(); return wantarray ? @links : \@links; } __END__ -- briac A flying swallow. A fox stalks under a she-oak. A nesting dove. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: extracting links from HTML data
Lorne Easton wrote: > > I need to write some code that extracts that extracts hyperlinks from a > scalar ($data) and puts them into an array. > > I imagine that grep can do this, but my mastery of it and > reqular expressions are not brilliant. > > Can you please provide some example code, or at least point me in the right > direction? http://search.cpan.org/search?dist=HTML-Parser John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
extracting links from HTML data
I need to write some code that extracts that extracts hyperlinks from a scalar ($data) and puts them into an array. I imagine that grep can do this, but my mastery of it and reqular expressions are not brilliant. Can you please provide some example code, or at least point me in the right direction? Cheers, Lorne -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]