The following module was proposed for inclusion in the Module List:
modid: HTML::LinkExtractor
DSLIP: MdpOp
description: Extract links from an HTML document
userid: PODMASTER (D. H.)
chapterid: 15 (World_Wide_Web_HTML_HTTP_CGI)
communities:
http://perlmonks.com/index.pl?node_id=191458
similar:
HTML::LinkExtor
rationale:
HTML::LinkExtractor is used for extracting links from HTML. It is
very similar to HTML::LinkExtor, except that besides getting the
URL, you also get the link-text.
It also has more complete idea of what constitutes a link. An
example of two cases which HTML::LinkExtor doesn't handle are
<!DOCTYPE HTML SYSTEM "http://www.w3.org/DTD/HTML4-strict.dtd">
and
<meta HTTP-EQUIV="Refresh" CONTENT="5;
URL=http://www.foo.com/foo.html"> It's built upon
HTML::TokeParser::Simple, an easier to grok interface to
HTML::TokeParser.
I marked it "M" for mature above, because HTML::TokeParser is
mature, and HTML::TokeParser::Simple and my module are pretty
straight forward. (plenty of testing has also been done)
enteredby: PODMASTER (D. H.)
enteredon: Mon Aug 26 11:35:12 2002 GMT
The resulting entry would be:
HTML::
::LinkExtractor MdpOp Extract links from an HTML document PODMASTER
Thanks for registering,
The Pause Team
PS: The following links are only valid for module list maintainers:
Registration form with editing capabilities:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=d0200000_43f0799ef7de901a&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=d0200000_43f0799ef7de901a&SUBMIT_pause99_add_mod_insertit=1