Module submission HTML::LinkExtractor

Perl Authors Upload Server Mon, 26 Aug 2002 04:14:19 -0700


The following module was proposed for inclusion in the Module List:


  modid:       HTML::LinkExtractor
  DSLIP:       MdpOp
  description: Extract links from an HTML document
  userid:      PODMASTER (D. H.)
  chapterid:   15 (World_Wide_Web_HTML_HTTP_CGI)
  communities:
    http://perlmonks.com/index.pl?node_id=191458

  similar:
    HTML::LinkExtor

  rationale:

    HTML::LinkExtractor is used for extracting links from HTML. It is
    very similar to HTML::LinkExtor, except that besides getting the
    URL, you also get the link-text.

    It also has more complete idea of what constitutes a link. An
    example of two cases which HTML::LinkExtor doesn't handle are

    <!DOCTYPE HTML SYSTEM "http://www.w3.org/DTD/HTML4-strict.dtd";>

    and

    <meta HTTP-EQUIV="Refresh" CONTENT="5;
    URL=http://www.foo.com/foo.html";> It's built upon
    HTML::TokeParser::Simple, an easier to grok interface to
    HTML::TokeParser.

    I marked it "M" for mature above, because HTML::TokeParser is
    mature, and HTML::TokeParser::Simple and my module are pretty
    straight forward. (plenty of testing has also been done)

  enteredby:   PODMASTER (D. H.)
  enteredon:   Mon Aug 26 11:35:12 2002 GMT

The resulting entry would be:

HTML::
::LinkExtractor   MdpOp Extract links from an HTML document          PODMASTER


Thanks for registering,
The Pause Team

PS: The following links are only valid for module list maintainers:

Registration form with editing capabilities:
  
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=d0200000_43f0799ef7de901a&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
  
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=d0200000_43f0799ef7de901a&SUBMIT_pause99_add_mod_insertit=1

Module submission HTML::LinkExtractor

Reply via email to