Greetings, I think I have something to offer the world. Here's my info in the order you asked for it at http://www.cpan.org/modules/04pause.html#registering. Please consider establishing a developer account for me. === Info === Name : Scott Thomason Email : [EMAIL PROTECTED] Website : industrial-linux.org UserID : SCOTTHOM Description : XML::RSSLite bdpf Extract broken XML from RSS/RDF/SN/WL format Background : After recently reading about XML::RSS, I decided to give it a try. Several hours later, I was still only capable of extracting about 40% of the open content cataloged at xmltree.com. I realized that the majority of open content syndicators are not very good at writing valid RSS XML. My goal is to write a module that extracts all the open content that is available, and be much less concerned about XML compliance. This module currently parses all but a handful of the XML URLs cataloged at xmltree.com. Rather than rely on XML::Parser, it uses heuristics and good old-fashioned Perl regular expressions. It stores the data in a simple hash structure, and "aliases" certain tags so that when done, you can count on having the minimal data necessary for re-constructing a valid RSS file. This means you get the basic title, description, and link for a channel and its items. Anything else present in the hash is a bonus :) This module extracts more usable links by parsing "scriptingNews" and "weblog" formats in addition to RDF & RSS. It also "sanitizes" the for best results. The munging includes: - Removing html tags to leave plain text - Eliminating objectionable characters - Using <url> tags when <link> is empty - Using misplaced urls in <title> when <link> is empty - Ripping links from <a href=...> if required - Limiting content to http/ftp links - Joining relative urls to the site base In the near future I plan to strengthen the weblog parsing, and add a routine to generate valid RSS output from the result hash via XML::RSS, plus contact