Re: [Nutch-general] RSS link extractor

Berlin Brown Wed, 18 Jul 2007 20:49:36 -0700

I think I asked this too.  I wish someone could come up with this.

On 7/18/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
> Has anyone written a tool or used a plugin that lets you pull out the
> RSS url from a crawled HTML page? (the link rel=alternate)
>
> I know about the parse-rss plugin, but that seems to work on an
> already discovered rss link. I want to look at my already crawled
> crawldb and pull out all the rss URLs.
>
>
>
>



-- 
Berlin Brown
http://www.newspiritcompany.com - newspirit technologies

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Re: [Nutch-general] RSS link extractor

Reply via email to