When I org.apache.nutch.parse.rss.RSSParser , its working fine.Now I am
getting URLs.Now i want to get content. How will i do this? Do i need to
send to all URLs to crawldb.Then run the crawl command,or there is another
way.

hi
I want to parse feedUrl using nutch.i tried to use
org.apache.nutch.parse.feed.FeedParser class. Its input is xml. I put in xml
the link below.
http://timesofindia.indiatimes.com/rssfeedsdefault.cms
This url contains all rss feeds for newspaper.When i tried to use it through
Rome Feed Parser it was giving me all the permalink, title,date etc. But
nutch parser doesnot give anything.
How can i get all the permalink,title,date in this url.



-- 
View this message in context: 
http://www.nabble.com/How-to-Parse-Rss-Feed-URL-tp24386051p24404029.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to