hi I am crawling a feed url. http://blog.taragana.com/n/c/india/feed/. I have set depth =2. I am using FeedParser.java for parsing it. For depth 1 in parseData in segments folder Parse Metadata for a url " http://blog.taragana.com/n/30-child-labourers-rescued-in-agra-and-firozabad-111417/ " is like this Parse Metadata :author=Ani CharEncodingForConversion=utf-8 tag=Agra tag=Firozabad tag=Uttar Pradesh tag=India OriginalCharEncoding=utf-8 feed=http://blog.taragana.com/n published=1247778368000 . As we can see it contains author.
but for depth 2 parsemetadata for same url is like this: Parse Metadata: CharEncodingForConversion=utf-8 OriginalCharEncoding=utf-8 when i search i am not getting author. i have following question regarding this- (1)Does Nutch overwrite Parsed metadata of depth 1 with that of depth 2 for this URL or does it merge the two? If it overwrites, then how can I stop it from doing the same as I need the author and other information obtained by parsing the RSS feed. -- View this message in context: http://www.nabble.com/Issue-with-Parse-metaData-while-crawling-RSSFeed-URL-tp24532613p24532613.html Sent from the Nutch - User mailing list archive at Nabble.com.
