Koch Martina wrote:
Hi all,

I'd like to add a new field to the CrawlDatum to capture the date when an URL 
was found first. The field should be called FoundFirst. Can anyone tell me 
which classes I need to modify in order to achieve this? In my opinion, it 
should be sufficient to change the CrawlDatum and CrawlDbReader class, but I 
think, I've missed something beacause the CrawlDbMerger crashes now. I know 
that I lose the compatibility to Nutch, but still...

The easiest (and compatible) way to do this is to use CrawlDatum.getMetaData(), which is a MapWritable that can store arbitrary key/value pairs.




--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to