Thx for the quick response,
Well i wrote a very simple plugin that tryes to the the same doc twice and
if there is and error
then put it in the orniginal doc custom field:
public NutchDocument filter(NutchDocument doc, Parse parse, Text url,
CrawlDatum datum, Inlinks inlinks) throws IndexingException {
// filter out if url contains archive, label or feeds
LOGGER.debug(Found Url: + new String(url.getBytes()));
NutchIndexWriter[] Writers =
NutchIndexWriterFactory.getNutchIndexWriters(getConf());
//doc.add(js, String.valueOf(Writers.length));
try {
Writers[0].write(doc);
} catch (Exception e) {
// TODO Auto-generated catch block
LOGGER.debug(Error adding Doc + e.getMessage());
doc.add(js, e.getMessage());
}
doc.add(js, AfterTest);
//return doc;
return doc;
}
and after the nutch run i just look at the index with lukeall-1.0.0 ,
I added the compiled plugin jar if you can try to debug it... or
if you can tell me how to debug it will be great (I have the nutch working
from ecplise).
http://old.nabble.com/file/p27598879/myplugins.rar myplugins.rar
--
View this message in context:
http://old.nabble.com/Trying-to-Add-an-new-NutchDoc-from-plugin-tp27598076p27598879.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.