Hi,
I know this kind of question has been discussed before, for example msg04098. I
have found two possible solutions.Could you help me to confirm whether my
solutions are possible.
1. Write an indexingFilter set the return doc to null
-Can this cause exceptions like NullPointerException etc during the process
below ?
2. In indexer.java, just skip "writer.addDocument(doc.analyzer)"
public void write(WritableComparable key, Writable value)
throws IOException { // unwrap & index doc
Document doc = (Document)((ObjectWritable)value).get();
//NutchAnalyzer analyzer = factory.get(doc.get("lang"));
NutchAnalyzer analyzer = factory.get("zh");
if (LOG.isInfoEnabled()) {
LOG.info(" Indexing [" + doc.getField("url").stringValue() + "]" +
" with analyzer " + analyzer +
" (" + doc.get("lang") + ")");
}
writer.addDocument(doc, analyzer);
}
Are these two solutions OK to us ? Or any other better solution for this
problem ?
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general