bin/nutch generate Usage: Generator <crawldb> <segments_dir> [-force] [-topN N] [-numFetchers numFetchers] [-adddays numDays] [-noFilter] [-noNorm][-maxNumSegments num]
Use the noNorm and likely the noFilter option as well. But again, only do this if you are sure the state of the CrawlDB is already normalized and properly filtered. On Thursday 22 March 2012 12:18:24 James Ford wrote: > Thanks for answer Markus, > > But I don't think I follow you. I am new to nutch. How could I make nutch > use the normalizer only when I have to? I tried removing the order of the > normalizers in the config, but nothing happened. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Generator-taking-time-tp3848106p3848158 > .html Sent from the Nutch - User mailing list archive at Nabble.com. -- Markus Jelsma - CTO - Openindex