Hi, Nutch builds Lucene indexes. But Nutch is much more than that. It is a web search application software that crawls the web, inverts links and builds indexes. Each step is one or more Map/Reduce jobs. You can find more information at http://lucene.apache.org/nutch/
The Map/Reduce job to build Lucene indexes in Nutch is customized to the data schema/structures used in Nutch. The index contrib package in Hadoop provides a general/configurable process to build Lucene indexes in parallel using a Map/Reduce job. That's the main difference. There is also the difference that the index build job in Nutch builds indexes in reduce tasks, while the index contrib package builds indexes in both map and reduce tasks and there are advantages in doing that... Regards, Ning On 4/1/08, Naama Kraus <[EMAIL PROTECTED]> wrote: > Hi, > > I'd like to know if Nutch is running on top of Lucene, or is it non related > to Lucene. I.e. indexing, parsing, crawling, internal data structures ... - > all written from scratch using MapReduce (my impression) ? > > What is the relation between Nutch and the distributed Lucene patch that was > inserted lately into Hadoop ? > > Thanks for any enlightening, > Naama > > -- > oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo > 00 oo 00 oo > "If you want your children to be intelligent, read them fairy tales. If you > want them to be more intelligent, read them more fairy tales." (Albert > Einstein) >