Hi Ning, Thanks a lot !
Naama On Tue, Apr 1, 2008 at 7:06 PM, Ning Li <[EMAIL PROTECTED]> wrote: > Hi, > > Nutch builds Lucene indexes. But Nutch is much more than that. It is a > web search application software that crawls the web, inverts links and > builds indexes. Each step is one or more Map/Reduce jobs. You can find > more information at http://lucene.apache.org/nutch/ > > The Map/Reduce job to build Lucene indexes in Nutch is customized to > the data schema/structures used in Nutch. The index contrib package in > Hadoop provides a general/configurable process to build Lucene indexes > in parallel using a Map/Reduce job. That's the main difference. There > is also the difference that the index build job in Nutch builds > indexes in reduce tasks, while the index contrib package builds > indexes in both map and reduce tasks and there are advantages in doing > that... > > Regards, > Ning > > > On 4/1/08, Naama Kraus <[EMAIL PROTECTED]> wrote: > > Hi, > > > > I'd like to know if Nutch is running on top of Lucene, or is it non > related > > to Lucene. I.e. indexing, parsing, crawling, internal data structures > ... - > > all written from scratch using MapReduce (my impression) ? > > > > What is the relation between Nutch and the distributed Lucene patch that > was > > inserted lately into Hadoop ? > > > > Thanks for any enlightening, > > Naama > > > > -- > > oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 > oo > > 00 oo 00 oo > > "If you want your children to be intelligent, read them fairy tales. If > you > > want them to be more intelligent, read them more fairy tales." (Albert > > Einstein) > > > -- oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo "If you want your children to be intelligent, read them fairy tales. If you want them to be more intelligent, read them more fairy tales." (Albert Einstein)