thanks all for your replies. I am checking Katta... -Tarandeep
On Tue, Jun 2, 2009 at 8:05 AM, Stefan Groschupf <s...@101tec.com> wrote: > Hi, > you might want to checkout: > http://katta.sourceforge.net/ > > Stefan > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Hadoop training and consulting > http://www.scaleunlimited.com > http://www.101tec.com > > > > > On Jun 1, 2009, at 9:54 AM, Tarandeep Singh wrote: > > Hi All, >> >> I am trying to build a distributed system to build and serve lucene >> indexes. >> I came across the Distributed Lucene project- >> http://wiki.apache.org/hadoop/DistributedLucene >> https://issues.apache.org/jira/browse/HADOOP-3394 >> >> and have a couple of questions. It will be really helpful if someone can >> provide some insights. >> >> 1) Is this code production ready? >> 2) Does someone has performance data for this project? >> 3) It allows searches and updates/deletes to be performed at the same >> time. >> How well the system will perform if there are frequent updates to the >> system. Will it handle the search and update load easily or will it be >> better to rebuild or update the indexes on different machines and then >> deploy the indexes back to the machines that are serving the indexes? >> >> Basically I am trying to choose between the 2 approaches- >> >> 1) Use Hadoop to build and/or update Lucene indexes and then deploy them >> on >> separate cluster that will take care or load balancing, fault tolerance >> etc. >> There is a package in Hadoop contrib that does this, so I can use that >> code. >> >> 2) Use and/or modify the Distributed Lucene code. >> >> I am expecting daily updates to our index so I am not sure if Distribtued >> Lucene code (which allows searches and updates on the same indexes) will >> be >> able to handle search and update load efficiently. >> >> Any suggestions ? >> >> Thanks, >> Tarandeep >> > >