One more alternative, though I am not sure if anyone is using it. Apache Compass has added a plug-in to allow storing Lucene index files inside the database. This should work in clustered environment as all nodes will share the same database instance.
I am not sure the impact it will have on performance. Is anyone using DB for index storage? Any drawbacks of this approach? Regards, Rajesh --- Zach Bailey <[EMAIL PROTECTED]> wrote: > Thanks for your response -- > > Based on my understanding, hadoop and nutch are > essentially the same > thing, with nutch being derived from hadoop, and are > primarily intended > to be standalone applications. > > We are not looking for a standalone application, > rather we must use a > framework to implement search inside our current > content management > application. Currently the application search > functionality is designed > and built around Lucene, so migrating frameworks at > this point is not > feasible. > > We are currently re-working our back-end to support > clustering (in > tomcat) and we are looking for information on the > migration of Lucene > from a single node filesystem index (which is what > we use now and hope > to continue to use for clients with a single-node > deployment) to a > shared filesystem index on a mounted network share. > > We prefer to use this strategy because it means we > do not have to have > two disparate methods of managing indexes for > clients who run in a > single-node, non-clustered environment versus > clients who run in a > multiple-node, clustered environment. > > So, hopefully here are some easy questions someone > could shed some light on: > > Is this not a recommended method of managing indexes > across multiple nodes? > > At this point would people recommend storing an > individual index on each > node and propagating index updates via a JMS > framework rather than > attempting to handle it transparently with a single > shared index? > > Is the Lucene index code so intimately tied to > filesystem semantics that > using a shared/networked file system is infeasible > at this point in time? > > What would be the quickest time-to-implementation of > these strategies > (JMS vs. shared FS)? The most robust/least > error-prone? > > I really appreciate any insight or response anyone > can provide, even if > it is a short answer to any of the related topics, > "i.e. we implemented > clustered search using per-node indexing with JMS > update propagation and > it works great", or even something as simple as > "don't use a shared > filesystem at this point". > > Cheers, > -Zach > > testn wrote: > > Why don't you check out Hadoop and Nutch? It > should provide what you are > > looking for. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > ____________________________________________________________________________________ Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online. http://smallbusiness.yahoo.com/webhosting --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]