I guess I was more concerned with doing the frequent commits and how that would affect the caches. Say I have 2M docs in my main index but I want to add docs every 2 seconds all while doing queries. if I do commits every 2 seconds I basically loose any caching advantage and my faceting performance goes down the tube. If however, I were to add things to a smaller index and then roll it into the larger one every ~30 minutes then I only take the hit on computing the larger filters caches on that interval. Further, if my smaller index were based on a RAMDirectory instead of a FSDirectory I assume computing the filter sets for the smaller index should be fast enough even every 2 seconds.
- will -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik Seeley Sent: Thursday, May 10, 2007 9:49 AM To: solr-user@lucene.apache.org Subject: Re: fast update handlers On 5/10/07, Will Johnson <[EMAIL PROTECTED]> wrote: > I'm trying to setup a system to have very low index latency (1-2 > seconds) and one of the javadocs intrigued me: > > "DirectUpdateHandler2 implements an UpdateHandler where documents are > added directly to the main Lucene index as opposed to adding to a > separate smaller index" > > > The plain DirectUpdateHandler also had the same in its docs. Does this > imply that there use to be another handler that could send docs to a > small/faster index and then merge them in with a larger one or that > someone could in the future? That was the original design, before I thought of the current method in DUH2. DirectUpdateHandler was just meant to get things working to establish the external interface (it's only for testing... very slow at overwriting docs). Adding documents to a separate index and then merging would have no real indexing speed advantage (it's essentially what Lucene does anyway when adding to a large index). There would be some advantage for index distribution, but it would complicate things greatly. High latency is caused by segment merges... this would happen when you periodically had to merge the smaller index into the larger anyway. You could do some other tricks for more predictable index times... set a large mergeFactor and then call optimize after you have added your batch of documents. Stay tuned though... there has been some work on a lucene patch to do merging in a background thread. -Yonik