I don't know if this helps, but...
Do *all* your queries need to include the fast updates? I have a setup
where there are some cases that need the newest stuff but most cases can
wait 5 mins (or so)
In that case, I have two solr instances pointing to the same index
files. One is used for updates and queries that need everything. The
other is a read-only index that serves the majority of queries.
What is nice about this is that you can set different cache sizes and
auto-warming for the different cases.
ryan
Will Johnson wrote:
The problem is I want the newly added documents to be made searchable
every 1-2 seconds so I need the commits. I was hoping that the caches
could be stored/tied to the IndexSearcher then a MultiSearcher could
take advantage of the multiple sub indexes and their respective caches.
I think the best approach now will be to write a top level federator
that can merge the large ~static index and the smaller more dynamic
index.
- will
-----Original Message-----
From: Charlie Jackson [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 10, 2007 10:53 AM
To: solr-user@lucene.apache.org
Subject: RE: fast update handlers
What about issuing separate commits to the index on a regularly
scheduled basis? For example, you add documents to the index every 2
seconds, or however often, but these operations don't commit. Instead,
you have a cron'd script or something that just issues a commit every 5
or 10 minutes or whatever interval you'd like.
I had to do something similar when I was running a re-index of my entire
dataset. My program wasn't issuing commits, so I just cron'd a commit
for every half hour so it didn't overload the server.
Thanks,
Charlie
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Thursday, May 10, 2007 9:07 AM
To: solr-user@lucene.apache.org
Subject: Re: fast update handlers
On 5/10/07, Will Johnson <[EMAIL PROTECTED]> wrote:
I guess I was more concerned with doing the frequent commits and how
that would affect the caches. Say I have 2M docs in my main index but
I
want to add docs every 2 seconds all while doing queries. if I do
commits every 2 seconds I basically loose any caching advantage and my
faceting performance goes down the tube. If however, I were to add
things to a smaller index and then roll it into the larger one every
~30
minutes then I only take the hit on computing the larger filters
caches
on that interval. Further, if my smaller index were based on a
RAMDirectory instead of a FSDirectory I assume computing the filter
sets
for the smaller index should be fast enough even every 2 seconds.
There isn't currently any support for incrementally updating filters.
-Yonik