Ok. I was talking about what tools are available now- much better things are in the NRT work. I don't know how merges work now, in re multitasking and thread contention. Most of the Solr sites I know of have much larger indexes than ram and expect everything to work smoothly.
Lance On Sun, Jan 9, 2011 at 9:18 AM, Jason Rutherglen <jason.rutherg...@gmail.com> wrote: >> The older MergePolicies followed a strategy which is quite disruptive in an >> NRT environment. > > Can you elaborate as to why (maybe we need to place this in a wiki)? > If large merges are running in their own thread, they should not > disrupt queries, eg, there won't be CPU contention. The IO contention > can be disruptive, depending on the size and type of hardware, however > in the ideal case of the index 'fitting' into RAM/IO cache, then a > large merge should not affect queries (or indexing). > > I think what's useful that is being developed for not disrupting NRT > with merges is DirectIOLinuxDirectory: > https://issues.apache.org/jira/browse/LUCENE-2500 It's also useful > for the non-NRT use case because anytime IO cache pages are evicted, > queries will slow down (unless the index is too large to fit in RAM > anyways). > > On Sat, Jan 8, 2011 at 7:55 PM, Lance Norskog <goks...@gmail.com> wrote: >> There are always slowdowns when merging new segments during indexing. >> A MergePolicy decides when to merge segments. The older MergePolicies >> followed a strategy which is quite disruptive in an NRT environment. >> >> There is a new feature in 3.x & the trunk called >> 'BalancedSegmentMergePolicy'. This new MergePolicy is designed for the >> near-real-time use case. It was contributed by LinkedIn. You may find >> it works well enough for your case. >> >> Lance >> >> On Thu, Jan 6, 2011 at 10:21 AM, Stephen Boesch <java...@gmail.com> wrote: >>> Thanks Yonik, >>> Using a stable release of Solr what would you suggest to do - given >>> MultiSearch's demise and the other work is still ongoing? >>> >>> 2011/1/6 Yonik Seeley <yo...@lucidimagination.com> >>> >>>> On Thu, Jan 6, 2011 at 12:37 PM, Stephen Boesch <java...@gmail.com> wrote: >>>> > Solr/lucene newbie here .. >>>> > >>>> > We would like searches against a solr/lucene index to immediately be able >>>> to >>>> > view data that was added. I stress "small" amount of new data given that >>>> > any significant amount would require excessive latency. >>>> >>>> There has been significant ongoing work in lucene-core for NRT (near real >>>> time). >>>> We need to overhaul Solr's DirectUpdateHandler2 to take advantage of >>>> all this work. >>>> Mark Miller took a first crack at it (sharing a single IndexWriter, >>>> letting lucene handle the concurrency issues, etc) >>>> but if there's a JIRA issue, I'm having trouble finding it. >>>> >>>> > Looking around, i'm wondering if the direction would be a MultiSearcher >>>> > living on top of our standard directory-based IndexReader as well as a >>>> > custom Searchable that handles the newest documents - and then combines >>>> the >>>> > two results? >>>> >>>> If you look at trunk, MultiSearcher has already gone away. >>>> >>>> -Yonik >>>> http://www.lucidimagination.com >>>> >>> >> >> >> >> -- >> Lance Norskog >> goks...@gmail.com >> > -- Lance Norskog goks...@gmail.com