The dimensions sound good. It's unclear if you're going to post a chart again, numbers, or code? There's a LUCENE-1577 Jira issue for code.
On Fri, Oct 9, 2009 at 12:37 PM, Jake Mannix <jake.man...@gmail.com> wrote: > Jason, > > We've been running some perf/load/stress tests lately, but on a suggestion > > from Ted Dunning, I've been trying to come up with a more "realistic" set of > stress > tests and indexing rates to see where NRT performs well and where it does > not, > instead of just indexing at maximum rate, looping over all docs in the test > set > and then doing them again and again. > > Once we've got a good test set, which hits on the variety of dimensions: > indexing > rate, document size, query rate while indexing, and delay-to-visibility of > indexed docs, > we'll certainly post that, as John did for the zoie tests on the zoie wiki. > > -jake > > On Fri, Oct 9, 2009 at 12:29 PM, Jason Rutherglen < > jason.rutherg...@gmail.com> wrote: > >> Jake and John, >> >> It would be interesting and enlightening to see NRT performance >> numbers in a variety of configurations. The best way to go about >> this is to post benchmarks that others may run in their >> environment which can then be tweaked for their unique edge >> cases. I wish I had more time to work on it. >> >> -J >> >> On Thu, Oct 8, 2009 at 8:18 PM, Jake Mannix <jake.man...@gmail.com> wrote: >> > Jason, >> > >> > On Thu, Oct 8, 2009 at 7:56 PM, Jason Rutherglen < >> jason.rutherg...@gmail.com >> >> wrote: >> > >> >> Today near realtime search (with or without SSDs) comes at a >> >> price, that is reduced indexing speed due to continued in RAM >> >> merging. People typically hack something together where indexes >> >> are held in a RAMDir until being flushed to disk. The problem >> >> with this is, merging in the background becomes really tricky >> >> unless it's performed inside of IndexWriter (see LUCENE-1313 and >> >> IW.getReader). There is the Zoie system which uses the RAMDir >> >> solution, however it's implemented using a customized deleted >> >> doc set based on a bloomfilter backed by an inefficient RB tree >> >> which slows down queries. There's always a trade off when trying >> >> to build an NRT system, currently. >> >> >> > >> > I'm not sure what numbers you are using to justify saying that zoie >> > "slows down queries" - latency at LinkedIn using zoie has a typical >> > median response time of 4-8ms at the searcher node level (slower >> > at the broker due to a lot of custom stuff that happens before >> > queries are actually sent to the nodex), while dealing with sustained >> > rapid indexing throughput, all with basically zero time between indexing >> > event to index visibility (ie. true real-time, not "near real time", >> unless >> > indexing events are coming in *very* fast). >> > >> > You say there's a tradeoff, but as you should remember from your >> > time at LinkedIn, we do distributed realtime faceted search while >> > maintaining extremely low latency and still indexing sometimes more >> > than a thousand new docs a minute per node (I should dredge up >> > some new numbers to verify what that is exactly these days). >> > >> > >> > Deletes can pile up in segments so the >> >> BalancedSegmentMergePolicy could be used to remove those faster >> >> than LogMergePolicy, however I haven't tested it, and it may be >> >> trying to not do large segment merges altogether which IMO >> >> is less than ideal because query performance soon degrades >> >> (similar to an unoptimized index). >> >> >> > >> > Not optimizing all the way has shown in our case to actually be >> > *better* than the "optimal" case of a 1-segment index, at least in >> > the case of realtime indexing at rapid update pace. >> > >> > >> > -jake >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org