+1 Agreed, the initial version should use RAMDirectory in order to keep things simple and to benchmark against other MemoryIndex like index representations.
On Fri, Dec 26, 2008 at 10:20 AM, Doug Cutting <[email protected]> wrote: > Michael McCandless wrote: > >> So then I think we should start with approach #2 (build real-time on >> top of the Lucene core) and iterate from there. Newly added docs go >> into a tiny segments, which IndexReader.reopen pulls in. Replaced or >> deleted docs record the delete against the right SegmentReader (and >> LUCENE-1314 lets reopen carry those pending deletes forward, in RAM). >> >> I would take the simple approach first: use ordinary SegmentReader on >> a RAMDirectory for the tiny segments. If that proves too slow, swap >> in Memory/InstantiatedIndex for the tiny segments. If that proves too >> slow, build a reader impl that reads from DocumentsWriter RAM buffer. >> > > +1 This sounds like a good approach to me. I don't see any fundamental > reasons why we need different representations, and fewer implementations of > IndexWriter and IndexReader is generally better, unless they get way too > hairy. Mostly it seems that real-time can be done with our existing toolbox > of datastructures, but with some slightly different control structures. > Once we have the control structure in place then we should look at > optimizing data structures as needed. > > Doug > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
