Hi Michael, I've actually been working on factoring DocumentsWriter, as a first step towards flexible indexing.
I agree we would have an abstract base Posting class that just tracks the term text. Then, DocumentsWriter manages inverting each field, maintaining the per-field hash of term Text -> abstract Posting instances, exposing the methods to write bytes into multiple streams for a Posting in the RAM "byte slices", and then read them back when flushing, etc. And then the code that writes the current index format would plug into this and should be fairly small and easy to understand. For example, frq/prx postings and term vectors writing would be two plugins to the "inverted terms" API; it's just that term vectors flush after every document and frq/prx flush when RAM is full. Then there would also be plugins that just tap into the entire document (don't need inversion), like FieldsWriter. There are still alot of details to work out...
The DocumentsWriter does pooling of the Posting instances and I'm wondering how much this improves performance.
We should retest this. I think it was a decent difference in performance but I don't remember how much. I think the pooling can also be made generic (handled by DocumentsWriter). EG the plugin could expose a "newPosting()" method. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]