An equivalent Parallelizer for IndexWriter would be a useful addition to keep the two indexes in synch.
Hiding the details of which lucene index document data is retrieved from gives us some added flexibility in storage options but I've been thinking of a more general-purpose layer of abstraction which would allow me to use other storage options eg relational databases just as transparently. A typical configuration might augment a lucene index with an rdbms storage plug-in where all text content is indexed (not stored) in the lucene index along with a stored Field holding the RDBMS primary key. The RDBMS would be used to store the original text plus any other fields. Retrieving documents would involve querying the lucene index, retrieving the rdbms key and using that to access the database for the other required fields from the database. As well as allowing the prospect of an RDBMS-backed storage option for document fields we can also introduce the option of using the RDBMS to provide filters at query time eg books with price <$10. As a rough outline this would require: 1) A new HybridDocument which can contain lucene and non-lucene fields for reading and writing 2) A new reader/writer abstraction which routes fields to the appropriate repository (lucene/plugin storage) 3) A plugin interface for attaching external storage/filter modules. 4) A new search facility that can pass lucene queries to lucene and filter requests to a filter module 5) A search facility that allows partial retrieval of documents (eg equivalent of select summary, title, price...). Send instant messages to your online friends http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]