Thanks all for the suggestions - there was also another thread "Lucene index on relational data" which had crossover here.

That's an interesting idea about using ParallelReader for the changable index. I had thought to just have a triplet indexed 'owner:mailId:label' in each Doc and have multiple Documents for the same mailId, e.g. if each recipient adds labels for the same mail, or if multiple labels are added by one recipient. I would then have to make a join using mailId against the core. However, if I want to use PR, I could have a single Document with multiple field, and using stored fields can 'modify' that Document. However, what happens to the DocId when the delete+add occurs and how do I ensure it stays the same.

I'm on 2.3.1. I seem to recall a discussion on this in another thread, but cannot find it.

Antony



Chris Hostetter wrote:
: The archive is read only apart from bulk deletes, but one of the requirements
: is for users to be able to label their own mail.  Given that a Lucene Document
: cannot be updated, I have thought about having a separate Lucene index that
: has just the 3 terms (or some combination of) userId + mailId + label.
: : That of course would mean joining searches from the main mail data index and
: the label index.

tangential to the existing follwups about ways to use Filters efficiently to get some of the behavior, take a look at ParallelReader ... your use case sounds like it might be perfect for it: one really large main dataset that changes fairly infrequently, and what changes do occur are mainly about adding new records; plus a small "parallel" set of fields about each record in the main set which do change fairly frequently.

you build up an index for the main data, and then you periodicly build up a second index with the docs in the exact same order as the main index.

additions to the main index do't need to block on rebuilding the secondary index. deletes do (since you need to delete from both indexes in parallel to keep the ids in sync) ... but that's ok since you said you only need occasional bulk deletes (you could process them as an initial step of your recuring rebuild of the smaller index).



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to