Thanks all for the suggestions - there was also another thread "Lucene index on
relational data" which had crossover here.
That's an interesting idea about using ParallelReader for the changable index.
I had thought to just have a triplet indexed 'owner:mailId:label' in each Doc
and have multiple Documents for the same mailId, e.g. if each recipient adds
labels for the same mail, or if multiple labels are added by one recipient. I
would then have to make a join using mailId against the core. However, if I
want to use PR, I could have a single Document with multiple field, and using
stored fields can 'modify' that Document. However, what happens to the DocId
when the delete+add occurs and how do I ensure it stays the same.
I'm on 2.3.1. I seem to recall a discussion on this in another thread, but
cannot find it.
Antony
Chris Hostetter wrote:
: The archive is read only apart from bulk deletes, but one of the requirements
: is for users to be able to label their own mail. Given that a Lucene Document
: cannot be updated, I have thought about having a separate Lucene index that
: has just the 3 terms (or some combination of) userId + mailId + label.
:
: That of course would mean joining searches from the main mail data index and
: the label index.
tangential to the existing follwups about ways to use Filters efficiently
to get some of the behavior, take a look at ParallelReader ... your use
case sounds like it might be perfect for it: one really large main dataset
that changes fairly infrequently, and what changes do occur are mainly
about adding new records; plus a small "parallel" set of fields about
each record in the main set which do change fairly frequently.
you build up an index for the main data, and then you periodicly build up
a second index with the docs in the exact same order as the main index.
additions to the main index do't need to block on rebuilding the secondary
index. deletes do (since you need to delete from both indexes in parallel
to keep the ids in sync) ... but that's ok since you said you only need
occasional bulk deletes (you could process them as an initial step of your
recuring rebuild of the smaller index).
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]