Re: Unique doc ids

Michael Busch Wed, 23 Jan 2008 01:11:22 -0800

Paul Elschot wrote:
> Michael,
> 
> How would IndexWriter.addIndexes() work with unique doc ids?


Hi Paul,

it would probably be a limitation of this design. The only way I can
think of right now to ensure that during an addIndexes() the UIDs don't
change is an API in IndexWriter like setMinUID(long). When you create an
index and you know that you'll add it to another one via addIndexes(),
then you could use this method to set the min UID value in that index to
the max number of add/update operations you'd expect in the other index.

Please note that the UIDs that I'm thinking about here would actually
not affect the index order. All postings would still be stored in
(dynamic) doc id order.
This means, with this design the search results would not be returned in
UID order, so the UIDs couldn't be used efficiently e. g. for a join
operation with an external data structure (e. g. database). I think in
this regard my proposed UID design differs from what was discussed here
some time ago.

The main usecase here is to get rid of readers that do write operations.
I think that this would be very desireable when we implement updateable
column-fields. Then you could use the UIDs that an IndexReader returned
to delete or update docs or the column fields/norms, and you wouldn't
have to worry about IndexReaders being "in sync" with the IndexWriters.

Maybe this UID design that I'm thinking out loudly here is total
overkill for the mentioned use cases. I'm open and interested in other
alternative ideas!

-Michael


> 
> Regards,
> Paul Elschot
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Unique doc ids

Reply via email to