Michael,

How would IndexWriter.addIndexes() work with unique doc ids?

Regards,
Paul Elschot


Op Tuesday 22 January 2008 12:07:16 schreef Michael Busch:
> Hi Team,
> 
> the question of how to delete with IndexWriter using doc ids is
> currently being discussed on java-user
> (http://www.gossamer-threads.com/lists/lucene/java-user/57228), so I
> thought this is a good time to mention an idea that I recently had. I'm
> planning to work on column-stored fields soon (I used to call them
> per-document payloads). Then we'll have the ability to store metadata
> for each document very efficiently in the index.
> 
> This new data structure could be used to store a unique ID for each doc
> in the index. The IndexReader would then get an API that provides a
> mapping from the dynamic doc ids to the new unique ones. We would also
> have to store a reverse mapping (UID -> ID) in the index - we could use
> a VInt list + skip list for that.
> 
> Then we should be able to make IndexReaders "read-only" (LUCENE-1030)
> and provide a new API in IndexWriter "delete by UID". This would allow
> to "delete by query" as well. The disadvantage is that the index would
> become bigger, but that should still be ok: 8 bytes per doc for the
> ID->UID map (assuming we took long for the UID, which I'd suggest). The
> UID->ID map might even be a bit smaller initially (using VInts and
> VLongs), but might become bigger when the index has lot's of deleted
> docs, because then the delta encoding wouldn't be as efficient anymore
> for the UIDs.
> 
> If RAM permits, the maps could also be cached in memory (optional,
> configurable). The FieldCache overhaul (LUCENE-831) with column fields
> as source can help here.
> 
> After all this is implemented (column fields, UIDs, "read-only"
> IndexReaders, FieldCache overhaul) I'd like to make the column fields
> (and norms) updateable via IndexWriter.
> 
> OK lot's of food for thought.
> 
> -Michael
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to