Very nice ! That is exactly what I needed. Thank you very much !


On 06/02/2014 09:26 AM, Michael McCandless wrote:
The index sorting APIs (in lucene/misc) can do this.  E.g. you could
make a SortingAtomicReader, with your sort criteria, then use
addIndexes(IR[]) to add it to a new index.  That resulting index would
have 1 segment and the docIDs would be in your order.

Mike McCandless

http://blog.mikemccandless.com


On Mon, May 12, 2014 at 12:01 PM, Olivier Binda
<olivier.bi...@wanadoo.fr> wrote:
In a 1-segment (parallel) read-only index, that is built offline once (and
then frozen),
is it possible to remap the docIds as the last step (i.e... to have the
exact same index, except that the docIds are all equal to the ord the docs
where added to the index) ?

Say I have the read only index

docId   : document
1 : bookB
2 : sentenceB
3 : linkA
4 : linkC
5 : sentenceC
6 : sentenceA
7 : bookA
...
300000 : linkD

I would like to have instead the read-only index

docId   : document
1 : bookA
2 : bookB
....

M : linkA
M+1: linkB
...
N+1 : sentenceA
N+2 : sentenceB
...
300000:sentenceZZZ

This would allow me to reduce the amount of ram to cache the type of each
document

-> without remapping, I need at least log2(types)* documents bits
here 2 * 300000 bits

-> with remapping, I need only to remember ints M and N

Also, if I need to cache 1 byte of metadata for each book

-> without remapping, I would need 1 byte * documents
here 300000 bytes

-> with remapping, I would only need 1 byte * books
here M - 1 bytes


I tried building such an index with LogMergePolicy/NoMergePolicy/extending
the ram buffer but (maybee I did something wrong),
the docIds were always reshuffled (maybee because my index was big and I was
over a threshold)



Best regards,
Olivier

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to