[ 
https://issues.apache.org/jira/browse/LUCENE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491456#comment-13491456
 ] 

Adrien Grand commented on LUCENE-4536:
--------------------------------------

bq. This patch only changes the on-disk format right? The specialized in-memory 
readers are still backed by native arrays (short[]/int[]/long[], etc.)?

Exactly.

bq. Ie, in general, I think the version constants should be created once and 
then not changed (write once), and VERSION_CURRENT changes to point to 
whichever is most recent.

Ok, I'll change it.

bq. That careful anonymous subclass in PackedInts to handle seeking to the end 
when the last value is read is sort of sneaky ... this should only kick in when 
reading the old (long-aligned) format right?

This only happens when reading the old format AND the number of bytes used to 
serialized the array is not a multiple of 8. I'll add an assert to make sure 
that this condition can only be true with the old format.

bq. Or ... maybe... we should not "promise" this (no trailing wasted bytes) in 
the API?
bq. Or maybe we expose a new explicit method to "seek to the end of this packed 
ints" or something (eg maybe "skipTrailingBytes").

These were my first ideas, but the truth is that I was very scared to break 
something (for example doc values rely on the assumption that after reading the 
last value of a direct array, the whole stream is consumed). Fixing PackedInts 
to make sure those assumptions are still true looked easier to me as I was able 
to create "fake" long-aligned packed ints and make sure that the whole stream 
was consumed after reading the last value.

But your option makes perfect sense to me and I will do it if you think it is 
cleaner.

Thanks for the review!
                
> Make PackedInts byte-aligned?
> -----------------------------
>
>                 Key: LUCENE-4536
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4536
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.1
>
>         Attachments: LUCENE-4536.patch
>
>
> PackedInts are more and more used to save/restore small arrays, but given 
> that they are long-aligned, up to 63 bits are wasted per array. We should try 
> to make PackedInts storage byte-aligned so that only 7 bits are wasted in the 
> worst case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to