IndexableBinaryStringTools: convert arbitrary byte sequences into Strings that 
can be used as index terms, and vice versa
-------------------------------------------------------------------------------------------------------------------------

                 Key: LUCENE-1434
                 URL: https://issues.apache.org/jira/browse/LUCENE-1434
             Project: Lucene - Java
          Issue Type: New Feature
          Components: Other
    Affects Versions: 2.4
            Reporter: Steven Rowe
            Priority: Minor
             Fix For: 2.9


Provides support for converting byte sequences to Strings that can be used as 
index terms, and back again. The resulting Strings preserve the original byte 
sequences' sort order (assuming the bytes are interpreted as unsigned).

The Strings are constructed using a Base 8000h encoding of the original binary 
data - each char of an encoded String represents a 15-bit chunk from the byte 
sequence.  Base 8000h was chosen because it allows for all lower 15 bits of 
char to be used without restriction; the surrogate range [U+D800-U+DFFF] does 
not represent valid chars, and would require complicated handling to avoid them 
and allow use of char's high bit.

This class is intended to serve as a mechanism to allow CollationKeys to serve 
as index terms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to