Kevin,

I think that this sort of thing should be built on top of the functionality provided by the binary fields proposal, or at least made to work with it:

<URL:http://issues.apache.org/bugzilla/show_bug.cgi?id=29370>

This would take care of the blob-vs.-text aspect of your proposal.

Also:

Kevin Burton wrote:
Supporting full unicode is important. Full java.lang.String storage is used with String.getBytes() so we should be able to avoid unicode issues.
If Java has a correct java.lang.String representation it's possible easily
add unicode support just by serializing the byte representation. (Note
that the JDK says that the DEFAULT system char encoding is used so if this
is ever changed it might break the index)

It's a bad idea to use the zero-parameter version of String.getBytes() (for example, what if you want to share an index between two platforms with different DEFAULT system char encodings?). Fortunately, there's a better alternative: for the suprisingly low price of "String.getBytes(String charsetName)", platform independence can be yours today.


Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to