Hi :)
I found the source of the problem. It is indeed the input string. It
comes from a csv export from a relational database. The inputStream of
this csv file was encoded with the wrong charset (ISO8859-1 instead of
CP1252). So the right single quote was returned as this character
Hi G. Long,
Most likely, the problem is in your application. Lucene does not change the
value stored in the index. For stored fields, Lucene does not deal with
entities, it's just binary data to Lucene. From your application perspective,
it is String in - String out. I think maybe you strip
Hi :)
I've got this result directly from tncTitle in the following code:
field = doc.getFieldable(IndexConstants.FIELD_TNC_TITLE);
if (field != null) {
tncTitle = field.stringValue();
}
ps: in my previous email, the copy/paste of the apostrophe html number
made it appear correctly
What is the hex value for that second character returned that appears to
display as an apostrophe? Hex 92 (decimal 146) is listed as Private Use
2, so who knows what it might display as. All that is important is the
binary/hax value.
Out of curiosity, how did your application come about
On Tue, Mar 4, 2014 at 4:44 AM, Jack Krupansky j...@basetechnology.com wrote:
What is the hex value for that second character returned that appears to
display as an apostrophe? Hex 92 (decimal 146) is listed as Private Use
2, so who knows what it might display as.
Well, if they're dealing