What is the hex value for that second character returned that appears to display as an apostrophe? Hex 92 (decimal 146) is listed as "Private Use 2", so who knows what it might display as. All that is important is the binary/hax value.

Out of curiosity, how did your application come about picking a PU Unicode character?

-- Jack Krupansky

-----Original Message----- From: G.Long
Sent: Monday, March 3, 2014 12:09 PM
To: java-user@lucene.apache.org
Subject: encoding problem when retrieving document field value

Hi :)

My index (Lucene 3.5) contains a field called title. Its value is
indexed (analyzed and stored) with the WhitespaceAnalyzer and can
contains html entities such as ’ or °

My problem is that when i retrieve values from this field, some of the
html entities are missing.
For example :

Luke tells me that the stored value is : "l’application n°
90-1258" and when I retrieve the field value in my application, I get
"l’application n° 90-1258".

The apostrophe is not in the returned value whereas the ° character is
present.

What could be the problem?

Thanks,

Gary



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to