Hi G. Long,

Most likely, the problem is in your application. Lucene does not change the 
value stored in the index. For stored fields, Lucene does not deal with 
entities, it's just binary data to Lucene. From your application perspective, 
it is String in -> String out. I think maybe you strip the entities when you 
output the data to the user?

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -----Original Message-----
> From: G.Long [mailto:jde...@gmail.com]
> Sent: Monday, March 03, 2014 6:09 PM
> To: java-user@lucene.apache.org
> Subject: encoding problem when retrieving document field value
> 
> Hi :)
> 
> My index (Lucene 3.5) contains a field called title. Its value is indexed
> (analyzed and stored) with the WhitespaceAnalyzer and can contains html
> entities such as ’ or °
> 
> My problem is that when i retrieve values from this field, some of the html
> entities are missing.
> For example :
> 
> Luke tells me that the stored value is : "l’application n° 90-1258"
> and when I retrieve the field value in my application, I get "l’application n°
> 90-1258".
> 
> The apostrophe is not in the returned value whereas the ° character is
> present.
> 
> What could be the problem?
> 
> Thanks,
> 
> Gary
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to