Hi I want to index data with utf-8 encoding, so when adding field to a document I am using the code new String(value.getBytes("utf-8")) in the other hand, when I am going to search I was using the same snippet code to convert to utf-8 but it did not work so finally I found somewhere that had been said to use new String(valueToSearch.getBytes("cp1252"),"UTF8") and it worked fine but I still has some problem. first, some characters are weird when I get result from lucene, It seems it is in cp1252 encoding. second, if the java environment property "file.encoding" not been cp1252 the result is completely in incorrect encoding. so I must change this property using System.setProperty("file.encoding","cp1252")
is lucene neglect my utf-8 encoding and proceed indexing data using cp1252? how can I correct weird characters I received by searching? Thank you very much in advance. -- Regards, Mohammad