On 02/28/2013 05:17 PM, Stefan Matheis wrote:
That's Debian Machine running wheezy/sid Packages, the file itself and the 
locale as well are already UTF-8:

$ echo $LANG
en_US.UTF-8

$ file input.txt
input.txt: UTF-8 Unicode text



anything else to check?


Hmm, pretty sure there is an encoding mismatch, do you know which encoding is used by your JVM? I would guess that is not UTF-8. You can probably get around the issue by re-encoding the input
file to the encoding the JVM is using.

Have a look here:
http://stackoverflow.com/questions/1749064/how-to-find-default-charset-encoding-in-java

Would be nice if you can run the println statements there.

Jörn

Reply via email to