FYI, here's how you can create a list of all available text encodings in the JVM you're running in. This can lead to a very long combo box, though :-)
Map<String, Charset> charsetMap = Charset.availableCharsets(); --Thilo On 5/18/2010 01:40, Jörn Kottmann (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/UIMA-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12868448#action_12868448 > ] > > Jörn Kottmann commented on UIMA-1782: > ------------------------------------- > > There is now an option to specify the encoding of the text import files. It > is always preset to the default platform encoding. The combo box displays the > Java standard charsets (see here: > http://java.sun.com/j2se/1.4.2/docs/api/java/nio/charset/Charset.html). > In case the user wants to use a non-standard Java charset (which usually are > there) he has to type in the name of the charset he wants to use, while the > name is typed in, it is validated if the charset is available and he can > proceed with the import, otherwise the "Apply" button just remains disabled. > > It would be nice to add a warning to tell the user that the "Apply" button is > disable because of an invalid charset name or unsupported charset. > >> Encoding of text files during import should be confugurable >> ----------------------------------------------------------- >> >> Key: UIMA-1782 >> URL: https://issues.apache.org/jira/browse/UIMA-1782 >> Project: UIMA >> Issue Type: Improvement >> Components: CasEditor >> Affects Versions: 2.3 >> Reporter: Thomas Hampp >> Assignee: Jörn Kottmann >> Fix For: 2.3.1 >> >> >> During import of text files into a corpus it seems to be impossible to >> control the encoding used. Looks like the default platform encoding is used >> (Latin 1 on Western Windows systems). The Eclipse default encoding settings >> for text files don't seem to affect import encoding. That makes it >> impossible to import documents with international characters in UTF8. >> Ideally the encoding should be selectable in a drop down field in the import >> wizard. >