JRUBY had a similar issue - perhaps this thread will help you sort things out.
http://jira.codehaus.org/browse/JRUBY-3576 Or this report. http://openradar.appspot.com/8307054 But it is "not a bug it is feature." http://bugs.sun.com/view_bug.do?bug_id=4163515 I think you will need to explicitly set your JVM to utf-8, but it seems like the Mac JVM is broken. Dave On Sep 9, 2010, at 2:36 PM, Laird Nelson wrote: > Also, this is making me a little nervous. How does POI figure it out? > Presumably if it's going to give me a "real Java unicode string", then it > has to know how to convert from whatever encoding the spreadsheet is in to > Unicode. So how does it figure out what encoding the spreadsheet is in? > What if it guesses wrongly? > > Best, > Laird > > On Thu, Sep 9, 2010 at 4:58 PM, Laird Nelson <ljnel...@gmail.com> wrote: > >> OK, so given that, I'm trying to figure out how when I take a String from >> cell.getStringCellValue(), and write it to a file whose FileOutputStream has >> been explicitly wrapped by a FileWriter using the UTF8 encoding--I'm trying >> to figure out why the contents in the file appear to be in MacRoman encoding >> (my platform's default). >> >> I'm creating my XMLEventWriter that's ultimately doing the writing like >> this: >> >> final FileOutputStream fileOuptutStream = new FileOutputStream(file); >> final OutputStreamWriter outputStreamWriter = new >> OutputStreamWriter(fileOuptutStream, "UTF8"); >> final BufferedWriter bufferedWriter = new >> BufferedWriter(outputStreamWriter); >> >> XMLEventWriter writer = outputFactory.createXMLEventWriter(bufferedWriter); >> >> ...and then at various points I'm using the String value from POI to stick >> in there as #PCDATA. Seems like this should not involve ANY character set >> conversion, is what you're telling me? >> >> L >> >> >> On Thu, Sep 9, 2010 at 4:30 PM, Nick Burch <nick.bu...@alfresco.com>wrote: >> >>> On Thu, 9 Sep 2010, Laird Nelson wrote: >>> >>>> I am using POI to read an Excel spreadsheet. I have no idea what >>>> character encoding it's in. I can tell you, however, it's not in UTF8. :-) >>>> >>> >>> You don't have to worry about the encoding, POI sorts that out for you. >>> Every String you get back is a real Java unicode string already >>> >>> Nick >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org >>> For additional commands, e-mail: user-h...@poi.apache.org >>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@poi.apache.org For additional commands, e-mail: user-h...@poi.apache.org