JRUBY had a similar issue - perhaps this thread will help you sort things out.

http://jira.codehaus.org/browse/JRUBY-3576

Or this report. http://openradar.appspot.com/8307054

But it is "not a bug it is feature."

http://bugs.sun.com/view_bug.do?bug_id=4163515

I think you will need to explicitly set your JVM to utf-8, but it seems like 
the Mac JVM is broken.

Dave

On Sep 9, 2010, at 2:36 PM, Laird Nelson wrote:

> Also, this is making me a little nervous.  How does POI figure it out?
> Presumably if it's going to give me a "real Java unicode string", then it
> has to know how to convert from whatever encoding the spreadsheet is in to
> Unicode.  So how does it figure out what encoding the spreadsheet is in?
> What if it guesses wrongly?
> 
> Best,
> Laird
> 
> On Thu, Sep 9, 2010 at 4:58 PM, Laird Nelson <ljnel...@gmail.com> wrote:
> 
>> OK, so given that, I'm trying to figure out how when I take a String from
>> cell.getStringCellValue(), and write it to a file whose FileOutputStream has
>> been explicitly wrapped by a FileWriter using the UTF8 encoding--I'm trying
>> to figure out why the contents in the file appear to be in MacRoman encoding
>> (my platform's default).
>> 
>> I'm creating my XMLEventWriter that's ultimately doing the writing like
>> this:
>> 
>> final FileOutputStream fileOuptutStream = new FileOutputStream(file);
>> final OutputStreamWriter outputStreamWriter = new
>> OutputStreamWriter(fileOuptutStream, "UTF8");
>> final BufferedWriter bufferedWriter = new
>> BufferedWriter(outputStreamWriter);
>> 
>> XMLEventWriter writer = outputFactory.createXMLEventWriter(bufferedWriter);
>> 
>> ...and then at various points I'm using the String value from POI to stick
>> in there as #PCDATA.  Seems like this should not involve ANY character set
>> conversion, is what you're telling me?
>> 
>> L
>> 
>> 
>> On Thu, Sep 9, 2010 at 4:30 PM, Nick Burch <nick.bu...@alfresco.com>wrote:
>> 
>>> On Thu, 9 Sep 2010, Laird Nelson wrote:
>>> 
>>>> I am using POI to read an Excel spreadsheet.  I have no idea what
>>>> character encoding it's in.  I can tell you, however, it's not in UTF8. :-)
>>>> 
>>> 
>>> You don't have to worry about the encoding, POI sorts that out for you.
>>> Every String you get back is a real Java unicode string already
>>> 
>>> Nick
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
>>> For additional commands, e-mail: user-h...@poi.apache.org
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to