I am pretty sure HSSF writes Japanese strings just fine.

It often messes up when it tries to read a spreadsheet with Japanese (or any
CJK) strings. In fact, it will be off by two characters whenever the
Japanese string contains something called 'Far East Info'

I've traced this problem down, and put my explanation at
http://issues.apache.org/bugzilla/show_bug.cgi?id=27394 (the last comment).
I have no idea what 'Far East Info' means. But in a SST table string, it is
a block of bytes which is stored after the actual string value.

Basically, strings stored in Excel can have an optional 'Far East Info'
block. If this 'Far East Info' is present, HSSF misinterprets the beginning
of the string, because it does not account for a header field that contains
the length of the 'Far East Info'. HSSF assumes that strings will always
begin at the 3rd byte. In fact, if the string contains 'Far East Info', then
the string value will begin at the 7th byte.

This is not a problem when HSSF writes out the string, because it never
writes out 'Far East Info', even if it was originally present. When 'Far
East Info' is not present, the string value does indeed begin at the 3rd
byte of the structure.

Here is what a plain (non rich-text) Japanese text string looks like.
2 bytes -- length of string in characters (Unicode characters are 2 bytes
each)
1 byte - flag. 0x05 = Unicode String with Far East Info. 0x0
4 bytes - length of Far East Info. HSSF does not recognize this and treats
these 4 bytes as the first 2 bytes of the string value. These 4 bytes are
not present if the flag does not indicate that the string contains 'Far East
Info'
2 * n bytes - String value, where 'n' is the length of the string
2 * m bytes - Far East Info, where 'm' is the length of Far East Info
indicated in the 3rd field.



-----Original Message-----
From: Ankur Goel [mailto:[EMAIL PROTECTED]
Sent: Friday, May 21, 2004 5:23 AM
To: 'POI Users List'
Subject: Jaoanese Content In MicroSoft Word, Excel


Hi
I am new to POI. I want to parse the words and excel documents which have
content in Japanese. Can I do it through POI so it's only for English
language only .

Regards,
Ankur
-----Original Message-----
From: Alexandru, Ionita [mailto:[EMAIL PROTECTED]
Sent: Friday, May 21, 2004 2:45 PM
To: POI User
Subject: Insert table



            How can I insert a table in a word document, and how can I set
cell properties, for example to set the invisibility for the upper line of
cell or other line of cell on printing process.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to