Hello,
I have a scenario in which the properties of a Word document are
Japanese characters. Using HPSF to get these properties does not give
me the results that I need. I have been debugging through the code and
came upon a comment in org.apache.poi.hpsf.TypeReader at around line
105. This comment is:
* FIXME: Reading an 8-bit string should pay attention
* to the codepage. Currently the byte making out the
* property's value are interpreted according to the
* platform's default character set.
While debugging the code, I have verified that the codepage read was
65001 (UTF-8) and that the type of the property being read is 30
(VT_LPSTR). Is this FIXME comment the reason why I am not getting the
property values back correctly? I am guessing that POI is using my
character set which is not UTF-8.
Are there any plans to fix this in the future?
Thank you,
Barry Molof
Computer Associates
Programmer
tel: +1 631 342 3234
[EMAIL PROTECTED]