Property Stream in UTF-8

Molof, Barry C Mon, 09 Aug 2004 10:17:12 -0700

Hello,
 
I have a scenario in which the properties of a Word document are
Japanese characters.  Using HPSF to get these properties does not give
me the results that I need.  I have been debugging through the code and
came upon a comment in org.apache.poi.hpsf.TypeReader at around line
105.  This comment is:
 
* FIXME: Reading an 8-bit string should pay attention
* to the codepage. Currently the byte making out the
* property's value are interpreted according to the
* platform's default character set.
 
While debugging the code, I have verified that the codepage read was
65001 (UTF-8) and that the type of the property being read is 30
(VT_LPSTR).  Is this FIXME comment the reason why I am not getting the
property values back correctly?  I am guessing that POI is using my
character set which is not UTF-8.
 
Are there any plans to fix this in the future?
 
Thank you,
 
 
Barry Molof
Computer Associates
Programmer
tel: +1 631 342 3234
[EMAIL PROTECTED]

Property Stream in UTF-8

Reply via email to