The problem I was having with extracting document properties from Asian
Microsoft documents involved the available character sets. I am grateful to
Martin Brown for putting me on the right track. I believe that the
character sets are in charsets.jar. The Java method below will return a
String array of the available locals (which, again, are reflected by the
available character sets). This particular method is specialized for
Chinese, but obviously you could do this for Japanese or any other non-Roman
language where the character set might not be present.
The problem I was having was corrected by installing the latest version of
Java, which comes with multi-language features. The code below returned the
local I needed and I no longer got the Runtime exception when the
SummaryInformation was read.
Ian
/**
* Get the Chinese locals supported on the local system (if any).
*
* @return A list of the locals with the language encludings
* (e.g., zh_CN).
*/
public static String[] getChineseLocaleList()
{
ArrayList<String> languageArrayList = new ArrayList<String>();
Locale[] availableLocals = Locale.getAvailableLocales();
String chineseLanguage = Locale.CHINESE.getDisplayLanguage();
for (Locale locale : availableLocals) {
String displayLanguage = locale.getDisplayLanguage();
if (displayLanguage.equals( chineseLanguage)) {
String localeStr = locale.toString();
String languagePair = displayLanguage + " " + localeStr;
languageArrayList.add( languagePair );
}
}
String[] localeList = null;
if (languageArrayList.size() > 0) {
localeList = languageArrayList.toArray( new String[1] );
Arrays.sort( localeList );
}
return localeList;
} // getLocaleList
On Mon, Nov 24, 2008 at 8:19 AM, Martin Brown <[EMAIL PROTECTED]>wrote:
> Hi Ian,
>
> > I have written some Java code very much like the HPSF example on the
> HPSF
> > HOW-TO <http://poi.apache.org/hpsf/how-to.html> page which reads
> Microsoft
> > Office document property (or metadata) information. This code works fine
> > for documents that have been saved with English versions of Microsoft
> > Office. However, when I try to use it with a Microsoft document that is
> > saved with a Chinese version, the code fails. I get an exception and the
> > only message is "GBK". I assume that this is referring to the character
> > set.
>
> You do have all the Java locale data installed? The Windows JRE doesn't
> do this by default and this has tripped me up in the past.
>
> HTH
>
> Martin
>
>
> Filtered by 3BClean from http://www.3bview.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>