On Fri, Nov 17, 2000 at 09:24:05PM +0400, [EMAIL PROTECTED] wrote:
> 
>  wv was really terribly broken for word6 format files. Here is a patch that
> fixes this.
> 
>  To CJK guys:
> 
> * Now word6.doc from _Belcon_ gets imported properly too (and word2k document
>   from Chih-Wei Huang also OK)
> 
> * word6.doc from Chih-Wei Huang's mail I've forwarded here doesn't import
>   properly -  (chars are not converted to unicode) since wv thinks it's in 
>   word7 format (!) - wvQuerySupported(&ps->fib,NULL) returns WORD7, so it 
>   seems there is no clean workaround/hack for importing it (may be wordpad is 
>   that broken - is word able to read this file ? And what version of windows
>   wordpad is used from -  is it from win2k or from NT or from win9x?). 
>   IMO the only hack that can be used - is to check whether the
>   arrived character's code  is less (or more or is in the range) than some
>   constant for given charset, and if doesn't satisfy constraints on the value, 
>   its character type is set to '1' to force conversion to unicode.

I only see word6 and word8/Word97 document at 
http://www.wotsit.org/search.asp?s=text
So it might as well that WORD7 do not use unicode either. It looks
like that word6 has more common features with Word95 than
word8 has with word95.

I only see one place in the word97 document mentioned unicode with word95 
that state:
=====
XCHAR( eXtended CHARacter set):

A data type which defines a "character". Each XCHAR corresponds to a character in the 
document, where "character" is defined as
a glyph, regardless of whether it is a single-byte or double-byte character. With 
Word6/FE, Word95/FE, Word97/all and future
versions of Word, this is defined as a 16-bit integer corresponding to the Unicode 
character code of the glyph.
======
where /FE means far-east.
If set chartype to 1 for word7 format also get proper result for
.doc import under other languages(russian?), we can assume word7
behave similarly with word6 in this aspect. What do you think?

-- 
Best regard
ha_shao



Reply via email to