On Wed, Oct 26, 2022 at 03:14:26PM +0300, Eli Zaretskii wrote:
> > Date: Wed, 26 Oct 2022 11:03:53 +0200
> > From: [email protected]
> > Cc: [email protected], [email protected]
> > 
> > Lets call LOC your locale.  The setup is a manual encoded in Latin1, and
> > an include file included_latîn1.texi.  On your computer, the î in the
> > include file is stored as 0x05DE, which is the conversion of 0xEE in the
> > LOC codepage.
> 
> The manual has the @documentencoding which declares Latin1 encoding.
> So I believe included_latîn1.texi is converted to UTF-8 correctly.
> 
> The problem happens when accessing this file via 'stat' etc., because
> the locale's encoding cannot encode î.
> 
> For this to work, the non-ASCII character we use should be encodable
> both by Latin1 and by the Windows codepages.  This is a tough
> requirement, but if you look at the tables of these encodings, you
> will see that some codepoints between 0xA1 and 0xAF are identical
> between many Windows codepages and Latin1.  For example, 0xAB is
> identical in many codepages.  So maybe we could try such a character,
> for these tests?

We could do that, but to me the tests are not so important, if they are
skipped on Windows, it is not such a big deal, to me what is important
is to make sure that the DOC_ENCODING_FOR_INPUT_FILE_NAME set to 0
in Windows is the best choice.

-- 
Pat

Reply via email to