On Mon, Dec 25, 2023 at 6:07 AM <felix.winkelm...@bevuta.com> wrote:
> I'm not too familiar with the way Windows handles non-ASCII characters > in operating system calls, but I assume that what gets passed to the C > library runtime functions like fopen(3), etc. assumes a particular > encoding. > Basically, there are two modes, one that assumes a particular encoding, as you say (that's the default) and one that assumes wchar_t, which is always UTF-16LE. Which encoding is used in the first mode depends on the locale setting. >From a quick glance at the Windows docs[1] it seems one needs to use > "_fwopen" with a wchar_t string argument to pass extended characters. > Indeed, except that it's _wfopen, not _fwopen. Note that _fopen can involve 8-bit, 16-bit, or 8/16-bit mode depending on the encoding. Sorry, if this is not overly helpful. We are currently in the process of > improving > the unicode support for the next major version of CHICKEN. > This makes me realize that posixwin needs to be changed in C6 so that it always uses the second mode. A simple way to do this is to use a UTF-8 to UTF-16BE converter (and vice versa for things like dirread) right before calling _fwopen. > > felix > > >