On 2020-08-03 09:36, Michael Shay via Cygwin wrote: > I'm having a problem with Cygwin 3.1.4, changing the character set on the > fly. It seems to work with Cygwin applications, but not with Win32 > applications. > I have a Korn shell script: > #!/bin/ksh > OLD_LANG="$LANG" > OLD_LC_ALL="$LC_ALL" > echo "locale on entry" > locale > echo "" > export LANG="en_US.CP1252" > export LC_ALL=en_US.CP1252 > echo "locale changed to" > locale > echo "" > # Default is to run the Win32 program. Input any argument other than > 'WIN32' > # to run '/bin/echo'. > case $# in > 0 ) echo "Running WIN32 pgm" > ksh -c 'cygtest.exe ZÇ' > ;; > 1 ) echo "Running Cygwin 'echo'" > ksh -c '/bin/echo ZÇ' > ;; > 2 ) echo "Running WIN32 pgm" > ksh -c 'cygtest.exe ZÇ' > echo "" > echo "Running Cygwin 'echo'" > ksh -c '/bin/echo ZÇ' > ;; > * ) ;; > esac > LC_ALL="$OLD_LC_ALL" > LANG="$OLD_LANG" > and a Win32 application (attached file cygtest.cpp) > I used gdb to see what was happening in child_info_spawn::worker(), when a > Win32 program is started using: > rc = CreateProcessW (runpath, /* image name w/ full path */ > cmd.wcs (wcmd), /* what was passed to exec */ > sa, /* process security attrs */ > sa, /* thread security attrs */ > TRUE, /* inherit handles */ > c_flags, > envblock, /* environment */ > NULL, > &si, > &pi); > Specifically, 'cmd.wcs(wcmd)' invokes: > wchar_t *wcs (wchar_t *wbuf, size_t n) > { > if (n == 1) > wbuf[0] = L'\0'; > else > sys_mbstowcs (wbuf, n, buf); > return wbuf; > } > and sys_mbstowcs(): > size_t __reg3 > sys_mbstowcs (wchar_t * dst, size_t dlen, const char *src, size_t nms) > { > mbtowc_p f_mbtowc = __MBTOWC; > if (f_mbtowc == __ascii_mbtowc) > { > f_mbtowc = __utf8_mbtowc; <<<<< this > is ALWAYS done, no matter what charset is in use. > } > return sys_cp_mbstowcs (f_mbtowc, dst, dlen, src, nms); > } > Since the CP1252 is an 8-bit single-byte character set with characters >= > 0x80, the '0xc7' character is always translated as '0xc7 0xf0', with the > '0xf0' byte indicating an invalid character in the string. > This doesn't seem to happen when e.g. '/bin/echo' is run, although I > haven't stepped into the code to see what's happening. > I do not think this is a Cygwin bug, but since the User's Guide says the > locale and charset can be changed on the fly, I don't know what's going > awry. > Any suggestions? If you need more information, I'm happy to provide it.
Try: $ chcp.com Active code page: 850 $ chcp.com 65001 Active code page: 65001 $ chcp.com Active code page: 65001 -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in IEC units and prefixes, physical quantities in SI.] -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple