On Jun 29 19:13, Christian Franke wrote: > Fixes the CESU-8 value, but not the missing encoding if the high surrogate > is at the very end of the string.
Are you going to provide a patch for that issue? > > -- > Regards, > Christian > > From 96f23496f249558949923e60270b9568956912bf Mon Sep 17 00:00:00 2001 > From: Christian Franke <[email protected]> > Date: Sun, 29 Jun 2025 19:03:36 +0200 > Subject: [PATCH] wcrtomb: fix CESU-8 value of leftover lone high surrogate > > Addresses: https://cygwin.com/pipermail/cygwin/2025-June/258378.html > Fixes: 6ff28fc3b121 ("Allow CESU-8 surrogate value encoding") > Signed-off-by: Christian Franke <[email protected]> > --- > newlib/libc/stdlib/wctomb_r.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/newlib/libc/stdlib/wctomb_r.c b/newlib/libc/stdlib/wctomb_r.c > index 5ea1e13e4..ec6adfa49 100644 > --- a/newlib/libc/stdlib/wctomb_r.c > +++ b/newlib/libc/stdlib/wctomb_r.c > @@ -62,8 +62,8 @@ __utf8_wctomb (struct _reent *r, > of the surrogate and proceed to convert the given character. Note > to return extra 3 bytes. */ > wchar_t tmp; > - tmp = (state->__value.__wchb[0] << 16 | state->__value.__wchb[1] << 8) > - - (0x10000 >> 10 | 0xd80d); What a weird typo. I wonder how I fat-fingered that 'd' into the code /*facepalm*/ > + tmp = (((state->__value.__wchb[0] << 16 | state->__value.__wchb[1] << > 8) > + - 0x10000) >> 10) | 0xd800; > *s++ = 0xe0 | ((tmp & 0xf000) >> 12); > *s++ = 0x80 | ((tmp & 0xfc0) >> 6); > *s++ = 0x80 | (tmp & 0x3f); > -- > 2.45.1 > LGTM, please push. Thanks, Corinna
