Jim Meyering wrote: > Paolo Bonzini wrote: > >> On 06/02/2011 11:08 PM, Jim Meyering wrote: >>> #if MBS_SUPPORT >>> - int b2 = wctob ((unsigned char) b); >>> - if (b2 == EOF || b2 == b) >>> + /* Below, note how when b2 != b and we have a uni-byte locale >>> + (MB_CUR_MAX == 1), we set b = b2. I.e., in a uni-byte locale, >>> + we can safely call setbit with a non-EOF value returned by wctob. >>> */ >>> + int b2 = wctob (b); >>> + if (b2 == EOF || b2 == b || (MB_CUR_MAX == 1 ? (b=b2), 1 : 0)) >> >> Can you explain again the reason for testing "b2 == EOF"? It seems >> wrong, and without it you can just make >> >> if (MB_CUR_MAX == 1 || b2 == b) >> setbit ((unsigned char) b, c); > > Hi Paolo, > > Your test would disable DFA-based matching for some bytes in a locale like > ru_RU.KOI8-R, because a pattern like [\360] leads to "wint_t b" having > the value 1055 (0x041F), and that is obviously too large to > be used as the first argument to setbit. However, converting > that "B" back to a single-byte value, B2, gives us back \360, > which is ok to use there. Hence the "(b=b2)" part of that > admittedly ugly expression. > > The b2 == EOF part is required for the somewhat similar bug I fixed > a month ago: > > fix a bug whereby echo c|grep '[c]' would fail for any c in 0x80..0xff > 8da41c930e03a8635cbd8c89e3e591374c232c89 > > The corresponding test demonstrates the need: > > tests: exercise bug with 0x80..0xff in [...] > d98338ebf842ec9b69631837eee50ebdcd543505 > > Thanks for the feedback. > If you see a better way, I'm sure you'll let me know. > > BTW, seeing your cast, I now think it'd be prudent to > guard that setbit use: > > #if MBS_SUPPORT > /* Below, note how when b2 != b and we have a uni-byte locale > (MB_CUR_MAX == 1), we set b = b2. I.e., in a uni-byte locale, > we can safely call setbit with a non-EOF value returned by wctob. */ > int b2 = wctob (b); > if (b2 == EOF || b2 == b || (MB_CUR_MAX == 1 ? (b=b2), 1 : 0)) > #endif > if (b < 256) > setbit (b, c);
Ahem. s/256/NOTCHAR/ -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org