Paolo Bonzini wrote:
\
+ size_t nbytes = mbrtowc (&_wc, lexptr, lexleft, &mbs); \
+ bool valid_char = 1 <= nbytes && nbytes < (size_t) -2; \
I find these conditionals complicated to follow.
Yes, that identifier 'valid_char' was a confusing choice; as you noted,
the character is valid even when nbytes is zero.
> I believe you should have simply
bool valid_char = nbytes < (size_t) -2;
or better:
+ if (! valid_char) \
if (nbytes >= (size_t) -2)
That wouldn't do, because when mbrtowc returns 0 the caller still needs
to advance the pointer by 1 to get past the null byte, just as it needs
to advance by 1 if mbrtowc returns (size_t) -2 or (size_t) -1.
I see this patch has been committed already. Can you please submit a followup?
There was a followup patch, in commit 2b9c57c, and the code's changed so
that it no longer has a 'valid_char' local. Perhaps it's clear enough now.