This does not look right: > +int _mbsbtype(const unsigned char* mbstr, size_t count) { > + const unsigned char* str; > + const unsigned char* start = mbstr; > + > + str = mbstr + count; > + > + /** from _ismbslead */ > + if (MSVCRT___mb_cur_max > 1) > + { > + while (start < str) { > + if (!*start) { > + return _MBC_ILLEGAL; > + } > + start += MSVCRT_isleadbyte(*str) ? 2 : 1;
This should probably be "start += MSVCRT_isleadbyte(*start) ? 2 : 1", BUT... > + } > + > + } > + if (!*str) { /** TODO: check *str validity */ > + return _MBC_ILLEGAL; > + } > + if (start == str && MSVCRT_isleadbyte(*str)) { > + return _MBC_LEAD; > + } This is not safe, because values used for a lead byte can also be used for a trailing byte - indeed with your loop (corrected as above), it seems that "start" could never be pointing to a trailing byte as you skip over trailing bytes. > + if (start == str && MSVCRT_isleadbyte(str[-1])) { > + return _MBC_TRAIL; > + } > + > + return _MBC_SINGLE; Try this: if (MSVCRT___mb_cur_max > 1) { while (start < str) { if (!*start) { return _MBC_ILLEGAL; } if (MSVCRT_isleadbyte(*start)) { if (str == start) return _MBC_LEAD; else if (str == start + 1) return _MBC_TRAIL; start += 2; } else { if (str == start) return _MBC_SINGLE; ++start; } } } return _MBC_ILLEGAL; The tests should probably test for this - of course the test will only work if the test knows which code page is set.