On Mon, Apr 25, 2005 at 01:07:54AM +0900, GOTO Masanori wrote: > At Fri, 22 Apr 2005 12:14:53 +0100, > Ross Paterson wrote: > > According to the spec, mbrtowc(&wc, buf, 1, &st) should either return 1 > > and set wc, or return 0, (size_t)-1 or (size_t)-2. In this locale it > > returns either 0 or 1, but doesn't always set wc in the latter case, > > It works OK when I changed this source as follows.
Sorry, the subject line was a bit broad -- I didn't mean to imply any more than a failure in this specific usage pattern. > > (In iconvdata/tcvn5712-1.c, this decoding is treated as stateful, but > > I don't think it should be.) > > It has five combined character: > > http://www.informatik.uni-leipzig.de/~duc/software/misc/tcvn.txt > > TCVN5712:1993 is very weird encodings, because 0xb0..0xb4 are > postposing combined character. This means even if we read the first > character, we cannot decide output character until we read the 2nd > character. I know -- I just thought that one could have, e.g. mbrtowc(&wc, "a ", 1, &st) return (size_t)-2 mbrtowc(&wc, "a ", 2, &st) return 1 (though getwc would have to push the extra byte back onto the stream, I guess) Just wishlist, of course, stateless encodings are easier to work with, and this is the only stateful one in Debian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]