I've identified the cause of bug #4253:
/* Trim trailing space */
while (*pbuf && !t_isspace(pbuf))
pbuf++;
*pbuf = '\0';
At least on Macs, t_isspace is capable of returning "true" when pointed
at the second byte of a 2-byte UTF8 character. This explains the report
that the letter "à" has a problem when some other ones don't. Of
course pbuf needs to be incremented using pg_mblen not just ++.
I looked around for other occurrences of the same problem and found
a couple. I also found occurrences of the same pattern for skipping
whitespace:
while (*s && t_isspace(s))
s++;
This is safe if and only if t_isspace is never true for multibyte
characters ... can anyone think of a counterexample?
regards, tom lane
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers