I've identified the cause of bug #4253:

            /* Trim trailing space */
            while (*pbuf && !t_isspace(pbuf))
                pbuf++;
            *pbuf = '\0';

At least on Macs, t_isspace is capable of returning "true" when pointed
at the second byte of a 2-byte UTF8 character.  This explains the report
that the letter "à" has a problem when some other ones don't.  Of
course pbuf needs to be incremented using pg_mblen not just ++.

I looked around for other occurrences of the same problem and found
a couple.  I also found occurrences of the same pattern for skipping
whitespace:

            while (*s && t_isspace(s))
                s++;

This is safe if and only if t_isspace is never true for multibyte
characters ... can anyone think of a counterexample?

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to