I have a question about the result contract of pg_do_encoding_conversion().
It can receive non null-terminated string because its arguments are
a char array and a byte length.
And it only returns a string, so the string should be null-terminated.

However, if conversions are not required, the function returns
the input string itself even though it might be not null-terminated.

I checked usages of pg_do_encoding_conversion() and xml_parse()
could cause troubles. Is it a bug? needed to be fixed?


---- [utils/mb/mbutils.c]
unsigned char *
pg_do_encoding_conversion(unsigned char *src, int len,
                          int src_encoding, int dest_encoding)
{
    ...
    if (src_encoding == dest_encoding)
        return src;
----

---- [utils/adt/xml.c]
static xmlDocPtr
xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace,
          xmlChar * encoding)
{
    ...
    len = VARSIZE(data) - VARHDRSZ;     /* will be useful later */
    string = xml_text2xmlChar(data);

    utf8string = pg_do_encoding_conversion(string,
                                           len,
                                           encoding ?
                                           xmlChar_to_encoding(encoding) :
           [It could be UTF8 to UTF8] -->  GetDatabaseEncoding(),
                                           PG_UTF8);
----

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to