On Sat, May 08, 2010 at 09:24:45PM -0400, Tom Lane wrote:
> hgonza...@gmail.com writes:
> > http://sources.redhat.com/bugzilla/show_bug.cgi?id=649
> 
> > The last explains why they do not consider it a bug:
> 
> > ISO C99 requires for %.*s to only write complete characters that fit below  
> > the
> > precision number of bytes. If you are using say UTF-8 locale, but ISO-8859-1
> > characters as shown in the input file you provided, some of the strings are
> > not valid UTF-8 strings, therefore sprintf fails with -1 because of the
> > encoding error. That's not a bug in glibc.
> 
> Yeah, that was about the position I thought they'd take.

GNU libc eventually revisited that conclusion and fixed the bug through commit
715a900c9085907fa749589bf738b192b1a2bda5.  RHEL 7.1 is fixed, but RHEL 6.6 and
RHEL 5.11 are still affected; the bug will be relevant for another 8+ years.

> So the bottom line here is that we're best off to avoid %.*s because
> it may fail if the string contains data that isn't validly encoded
> according to libc's idea of the prevailing encoding.

Yep.  Immediate precisions like %.10s trigger the bug as effectively as %.*s,
so tarCreateHeader() [_tarWriteHeader() in 9.2 and earlier] is also affected.
Switching to strlcpy(), as attached, fixes the bug while simplifying the code.
The bug symptom is error 'pg_basebackup: unrecognized link indicator "0"' when
the name of a file in the data directory is not a valid multibyte string.


Commit 6dd9584 introduced a new use of .*s, to pg_upgrade.  It works reliably
for now, because it always runs in the C locale.  pg_upgrade never calls
set_pglocale_pgservice() or otherwise sets its permanent locale.  It would be
natural for us to fix that someday, at which point non-ASCII database names
would perturb this status output.

It would be good to purge the code of precisions on "s" conversion specifiers,
then Assert(!pointflag) in fmtstr() to catch new introductions.  I won't plan
to do it myself, but it would be a nice little defensive change.
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl 
b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index 0e4bd12..f46c5fc 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -17,6 +17,12 @@ command_fails(
        [ 'pg_basebackup', '-D', "$tempdir/backup" ],
        'pg_basebackup fails because of hba');
 
+# Some Windows ANSI code pages may reject this filename, in which case we
+# quietly proceed without this bit of test coverage.
+open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR";
+print BADCHARS "test backup of file with non-UTF8 name\n";
+close BADCHARS;
+
 open HBA, ">>$tempdir/pgdata/pg_hba.conf";
 print HBA "local replication all trust\n";
 print HBA "host replication all 127.0.0.1/32 trust\n";
diff --git a/src/port/tar.c b/src/port/tar.c
index 4721df3..72fd4e1 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -68,7 +68,7 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
        memset(h, 0, 512);                      /* assume tar header size */
 
        /* Name 100 */
-       sprintf(&h[0], "%.99s", filename);
+       strlcpy(&h[0], filename, 100);
        if (linktarget != NULL || S_ISDIR(mode))
        {
                /*
@@ -110,7 +110,7 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
                /* Type - Symbolic link */
                sprintf(&h[156], "2");
                /* Link Name 100 */
-               sprintf(&h[157], "%.99s", linktarget);
+               strlcpy(&h[157], linktarget, 100);
        }
        else if (S_ISDIR(mode))
                /* Type - directory */
@@ -127,11 +127,11 @@ tarCreateHeader(char *h, const char *filename, const char 
*linktarget,
 
        /* User 32 */
        /* XXX: Do we need to care about setting correct username? */
-       sprintf(&h[265], "%.31s", "postgres");
+       strlcpy(&h[265], "postgres", 32);
 
        /* Group 32 */
        /* XXX: Do we need to care about setting correct group name? */
-       sprintf(&h[297], "%.31s", "postgres");
+       strlcpy(&h[297], "postgres", 32);
 
        /* Major Dev 8 */
        sprintf(&h[329], "%07o ", 0);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to