Robert Haas <robertmh...@gmail.com> writes:
> On Thu, Sep 12, 2019 at 12:19 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
>> After burrowing down further, it's visibly the case that
>> text_cmp and varstr_cmp don't leak in the sense of actually
>> reporting any part of their input strings.  What they do do,
>> in some code paths, is things like
>>         ereport(ERROR,
>>                 (errmsg("could not convert string to UTF-16: error code %lu",
>>                         GetLastError())));

> Is this possible? I mean, I'm sure it could happen if the data's
> corrupted, but we ought to have validated it on the way into the
> database. But maybe this code path also gets used for non-Unicode
> encodings?

Nope, the above is inside 

#ifdef WIN32
        /* Win32 does not have UTF-8, so we need to map to UTF-16 */
        if (GetDatabaseEncoding() == PG_UTF8
            && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))

I agree with your point that this is a shouldn't-happen corner case.
The question boils down to, if it *does* happen, does that constitute
a meaningful information leak?  Up to now we've taken quite a hard
line about what leakproofness means, so deciding that varstr_cmp
is leakproof would constitute moving the goalposts a bit.  They'd
still be in the same stadium, though, IMO.

Another approach would be to try to remove these failure cases,
but I don't really see how we'd do that.

                        regards, tom lane


Reply via email to