Re: [HACKERS] Reasons not to like asprintf

2013-10-24 Thread Peter Eisentraut
On 10/22/13, 3:40 PM, Tom Lane wrote:
 In order to avoid having to clutter stuff like that with #ifdef FRONTENDs,
 I'm now thinking we should use exactly the same names for the frontend and
 backend versions, ie psprintf() and pvsprintf().  The main reason for
 considering a pg_ prefix for the frontend versions was to avoid cluttering
 application namespace; but it's already the case that we don't expect
 libpgcommon to be namespace clean.

While this is attractive, the same logic would suggest that we rename
pg_malloc() to palloc(), and that sounds wrong.  The frontend and
backend functions do have different freeing semantics.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-24 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 10/22/13, 3:40 PM, Tom Lane wrote:
 In order to avoid having to clutter stuff like that with #ifdef FRONTENDs,
 I'm now thinking we should use exactly the same names for the frontend and
 backend versions, ie psprintf() and pvsprintf().  The main reason for
 considering a pg_ prefix for the frontend versions was to avoid cluttering
 application namespace; but it's already the case that we don't expect
 libpgcommon to be namespace clean.

 While this is attractive, the same logic would suggest that we rename
 pg_malloc() to palloc(), and that sounds wrong.  The frontend and
 backend functions do have different freeing semantics.

We already crossed that bridge, though, by defining palloc in frontend
environments to mean pg_malloc.  I'm doubtful that insisting on different
names is going to result in anything except #ifdef clutter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-24 Thread Robert Haas
On Thu, Oct 24, 2013 at 2:18 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 10/22/13, 3:40 PM, Tom Lane wrote:
 In order to avoid having to clutter stuff like that with #ifdef FRONTENDs,
 I'm now thinking we should use exactly the same names for the frontend and
 backend versions, ie psprintf() and pvsprintf().  The main reason for
 considering a pg_ prefix for the frontend versions was to avoid cluttering
 application namespace; but it's already the case that we don't expect
 libpgcommon to be namespace clean.

 While this is attractive, the same logic would suggest that we rename
 pg_malloc() to palloc(), and that sounds wrong.  The frontend and
 backend functions do have different freeing semantics.

I'd almost be inclined to go the other way and suggest that we try to
use the pg_ prefix more, at least for things to be shared between
front and back end code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-24 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Oct 24, 2013 at 2:18 PM, Peter Eisentraut pete...@gmx.net wrote:
 While this is attractive, the same logic would suggest that we rename
 pg_malloc() to palloc(), and that sounds wrong.  The frontend and
 backend functions do have different freeing semantics.

 I'd almost be inclined to go the other way and suggest that we try to
 use the pg_ prefix more, at least for things to be shared between
 front and back end code.

Meh.  I think that mainly promotes carpal tunnel syndrome.  The place
for a pg_ prefix is in functions we intend to expose to the outside
world, such as functions exposed by libpq.  But these are not that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-23 Thread Florian Weimer

On 10/22/2013 11:06 PM, Tom Lane wrote:


Attached is a draft, which compiles though I've not yet actually tested it.


nprinted = vsnprintf(buf, len, fmt, args);

Assert(buf[len - 1] == '\0');

The assert may fire if len  INT_MAX and the system returns with errno 
== EOVERFLOW, as required by POSIX.  It's probably better to move it 
after the error logging.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 In short, I want to change pg_asprintf to return the malloc'd buffer
 as its function result.  This probably means we should change the
 function name too, since the allusion to asprintf is completely useless
 if we do that.  I'm thinking pg_psprintf for the frontend version of
 psprintf, but maybe somebody has a better suggestion?

Sounds reasonable, and I haven't got a better name, but I'm trying to
figure out why psprintf hasn't got the same issues which you mention in
your other thread (eg: depending on vsnprintf to return the would-be
result size).  I'm also a bit nervous that we only check vsnprintf()
success in Assert-enabled builds in psprintf, though I suppose that's
all we're doing in appendStringInfo too.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 Sounds reasonable, and I haven't got a better name, but I'm trying to
 figure out why psprintf hasn't got the same issues which you mention in
 your other thread (eg: depending on vsnprintf to return the would-be
 result size).

It does, though not directly in psprintf but rather in the underlying
vasprintf implementation.  (Although psprintf is exposed to the problem
that it can't tell out-of-memory from format errors.)

What I'm on about in this thread is the API used by all the call sites,
not the underlying implementation issues.  As long as we're okay with
exit or elog on any error, I think we should just have the widely-used
functions returning a buffer pointer.  Underneath that, we need some work
so that we can print more-specific error messages, but that will not be
of concern to the callers.

It's possibly debatable whether exit-on-error is okay or not.  I think
it is though, because none of our frontend code is prepared to do anything
except go belly-up on out-of-memory, while in the backend case it's only
elog(ERROR) anyway.  The other possible class of failures is format string
or encoding errors, but those seem to reduce to the same cases: I can't
envision that frontend code would have any useful recovery strategies.
I'd like the printed error message to distinguish these cases, but I don't
think the callers need to care.

 I'm also a bit nervous that we only check vsnprintf()
 success in Assert-enabled builds in psprintf, though I suppose that's
 all we're doing in appendStringInfo too.

Actually, appendStringInfo treats result  0 as a buffer overrun report;
if the failure is persistent because it's really a format/encoding
problem, it'll loop till it runs out of memory, and then report the error
as that.  The trouble here is that in pre-C99 implementations of
vsnprintf, return code  0 might mean either buffer overrun or a
format/encoding problem.  We can't do too much about this unless we are
prepared to move our goalposts about portability assumptions.  (Hmm ...
current POSIX requires *printf to set errno on error, so maybe we could
look at errno?)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  Sounds reasonable, and I haven't got a better name, but I'm trying to
  figure out why psprintf hasn't got the same issues which you mention in
  your other thread (eg: depending on vsnprintf to return the would-be
  result size).
 
 It does, though not directly in psprintf but rather in the underlying
 vasprintf implementation.  (Although psprintf is exposed to the problem
 that it can't tell out-of-memory from format errors.)

Right, pvsprintf().  Sorry for any confusion.

 What I'm on about in this thread is the API used by all the call sites,
 not the underlying implementation issues.  As long as we're okay with
 exit or elog on any error, I think we should just have the widely-used
 functions returning a buffer pointer.  Underneath that, we need some work
 so that we can print more-specific error messages, but that will not be
 of concern to the callers.

I agree that exit/Assert-or-elog is the right API here.  I wonder if
it'd be reasonable or worthwhile to try and actually make that happen-
that is to say, we really do only have one implementation for both
front-end and back-end here, but on the front-end we exit, while on the
back-end we elog(ERROR).  I'm guessing that's a pipe-dream, but figured
I'd mention it anyway, in case people have crafty ideas about how that
might be made to work (and if others think it's even a good idea).

 The other possible class of failures is format string
 or encoding errors, but those seem to reduce to the same cases: I can't
 envision that frontend code would have any useful recovery strategies.
 I'd like the printed error message to distinguish these cases, but I don't
 think the callers need to care.

Agreed.

  I'm also a bit nervous that we only check vsnprintf()
  success in Assert-enabled builds in psprintf, though I suppose that's
  all we're doing in appendStringInfo too.
 
 Actually, appendStringInfo treats result  0 as a buffer overrun report;

I had been looking at the Assert() that the last character is always
'\0' in allocStringInfoVA.  Probably a stretch to say that has to be
upgraded to an elog(), but I'd certainly be much happier with a runtime
did this error? check of some kind than not; hence my support for your
errno notion below..

 if the failure is persistent because it's really a format/encoding
 problem, it'll loop till it runs out of memory, and then report the error
 as that.  

Ah, yes, I see that in enlargeStringInfo and as long as we keep
MaxAllocSize reasonable that isn't terrible.

 (Hmm ...
 current POSIX requires *printf to set errno on error, so maybe we could
 look at errno?)

Sounds like a good idea to me..  Being explicit about checking for error
results, rather than hoping-against-hope that the only reason we'd ever
get a value less than the total length is when we need to provide more
memory, would be a lot better.  Perhaps it'd be worthwhile to run a test
against the build farm and see what kind of errno's are being set in
those cases where we're now getting the 'out of memory' failures you
mentioned upthread?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 I agree that exit/Assert-or-elog is the right API here.  I wonder if
 it'd be reasonable or worthwhile to try and actually make that happen-
 that is to say, we really do only have one implementation for both
 front-end and back-end here, but on the front-end we exit, while on the
 back-end we elog(ERROR).  I'm guessing that's a pipe-dream,

It's not really --- we could make it happen with a single source-file in
libpgcommon, with a few #ifdef FRONTENDs, similarly to exec.c for
instance.  I'm not entirely sure that it's worth the trouble, since once
you take away the error and memory management there's not much left of
these functions; so the #ifdefs might end up being most of the code.
I'm going to go code it shortly, and we'll find out.

 I had been looking at the Assert() that the last character is always
 '\0' in allocStringInfoVA.  Probably a stretch to say that has to be
 upgraded to an elog(), but I'd certainly be much happier with a runtime
 did this error? check of some kind than not;

IIRC, that Assert is meant to catch an ancient buggy version of vsnprintf,
so it can't really be replaced by an errno check --- the whole point is
that that vsnprintf failed to recognize it was doing something wrong.
We could upgrade it to an elog; but we had that debate years ago when
the code went in, and I don't think the passage of time has made it more
important rather than less so to have an always-on check for that case.

 (Hmm ...
 current POSIX requires *printf to set errno on error, so maybe we could
 look at errno?)

 Sounds like a good idea to me..  Being explicit about checking for error
 results, rather than hoping-against-hope that the only reason we'd ever
 get a value less than the total length is when we need to provide more
 memory, would be a lot better.  Perhaps it'd be worthwhile to run a test
 against the build farm and see what kind of errno's are being set in
 those cases where we're now getting the 'out of memory' failures you
 mentioned upthread?

Yeah, I'm hoping that reporting errno will give us some more insight,
if anole's problem is still there after we replace the code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Tom Lane
... BTW, another reason to choose identical APIs for frontend and backend
versions of these functions is that it greatly eases use of them in shared
frontend/backend code.  As I notice somebody has *already done* in
common/relpath.c.   I'm not exactly sure how those psprintf calls are
working at all in frontend builds.  Maybe they aren't quite, and that has
something to do with the failures on anole?

In order to avoid having to clutter stuff like that with #ifdef FRONTENDs,
I'm now thinking we should use exactly the same names for the frontend and
backend versions, ie psprintf() and pvsprintf().  The main reason for
considering a pg_ prefix for the frontend versions was to avoid cluttering
application namespace; but it's already the case that we don't expect
libpgcommon to be namespace clean.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 ... BTW, another reason to choose identical APIs for frontend and backend
 versions of these functions is that it greatly eases use of them in shared
 frontend/backend code.  As I notice somebody has *already done* in
 common/relpath.c.   I'm not exactly sure how those psprintf calls are
 working at all in frontend builds.  Maybe they aren't quite, and that has
 something to do with the failures on anole?

Seems plausible.

 In order to avoid having to clutter stuff like that with #ifdef FRONTENDs,
 I'm now thinking we should use exactly the same names for the frontend and
 backend versions, ie psprintf() and pvsprintf().  The main reason for
 considering a pg_ prefix for the frontend versions was to avoid cluttering
 application namespace; but it's already the case that we don't expect
 libpgcommon to be namespace clean.

To be honest, I had been assuming we'd use the same names.  As for which
direction to go, I'd personally prefer psprintf but that's just my lazy
developer fingers talking.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Alvaro Herrera
Tom Lane wrote:
 ... BTW, another reason to choose identical APIs for frontend and backend
 versions of these functions is that it greatly eases use of them in shared
 frontend/backend code.  As I notice somebody has *already done* in
 common/relpath.c.   I'm not exactly sure how those psprintf calls are
 working at all in frontend builds.

There's psprintf in src/common/fe_memutils.c, too, using the backend's
API, which is why this works.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 ... BTW, another reason to choose identical APIs for frontend and backend
 versions of these functions is that it greatly eases use of them in shared
 frontend/backend code.  As I notice somebody has *already done* in
 common/relpath.c.   I'm not exactly sure how those psprintf calls are
 working at all in frontend builds.

 There's psprintf in src/common/fe_memutils.c, too, using the backend's
 API, which is why this works.

Ah --- that's why it links, anyway.  As for works, I suspect that the
answer is that anole has a vsnprintf that returns -1 on buffer overrun.
vasprintf and then psprintf will interpret that as out of memory,
resulting in the observed behavior.  It'd be interesting to see the stack
trace from that error, though, just to confirm this theory.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reasons not to like asprintf

2013-10-22 Thread Tom Lane
I wrote:
 Stephen Frost sfr...@snowman.net writes:
 I agree that exit/Assert-or-elog is the right API here.  I wonder if
 it'd be reasonable or worthwhile to try and actually make that happen-
 that is to say, we really do only have one implementation for both
 front-end and back-end here, but on the front-end we exit, while on the
 back-end we elog(ERROR).  I'm guessing that's a pipe-dream,

 It's not really --- we could make it happen with a single source-file in
 libpgcommon, with a few #ifdef FRONTENDs, similarly to exec.c for
 instance.  I'm not entirely sure that it's worth the trouble, since once
 you take away the error and memory management there's not much left of
 these functions; so the #ifdefs might end up being most of the code.
 I'm going to go code it shortly, and we'll find out.

Attached is a draft, which compiles though I've not yet actually tested it.
It seems pretty reasonable as far as readability goes.  A couple of notes:

1. I omitted pvsprintf(), which we don't actually use anywhere anyway.
I don't see any way to implement that API without va_copy, which is one
of the main things I'm trying to get rid of.  Since we've never needed
a va_args variant in stringinfo.c in all the years we've had that, I
think we can get away with omitting it here.

2. The file includes utils/memutils.h, which I'm not 100% sure is safe
to include in the FRONTEND case.  This is so that it can refer to
MaxAllocSize.  If it turns out that that causes build problems on some
platforms, we could use some more ad-hoc way to define the limit (maybe
just INT_MAX/4?), or we could move the definition of MaxAllocSize into
palloc.h.

Barring objections, I'll push forward with replacing the existing code
with this.

regards, tom lane

/*-
 *
 * psprintf.c
 *  sprintf into an allocated-on-demand buffer
 *
 *
 * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 *
 * IDENTIFICATION
 *src/common/psprintf.c
 *
 *-
 */

#ifndef FRONTEND
#include postgres.h
#else
#include postgres_fe.h
#endif

#include utils/memutils.h


static size_t pvsnprintf(char *buf, size_t len, const char *fmt, va_list args);


/*
 * psprintf
 *
 * Format text data under the control of fmt (an sprintf-style format string)
 * and return it in an allocated-on-demand buffer.  The buffer is allocated
 * with palloc in the backend, or malloc in frontend builds.  Caller is
 * responsible to free the buffer when no longer needed, if appropriate.
 *
 * Errors are not returned to the caller, but are reported via elog(ERROR)
 * in the backend, or printf-to-stderr-and-exit() in frontend builds.
 * One should therefore think twice about using this in libpq.
 */
char *
psprintf(const char *fmt,...)
{
size_t  len = 128;  /* initial assumption about 
buffer size */

for (;;)
{
char   *result;
va_list args;

/*
 * Allocate result buffer.  Note that in frontend this maps 
to malloc
 * with exit-on-error.
 */
result = (char *) palloc(len);

/* Try to format the data. */
va_start(args, fmt);
len = pvsnprintf(result, len, fmt, args);
va_end(args);

if (len == 0)
return result;  /* success */

/* Release buffer and loop around to try again with larger len. 
*/
pfree(result);
}
}

/*
 * pvsnprintf
 *
 * Attempt to format text data under the control of fmt (an sprintf-style
 * format string) and insert it into buf (which has length len).
 *
 * If successful, return zero.  If there's not enough space in buf, return
 * an estimate of the buffer size needed to succeed (this *must* be more
 * than len, else psprintf might loop infinitely).
 * Other error cases do not return.
 *
 * XXX This API is ugly, but there seems no alternative given the C spec's
 * restrictions on what can portably be done with va_list arguments: you have
 * to redo va_start before you can rescan the argument list, and we can't do
 * that from here.
 */
static size_t
pvsnprintf(char *buf, size_t len, const char *fmt, va_list args)
{
int nprinted;

Assert(len  0);

errno = 0;

/*
 * Assert check here is to catch buggy vsnprintf that overruns the
 * specified buffer length.  Solaris 7 in 64-bit mode is an example of a
 * platform with such a bug.
 */
#ifdef USE_ASSERT_CHECKING
buf[len - 1] = '\0';
#endif

nprinted = vsnprintf(buf, len, fmt, args);

Assert(buf[len - 1] == '\0');

/*
 *