Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thu, Sep 25, 2014 at 1:04 AM, Tom Lane wrote: > David Johnston writes: > > On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane wrote: > >> TBH I've also been wondering whether any of these proposed cures are > >> better than the disease. The changes that can be argued to make the > >> behavior more sane are also ones that introduce backwards compatibility > >> issues of one magnitude or another. And I do not have a lot of sympathy > >> for "let's not change anything except to throw an error in a case that > >> seems ambiguous". That's mostly being pedantic, not helpful, especially > >> seeing that the number of field complaints about it is indistinguishable > >> from zero. > > > Then what does it matter that we'd choose to error-out? > > The number of complaints about the *existing* behavior is indistinguishable > from zero (AFAIR anyway). It does not follow that deciding to throw an > error where we did not before will draw no complaints. > > Any change has the potential to draw complaints. For you it seems that "hey, I upgraded to 9.5 and my logs are being rotated out every minute now. I thought I had that turned off" is the desired complaint. Greg wants: "hey, my 1 hour log rotation is now happening every minute". If the error message is written correctly most people upon seeing the error will simply fix their configuration and move on - regardless of whether they were proactive in doing so having read the release notes. And, so what if we get some complaints? We already disallow the specified values (instead, turning them into zeros) so it isn't like we are further restricting user capabilities. If I was to commit a patch that didn't throw an error I'd ask the author to provide an outline for each of the affected parameters and what it would mean (if possible) for a setting currently rounded to zero to instead be rounded to 1. Its the same language in the release notes to get someone to avoid the error as it is to get them to not be impacted by the rounding change so the difference is those people who would not read the release notes. The "error" outcome is simple, direct, and fool-proof - and conveys the fact that what they are asking for is invalid. It may be a little scary but at least we can be sure nothing bad is happening in the system. If the argument is that there are two few possible instances out there to expend the effort to catalog every parameter then there isn't enough surface area to care about throwing an error. And I still maintain that anyone setting values for a clean installation would rather be told their input is not valid instead of silently making it so. Changing the unit for log_rotate_age when their is basically no one asking for that seems like the worse solution at face value; my +1 not withstanding. I gave that mostly because if you are going to overhaul the system then making everything "ms" seems like the right thing to do. I think that such an effort would be a waste of resources. I don't have clients so maybe my acceptance of errors is overly optimistic - but in this case I truly don't see enough people even realizing that there was a change and those few that do should be able to read the error and make the same change that they would need to make anyway if the rounding option is chosen. My only concern here, and it probably applies to any solution, is that the change is implemented properly so that all possible user input areas throw the error correctly and as appropriate to the provided interface. That is, interactive use immediately errors out without changing the default values while explicit/manual file changes cause the system to error at startup. I haven't looked into the code parts of this but I have to imagine this is already covered since even with units and such invalid input is still possible and needs to be addressed; this only add one more check to that existing routine. David J.
Re: [HACKERS] add modulo (%) operator to pgbench
On 9/24/14, 10:10 PM, Robert Haas wrote: I think you're making a mountain out of a molehill. I implemented this today in about three hours. I think you're greatly understating your own efficiency at shift/reducing parser mountains down to molehills. Fabien even guessed the LOC size of the resulting patch with less than a 9% error. That's some top notch software metrics and development work there boys; kudos all around. Let's get this operator support whipped into shape, then we can add the 2 to 3 versions of the modulo operator needed to make the major use cases work. (There was never going to be just one hacked in with a quick patch that satisfied the multiple ways you can do this) Then onto the last missing pieces of Fabien's abnormally distributed test workload vision. He seems kind of stressed about the process lately; not sure what to say about it. Yes, the PostgreSQL community is hostile to short, targeted feature improvements unless they come already fit into a large, standards compliant strategy for improving the database. That doesn't work well when approached by scratch an itch stye development. Nothing we can really do about it -- Greg Smith greg.sm...@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: rounding up time value less than its unit.
David Johnston writes: > On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane wrote: >> TBH I've also been wondering whether any of these proposed cures are >> better than the disease. The changes that can be argued to make the >> behavior more sane are also ones that introduce backwards compatibility >> issues of one magnitude or another. And I do not have a lot of sympathy >> for "let's not change anything except to throw an error in a case that >> seems ambiguous". That's mostly being pedantic, not helpful, especially >> seeing that the number of field complaints about it is indistinguishable >> from zero. > âThen what does it matter that we'd choose to error-out?â The number of complaints about the *existing* behavior is indistinguishable from zero (AFAIR anyway). It does not follow that deciding to throw an error where we did not before will draw no complaints. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thu, Sep 25, 2014 at 12:46 AM, Tom Lane wrote: > Gregory Smith writes: > > I don't see any agreement on the real root of a problem here yet. That > > makes gauging whether any smaller change leads that way or not fuzzy. I > > personally would be fine doing nothing right now, instead waiting until > > that's charted out--especially if the alternative is applying any of the > > rounding or error throwing ideas suggested so far. I'd say that I hate > > to be that guy who tells everyone else they're wrong, but we all know I > > enjoy it. > > TBH I've also been wondering whether any of these proposed cures are > better than the disease. The changes that can be argued to make the > behavior more sane are also ones that introduce backwards compatibility > issues of one magnitude or another. And I do not have a lot of sympathy > for "let's not change anything except to throw an error in a case that > seems ambiguous". That's mostly being pedantic, not helpful, especially > seeing that the number of field complaints about it is indistinguishable > from zero. > > Then what does it matter that we'd choose to error-out? > I am personally not as scared of backwards-compatibility problems as some > other commenters: I do not think that there's ever been a commitment that > postgresql.conf contents will carry forward blindly across major releases. > So I'd be willing to break strict compatibility in the name of making the > behavior less surprising. But the solutions that have been proposed that > hold to strict backwards compatibility requirements are not improvements > IMHO. > Or, put differently, the pre-existing behavior is fine so don't fix what isn't broken. This patch simply fixes an oversight in the original implementation - that someone might try to specify an invalid value (i.e., between 0 and 1). if 0 and -1 are flags, then the minimum allowable value is 1. The logic should have been: range [1, something]; 0 (optionally); -1 (optionally). Values abs(x) between 0 and 1 (exclusive) should be disallowed and, like an attempt to specify 0.5 (without units), should throw an error. David J.
Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)
At 2014-09-24 11:09:24 +0200, and...@2ndquadrant.com wrote: > > Why not add it to the stattuple extension, but as an independent > function (and file)? Thanks, that's a good idea. I'll do that. > I think it's completely unacceptable to copy a visibility routine. OK. Which visibility routine should I use, and should I try to create a variant that doesn't set hint bits? I don't have any reasoning for why it's safe to not hold a content lock. If there is one, I need help to find it. If not, I'll acquire a content lock. (If anyone can explain why it isn't safe, I would appreciate it.) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: rounding up time value less than its unit.
On Thu, Sep 25, 2014 at 12:11 AM, Gregory Smith wrote: > On 9/24/14, 6:45 PM, Peter Eisentraut wrote: > >> But then this proposal is just one of several others that break backward >> compatibility, and does so in a more or less silent way. Then we might >> as well pick another approach that gets closer to the root of the problem. >> > > I was responding to some desire to get a quick fix that cut off the worst > of the problem, and the trade-offs of the other ideas bothered me even > more. Obvious breakage is annoying, but small and really subtle version to > version incompatibilities drive me really crazy. A 60X scale change to > log_rotation_age will be big enough that impacted users should definitely > notice, while not being so big it's scary. Rounding differences are small > enough to miss. And I do see whacking the sole GUC that's set in minutes, > which I was surprised to find is a thing that existed at all, as a useful > step forward. > Why? Do you agree that a log_rotation_age value defined in seconds is sane? If your reasoning is that everything else is defined in "s" and "ms" then that is a developer, not a user, perspective. I'll buy into the "everything is defined in the smallest possible unit" approach - in which case the whole rounding things becomes a non-issue - but if you leave some of these in seconds then we should still add an error if someone defines an insane millisecond value for those parameters. > > I can't agree with Stephen's optimism that some wording cleanup is all > that's needed here. I predict it will be an annoying, multiple commit job > just to get the docs right, describing both the subtle rounding change and > how it will impact users. (Cry an extra tear for the translators) > > > Maybe I'm nieve but I'm seriously doubting this. From what I can tell the rounding isn't currently documented and really doesn't need much if any to be added. An error instead of rounding down to zero would be sufficient and self-contained. "The value specified is less than 1 in the parameter's base unit" > > Meanwhile, I'd wager the entire work of my log_rotation_scale unit change > idea, from code to docs to release notes, will take less time than just > getting the docs done on any rounding change. Probably cause less field > breakage and confusion in the end too. > You've already admitted there will be breakage. Your argument is that it will be obvious enough to notice. Why not just make it impossible? > > The bad case I threw out is a theorized one that shows why we can't go > completely crazy with jerking units around, why 1000:1 adjustments are > hard. I'm not actually afraid of that example being in the wild in any > significant quantity. The resulting regression from a 60X scale change > should not be so bad that people will suffer greatly from it either. > Probably be like the big 10:1 change in default_statistics_target. Seemed > scary that some people might be nailed by it; didn't turn out to be such a > big deal. > > I don't see any agreement on the real root of a problem here yet. That > makes gauging whether any smaller change leads that way or not fuzzy. I > personally would be fine doing nothing right now, instead waiting until > that's charted out--especially if the alternative is applying any of the > rounding or error throwing ideas suggested so far. I'd say that I hate to > be that guy who tells everyone else they're wrong, but we all know I enjoy > it. > > Maybe not: I contend the root problem is that while we provide sane unit specifications we've assumed that people will always be providing values greater than 1 - but if they don't we silently use zero (a special value) instead of telling them they messed up (made an error). If the presence of config.d and such make this untenable then I'd say those features have a problem.- not our current choice to define what sane is. David J.
Re: [HACKERS] proposal: rounding up time value less than its unit.
Gregory Smith writes: > I don't see any agreement on the real root of a problem here yet. That > makes gauging whether any smaller change leads that way or not fuzzy. I > personally would be fine doing nothing right now, instead waiting until > that's charted out--especially if the alternative is applying any of the > rounding or error throwing ideas suggested so far. I'd say that I hate > to be that guy who tells everyone else they're wrong, but we all know I > enjoy it. TBH I've also been wondering whether any of these proposed cures are better than the disease. The changes that can be argued to make the behavior more sane are also ones that introduce backwards compatibility issues of one magnitude or another. And I do not have a lot of sympathy for "let's not change anything except to throw an error in a case that seems ambiguous". That's mostly being pedantic, not helpful, especially seeing that the number of field complaints about it is indistinguishable from zero. I am personally not as scared of backwards-compatibility problems as some other commenters: I do not think that there's ever been a commitment that postgresql.conf contents will carry forward blindly across major releases. So I'd be willing to break strict compatibility in the name of making the behavior less surprising. But the solutions that have been proposed that hold to strict backwards compatibility requirements are not improvements IMHO. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: rounding up time value less than its unit.
On 9/24/14, 6:45 PM, Peter Eisentraut wrote: But then this proposal is just one of several others that break backward compatibility, and does so in a more or less silent way. Then we might as well pick another approach that gets closer to the root of the problem. I was responding to some desire to get a quick fix that cut off the worst of the problem, and the trade-offs of the other ideas bothered me even more. Obvious breakage is annoying, but small and really subtle version to version incompatibilities drive me really crazy. A 60X scale change to log_rotation_age will be big enough that impacted users should definitely notice, while not being so big it's scary. Rounding differences are small enough to miss. And I do see whacking the sole GUC that's set in minutes, which I was surprised to find is a thing that existed at all, as a useful step forward. I can't agree with Stephen's optimism that some wording cleanup is all that's needed here. I predict it will be an annoying, multiple commit job just to get the docs right, describing both the subtle rounding change and how it will impact users. (Cry an extra tear for the translators) Meanwhile, I'd wager the entire work of my log_rotation_scale unit change idea, from code to docs to release notes, will take less time than just getting the docs done on any rounding change. Probably cause less field breakage and confusion in the end too. The bad case I threw out is a theorized one that shows why we can't go completely crazy with jerking units around, why 1000:1 adjustments are hard. I'm not actually afraid of that example being in the wild in any significant quantity. The resulting regression from a 60X scale change should not be so bad that people will suffer greatly from it either. Probably be like the big 10:1 change in default_statistics_target. Seemed scary that some people might be nailed by it; didn't turn out to be such a big deal. I don't see any agreement on the real root of a problem here yet. That makes gauging whether any smaller change leads that way or not fuzzy. I personally would be fine doing nothing right now, instead waiting until that's charted out--especially if the alternative is applying any of the rounding or error throwing ideas suggested so far. I'd say that I hate to be that guy who tells everyone else they're wrong, but we all know I enjoy it. -- Greg Smith greg.sm...@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] interval typmodout is broken
Alvaro Herrera writes: > Tom Lane wrote: >> You sure about that? The grammar for INTERVAL is weird. > Well, I tested what is taken on input, and yes I agree the grammar is > weird (but not more weird than timestamp/timestamptz, mind). The input > function only accepts the precision just after the INTERVAL keyword, not > after the fieldstr: > alvherre=# create table str (a interval(2) hour to minute); > CREATE TABLE > alvherre=# create table str2 (a interval hour to minute(2)); > ERROR: syntax error at or near "(" > LÍNEA 1: create table str2 (a interval hour to minute(2)); > ^ No, that's not about where it is, it's about what the field is: only "second" can have a precision. Our grammar is actually allowing stuff here that it shouldn't. According to the SQL spec, you could write interval hour(2) to minute but this involves a "leading field precision", which we do not support and should definitely not be conflating with trailing-field precision. Or you could write interval hour to second(2) which is valid and we support it. You can *not* write interval hour to minute(2) either per spec or per our implementation; and interval(2) hour to minute is 100% invalid per spec, even though our grammar goes out of its way to accept it. In short, the typmodout function is doing what it ought to. It's the grammar that's broken. It looks to me like Tom Lockhart coded the grammar to accept a bunch of cases that he never got round to actually implementing reasonably. In particular, per SQL spec these are completely different animals: interval hour(2) to second interval hour to second(2) but our grammar transforms them into the same thing. We ought to fix that... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identify_locking_dependencies is broken for schema-only dumps
On Wed, Sep 24, 2014 at 4:36 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Sep 24, 2014 at 12:18 PM, Tom Lane wrote: >>> Sorry for not paying attention sooner. After studying it for awhile, >>> I think the change is probably all right but your proposed comment is >>> entirely inadequate. > >> If you don't like that version, can you suggest something you would like >> better? > > Perhaps like this: > > * We assume the entry requires exclusive lock on each TABLE or TABLE DATA > * item listed among its dependencies. Originally all of these would have > * been TABLE items, but repoint_table_dependencies would have repointed > * them to the TABLE DATA items if those are present (which they might not > * be, eg in a schema-only dump). Note that all of the entries we are > * processing here are POST_DATA; otherwise there might be a significant > * difference between a dependency on a table and a dependency on its > * data, so that closer analysis would be needed here. Works for me. I'll push with that text unless you'd like to take care of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
On Wed, Sep 24, 2014 at 2:34 PM, Fabien COELHO wrote: > Sigh. > > How to transform a trivial 10 lines patch into a probably 500+ lines project > involving flex & bison & some non trivial data structures, and which may get > rejected on any ground... > > Maybe I'll set that as a student project. I think you're making a mountain out of a molehill. I implemented this today in about three hours. The patch is attached. It needs more testing, documentation, and possibly some makefile adjustments, but it seems to basically work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/contrib/pgbench/.gitignore b/contrib/pgbench/.gitignore index 489a2d6..aae819e 100644 --- a/contrib/pgbench/.gitignore +++ b/contrib/pgbench/.gitignore @@ -1 +1,3 @@ +/exprparse.c +/exprscan.c /pgbench diff --git a/contrib/pgbench/Makefile b/contrib/pgbench/Makefile index b8e2fc8..67e2bf6 100644 --- a/contrib/pgbench/Makefile +++ b/contrib/pgbench/Makefile @@ -4,7 +4,7 @@ PGFILEDESC = "pgbench - a simple program for running benchmark tests" PGAPPICON = win32 PROGRAM = pgbench -OBJS = pgbench.o $(WIN32RES) +OBJS = pgbench.o exprparse.o $(WIN32RES) PG_CPPFLAGS = -I$(libpq_srcdir) PG_LIBS = $(libpq_pgport) $(PTHREAD_LIBS) @@ -18,8 +18,22 @@ subdir = contrib/pgbench top_builddir = ../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk + +distprep: exprparse.c exprscan.c endif ifneq ($(PORTNAME), win32) override CFLAGS += $(PTHREAD_CFLAGS) endif + +# There is no correct way to write a rule that generates two files. +# Rules with two targets don't have that meaning, they are merely +# shorthand for two otherwise separate rules. To be safe for parallel +# make, we must chain the dependencies like this. The semicolon is +# important, otherwise make will choose the built-in rule for +# gram.y=>gram.c. + +exprparse.h: exprparse.c ; + +# exprscan is compiled as part of exprparse +exprparse.o: exprscan.c diff --git a/contrib/pgbench/exprparse.y b/contrib/pgbench/exprparse.y new file mode 100644 index 000..37eb538 --- /dev/null +++ b/contrib/pgbench/exprparse.y @@ -0,0 +1,96 @@ +%{ +/*- + * + * exprparse.y + * bison grammar for a simple expression syntax + * + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + *- + */ + +#include "postgres_fe.h" + +#include "pgbench.h" + +PgBenchExpr *expr_parse_result; + +static PgBenchExpr *make_integer_constant(int64 ival); +static PgBenchExpr *make_variable(char *varname); +static PgBenchExpr *make_op(char operator, PgBenchExpr *lexpr, + PgBenchExpr *rexpr); + +%} + +%expect 0 +%name-prefix="expr_yy" + +%union +{ + int64 ival; + char *str; + PgBenchExpr *expr; +} + +%type expr +%type INTEGER +%type VARIABLE +%token INTEGER VARIABLE + +%left '+' '-' +%left '*' '/' '%' +%right UMINUS + +%% + +result: expr{ expr_parse_result = $1; } + +expr: '(' expr ')' { $$ = $2; } + | '+' expr %prec UMINUS { $$ = $2; } + | '-' expr %prec UMINUS + { $$ = make_op('-', make_integer_constant(0), $2); } + | expr '+' expr { $$ = make_op('+', $1, $3); } + | expr '-' expr { $$ = make_op('-', $1, $3); } + | expr '*' expr { $$ = make_op('*', $1, $3); } + | expr '/' expr { $$ = make_op('/', $1, $3); } + | expr '%' expr { $$ = make_op('%', $1, $3); } + | INTEGER{ $$ = make_integer_constant($1); } + | VARIABLE { $$ = make_variable($1); } + ; + +%% + +static PgBenchExpr * +make_integer_constant(int64 ival) +{ + PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr)); + + expr->etype = ENODE_INTEGER_CONSTANT; + expr->u.integer_constant.ival = ival; + return expr; +} + +static PgBenchExpr * +make_variable(char *varname) +{ + PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr)); + + expr->etype = ENODE_VARIABLE; + expr->u.variable.varname = varname; + return expr; +} + +static PgBenchExpr * +make_op(char operator, PgBenchExpr *lexpr, PgBenchExpr *rexpr) +{ + PgBenchExpr *expr = pg_malloc(sizeof(PgBenchExpr)); + + expr->etype = ENODE_OPERATOR; + expr->u.operator.operator = operator; + expr->u.operator.lexpr = lexpr; + expr->u.operator.rexpr = rexpr; + return expr; +} + +#include "exprscan.c" diff --git a/contrib/pgbench/exprscan.l b/contrib/pgbench/exprscan.l new file mode 100644 index 000..4409f08 --- /dev/null +++ b/contrib/pgbench/exprscan.l @@ -0,0 +1,100 @@ +%{ +/*- + * + * exprscan.l + * a lexical scanner for a simple expression syntax + * + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + *- + */ + +s
Re: [HACKERS] delta relations in AFTER triggers
On 09/15/2014 10:25 PM, Kevin Grittner wrote: > I broke out the changes from the previous patch in multiple commits > in my repository on github: *Thankyou* That gives me the incentive to pull it and test it. A nice patch series published in a git repo is so much easier to work with than a giant squashed patch as an attachment. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Immediate standby promotion
On Thu, Sep 25, 2014 at 5:36 AM, Simon Riggs wrote: > We go to a lot of trouble to ensure data is successfully on disk and > in WAL. I won't give that up, nor do I want to make it easier to lose > data than it already is. +1. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On 08/28/2014 05:03 AM, Kevin Grittner wrote: > I don't have to squint that hard -- I've always been comfortable > with the definition of a table as a relation variable, and it's not > too big a stretch to expand that to a tuplestore. ;-) In fact, I > will be surprised if someone doesn't latch onto this to create a > new "declared temporary table" that only exists within the scope of > a compound statement (i.e., a BEGIN/END block). You would DECLARE > them just like you would a scalar variable in a PL, and they would > have the same scope. > > I'll take a look at doing this in the next couple days, and see > whether doing it that way is as easy as it seems on the face of it. Oracle's TABLE variables in PL/SQL are quite similar; it might be worth observing how they work for comparison. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On Fri, Sep 19, 2014 at 2:54 PM, Peter Geoghegan wrote: > Probably not - it appears to make very little difference to > unoptimized pass-by-reference types whether or not datum1 can be used > (see my simulation of Kevin's worst case, for example [1]). Streaming > through a not inconsiderable proportion of memtuples again is probably > a lot worse. The datum1 optimization (which is not all that old) made > a lot of sense when initially introduced, because it avoided chasing > through a pointer for pass-by-value types. I think that's its sole > justification, though. Just to be clear -- I am blocked on this. Do you really prefer to restart copying heap tuples from scratch in the event of aborting, just to make sure that the datum1 representation is consistently either a pointer to text, or an abbreviated key? I don't think that the costs involved make that worth it, as I've said, but I'm not sure how to resolve that controversy. I suggest that we focus on making sure the abort logic itself is sound. There probably hasn't been enough discussion of that. Once that is resolved, we can revisit the question of whether or not copying should restart to keep the datum1 representation consistent. I suspect that leaving that until later will be easier all around. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: rounding up time value less than its unit.
On 9/23/14 11:55 PM, Gregory Smith wrote: > Right now there are people out there who have configurations that look > like this: > > log_rotation_age=60 > > In order to get hourly rotation. These users will suffer some trauma > should they upgrade to a version where this parameter now means a new > log file pops every 60 seconds instead. But then this proposal is just one of several others that break backward compatibility, and does so in a more or less silent way. Then we might as well pick another approach that gets closer to the root of the problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
On 9/24/14 4:26 PM, Tom Lane wrote: > Peter Eisentraut writes: >> On 9/24/14 9:21 AM, Tom Lane wrote: >>> Agreed, but what about non-GCC compilers? > >> Stick AC_PROG_CC_C99 into configure.in. > > I think that's a bad idea, unless you mean to do it only on Solaris. > If we do that unconditionally, we will pretty much stop getting any > warnings about C99-isms on modern platforms. I am not aware that > there has been any agreement to move our portability goalposts up > to C99. I don't disagree with that concern. But let's look at it this way. isinf() is a C99 function. If we want to have it, the canonical way is to put the compiler into C99 mode. Anything else is just pure luck (a.k.a. GNU extension). It's conceivable that on other platforms we fail to detect a system isinf() function because of that. The only way we learned about this is because the current configure check is inconsistent on Solaris. The first thing to fix is the configure check. It shouldn't detect a function if it creates a warning when used. What will happen then is probably that isinf() isn't detected on Solaris, and we use the fallback implementation. Are we happy with that? Maybe. If not, well, then we need to put the compiler into C99 mode. But then it's not a Solaris-specific problem anymore, because Solaris will then behave like any other platform in this regard. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Sloppy thinking about leakproof properties of opclass co-members
It strikes me that there's a significant gap in the whole "leakproof function" business, namely that no consideration has been given to planner-driven transformations of queries. As an example, if we have "a = b" and "b = c", the planner may generate and apply "a = c" instead of one or both of those clauses. If a, b, c are not all the same type, "a = c" might involve an operator that's not either of the original ones. And it's possible that that operator is not leakproof where the original ones are. This could easily result in introducing non-leakproof operations into a secure subquery after pushdown of a clause that was marked secure. Another example is that in attempting to make implication or refutation proofs involving operator clauses, the planner feels free to apply other members of the operator's btree opclass (if it's in one). I've not bothered to try to create a working exploit, but I'm pretty sure that this could result in a non-leakproof function being applied during planning of a supposedly secure subquery. It might be that that couldn't leak anything worse than constant values within the query tree, but perhaps it could leak data values from a protected table's pg_statistic entries. ISTM that the most appropriate solution here is to insist that all or none of the members of an operator class be marked leakproof. (Possibly we could restrict that to btree opclasses, but I'm not sure any exception is needed for other index types.) I looked for existing violations of this precept, and unfortunately found a *lot*. For example, texteq is marked leakproof but its fellow text comparison operators aren't. Is that really sane? Here's a query to find problematic cases: select p1.proname,o1.oprname,p2.proname,o2.oprname,a1.amopfamily from pg_proc p1, pg_operator o1, pg_amop a1, pg_proc p2, pg_operator o2, pg_amop a2 where p1.oid = o1.oprcode and p2.oid = o2.oprcode and o1.oid = a1.amopopr and o2.oid = a2.amopopr and a1.amopfamily = a2.amopfamily and p1.proleakproof and not p2.proleakproof; Oh ... and just to add insult to injury, the underlying opclass support functions (such as textcmp and hashtext) are generally not marked leakproof even when the operators they may be executed instead of are. This is even more silly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] interval typmodout is broken
Tom Lane wrote: > Alvaro Herrera writes: > > I just noticed when working on DDL deparsing that the typmodout routine > > for intervals is broken. The code uses > > > if (precision != INTERVAL_FULL_PRECISION) > > snprintf(res, 64, "%s(%d)", fieldstr, precision); > > else > > snprintf(res, 64, "%s", fieldstr); > > > which puts the parenthised number after the textual name; but the > > grammar only takes it the other way around. > > You sure about that? The grammar for INTERVAL is weird. I believe > the output we're trying to produce here is something like > > INTERVAL HOUR TO SECOND(2) > > where "fieldstr" would be " HOUR TO SECOND" and "precision" would > give the fractional-second precision. Well, I tested what is taken on input, and yes I agree the grammar is weird (but not more weird than timestamp/timestamptz, mind). The input function only accepts the precision just after the INTERVAL keyword, not after the fieldstr: alvherre=# create table str (a interval(2) hour to minute); CREATE TABLE alvherre=# create table str2 (a interval hour to minute(2)); ERROR: syntax error at or near "(" LÍNEA 1: create table str2 (a interval hour to minute(2)); ^ -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of GetUserId() Usage
Alvaro, * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: > I think the case for pgstat_get_backend_current_activity() and > pg_stat_get_activity and the other pgstatfuncs.c callers is easy to make > and seems acceptable to me; but I would leave pg_signal_backend out of > that discussion, because it has a potentially harmful side effect. By > requiring SET ROLE you add an extra layer of protection against > mistakes. (Hopefully, pg_signal_backend() is not a routine thing for > well-run systems, which means human intervention, and therefore the room > for error isn't insignificant.) While I certainly understand where you're coming from, I don't really buy into it. Yes, cancelling a query (the only thing normal users can do anyway- they can't terminate backends) could mean the loss of any in-progress work, but it's not like 'rm' and I don't see that it needs to require extra hoops for individuals to go through. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] interval typmodout is broken
Alvaro Herrera writes: > I just noticed when working on DDL deparsing that the typmodout routine > for intervals is broken. The code uses > if (precision != INTERVAL_FULL_PRECISION) > snprintf(res, 64, "%s(%d)", fieldstr, precision); > else > snprintf(res, 64, "%s", fieldstr); > which puts the parenthised number after the textual name; but the > grammar only takes it the other way around. You sure about that? The grammar for INTERVAL is weird. I believe the output we're trying to produce here is something like INTERVAL HOUR TO SECOND(2) where "fieldstr" would be " HOUR TO SECOND" and "precision" would give the fractional-second precision. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
Tom Lane wrote: > Peter Eisentraut writes: > > On 9/24/14 9:21 AM, Tom Lane wrote: > >> Agreed, but what about non-GCC compilers? > > > Stick AC_PROG_CC_C99 into configure.in. > > I think that's a bad idea, unless you mean to do it only on Solaris. > If we do that unconditionally, we will pretty much stop getting any > warnings about C99-isms on modern platforms. I am not aware that > there has been any agreement to move our portability goalposts up > to C99. AFAIK we cannot move all the way to C99, because MSVC doesn't support it. Presumably we're okay on the isinf() front because MSVC inherited it from somewhere else (it's on BSD 4.3 according to my Linux manpage), but other things are known not to work. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Immediate standby promotion
On 18 September 2014 01:22, Robert Haas wrote: >> "fast" promotion was actually a supported option in r8 of Postgres but >> this option was removed when we implemented streaming replication in >> r9.0 >> >> The *rough* requirement is sane, but that's not the same thing as >> saying this exact patch makes sense. > > Granted. Fair point. > >> If you are paused and you can see that WAL up ahead is damaged, then >> YES, you do want to avoid applying it. That is possible by setting a >> PITR target so that recovery stops at a precise location specified by >> you. As an existing option is it better than the blunt force trauma >> suggested here. > > You can pause at a recovery target, but then what if you want to go > read/write at that point? Or what if you've got a time-delayed > standby and you want to break replication so that it doesn't replay > the DROP TABLE students that somebody ran on the master? It doesn't > have to be that WAL is unreadable or corrupt; it's enough for it to > contain changes you wish to avoid replaying. > >> If you really don't care, just shutdown server, resetxlog and start >> her up - again, no need for new option. > > To me, being able to say "pg_ctl promote_right_now -m yes_i_mean_it" > seems like a friendlier interface than making somebody shut down the > server, run pg_resetxlog, and start it up again. It makes sense to go from paused --> promoted. It doesn't make sense to go from normal running --> promoted, since that is just random data loss. I very much understand the case where somebody is shouting "get the web site up, we are losing business". Implementing a feature that allows people to do exactly what they asked (go live now), but loses business transactions that we thought had been safely recorded is not good. It implements only the exact request, not its actual intention. Any feature that lumps both cases together is wrongly designed and will cause data loss. We go to a lot of trouble to ensure data is successfully on disk and in WAL. I won't give that up, nor do I want to make it easier to lose data than it already is. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identify_locking_dependencies is broken for schema-only dumps
Robert Haas writes: > On Wed, Sep 24, 2014 at 12:18 PM, Tom Lane wrote: >> Sorry for not paying attention sooner. After studying it for awhile, >> I think the change is probably all right but your proposed comment is >> entirely inadequate. > If you don't like that version, can you suggest something you would like > better? Perhaps like this: * We assume the entry requires exclusive lock on each TABLE or TABLE DATA * item listed among its dependencies. Originally all of these would have * been TABLE items, but repoint_table_dependencies would have repointed * them to the TABLE DATA items if those are present (which they might not * be, eg in a schema-only dump). Note that all of the entries we are * processing here are POST_DATA; otherwise there might be a significant * difference between a dependency on a table and a dependency on its * data, so that closer analysis would be needed here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS feature has been committed
Heikki, * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: > Some random comments after a quick read-through of the patch: Many thanks for this, again. I've pushed updates along with the fix for relcache which was identified by the buildfarm. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] interval typmodout is broken
I just noticed when working on DDL deparsing that the typmodout routine for intervals is broken. The code uses if (precision != INTERVAL_FULL_PRECISION) snprintf(res, 64, "%s(%d)", fieldstr, precision); else snprintf(res, 64, "%s", fieldstr); which puts the parenthised number after the textual name; but the grammar only takes it the other way around. This has been wrong since commit 5725b9d9afc8 dated Dec 30 2006, which introduced the whole notion of type-specific typmod output functions. I don't understand how come nobody has noticed this in eight years. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
Peter Eisentraut writes: > On 9/24/14 9:21 AM, Tom Lane wrote: >> Agreed, but what about non-GCC compilers? > Stick AC_PROG_CC_C99 into configure.in. I think that's a bad idea, unless you mean to do it only on Solaris. If we do that unconditionally, we will pretty much stop getting any warnings about C99-isms on modern platforms. I am not aware that there has been any agreement to move our portability goalposts up to C99. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
On 9/24/14 9:21 AM, Tom Lane wrote: > Oskari Saarenmaa writes: >> ... so to enable XPG6 we'd need to use C99 mode anyway. > > OK. > >> Could we just use >> -std=gnu99 (with -fgnu89-inline if required) with GCC on Solaris? ISTM >> it would be cleaner to just properly enable c99 mode rather than define >> an undocumented macro to use a couple of c99 declarations. > > Agreed, but what about non-GCC compilers? Stick AC_PROG_CC_C99 into configure.in. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identify_locking_dependencies is broken for schema-only dumps
On Wed, Sep 24, 2014 at 12:18 PM, Tom Lane wrote: > Robert Haas writes: >> If there are no comments on this soon-ish, I'm going to push and >> back-patched the patch I attached. > > Sorry for not paying attention sooner. After studying it for awhile, > I think the change is probably all right but your proposed comment is > entirely inadequate. There are extremely specific reasons why this > works, and you removed an existing comment about that and replaced it > with nothing but a wishy-washy "maybe". Well, I could write something like this: * We assume the item requires exclusive lock on each TABLE or TABLE DATA * item listed among its dependencies. (This was originally a dependency on * the TABLE, but fix_dependencies may have repointed it to the data item. * In a schema-only dump, however, this will not have been done.) If you don't like that version, can you suggest something you would like better? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Spinlocks and compiler/memory barriers
On 2014-09-09 17:54:03 -0400, Robert Haas wrote: > So, that's committed, then. I think we should pick something that uses > spinlocks and is likely to fail spectacularly if we haven't got this > totally right yet, and de-volatilize it. And then watch to see what > turns red in the buildfarm and/or which users start screaming. I'm > inclined to propose lwlock.c as a candidate, since that's very widely > used and a place where we know there's significant contention. Did you consider removing the volatiles from bufmgr.c? There's lots of volatiles in there and most of them don't seem to have been added in a principled way. I'm looking at my old patch for lockless pin/unpin of buffers and it'd look a lot cleaner without. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
On 09/24/2014 09:34 PM, Fabien COELHO wrote: The idea of a modulo operator was not rejected, we'd just like to have the infrastructure in place first. Sigh. How to transform a trivial 10 lines patch into a probably 500+ lines project involving flex & bison & some non trivial data structures, and which may get rejected on any ground... That's how we get good features. It's very common for someone to need X, and to post a patch that does X. Other people pop up that need Y and Z which are similar, and you end up implementing those too. It's more work for you in the short term, but it benefits everyone in the long run. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] json_object_agg return on no rows
Probably due to an oversight on my part, json_object_agg currently returns a json object with no fields rather than NULL, which the docs say it will, and which would be consistent with all other aggregates except count(). I don't think we ever discussed this, so it's probably just a slip up. I'm going to change it to return NULL in this case unless there is some objection. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 2014-09-24 14:28:18 -0400, Tom Lane wrote: > Note that the spinlock code separates s_lock.h (hardware implementations) > from spin.h (a hardware-independent abstraction layer). Perhaps there's > room for a similar separation here. Luckily that separation exists ;). The hardware dependant bits are in different files than the platform independent functionality on top of them (atomics/generic.h). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
The idea of a modulo operator was not rejected, we'd just like to have the infrastructure in place first. Sigh. How to transform a trivial 10 lines patch into a probably 500+ lines project involving flex & bison & some non trivial data structures, and which may get rejected on any ground... Maybe I'll set that as a student project. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
Heikki Linnakangas writes: > On 09/24/2014 07:57 PM, Andres Freund wrote: >> On 2014-09-24 12:44:09 -0400, Tom Lane wrote: >>> I think the question is more like "what in the world happened to confining >>> ourselves to a small set of atomics". >> I fail to see why the existance of a wrapper around compare-exchange >> (which is one of the primitives we'd agreed upon) runs counter to >> the agreement that we'll only rely on a limited number of atomics on the >> hardware level? > It might be a useful function, but if there's no hardware implementation > for it, it doesn't belong in atomics.h. We don't want to turn it into a > general library of useful little functions. Note that the spinlock code separates s_lock.h (hardware implementations) from spin.h (a hardware-independent abstraction layer). Perhaps there's room for a similar separation here. I tend to agree with Heikki that wrappers around compare-exchange ought not be conflated with compare-exchange itself, even if there might theoretically be architectures where the wrapper function could be implemented directly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 2014-09-24 21:19:06 +0300, Heikki Linnakangas wrote: > On 09/24/2014 07:57 PM, Andres Freund wrote: > >On 2014-09-24 12:44:09 -0400, Tom Lane wrote: > >>Andres Freund writes: > >>>On 2014-09-24 18:55:51 +0300, Heikki Linnakangas wrote: > There doesn't seem to be any hardware implementations of that in the > patch. > Is there any architecture that has an instruction or compiler intrinsic > for > that? > >> > >>>You can implement it rather efficiently on ll/sc architectures. But I > >>>don't really think it matters. I prefer add_until (I've seen it named > >>>saturated add before as well) to live in the atomics code, rather than > >>>reimplement it in atomics employing code. I guess you see that > >>>differently? > >> > >>I think the question is more like "what in the world happened to confining > >>ourselves to a small set of atomics". > > > >I fail to see why the existance of a wrapper around compare-exchange > >(which is one of the primitives we'd agreed upon) runs counter to > >the agreement that we'll only rely on a limited number of atomics on the > >hardware level? > > It might be a useful function, but if there's no hardware implementation for > it, it doesn't belong in atomics.h. We don't want to turn it into a general > library of useful little functions. Uh. It belongs there because it *atomically* does a saturated add? That's precisely what atomics.h is about, so I don't see why it'd belong anywhere else. > >>I doubt either that this exists > >>natively anywhere, or ethat it's so useful that we should expect platforms > >>to have efficient implementations. > > Googling around, ARM seems to have a QADD instruction that does that. But > AFAICS it's not an atomic instruction that you could use for synchronization > purposes, just a regular instruction. On ARM - and many other LL/SC architectures - you can implement it atomically using LL/SC. In pseudocode: while (true) { load_linked(mem, reg); if (reg + add < limit) reg = reg + add; else reg = limit; if (store_conditional(mem, reg)) break } That's how pretty much *all* atomic instructions work on such architectures. > >It's useful for my work to get rid of most LockBufHdr() calls (to > >manipulate usagecount locklessly). That's why I added it. We can delay > >it till that patch is ready, but I don't really see the benefit. > > Yeah, please leave it out for now, we can argue about it later. Even if we > want it in the future, it would be just dead, untested code now. Ok, will remove it for now. I won't repost a version with it removed, as removing a function as the only doesn't seem to warrant reposting it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
No, it depends totally on the application. For financial and physical inventory purposes where I have had occasion to use it, the properties which were important were: [...] Hmmm. Probably I'm biased towards my compiler with an integer linear flavor field, where C-like "%" is always a pain, however you look at it. I'm not sure of physical inventories with negative numbers though. In accounting, I thought that a negative number was a positive number with "debit" written above. In finance, no problem to get big deficits:-) Now about the use case, I'm not sure that you would like to write your financial and physical inventory stuff within a pgbench test script, whereas in such a script I do expect when doing a modulo with the size of a table to have a positive result so that I can expect to find a tuple there, hence the desired "positive remainder" property for negative dividends. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 09/24/2014 07:57 PM, Andres Freund wrote: On 2014-09-24 12:44:09 -0400, Tom Lane wrote: Andres Freund writes: On 2014-09-24 18:55:51 +0300, Heikki Linnakangas wrote: There doesn't seem to be any hardware implementations of that in the patch. Is there any architecture that has an instruction or compiler intrinsic for that? You can implement it rather efficiently on ll/sc architectures. But I don't really think it matters. I prefer add_until (I've seen it named saturated add before as well) to live in the atomics code, rather than reimplement it in atomics employing code. I guess you see that differently? I think the question is more like "what in the world happened to confining ourselves to a small set of atomics". I fail to see why the existance of a wrapper around compare-exchange (which is one of the primitives we'd agreed upon) runs counter to the agreement that we'll only rely on a limited number of atomics on the hardware level? It might be a useful function, but if there's no hardware implementation for it, it doesn't belong in atomics.h. We don't want to turn it into a general library of useful little functions. I doubt either that this exists natively anywhere, or ethat it's so useful that we should expect platforms to have efficient implementations. Googling around, ARM seems to have a QADD instruction that does that. But AFAICS it's not an atomic instruction that you could use for synchronization purposes, just a regular instruction. It's useful for my work to get rid of most LockBufHdr() calls (to manipulate usagecount locklessly). That's why I added it. We can delay it till that patch is ready, but I don't really see the benefit. Yeah, please leave it out for now, we can argue about it later. Even if we want it in the future, it would be just dead, untested code now. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
Kevin Grittner writes: > Assuming that all values are integers, for: > x = a / b; > y = a % b; > If b is zero either statement must generate an error. > If a and b have the same sign, x must be positive; else x must be negative. > It must hold that abs(x) is equal to abs(a) / abs(b). > It must hold that ((x * b) + y) is equal to a. Not sure about the third of those statements, but the last one is definitely a requirement. I think the only defensible choice, really, is that % should be defined so that a = ((a / b) * b) + (a % b). It is perfectly reasonable to provide other division/modulus semantics as functions, preferably in matching pairs that also satisfy this axiom. But the two operators need to agree, or you'll have surprised users. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
Fabien COELHO wrote: >>> So my opinion is that this small modulo operator patch is both useful and >>> harmless, so it should be committed. >> >> You've really failed to make that case --- in particular, AFAICS there is >> not even consensus on the exact semantics that the operator should have. > > There is. Basically whatever with a positive remainder when the divisor is > positive is fine. The only one to avoid is the dividend signed version, > which happen to be the one of C and SQL, a very unfortunate choice in both > case as soon as you have negative numbers. No, it depends totally on the application. For financial and physical inventory purposes where I have had occasion to use it, the properties which were important were: Assuming that all values are integers, for: x = a / b; y = a % b; If b is zero either statement must generate an error. If a and b have the same sign, x must be positive; else x must be negative. It must hold that abs(x) is equal to abs(a) / abs(b). It must hold that ((x * b) + y) is equal to a. This is exactly what the languages I was using did, and I was glad. I find it convenient that C and SQL behave this way. You are proposing that these not all hold, which in a lot of situations could cause big problems. You seem familiar enough with your own use case that I believe you when you say you don't want these semantics, but that just points out the need to support both. >> Then we could add a modulo operator with whatever semantics seem >> most popular, and some function(s) for the other semantics, and >> there would not be so much riding on choosing the "right" > semantics. > > The semantics is clear. I disagree. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 2014-09-24 12:44:09 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2014-09-24 18:55:51 +0300, Heikki Linnakangas wrote: > >> There doesn't seem to be any hardware implementations of that in the patch. > >> Is there any architecture that has an instruction or compiler intrinsic for > >> that? > > > You can implement it rather efficiently on ll/sc architectures. But I > > don't really think it matters. I prefer add_until (I've seen it named > > saturated add before as well) to live in the atomics code, rather than > > reimplement it in atomics employing code. I guess you see that > > differently? > > I think the question is more like "what in the world happened to confining > ourselves to a small set of atomics". I fail to see why the existance of a wrapper around compare-exchange (which is one of the primitives we'd agreed upon) runs counter to the agreement that we'll only rely on a limited number of atomics on the hardware level? > I doubt either that this exists > natively anywhere, or ethat it's so useful that we should expect platforms > to have efficient implementations. It's useful for my work to get rid of most LockBufHdr() calls (to manipulate usagecount locklessly). That's why I added it. We can delay it till that patch is ready, but I don't really see the benefit. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Heikki" == Heikki Linnakangas writes: Heikki> There's been a lot of discussion and I haven't followed it in Heikki> detail. Andrew, there were some open questions, but have you Heikki> gotten enough feedback so that you know what to do next? I was holding off on posting a recut patch with the latest EXPLAIN formatting changes (which are basically cosmetic) until it became clear whether RLS was likely to be reverted or kept (we have a tiny but irritating conflict with it, in the regression test schedule file where we both add to the same list of tests). Other than that there is nothing for Atri and me to do next but wait on a proper review. The feedback and discussion has been almost all about cosmetic details; the only actual issues found have been a trivial omission from pg_stat_statements, and a slightly suboptimal planning of sort steps, both long since fixed. What we have not had: - anything more than a superficial review - any feedback over the acceptability of our chained-sorts approach for doing aggregations with differing sort orders - any decision about the question of reserved words and/or possibly renaming contrib/cube (and what new name to use if so) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
Andres Freund writes: > On 2014-09-24 18:55:51 +0300, Heikki Linnakangas wrote: >> There doesn't seem to be any hardware implementations of that in the patch. >> Is there any architecture that has an instruction or compiler intrinsic for >> that? > You can implement it rather efficiently on ll/sc architectures. But I > don't really think it matters. I prefer add_until (I've seen it named > saturated add before as well) to live in the atomics code, rather than > reimplement it in atomics employing code. I guess you see that > differently? I think the question is more like "what in the world happened to confining ourselves to a small set of atomics". I doubt either that this exists natively anywhere, or that it's so useful that we should expect platforms to have efficient implementations. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identify_locking_dependencies is broken for schema-only dumps
Robert Haas writes: > If there are no comments on this soon-ish, I'm going to push and > back-patched the patch I attached. Sorry for not paying attention sooner. After studying it for awhile, I think the change is probably all right but your proposed comment is entirely inadequate. There are extremely specific reasons why this works, and you removed an existing comment about that and replaced it with nothing but a wishy-washy "maybe". regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 2014-09-24 18:55:51 +0300, Heikki Linnakangas wrote: > On 09/24/2014 03:37 PM, Andres Freund wrote: > >+/* > >+ * pg_fetch_add_until_u32 - saturated addition to variable > >+ * > >+ * Returns the the value of ptr after the arithmetic operation. > >+ * > >+ * Full barrier semantics. > >+ */ > >+STATIC_IF_INLINE uint32 > >+pg_atomic_fetch_add_until_u32(volatile pg_atomic_uint32 *ptr, int32 > >add_, > >+ uint32 until) > >+{ > >+ CHECK_POINTER_ALIGNMENT(ptr, 4); > >+ return pg_atomic_fetch_add_until_u32_impl(ptr, add_, until); > >+} > >+ > >>> > >>>This was a surprise to me, I don't recall discussion of an > >>>"fetch-add-until" > >>>operation, and hadn't actually ever heard of it before. > >It was included from the first version on, and I'd mentioned it a couple > >times. > > There doesn't seem to be any hardware implementations of that in the patch. > Is there any architecture that has an instruction or compiler intrinsic for > that? You can implement it rather efficiently on ll/sc architectures. But I don't really think it matters. I prefer add_until (I've seen it named saturated add before as well) to live in the atomics code, rather than reimplement it in atomics employing code. I guess you see that differently? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 09/24/2014 03:37 PM, Andres Freund wrote: > >+/* > >+ * pg_fetch_add_until_u32 - saturated addition to variable > >+ * > >+ * Returns the the value of ptr after the arithmetic operation. > >+ * > >+ * Full barrier semantics. > >+ */ > >+STATIC_IF_INLINE uint32 > >+pg_atomic_fetch_add_until_u32(volatile pg_atomic_uint32 *ptr, int32 add_, > >+uint32 until) > >+{ > >+ CHECK_POINTER_ALIGNMENT(ptr, 4); > >+ return pg_atomic_fetch_add_until_u32_impl(ptr, add_, until); > >+} > >+ > >This was a surprise to me, I don't recall discussion of an "fetch-add-until" >operation, and hadn't actually ever heard of it before. It was included from the first version on, and I'd mentioned it a couple times. There doesn't seem to be any hardware implementations of that in the patch. Is there any architecture that has an instruction or compiler intrinsic for that? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.5] Custom Plan API
On Wed, Sep 17, 2014 at 7:40 PM, Kouhei Kaigai wrote: > At this moment, I revised the above portion of the patches. > create_custom_plan() was modified to call "PlanCustomPath" callback > next to the initialization of tlist and clauses. > > It's probably same as what you suggested. create_custom_plan() is mis-named. It's actually only applicable to the custom-scan case, because it's triggered by create_plan_recurse() getting a path node with a T_CustomScan pathtype. Now, we could change that; although in general create_plan_recurse() dispatches on pathtype, we could make CustomPath an exception; the top of that function could say if (IsA(best_path, CustomPath)) { /* do custom stuff */ }. But the problem with that idea is that, when the custom path is specifically a custom scan, rather than a join or some other thing, you want to do all of the same processing that's in create_scan_plan(). So I think what should happen is that create_plan_recurse() should handle T_CustomScan the same way it handles T_SeqScan, T_IndexScan, et al: by calling create_scan_plan(). The switch inside that function can then call a function create_customscan_plan() if it sees T_CustomScan. And that function will be simpler than the create_custom_plan() that you have now, and it will be named correctly, too. In ExplainNode(), I think sname should be set to "Custom Scan", not "Custom". And further down, the custom_name should be printed as "Custom Plan Provider" not just "Custom". setrefs.c has remaining handling for the scanrelid = 0 case; please remove that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] src/backend/replication/Makefile misses -I.
Hi, Because of the atomics patch I was building postgres with sun studio. Turns out vpath builds don't work in that scenario when building from git. The problem is that the replication Makefile override CPPFLAGS := -I$(srcdir) $(CPPFLAGS) includes the source directory, but not the current directory. The other Makefiles dealing with similar things do: override CPPFLAGS := -I. -I$(srcdir) $(CPPFLAGS) which looks right to me. The current override line is from 9cc2c182fc20d5 in reaction to #6073. Unless somebody protests I'm going to backpatch the -I. addition back to 9.1. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Tue, Sep 23, 2014 at 8:15 PM, Florian Weimer wrote: > * Ants Aasma: > >> CRC has exactly one hardware implementation in general purpose CPU's > > I'm pretty sure that's not true. Many general purpose CPUs have CRC > circuity, and there must be some which also expose them as > instructions. I must eat my words here, indeed AMD processors starting from Bulldozer do implement the CRC32 instruction. However, according to Agner Fog, AMD's implementation has a 6 cycle latency and more importantly a throughput of 1/6 per cycle. While Intel's implementation on all CPUs except the new Atom has 3 cycle latency and 1 instruction/cycle throughput. This means that there still is a significant handicap for AMD platforms, not to mention Power or Sparc with no hardware support. Some ARM's implement CRC32, but I haven't researched what their performance is. Regards, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
Updated patches attached. -- Abhijit >From b3bd465357f96ebf1953b3a98f25fb51bac5eb26 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:26:00 +0530 Subject: Make pg_controldata ignore a -D before DataDir --- doc/src/sgml/ref/pg_controldata.sgml| 5 +++-- src/bin/pg_controldata/pg_controldata.c | 13 +++-- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/ref/pg_controldata.sgml b/doc/src/sgml/ref/pg_controldata.sgml index fbf40fc..b4be660 100644 --- a/doc/src/sgml/ref/pg_controldata.sgml +++ b/doc/src/sgml/ref/pg_controldata.sgml @@ -40,8 +40,9 @@ PostgreSQL documentation This utility can only be run by the user who initialized the cluster because it requires read access to the data directory. - You can specify the data directory on the command line, or use - the environment variable PGDATA. This utility supports the options + You can specify the data directory on the command line, with or without + -D or use the environment variable PGDATA. + This utility supports the options -V and --version, which print the pg_controldata version and exit. It also supports options -? and --help, which output the diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f815024..cfa6911 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -31,14 +31,19 @@ static void usage(const char *progname) { + const char *pgdata; + + pgdata = getenv("PGDATA"); + if (!pgdata) + pgdata = ""; + printf(_("%s displays control information of a PostgreSQL database cluster.\n\n"), progname); printf(_("Usage:\n")); printf(_(" %s [OPTION] [DATADIR]\n"), progname); printf(_("\nOptions:\n")); + printf(_(" -D DATADIR data directory (default: \"%s\")\n"), pgdata); printf(_(" -V, --version output version information, then exit\n")); printf(_(" -?, --help show this help, then exit\n")); - printf(_("\nIf no data directory (DATADIR) is specified, " - "the environment variable PGDATA\nis used.\n\n")); printf(_("Report bugs to .\n")); } @@ -120,7 +125,11 @@ main(int argc, char *argv[]) } if (argc > 1) + { DataDir = argv[1]; + if (strcmp(DataDir, "-D") == 0 && argc > 2) + DataDir = argv[2]; + } else DataDir = getenv("PGDATA"); if (DataDir == NULL) -- 1.9.1 >From 51c651a24db4f3b977fa38c08177acc062a5fb56 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:48:36 +0530 Subject: Make pg_resetxlog also accept -D datadir --- doc/src/sgml/ref/pg_resetxlog.sgml | 6 -- src/bin/pg_resetxlog/pg_resetxlog.c | 16 +++- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/ref/pg_resetxlog.sgml b/doc/src/sgml/ref/pg_resetxlog.sgml index 0b53bd6..7b2386a 100644 --- a/doc/src/sgml/ref/pg_resetxlog.sgml +++ b/doc/src/sgml/ref/pg_resetxlog.sgml @@ -22,6 +22,7 @@ PostgreSQL documentation pg_resetxlog + -D datadir -f -n -o oid @@ -30,7 +31,7 @@ PostgreSQL documentation -m mxid,mxid -O mxoff -l xlogfile - datadir + datadir @@ -55,7 +56,8 @@ PostgreSQL documentation This utility can only be run by the user who installed the server, because it requires read/write access to the data directory. - For safety reasons, you must specify the data directory on the command line. + For safety reasons, you must specify the data directory on the command line + (with or without -D). pg_resetxlog does not use the environment variable PGDATA. diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c index 302d005..9364e49 100644 --- a/src/bin/pg_resetxlog/pg_resetxlog.c +++ b/src/bin/pg_resetxlog/pg_resetxlog.c @@ -89,7 +89,7 @@ main(int argc, char *argv[]) MultiXactId set_oldestmxid = 0; char *endptr; char *endptr2; - char *DataDir; + char *DataDir = NULL; int fd; set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_resetxlog")); @@ -111,10 +111,14 @@ main(int argc, char *argv[]) } - while ((c = getopt(argc, argv, "fl:m:no:O:x:e:")) != -1) + while ((c = getopt(argc, argv, "D:fl:m:no:O:x:e:")) != -1) { switch (c) { + case 'D': +DataDir = optarg; +break; + case 'f': force = true; break; @@ -233,7 +237,7 @@ main(int argc, char *argv[]) } } - if (optind == argc) + if (DataDir == NULL && optind == argc) { fprintf(stderr, _("%s: no data directory specified\n"), progname); fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); @@ -257,7 +261,8 @@ main(int argc, char *argv[]) } #endif - DataDir = argv[optind]; + if (DataDir == NULL) + DataDir = argv[optind]; if (chdir(DataDir) < 0) { @@ -1077,8 +1082,9 @@ static void usage(void) { printf(_("%s resets the PostgreSQL transaction log.\n\n"), progname); - printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname); + pri
Re: [HACKERS] “Core” function in Postgres
On 09/24/2014 07:29 AM, Mingzhe Li wrote: > PS: I have the same post on stackoverflow. Since no one answered there, > I just report here. Thanks for mentioning it. In future, please include a link. You might want to wait more than a couple of hours too ;-) For reference, the Stack Overflow post is http://stackoverflow.com/q/26005359/398670 . where I answered in some detail. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
There's been a lot of discussion and I haven't followed it in detail. Andrew, there were some open questions, but have you gotten enough feedback so that you know what to do next? I'm trying to get this commitfest to an end, and this is still in "Needs Review" state... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] identify_locking_dependencies is broken for schema-only dumps
On Fri, Sep 19, 2014 at 11:23 AM, Robert Haas wrote: > This can lead to deadlocks during parallel restore. Test case: > > create table bar (a int primary key, b int); > create table baz (z int, a int references bar); > create view foo as select a, b, sum(1) from bar group by a union all > select z, a, 0 from baz; > > I dumped this with: pg_dump -Fc -s -f test.dmp > Then: while (dropdb rhaas; createdb; pg_restore -O -d rhaas -j3 > test.dmp); do true; done > > This quickly fails for me with: > > pg_restore: [archiver (db)] Error while PROCESSING TOC: > pg_restore: [archiver (db)] Error from TOC entry 2155; 2606 47822 FK > CONSTRAINT baz_a_fkey rhaas > pg_restore: [archiver (db)] could not execute query: ERROR: deadlock detected > DETAIL: Process 81791 waits for AccessExclusiveLock on relation 47862 > of database 47861; blocked by process 81789. > Process 81789 waits for AccessShareLock on relation 47865 of database > 47861; blocked by process 81791. > HINT: See server log for query details. > Command was: ALTER TABLE ONLY baz > ADD CONSTRAINT baz_a_fkey FOREIGN KEY (a) REFERENCES bar(a); > WARNING: errors ignored on restore: 2 > > The attached patch seems to fix it for me. > > Comments? If there are no comments on this soon-ish, I'm going to push and back-patched the patch I attached. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Core" function in Postgres
On Tue, Sep 23, 2014 at 7:02 PM, Michael Paquier wrote: > On Wed, Sep 24, 2014 at 8:29 AM, Mingzhe Li wrote: >> Hi experts, >> >> I want to know what's the "core" function used in Postgres server? I am >> looking for something corresponding to main() in a simple C program. I want >> to know the file path and the function name. I am using Postgres 9.3.5, >> however I assume the "core" function will be unchanged between different >> revisions. > > In your question, it seems that you are looking for the main() call > for the binary postgres, which is located in src/backend/main/main.c. > At the bottom of main() you'll see as well a set of functions like > PostmasterMain or PostgresMain that really define the startup path > used. PostmasterMain for example starts the postmaster, that is then > able to itself start backend process through PostgresMain()... As noted main is in main.c. Most of the interesting stuff that happens in the main execution loop for the backend is in tcop/postgres.c (tcop being shorthand for 'traffic cop' -- probably not a great name but it hails from the very early days of the project). That kinda answers your other question: these function names rarely change except as needed to meet new requirements. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
Abhijit Menon-Sen wrote: > P.S. Trivia: I can't find any other options in Postgres where the > argument is mandatory but the option is optional. psql behaves similarly with its -d and -U switches also; note --help: Usage: psql [OPTION]... [DBNAME [USERNAME]] ... -d, --dbname=DBNAME database name to connect to (default: "alvherre") ... -U, --username=USERNAME database user name (default: "alvherre") Both are optional and have defaults. Datadir for pg_controldata is also optional and --help contains this blurb: If no data directory (DATADIR) is specified, the environment variable PGDATA is used. I think you could replace those lines with -Ddata directory blah (default: "value of $PGDATA") But psql --help doesn't use $PGUSER to expand the default value for -U; I'm not clear if this means we shouldn't expand PGDATA in the help line for pg_controldata, or need to fix psql to expand PGUSER in -U. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
24.09.2014, 16:21, Tom Lane kirjoitti: Oskari Saarenmaa writes: ... so to enable XPG6 we'd need to use C99 mode anyway. OK. Could we just use -std=gnu99 (with -fgnu89-inline if required) with GCC on Solaris? ISTM it would be cleaner to just properly enable c99 mode rather than define an undocumented macro to use a couple of c99 declarations. Agreed, but what about non-GCC compilers? Solaris Studio defaults to "-xc99=all,no_lib" which, according to the man page, enables c99 language features but doesn't use c99 standard library semantics. isinf is defined to be a macro by c99 and doesn't require changing the c99 mode so I'd just keep using the defaults with Solaris Studio for now. For GCC --- a/src/template/solaris +++ b/src/template/solaris @@ -0,0 +1,4 @@ +if test "$GCC" = yes ; then + CPPFLAGS="$CPPFLAGS -std=gnu99" +fi + gets rid of the warnings and passes tests. / Oskari -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On 09/24/2014 04:49 PM, Abhijit Menon-Sen wrote: At 2014-09-24 09:25:12 -0400, t...@sss.pgh.pa.us wrote: What's so hard about [ -D ] before the datadir argument? I'm sorry to have given you the impression that I objected to it because it was hard. Since I proposed the same thing a few lines after what you quoted, I guess I have to agree it's not hard. I think it's pointless, because if you're going to look at the usage message to begin with, why not do what it already says? But I don't care enough to argue about it any further. There's also the reverse situation: you see that a script contains a line like "pg_controldata -D foo". You're accustomed to doing just "pg_controldata foo", and you wonder what the -D option does. So you look it up in the docs or "pg_controldata --help". - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
At 2014-09-24 09:25:12 -0400, t...@sss.pgh.pa.us wrote: > > What's so hard about [ -D ] before the datadir argument? I'm sorry to have given you the impression that I objected to it because it was hard. Since I proposed the same thing a few lines after what you quoted, I guess I have to agree it's not hard. I think it's pointless, because if you're going to look at the usage message to begin with, why not do what it already says? But I don't care enough to argue about it any further. Patches attached. -- Abhijit P.S. Trivia: I can't find any other options in Postgres where the argument is mandatory but the option is optional. >From 69d7386ab59ca9df74af5abe5a5c3cf5a93e64bb Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:26:00 +0530 Subject: Make pg_controldata ignore a -D before DataDir --- src/bin/pg_controldata/pg_controldata.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f815024..4386283 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -33,7 +33,7 @@ usage(const char *progname) { printf(_("%s displays control information of a PostgreSQL database cluster.\n\n"), progname); printf(_("Usage:\n")); - printf(_(" %s [OPTION] [DATADIR]\n"), progname); + printf(_(" %s [OPTION] [[-D] DATADIR]\n"), progname); printf(_("\nOptions:\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" -?, --help show this help, then exit\n")); @@ -120,7 +120,11 @@ main(int argc, char *argv[]) } if (argc > 1) + { DataDir = argv[1]; + if (strcmp(DataDir, "-D") == 0 && argc > 2) + DataDir = argv[2]; + } else DataDir = getenv("PGDATA"); if (DataDir == NULL) -- 1.9.1 >From 388b5009281184051398657449e649ec9585a242 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:48:36 +0530 Subject: Make pg_resetxlog also accept -D datadir --- src/bin/pg_resetxlog/pg_resetxlog.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c index 302d005..8ff5df0 100644 --- a/src/bin/pg_resetxlog/pg_resetxlog.c +++ b/src/bin/pg_resetxlog/pg_resetxlog.c @@ -89,7 +89,7 @@ main(int argc, char *argv[]) MultiXactId set_oldestmxid = 0; char *endptr; char *endptr2; - char *DataDir; + char *DataDir = NULL; int fd; set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_resetxlog")); @@ -111,10 +111,14 @@ main(int argc, char *argv[]) } - while ((c = getopt(argc, argv, "fl:m:no:O:x:e:")) != -1) + while ((c = getopt(argc, argv, "D:fl:m:no:O:x:e:")) != -1) { switch (c) { + case 'D': +DataDir = optarg; +break; + case 'f': force = true; break; @@ -233,7 +237,7 @@ main(int argc, char *argv[]) } } - if (optind == argc) + if (DataDir == NULL && optind == argc) { fprintf(stderr, _("%s: no data directory specified\n"), progname); fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); @@ -257,7 +261,8 @@ main(int argc, char *argv[]) } #endif - DataDir = argv[optind]; + if (DataDir == NULL) + DataDir = argv[optind]; if (chdir(DataDir) < 0) { @@ -1077,7 +1082,7 @@ static void usage(void) { printf(_("%s resets the PostgreSQL transaction log.\n\n"), progname); - printf(_("Usage:\n %s [OPTION]... DATADIR\n\n"), progname); + printf(_("Usage:\n %s [OPTION]... [-D] DATADIR\n\n"), progname); printf(_("Options:\n")); printf(_(" -e XIDEPOCH set next transaction ID epoch\n")); printf(_(" -f force update to be done\n")); -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] delta relations in AFTER triggers
On 09/24/2014 12:22 AM, Kevin Grittner wrote: Heikki Linnakangas wrote: instead of passing parameters to the SPI calls individually, you invented SPI_register_tuplestore which affects all subsequent SPI calls. All subsequent SPI calls on that particular SPI connection until it is closed, except for any tuplestores are later unregistered. Nested SPI connections do not automatically inherit the named tuplestores; whatever code opens an SPI connection would need to register them for the new context, if desired. This seemed to me to provide minimal disruption to the existing SPI callers who might want to use this. Yeah, I got that. And note that I'm not saying that's necessarily a bad design per se - it's just that it's different from the way parameters work, and I don't like it for that reason. You could imagine doing the same for parameters; have a SPI_register_param() function that you could use to register parameter types, and the parameters could then be referenced in any SPI calls that follow (within the same connection). But as the code stands, SPI is stateless wrt. to parameters, and tuplestores or relation parameters should follow the lead. The next question is how to pass the new hooks and tuplestores However, there doesn't seem to be any way to do one-shot queries with a ParserSetupHook and ParamListInfo. That seems like an oversight, and would be nice to address that anyway. There are dozens of SPI_prepare* and SPI_exec* calls scattered all over the backend, pl, and contrib code which appear to get along fine without that. Yeah. None of the current callers have apparently needed that functionality. But it's not hard to imagine cases where it would be needed. For example, imagine a variant of EXECUTE '...' where all the PL/pgSQL variables can be used in the query, like they can in static queries: declare myvar int4; tablename text; begin ... EXECUTE 'insert into ' || tablename ||' values (myvar)'; end; Currently you have to use $1 in place of the variable name, and pass the variable's current value with USING. If you wanted to make the above work, you would need a variant of SPI_execute that can run a one-shot query, with a parser-hook. Whether you want to use a parser-hook or is orthogonal to whether or not you want to run a one-shot query or prepare it and keep the plan. Partly it may be because it involves something of a modularity violation; the comment for the function used for this call (where it *is* used) says: /* * plpgsql_parser_setup set up parser hooks for dynamic parameters * * Note: this routine, and the hook functions it prepares for, are logically * part of plpgsql parsing. But they actually run during function execution, * when we are ready to evaluate a SQL query or expression that has not * previously been parsed and planned. */ No, that's something completely different. The comment points out that even though plpgsql_parser_setup is in pl_comp.c, which contains code related to compiling a PL/pgSQL function, it's actually called at execution time, not compilation time. Can you clarify what benefit you see to modifying the SPI API the way you suggest, and what impact it might have on existing calling code? Well, we'll have to keep the existing functions anyway, to avoid breaking 3rd party code that use them, so there would be no impact on existing code. The benefit would be that you could use the parser hooks and the ParamListInfo struct even when doing a one-shot query. Or perhaps you could just use SPI_prepare_params + SPI_execute_plan_with_paramlist even for one-shot queries. There is some overhead when a SPIPlan has to be allocated, but maybe it's not big enough to worry about. That would be worth measuring before adding new functions to the SPI. PS. the copy/read/write functions for TuplestoreRelation in the patch are broken; TuplestoreRelation is not a subclass of Plan. I'm not sure what you mean by "broken" -- could you elaborate? Sure: + /* + * _copyTuplestoreRelation + */ + static TuplestoreRelation * + _copyTuplestoreRelation(const TuplestoreRelation *from) + { + TuplestoreRelation *newnode = makeNode(TuplestoreRelation); + + /* +* copy node superclass fields +*/ + CopyPlanFields((const Plan *) from, (Plan *) newnode); + + /* +* copy remainder of node +*/ + COPY_STRING_FIELD(refname); + + return newnode; + } You cast the TuplestoreRelation to Plan, and pass it to CopyPlanFields. That will crash, because TuplestoreRelation is nothing like a Plan: + /* + * TuplestoreRelation - + * synthetic node for tuplestore passed in to the query by name + * + * This is initially added to support trigger transition tables, but may find + * other uses, so we try to keep it generic. + */ + typedef struct TuplestoreRelation + { + NodeTag type; + char *refname; + } TuplestoreRelat
Re: [HACKERS] make pg_controldata accept "-D dirname"
Abhijit Menon-Sen writes: > At 2014-09-24 16:02:29 +0300, hlinnakan...@vmware.com wrote: >> Thanks. Please update the docs and usage(), too. > I'm sorry, but I don't think it would be an improvement to make the docs > explain that it's valid to use either "-D datadir" or specify it without > an option. What's so hard about [ -D ] before the datadir argument? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
Oskari Saarenmaa writes: > ... so to enable XPG6 we'd need to use C99 mode anyway. OK. > Could we just use > -std=gnu99 (with -fgnu89-inline if required) with GCC on Solaris? ISTM > it would be cleaner to just properly enable c99 mode rather than define > an undocumented macro to use a couple of c99 declarations. Agreed, but what about non-GCC compilers? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
To put it another way, I doubt any of the people who were surprised by it looked at either the usage message or the docs. ;-) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
On 2014-09-24 08:25:34 -0400, Tom Lane wrote: > I'm worried that __C99FEATURES__ might do other, not-so-C89-compatible > things in later Solaris releases. Possibly that risk could be addressed > by having src/template/solaris make an OS version check before adding the > switch, but it'd be a bit painful probably. As we want to actually be able to compile with a C99 compiler I don't see much harm in that? Many compilers these day have C99 stuff enabled by default anyway... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
At 2014-09-24 16:02:29 +0300, hlinnakan...@vmware.com wrote: > > Thanks. Please update the docs and usage(), too. I'm sorry, but I don't think it would be an improvement to make the docs explain that it's valid to use either "-D datadir" or specify it without an option. If both commands were changed to *require* "-D datadir", that would be worth documenting. But of course I didn't want to change any behaviour for people who have a better memory than I do. I think it's all right as a quick hack to remove an inconsistency that has clearly annoyed many people, but not much more. Would you be OK with my just sticking a "[-D]" before DATADIR in the usage() message for both programs, with no further explanation? (But to be honest, I don't think even that is necessary or desirable.) -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] missing isinf declaration on solaris
24.09.2014, 15:25, Tom Lane kirjoitti: Oskari Saarenmaa writes: GCC 4.9 build on Solaris 10 shows these warnings about isinf: float.c: In function 'is_infinite': float.c:178:2: warning: implicit declaration of function 'isinf' Ugh. isinf declaration is in which is included by , but it's surrounded by #if defined(_STDC_C99) || _XOPEN_SOURCE - 0 >= 600 || defined(__C99FEATURES__). A couple of quick Google searches suggests that some other projects have worked around this by always defining __C99FEATURES__ even if compiling in C89 mode. __C99FEATURES__ is only used by math.h and fenv.h in /usr/include. Should we just add -D__C99FEATURES__ to CPPFLAGS in src/template/solaris, add our own declaration of isinf() or do something else about the warning? I'm worried that __C99FEATURES__ might do other, not-so-C89-compatible things in later Solaris releases. Possibly that risk could be addressed by having src/template/solaris make an OS version check before adding the switch, but it'd be a bit painful probably. Based on the #if you show, I'd be more inclined to think about defining _XOPEN_SOURCE to get the result. There is precedent for that in src/template/hpux which does CPPFLAGS="$CPPFLAGS -D_XOPEN_SOURCE_EXTENDED" I forget what-all _XOPEN_SOURCE_EXTENDED enables exactly, but if memory serves there were a nontrivial number of now-considered-standard features turned on by that in ancient HPUX releases. If you want to pursue this route, you'd need to do a bit of digging to see what else _XOPEN_SOURCE controls in Solaris and if there is some front-end feature macro (like _XOPEN_SOURCE_EXTENDED) that you're supposed to set instead of touching it directly. Looking at standards(5) and /usr/include/sys/feature_tests.h it looks like _XOPEN_SOURCE_EXTENDED enables XPG4v2 environment. _XOPEN_SOURCE=600 enables XPG6, but feature_tests.h also has this bit: /* * It is invalid to compile an XPG3, XPG4, XPG4v2, or XPG5 application * using c99. The same is true for POSIX.1-1990, POSIX.2-1992, POSIX.1b, * and POSIX.1c applications. Likewise, it is invalid to compile an XPG6 * or a POSIX.1-2001 application with anything other than a c99 or later * compiler. Therefore, we force an error in both cases. */ so to enable XPG6 we'd need to use C99 mode anyway. Could we just use -std=gnu99 (with -fgnu89-inline if required) with GCC on Solaris? ISTM it would be cleaner to just properly enable c99 mode rather than define an undocumented macro to use a couple of c99 declarations. / Oskari -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On 09/24/2014 02:21 PM, Abhijit Menon-Sen wrote: At 2014-09-24 14:03:41 +0300, hlinnakan...@vmware.com wrote: Ah, I frequently run into that too; but with pg_resetxlog. Well, that's no fun. Patch attached. Thanks. Please update the docs and usage(), too. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On 24 September 2014 12:04, Magnus Hagander wrote: > On Wed, Sep 24, 2014 at 12:59 PM, Abhijit Menon-Sen > wrote: > > Hi. > > > > I can never remember that pg_controldata takes only a dirname rather > > than "-D dirname", and I gather I'm not the only one. Here's a tiny > > patch to make it work either way. As far as I can see, it doesn't > > break anything, not even if you have a data directory named "-D". > > I haven't looked at the code, but definitely +1 for the feature. > That's really quite annoying. > And here I was thinking it was just me. +1 Thom
Re: [HACKERS] pg_dump bug in 9.4beta2 and HEAD
Heikki Linnakangas wrote: > On 09/24/2014 01:50 PM, Thom Brown wrote: > >>I am sending two patches > >> > >>first is fast fix > >> > >>second fix is implementation of Heikki' idea. > > > >I'm guessing this issue is still unresolved? It would be nice to get this > >off the open items list. > > Yeah, I had completely forgotten about this. Alvaro, could you > finish this off? Ah, I had forgotten too. Sure, will look into it shortly. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
Magnus Hagander wrote: > On Wed, Sep 24, 2014 at 12:59 PM, Abhijit Menon-Sen > wrote: > > Hi. > > > > I can never remember that pg_controldata takes only a dirname rather > > than "-D dirname", and I gather I'm not the only one. Here's a tiny > > patch to make it work either way. As far as I can see, it doesn't > > break anything, not even if you have a data directory named "-D". > > I haven't looked at the code, but definitely +1 for the feature. > That's really quite annoying. +1 -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 2014-09-24 14:59:19 +0300, Heikki Linnakangas wrote: > On 09/23/2014 12:01 AM, Andres Freund wrote: > >Hi, > > > >I've finally managed to incorporate (all the?) feedback I got for > >0.5. Imo the current version looks pretty good. > > Thanks! I agree it looks good. Some random comments after a quick > read-through: > > There are some spurious whitespace changes in spin.c. Huh. Unsure how they crept in. Will fix. > >+ * These files can provide the full set of atomics or can do pretty much > >+ * nothing if all the compilers commonly used on these platforms provide > >+ * useable generics. It will often make sense to define memory barrier > >+ * semantics in those, since e.g. the compiler intrinsics for x86 memory > >+ * barriers can't know we don't need read/write barriers anything more than > >a > >+ * compiler barrier. > > I didn't understand the latter sentence. Hm. The background here is that postgres doesn't need memory barriers affect sse et al registers - which is why read/write barriers currently can be defined to be empty on TSO architectures like x86. But e.g. gcc's barrier intrinsics don't assume that, so they do a mfence or lock xaddl for read/write barriers. Which isn't cheap. Will try to find a way to rephrase. > >+ * Don't add an inline assembly of the actual atomic operations if all the > >+ * common implementations of your platform provide intrinsics. Intrinsics > >are > >+ * much easier to understand and potentially support more architectures. > > What about uncommon implementations, then? I think we need to provide inline > assembly if any supported implementation on the platform does not provide > intrinsics, otherwise we fall back to emulation. Or is that exactly what > you're envisioning we do? Yes, that's what I'm envisioning. E.g. old versions of solaris studio don't provide compiler intrinsics and also don't provide reliable ways to barrier semantics. It doesn't seem worthwile to add a inline assembly version for that special case. > I believe such a situation hasn't come up on the currently supported > platforms, so we don't necessary have to have a solution for that, but the > comment doesn't feel quite right either. It's the reason I added a inline assembly version of atomics for x86 gcc... > >+#define CHECK_POINTER_ALIGNMENT(ptr, bndr) \ > >+Assert(TYPEALIGN((uintptr_t)(ptr), bndr)) > > Would be better to call this AssertAlignment, to make it clear that this is > an assertion, not something like CHECK_FOR_INTERRUPTS(). And perhaps move > this to c.h where other assertions are - this seems generally useful. Sounds good. > >+ * Spinloop delay - > > Spurious dash. Probably should rather add a bit more explanation... > >+/* > >+ * pg_fetch_add_until_u32 - saturated addition to variable > >+ * > >+ * Returns the the value of ptr after the arithmetic operation. > >+ * > >+ * Full barrier semantics. > >+ */ > >+STATIC_IF_INLINE uint32 > >+pg_atomic_fetch_add_until_u32(volatile pg_atomic_uint32 *ptr, int32 add_, > >+ uint32 until) > >+{ > >+CHECK_POINTER_ALIGNMENT(ptr, 4); > >+return pg_atomic_fetch_add_until_u32_impl(ptr, add_, until); > >+} > >+ > > This was a surprise to me, I don't recall discussion of an "fetch-add-until" > operation, and hadn't actually ever heard of it before. It was included from the first version on, and I'd mentioned it a couple times. > None of the subsequent patches seem to use it either. Can we just > leave this out? I have a further patch (lockless buffer header manipulation) that uses it to manipulate the usagecount. I can add it back later if you feel strongly about it. > s/the the/the/ > > >+#ifndef PG_HAS_ATOMIC_WRITE_U64 > >+#define PG_HAS_ATOMIC_WRITE_U64 > >+static inline void > >+pg_atomic_write_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val) > >+{ > >+/* > >+ * 64 bit writes aren't safe on all platforms. In the generic > >+ * implementation implement them as an atomic exchange. That might > >store a > >+ * 0, but only if the prev. value also was a 0. > >+ */ > >+pg_atomic_exchange_u64_impl(ptr, val); > >+} > >+#endif > > Why is 0 special? That bit should actually be in the read_u64 implementation, not write... If you do a compare and exchange you have to provide an 'expected' value. So, if you do a compare and exchange and the previous value is 0 the value will be written superflously - but the actual value won't change. > >+/* > >+ * Can't run the test under the semaphore emulation, it doesn't handle > >+ * checking edge cases well. > >+ */ > >+#ifndef PG_HAVE_ATOMIC_FLAG_SIMULATION > >+test_atomic_flag(); > >+#endif > > Huh? Is the semaphore emulation broken, then? No, it's not. The semaphore emulation doesn't implement pg_atomic_unlocked_test_flag() (as there's no sane way to do that with semaphores that I know of). Instead it always returns true. Which disturbs the test. I guess I'll expand that
Re: [HACKERS] missing isinf declaration on solaris
Oskari Saarenmaa writes: > GCC 4.9 build on Solaris 10 shows these warnings about isinf: > float.c: In function 'is_infinite': > float.c:178:2: warning: implicit declaration of function 'isinf' Ugh. > isinf declaration is in which is included by , > but it's surrounded by #if defined(_STDC_C99) || _XOPEN_SOURCE - 0 >= > 600 || defined(__C99FEATURES__). A couple of quick Google searches > suggests that some other projects have worked around this by always > defining __C99FEATURES__ even if compiling in C89 mode. __C99FEATURES__ > is only used by math.h and fenv.h in /usr/include. > Should we just add -D__C99FEATURES__ to CPPFLAGS in > src/template/solaris, add our own declaration of isinf() or do something > else about the warning? I'm worried that __C99FEATURES__ might do other, not-so-C89-compatible things in later Solaris releases. Possibly that risk could be addressed by having src/template/solaris make an OS version check before adding the switch, but it'd be a bit painful probably. Based on the #if you show, I'd be more inclined to think about defining _XOPEN_SOURCE to get the result. There is precedent for that in src/template/hpux which does CPPFLAGS="$CPPFLAGS -D_XOPEN_SOURCE_EXTENDED" I forget what-all _XOPEN_SOURCE_EXTENDED enables exactly, but if memory serves there were a nontrivial number of now-considered-standard features turned on by that in ancient HPUX releases. If you want to pursue this route, you'd need to do a bit of digging to see what else _XOPEN_SOURCE controls in Solaris and if there is some front-end feature macro (like _XOPEN_SOURCE_EXTENDED) that you're supposed to set instead of touching it directly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.6
On 09/23/2014 12:01 AM, Andres Freund wrote: Hi, I've finally managed to incorporate (all the?) feedback I got for 0.5. Imo the current version looks pretty good. Thanks! I agree it looks good. Some random comments after a quick read-through: There are some spurious whitespace changes in spin.c. + * These files can provide the full set of atomics or can do pretty much + * nothing if all the compilers commonly used on these platforms provide + * useable generics. It will often make sense to define memory barrier + * semantics in those, since e.g. the compiler intrinsics for x86 memory + * barriers can't know we don't need read/write barriers anything more than a + * compiler barrier. I didn't understand the latter sentence. + * Don't add an inline assembly of the actual atomic operations if all the + * common implementations of your platform provide intrinsics. Intrinsics are + * much easier to understand and potentially support more architectures. What about uncommon implementations, then? I think we need to provide inline assembly if any supported implementation on the platform does not provide intrinsics, otherwise we fall back to emulation. Or is that exactly what you're envisioning we do? I believe such a situation hasn't come up on the currently supported platforms, so we don't necessary have to have a solution for that, but the comment doesn't feel quite right either. +#define CHECK_POINTER_ALIGNMENT(ptr, bndr) \ + Assert(TYPEALIGN((uintptr_t)(ptr), bndr)) Would be better to call this AssertAlignment, to make it clear that this is an assertion, not something like CHECK_FOR_INTERRUPTS(). And perhaps move this to c.h where other assertions are - this seems generally useful. + * Spinloop delay - Spurious dash. +/* + * pg_fetch_add_until_u32 - saturated addition to variable + * + * Returns the the value of ptr after the arithmetic operation. + * + * Full barrier semantics. + */ +STATIC_IF_INLINE uint32 +pg_atomic_fetch_add_until_u32(volatile pg_atomic_uint32 *ptr, int32 add_, + uint32 until) +{ + CHECK_POINTER_ALIGNMENT(ptr, 4); + return pg_atomic_fetch_add_until_u32_impl(ptr, add_, until); +} + This was a surprise to me, I don't recall discussion of an "fetch-add-until" operation, and hadn't actually ever heard of it before. None of the subsequent patches seem to use it either. Can we just leave this out? s/the the/the/ +#ifndef PG_HAS_ATOMIC_WRITE_U64 +#define PG_HAS_ATOMIC_WRITE_U64 +static inline void +pg_atomic_write_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val) +{ + /* +* 64 bit writes aren't safe on all platforms. In the generic +* implementation implement them as an atomic exchange. That might store a +* 0, but only if the prev. value also was a 0. +*/ + pg_atomic_exchange_u64_impl(ptr, val); +} +#endif Why is 0 special? +* fail or suceed, but always return the old value s/suceed/succeed/. Add a full stop to end. + /* +* Can't run the test under the semaphore emulation, it doesn't handle +* checking edge cases well. +*/ +#ifndef PG_HAVE_ATOMIC_FLAG_SIMULATION + test_atomic_flag(); +#endif Huh? Is the semaphore emulation broken, then? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On 24 September 2014 17:15, Michael Paquier Wrote: > On Wed, Sep 24, 2014 at 7:59 PM, Abhijit Menon-Sen > wrote: > > I can never remember that pg_controldata takes only a dirname rather > > than "-D dirname", and I gather I'm not the only one. Here's a tiny > > patch to make it work either way. As far as I can see, it doesn't > > break anything, not even if you have a data directory named "-D". > +1. I forget it all the time as well... +1, always I get confused once pg_controldata does not work with -D. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb format is pessimal for toast compression
Heikki Linnakangas writes: > On 09/24/2014 08:16 AM, Tom Lane wrote: >> Heikki's patch would eat up the high-order JEntry bits, but the other >> points remain. > If we don't need to be backwards-compatible with the 9.4beta on-disk > format, we don't necessarily need to eat the high-order JEntry bit. You > can just assume that that every nth element is stored as an offset, and > the rest as lengths. Although it would be nice to have the flag for it > explicitly. If we go with this approach, I think that we *should* eat the high bit for it. The main reason I want to do that is that it avoids having to engrave the value of N on stone tablets. I think that we should use a pretty large value of N --- maybe 32 or so --- and having the freedom to change it later based on experience seems like a good thing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On Wed, Sep 24, 2014 at 7:59 PM, Abhijit Menon-Sen wrote: > I can never remember that pg_controldata takes only a dirname rather > than "-D dirname", and I gather I'm not the only one. Here's a tiny > patch to make it work either way. As far as I can see, it doesn't > break anything, not even if you have a data directory named "-D". +1. I forget it all the time as well... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] END_OF_RECOVERY shutdowns and ResetUnloggedRelations()
Hi Andres, Robert. I've attached four patches here. 1. Move the call to ResetUnloggedRelations(UNLOGGED_RELATION_INIT) to earlier in StartupXLOG. 2. Inside that function, issue fsync()s for the main forks we create by copying the _init fork. 3. A small fixup to add a const to "typedef char *FileName", because the earlier patch gave me warnings about discarding const-ness. This is consistent with many other functions in fd.c that take const char *. 4. Issue an fsync() on the data directory at startup if we need to perform crash recovery. I created some unlogged relations, performed an immediate shutdown, and then straced postgres as it performed crash recovery. The changes in (2) do indeed fsync the files we create by copying *_init, and don't fsync anything else (at least not after I fixed a bug ;-). I did not do anything about the END_OF_RECOVERY checkpoint setting the ControlFile->state to DB_SHUTDOWNED, because it wasn't clear to me if there was any agreement on what to do. I would be happy to submit a followup patch if we reach some decision about it. Is this what you had in mind? -- Abhijit >From d8726c06cdf11674661eac1d091cf7edd05c2a0c Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 14:43:18 +0530 Subject: Call ResetUnloggedRelations(UNLOGGED_RELATION_INIT) earlier We need to call this after recovery, but not after the END_OF_RECOVERY checkpoint is written. If we crash after that checkpoint, for example because of ENOSPC in PreallocXlogFiles, the checkpoint has already set the ControlFile->state to DB_SHUTDOWNED, so we don't enter recovery again at startup. Because we did call ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP) in the first cleanup, this leaves the database with a bunch of _init forks for unlogged relations, but no main forks, leading to scary errors. See thread from 20140912112246.ga4...@alap3.anarazel.de for details. --- src/backend/access/transam/xlog.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 46eef5f..218f7fb 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6863,6 +6863,14 @@ StartupXLOG(void) ShutdownWalRcv(); /* + * Reset initial contents of unlogged relations. This has to be done + * AFTER recovery is complete so that any unlogged relations created + * during recovery also get picked up. + */ + if (InRecovery) + ResetUnloggedRelations(UNLOGGED_RELATION_INIT); + + /* * We don't need the latch anymore. It's not strictly necessary to disown * it, but let's do it for the sake of tidiness. */ @@ -7130,14 +7138,6 @@ StartupXLOG(void) PreallocXlogFiles(EndOfLog); /* - * Reset initial contents of unlogged relations. This has to be done - * AFTER recovery is complete so that any unlogged relations created - * during recovery also get picked up. - */ - if (InRecovery) - ResetUnloggedRelations(UNLOGGED_RELATION_INIT); - - /* * Okay, we're officially UP. */ InRecovery = false; -- 1.9.1 >From 7eba57b5ed9e1eda1fa1b14b32a82828617d823e Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:01:37 +0530 Subject: =?UTF-8?q?Make=20ResetUnloggedRelations(=E2=80=A6=5FINIT)=20fsync?= =?UTF-8?q?=20newly-created=20main=20forks?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit During UNLOGGED_RELATION_INIT, after we have copied ${x}_init to $x, we issue fsync()s for the newly-created files. We depend on their existence and a checkpoint isn't going to fsync them for us during recovery. See thread from 20140912112246.ga4...@alap3.anarazel.de for details. --- src/backend/storage/file/reinit.c | 36 1 file changed, 36 insertions(+) diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index 3229f41..4febf6f 100644 --- a/src/backend/storage/file/reinit.c +++ b/src/backend/storage/file/reinit.c @@ -339,6 +339,42 @@ ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op) } FreeDir(dbspace_dir); + + /* + * copy_file() above has already called pg_flush_data() on the + * files it created. Now we need to fsync those files, because + * a checkpoint won't do it for us while we're in recovery. + */ + + dbspace_dir = AllocateDir(dbspacedirname); + while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL) + { + ForkNumber forkNum; + int oidchars; + char oidbuf[OIDCHARS + 1]; + char mainpath[MAXPGPATH]; + + /* Skip anything that doesn't look like a relation data file. */ + if (!parse_filename_for_nontemp_relation(de->d_name, &oidchars, + &forkNum)) +continue; + + /* Also skip it unless this is the init fork. */ + if (forkNum != INIT_FORKNUM) +continue; + + /* Construct main fork pathname. */ + memcpy(oidbuf, de->d_name, oidchars); + oidbuf[oidchars] = '\
Re: [HACKERS] make pg_controldata accept "-D dirname"
At 2014-09-24 14:03:41 +0300, hlinnakan...@vmware.com wrote: > > Ah, I frequently run into that too; but with pg_resetxlog. Well, that's no fun. Patch attached. -- Abhijit >From 23fc4d90d0353e1c6d65ca5715fc0338199f01cf Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:48:36 +0530 Subject: Make pg_resetxlog also accept -D datadir --- src/bin/pg_resetxlog/pg_resetxlog.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c index 302d005..3b43aff 100644 --- a/src/bin/pg_resetxlog/pg_resetxlog.c +++ b/src/bin/pg_resetxlog/pg_resetxlog.c @@ -89,7 +89,7 @@ main(int argc, char *argv[]) MultiXactId set_oldestmxid = 0; char *endptr; char *endptr2; - char *DataDir; + char *DataDir = NULL; int fd; set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_resetxlog")); @@ -111,10 +111,14 @@ main(int argc, char *argv[]) } - while ((c = getopt(argc, argv, "fl:m:no:O:x:e:")) != -1) + while ((c = getopt(argc, argv, "D:fl:m:no:O:x:e:")) != -1) { switch (c) { + case 'D': +DataDir = optarg; +break; + case 'f': force = true; break; @@ -233,7 +237,7 @@ main(int argc, char *argv[]) } } - if (optind == argc) + if (DataDir == NULL && optind == argc) { fprintf(stderr, _("%s: no data directory specified\n"), progname); fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); @@ -257,7 +261,8 @@ main(int argc, char *argv[]) } #endif - DataDir = argv[optind]; + if (DataDir == NULL) + DataDir = argv[optind]; if (chdir(DataDir) < 0) { -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On 09/24/2014 01:59 PM, Abhijit Menon-Sen wrote: Hi. I can never remember that pg_controldata takes only a dirname rather than "-D dirname", and I gather I'm not the only one. Here's a tiny patch to make it work either way. As far as I can see, it doesn't break anything, not even if you have a data directory named "-D". Ah, I frequently run into that too; but with pg_resetxlog. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make pg_controldata accept "-D dirname"
On Wed, Sep 24, 2014 at 12:59 PM, Abhijit Menon-Sen wrote: > Hi. > > I can never remember that pg_controldata takes only a dirname rather > than "-D dirname", and I gather I'm not the only one. Here's a tiny > patch to make it work either way. As far as I can see, it doesn't > break anything, not even if you have a data directory named "-D". I haven't looked at the code, but definitely +1 for the feature. That's really quite annoying. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump bug in 9.4beta2 and HEAD
On 09/24/2014 01:50 PM, Thom Brown wrote: On 15 August 2014 16:31, Pavel Stehule wrote: 2014-08-14 9:03 GMT+02:00 Heikki Linnakangas : On 08/14/2014 06:53 AM, Joachim Wieland wrote: I'm seeing an assertion failure with "pg_dump -c --if-exists" which is not ready to handle BLOBs it seems: pg_dump: pg_backup_archiver.c:472: RestoreArchive: Assertion `mark != ((void *)0)' failed. To reproduce: $ createdb test $ pg_dump -c --if-exists test (works, dumps empty database) $ psql test -c "select lo_create(1);" $ pg_dump -c --if-exists test (fails, with the above mentioned assertion) The code tries to inject an "IF EXISTS" into the already-construct DROP command, but it doesn't work for large objects, because the deletion command looks like "SELECT pg_catalog.lo_unlink(xxx)". There is no DROP there. I believe we could use "SELECT pg_catalog.lo_unlink(loid) FROM pg_catalog.pg_largeobject_metadata WHERE loid = xxx". pg_largeobject_metadata table didn't exist before version 9.0, but we don't guarantee pg_dump's output to be compatible in that direction anyway, so I think that's fine. The quick fix would be to add an exception for blobs, close to where Assert is. There are a few exceptions there already. A cleaner solution would be to add a new argument to ArchiveEntry and make the callers responsible for providing an "DROP IF EXISTS" query, but that's not too appetizing because for most callers it would be boring boilerplate code. Perhaps add an argument, but if it's NULL, ArchiveEntry would form the if-exists query automatically from the DROP query. I am sending two patches first is fast fix second fix is implementation of Heikki' idea. I'm guessing this issue is still unresolved? It would be nice to get this off the open items list. Yeah, I had completely forgotten about this. Alvaro, could you finish this off? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] make pg_controldata accept "-D dirname"
Hi. I can never remember that pg_controldata takes only a dirname rather than "-D dirname", and I gather I'm not the only one. Here's a tiny patch to make it work either way. As far as I can see, it doesn't break anything, not even if you have a data directory named "-D". -- Abhijit >From 15d43a3a47b3042d3365cf86d4bf585ed00fa744 Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen Date: Wed, 24 Sep 2014 16:26:00 +0530 Subject: Make pg_controldata ignore a -D before DataDir --- src/bin/pg_controldata/pg_controldata.c | 4 1 file changed, 4 insertions(+) diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index f815024..7e8d0c2 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -120,7 +120,11 @@ main(int argc, char *argv[]) } if (argc > 1) + { DataDir = argv[1]; + if (strcmp(DataDir, "-D") == 0 && argc > 2) + DataDir = argv[2]; + } else DataDir = getenv("PGDATA"); if (DataDir == NULL) -- 1.9.1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump bug in 9.4beta2 and HEAD
On 15 August 2014 16:31, Pavel Stehule wrote: > Hi > > > 2014-08-14 9:03 GMT+02:00 Heikki Linnakangas : > >> On 08/14/2014 06:53 AM, Joachim Wieland wrote: >> >>> I'm seeing an assertion failure with "pg_dump -c --if-exists" which is >>> not ready to handle BLOBs it seems: >>> >>> pg_dump: pg_backup_archiver.c:472: RestoreArchive: Assertion `mark != >>> ((void *)0)' failed. >>> >>> To reproduce: >>> >>> $ createdb test >>> $ pg_dump -c --if-exists test (works, dumps empty database) >>> $ psql test -c "select lo_create(1);" >>> $ pg_dump -c --if-exists test (fails, with the above mentioned >>> assertion) >>> >> >> The code tries to inject an "IF EXISTS" into the already-construct DROP >> command, but it doesn't work for large objects, because the deletion >> command looks like "SELECT pg_catalog.lo_unlink(xxx)". There is no DROP >> there. >> >> I believe we could use "SELECT pg_catalog.lo_unlink(loid) FROM >> pg_catalog.pg_largeobject_metadata WHERE loid = xxx". >> pg_largeobject_metadata table didn't exist before version 9.0, but we don't >> guarantee pg_dump's output to be compatible in that direction anyway, so I >> think that's fine. >> >> The quick fix would be to add an exception for blobs, close to where >> Assert is. There are a few exceptions there already. A cleaner solution >> would be to add a new argument to ArchiveEntry and make the callers >> responsible for providing an "DROP IF EXISTS" query, but that's not too >> appetizing because for most callers it would be boring boilerplate code. >> Perhaps add an argument, but if it's NULL, ArchiveEntry would form the >> if-exists query automatically from the DROP query. >> > > I am sending two patches > > first is fast fix > > second fix is implementation of Heikki' idea. > I'm guessing this issue is still unresolved? It would be nice to get this off the open items list. Thom
Re: [HACKERS] jsonb format is pessimal for toast compression
On 09/24/2014 08:16 AM, Tom Lane wrote: Jan Wieck writes: On 09/15/2014 09:46 PM, Craig Ringer wrote: Anyway - this is looking like the change will go in, and with it a catversion bump. Introduction of a jsonb version/flags byte might be worthwhile at the same time. It seems likely that there'll be more room for improvement in jsonb, possibly even down to using different formats for different data. Is it worth paying a byte per value to save on possible upgrade pain? If there indeed has to be a catversion bump in the process of this, then I agree with Craig. FWIW, I don't really. To begin with, it wouldn't be a byte per value, it'd be four bytes, because we need word-alignment of the jsonb contents so there's noplace to squeeze in an ID byte for free. Secondly, as I wrote in <15378.1408548...@sss.pgh.pa.us>: : There remains the : question of whether to take this opportunity to add a version ID to the : binary format. I'm not as excited about that idea as I originally was; : having now studied the code more carefully, I think that any expansion : would likely happen by adding more type codes and/or commandeering the : currently-unused high-order bit of JEntrys. We don't need a version ID : in the header for that. Moreover, if we did have such an ID, it would be : notationally painful to get it to most of the places that might need it. Heikki's patch would eat up the high-order JEntry bits, but the other points remain. If we don't need to be backwards-compatible with the 9.4beta on-disk format, we don't necessarily need to eat the high-order JEntry bit. You can just assume that that every nth element is stored as an offset, and the rest as lengths. Although it would be nice to have the flag for it explicitly. There are also a few free bits in the JsonbContainer header that can be used as a version ID in the future. So I don't think we need to change the format to add an explicit version ID field. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
On 09/24/2014 10:45 AM, Fabien COELHO wrote: Currently these distributions are achieved by mapping a continuous function onto integers, so that neighboring integers get neighboring number of draws, say with size=7: #draws 10 6 3 1 0 0 0 // some exponential distribution int drawn 0 1 2 3 4 5 6 Although having an exponential distribution of accesses on tuples is quite reasonable, the likelyhood there would be so much correlation between neighboring values is not realistic at all. You need some additional shuffling to get there. I don't understand what that pseudo-random stage you're talking about is. Can you elaborate? The pseudo random stage is just a way to scatter the values. A basic approach to achieve this is "i' = (i * large-prime) % size", if you have a modulo. For instance with prime=5 you may get something like: #draws 10 6 3 1 0 0 0 int drawn 0 1 2 3 4 5 6 (i) scattered 0 5 3 1 6 4 2 (i' = 5 i % 7) So the distribution becomes: #draws 10 1 0 3 0 6 0 scattered 0 1 2 3 4 5 6 Which is more interesting from a testing perspective because it removes the neighboring value correlation. Depends on what you're testing. Yeah, shuffling like that makes sense for a primary key. Or not: very often, recently inserted rows are also queried more often, so that there is indeed a strong correlation between the integer key and the access frequency. Or imagine that you have a table that stores the height of people in centimeters. To populate that, you would want to use a gaussian distributed variable, without shuffling. For shuffling, perhaps we should provide a pgbench function or operator that does that directly, instead of having to implement it using * and %. Something like hash(x, min, max), where x is the input variable (gaussian distributed, or whatever you want), and min and max are the range to map it to. I must say that I'm appaled by a decision process which leads to such results, with significant patches passed, and the tiny complement to make it really useful (I mean not on the paper or on the feature list, but in real life) is rejected... The idea of a modulo operator was not rejected, we'd just like to have the infrastructure in place first. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extending COPY TO
On 09/24/2014 09:23 AM, Andrea Riciputi wrote: Imagine you access PG from an application written in the language X using a driver library, both your application and your PG instance run on two different hosts. In that scenario, you'll be using the PQgetCopyData function to get the data. PQgetCopyData returns one row at a time; the application can trivially change the line-ending to whatever it wants, when writing the output to a file or wherever it goes. As I wrote before, despite being an heavy PG user, it’s my first time on the hackers ML and I don’t want to seem disrespectful of the community. No worries; thanks for effort, even if this idea doesn't pan out. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 24 August 2014 11:33, Amit Kapila Wrote Thanks for you comments, i have worked on both the review comment lists, sent on 19 August, and 24 August. Latest patch is attached with the mail.. on 19 August: >You can compare against SQLSTATE by using below API. >val = PQresultErrorField(res, PG_DIAG_SQLSTATE); >You need to handle *42P01* SQLSTATE, also please refer below >usage where we are checking SQLSTATE. >fe-connect.c >PQresultErrorField(conn->result, PG_DIAG_SQLSTATE), > ERRCODE_INVALID_PASSWORD) == 0) DONE >Few more comments: > >1. >* If user has not given the vacuum of complete db, then if > >I think here you have said reverse of what code is doing. >You don't need *not* in above sentence. DONE >2. >+ appendPQExpBuffer(&sql, >"\"%s\".\"%s\"", nspace, relName); >I think here you need to use function fmtQualifiedId() or fmtId() >or something similar to handle quotes appropriately. DONE >3. > >+ */ >+ if (!r && !completedb) >Here the usage of completedb variable is reversed which means >that it goes into error path when actually whole database is >getting vacuumed and the reason is that you are setting it >to false in below code: >+ /* Vaccuming full database*/ >+ vacuum_tables = false; FIXED >4. >Functions prepare_command() and vacuum_one_database() contain >duplicate code, is there any problem in using prepare_command() >function in vacuum_one_database(). Another point in this context >is that I think it is better to name function prepare_command() >as append_vacuum_options() or something on that lines, also it will >be better if you can write function header for this function as well. DONE >5. >+ if (error) >+ { >+ for (i = 0; i < max_slot; i++) >+ { >+ DisconnectDatabase(&connSlot[i]); >+ } > >Here why do we need DisconnectDatabase() type of function? >Why can't we simply call PQfinish() as in base code? Beacause base code jut need to handle main connection, and when sending the PQfinish, means either its completed or error, In both the cases, control is with client, But in multi connection case, if one connection fails then we need to send cancle to on other connection that wwhat is done DisconnectDatabase. > >6. >+ /* >+ * if table list is not provided then we need to do vaccum for >whole DB >+ * get the list of all tables and prpare the list >+ */ >spelling of prepare is wrong. I have noticed spell mistake >in comments at some other place as well, please check all >comments once FIXED > >7. I think in new mechanism cancel handler will not work. >In single connection vacuum it was always set/reset >in function executeMaintenanceCommand(). You might need >to set/reset it in function run_parallel_vacuum(). Good catch, Now i have called SetCancelConn(pSlot[0].connection), on first connection. this will enable cancle handler to cancle the query on first connection so that select loop will come out. 24 August - > >1. I could see one shortcoming in the way the patch has currently parallelize >the > work for --analyze-in-stages. Basically patch is performing the work for > each stage > for multiple tables in concurrent connections that seems okay for the cases > when > number of parallel connections is less than equal to number of tables, but > for > the case when user has asked for more number of connections than number of > tables, > then I think this strategy will not be able to use the extra connections. I think --analyze-in-stages should maintain the order. >2. Similarly for the case of multiple databases, currently it will not be able > to use connections more than number of tables in each database because the > parallelizing strategy is to just use the conncurrent connections for > tables inside single database. I think for multiple database doing the parallel execution we need to maintain the multiple connection with multiple databases. And we need to maintain a table list for all the databases together to run them concurrently. I think this may impact the startup cost, as we need to create a multiple connection and disconnect for preparing the list and i think it will increase the complexity also. >I am not completely sure whether current strategy is good enough or >we should try to address the above problems. What do you think? >3. >+ do >+ { >+ i = select_loop(maxFd, &slotset); >+ Assert(i != 0); > >Could you explain the reason of using this loop, I think you >want to wait for data on socket descriptor, but why for maxFd? >Also it is better if you explain this logic in comments. We are sending vacuum job to
Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)
Hi, On 2014-09-24 14:26:37 +0530, Abhijit Menon-Sen wrote: > Thanks for your comments, and I'm sorry it's taken me so long to > respond. > > At 2014-08-03 11:18:57 +0530, amit.kapil...@gmail.com wrote: > > > > After looking at code, I also felt that it is better to add this as a > > version of pg_stattuple. > > I started off trying to do that, but now I'm afraid I strongly disagree. > First, pg_stattuple works on relations and indexes, whereas fastbloat > only works on relations (because of the VM/FSM use). Second, the code > began to look hideous when mushed together, with too many ifs to avoid > locking I didn't need and so on. The logic is just too different. Why not add it to the stattuple extension, but as an independent function (and file)? I don't really see a need for a completely new extension, but a separate extension seems wasteful. > > 7. > > HeapTupleSatisfiesVacuumNoHint() > > a. Why can't we use HeapTupleSatisfiesVisibility() with dirty snapshot > > as we use for pgstattuple? I think it's completely unacceptable to copy a visibility routine. And I don't believe for a second that avoiding hint bit setting makes it legal to not acquire a content lock for this. What's your reasoning for that being safe? If you argue that all possible corruption has limited impact you need to do that *much* more explicitly and verbosely. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] a fast bloat measurement tool (was Re: Measuring relation free space)
Hi Amit. Thanks for your comments, and I'm sorry it's taken me so long to respond. At 2014-08-03 11:18:57 +0530, amit.kapil...@gmail.com wrote: > > After looking at code, I also felt that it is better to add this as a > version of pg_stattuple. I started off trying to do that, but now I'm afraid I strongly disagree. First, pg_stattuple works on relations and indexes, whereas fastbloat only works on relations (because of the VM/FSM use). Second, the code began to look hideous when mushed together, with too many ifs to avoid locking I didn't need and so on. The logic is just too different. > Is this assumption based on the reason that if the visibility map bit > of page is set, then there is high chance that vacuum would have > pruned the dead tuples and updated FSM with freespace? Right. I'm not convinced an explanation of the VM/FSM belongs in the fastbloat documentation, but I'll see if I can make it clearer. > 1. compilation errors […] > I think this is not a proper multi-line comment. […] > It is better to have CHECK_FOR_INTERRUPTS() in above loop. […] > Incase of new page don't you need to account for freespace. […] > 5. Don't we need the handling for empty page (PageIsEmpty()) case? Thanks, have fixed, will push updates soon. > What is the reason for not counting ItemIdIsDead() case for > accounting of dead tuples. Will reconsider and fix. > 7. > HeapTupleSatisfiesVacuumNoHint() > a. Why can't we use HeapTupleSatisfiesVisibility() with dirty snapshot > as we use for pgstattuple? Heavier locking. I tried to make do with the existing HTS* functions, but fastbloat was noticeably faster in tests with the current setup. > b. If we want to use HeapTupleSatisfiesVacuum(), why can't we add one > parameter to function which can avoid setting of hintbit. This is certainly a possibility, and as you say it would be better in terms of maintenance. I didn't do it that way because I wanted to keep the extension self-contained rather than make it depend on core changes. If there's consensus on adding a nohint parameter, sure, I can do that. (But the fastbloat won't work on current versions, which would suck.) In the meantime, I'll merge the later updates from HTSVacuum into my code. Thanks for the heads-up. > function header of vac_estimate_reltuples() suggests that it is > used by VACUUM and ANALYZE, I think it will be better to update > the comment w.r.t new usage as well. OK. > 10. I think it should generate resource file as is done for other > modules if you want to keep it as a separate module. Thanks, will do. > Do you really need to specify examples in README. I don't *need* to, but I like it. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgcrypto: PGP signatures
At 2014-09-15 13:37:48 +0200, ma...@joh.to wrote: > > I'm not sure we're talking about the same thing. No, we weren't. I was under the impression that the signatures could be validated. Sorry for the noise. -- Abhijit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] add modulo (%) operator to pgbench
Hello Heikki, If you reject it, you can also remove the gaussian and exponential random distribution which is near useless without a mean to add a minimal pseudo-random stage, for which "(x * something) % size" is a reasonable approximation, hence the modulo submission. I'm confused. The gaussian and exponential patch was already committed. Yes. They are significant patches that really involved significant work, and which are only really useful with a complementary stupid 10 lines patch which is being rejected without understanding why it is needed. Are you saying that it's not actually useful, and should be reverted? That doesn't make sense to me, gaussian and exponential distributed values seems quite useful to me in test scripts. Yes and no. Currently these distributions are achieved by mapping a continuous function onto integers, so that neighboring integers get neighboring number of draws, say with size=7: #draws 10 6 3 1 0 0 0 // some exponential distribution int drawn 0 1 2 3 4 5 6 Although having an exponential distribution of accesses on tuples is quite reasonable, the likelyhood there would be so much correlation between neighboring values is not realistic at all. You need some additional shuffling to get there. I don't understand what that pseudo-random stage you're talking about is. Can you elaborate? The pseudo random stage is just a way to scatter the values. A basic approach to achieve this is "i' = (i * large-prime) % size", if you have a modulo. For instance with prime=5 you may get something like: #draws 10 6 3 1 0 0 0 int drawn 0 1 2 3 4 5 6 (i) scattered 0 5 3 1 6 4 2 (i' = 5 i % 7) So the distribution becomes: #draws 10 1 0 3 0 6 0 scattered 0 1 2 3 4 5 6 Which is more interesting from a testing perspective because it removes the neighboring value correlation. A better approach is to use a hash function. "i' = hash(i) % size", although it skews the distribution some more, but the quality of the shuffling is much better than with the mult-modulo version above. Note that you need a modulo as well... I must say that I'm appaled by a decision process which leads to such results, with significant patches passed, and the tiny complement to make it really useful (I mean not on the paper or on the feature list, but in real life) is rejected... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN indexes - TRAP: BadArgument
On 09/23/2014 10:04 PM, Alvaro Herrera wrote: + + The BRIN implementation in PostgreSQL + is primarily maintained by Álvaro Herrera. + We don't usually have such verbiage in the docs. The GIN and GiST pages do, but I think those are a historic exceptions, not something we want to do going forward. + + + BrinOpcInfo *opcInfo(void) + + + Returns internal information about the indexed columns' summary data. + + + I think you should explain what that internal information is. The minmax-19a.patch adds the type OID argument to this; remember to update the docs. In SP-GiST, the similar function is called "config". It might be good to use the same name here, for consistency across indexams (although I actually like the "opcInfo" name better than "config") The docs for the other support functions need to be updated, now that you changed the arguments from DeformedBrTuple to BrinValues. + + To implement these methods in a generic ways, normally the opclass + defines its own internal support functions. For instance, minmax + opclasses add the support functions for the four inequality operators + for the datatype. + Additionally, the operator class must supply appropriate + operator entries, + to enable the optimizer to use the index when those operators are + used in queries. The above needs improvement ;-) +BRIN indexes (a shorthand for Block Range indexes) +store summaries about the values stored in consecutive table physical block ranges. "consecutive table physical block ranges" is quite a mouthful. +For datatypes that have a linear sort order, the indexed data +corresponds to the minimum and maximum values of the +values in the column for each block range, +which support indexed queries using these operators: + + + < + <= + = + >= + > + That's the built-in minmax indexing strategy, yes, but you could have others, even for datatypes with a linear sort order. + To find out the index tuple for a particular page range, we have an internal s/find out/find/ + new heap tuple contains null values but the index tuple indicate there are no s/indicate/indicates/ + Open questions + -- + + * Same-size page ranges? + Current related literature seems to consider that each "index entry" in a + BRIN index must cover the same number of pages. There doesn't seem to be a What is the related literature? Is there an academic paper or something that should be cited as a reference for BRIN? + * TODO + ** ScalarArrayOpExpr (amsearcharray -> SK_SEARCHARRAY) + ** add support for unlogged indexes + ** ditto expressional indexes We don't have unlogged indexes in general, so no need to list that here. What would be needed to implement ScalarArrayOpExprs? I didn't realize that expression indexes are still not supported. And I see that partial indexes are not supported either. Why not? I wouldn't expect BRIN to need to care about those things in particular; the expressions for an expressional or partial index are handled in the executor, no? + /* + * A tuple in the heap is being inserted. To keep a brin index up to date, + * we need to obtain the relevant index tuple, compare its stored values with + * those of the new tuple; if the tuple values are consistent with the summary + * tuple, there's nothing to do; otherwise we need to update the index. s/compare/and compare/. Perhaps replace one of the semicolons with a full stop. + * If the range is not currently summarized (i.e. the revmap returns InvalidTid + * for it), there's nothing to do either. + */ + Datum + brininsert(PG_FUNCTION_ARGS) There is no InvalidTid, as a constant or a #define. Perhaps replace with "invalid item pointer". + /* +* XXX We need to know the size of the table so that we know how long to +* iterate on the revmap. There's room for improvement here, in that we +* could have the revmap tell us when to stop iterating. +*/ The revmap doesn't know how large the table is. Remember that you have to return all blocks that are not in the revmap, so you can't just stop when you reach the end of the revmap. I think the current design is fine. I have to stop now to do some other stuff. Overall, this is in pretty good shape. In addition to little cleanup of things I listed above, and similar stuff elsewhere that I didn't read through right now, there are a few medium-sized items I'd still like to see addressed before you commit this: * expressional/partial index support * the difficulty of testing the union support function that we discussed earlier * clarify the memory context stuff of support functions that we also discussed earlier - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postg