Re: [HACKERS] PostgreSQL in Windows console and Ctrl-C
* From: Amit Kapila > On Sun, Apr 13, 2014 at 5:59 PM, Christian Ullrich > wrote: > > There are some possible solutions: > > > > - pg_ctl could set an environment variable (unless it has to be > > compatible with postmasters from different versions, and it does > > not, does it?). > > Do you mean to say use some existing environment variable? > Introducing an environment variable to solve this issue or infact using > some existing environ variable doesn't seem to be the best way to pass > such information. I meant creating a new one, yes. If, say, PGSQL_BACKGROUND_JOB was set, the postmaster etc. would ignore the events. > > pg_ctl creates a named job object, and the postmaster has all the > > information it needs to reconstruct the name, so it can check if it is > > running inside that pg_ctl-created job. > > We are already creating one job object, so may be that itself can be > used, but not sure if Job Objects are supported on all platforms on which > postgres works. I mentioned that in my earlier message; we cannot rely on having that job object around. If the OS does not support them (not much of a problem nowadays, I think; anything anyone is going to run the version this will first be released in on is going to) or, more likely, if something uses PostgreSQL as "embedded" database and starts it in a job of their own, pg_ctl will not create its job object. There may also be software out there that a) runs PostgreSQL in this way, b) uses the console events to stop it again, and c) does that with or without a separate job object. Hence, deciding based on the existence of a job object would break backward compatibility in this case, assuming it exists in reality. > If you have to pass such information, then why don't pass some special > flag in command to CreateProcess()? Because I wanted to avoid introducing a command line option to tell the postmaster who started it. If we cannot look at job objects, and an environment variable is not an option either, then this looks like the best remaining solution. I will create a patch based on this approach. > There is always a chance that we ignore the CTRL+C for some app which > we don't intend if we solve the problem by passing some info, as some > other app could also do so, but I think it should not be a big problem. No. In fact, if whatever starts the postmaster does pass that flag, we can safely assume that it did not do so just for fun. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Including replication slot data in base backups
On Thu, Apr 10, 2014 at 12:36 AM, Robert Haas wrote: > On Tue, Apr 8, 2014 at 3:08 AM, Michael Paquier > wrote: >> On Tue, Apr 8, 2014 at 1:18 AM, Robert Haas wrote: >>> Not sure if this is exactly the right way to do it, but I agree that >>> something along those lines is a good idea. I also think, maybe even >>> importantly, that we should probably document that people using >>> file-copy based hot backups should strongly consider removing the >>> replication slots by hand before using the backup. >> Good point. Something here would be adapted in this case: >> http://www.postgresql.org/docs/devel/static/backup-file.html >> I am attaching an updated patch as well. > > What you've got here doesn't look like the right section to update - > the section you've updated is on taking a cold backup. The section I > think you want to update is "Making a Base Backup Using the Low Level > API", and specifically this part: > > You can, however, omit from the backup dump the files within the > cluster's pg_xlog/ subdirectory. This slight adjustment is worthwhile > because it reduces the risk of mistakes when restoring. This is easy > to arrange if pg_xlog/ is a symbolic link pointing to someplace > outside the cluster directory, which is a common setup anyway for > performance reasons. You might also want to exclude postmaster.pid and > postmaster.opts, which record information about the running > postmaster, not about the postmaster which will eventually use this > backup. (These files can confuse pg_ctl.) > > What I'd propose is adding a second paragraph like this: > > It is often a good idea to also omit from the backup dump the files > within the cluster's pg_replslot/ directory, so that replication slots > that exist on the master do not become part of the backup. Otherwise, > the subsequent use of the backup to create a standby may result in > indefinite retention of WAL files on the standby, and possibly bloat > on the master if hot standby feedback is enabled, because the clients > that are using those replication slots will still be connecting to and > updating the slots on the master, not the standby. Even if the backup > is only intended for use in creating a new master, copying the > replication slots isn't expected to be particularly useful, since the > contents of those slots will likely be badly out of date by the time > the new master comes on line. Yes, that's far better than what I sent. Thanks! -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Signaling of waiting for a cleanup lock?
On Sun, Apr 13, 2014 at 9:14 PM, Andres Freund wrote: > On 2014-04-12 17:40:34 -0400, Robert Haas wrote: >> On Fri, Apr 11, 2014 at 10:28 AM, Andres Freund >> wrote: >> > VACUUM sometimes waits synchronously for a cleanup lock on a heap >> > page. Sometimes for a long time. Without reporting it externally. >> > Rather confusing ;). >> > >> > Since we only take cleanup locks around vacuum, how about we report at >> > least in pgstat that we're waiting? At the moment, there's really no way >> > to know if that's what's happening. >> >> That seems like a pretty good idea to me. > > What I am not sure about is how... It's trivial to set > pg_stat_activity.waiting = true, but without a corresponding description > what the backend is waiting for it's not exactly obvious what's > happening. I think that's better than nothing, but maybe somebody has a > glorious better idea. > > Overwriting parts of the query/activity sounds like it'd be somewhat > expensive ugly. > >> I think we've avoided doing >> this for LWLocks for fear that there might be too much overhead, but >> it's hard for me to imagine a workload where you're waiting for >> cleanup locks often enough for the overhead to matter. > > Hm. I am not sure I see the cost as a very compelling thing here. Sure, > we can't list the acquired lwlocks and such, but it should be cheap > enough to export what lwlock we're waiting for if we're going to > sleep. I think it'd be worthwile making that visible somehow. > But that's a separate issue... How about having a view like pg_lwlocks similar to pg_locks which can be used to capture and provide the useful information? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL in Windows console and Ctrl-C
On Sun, Apr 13, 2014 at 5:59 PM, Christian Ullrich wrote: > There are some possible solutions: > > - pg_ctl could set an environment variable (unless it has to be compatible > with postmasters from different versions, and it does not, does it?). Do you mean to say use some existing environment variable? Introducing an environment variable to solve this issue or infact using some existing environ variable doesn't seem to be the best way to pass such information. > pg_ctl creates a named job object, and the postmaster has all the > information it needs to reconstruct the name, so it can check if it is > running inside that pg_ctl-created job. We are already creating one job object, so may be that itself can be used, but not sure if Job Objects are supported on all platforms on which postgres works. If you have to pass such information, then why don't pass some special flag in command to CreateProcess()? There is always a chance that we ignore the CTRL+C for some app which we don't intend if we solve the problem by passing some info, as some other app could also do so, but I think it should not be a big problem. > I would appreciate some advice on this. > >> One another way could be to use CREATE_NEW_CONSOLE instead of >> CREATE_NEW_PROCESS_GROUP in you previous fix, but I am not sure if it's >> acceptable to users to have a new console window for server. > > It might. That would also isolate stderr logging from whatever else the > user is doing in the window, and it would work correctly in the pg_ctl > (and by extension the direct-postmaster) case. It would not do anything > for events generated by keystrokes in the new console window, though. > > There would also have to be a command line option to pg_ctl to either > enable or disable it, not sure which. The problem can be solved this way, but the only question here is whether it is acceptable for users to have a new console window for server. Can others also please share their opinion if this fix (start server in new console) seems acceptable or shall we try by passing some information from pg_ctl and then ignore CTRL+C && CTRL+BREAK for it? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add ANALYZE into regression tests
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Sure, but I think Greg's point is that this could be tested by a > black-box functional test ("does it print something it shouldn't") > rather than a white-box test that necessarily depends on a whole lot > of *other* planner choices that don't have much to do with the point > in question. You already got bit by variances in the choice of join > type, which is not what the test is about. Yes, but as I said, I'm *also* doing the black-box functional test.. Perhaps that means this extra EXPLAIN test is overkill, but, after trying to run down whatever build animal 'hamerkop' has done to break the tablespace regression test, I'm definitely keen to err on the side of 'more information'. > I think the test is okay as-is as long as we don't see more failures > from it; but if we do see any more I'd suggest rewriting as per Greg's > suggestion rather than trying to constrain the plan choice even more > narrowly. The 'rewriting', in this case, would simply be removing the 'explain' part of the test and keeping the rest. If you or Greg see an issue with the test that actually does the 'raise notice' for every value seen by the 'snoop' function or it doesn't match your 'black-box' expectation, please let me know and I can try to refine it.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [COMMITTERS] pgsql: Add ANALYZE into regression tests
Stephen Frost writes: > * Greg Stark (st...@mit.edu) wrote: >> But the original goal seems like it would be easier and better done with an >> immutable function which lies and calls elog to leak information. That's >> the actual attack this is supposed to protect against anyways. > Sure, but there's a whole slew of tests that would have to change if we > changed the explain output, not just this one. Sure, but I think Greg's point is that this could be tested by a black-box functional test ("does it print something it shouldn't") rather than a white-box test that necessarily depends on a whole lot of *other* planner choices that don't have much to do with the point in question. You already got bit by variances in the choice of join type, which is not what the test is about. I think the test is okay as-is as long as we don't see more failures from it; but if we do see any more I'd suggest rewriting as per Greg's suggestion rather than trying to constrain the plan choice even more narrowly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb
On Sun, Apr 13, 2014 at 10:49 AM, Tom Lane wrote: > Simon Riggs writes: >> ISTM that this is the way ANALYZE should work when run on a table that >> has never been analysed before. Let's just do this logic within >> ANALYZE and be done. > > Can't. Not unless you intend to make ANALYZE do internal commits > so that its output rows become visible to other transactions before > its done. (Which would be, shall we say, a damn bad idea.) > > Even without that implementation problem, I absolutely don't agree > that this is such a great thing that it should become not only the > default but the only obtainable behavior. It would slow down > ANALYZE, and would only be helpful if there is concurrent activity > that would benefit from the stats. There are plenty of scenarios > where that would be giving up something to get nothing. Agreed. I suspect that's also true of the pg_upgrade behavior. Sure, there may be people for who the database will be immediately usable with a stats target of 1 or 10, but I bet there are also quite a few for whom it isn't, or who just wait for the whole thing to be done anyway before they fire the system up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with txid_snapshot_in/out() functionality
On 04/13/14 14:22, Jan Wieck wrote: On 04/13/14 08:27, Marko Kreen wrote: I think you need to do SET_VARSIZE also here. Alternative is to move SET_VARSIZE after sort_snapshot(). And it seems the drop-double-txid logic should be added also to txid_snapshot_recv(). It seems weird to have it behave differently from txid_snapshot_in(). Thanks, yes on both issues. Will create another patch. New patch attached. New github commit is https://github.com/wieck/postgres/commit/b8fd0d2eb78791e5171e34aecd233fd06218890d Thanks again, Jan -- Jan Wieck Senior Software Engineer http://slony.info diff --git a/src/backend/utils/adt/txid.c b/src/backend/utils/adt/txid.c index a005e67..1023566 100644 *** a/src/backend/utils/adt/txid.c --- b/src/backend/utils/adt/txid.c *** cmp_txid(const void *aa, const void *bb) *** 131,146 } /* ! * sort a snapshot's txids, so we can use bsearch() later. * * For consistency of on-disk representation, we always sort even if bsearch ! * will not be used. */ static void sort_snapshot(TxidSnapshot *snap) { if (snap->nxip > 1) qsort(snap->xip, snap->nxip, sizeof(txid), cmp_txid); } /* --- 131,161 } /* ! * unique sort a snapshot's txids, so we can use bsearch() later. * * For consistency of on-disk representation, we always sort even if bsearch ! * will not be used. */ static void sort_snapshot(TxidSnapshot *snap) { + txidlast = 0; + int nxip, idx1, idx2; + if (snap->nxip > 1) + { qsort(snap->xip, snap->nxip, sizeof(txid), cmp_txid); + nxip = snap->nxip; + idx1 = idx2 = 0; + while (idx1 < nxip) + { + if (snap->xip[idx1] != last) + last = snap->xip[idx2++] = snap->xip[idx1]; + else + snap->nxip--; + idx1++; + } + } } /* *** parse_snapshot(const char *str) *** 294,304 val = str2txid(str, &endp); str = endp; ! /* require the input to be in order */ ! if (val < xmin || val >= xmax || val <= last_val) goto bad_format; ! buf_add_txid(buf, val); last_val = val; if (*str == ',') --- 309,320 val = str2txid(str, &endp); str = endp; ! /* require the input to be in order and skip duplicates */ ! if (val < xmin || val >= xmax || val < last_val) goto bad_format; ! if (val != last_val) ! buf_add_txid(buf, val); last_val = val; if (*str == ',') *** txid_current_snapshot(PG_FUNCTION_ARGS) *** 361,368 { TxidSnapshot *snap; uint32 nxip, ! i, ! size; TxidEpoch state; Snapshotcur; --- 377,383 { TxidSnapshot *snap; uint32 nxip, ! i; TxidEpoch state; Snapshotcur; *** txid_current_snapshot(PG_FUNCTION_ARGS) *** 381,389 /* allocate */ nxip = cur->xcnt; ! size = TXID_SNAPSHOT_SIZE(nxip); ! snap = palloc(size); ! SET_VARSIZE(snap, size); /* fill */ snap->xmin = convert_xid(cur->xmin, &state); --- 396,402 /* allocate */ nxip = cur->xcnt; ! snap = palloc(TXID_SNAPSHOT_SIZE(nxip)); /* fill */ snap->xmin = convert_xid(cur->xmin, &state); *** txid_current_snapshot(PG_FUNCTION_ARGS) *** 395,400 --- 408,416 /* we want them guaranteed to be in ascending order */ sort_snapshot(snap); + /* we set the size here because the sort may have removed duplicate xips */ + SET_VARSIZE(snap, TXID_SNAPSHOT_SIZE(snap->nxip)); + PG_RETURN_POINTER(snap); } *** txid_snapshot_recv(PG_FUNCTION_ARGS) *** 472,489 snap = palloc(TXID_SNAPSHOT_SIZE(nxip)); snap->xmin = xmin; snap->xmax = xmax; - snap->nxip = nxip; - SET_VARSIZE(snap, TXID_SNAPSHOT_SIZE(nxip)); for (i = 0; i < nxip; i++) { txidcur = pq_getmsgint64(buf); ! if (cur <= last || cur < xmin || cur >= xmax) goto bad_format; snap->xip[i] = cur; last = cur; } PG_RETURN_POINTER(snap); bad_format: --- 488,514 snap = palloc(TXID_SNAPSHOT_SIZE(nxip)); snap->xmin = xmin; snap->xmax = xmax; for (i = 0; i < nxip; i++) { txidcur = pq
Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: [HACKERS] Problem with txid_snapshot_in/out() functionality)
On 04/12/2014 05:03 PM, Andres Freund wrote: >If we don't, aren't we letting other backends see non-self-consistent >state in regards to who holds which locks, for example? I think that actually works out ok, because the locks aren't owned by xids/xacts, but procs. Otherwise we'd be in deep trouble in CommitTransaction() as well where ProcArrayEndTransaction() clearing that state. After the whole xid transfer, there's PostPrepare_Locks() transferring the locks. Right. However, I just noticed that there's a race condition between PREPARE TRANSACTION and COMMIT/ROLLBACK PREPARED. PostPrepare_Locks runs after the prepared transaction is already marked as fully prepared. That means that by the time we get to PostPrepare_Locks, another backend might already have finished and removed the prepared transaction. That leads to a PANIC (put a breakpoint just before PostPrepare_Locks): postgres=# commit prepared 'foo'; PANIC: failed to re-find shared proclock object PANIC: failed to re-find shared proclock object The connection to the server was lost. Attempting reset: Failed. FinishPrepareTransaction reads the list of locks from the two-phase state file, but PANICs when it doesn't find the corresponding locks in the lock manager (because PostPrepare_Locks hasn't transfered them to the dummy PGPROC yet). I think we'll need to transfer of the locks earlier, before the transaction is marked as fully prepared. I'll take a closer look at this tomorrow. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: Debug strategy for musl Postgres?
Hi, On 2014-04-13 16:08:00 -0400, John Mudd wrote: > I built Postgres 9.3.4 from source on top of the musl C library, > http://www.musl-libc.org/ > I also built zlib, bzip2, ncurses, openssl, readline and Python using musl > as a foundation for Postgres. > > I'm using musl to increase the portability of the Postgres binary. I build > on Ubuntu 13.10 but will runs on older Linux boxes. > > So far I get better results with the musl Postgres built on modern Ubuntu > and running on an old kernel than building Postgres directly on the old > Linux using standard C library. But the musl Postgres is still not working > fully. I'm not getting responses from the server. I tend to think that this is more a matter for the musl devs than postgres. Postgres works on a fair numbers of libcs and musl is pretty new and rough around the edges. > clock_gettime(0, 0xbfffa5a8)= -1 ENOSYS (Function not implemented) This looks suspicious. > gettimeofday(NULL, {300, 0})= 0 > poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, 3000) = 1 > sendto(3, "\0\0\0?\0\3\0\0user\0jmudd\0database\0jmud"..., 63, 0x4000, > NULL, 0) = 63 > > clock_gettime(0, 0xbfffa5a8)= -1 ENOSYS (Function not > implemented) > gettimeofday(NULL, {300, 0})= 0 > poll([{fd=3, events=POLLIN|POLLERR}], 1, 3000) = 0 Here a poll didn't return anything. You'll likely have to look at the server side. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fwd: Debug strategy for musl Postgres?
Reposting from pgsql-bugs since this is not a bug. I built Postgres 9.3.4 from source on top of the musl C library, http://www.musl-libc.org/ I also built zlib, bzip2, ncurses, openssl, readline and Python using musl as a foundation for Postgres. I'm using musl to increase the portability of the Postgres binary. I build on Ubuntu 13.10 but will runs on older Linux boxes. So far I get better results with the musl Postgres built on modern Ubuntu and running on an old kernel than building Postgres directly on the old Linux using standard C library. But the musl Postgres is still not working fully. I'm not getting responses from the server. Here's the tail end "strace pg_isready" output for musl Postgres built and running on Ubuntu 13.10: clock_gettime(CLOCK_REALTIME, {1397359337, 426941692}) = 0 poll([{fd=4, events=POLLOUT|POLLERR}], 1, 3000) = 1 ([{fd=4, revents=POLLOUT}]) sendto(4, "\0\0\0=\0\3\0\0user\0mudd\0database\0mudd\0"..., 61, MSG_NOSIGNAL, NULL, 0) = 61 clock_gettime(CLOCK_REALTIME, {1397359337, 427070343}) = 0 poll([{fd=4, events=POLLIN|POLLERR}], 1, 3000) = 1 ([{fd=4, revents=POLLIN}]) recvfrom(4, "R\0\0\0\10\0\0\0\0E\0\0\0RSFATAL\0C3D000\0Mdat"..., 16384, 0, NULL, NULL) = 92 close(4)= 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 writev(1, [{"/tmp:5432 - accepting connection"..., 33}, {"\n", 1}], 2) = 34 exit_group(0) = ? Here's the tail end "strace pg_isready" output for musl Postgres built on Ubuntu 13.10 but running on old Linux: clock_gettime(0, 0xbfffa5a8)= -1 ENOSYS (Function not implemented) gettimeofday(NULL, {300, 0})= 0 poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, 3000) = 1 sendto(3, "\0\0\0?\0\3\0\0user\0jmudd\0database\0jmud"..., 63, 0x4000, NULL, 0) = 63 clock_gettime(0, 0xbfffa5a8)= -1 ENOSYS (Function not implemented) gettimeofday(NULL, {300, 0})= 0 poll([{fd=3, events=POLLIN|POLLERR}], 1, 3000) = 0 close(3)= 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 writev(1, [{"/tmp:5432 - no response", 23}, {"\n", 1}], 2) = 24 exit_group(2) = ? For my next step I'll try building musl Postgres with the --enable-cassert option. What else can I do to debug this? John
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
> -Original Message- > From: Tom Lane [mailto:t...@sss.pgh.pa.us] > Sent: 14 April 2014 06:23 > To: Dean Rasheed > Cc: Florian Pflug; David Rowley; Kevin Grittner; Josh Berkus; Greg Stark; > PostgreSQL-development > Subject: Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions > (WIP) > > Dean Rasheed writes: > > OK, I'm marking this ready for committer attention, on the > > understanding that that doesn't include the invtrans_minmax patch. > > I've finished reviewing/revising/committing this submission. That's great news! Tom, Thanks for taking the time to make the all modifications and commit this. Dean, Thank you for all your hard work reviewing the patch. The detail of the review was quite impressive. I'm really happy to see this make it in for 9.4. Regards David Rowley -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
On Apr13, 2014, at 20:23 , Tom Lane wrote: > Dean Rasheed writes: >> OK, I'm marking this ready for committer attention, on the >> understanding that that doesn't include the invtrans_minmax patch. > > I've finished reviewing/revising/committing this submission. Cool! Thanks! > Some notes: > > * I left out the EXPLAIN ANALYZE statistics, as I still feel that they're > of little if any use to regular users, and neither very well defined nor > well documented. We can revisit that later of course. > > * I also left out the table documenting which aggregates have this > optimization. That's not the kind of thing we ordinarily document, > and it seems inevitable to me that such a table would be noteworthy > mostly for wrong/incomplete/obsolete information in the future. I can live with each leaving each of these out individually, but leaving them both out means there's now no way of determining how a particular sliding window aggregation will behave. That leaves our users a bit out in the cold IMHO. For example, with this patch you'd usually want to write SUM(numeric_value::numeric(n,m)) instead of SUM(numeric_value)::numeric(n,m) in a sliding window aggregation, and there should be *some* way for users to figure this out. We have exhaustive tables of all the functions, operators and aggregates we support, and maintenance of those seems to work well for the most part. Why would a table of invertible aggregates be any different? If it's the non-locality that bothers you - I guess the same information could be added to the descriptions of the individual aggregates, instead of having one big table. > * I rejected the invtrans_collecting sub-patch altogether. I didn't > like anything about the idea of adding a poorly-documented field to > ArrayBuildState and then teaching some random subset of the functions > using that struct what to do with it. I think it would've been simpler, > more reliable, and not that much less efficient to just do memmove's in > shiftArrayResult. The string-aggregate changes were at least more > localized, but they still seemed awfully messy. And there's a bigger > issue: these aggregates have to do O(N) work per row for a frame of length > N anyway, so it's not clear to me that there's any big-O gain to be had > from making them into moving aggregates. Yeah, those probably aren't super-usefull. I initially added array_agg because with the nonstrict-forward/strict-reverse rule still in place, there wouldn't otherwise have been an in-core aggregate which exercises the non-strict case, which seemed like a bad idea. For the string case - I didn't expect that to turn out to be *quite* this messy when I started implementing it. > Anyway, this is nice forward progress for 9.4, even if we get no further. Yup! Thanks to everybody who made this happens! best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)
Dean Rasheed writes: > OK, I'm marking this ready for committer attention, on the > understanding that that doesn't include the invtrans_minmax patch. I've finished reviewing/revising/committing this submission. Some notes: * I left out the EXPLAIN ANALYZE statistics, as I still feel that they're of little if any use to regular users, and neither very well defined nor well documented. We can revisit that later of course. * I also left out the table documenting which aggregates have this optimization. That's not the kind of thing we ordinarily document, and it seems inevitable to me that such a table would be noteworthy mostly for wrong/incomplete/obsolete information in the future. * I rejected the invtrans_collecting sub-patch altogether. I didn't like anything about the idea of adding a poorly-documented field to ArrayBuildState and then teaching some random subset of the functions using that struct what to do with it. I think it would've been simpler, more reliable, and not that much less efficient to just do memmove's in shiftArrayResult. The string-aggregate changes were at least more localized, but they still seemed awfully messy. And there's a bigger issue: these aggregates have to do O(N) work per row for a frame of length N anyway, so it's not clear to me that there's any big-O gain to be had from making them into moving aggregates. I doubt people are going to be using these aggregates with very wide frames, just because they're going to be horribly expensive no matter what we do. Anyway, this is nice forward progress for 9.4, even if we get no further. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with txid_snapshot_in/out() functionality
On 04/13/14 08:27, Marko Kreen wrote: On Sat, Apr 12, 2014 at 02:10:13PM -0400, Jan Wieck wrote: Since it doesn't seem to produce any side effects, I'd think that making the snapshot unique within txid_current_snapshot() and filtering duplicates on input should be sufficient and eligible for backpatching. Agreed. The attached patch adds a unique loop to the internal sort_snapshot() function and skips duplicates on input. The git commit is here: https://github.com/wieck/postgres/commit/a88a2b2c25b856478d7e2b012fc718106338fe00 static void sort_snapshot(TxidSnapshot *snap) { + txidlast = 0; + int nxip, idx1, idx2; + if (snap->nxip > 1) + { qsort(snap->xip, snap->nxip, sizeof(txid), cmp_txid); + nxip = snap->nxip; + idx1 = idx2 = 0; + while (idx1 < nxip) + { + if (snap->xip[idx1] != last) + last = snap->xip[idx2++] = snap->xip[idx1]; + else + snap->nxip--; + idx1++; + } + } } I think you need to do SET_VARSIZE also here. Alternative is to move SET_VARSIZE after sort_snapshot(). And it seems the drop-double-txid logic should be added also to txid_snapshot_recv(). It seems weird to have it behave differently from txid_snapshot_in(). Thanks, yes on both issues. Will create another patch. Jan -- Jan Wieck Senior Software Engineer http://slony.info -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test failure on latest source
Andres Freund writes: > On 2014-04-12 16:35:48 -0400, Tom Lane wrote: >> In principle, that commit shouldn't have affected behavior for pg_hba >> entries with numeric address fields ... > Hm. getaddrinfo.c has this bit: > /* Unsupported flags. */ > if (flags & NI_NAMEREQD) > return EAI_AGAIN; Yeah, and that flag is only ever specified when attempting to do reverse lookup on a client address to see if it matches a non-numeric pg_hba entry. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] test failure on latest source
On 2014-04-12 16:35:48 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2014-04-12 19:45:59 +0200, Marco Atzeri wrote: > >> so it is only failing on recent trunk > > > Does it work on a commit before > > fc752505a99a4e2c781a070d3d42a25289c22e3c? > > In principle, that commit shouldn't have affected behavior for pg_hba > entries with numeric address fields ... Hm. getaddrinfo.c has this bit: /* Unsupported flags. */ if (flags & NI_NAMEREQD) return EAI_AGAIN; Not sure if that explains anything, but it at the very least seems problematic for other cases given that flag is now always used... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Signaling of waiting for a cleanup lock?
On 2014-04-12 17:40:34 -0400, Robert Haas wrote: > On Fri, Apr 11, 2014 at 10:28 AM, Andres Freund > wrote: > > VACUUM sometimes waits synchronously for a cleanup lock on a heap > > page. Sometimes for a long time. Without reporting it externally. > > Rather confusing ;). > > > > Since we only take cleanup locks around vacuum, how about we report at > > least in pgstat that we're waiting? At the moment, there's really no way > > to know if that's what's happening. > > That seems like a pretty good idea to me. What I am not sure about is how... It's trivial to set pg_stat_activity.waiting = true, but without a corresponding description what the backend is waiting for it's not exactly obvious what's happening. I think that's better than nothing, but maybe somebody has a glorious better idea. Overwriting parts of the query/activity sounds like it'd be somewhat expensive ugly. > I think we've avoided doing > this for LWLocks for fear that there might be too much overhead, but > it's hard for me to imagine a workload where you're waiting for > cleanup locks often enough for the overhead to matter. Hm. I am not sure I see the cost as a very compelling thing here. Sure, we can't list the acquired lwlocks and such, but it should be cheap enough to export what lwlock we're waiting for if we're going to sleep. I think it'd be worthwile making that visible somehow. But that's a separate issue... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
David Rowley writes: > On this thread > http://www.postgresql.org/message-id/52c6f712.6040...@student.kit.edu there > was some discussion around allowing push downs of quals that happen to be > in every window clause of the sub query. I've quickly put together a patch > which does this (see attached) I think you should have check_output_expressions deal with this, instead. Compare the existing test there for non-DISTINCT output columns. > Oh and I know that my function var_exists_in_all_query_partition_by_clauses > has no business in allpaths.c, I'll move it out as soon as I find a better > home for it. I might be wrong, but I think you could just embed that search loop in check_output_expressions, and it wouldn't be too ugly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to fix a couple of compiler warnings from 80a5cf64
David Rowley writes: > The attached patch fixes a couple of compiler warnings seen by the MSVC > build. Committed, thanks. > I'm not quite sure why I get 2 warnings rather than 8, but the attached > seems to make them go away, for what it's worth. I think it's complaining about the 2 cases where rounding off to float4 would actually change the value compared to float8. The other 6 numbers are exactly represented anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb
Simon Riggs writes: > ISTM that this is the way ANALYZE should work when run on a table that > has never been analysed before. Let's just do this logic within > ANALYZE and be done. Can't. Not unless you intend to make ANALYZE do internal commits so that its output rows become visible to other transactions before its done. (Which would be, shall we say, a damn bad idea.) Even without that implementation problem, I absolutely don't agree that this is such a great thing that it should become not only the default but the only obtainable behavior. It would slow down ANALYZE, and would only be helpful if there is concurrent activity that would benefit from the stats. There are plenty of scenarios where that would be giving up something to get nothing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
Hello David, thanks for your work. The results look promising. What I'm missing is a test case with multiple fields in the partition by clauses: -- should push down, because partid is part of all PARTITION BY clauses explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partition by partid, partid+0) m from winagg ) winagg where partid=1; current production 9.3.4 is returning QUERY PLAN -- Subquery Scan on winagg (cost=350955.11..420955.11 rows=20 width=20) (actual time=2564.360..3802.413 rows=20 loops=1) Filter: (winagg.partid = 1) Rows Removed by Filter: 180 -> WindowAgg (cost=350955.11..395955.11 rows=200 width=4) (actual time=2564.332..3657.051 rows=200 loops=1) -> Sort (cost=350955.11..355955.11 rows=200 width=4) (actual time=2564.320..2802.444 rows=200 loops=1) Sort Key: winagg_1.partid, ((winagg_1.partid + 0)) Sort Method: external sort Disk: 50840kB -> WindowAgg (cost=0.43..86948.43 rows=200 width=4) (actual time=0.084..1335.081 rows=200 loops=1) -> Index Only Scan using winagg_partid_idx on winagg winagg_1 (cost=0.43..51948.43 rows=200 width=4) (actual time=0.051..378.232 rows=200 loops=1) Heap Fetches: 0 "Index Only Scan" currently returns all rows (without pushdown) on current production 9.3.4. What happens with the patch you provided? -- Already Part of your tests: -- should NOT push down, because partid is NOT part of all PARTITION BY clauses explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partition by partid+0) m from winagg ) winagg where partid=1; Reordering the fields should also be tested: -- should push down, because partid is part of all PARTITION BY clauses -- here: partid at the end explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partition by partid+0, partid) m from winagg ) winagg where partid=1; -- should push down, because partid is part of all PARTITION BY clauses -- here: partid in the middle explain analyze select partid,n,m from ( select partid, count(*) over (partition by partid) n, count(*) over (partition by partid+0, partid, partid+1) m from winagg ) winagg where partid=1; Best regards Thomas Am 13.04.2014 13:32, schrieb David Rowley: On this thread http://www.postgresql.org/message-id/52c6f712.6040...@student.kit.edu there was some discussion around allowing push downs of quals that happen to be in every window clause of the sub query. I've quickly put together a patch which does this (see attached) I'm posting this just mainly to let Thomas know that I'm working on it, per his request on the other thread. The patch seems to work with all my test cases, and I've not quite gotten around to thinking of any more good cases to throw at it. Oh and I know that my function var_exists_in_all_query_partition_by_clauses has no business in allpaths.c, I'll move it out as soon as I find a better home for it. Here's my test case: drop table if exists winagg; create table winagg ( id serial not null primary key, partid int not null ); insert into winagg (partid) select x.x % 10 from generate_series(1,200) x(x); create index winagg_partid_idx on winagg(partid); -- Should push: this should push WHERE partid=1 to the inner query as partid is in the only parition by clause in the query. explain analyze select partid,n from (select partid,count(*) over (partition by partid) n from winagg) winagg where partid=1; QUERY PLAN -- WindowAgg (cost=4.58..82.23 rows=20 width=4) (actual time=0.196..0.207 rows=20 loops=1) -> Bitmap Heap Scan on winagg (cost=4.58..81.98 rows=20 width=4) (actual time=0.102..0.170 rows=20 loops=1) Recheck Cond: (partid = 1) Heap Blocks: exact=20 -> Bitmap Index Scan on winagg_partid_idx (cost=0.00..4.58 rows=20 width=0) (actual time=0.084..0.084 rows=20 loops=1) Index Cond: (partid = 1) Planning time: 0.208 ms Total runtime: 0.276 ms (8 rows) -- Should not push: Added a +0 to partition by clause. explain analyze select partid,n from (select partid,count(*) over (partition by partid + 0) n from winagg) winagg where partid=1; QUERY PLAN
Re: [HACKERS] PostgreSQL in Windows console and Ctrl-C
* From: Amit Kapila > On Sat, Apr 12, 2014 at 12:36 PM, Christian Ullrich > wrote: > > * From: Amit Kapila > > > >> Another thing to decide about this fix is that whether it is okay to > >> fix it for CTRL+C and leave the problem open for CTRL+BREAK? > >> (The current option used (CREATE_NEW_PROCESS_GROUP) will handle only > >> CTRL+C). > > > > Below is a new (right now very much proof-of-concept) patch to replace > > my earlier one. It has the same effect on Ctrl-C the change to pg_ctl > > had, and additionally ignores Ctrl-Break as well. > > This fix won't allow CTRL+C for other cases like when server is started > directly with postgres binary. > ./postgres -D > I think this doesn't come under your original complaint and as such I > don't see any problem in allowing CTRL+C for above case. Good point. I had not thought of that case. Just how do you tell if your postmaster was started by pg_ctl or directly on the command line? - pg_ctl starts the postmaster through an intervening shell, so even if it did not exit right after that, we could not check the parent process name (assuming nobody renamed pg_ctl). - pg_ctl starts the postmaster with stdin redirected from the null device, but so could anyone else. The same is true for access rights, token groups, and most everything else pg_ctl does. - I don't want to add a new command line flag to postgres.exe just to tell it who its parent is. - Job objects may not be supported on the underlying OS, or creation may have failed, or the interactive console may have been running in one already, so the sheer existence of a job is no proof it's ours. There are some possible solutions: - pg_ctl could set an environment variable (unless it has to be compatible with postmasters from different versions, and it does not, does it?). pg_ctl creates a named job object, and the postmaster has all the information it needs to reconstruct the name, so it can check if it is running inside that pg_ctl-created job. I would appreciate some advice on this. > One another way could be to use CREATE_NEW_CONSOLE instead of > CREATE_NEW_PROCESS_GROUP in you previous fix, but I am not sure if it's > acceptable to users to have a new console window for server. It might. That would also isolate stderr logging from whatever else the user is doing in the window, and it would work correctly in the pg_ctl (and by extension the direct-postmaster) case. It would not do anything for events generated by keystrokes in the new console window, though. There would also have to be a command line option to pg_ctl to either enable or disable it, not sure which. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Problem with txid_snapshot_in/out() functionality
On Sat, Apr 12, 2014 at 02:10:13PM -0400, Jan Wieck wrote: > Since it doesn't seem to produce any side effects, I'd think that > making the snapshot unique within txid_current_snapshot() and > filtering duplicates on input should be sufficient and eligible for > backpatching. Agreed. > The attached patch adds a unique loop to the internal > sort_snapshot() function and skips duplicates on input. The git > commit is here: > > https://github.com/wieck/postgres/commit/a88a2b2c25b856478d7e2b012fc718106338fe00 > static void > sort_snapshot(TxidSnapshot *snap) > { > + txidlast = 0; > + int nxip, idx1, idx2; > + > if (snap->nxip > 1) > + { > qsort(snap->xip, snap->nxip, sizeof(txid), cmp_txid); > + nxip = snap->nxip; > + idx1 = idx2 = 0; > + while (idx1 < nxip) > + { > + if (snap->xip[idx1] != last) > + last = snap->xip[idx2++] = snap->xip[idx1]; > + else > + snap->nxip--; > + idx1++; > + } > + } > } I think you need to do SET_VARSIZE also here. Alternative is to move SET_VARSIZE after sort_snapshot(). And it seems the drop-double-txid logic should be added also to txid_snapshot_recv(). It seems weird to have it behave differently from txid_snapshot_in(). -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb
On 4 April 2014 16:01, Andres Freund wrote: >> + const char *stage_commands[] = { >> + "SET default_statistics_target=1; SET >> vacuum_cost_delay=0;", >> + "SET default_statistics_target=10; RESET >> vacuum_cost_delay;", >> + "RESET default_statistics_target;" > This whole thing won't work for relations with per-column statistics > targets btw... Yes, agreed. Plus I would note that this makes no difference at all for very small tables since the sample will be big enough even with stats_target=1. ISTM that this is the way ANALYZE should work when run on a table that has never been analysed before. Let's just do this logic within ANALYZE and be done. Suggest logic if not ANALYZEd before && table is not small && stats_target is default then AnalyzeInStages() otherwise just do one ANALYZE pass -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
On this thread http://www.postgresql.org/message-id/52c6f712.6040...@student.kit.edu there was some discussion around allowing push downs of quals that happen to be in every window clause of the sub query. I've quickly put together a patch which does this (see attached) I'm posting this just mainly to let Thomas know that I'm working on it, per his request on the other thread. The patch seems to work with all my test cases, and I've not quite gotten around to thinking of any more good cases to throw at it. Oh and I know that my function var_exists_in_all_query_partition_by_clauses has no business in allpaths.c, I'll move it out as soon as I find a better home for it. Here's my test case: drop table if exists winagg; create table winagg ( id serial not null primary key, partid int not null ); insert into winagg (partid) select x.x % 10 from generate_series(1,200) x(x); create index winagg_partid_idx on winagg(partid); -- Should push: this should push WHERE partid=1 to the inner query as partid is in the only parition by clause in the query. explain analyze select partid,n from (select partid,count(*) over (partition by partid) n from winagg) winagg where partid=1; QUERY PLAN -- WindowAgg (cost=4.58..82.23 rows=20 width=4) (actual time=0.196..0.207 rows=20 loops=1) -> Bitmap Heap Scan on winagg (cost=4.58..81.98 rows=20 width=4) (actual time=0.102..0.170 rows=20 loops=1) Recheck Cond: (partid = 1) Heap Blocks: exact=20 -> Bitmap Index Scan on winagg_partid_idx (cost=0.00..4.58 rows=20 width=0) (actual time=0.084..0.084 rows=20 loops=1) Index Cond: (partid = 1) Planning time: 0.208 ms Total runtime: 0.276 ms (8 rows) -- Should not push: Added a +0 to partition by clause. explain analyze select partid,n from (select partid,count(*) over (partition by partid + 0) n from winagg) winagg where partid=1; QUERY PLAN - Subquery Scan on winagg (cost=265511.19..330511.19 rows=20 width=12) (actual time=2146.642..4257.267 rows=20 loops=1) Filter: (winagg.partid = 1) Rows Removed by Filter: 180 -> WindowAgg (cost=265511.19..305511.19 rows=200 width=4) (actual time=2146.614..4099.169 rows=200 loops=1) -> Sort (cost=265511.19..270511.19 rows=200 width=4) (actual time=2146.587..2994.993 rows=200 loops=1) Sort Key: ((winagg_1.partid + 0)) Sort Method: external merge Disk: 35136kB -> Seq Scan on winagg winagg_1 (cost=0.00..28850.00 rows=200 width=4) (actual time=0.025..418.306 rows=200 loops=1) Planning time: 0.249 ms Total runtime: 4263.933 ms (10 rows) -- Should not push: Add a window clause (which is not used) that has a partition by clause that does not have partid explain analyze select partid,n from (select partid,count(*) over (partition by partid) n from winagg window stopPushDown as (partition by id)) winagg where partid=1; -- Should not push: 1 window clause does not have partid explain analyze select partid,n from (select partid,count(*) over (partition by partid) n from winagg window stopPushDown as (order id)) winagg where partid=1; -- Should not push: 1 window clause does not have partid explain analyze select partid,n from (select partid,count(*) over (partition by partid) n from winagg window stopPushDown as ()) winagg where partid=1; As of now the patch is a couple of hours old, I've not even bothered to run the regression tests yet, let alone add any new ones. Comments are welcome... Regards David Rowley wfunc_pushdown_paritionby_v0.1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: interprocess EXPLAIN PID
Hello 2014-04-13 7:16 GMT+02:00 Amit Kapila : > On Fri, Apr 11, 2014 at 12:13 PM, Pavel Stehule > wrote: > > Hello > > > > I propose a enhancing of EXPLAIN statement about possibility get a plan > of > > other PostgreSQL process. > > Does this mean that you want to track plan (and other info Explain > provides) of statements running in particular backend? > Could you please explain this in bit more detail w.r.t how this will work? > for 9.5 I plan to implement a plain EXPLAIN only. Any other improvements can be append incrementally. Regards Pavel > > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com >
Re: [HACKERS] WIP patch (v2) for updatable security barrier views
On 13 April 2014 05:23, Stephen Frost wrote: > Alright, I've committed this with an updated note regarding the locking, > and a few additional regression tests That's awesome. Thank you very much. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers