Re: [HACKERS] Storing hot members of PGPROC out of the band
On Fri, Nov 4, 2011 at 4:13 AM, Simon Riggs si...@2ndquadrant.com wrote: If you look at your PGPROC_MINIMAL, its mostly transaction related stuff, so I would rename it PGXACT or similar. Yeah, that looks good too. Though I am not sure if all fields are related to transaction state and whether we would need to add more fields to the structure in future. Having a general name might help in that case. Not sure why you talk about pointer math, seems easy enough just to have two arrays protected by one lock, and have each proc use the same offset in both arrays. Right now we store PGPROC pointers in the ProcArray and the pointer math gets us the index to look into the other array. But we can actually just store indexes in the ProcArray to avoid that. A positive index may mean offset into the normal PGPROC array and a negative index can be used to get dummy PGPROC from the prepared transactions. Thanks, Pavan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] git trunk doesn't build
http://pastebin.com/RjiRjGZc this is on fedora 12, gcc 4.6.1 -- GJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Collecting statistics on CSV file data
(2011/10/20 18:56), Etsuro Fujita wrote: I revised the patch according to Hanada-san's comments. Attached is the updated version of the patch. Changes: * pull up of logging analyzing foo.bar * new vac_update_relstats always called * tab-completion in psql * add foreign tables are not analyzed automatically... to 23.1.3 Updating Planner Statistics * some other modifications Submission review = - Patch can be applied, and all regression tests passed. :) Random comments === - Some headers are not necessary for file_fdw.c #include access/htup.h #include commands/dbcommands.h #include optimizer/plancat.h #include utils/elog.h #include utils/guc.h #include utils/lsyscache.h #include utils/rel.h - It might be better to mention in document that users need to explicitly specify foreign table name to ANALYZE command. - I think backend should be aware the case which a handler is NULL. For the case of ANALYZE of foreign table, it would be worth telling user that request was not accomplished. - file_fdw_do_analyze_rel is almost copy of do_analyze_rel. IIUC, difference against do_analyze_rel are: * don't logging analyze target * don't switch userid to the owner of target table * don't measure elapsed time for autoanalyze deamon * don't handle index * some comments are removed. * sample rows are acquired by file_fdw's routine I don't see any problem here, but would you confirm that all of them are intentional? Besides, keeping format (mainly indent and folding) of this function similar to do_analyze_rel would help to follow changes in do_analyze_rel. - IMHO exporting CopyState should be avoided. One possible idea is adding new COPY API which allows to extract records from the file with skipping specified number or rate. - In your design, each FDW have to copy most of do_analyze_rel to their own source. It means that FDW authors must know much details of ANALYZE to implement ANALYZE handler. Actually, your patch exports some static functions from analyze.c. Have you considered hooking acquire_sample_rows()? Such handler should be more simple, and FDW-specific. As you say, such design requires FDWs to skip some records, but it would be difficult for some FDW (e.g. twitter_fdw) which can't pick sample data up easily. IMHO such problem *must* be solved by FDW itself. -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Storing hot members of PGPROC out of the band
On 04.11.2011 00:43, Simon Riggs wrote: On Thu, Nov 3, 2011 at 6:12 PM, Pavan Deolaseepavan.deola...@gmail.com wrote: When PGPROC array is allocated, we also allocate another array of PGPROC_MINIMAL structures of the same size. While accessing the ProcArray, a simple pointer mathematic can get us the corresponding PGPROC_MINIMAL structure. The only exception being the dummy PGPROC for prepared transaction. A first cut version of the patch is attached. It looks big, but most of the changes are cosmetic because of added indirection. The patch also contains another change to keep the ProcArray sorted by (PGPROC *) to preserve locality of references while traversing the array. This is very good. If you look at your PGPROC_MINIMAL, its mostly transaction related stuff, so I would rename it PGXACT or similar. Not sure why you talk about pointer math, seems easy enough just to have two arrays protected by one lock, and have each proc use the same offset in both arrays. Agreed, that seems more clean. The PGPROCs for prepared transactions are currently allocated separately, so they're not currently in the same array as all other PGPROCs, but that could be changed. Here's a WIP patch for that. I kept the PGPROC_MINIMAL nomenclature for now, but I agree with Simon's suggestion to rename it. On 03.11.2011 20:12, Pavan Deolasee wrote: Its quite possible that the effect of the patch is more evident on the particular hardware that I am testing. But the approach nevertheless seems reasonable. It will very useful if someone else having access to a large box can test the effect of the patch. I tested this on an 8-core x64 box, but couldn't see any measurable difference in pgbench performance. I tried with and without -N and -S, and --unlogged-tables, but no difference. While looking at GetSnapshotData(), it occurred to me that when a PGPROC entry does not have its xid set, ie. xid == InvalidTransactionId, it's pointless to check the subxid array for that proc. If a transaction has no xid, none of its subtransactions can have an xid either. That's a trivial optimization, see attached. I tested this with pgbench -S -M prepared -c 500 on the 8-core box, and it seemed to make a 1-2% difference (without the other patch). So, almost in the noise, but it also doesn't cost us anything in terms of readability or otherwise. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 477982d..b72915b 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -113,7 +113,8 @@ int max_prepared_xacts = 0; typedef struct GlobalTransactionData { - PGPROC proc; /* dummy proc */ + GlobalTransaction next; + int pgprocno; /* dummy proc */ BackendId dummyBackendId; /* similar to backend id for backends */ TimestampTz prepared_at; /* time of preparation */ XLogRecPtr prepare_lsn; /* XLOG offset of prepare record */ @@ -207,7 +208,8 @@ TwoPhaseShmemInit(void) sizeof(GlobalTransaction) * max_prepared_xacts)); for (i = 0; i max_prepared_xacts; i++) { - gxacts[i].proc.links.next = (SHM_QUEUE *) TwoPhaseState-freeGXacts; + gxacts[i].pgprocno = PreparedXactProcs[i].pgprocno; + gxacts[i].next = TwoPhaseState-freeGXacts; TwoPhaseState-freeGXacts = gxacts[i]; /* @@ -243,6 +245,8 @@ MarkAsPreparing(TransactionId xid, const char *gid, TimestampTz prepared_at, Oid owner, Oid databaseid) { GlobalTransaction gxact; + PGPROC *proc; + PGPROC_MINIMAL *proc_minimal; int i; if (strlen(gid) = GIDSIZE) @@ -274,7 +278,7 @@ MarkAsPreparing(TransactionId xid, const char *gid, TwoPhaseState-numPrepXacts--; TwoPhaseState-prepXacts[i] = TwoPhaseState-prepXacts[TwoPhaseState-numPrepXacts]; /* and put it back in the freelist */ - gxact-proc.links.next = (SHM_QUEUE *) TwoPhaseState-freeGXacts; + gxact-next = TwoPhaseState-freeGXacts; TwoPhaseState-freeGXacts = gxact; /* Back up index count too, so we don't miss scanning one */ i--; @@ -302,32 +306,36 @@ MarkAsPreparing(TransactionId xid, const char *gid, errhint(Increase max_prepared_transactions (currently %d)., max_prepared_xacts))); gxact = TwoPhaseState-freeGXacts; - TwoPhaseState-freeGXacts = (GlobalTransaction) gxact-proc.links.next; + TwoPhaseState-freeGXacts = (GlobalTransaction) gxact-next; - /* Initialize it */ - MemSet(gxact-proc, 0, sizeof(PGPROC)); - SHMQueueElemInit((gxact-proc.links)); - gxact-proc.waitStatus = STATUS_OK; + proc = ProcGlobal-allProcs[gxact-pgprocno]; + proc_minimal = ProcGlobal-allProcs_Minimal[gxact-pgprocno]; + + /* Initialize the PGPROC entry */ + MemSet(proc, 0, sizeof(PGPROC)); + proc-pgprocno = gxact-pgprocno; + SHMQueueElemInit((proc-links)); + proc-waitStatus = STATUS_OK; /* We set up the gxact's VXID as InvalidBackendId/XID */ - gxact-proc.lxid = (LocalTransactionId) xid; - gxact-proc.xid =
Re: [HACKERS] git trunk doesn't build
On 07.11.2011 13:14, Gregg Jaskiewicz wrote: http://pastebin.com/RjiRjGZc this is on fedora 12, gcc 4.6.1 rangetypes.c: In function ‘tsrange_subdiff’: rangetypes.c:1316:11: error: ‘timestamp’ undeclared (first use in this function) rangetypes.c:1316:11: note: each undeclared identifier is reported only once for each function it appears in rangetypes.c:1310:12: warning: unused variable ‘v2’ [-Wunused-variable] rangetypes.c:1309:12: warning: unused variable ‘v1’ [-Wunused-variable] rangetypes.c: In function ‘tstzrange_subdiff’: rangetypes.c:1332:11: error: ‘timestamp’ undeclared (first use in this function) rangetypes.c:1326:12: warning: unused variable ‘v2’ [-Wunused-variable] rangetypes.c:1325:12: warning: unused variable ‘v1’ [-Wunused-variable] Looks like the range types patch was broken for float timestamps. I'll go fix that. Thanks for the report! -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
On 7 November 2011 11:57, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Looks like the range types patch was broken for float timestamps. I'll go fix that. Thanks for the report! yup, no probs. -- GJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
On 7 November 2011 11:57, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 07.11.2011 13:14, Gregg Jaskiewicz wrote: http://pastebin.com/RjiRjGZc this is on fedora 12, gcc 4.6.1 rangetypes.c: In function ‘tsrange_subdiff’: rangetypes.c:1316:11: error: ‘timestamp’ undeclared (first use in this function) rangetypes.c:1316:11: note: each undeclared identifier is reported only once for each function it appears in rangetypes.c:1310:12: warning: unused variable ‘v2’ [-Wunused-variable] rangetypes.c:1309:12: warning: unused variable ‘v1’ [-Wunused-variable] rangetypes.c: In function ‘tstzrange_subdiff’: rangetypes.c:1332:11: error: ‘timestamp’ undeclared (first use in this function) rangetypes.c:1326:12: warning: unused variable ‘v2’ [-Wunused-variable] rangetypes.c:1325:12: warning: unused variable ‘v1’ [-Wunused-variable] Looks like the range types patch was broken for float timestamps. I'll go fix that. Thanks for the report! I notice that there are no machines in the build farm which build HEAD and use float timestamps, so this could silently happen again unless the Grzegorz Jaskiewiczs out there continue to report build issues. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Storing hot members of PGPROC out of the band
On Mon, Nov 7, 2011 at 6:45 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: While looking at GetSnapshotData(), it occurred to me that when a PGPROC entry does not have its xid set, ie. xid == InvalidTransactionId, it's pointless to check the subxid array for that proc. If a transaction has no xid, none of its subtransactions can have an xid either. That's a trivial optimization, see attached. I tested this with pgbench -S -M prepared -c 500 on the 8-core box, and it seemed to make a 1-2% difference (without the other patch). So, almost in the noise, but it also doesn't cost us anything in terms of readability or otherwise. Oh, that's a good idea. +1 for doing that now, and then we can keep working on the rest of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] -Wcast-qual cleanup, part 1
On Mon, Nov 7, 2011 at 12:13 AM, Peter Eisentraut pete...@gmx.net wrote: Anyway, attached is the first patch for your amusement. I can't help but wonder if the cure isn't worse than the disease. I mean, I very much like the fact that our code compiles without warnings, and I'm glad you're willing to put in the time to make that happen ... but aren't these warnings incredibly pedantic? const is like kudzu. Once you start using it, you find that you need it everywhere ... but your life is no better than it was before, except that now you have const. I'm suspicious of this hunk, for example: typedef struct ErrorContextCallback { struct ErrorContextCallback *previous; - void(*callback) (void *arg); - void *arg; + void(*callback) (const void *arg); + const void *arg; } ErrorContextCallback; Why should the callback be forced to treat its private argument as const? #define XLogRecGetData(record) ((char*) (record) + SizeOfXLogRecord) +#define XLogRecGetConstData(record)((const char*) (record) + SizeOfXLogRecord) IMHO, this is an example of everything that's wrong with const. The result will, I suppose, be const if and only if record is const. But there's no way to express that cleanly, so we have to duplicate the macro definition. And anyone who is not using the right compiler version will have to stare at the code and scratch their head to figure out which one to use. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] xslt_process() need
Can I vote to the xslt_process need in the SQL/XML core of the PostgreSQL 9.X ?
Re: [HACKERS] xslt_process() need
On Mon, Nov 7, 2011 at 8:20 AM, Peter Padua Krauss ppkra...@gmail.com wrote: Can I vote to the xslt_process need in the SQL/XML core of the PostgreSQL 9.X ? Not sure I understand this. Are you saying that you'd like to see contrib/xml2's xslt_process moved into core? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
On 7 November 2011 12:14, Thom Brown t...@linux.com wrote: I notice that there are no machines in the build farm which build HEAD and use float timestamps, so this could silently happen again unless the Grzegorz Jaskiewiczs out there continue to report build issues. I am rebuilding head once a week-ish for my own tests, but I don't have any build farm for that matter. It shouldn't be much of a hassle for me to create one that would rebuild it with float timestamps. Historically, that's what I have and I need to continue to use it that way. Otherwise pg_upgrade won't work for me, and it's not an option on big databases really. Are the build scripts open/available ? -- GJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] optional cleaning queries stored in pg_stat_statements
On Sun, Nov 6, 2011 at 2:56 PM, Greg Smith g...@2ndquadrant.com wrote: Peter's patch adds a planner hook and hashes at that level. Since this cat is rapidly escaping its bag and impacting other contributors, we might as well share the work in progress early if anyone has a burning desire to look at the code. A diff against the version I've been internally reviewing in prep for community submission is at https://github.com/greg2ndQuadrant/postgres/compare/master...pg_stat_statements_norm Easier to browse than ask questions probing about it I think. Apologies to Peter for sharing his work before he was completely ready; there is a list of known problems with it. I also regret the thread hijack here. I think it's an established principle that the design for features like this should, for best results, be discussed on -hackers before writing a lot of code. That not only avoids duplication of effort (which is nice), but also allows design decisions like what exactly should we hash and where should the hooks be? to be ironed out early, before they can force a major rewrite. Of course, there's no hard requirement, but the archives are littered with patches that crashed and burned because the author wrote them in isolation and then, upon receiving feedback from the community that necessitated a full rewrite, gave up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] synchronous commit vs. hint bits
I've long considered synchronous_commit=off to be one of our best performance features. Certainly, it's not applicable in every situation, but there are many applications where losing a second or so worth of transactions is an acceptable price to pay for not needing to wait for the disk to spin around for every commit. However, recent experimentation has convinced me that it's got a serious downside: SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED hints until the commit record has been durably flushed to disk. It turns out that can cause a major performance regression on systems with many CPU cores. I fixed this for temporary and unlogged tables in commit 53f1ca59b5875f1d3e95ee709ecaddcbdfdbd175, but the same issue exists (without any clear fix) for permanent tables. Here are some benchmark results on Nate Boley's 32-core AMD system. These are pgbench -T 300 -c 32 -j 32 runs with scale factor 100, shared_buffers = 8GB, maintenance_work_mem = 1GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9: tps = 8360.657049 (including connections establishing) tps = 7818.766335 (including connections establishing) tps = 8344.653290 (including connections establishing) And here are the same results after lobotomizing SetHintBits() to always sent the hint bits immediately (#if 0 around the TransactionIdIsValid(xid) test): tps = 9548.943930 (including connections establishing) tps = 9579.485767 (including connections establishing) tps = 9590.350954 (including connections establishing) That's pretty significant - about a 15% improvement. That's quite remarkable when you think about the fact that we're talking about refraining from setting hint bits for just a fraction of a second. The failure to sent those hint bits even for that very brief period of time has to cause enough additional work (or lock contention) to degrade performance quite noticeably. So, what could we do about this? Ideas: 1. Set the hint bits right away, and avoid letting the page be flushed to disk until the commit record is durably on disk (by bumping the page LSN?). 2. Improve CLOG concurrency or performance in some way so that consulting it repeatedly doesn't slow us down so much. 3. Do more backend-private XID status caching - in particular, for commits, since this isn't a problem for aborts. 4. (Crazy idea) Have something that's like a hint bit, but stored in the buffer header rather than the data block itself. We allocate an array large enough to hold 2 bits per tuple (for the maximum number of tuples that can exist on a page), with one bit indicating that xmin is async-committed and the other indicating that xmax is async-committed. There are probably other options as well. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
Gregg Jaskiewicz gryz...@gmail.com wrote: On 7 November 2011 12:14, Thom Brown t...@linux.com wrote: I notice that there are no machines in the build farm which build HEAD and use float timestamps, so this could silently happen again unless the Grzegorz Jaskiewiczs out there continue to report build issues. I am rebuilding head once a week-ish for my own tests, but I don't have any build farm for that matter. It shouldn't be much of a hassle for me to create one that would rebuild it with float timestamps. Historically, that's what I have and I need to continue to use it that way. Otherwise pg_upgrade won't work for me, and it's not an option on big databases really. Are the build scripts open/available ? http://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 9:31 AM, Robert Haas robertmh...@gmail.com wrote: So, what could we do about this? Ideas: 1. Set the hint bits right away, and avoid letting the page be flushed to disk until the commit record is durably on disk (by bumping the page LSN?). 2. Improve CLOG concurrency or performance in some way so that consulting it repeatedly doesn't slow us down so much. 3. Do more backend-private XID status caching - in particular, for commits, since this isn't a problem for aborts. 4. (Crazy idea) Have something that's like a hint bit, but stored in the buffer header rather than the data block itself. We allocate an array large enough to hold 2 bits per tuple (for the maximum number of tuples that can exist on a page), with one bit indicating that xmin is async-committed and the other indicating that xmax is async-committed. There are probably other options as well. 5. Make the WAL writer more responsive, maybe using latches, so that it doesn't take as long for the commit record to make it out to disk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] -Wcast-qual cleanup, part 1
Robert Haas robertmh...@gmail.com writes: On Mon, Nov 7, 2011 at 12:13 AM, Peter Eisentraut pete...@gmx.net wrote: typedef struct ErrorContextCallback { struct ErrorContextCallback *previous; -void(*callback) (void *arg); -void *arg; +void(*callback) (const void *arg); +const void *arg; } ErrorContextCallback; Why should the callback be forced to treat its private argument as const? Yes, the above seems like a seriously bad idea. It's probably true that most existing callbacks could afford to declare their callback args as const, but it does not follow that we should legislate that in the API. All that that will accomplish is to move the need to cast away const from some places to some other places. #define XLogRecGetData(record) ((char*) (record) + SizeOfXLogRecord) +#define XLogRecGetConstData(record) ((const char*) (record) + SizeOfXLogRecord) IMHO, this is an example of everything that's wrong with const. Yes, this seems pretty horrid as well. Not so much just the one instance as the implications of this sweeping requirement: 2. Macros accessing structures should come in two variants: a get version, and a set/anything else version, so that the get version can preserve the const qualifier. I'm not prepared to buy into that as a general coding rule. Maybe it would be better to just add -Wno-cast-qual to CFLAGS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
Robert Haas robertmh...@gmail.com writes: SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED hints until the commit record has been durably flushed to disk. It turns out that can cause a major performance regression on systems with many CPU cores. It seems to me that you've jumped to proposing solutions before you know where the problem actually is --- or at least, if you do know where the problem is, you didn't explain it. Is the cost in repeating clog lookups, or in testing to determine whether it's safe to set the bit yet, or is it contention associated with one or the other of those? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree gist known problems
Jeff Davis pg...@j-davis.com writes: I'm looking for known problem areas in btree_gist. I see: http://archives.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us With Range Types, I'm anticipating that btree_gist will become more important, so I'd like to know what bugs are holding it back. So, anything else come to mind? Or does btree_gist just need a good review? Or have the problems been fixed? The thing I was complaining about in the referenced message is specific to the inet opclass; it's unfair to make it sound like it's a generic problem with btree_gist. And no, it's not been fixed AFAIK. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
On 11/07/2011 09:40 AM, Kevin Grittner wrote: Gregg Jaskiewiczgryz...@gmail.com wrote: On 7 November 2011 12:14, Thom Brownt...@linux.com wrote: I notice that there are no machines in the build farm which build HEAD and use float timestamps, so this could silently happen again unless the Grzegorz Jaskiewiczs out there continue to report build issues. I am rebuilding head once a week-ish for my own tests, but I don't have any build farm for that matter. It shouldn't be much of a hassle for me to create one that would rebuild it with float timestamps. Historically, that's what I have and I need to continue to use it that way. Otherwise pg_upgrade won't work for me, and it's not an option on big databases really. Are the build scripts open/available ? http://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto Yeah. I just updated this and the buildfarm web site to point to GitHub where the code and releases now live. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 8:31 AM, Robert Haas robertmh...@gmail.com wrote: I've long considered synchronous_commit=off to be one of our best performance features. Certainly, it's not applicable in every situation, but there are many applications where losing a second or so worth of transactions is an acceptable price to pay for not needing to wait for the disk to spin around for every commit. However, recent experimentation has convinced me that it's got a serious downside: SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED hints until the commit record has been durably flushed to disk. It turns out that can cause a major performance regression on systems with many CPU cores. I fixed this for temporary and unlogged tables in commit 53f1ca59b5875f1d3e95ee709ecaddcbdfdbd175, but the same issue exists (without any clear fix) for permanent tables. What's the source of the regression? Is it coming from losing the hint bit and being forced out to clog? How likely is it really going to happen in non synthetic real world cases? Thinking about the hint bit cache I was playing with a while back, I guess you could have put the hint bit in the cache but refrained from marking it in the page in the TransactionIdIsValid(xid)=false case -- in the first implementation I had only put the bit in the cache when it was valid -- since TransactionIdIsValid(xid) is not necessarily cheap though, maybe it's worth reserving an extra bit for the transaction being valid in the cache if you went down that road. Another way to attack this problem is to re-check and set the hint bit if you can do it in the bgwriter -- there's a good chance you will catch it in oltp environments like pgbench although it not clear if the cost to the general case would be too high. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] optional cleaning queries stored in pg_stat_statements
On 11/07/2011 09:03 AM, Robert Haas wrote: I think it's an established principle that the design for features like this should, for best results, be discussed on -hackers before writing a lot of code. You can see from the commit history this idea is less than a month old. Do we need to get community approval before writing a proof of concept now? Everyone would still be arguing about how to get started had that path been taken. If feedback says this needs a full rewrite, fine. We are familiar with how submitting patches works here. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 9:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED hints until the commit record has been durably flushed to disk. It turns out that can cause a major performance regression on systems with many CPU cores. It seems to me that you've jumped to proposing solutions before you know where the problem actually is --- or at least, if you do know where the problem is, you didn't explain it. Is the cost in repeating clog lookups, or in testing to determine whether it's safe to set the bit yet, or is it contention associated with one or the other of those? In the current code, if you get to the IsValid check and fail to set the bit, you've essentially done all the work for no reason. I tested this pretty well a few months back, and (recalling from memory), the IsValid check is maybe 25% of the entire cost when you fail through the hint bit -- this is why I organized the cache to only store the bit if the xid was known good -- then you get to skip the check in the known good case and immediately set the bit (w/o marking dirty) and move on. As noted above, the cache I was playing with was a win from performance point of view, but would require another bit to address Robert's proposed case, and should really be prepared against alternative solutions (like marking the bit in the bgwriter) before being seriously considered. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
On 07.11.2011 14:01, Gregg Jaskiewicz wrote: On 7 November 2011 11:57, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Looks like the range types patch was broken for float timestamps. I'll go fix that. Thanks for the report! yup, no probs. Fixed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] optional cleaning queries stored in pg_stat_statements
On 11/06/2011 06:00 PM, Tom Lane wrote: Peter Geogheganpe...@2ndquadrant.com writes: A major consideration was backwards compatibility; This is not a consideration that the community is likely to weigh heavily, or indeed at all. We aren't going to back-port this feature into prior release branches, and we aren't going to want to contort its definition to make that easier. Being able to ship a better pg_stat_statements that can run against earlier versions as an extension has significant value to the community. Needing to parse log files to do statement profiling is a major advocacy issue for PostgreSQL. If we can get something useful that's possible to test as an extension earlier than the 9.2 release--and make it available to more people running older versions too--that has some real value to users, and for early production testing of what will go into 9.2. The point where this crosses over to requiring server-side code to operate at all is obviously a deal breaker on that idea. So far that line hasn't been crossed, and we're trying to stage testing against older versions on real-world queries. As long as it's possible to keep that goal without making the code more difficult to deal with, I wouldn't dismiss that as a useless distinction. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] optional cleaning queries stored in pg_stat_statements
On Mon, Nov 7, 2011 at 10:27 AM, Greg Smith g...@2ndquadrant.com wrote: On 11/07/2011 09:03 AM, Robert Haas wrote: I think it's an established principle that the design for features like this should, for best results, be discussed on -hackers before writing a lot of code. You can see from the commit history this idea is less than a month old. Do we need to get community approval before writing a proof of concept now? Everyone would still be arguing about how to get started had that path been taken. If feedback says this needs a full rewrite, fine. We are familiar with how submitting patches works here. Eh, obviously that didn't come off the way I meant it, since I already got one other off-list reply suggesting that my tone there may not have been the best. Sorry. I suppose in a way I am taking myself to task as much as anyone, since I have been spending a lot of time thinking about things lately, and I haven't been as good about converting those ideas to a form suitable for posting on -hackers, and actually doing it, as I historically have been, and I'm feeling like I need to get back to doing that more. But I honestly don't care one way or the other how you or Peter develop your patches, beyond the fact that they contain a lot of good ideas which I would like to see work their way into the tree; and of course I do know that you are already familiar with the process. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
J Smith dark.panda+li...@gmail.com writes: Alright, I wrote up another patch that uses strchr to parse out the lines of the unaccent.rules file, foregoing sscanf completely. Hopefully this looks a bit better than using swscanf. I looked at this a bit and realized that sscanf is actually doing a couple of critical things for us, which are lost in translation when doing it like this: 1. It ignores whitespace other than the dividing tab. If we don't continue to do that, we'll likely break existing config files. 2. It ensures that src and trg each consist of at least one (nonblank) character. placeChar() is critically dependent on the assumption that src is not empty. However, after looking around a bit at the other tsearch config-file- reading functions, I noted that they all use t_isspace() to identify whitespace ... and that function in fact should be okay on OS X, because it uses iswspace in multibyte encodings. So it's fairly simple to improve this code to reject whitespace that way. I don't like the existing code anyway because of its potential vulnerability to buffer overrun. I'll fix it up and commit. As for the other problems with isspace and such on OSX, it might be worth looking at the python portability fixes. If OS X's UTF8 locales weren't so thoroughly broken (eg sorting does not work), I might be tempted to try to do it that way, but I still fail to see the point. After reviewing the code I feel that unaccent needs to be fixed because it's not consistent with the other tsearch config file parsers, and not so much because it works or doesn't work on any specific platform. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 10:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED hints until the commit record has been durably flushed to disk. It turns out that can cause a major performance regression on systems with many CPU cores. It seems to me that you've jumped to proposing solutions before you know where the problem actually is --- or at least, if you do know where the problem is, you didn't explain it. Is the cost in repeating clog lookups, or in testing to determine whether it's safe to set the bit yet, or is it contention associated with one or the other of those? Good question. One possibly informative fact is that, on unlogged tables, the same change doesn't seem to make any difference. Here are the benchmark results with unlogged tables, configuration otherwise identical to the OP: [unpatched] tps = 10624.851704 (including connections establishing) tps = 10507.024822 (including connections establishing) tps = 10714.411389 (including connections establishing) [test whacked out] tps = 10779.704540 (including connections establishing) tps = 10523.863100 (including connections establishing) tps = 10654.102699 (including connections establishing) The difference might be noise, or it may be a very small real effect, but it's clearly tiny compared to the change for permanent tables (but note that this was not true prior to commit 53f1ca59b5875f1d3e95ee709ecaddcbdfdbd175). This seems to me to be fairly compelling evidence that the problem is in the clog lookups themselves, rather than the test that determines whether or not it's safe to set the bit. However, I don't know whether the problem is the cost of the test itself or some kind of associated contention. I don't see much difference in CPU utilization between the patched and unpatched code, but that's not really accurate enough to be certain. I just reran both tests with LWLOCK_STATS defined. Again, five minute test run: lwlock 11: shacq 87323748 exacq 3708555 blk 1932719 [unpatched] lwlock 11: shacq 11682513 exacq 4769472 blk 677534 [patched] 11 = CLogControlLock, so you can see that the unpatched code is acquiring CLogControlLock in shared mode more 7x as often and blocks on the lock about 3x as often, despite processing fewer transactions. The patched code has more exclusive-acquires, but that's at least partly just because it's processing more transactions. Unfortunately, I don't have oprofile access on this box and can't see exactly where the time is being spent. However, I am not sure how much it matters. With that much of an increase in CLOG traffic, something's gotta give. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] btree gist known problems
On Mon, 2011-11-07 at 10:18 -0500, Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: I'm looking for known problem areas in btree_gist. I see: http://archives.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us With Range Types, I'm anticipating that btree_gist will become more important, so I'd like to know what bugs are holding it back. So, anything else come to mind? Or does btree_gist just need a good review? Or have the problems been fixed? The thing I was complaining about in the referenced message is specific to the inet opclass; it's unfair to make it sound like it's a generic problem with btree_gist. And no, it's not been fixed AFAIK. Sorry, I thought I remembered a few other comments but couldn't find them in the archives. I must be mistaken. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
On Nov7, 2011, at 17:46 , J Smith wrote: On Mon, Nov 7, 2011 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: If OS X's UTF8 locales weren't so thoroughly broken (eg sorting does not work), I might be tempted to try to do it that way, but I still fail to see the point. After reviewing the code I feel that unaccent needs to be fixed because it's not consistent with the other tsearch config file parsers, and not so much because it works or doesn't work on any specific platform. Yeah, I never knew there was such a problem with OSX and UTF8 before running into it here but it's good to know. Various issues with OSX and UTF-8 locales seems to come up quite often, yet we're not really in a position to do anything about them. Thus, I think we should warn about these issues and save people the trouble of finding out about this the hard way. The only question is, what would be a good place for such a warning? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
On Mon, Nov 7, 2011 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: I looked at this a bit and realized that sscanf is actually doing a couple of critical things for us, which are lost in translation when doing it like this: 1. It ignores whitespace other than the dividing tab. If we don't continue to do that, we'll likely break existing config files. 2. It ensures that src and trg each consist of at least one (nonblank) character. placeChar() is critically dependent on the assumption that src is not empty. However, after looking around a bit at the other tsearch config-file- reading functions, I noted that they all use t_isspace() to identify whitespace ... and that function in fact should be okay on OS X, because it uses iswspace in multibyte encodings. So it's fairly simple to improve this code to reject whitespace that way. I don't like the existing code anyway because of its potential vulnerability to buffer overrun. I'll fix it up and commit. As for the other problems with isspace and such on OSX, it might be worth looking at the python portability fixes. If OS X's UTF8 locales weren't so thoroughly broken (eg sorting does not work), I might be tempted to try to do it that way, but I still fail to see the point. After reviewing the code I feel that unaccent needs to be fixed because it's not consistent with the other tsearch config file parsers, and not so much because it works or doesn't work on any specific platform. Yeah, I never knew there was such a problem with OSX and UTF8 before running into it here but it's good to know. When I noticed the unnaccent extension in more recent PostgreSQL versions, I figured it would perform better than our current plperl-based accent stripping function (which it surely does) and just noticed the results on my machine were a little off, but our linux-based servers were fine and dandy and yadda yadda yadda. Anyways, lemme know if there's anything else I could help with or could test and whatnot. Cheers. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
J Smith dark.panda+li...@gmail.com writes: Anyways, lemme know if there's anything else I could help with or could test and whatnot. Cheers. If you have time to check that the patch I just committed fixes your problem, it'd be worth doing. I did not test it on OS X ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] voting to the xslt_process() need
I vote to see the contrib/xml2's xslt_process() build-in function (only xslt_process) moved into PostgreSQL core. Database servers can offer some load balancing with another CPUs, like a PHP server... I have in mind some main examples: * If database server not process XSLT, the *balance* is lost (for PHP processing DOM and XSLT). * If database server not process XSLT, a lot of *traffic* and not-elegant * code* is necessary. * If a XML framework is developed for SQL-side, like Oracle-APEX, a lot of *extra-workaround* is necessary to build the framework (with external XML processing). * ... Another big problem is the *lack of xQuery*, them the *internal processing of XSLT is a important workaround*. Several serious XML projects are being lost to Oracle, only because of this lack of xQuey and XSLT support. PS: the main XPath libraries implements also XSLT; PostgreSQL core have XPath, to add XSLT is only a little more.
Re: [HACKERS] type privileges and default privileges
On Sat, Nov 5, 2011 at 10:35 AM, Peter Eisentraut pete...@gmx.net wrote: During the closing days of the 9.1 release, we had discussed that we should add privileges on types (and domains), so that owners can prevent others from using their types because that would prevent the owners from changing them in certain ways. (Collations have similar issues and work quite similar to types, so we could include them in this consideration.) As I'm plotting to write code for this, I wonder about how to handle default privileges. For compatibility and convenience, we would still want to have types with public privileges by default. Should we continue to hardcode this, as we have done in the past with functions, for example, or should we use the new default privileges facility to register the public default privileges in the template database? I think it would make sense to follow the model of default privileges, but I'm a bit confused by the rest of this, because pg_default_acl is normally empty - you only make an entry there when a schema has different defaults than the, uh, default defaults. So you shouldn't need to register anything, I wouldn't think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] voting to the xslt_process() need
On 11/07/2011 12:10 PM, Peter Padua Krauss wrote: I vote to see the contrib/xml2's xslt_process() build-in function (only xslt_process) moved into PostgreSQL core. Database servers can offer some load balancing with another CPUs, like a PHP server... I have in mind some main examples: * If database server not process XSLT, the *balance* is lost (for PHP processing DOM and XSLT). * If database server not process XSLT, a lot of *traffic* and not-elegant *code* is necessary. * If a XML framework is developed for SQL-side, like Oracle-APEX, a lot of *extra-workaround* is necessary to build the framework (with external XML processing). * ... Another big problem is the *lack of xQuery*, them the *internal processing of XSLT is a important workaround*. Several serious XML projects are being lost to Oracle, only because of this lack of xQuey and XSLT support. PS: the main XPath libraries implements also XSLT; PostgreSQL core have XPath, to add XSLT is only a little more. We don't really have a voting process like you seem to think. If you want something to happen then your best bet is to devote resources to making it happen. This could be either effort in the way of patches, or sponsorship money for someone who is capable of making it happen, for example. Please also see earlier discussions of these items in the mailing list. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 2:45 PM, Robert Haas robertmh...@gmail.com wrote: 2. Improve CLOG concurrency or performance in some way so that consulting it repeatedly doesn't slow us down so much. We should also ask what makes the clog slow. I think it shows physical contention as well as logical contention on the lwlock. Since we have 2 bits per transaction that means we will see at least 256 transactions fitting in each cacheline in the clog. Consecutive transactions are currently stored next to each other in the clog, so that the current cacheline needs to be passed around between 256 transactions, one at a time. That is a problem if they all finish near enough the same time. My proposal is to stripe the clog, so that consecutive xids are not adjacent in the clog, such that xids are always at least 64 bytes apart on a 8192 byte clog page. That allows 128 commits with consecutive xids to complete concurrently, with respect to physical access to memory. That's just a one line change in the defines at top of clog.c, so easy enough to play with. #define CACHELINE_SZ 64 #define CACHELINES_PER_BLOCK (BLCKSZ / CACHELINE_SZ) #define CLOG_XACTS_PER_CACHELINE (CLOG_XACTS_PER_BYTE * CACHELINE_SZ) #define TransactionIdToByte(xid)\ (CACHELINES_PER_BLOCK * \ (TransactionIdToPgIndex(xid) /CLOG_XACTS_PER_CACHELINE)) \ + (TransactionIdToPgIndex(xid) % CLOG_XACTS_PER_CACHELINE) plus few extra lines to fix the other defines. 5. Make the WAL writer more responsive, maybe using latches, so that it doesn't take as long for the commit record to make it out to disk. I'm working on this already as part of the update for power reduction/group commit/replication performance. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
On Mon, Nov 7, 2011 at 11:59 AM, Tom Lane t...@sss.pgh.pa.us wrote: If you have time to check that the patch I just committed fixes your problem, it'd be worth doing. I did not test it on OS X ... Looks good to me, thanks. Would it even really be worth it to look into any of the other locale issues on OSX, given that PostgreSQL is now included in their default installs starting with 10.7, or would this really be more of a case of hoping Apple some day fixes the issue upstream? It doesn't seem like that's going to happen any time soon, mind you, as apparently this is a rather long-standing issue in their libc, but who knows. I know OSX isn't exactly the most popular database server OS out there, but it seems to be becoming more popular for developers these days at the very least. Anyways, cheers, and thanks again. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
J Smith dark.panda+li...@gmail.com writes: Would it even really be worth it to look into any of the other locale issues on OSX, given that PostgreSQL is now included in their default installs starting with 10.7, or would this really be more of a case of hoping Apple some day fixes the issue upstream? To my mind, the killer issue is the incorrect sorting --- there's basically no way we can work around that. If Apple were to fix that, we might be able to nibble at the margins of the other stuff. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unaccent extension missing some accents
On Mon, Nov 7, 2011 at 11:53 AM, Florian Pflug f...@phlo.org wrote: Various issues with OSX and UTF-8 locales seems to come up quite often, yet we're not really in a position to do anything about them. Thus, I think we should warn about these issues and save people the trouble of finding out about this the hard way. The only question is, what would be a good place for such a warning? Hmm, I suppose one place could be on initdb if initializing with a UTF-8 locale, although that only really helps with fixed settings like LC_COLLATE and LC_CTYPE and the defaults for the user-configurable settings, right? For the configurable settings I guess logging as warnings on server start up or when they're changed via SET. But then there's all of the other places, like when using COLLATE and using locale-aware functions and so on and so forth. It would be a lot to cover, I'll bet. I guess maybe initdb, start up and SET warnings and a note in the docs would hopefully suffice? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] voting to the xslt_process() need
Andrew Dunstan and...@dunslane.net writes: Please also see earlier discussions of these items in the mailing list. Yes. It's very unlikely that we'll accept a patch that just moves xslt_process into core without doing anything about its definitional and implementation shortcomings. See for instance http://archives.postgresql.org/pgsql-hackers/2011-02/msg01878.php regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 12:26 PM, Simon Riggs si...@2ndquadrant.com wrote: 5. Make the WAL writer more responsive, maybe using latches, so that it doesn't take as long for the commit record to make it out to disk. I'm working on this already as part of the update for power reduction/group commit/replication performance. OK. Here's an interesting result on that front that I so far can't explain. I lowered wal_writer_delay to 20 ms and repeated my test: tps = 10175.265689 (including connections establishing) [unpatched] tps = 10159.597727 (including connections establishing) [patched] Now, that's odd. I expect that to improve performance on the unpatched code, by reducing the amount of time we have to wait for the commit record to hit the disk. I did *not* expect it to improve the performance of the patched code, since one would think that setting the hint bit the first time through would be about as good as we could possibly do. And yet, those are the numbers. Apparently, there's some other effect whereby a more responsive walwriter improves performance on this setup (beats me what it is, though). Here it is with wal_writer_delay=50 ms: tps = 9964.225358 (including connections establishing) [unpatched] tps = 10048.396729 (including connections establishing) [patched] And back to wal_writer_delay=200ms: tps = 8119.121633 (including connections establishing) [unpatched] tps = 9602.645495 (including connections establishing) [patched] So it seems like there is quite a bit of win available here, though at the moment I don't know why. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 9:25 AM, Merlin Moncure mmonc...@gmail.com wrote: On Mon, Nov 7, 2011 at 8:31 AM, Robert Haas robertmh...@gmail.com wrote: I've long considered synchronous_commit=off to be one of our best performance features. Certainly, it's not applicable in every situation, but there are many applications where losing a second or so worth of transactions is an acceptable price to pay for not needing to wait for the disk to spin around for every commit. However, recent experimentation has convinced me that it's got a serious downside: SetHintBits() can't set HEAP_XMIN_COMMITTED or HEAP_XMAX_COMMITTED hints until the commit record has been durably flushed to disk. It turns out that can cause a major performance regression on systems with many CPU cores. I fixed this for temporary and unlogged tables in commit 53f1ca59b5875f1d3e95ee709ecaddcbdfdbd175, but the same issue exists (without any clear fix) for permanent tables. What's the source of the regression? Is it coming from losing the hint bit and being forced out to clog? How likely is it really going to happen in non synthetic real world cases? Thinking about the hint bit cache I was playing with a while back, I guess you could have put the hint bit in the cache but refrained from marking it in the page in the TransactionIdIsValid(xid)=false case -- in the first implementation I had only put the bit in the cache when it was valid -- since TransactionIdIsValid(xid) is not necessarily cheap though, maybe it's worth reserving an extra bit for the transaction being valid in the cache if you went down that road. Another way to attack this problem is to re-check and set the hint bit if you can do it in the bgwriter -- there's a good chance you will catch it in oltp environments like pgbench although it not clear if the cost to the general case would be too high. Thinking about this more, the backend local cache approach is probably going to be useless in terms of addressing this problem -- mostly due to the fact that the cache is, well, local. Even if backend A takes the time to mark the bit in its own cache, backends B-Z haven't yet and presumably by the time they do the page has been rolled out anyways so you get no benefit. The cache helps when a backend sees the same transaction spread out over a number of tuples/pages -- that's simply not the case in OLTP. Doing the work in the bgwriter might do the trick assuming the bgwriter consistently loses the race against both transaction resolution and the wal, and the extra clog lookup (when you win the race) penalty doesn't sting too muh...possibly do this in conjuction with clog striping Simon is thinking about. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 1:08 PM, Merlin Moncure mmonc...@gmail.com wrote: Thinking about this more, the backend local cache approach is probably going to be useless in terms of addressing this problem -- mostly due to the fact that the cache is, well, local. Even if backend A takes the time to mark the bit in its own cache, backends B-Z haven't yet and presumably by the time they do the page has been rolled out anyways so you get no benefit. The cache helps when a backend sees the same transaction spread out over a number of tuples/pages -- that's simply not the case in OLTP. Ah, right. Good point. Doing the work in the bgwriter might do the trick assuming the bgwriter consistently loses the race against both transaction resolution and the wal, and the extra clog lookup (when you win the race) penalty doesn't sting too muh... But I can't see how this can work. The background writer is only designed to do one thing: ensuring a supply of clean buffers for backends that need to allocate new ones. I'm not sure the background writer is going to do anything at all on this test, since the data set fits inside shared_buffers and therefore there's no buffer eviction happening. But even if it does, it's certainly not going to scan all 1 million shared buffers anywhere near quick enough to matter; it's going to be limited to at most 100 buffers every 200 ms, which means that even if it ran at top speed for the entire test, it would only get through about 15% of the buffer pool even *once* before the test ended. That's not even slightly close to what would be needed to move the needle here; you would need to visit any given buffer within a few hundred milliseconds of the relevant transaction commit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)
During work on gist for range types I've faced with following problem: test=# select 'empty'::int4range !?; ERROR: operator does not exist: int4range !? LINE 1: select 'empty'::int4range !?; ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. test=# select 'empty'::int4range ?; ERROR: operator does not exist: int4range ? LINE 1: select 'empty'::int4range ?; ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. So, !? and ? operators are mentioned in documentation, but don't present in catalog. Are them just missed in the catalog or there is some more serious problem? -- With best regards, Alexander Korotkov.
Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)
Alexander Korotkov aekorot...@gmail.com writes: So, !? and ? operators are mentioned in documentation, but don't present in catalog. Are them just missed in the catalog or there is some more serious problem? IIRC, Heikki removed them from the final commit. Sounds like he missed some documentation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 12:19 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 7, 2011 at 1:08 PM, Merlin Moncure mmonc...@gmail.com wrote: Thinking about this more, the backend local cache approach is probably going to be useless in terms of addressing this problem -- mostly due to the fact that the cache is, well, local. Even if backend A takes the time to mark the bit in its own cache, backends B-Z haven't yet and presumably by the time they do the page has been rolled out anyways so you get no benefit. The cache helps when a backend sees the same transaction spread out over a number of tuples/pages -- that's simply not the case in OLTP. Ah, right. Good point. Doing the work in the bgwriter might do the trick assuming the bgwriter consistently loses the race against both transaction resolution and the wal, and the extra clog lookup (when you win the race) penalty doesn't sting too muh... But I can't see how this can work. The background writer is only designed to do one thing: ensuring a supply of clean buffers for backends that need to allocate new ones. I'm not sure the background writer is going to do anything at all on this test, since the data set fits inside shared_buffers and therefore there's no buffer eviction happening. But even if it does, it's certainly not going to scan all 1 million shared buffers anywhere near quick enough to matter; it's going to be limited to at most 100 buffers every 200 ms, which means that even if it ran at top speed for the entire test, it would only get through about 15% of the buffer pool even *once* before the test ended. That's not even slightly close to what would be needed to move the needle here; you would need to visit any given buffer within a few hundred milliseconds of the relevant transaction commit. Well, I'd argue that in most real world, high write intensity databases there is constant pressure on pages to be flushed out to make room for new ones being written to and the database size is much, much larger than shared buffers -- pgbench is 100% update and pretty novel in that respect. I guess I said 'bgwriter' when I really meant 'generally upon eviction, either by bgwriter or an evicting backend'. But even given that, probably the lag is too long to be of useful benefit to your problem. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Agreed. This thread has already expended too much of our valuable time, in my opinion. As I said elsewhere, I think that a modified version of Simon's proposal gets us all what we want except code cleanliness. I'd like to hear from Tom on that issue. The proposal is: 1. No changes are expected to PITR mode, unless required by the other changes below. 2. standby_mode becomes an ENUM: off,standby,pitr. It can be reset on server reload or through pg_ctl promote 3. we add a recovery_file filepath parameter to postgresql.conf. This points to the location of recovery.conf, if any, but does not error fatally if the file doesn't exist, allowing recovery.conf/.done if we want to support it. 3.a. we continue to support recovery.conf/.done trigger file behavior if recovery_file is set. In this case, the recovery.conf file would contain the standby_mode GUC. 3.b. should pitr mode require recovery_file? Doesn't seem like it. 3.c. we begin arguments on whether or not recovery_file should be set by default 4. Haas implements SET PERSISTENT for postgresql.conf 5. pg_ctl promote uses SET PERSISTENT to change standby_mode so that former standbys can be permanently promoted. 6. (optional) we add include_if_exists specifier for postgresql.conf, allowing scripters to handle having a separate recovery.conf file in other ways. 7. (optional) we add a pg_ctl standby command which starts up postgres and sets standby_mode to standby. I'm not sure I see the value in this though. The above allows DBAs to operate in backwards compatibility mode without forcing authors of new tools and scripts to abide by it. Question: if both standby_mode and recovery_file are set, what should happen if the recovery_file is not present? For backwards compatibility, we would use SET PERSISTENT to set standby_mode after the server comes up. Arguments? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git trunk doesn't build
On 7 Nov 2011, at 15:44, Heikki Linnakangas wrote: On 07.11.2011 14:01, Gregg Jaskiewicz wrote: On 7 November 2011 11:57, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Looks like the range types patch was broken for float timestamps. I'll go fix that. Thanks for the report! yup, no probs. Fixed. Thank You. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)
First version of GiST for range types patch is here. Comments refactoring testing are coming soon. -- With best regards, Alexander Korotkov. rangetypegist-0.1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] unite recovery.conf and postgresql.conf
Josh Berkus j...@agliodbs.com writes: As I said elsewhere, I think that a modified version of Simon's proposal gets us all what we want except code cleanliness. I'd like to hear from Tom on that issue. Well, code complexity is hard to gauge without coding a draft implementation, but I think this largely fails on UI complexity. It's still paying too high a price for backwards compatibility IMO. The more I read about this, the more doubtful I am that unifying PITR recovery with standby mode is a good idea from the UI perspective. They may share a lot of common infrastructure but they need to be triggered in fundamentally different ways. In particular, one of the reasons that recovery.conf/.done was set up the way that it was was to have a simple way of letting the system get out of PITR mode; without that, a crash during PITR recovery is going to lead to restarting from the PITR start point (because when we restart, we find configuration settings telling us to do that). We could possibly move the necessary state into pg_control, but keeping it as a GUC is going to be a mess. On the whole I still think a trigger file is a sane design for that. It may make sense to move the configuration data somewhere else, but we really need to be able to tell the difference between starting PITR, continuing PITR after a mid-recovery crash, and finished PITR, up and running normally. A GUC is not a good way to do that. The angst around this issue seems to me to largely stem from trying to use a configuration setup designed for one-shot PITR scenarios for hot standby scenarios, which are really pretty different. We have to divorce those two cases before we're going to have something that's sane and usable ... and AFAICS that means giving up backwards compatibility to some degree. We got this wrong in 9.0, which everyone understood at the time was an unpolished prototype implementation of replication. I don't think it's sensible to now move the goalposts and decree that we've got to be fully backward compatible with our first-cut mistakes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 5:26 PM, Simon Riggs si...@2ndquadrant.com wrote: 5. Make the WAL writer more responsive, maybe using latches, so that it doesn't take as long for the commit record to make it out to disk. I'm working on this already as part of the update for power reduction/group commit/replication performance. I extracted this from my current patch for you to test. Rather useful actually 'cos its allowed me a sensible phasing of the development. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services walwriter_latch.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Mon, Nov 7, 2011 at 6:33 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Nov 7, 2011 at 5:26 PM, Simon Riggs si...@2ndquadrant.com wrote: 5. Make the WAL writer more responsive, maybe using latches, so that it doesn't take as long for the commit record to make it out to disk. I'm working on this already as part of the update for power reduction/group commit/replication performance. I extracted this from my current patch for you to test. Thank you! Rather useful actually 'cos its allowed me a sensible phasing of the development. +1. reads patch Hmm, this is different than what I was expecting, although that's not necessarily bad. What this does is retain wal_writer_delay, but allow the WAL writer to be woken up more frequently if there's enough WAL to justify it. What I was expecting you to do is eliminate wal_writer_delay altogether and drive the wakeups entirely off of the latch. I think you could get away with that, because SetLatch is ridiculously cheap if the latch is already set. Anyway, I'll give this a spin as you have it and see what falls out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heap vacuum cleanup locks
On Fri, Nov 4, 2011 at 3:39 PM, Robert Haas robertmh...@gmail.com wrote: Doing that actually makes the patch simpler, so I went ahead and did that in the attached version, along with the renaming mentioned above. Hearing no objections, I went ahead and committed this version. Thanks for coding this up; I think this is a very useful improvement. It would still be nice to fix the case where we need to freeze a tuple that is on a page someone else has pinned, but I don't have any good ideas for how to do that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] removing =(text, text) in 9.2
On Mon, Nov 7, 2011 at 2:37 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Thanks for the review. New version attached. This looks sane to me too (though I only read the patch and didn't actually test upgrading). OK, committed. Thanks for the reviews, all. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: psql concise mode
On Sun, Nov 6, 2011 at 3:29 PM, Josh Kupershmidt schmi...@gmail.com wrote: On Sun, Nov 6, 2011 at 1:16 PM, Dickson S. Guedes lis...@guedesoft.net wrote: test=# \d+ foo Table public.foo Column | Type | Storage +-+- a | integer | plain b | integer | plain Has OIDs: no Using your example, what if column 'b' has a comment and 'a' not? How the above output will be displayed? Then the comments would be displayed as they previously were, like so: Table public.foo Column | Type | Storage | Description +-+-+- a | integer | plain | b | integer | plain | some comment Has OIDs: no I don't strongly object to this, but I wonder how useful it will really be in practice. It strikes me as the sort of advanced psql hackery that only a few people will use, and only some of those will gain any benefit. Empty columns don't really take up that much screen width, and even one value in any given column will require its inclusion anyway. I can also see myself turning it on and then going - oh, wait, is that column not there, or did it just disappear because I'm in concise mode? Not saying we shouldn't do it, just some food for thought. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: psql concise mode
On Mon, Nov 7, 2011 at 10:04 PM, Robert Haas robertmh...@gmail.com wrote: I don't strongly object to this, but I wonder how useful it will really be in practice. It strikes me as the sort of advanced psql hackery that only a few people will use, and only some of those will gain any benefit. I'm probably just not going to bother, based on the (lack of) response so far. I'd like to have it for myself, but not enough to make the patch and then fight for its inclusion :-) Empty columns don't really take up that much screen width, and even one value in any given column will require its inclusion anyway. What really prompted the proposal was my somewhat antiquated use of 80-column terminal windows (so that 2 or 3 fit side-by-side comfortably on my screen). A lot of the backslash commands are creeping well over that 80-column limit, even with the most basic of outputs, and I find the default line-wrapping hard to follow. And I'd bet that the use of column comments and column statistics targets, for example, are quite rare -- and that's almost 30 columns of horizontal space lost for the common case of \d+ tablename. Maybe I just have to get comfortable with the expanded mode. I can also see myself turning it on and then going - oh, wait, is that column not there, or did it just disappear because I'm in concise mode? Yeah, that would be a bit of a nuisance in some cases. Not saying we shouldn't do it, just some food for thought. Thanks for the feedback, anyway. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: psql concise mode
On Mon, Nov 7, 2011 at 11:01 PM, Josh Kupershmidt schmi...@gmail.com wrote: What really prompted the proposal was my somewhat antiquated use of 80-column terminal windows (so that 2 or 3 fit side-by-side comfortably on my screen). A lot of the backslash commands are creeping well over that 80-column limit, even with the most basic of outputs, and I find the default line-wrapping hard to follow. And I'd bet that the use of column comments and column statistics targets, for example, are quite rare -- and that's almost 30 columns of horizontal space lost for the common case of \d+ tablename. Yeah, I've noticed that, too. OTOH, the regular old \d output, as opposed to \d+, still fits fairly well. Mostly. So I'm managing. But I can't help feeling that as we continue to add more features, we've eventually going to end up with our backs to the wall. Not sure what to do about that, but... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql expanded auto
On lör, 2011-11-05 at 12:26 -0400, Noah Misch wrote: On Sat, Nov 05, 2011 at 04:53:56PM +0200, Peter Eisentraut wrote: On fre, 2011-11-04 at 07:34 -0400, Noah Misch wrote: For \pset format wrapped, we only use $COLUMNS when the output is a tty. I'm thinking it's best, although not terribly important, to apply the same rule to this feature. I think it does work that way. There is only one place where the environment variable is queries, and it's used for both wrapped format and expanded auto format. You're correct; given output to a non-tty and no use of \pset columns, output_columns always becomes zero. This makes wrapped format never wrap, but it makes expanded auto mode always expand. Would it be more consistent to never expand when output_columns == 0? That is, make these give the same output: psql -X -P expanded=auto -c select 'a' as a psql -X -P expanded=auto -c select 'a' as a | cat I just noticed: the help text for \x in slashUsage() will also need an update. Here is an updated patch that addresses all the issues you pointed out. diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index d6941e0..01f57c4 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -1860,7 +1860,8 @@ lo_import 152801 para Sets the target width for the literalwrapped/ format, and also the width limit for determining whether output is wide enough to - require the pager. + require the pager or switch to the vertical display in expanded auto + mode. Zero (the default) causes the target width to be controlled by the environment variable envarCOLUMNS/, or the detected screen width if envarCOLUMNS/ is not set. @@ -1876,15 +1877,19 @@ lo_import 152801 termliteralexpanded/literal (or literalx/literal)/term listitem para - If replaceable class=parametervalue/replaceable is specified - it must be either literalon/literal or literaloff/literal - which will enable or disable expanded mode. If replaceable - class=parametervalue/replaceable is omitted the command toggles - between regular and expanded mode. - When expanded mode is enabled, query results - are displayed in two columns, with the column name on the left and - the data on the right. This mode is useful if the data wouldn't fit - on the screen in the normal quotehorizontal/quote mode. + If replaceable class=parametervalue/replaceable is specified it + must be either literalon/literal or literaloff/literal, which + will enable or disable expanded mode, or literalauto/literal. + If replaceable class=parametervalue/replaceable is omitted the + command toggles between the on and off settings. When expanded mode + is enabled, query results are displayed in two columns, with the + column name on the left and the data on the right. This mode is + useful if the data wouldn't fit on the screen in the + normal quotehorizontal/quote mode. In the auto setting, the + expanded mode is used whenever the query output is wider than the + screen, otherwise the regular mode is used. The auto setting is only + effective in the aligned and wrapped formats. In other formats, it + always behaves as if the expanded mode is off. /para /listitem /varlistentry @@ -2326,10 +2331,10 @@ lo_import 152801 varlistentry -termliteral\x/literal/term +termliteral\x [ replaceable class=parameteron/replaceable | replaceable class=parameteroff/replaceable | replaceable class=parameterauto/replaceable ]/literal/term listitem para -Toggles expanded table formatting mode. As such it is equivalent to +Sets or toggles expanded table formatting mode. As such it is equivalent to literal\pset expanded/literal. /para /listitem @@ -3197,7 +3202,8 @@ $endif para If literal\pset columns/ is zero, controls the width for the literalwrapped/ format and width for determining - if wide output requires the pager. + if wide output requires the pager or should be switched to the + vertical format in expanded auto mode. /para /listitem /varlistentry diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 2c38902..daaed66 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -1343,7 +1343,7 @@ exec_command(const char *cmd, free(fname); } - /* \x -- toggle expanded table representation */ + /* \x -- set or toggle expanded table representation */ else if (strcmp(cmd, x) == 0) { char *opt = psql_scan_slash_option(scan_state, @@ -2177,14 +2177,21 @@ do_pset(const char *param, const char *value, printQueryOpt *popt,
[HACKERS] ProcArrayLock contention
I've been playing with the attached patch, which adds an additional light-weight lock mode, LW_SHARED2. LW_SHARED2 conflicts with LW_SHARED and LW_EXCLUSIVE, but not with itself. The patch changes ProcArrayEndTransaction() to use this new mode. IOW, multiple processes can commit at the same time, and multiple processes can take snapshots at the same time, but nobody can take a snapshot while someone else is committing. Needless to say, I don't we'd really want to apply this, because adding a LW_SHARED2 mode that's probably only useful for ProcArrayLock would be a pretty ugly wart. But the results are interesting. pgbench, scale factor 100, unlogged tables, Nate Boley's 32-core AMD box, shared_buffers = 8GB, maintenance_work_mem = 1GB, synchronous_commit = off, checkpoint_segments = 300, checkpoint_timeout = 15min, checkpoint_completion_target = 0.9, wal_writer_delay = 20ms, results are median of three five-minute runs: #clients tps(master) tps(lwshared2) 1 657.984859 683.251582 8 4748.906750 4946.069238 32 10695.160555 17530.390578 80 7727.563437 16099.549506 That's a pretty impressive speedup, but there's trouble in paradise. With 80 clients (but not 32 or fewer), I occasionally get the following error: ERROR: t_xmin is uncommitted in tuple to be updated So it seems that there's some way in which this locking is actually incorrect, though I'm not seeing what it is at the moment. Either that, or there's some bug in the existing code that happens to be exposed by this change. The patch also produces a (much smaller) speedup with regular tables, but it's hard to know how seriously to take that until the locking issue is debugged. Any ideas? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company lwshared2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Nov 7, 2011, at 9:35 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Nov 7, 2011 at 6:33 PM, Simon Riggs si...@2ndquadrant.com wrote: On Mon, Nov 7, 2011 at 5:26 PM, Simon Riggs si...@2ndquadrant.com wrote: 5. Make the WAL writer more responsive, maybe using latches, so that it doesn't take as long for the commit record to make it out to disk. I'm working on this already as part of the update for power reduction/group commit/replication performance. I extracted this from my current patch for you to test. Thank you! Rather useful actually 'cos its allowed me a sensible phasing of the development. +1. reads patch Hmm, this is different than what I was expecting, although that's not necessarily bad. What this does is retain wal_writer_delay, but allow the WAL writer to be woken up more frequently if there's enough WAL to justify it. What I was expecting you to do is eliminate wal_writer_delay altogether and drive the wakeups entirely off of the latch. Oh, I think I see why you didn't do that... Anyway, I'll try to post test results in the morning. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] synchronous commit vs. hint bits
On Tue, Nov 8, 2011 at 2:35 AM, Robert Haas robertmh...@gmail.com wrote: What I was expecting you to do is eliminate wal_writer_delay altogether and drive the wakeups entirely off of the latch. Please continue to expect that, I just haven't finished it yet... -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ProcArrayLock contention
On Tue, Nov 8, 2011 at 4:52 AM, Robert Haas robertmh...@gmail.com wrote: With 80 clients (but not 32 or fewer), I occasionally get the following error: ERROR: t_xmin is uncommitted in tuple to be updated So it seems that there's some way in which this locking is actually incorrect, though I'm not seeing what it is at the moment. Either that, or there's some bug in the existing code that happens to be exposed by this change. The semantics of shared locks is that they jump the existing queue, so this patch allows locks to be held in sequences not previously seen when using exclusive locks. For me, the second kind of lock should queue up normally, but then be released en masse when possible. So queue like an exclusive, but wake like a shared. Vaguely remember shared_queued.v1.patch That can then produce flip-flop lock parties. A slight problem there is that when shared locks queue they don't all queue together, a problem which the other patch addresses, written long ago. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services shared_lwlock.v1.patch Description: Binary data shared_queued.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] heap vacuum cleanup locks
On Tue, Nov 8, 2011 at 2:54 AM, Robert Haas robertmh...@gmail.com wrote: It would still be nice to fix the case where we need to freeze a tuple that is on a page someone else has pinned, but I don't have any good ideas for how to do that. I think we need to avoid long pin hold times generally. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)
On 07.11.2011 20:36, Tom Lane wrote: Alexander Korotkovaekorot...@gmail.com writes: So, !? and ? operators are mentioned in documentation, but don't present in catalog. Are them just missed in the catalog or there is some more serious problem? IIRC, Heikki removed them from the final commit. Sounds like he missed some documentation. Yep. Fixed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers