Re: [HACKERS] GIN improvements part 1: additional information
On Sat, Dec 21, 2013 at 4:36 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Yet another version. The encoding/decoding code is now quite isolated in ginpostinglist.c, so it's easy to experiment with different encodings. This patch uses varbyte encoding again. I got a bit carried away, experimented with a bunch of different encodings. I tried rice encoding, rice encoding with block and offset number delta stored separately, the simple9 variant, and varbyte encoding. The compressed size obviously depends a lot on the distribution of the items, but in the test set I used, the differences between different encodings were quite small. One fatal problem with many encodings is VACUUM. If a page is completely full and you remove one item, the result must still fit. In other words, removing an item must never enlarge the space needed. Otherwise we have to be able to split on vacuum, which adds a lot of code, and also makes it possible for VACUUM to fail if there is no disk space left. That's unpleasant if you're trying to run VACUUM to release disk space. (gin fast updates already has that problem BTW, but let's not make it worse) I believe that eliminates all encodings in the Simple family, as well as PForDelta, and surprisingly also Rice encoding. For example, if you have three items in consecutive offsets, the differences between them are encoded as 11 in rice encoding. If you remove the middle item, the encoding for the next item becomes 010, which takes more space than the original. AFAICS varbyte encoding is safe from that. (a formal proof would be nice though). So, I'm happy to go with varbyte encoding now, indeed I don't think we have much choice unless someone can come up with an alternative that's VACUUM-safe. I have to put this patch aside for a while now, I spent a lot more time on these encoding experiments than I intended. If you could take a look at this latest version, spend some time reviewing it and cleaning up any obsolete comments, and re-run the performance tests you did earlier, that would be great. One thing I'm slightly worried about is the overhead of merging the compressed and uncompressed posting lists in a scan. This patch will be in good shape for the final commitfest, or even before that. I just tried out the patch gin-packed-postinglists-varbyte2.patch (which looks like the latest one in this thread) as follows: 1) Applied patch to the HEAD (on commit 94b899b829657332bda856ac3f06153d09077bd1) 2) Created a test table and index create table test (a text); copy test from '/usr/share/dict/words'; create index test_trgm_idx on test using gin (a gin_trgm_ops); 3) Got the following error on a wildcard query: postgres=# explain (buffers, analyze) select count(*) from test where a like '%tio%'; ERROR: lock 9447 is not held STATEMENT: explain (buffers, analyze) select count(*) from test where a like '%tio%'; ERROR: lock 9447 is not held Following is the bt: #0 LWLockRelease (lockid=9447) at lwlock.c:747 #1 0x0074a356 in LockBuffer (buffer=4638, mode=0) at bufmgr.c:2760 #2 0x00749d6e in UnlockReleaseBuffer (buffer=4638) at bufmgr.c:2551 #3 0x00478bcc in entryGetNextItem (ginstate=0x2380100, entry=0x23832a8) at ginget.c:555 #4 0x00479346 in entryGetItem (ginstate=0x2380100, entry=0x23832a8) at ginget.c:695 #5 0x0047987e in scanGetItem (scan=0x237fee8, advancePast=0x7fffa1a46b80, item=0x7fffa1a46b80, recheck=0x7fffa1a46b7f \001) at ginget.c:925 #6 0x0047ae3f in gingetbitmap (fcinfo=0x7fffa1a46be0) at ginget.c:1540 #7 0x008a9486 in FunctionCall2Coll (flinfo=0x236f558, collation=0, arg1=37224168, arg2=37236160) at fmgr.c:1323 #8 0x004bd091 in index_getbitmap (scan=0x237fee8, bitmap=0x2382dc0) at indexam.c:649 #9 0x0064827c in MultiExecBitmapIndexScan (node=0x237fce0) at nodeBitmapIndexscan.c:89 #10 0x006313b6 in MultiExecProcNode (node=0x237fce0) at execProcnode.c:550 #11 0x0064713a in BitmapHeapNext (node=0x237e998) at nodeBitmapHeapscan.c:104 #12 0x0063c348 in ExecScanFetch (node=0x237e998, accessMtd=0x6470ac BitmapHeapNext, recheckMtd=0x647cbc BitmapHeapRecheck) at execScan.c:82 #13 0x0063c3bc in ExecScan (node=0x237e998, accessMtd=0x6470ac BitmapHeapNext, recheckMtd=0x647cbc BitmapHeapRecheck) at execScan.c:132 #14 0x00647d3a in ExecBitmapHeapScan (node=0x237e998) at nodeBitmapHeapscan.c:436 #15 0x00631121 in ExecProcNode (node=0x237e998) at execProcnode.c:414 #16 0x00644992 in agg_retrieve_direct (aggstate=0x237e200) at nodeAgg.c:1073 #17 0x006448cc in ExecAgg (node=0x237e200) at nodeAgg.c:1026 #18 0x00631247 in ExecProcNode (node=0x237e200) at execProcnode.c:476 #19 0x0062ef2a in ExecutePlan (estate=0x237e0e8, planstate=0x237e200, operation=CMD_SELECT, sendTuples=1 '\001', numberTuples=0, direction=ForwardScanDirection, dest=0xd0c360) at execMain.c:1474 #20 0x0062cfeb
Re: [HACKERS] dynamic shared memory and locks
On 01/05/2014 07:56 PM, Robert Haas wrote: Right now, storing spinlocks in dynamic shared memory *almost* works, but there are problems with --disable-spinlocks. In that configuration, we use semaphores to simulate spinlocks. Every time someone calls SpinLockInit(), it's going to allocate a new semaphore which will never be returned to the operating system, so you're pretty quickly going to run out. There are a couple of things we could do about this: 5. Allocate a fixed number of semaphores, and multiplex them to emulate any number of spinlocks. For example, allocate 100 semaphores, and use the address of the spinlock modulo 100 to calculate which semaphore to use to emulate that spinlock. That assumes that you never hold more than one spinlock at a time, otherwise you can get deadlocks. I think that assumptions holds currently, because acquiring two spinlocks at a time would be bad on performance grounds anyway. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for pg_stat_archiver view
Hi Fabrizio, Il 05/01/14 20:46, Fabrizio Mello ha scritto: I don't see your code yet, but I would like to know if is possible to implement this view as an extension. I wanted to do it as an extension - so that I could backport that to previous versions of Postgres. I do not think it is a possibility, given that the client code that is aware of the events lies in pgarch.c. Ciao, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiling extensions on Windows
On Mon, Jan 6, 2014 at 3:32 AM, Craig Ringer cr...@2ndquadrant.com wrote: Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? I'm interested in making the Windows installer distributions a bit more extension dev friendly. In particular, I'd really like to see EDB's Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but it's an unnecessary roadblock. Sandeep, can you work on fixing this please? Thanks. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cleanup in code
On 01/04/2014 07:20 AM, Amit Kapila wrote: 1. compiling with msvc shows warning in relcache.c 1e:\workspace\postgresql\master\postgresql\src\backend\utils\cache\relcache.c(3959): warning C4715: 'RelationGetIndexAttrBitmap' : not all control paths return a value Attached patch remove_msvc_warning.patch to remove above warning Hmm, I thought we gave enough hints in the elog macro to tell the compiler that elog(ERROR) does no return, since commit b853eb97182079dcd30b4f52576bd5d6c275ee71. Have we not enabled that for MSVC? 2. It seems option K is not used in pg_dump: while ((c = getopt_long(argc, argv, abcCd:E:f:F:h:ij:K:n:N:oOp:RsS:t:T:U:vwWxZ:, long_options, optindex)) != -1) I have checked both docs and code but didn't find the use of this option. Am I missing something here? Attached patch remove_redundant_option_K_pgdump.patch to remove this option from code. Huh. That was added by my commit that added --dbname option, by accident. Removed, thanks. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cleanup in code
On Mon, Jan 6, 2014 at 11:38 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/04/2014 07:20 AM, Amit Kapila wrote: 1. compiling with msvc shows warning in relcache.c 1e:\workspace\postgresql\master\postgresql\src\backend\ utils\cache\relcache.c(3959): warning C4715: 'RelationGetIndexAttrBitmap' : not all control paths return a value Attached patch remove_msvc_warning.patch to remove above warning Hmm, I thought we gave enough hints in the elog macro to tell the compiler that elog(ERROR) does no return, since commit b853eb97182079dcd30b4f52576bd5d6c275ee71. Have we not enabled that for MSVC? I looked at this a while back here: http://www.postgresql.org/message-id/caaphdvqosb4nc3og0xoboj2fma-6akihuwsad43rlekqk6s...@mail.gmail.com And found that because elevel was being assigned to a variable that the compiler could not determine that the if (elevel_ = ERROR) was constant therefore couldn't assume that __assume(0) would be reached with the microsoft compiler Regards David Rowley
Re: [HACKERS] Compiling extensions on Windows
Sure. I'll make the changes so that the next available Windows installers include lbintl.h in $Installdir/include. How about the changes with respect to NLS? On Mon, Jan 6, 2014 at 2:44 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jan 6, 2014 at 3:32 AM, Craig Ringer cr...@2ndquadrant.com wrote: Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? I'm interested in making the Windows installer distributions a bit more extension dev friendly. In particular, I'd really like to see EDB's Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but it's an unnecessary roadblock. Sandeep, can you work on fixing this please? Thanks. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sandeep Thakkar
Re: [HACKERS] cleanup in code
On 2014-01-06 23:51:52 +1300, David Rowley wrote: On Mon, Jan 6, 2014 at 11:38 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/04/2014 07:20 AM, Amit Kapila wrote: 1. compiling with msvc shows warning in relcache.c 1e:\workspace\postgresql\master\postgresql\src\backend\ utils\cache\relcache.c(3959): warning C4715: 'RelationGetIndexAttrBitmap' : not all control paths return a value Attached patch remove_msvc_warning.patch to remove above warning Hmm, I thought we gave enough hints in the elog macro to tell the compiler that elog(ERROR) does no return, since commit b853eb97182079dcd30b4f52576bd5d6c275ee71. Have we not enabled that for MSVC? I looked at this a while back here: http://www.postgresql.org/message-id/caaphdvqosb4nc3og0xoboj2fma-6akihuwsad43rlekqk6s...@mail.gmail.com And found that because elevel was being assigned to a variable that the compiler could not determine that the if (elevel_ = ERROR) was constant therefore couldn't assume that __assume(0) would be reached with the microsoft compiler But afair the declaration for elog() works in several other places, so that doesn't sufficiently explain this. I'd very much expect that that variable is complitely elided by any halfway competent compiler - it's just there to prevent multiple evaluation should elevel not be a constant. Do you see the warning both with asserts enabled and non-assert builds? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiling extensions on Windows
On Mon, Jan 6, 2014 at 10:57 AM, Sandeep Thakkar sandeep.thak...@enterprisedb.com wrote: Sure. I'll make the changes so that the next available Windows installers include lbintl.h in $Installdir/include. How about the changes with respect to NLS? No, there's nothing to change there. Craig was suggesting that users could disable NLS in their extension to avoid issues with the missing header, but that's not a good solution. On Mon, Jan 6, 2014 at 2:44 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jan 6, 2014 at 3:32 AM, Craig Ringer cr...@2ndquadrant.com wrote: Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? I'm interested in making the Windows installer distributions a bit more extension dev friendly. In particular, I'd really like to see EDB's Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but it's an unnecessary roadblock. Sandeep, can you work on fixing this please? Thanks. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sandeep Thakkar -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiling extensions on Windows
If libintl.h and any headers it in turn includes are bundled, there is no longer an issue with NLS. That was just a workaround for building exts when Pg's headers tried to refer to nonexistent headers when NLS was enabled. On 6 Jan 2014 18:57, Sandeep Thakkar sandeep.thak...@enterprisedb.com wrote:Sure. Ill make the changes so that the next available Windows installers include lbintl.h in $Installdir/include. How about the changes with respect to NLS? On Mon, Jan 6, 2014 at 2:44 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jan 6, 2014 at 3:32 AM, Craig Ringer cr...@2ndquadrant.com wrote: Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly thats how Ive been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? Im interested in making the Windows installer distributions a bit more extension dev friendly. In particular, Id really like to see EDBs Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but its an unnecessary roadblock. Sandeep, can you work on fixing this please? Thanks. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sandeep Thakkar
Re: [HACKERS] dynamic shared memory and locks
On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote: On 01/05/2014 07:56 PM, Robert Haas wrote: Right now, storing spinlocks in dynamic shared memory *almost* works, but there are problems with --disable-spinlocks. In that configuration, we use semaphores to simulate spinlocks. Every time someone calls SpinLockInit(), it's going to allocate a new semaphore which will never be returned to the operating system, so you're pretty quickly going to run out. There are a couple of things we could do about this: 5. Allocate a fixed number of semaphores, and multiplex them to emulate any number of spinlocks. For example, allocate 100 semaphores, and use the address of the spinlock modulo 100 to calculate which semaphore to use to emulate that spinlock. That assumes that you never hold more than one spinlock at a time, otherwise you can get deadlocks. I think that assumptions holds currently, because acquiring two spinlocks at a time would be bad on performance grounds anyway. I think that's a pretty dangerous assumption - and one which would only break without just about anybody noticing because nobody tests --disable-spinlock builds regularly. We could improve on that by adding cassert checks against nested spinlocks, but that'd be a far too high price for little benefit imo. I am not convinced by the performance argument - there's lots of spinlock that are rarely if ever contended. If you lock two such locks in a consistent nested fashion there won't be a performance problem. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostgreSQL Service on Windows does not start if data directory given is relative path
I have found a case that PostgreSQL as win32 service does not start, if the data directory given as relative path. Error observed in this case is: The PostgreSQL on Local Computer started and then stopped. This may happen because relative path given will be relative to path from where service is registered but the path from where WIN32 service execution gets invoked may be different and hence it won't be able to find the data directory. I have fixed the same by internally converting the relative path to absolute path as it is being done for executable file. Attached is the patch with the fix. Please provide your opinion. I will add this to Jan 2014 CommitFest. Thanks and Regards, Kumar Rajeev Rastogi pgctl_win32service_rel_dbpath.patch Description: pgctl_win32service_rel_dbpath.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiling extensions on Windows
Okay. BTW, I just checked that Windows 32bit installer ships the libintl.h. Did you try if it works for you? Or it requires more headers? Somehow, Windows 64bit installer installer missed it. I'll fix the build script. On Mon, Jan 6, 2014 at 4:55 PM, Craig Ringer cr...@2ndquadrant.com wrote: If libintl.h and any headers it in turn includes are bundled, there is no longer an issue with NLS. That was just a workaround for building exts when Pg's headers tried to refer to nonexistent headers when NLS was enabled. On 6 Jan 2014 18:57, Sandeep Thakkar sandeep.thak...@enterprisedb.com wrote: Sure. I'll make the changes so that the next available Windows installers include lbintl.h in $Installdir/include. How about the changes with respect to NLS? On Mon, Jan 6, 2014 at 2:44 PM, Dave Page dp...@pgadmin.org wrote: On Mon, Jan 6, 2014 at 3:32 AM, Craig Ringer cr...@2ndquadrant.com wrote: Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? I'm interested in making the Windows installer distributions a bit more extension dev friendly. In particular, I'd really like to see EDB's Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but it's an unnecessary roadblock. Sandeep, can you work on fixing this please? Thanks. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sandeep Thakkar -- Sandeep Thakkar
[HACKERS] Convert Datum* to char*
Hi, I want to read an attribute value from a TupleTableSlot. When I try to convert an attribute of SQL type varchar from Datum* to char* with the help of the method TextDatumGetCString(...), sometimes there is a segmentation fault. The segmentation fault comes from the method TextDatumGetCString(...), which is defined in utils/builtins.h. Unfortunately, the fault is not always reproducible. I debugged the code and figured out that the value of result-tts_values[i] sometimes is random. It may be uninitialized memory. In other cases, the variable value is NULL. Then, I can just skip the conversion from Datum* to char*, so that there is no segmentation fault. I attached a patch with the code. The relevant line is: char *value = TextDatumGetCString(result-tts_values[i]); The SQL-Query is a simple SELECT * from ... on the TPC-H table customer. About every third execution leads to a segmentation fault. Why is the memory of the variable uninitialized? I am not very familiar with Postgres. Is there another method to get a varchar attribute out of a TupleTableSlot as string? Best regards Maria diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c index 76dd62f..be05ad0 100644 --- a/src/backend/executor/execProcnode.c +++ b/src/backend/executor/execProcnode.c @@ -77,6 +77,7 @@ */ #include postgres.h +#include catalog/pg_attribute.h #include executor/executor.h #include executor/nodeAgg.h #include executor/nodeAppend.h @@ -112,6 +113,7 @@ #include executor/nodeWindowAgg.h #include executor/nodeWorktablescan.h #include miscadmin.h +#include utils/builtins.h /* @@ -509,6 +511,17 @@ ExecProcNode(PlanState *node) if (node-instrument) InstrStopNode(node-instrument, TupIsNull(result) ? 0.0 : 1.0); + int i; + for(i = 0; i result-tts_tupleDescriptor-natts; i++) { + if (1042 == result-tts_tupleDescriptor-attrs[i]-atttypid) { + if (0 == result-tts_values[i]) { +printf(null found\n); + } else { +char *value = TextDatumGetCString(result-tts_values[i]); + } + } + } + return result; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
On 01/06/2014 03:09 PM, Masterprojekt Naumann1 wrote: I want to read an attribute value from a TupleTableSlot. When I try to convert an attribute of SQL type varchar from Datum* to char* with the help of the method TextDatumGetCString(...), sometimes there is a segmentation fault. The segmentation fault comes from the method TextDatumGetCString(...), which is defined in utils/builtins.h. Unfortunately, the fault is not always reproducible. I debugged the code and figured out that the value of result-tts_values[i] sometimes is random. It may be uninitialized memory. In other cases, the variable value is NULL. Then, I can just skip the conversion from Datum* to char*, so that there is no segmentation fault. I attached a patch with the code. The relevant line is: char *value = TextDatumGetCString(result-tts_values[i]); The SQL-Query is a simple SELECT * from ... on the TPC-H table customer. About every third execution leads to a segmentation fault. Maybe the field is NULL? By convention, we normally set the Datum to 0 on an SQL NULL, but you're supposed to check tts_isnull first, and ignore tts_values[x] when tts_isnull[x] is true. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
Yes, in some cases, Datum is 0, which I test before conversion. Additionally, I looked at tts_isnull but it does not prevent the segmentation fault in some cases. The problem is. that sometimes the value is random, but I don't know why and how I can detect that case. 2014/1/6 Heikki Linnakangas hlinnakan...@vmware.com On 01/06/2014 03:09 PM, Masterprojekt Naumann1 wrote: I want to read an attribute value from a TupleTableSlot. When I try to convert an attribute of SQL type varchar from Datum* to char* with the help of the method TextDatumGetCString(...), sometimes there is a segmentation fault. The segmentation fault comes from the method TextDatumGetCString(...), which is defined in utils/builtins.h. Unfortunately, the fault is not always reproducible. I debugged the code and figured out that the value of result-tts_values[i] sometimes is random. It may be uninitialized memory. In other cases, the variable value is NULL. Then, I can just skip the conversion from Datum* to char*, so that there is no segmentation fault. I attached a patch with the code. The relevant line is: char *value = TextDatumGetCString(result-tts_values[i]); The SQL-Query is a simple SELECT * from ... on the TPC-H table customer. About every third execution leads to a segmentation fault. Maybe the field is NULL? By convention, we normally set the Datum to 0 on an SQL NULL, but you're supposed to check tts_isnull first, and ignore tts_values[x] when tts_isnull[x] is true. - Heikki
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 01/05/2014 09:13 PM, Michael Paquier wrote: On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com mailto:markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. I'd encourage anyone looking at implementing a testing suite for replication to look at the stuff we did for Slony at least to get some ideas. We wrote a test driver framework (clustertest - https://github.com/clustertest/clustertest-framework) then some Javascript base classes for common types of operations. An individual test is then written in Javascript that invokes methods either in the framework or base-class to do most of the interesting work. http://git.postgresql.org/gitweb/?p=slony1-engine.git;a=blob;f=clustertest/disorder/tests/EmptySet.js;h=7b4850c1d24036067f5a659b990c7f05415ed967;hb=HEAD as an example A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it, or even include those tests directly with pg_basebackup or an upper layer of it. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. -1 for that developers should not need to use root to run regression suite. 2) I have no win32 version of the logic For a first shot I am not sure that it matters much. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment An assumption of the test
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 2014-01-06 01:25:57 +, Greg Stark wrote: -- greg On 5 Jan 2014 14:54, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. This sounds pretty cool. The real trick will be in testing concurrent behaviour -- I.e. queries on the slave when it's replaying logs at a certain point. But right now we have nothing so anything would be an improvement. Abhijit Menon-Sen (CCed) has prototyped an isolationtester version that can connect to multiple nodes. Once we've got automated setup of multiple nodes, pursuing that makes sense again. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. This isn't necessary. You can use the same binaries and run initdb with a different location just fine. Then start up the database with -D to specify the directory. Very emphathically seconded. It should absolutely not be necessary to use different chroots. Pretty much the only case that will require that is tablespaces unless you do some pretty ugly hackery... In almost all scenarios you'll have to change either unix_socket_directory or port (recommended) in addition to the datadir - but that's not a problem. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
On 01/06/2014 09:09 PM, Masterprojekt Naumann1 wrote: Why is the memory of the variable uninitialized? Are there any other patches you've made to the running PostgreSQL instance? I'd want to run under Valgrind and see what turned up. This might be a bit tricky with an intermittent fault during something like a TPC-H run, though. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On Thu, Jan 2, 2014 at 4:15 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-02 16:05:09 -0500, Robert Haas wrote: On Thu, Jan 2, 2014 at 3:19 PM, Andres Freund and...@2ndquadrant.com wrote: I was wondering if we could somehow arrange to not release the subtransaction's AccessShareLock on the table, as long as it was protecting toasted references someplace. Sounds fairly ugly... I think the only principled fixes are to either retain the lock or forcibly detoast before releasing it. I don't think that's sufficient. Unless I miss something the problem isn't restricted to TRUNCATE and such at all. I think a plain VACUUM should be sufficient? I haven't tested it, but INSERT RETURNING toasted_col a row, storing the result in a record, and then aborting the subtransaction will allow the inserted row to be VACUUMed by a concurrent transaction. Hmm, that's actually nastier than the case that the case Rushabh originally reported. A somewhat plausible response to my holdable cursor didn't work after I truncated the table it read from is well don't do that then. But this case could actually happen to someone who wasn't trying to do anything screwy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On 2014-01-06 09:10:48 -0500, Robert Haas wrote: On Thu, Jan 2, 2014 at 4:15 PM, Andres Freund and...@2ndquadrant.com wrote: I think the only principled fixes are to either retain the lock or forcibly detoast before releasing it. I don't think that's sufficient. Unless I miss something the problem isn't restricted to TRUNCATE and such at all. I think a plain VACUUM should be sufficient? I haven't tested it, but INSERT RETURNING toasted_col a row, storing the result in a record, and then aborting the subtransaction will allow the inserted row to be VACUUMed by a concurrent transaction. Hmm, that's actually nastier than the case that the case Rushabh originally reported. A bit, yes. Somebody should probably verify that it can actually happen :P A somewhat plausible response to my holdable cursor didn't work after I truncated the table it read from is well don't do that then. But this case could actually happen to someone who wasn't trying to do anything screwy. Personally I think everything that involves using data computed in an aborted subtransaction but the error code is screwy. I think plpgsql has been far too lenient in allowing that in an unconstrained fashion. I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Sun, Dec 22, 2013 at 10:02 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Sun, Dec 22, 2013 at 3:01 PM, Fujii Masao masao.fu...@gmail.com wrote: I found the bug of ALTER SYSTEM SET patch. The procedure to reproduce it is here. $ psql =# ALTER SYSTEM SET shared_buffers = '512MB'; ALTER SYSTEM =# \q $ pg_ctl -D data reload server signaled 2013-12-22 18:24:13 JST LOG: received SIGHUP, reloading configuration files 2013-12-22 18:24:13 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-22 18:24:13 JST LOG: configuration file X?? contains errors; unaffected changes were applied The configuration file name in the last log message is broken. This problem was introduced by the ALTER SYSTEM SET patch. FreeConfigVariables(head); snip else if (apply) ereport(elevel, (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg(configuration file \%s\ contains errors; unaffected changes were applied, ErrorConfFile))); The cause of the problem is that, in ProcessConfigFile(), the log message including the 'ErrorConfFile' is emitted after 'head' is free'd even though 'ErrorConfFile' points to one of entry of the 'head'. Your analysis is absolutely right. The reason for this issue is that we want to display filename to which erroneous parameter belongs and in this case we are freeing the parameter info before actual error. To fix, we need to save the filename value before freeing parameter list. Please find the attached patch to fix this issue. Note - In my m/c, I could not reproduce the issue. I think this is not surprising because we are not setting freed memory to NULL, so it can display anything (sometimes right value as well) Couldn't we also handle this by postponing FreeConfigVariables until after the if (error) block? Also, what's the point of this: + snprintf(ConfigAutoFileName,sizeof(ConfigAutoFileName),%s, PG_AUTOCONF_FILENAME); Why copy PG_AUTOCONF_FILENAME into another buffer? Why not just pass PG_AUTOCONF_FILENAME directly to ParseConfigFile? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
2014/1/6 Craig Ringer cr...@2ndquadrant.com On 01/06/2014 09:09 PM, Masterprojekt Naumann1 wrote: Why is the memory of the variable uninitialized? Are there any other patches you've made to the running PostgreSQL instance? I'd want to run under Valgrind and see what turned up. This might be a bit tricky with an intermittent fault during something like a TPC-H run, though. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I am on the latest commit of the master branch of the GitHub repository (commit 10a82cda67731941c18256e009edad4a784a2994) and I only applied the attached patch. I hope that you can reproduce the fault. Thanks, Maria
Re: [HACKERS] Convert Datum* to char*
On Mon, Jan 6, 2014 at 8:09 AM, Masterprojekt Naumann1 mpws201...@gmail.com wrote: Why is the memory of the variable uninitialized? Are you checking that i = slot-tts_nvalid before accessing the tts_values and tts_isnull arrays? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On 2014-01-06 09:43:45 -0500, Robert Haas wrote: I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. So, you want to let users do things that are unsafe, but only if they ask nicely? That hardly seems right. Well, no. If they have to use that function explicitly *before* the subxact aborted, we can copy detoast the value out of that context safely. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
* Robert Haas (robertmh...@gmail.com) wrote: Another idea is to include some identifying information in the lwlock. That was my immediate reaction to this issue... For example, each lwlock could have a char *name in it, and we could print the name. In theory, this could be a big step forward in terms of usability, because it'd spare us all needing to remember that 4 == ProcArrayLock. But it's awkward for buffer locks, of which there might be a great many, and we surely don't want to allocate a dynamically-generated string in shared memory for each one. You could do a bit better by making the identifying information a string plus an integer, because then all the buffer locks could set the string to a static constant like buffer content lock and the integer to the buffer number, and similarly for lock manager partition locks and so on. This is appealing, but would increase the size of LWLockPadded from 16 bytes to 32 on 64-bit platforms where slock_t is four bytes or less, which I'm not that excited about even though it would reduce cache line contention in some cases. I'm not thrilled with this either but at the same time, it may well be worth it. Yet a third idea is to try to reverse-engineer a name for a given lwlock from the pointer address. If it's an offset into the main array, this is easy enough to do, and even if we ended up with several arrays (like one for bufmgr locks) it wouldn't be too hard to write code to figure out which array contains it and emit the appropriate string. The only real problem that I see with this is that it might cause a performance hit. A performance hit when running with trace_lwlocks or LWLOCK_STATS is not really a problem, but people won't like if --enable-dtrace slow things up. This seems like an interesting idea- but this and my other thought (having a defined array of strings) seem like they might fall over in the face of DSMs. Preferences, other ideas? My preference would generally be use more memory rather than CPU time; so I'd vote for adding identifying info rather than having the code hunt through arrays to try and find it. None of these ideas are a complete solution for LWLOCK_STATS. In the other three cases noted above, we only need an identifier for the lock instantaneously, so that we can pass it off to the logger or dtrace or whatever. But LWLOCK_STATS wants to hold on to data about the locks that were visited until the end of the session, and it does that using an array that is *indexed* by lwlockid. I guess we could replace that with a hash table. Ugh. Any suggestions? Yeah, that's not fun. No good suggestions here offhand. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On Mon, Jan 6, 2014 at 9:19 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-06 09:10:48 -0500, Robert Haas wrote: On Thu, Jan 2, 2014 at 4:15 PM, Andres Freund and...@2ndquadrant.com wrote: I think the only principled fixes are to either retain the lock or forcibly detoast before releasing it. I don't think that's sufficient. Unless I miss something the problem isn't restricted to TRUNCATE and such at all. I think a plain VACUUM should be sufficient? I haven't tested it, but INSERT RETURNING toasted_col a row, storing the result in a record, and then aborting the subtransaction will allow the inserted row to be VACUUMed by a concurrent transaction. Hmm, that's actually nastier than the case that the case Rushabh originally reported. A bit, yes. Somebody should probably verify that it can actually happen :P A somewhat plausible response to my holdable cursor didn't work after I truncated the table it read from is well don't do that then. But this case could actually happen to someone who wasn't trying to do anything screwy. Personally I think everything that involves using data computed in an aborted subtransaction but the error code is screwy. I think plpgsql has been far too lenient in allowing that in an unconstrained fashion. I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. So, you want to let users do things that are unsafe, but only if they ask nicely? That hardly seems right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Mon, Jan 6, 2014 at 7:58 PM, Robert Haas robertmh...@gmail.com wrote: On Sun, Dec 22, 2013 at 10:02 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Sun, Dec 22, 2013 at 3:01 PM, Fujii Masao masao.fu...@gmail.com wrote: I found the bug of ALTER SYSTEM SET patch. The procedure to reproduce it is here. FreeConfigVariables(head); snip else if (apply) ereport(elevel, (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg(configuration file \%s\ contains errors; unaffected changes were applied, ErrorConfFile))); The cause of the problem is that, in ProcessConfigFile(), the log message including the 'ErrorConfFile' is emitted after 'head' is free'd even though 'ErrorConfFile' points to one of entry of the 'head'. Your analysis is absolutely right. The reason for this issue is that we want to display filename to which erroneous parameter belongs and in this case we are freeing the parameter info before actual error. To fix, we need to save the filename value before freeing parameter list. Please find the attached patch to fix this issue. Couldn't we also handle this by postponing FreeConfigVariables until after the if (error) block? Wouldn't doing that way can lead to bigger memory leak, if error level is ERROR. Though in current fix also it can leak memory but it will be just for ErrorConfFile_save. I think some similar case can happen for 'pre_value' in code currently as well, that's why I have fixed it in a similar way in patch. Also, what's the point of this: + snprintf(ConfigAutoFileName,sizeof(ConfigAutoFileName),%s, PG_AUTOCONF_FILENAME); Why copy PG_AUTOCONF_FILENAME into another buffer? Why not just pass PG_AUTOCONF_FILENAME directly to ParseConfigFile? Initially, I think we were doing concat of folder and file name for Autofile, that's why the code was written that way, but currently it can be changed to way you are suggesting. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cleanup in code
Andres Freund and...@2ndquadrant.com writes: On Mon, Jan 6, 2014 at 11:38 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Hmm, I thought we gave enough hints in the elog macro to tell the compiler that elog(ERROR) does no return, since commit b853eb97182079dcd30b4f52576bd5d6c275ee71. But afair the declaration for elog() works in several other places, so that doesn't sufficiently explain this. I'd very much expect that that variable is complitely elided by any halfway competent compiler - it's just there to prevent multiple evaluation should elevel not be a constant. At -O0 (or local equivalent), it would not surprise me at all that compilers wouldn't recognize elog(ERROR) as not returning. I don't think there's ever been any expectation that that marking would be bulletproof. We added it to permit better code generation, not to silence reachability warnings. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] generic pseudotype IO functions?
Hi, Does anybody have an opinion about introducing generic pseudotype IO functions? Pseudotype.c/pg_proc.h are slowly growing a number of pretty useless/redundant copypasted functions... Most for cases that are pretty damn unlikely to be hit by users not knowing what they do. What about adding a pseudotype_in/out that just error out with a generic message? We could start trying to guess the type they are operating on using get_fn_expr_rettype/get_fn_expr_argtype but that'd require modifying several callers not providing that info to work reliably? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL assertions prototype
Andres Freund and...@2ndquadrant.com wrote: Peter Eisentraut pete...@gmx.net schrieb: On 12/18/13, 2:22 PM, Andres Freund wrote: It would only force serialization for transactions that modify tables covered by the assert, that doesn't seem to bad. Anything covered by an assert shoulnd't be modified frequently, otherwise you'll run into major performance problems. I think that makes sense. If you want to use assertions, you need to run in serializable mode, otherwise you get an error if you modify anything covered by an assertion. In the future, someone could enhance this for other isolation levels, but as Josh has pointed out, that would likely just be reimplementing SSI with big locks. SSI only actually works correctly if all transactions use SSI... I am not sure if we can guarantee that the subset we'd require'd be safe without the read sie using SSI. You could definitely see a state which would not be consistent with getting to some later state under procedural business rules; however, I don't think any connection could ever see a state which violated the constraint as of the moment it was viewed. For examples of essentially enforcing multi-table constraints using triggers and SSI see this section: http://wiki.postgresql.org/wiki/SSI#Enforcing_Business_Rules_in_Triggers For an example of how things can look OK in terms of enforced constraints as of different moments in time, yet those moments in time could be inconsistent, see this section: http://wiki.postgresql.org/wiki/SSI#Enforcing_Business_Rules_in_Triggers SSI gives you a guarantee that with any set of concurrently running transactions, the effect is the same as some serial (one-at-a-time) execution of those transactions; but it says little about the mix of serializable and non-serializable transactions. Non-serializable transactions will, after the last of those serializable transactions has committed or rolled back, see a state which is consistent with some serial execution of those serializable transactions which committed, but it will not necessarily be consistent with them having run in any *particular* order. NOTE: the state might be consistent with some order other than commit order. This means that a non-serializable transaction running in the midst of those serializable transaction commits might see the work of some transaction which will appear to all serializable transactions as having been run *later* while not yet seeing the work of a transaction which will appear to all serializable transactions to have run *earlier*. I'm pretty sure that this means that an invariant, if it is an expression which must always hold for any view of the database, can be enforced by requiring modifying transactions to be serializable. What it doesn't guarantee is that business rules about *transitions* can be enforced without requiring all *transactions* to be serializable. In the Deposit Report example, note that a non-serializable transaction would never be able to see a receipt with a deposit number that was not open; but it *would* be able to see a closed batch header with a set of receipts which was not yet complete. So I think the answer is that the suggested approach is sufficient for enforcing assertions about static database state. If you want to make sure that nobody sees a state for which a given expression is false, it is sufficient. Just don't overestimate what that means. You can't ensure that a non-serializable transaction won't see a state which is inconsistent with a later database state according to *procedural* business rules. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cleanup in code
On 2014-01-06 09:54:15 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On Mon, Jan 6, 2014 at 11:38 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Hmm, I thought we gave enough hints in the elog macro to tell the compiler that elog(ERROR) does no return, since commit b853eb97182079dcd30b4f52576bd5d6c275ee71. But afair the declaration for elog() works in several other places, so that doesn't sufficiently explain this. I'd very much expect that that variable is complitely elided by any halfway competent compiler - it's just there to prevent multiple evaluation should elevel not be a constant. At -O0 (or local equivalent), it would not surprise me at all that compilers wouldn't recognize elog(ERROR) as not returning. You have a point, but I don't think that any of the compilers we try to keep clean have such behaviour in the common case - at least most versions of gcc certainly recognize such on -O0, and I am pretty sure that 52906f175a05a4e7e5e4a0e6021c32b1bfb221cf fixed some warnings for mvcc, at least in some versions and some configurations. So I am wondering if there's a special reason it doesn't recognize this individual case as it does seem to work in others - defining pg_unreachable() to be empty generates about a dozen warnings here. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL assertions prototype
Kevin Grittner kgri...@ymail.com wrote: For an example of how things can look OK in terms of enforced constraints as of different moments in time, yet those moments in time could be inconsistent, see this section: http://wiki.postgresql.org/wiki/SSI#Enforcing_Business_Rules_in_Triggers Copy/paste error. I meant this link: http://wiki.postgresql.org/wiki/SSI#Read_Only_Transactions -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Andres Freund and...@2ndquadrant.com writes: On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote: That assumes that you never hold more than one spinlock at a time, otherwise you can get deadlocks. I think that assumptions holds currently, because acquiring two spinlocks at a time would be bad on performance grounds anyway. I think that's a pretty dangerous assumption I think it's a fine assumption. Code that could possibly do that should never get within hailing distance of being committed, because you are only supposed to have short straight-line bits of code under a spinlock. Taking another spinlock breaks both the straight line code and the no loops aspects of that coding rule. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic pseudotype IO functions?
On 2014-01-06 10:29:06 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Does anybody have an opinion about introducing generic pseudotype IO functions? Yes: -1. Ok, fine with me. Pseudotype.c/pg_proc.h are slowly growing a number of pretty useless/redundant copypasted functions... Most for cases that are pretty damn unlikely to be hit by users not knowing what they do. That's hardly the largest cost associated with inventing a new pseudotype. Nor are there lots of new pseudotypes coming down the pike, anyway. Robert suggested modelling the lookup of changeset extraction output callbacks after fdw's FdwRoutine, that's why I am wondering about it. I noticed while reviewing that I so far had borrowed fdw's C routines which didn't seem like such a nice thing to do... What about adding a pseudotype_in/out that just error out with a generic message? This will break some of the function sanity checks in opr_sanity, I believe. Yeah, we could lobotomize that, but I don't see any benefit. Yes. But there's precedent in refcursor using text's routines... (it's in type_sanity, but whatever) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
2014/1/6 Thomas Fanghaenel tfanghae...@salesforce.com On Mon, Jan 6, 2014 at 8:09 AM, Masterprojekt Naumann1 mpws201...@gmail.com wrote: Why is the memory of the variable uninitialized? Are you checking that i = slot-tts_nvalid before accessing the tts_values and tts_isnull arrays? Thanks for your ideas! I could fix the segmentation fault. I have to check both, slot-tts_isnull and 0 == slot-tts_values[i]. If both are false, I can convert the value with the method TextDatumGetCString(...). Nevertheless, slot-tts_nvalid is always 0. I hope that there is no other problem.
Re: [HACKERS] generic pseudotype IO functions?
Andres Freund and...@2ndquadrant.com writes: Does anybody have an opinion about introducing generic pseudotype IO functions? Yes: -1. Pseudotype.c/pg_proc.h are slowly growing a number of pretty useless/redundant copypasted functions... Most for cases that are pretty damn unlikely to be hit by users not knowing what they do. That's hardly the largest cost associated with inventing a new pseudotype. Nor are there lots of new pseudotypes coming down the pike, anyway. What about adding a pseudotype_in/out that just error out with a generic message? This will break some of the function sanity checks in opr_sanity, I believe. Yeah, we could lobotomize that, but I don't see any benefit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On Mon, Jan 6, 2014 at 9:47 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-06 09:43:45 -0500, Robert Haas wrote: I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. So, you want to let users do things that are unsafe, but only if they ask nicely? That hardly seems right. Well, no. If they have to use that function explicitly *before* the subxact aborted, we can copy detoast the value out of that context safely. Oh, I see. I think that's pretty icky. Users won't expect (and will complain about) such restrictions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic pseudotype IO functions?
On 1/6/14, 10:29 AM, Tom Lane wrote: Pseudotype.c/pg_proc.h are slowly growing a number of pretty useless/redundant copypasted functions... Most for cases that are pretty damn unlikely to be hit by users not knowing what they do. That's hardly the largest cost associated with inventing a new pseudotype. Nor are there lots of new pseudotypes coming down the pike, anyway. If someone wants to do the work, what's the harm in reducing some code redundancy? What about adding a pseudotype_in/out that just error out with a generic message? This will break some of the function sanity checks in opr_sanity, Then the tests can be changed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Robert Haas robertmh...@gmail.com writes: I guess the question boils down to: why are we keeping --disable-spinlocks around? If we're expecting that people might really use it for serious work, then it needs to remain and it needs to work with dynamic shared memory. If we're expecting that people might use it while porting PostgreSQL to a new platform but only as an aid to get things up and running, then it needs to remain, but it's probably OK for dynamic shared memory not to work when this option is in use. If nobody uses it at all, then we can just forget the whole thing. I think we can eliminate the first of those. Semaphores for spinlocks were a performance disaster fifteen years ago, and the situation has surely only gotten worse since then. I do, however, believe that --disable-spinlocks has some use while porting to a new platform. (And I don't believe the argument advanced elsewhere in this thread that everybody uses gcc; much less that gcc's atomic intrinsics are of uniformly high quality even on oddball architectures.) Heikki's idea has some attraction for porting work whether or not you believe that DSM needs to work during initial porting. By default, PG will try to create upwards of ten thousand spinlocks just for buffer headers. It's likely that that will fail unless you whack around the kernel's SysV parameters. Being able to constrain the number of semaphores to something sane would probably be useful. Having said all that, I'm not personally going to take the time to implement it, and I don't think the DSM patch should be required to either. Somebody who actually needs it can go make it work. But I think Heikki's idea points the way forward, so I'm now against having the DSM code reject --disable-spinlocks. I think benign neglect is the way for now. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic pseudotype IO functions?
Peter Eisentraut pete...@gmx.net writes: On 1/6/14, 10:29 AM, Tom Lane wrote: This will break some of the function sanity checks in opr_sanity, Then the tests can be changed. That will weaken their ability to detect actual mistakes, no? If there were a large benefit to merging the pseudotype I/O functions, I'd think this would be acceptable; but merging them seems of mighty marginal value. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic pseudotype IO functions?
On 2014-01-06 11:28:29 -0500, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: On 1/6/14, 10:29 AM, Tom Lane wrote: This will break some of the function sanity checks in opr_sanity, Then the tests can be changed. That will weaken their ability to detect actual mistakes, no? FWIW, I am perfectly fine with duplicating the functions for now - I just thought that that might not be the best way but I didn't (and still don't) have a strong opinion. That's why I didn't supply a patch ;) If there were a large benefit to merging the pseudotype I/O functions, I'd think this would be acceptable; but merging them seems of mighty marginal value. I think I am less concerned about pseudotypes.c than about bloating pg_proc.h even further and about the annoyance of editing it - but I guess that should rather be fixed by storing it in a more sensible format at some point... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiling extensions on Windows
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2014 07:32 PM, Craig Ringer wrote: Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. Yes, this pretty much exactly describes how I build PL/R for Windows. I had to match my build system SDK with the one EDB uses to get a compatible binary. It would be nice if we had something equivalent to PGXS on Windows, or maybe even a community build system where authors could get Windows binaries built. Joe - -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSytzOAAoJEDfy90M199hlgdUP/3wHL7Akq4ONkYlFXq8d3LGJ h5LNAWvILfJVOI3dmpHHzw8FCmWSsrsnDKuN/GUvJvhioPL0QsWgU1YGLwKNb7Mt Q3famk8hJCe8PikrL5KvItk1jQtxem+M4wGcKZZoM2cb4soeRDxM5FtTKzZMHGi0 9GA3Tx/TpgGfIP35Xg9ckL/LejyOZIndrRHuJHREGZlIP27AW0SEscZMDq25Q5yy jPcWki7hAIABohwkRPkFWVZArhSrCe1dA1Yy0Gad5cB/JdVB4xAjkmwLSa2suBod nQ65G/8Hz88GIRxY1FlzPn+06IDnSdnZmrhxfZLn8Vl/mnMoW9h6pKmBNyWTQMoP 25Ex5/tYIQ6iYyUO3Ic/B/23OMVHu9bWXeyk1hKEpqCFR/1BctzafQI/vA4dRs0u KRN3hua9GYnX+guw+d9mIkujPAeXphbjaMlgY6ckkENmiAg1HXfzv8+tKsQT4Pwx IqcSNzIsnzTRSag1IKklwUW6DuTSGyFHyXsbWRA+kkxL2/ucHL7f2mCmbZ3Qg8WW zthp6dN+9dLC/iH92qiS/nkFFxkkikyBpG2wb+Xcc0Ko1u26xp3e7ZFnCUZQ0Bse DTiOIywW89ICk7pokI8vMEwJIN5d42dZ01GL6XdLT9iPTnGXSuCQsE2GSMBxUuHs KbY+hsZrrvWH0QaVkrq5 =V4H5 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On 2014-01-06 11:08:41 -0500, Robert Haas wrote: On Mon, Jan 6, 2014 at 9:47 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-06 09:43:45 -0500, Robert Haas wrote: I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. So, you want to let users do things that are unsafe, but only if they ask nicely? That hardly seems right. Well, no. If they have to use that function explicitly *before* the subxact aborted, we can copy detoast the value out of that context safely. Oh, I see. I think that's pretty icky. Users won't expect (and will complain about) such restrictions. Yea. But at least it would fail reliably instead of just under concurrency and other strange circumstances - and there'd be a safe way out. Currently there seem to be all sorts of odd behaviour possible. I simply don't have a better idea :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generic pseudotype IO functions?
Andres Freund and...@2ndquadrant.com writes: I think I am less concerned about pseudotypes.c than about bloating pg_proc.h even further and about the annoyance of editing it - but I guess that should rather be fixed by storing it in a more sensible format at some point... Yeah, getting rid of a dozen pseudotype I/O functions is hardly going to reduce the PITA factor of editing pg_proc.h. It's interesting to think about moving all those DATA() macros into some more-maintainable format --- I'm not sure what it should be exactly, but I think something that can insert plausible defaults for omitted columns would be a big help for pg_proc and maybe some of the other catalogs too. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On 2014-01-06 09:59:49 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote: That assumes that you never hold more than one spinlock at a time, otherwise you can get deadlocks. I think that assumptions holds currently, because acquiring two spinlocks at a time would be bad on performance grounds anyway. I think that's a pretty dangerous assumption I think it's a fine assumption. Code that could possibly do that should never get within hailing distance of being committed, because you are only supposed to have short straight-line bits of code under a spinlock. Taking another spinlock breaks both the straight line code and the no loops aspects of that coding rule. I think it's fair to say that no such code should be accepted - but I personally don't trust the review process to prevent such cases and not all spinlock usages are obvious (e.g. LockBufferHdr). And having rules that only cause breakage in configurations that nobody ever uses doesn't inspire much trust in that configuration. If we go that way, we should add an Assert preventing recursive spinlock acquiration... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
Masterprojekt Naumann1 mpws201...@gmail.com writes: Nevertheless, slot-tts_nvalid is always 0. I hope that there is no other problem. You should not be touching the tts_values/tts_isnull arrays without having first called slot_getsomeattrs or slot_getallattrs. See comments in src/include/executor/tuptable.h for some documentation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. So: (the '-' means is bind mounted to) /master/bin - /bin /master/dev - /dev /master/etc - /etc /master/lib - /lib /master/usr - /usr /master/data /master/tablespace /hotstandby/bin - /bin /hotstandby/dev - /dev /hotstandby/etc - /etc /hotstandby/lib - /lib /hotstandby/usr - /usr /hotstandby/data /hotstandby/tablespace So from inside the master chroot, you see the system's /bin as /bin, the system's /dev as /dev, etc, but what you see as /data and /tablespace are your own private ones. Likewise from the hotstandby chroot. But since the binaries are in something like /home/myuser/postgresql/src/test/regress/tmp_check/install/usr/local/pgsql/bin each cluster uses the same binaries, refered to by the same path. On Sunday, January 5, 2014 5:25 PM, Greg Stark st...@mit.edu wrote: -- greg On 5 Jan 2014 14:54, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. This sounds pretty cool. The real trick will be in testing concurrent behaviour -- I.e. queries on the slave when it's replaying logs at a certain point. But right now we have nothing so anything would be an improvement. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. This isn't necessary. You can use the same binaries and run initdb with a different location just fine. Then start up the database with -D to specify the directory.
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 9:59 AM, Tom Lane t...@sss.pgh.pa.us wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-01-06 10:35:59 +0200, Heikki Linnakangas wrote: That assumes that you never hold more than one spinlock at a time, otherwise you can get deadlocks. I think that assumptions holds currently, because acquiring two spinlocks at a time would be bad on performance grounds anyway. I think that's a pretty dangerous assumption I think it's a fine assumption. Code that could possibly do that should never get within hailing distance of being committed, because you are only supposed to have short straight-line bits of code under a spinlock. Taking another spinlock breaks both the straight line code and the no loops aspects of that coding rule. OK, so we could do this. But should we? If we're going to get rid of --disable-spinlocks for other reasons, then there's not much point in spending the time to make it work with dynamic shared memory segments that contain locks. In fact, even if we *aren't* going to get rid of it, it's not 100% clear to me that there's much point in supporting that intersection: a big point of the point of dsm is to enable parallelism, and the point of parallelism is to be fast, and if you want things to be fast, you should probably have working spinlocks. Of course, there are some other applications for dynamic shared memory as well, so this isn't a watertight argument, and in a quick test on my laptop, the performance of --disable-spinlocks wasn't horrible, so this argument isn't watertight. I guess the question boils down to: why are we keeping --disable-spinlocks around? If we're expecting that people might really use it for serious work, then it needs to remain and it needs to work with dynamic shared memory. If we're expecting that people might use it while porting PostgreSQL to a new platform but only as an aid to get things up and running, then it needs to remain, but it's probably OK for dynamic shared memory not to work when this option is in use. If nobody uses it at all, then we can just forget the whole thing. I guess my bottom line here is that if --disable-spinlocks is important to somebody, then I'm willing to expend some effort on keeping it working. But if it's just inertia, then I'd rather not spend a lot of time on it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On Mon, Jan 6, 2014 at 11:47 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-06 11:08:41 -0500, Robert Haas wrote: On Mon, Jan 6, 2014 at 9:47 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-06 09:43:45 -0500, Robert Haas wrote: I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. So, you want to let users do things that are unsafe, but only if they ask nicely? That hardly seems right. Well, no. If they have to use that function explicitly *before* the subxact aborted, we can copy detoast the value out of that context safely. Oh, I see. I think that's pretty icky. Users won't expect (and will complain about) such restrictions. Yea. But at least it would fail reliably instead of just under concurrency and other strange circumstances - and there'd be a safe way out. Currently there seem to be all sorts of odd behaviour possible. I simply don't have a better idea :( Is forcibly detoast everything a complete no-go? I realize there are performance concerns with that approach, but I'm not sure how realistic a worry it actually is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add CREATE support to event triggers
On Fri, Jan 3, 2014 at 9:05 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: The other thing that bothers me here is that, while a normalized command string sounds great in theory, as soon as you want to allow (for example) mapping schema A on node 1 to schema B on node 2, the wheels come off: you'll have to deparse that normalized command string so you can change out the schema name and then reassemble it back into a command string again. So we're going to parse the user input, then deparse it, hand over the results to the application code, which will then parse it, modify that, and deparse it again. I have considered several ideas on this front, but most of them turn out to be useless or too cumbersome to use. What seems most adequate is to build a command string containing certain patterns, and an array of replacement values for such patterns; each pattern corresponds to one element that somebody might want modified in the command. As a trivial example, a command such as CREATE TABLE foo (bar INTEGER); would return a string like CREATE TABLE ${table_schema}.${table_name} (bar INTEGER); and the replacement array would be {table_schema = public, table_name = foo} If we additionally provide a function to expand the replacements in the string, we would have the base funcionality of a normalized command string. If somebody wants to move the table to some other schema, they can simply modify the array to suit their taste, and again expand using the provided function; this doesn't require parsing SQL. It's likely that there are lots of fine details that need exploring before this is a fully workable idea -- I have just started work on it, so please bear with me. I think this is basically what you call a JSON blob. I think this direction has some potential. I'm not sure it's right in detail. The exact scheme you propose above won't work if you want to leave out the schema name altogether, and more generally it's not going to help very much with anything other than substituting in identifiers. What if you want to add a column called satellite_id to every table that gets created, for example? What if you want to make the tables UNLOGGED? I don't see how that kind of things is going to work at all cleanly. What I can imagine that might work is that you get a JSON blob for a create table statement and then you have a function make_a_create_table_statement(json) that will turn it back into SQL. You can pass a modified blob (adding, removing, or changing the schema_name or any other element) instead and it will do its thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
I agree that merely setting up masters and slaves is the tip of the iceberg. It seems to be what needs to be tackled first, though, because until we have a common framework, we cannot all contribute tests to it. I imagine setting up a whole hierarchy of master, hotstandbys, warmstandbys, etc., and having over the course of the test, base backups made, new clusters spun up from those backups, masters stopped and standbys promoted to master, etc. But I also imagine there needs to be SQL run on the master that changes the data, so that replication of those changes can be confirmed. There are lots of ways to change data, such as through the large object interface. The current 'make check' test suite exercises all those code paths. If we incorporate them into our replication testing suite, then we get the advantage of knowing that all those paths are being tested in our suite as well. And if some new interface, call it huge object, ever gets made, then there should be a hugeobject.sql in src/test/regress/sql, and we automatically get that in our replication tests. mark On Sunday, January 5, 2014 6:13 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it, or even include those tests directly with pg_basebackup or an upper layer of it. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. -1 for that developers should not need to use root to run regression suite. 2) I have no win32 version of
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On 2014-01-06 09:12:03 -0800, Mark Dilger wrote: The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. I don't think the benefit of being able to test tablespaces without restarts comes even close to offsetting the cost of requiring sudo permissions and introducing OS dependencies. E.g. there's pretty much no hope of making this work sensibly on windows. So I'd just leave out that part. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
Robert Haas robertmh...@gmail.com writes: Is forcibly detoast everything a complete no-go? I realize there are performance concerns with that approach, but I'm not sure how realistic a worry it actually is. It's certainly possible to think of scenarios under which it'd be painful, eg, you fetch all columns into a record but you never actually use the toasted one(s). OTOH, I can think of cases where forced detoasting might save cycles too, if it prevents multiple detoastings on later accesses. Probably what we ought to do is put together a trial patch and try to do some benchmarking. I agree that this is the simplest route to a fix if we can stand the overhead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add CREATE support to event triggers
Robert Haas escribió: I think this direction has some potential. I'm not sure it's right in detail. The exact scheme you propose above won't work if you want to leave out the schema name altogether, and more generally it's not going to help very much with anything other than substituting in identifiers. What if you want to add a column called satellite_id to every table that gets created, for example? What if you want to make the tables UNLOGGED? I don't see how that kind of things is going to work at all cleanly. Thanks for the discussion. I am building some basic infrastructure to make this possible, and will explore ideas to cover these oversights (not posting anything concrete yet because I expect several iterations to crash and burn before I have something sensible to post). What I can imagine that might work is that you get a JSON blob for a create table statement and then you have a function make_a_create_table_statement(json) that will turn it back into SQL. You can pass a modified blob (adding, removing, or changing the schema_name or any other element) instead and it will do its thing. I agree, except that I would prefer not to have one function for each DDL statement type; instead we would have a single function that knows to expand arbitrary strings using arbitrary JSON parameter objects, and let ddl_rewrite.c (or whatever we call that file) deal with all the ugliness. One function per statement would increase the maintenance cost more, I think (three extra pg_proc entries every time you add a new object type? Ugh.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
Andres Freund-3 wrote On 2014-01-06 09:12:03 -0800, Mark Dilger wrote: The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. I don't think the benefit of being able to test tablespaces without restarts comes even close to offsetting the cost of requiring sudo permissions and introducing OS dependencies. E.g. there's pretty much no hope of making this work sensibly on windows. So I'd just leave out that part. Only skimming this thread but even if only a handful of buildfarm animals can run this extended test bundle because of the restrictive requirements it is likely better than discarding them altogether. The main thing in this case is to segregate out this routine so that it has to be invoked explicitly and ideally in a ignore if pre-reqs are missing manner. Increasing the likelihood and frequency of test runs in what is a fairly popular platform and that covers non-OS specific code as well has benefits. As long at it doesn't poison anything else I don't see that much harm coming of it. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-In-core-regression-tests-for-replication-cascading-archiving-PITR-etc-Michael-Paquier-tp5785400p578.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 11:22 AM, Tom Lane t...@sss.pgh.pa.us wrote: I think we can eliminate the first of those. Semaphores for spinlocks were a performance disaster fifteen years ago, and the situation has surely only gotten worse since then. I do, however, believe that --disable-spinlocks has some use while porting to a new platform. (And I don't believe the argument advanced elsewhere in this thread that everybody uses gcc; much less that gcc's atomic intrinsics are of uniformly high quality even on oddball architectures.) Heikki's idea has some attraction for porting work whether or not you believe that DSM needs to work during initial porting. By default, PG will try to create upwards of ten thousand spinlocks just for buffer headers. It's likely that that will fail unless you whack around the kernel's SysV parameters. Being able to constrain the number of semaphores to something sane would probably be useful. Having said all that, I'm not personally going to take the time to implement it, and I don't think the DSM patch should be required to either. Somebody who actually needs it can go make it work. Well, I took a look at this and it turns out not to be very hard, so here's a patch. Currently, we allocate 3 semaphore per shared buffer and a bunch of others, but the 3 per shared buffer dominates, so you end up with ~49k spinlocks for the default of 128MB shared_buffers. I chose to peg the number of semaphores at 1024, which is quite small compared to the current allocation, but the number of spinlock allocations that can be in progress at any given time is limited by the number of running backends. Even allowing for the birthday paradox, that should be enough to run at least a few dozen backends without suffering serious problems due to the multiplexing - especially because in real workloads, contention is usually concentrated around a small number of spinlocks that are unlikely to all be mapped to the same underlying semaphore. I'm happy enough with this way forward. Objections? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 048a189..7eae2a1 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -471,6 +471,9 @@ typedef struct slock_t*ShmemLock; VariableCache ShmemVariableCache; Backend*ShmemBackendArray; +#ifndef HAVE_SPINLOCKS + PGSemaphore SpinlockSemaArray; +#endif LWLock *LWLockArray; slock_t*ProcStructLock; PROC_HDR *ProcGlobal; @@ -5626,6 +5629,9 @@ save_backend_variables(BackendParameters *param, Port *port, param-ShmemVariableCache = ShmemVariableCache; param-ShmemBackendArray = ShmemBackendArray; +#ifndef HAVE_SPINLOCKS + param-SpinlockSemaArray = SpinlockSemaArray; +#endif param-LWLockArray = LWLockArray; param-ProcStructLock = ProcStructLock; param-ProcGlobal = ProcGlobal; @@ -5854,6 +5860,9 @@ restore_backend_variables(BackendParameters *param, Port *port) ShmemVariableCache = param-ShmemVariableCache; ShmemBackendArray = param-ShmemBackendArray; +#ifndef HAVE_SPINLOCKS + SpinlockSemaArray = param-SpinlockSemaArray; +#endif LWLockArray = param-LWLockArray; ProcStructLock = param-ProcStructLock; ProcGlobal = param-ProcGlobal; diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c index 040c7aa..ad663ec 100644 --- a/src/backend/storage/ipc/ipci.c +++ b/src/backend/storage/ipc/ipci.c @@ -105,6 +105,7 @@ CreateSharedMemoryAndSemaphores(bool makePrivate, int port) * need to be so careful during the actual allocation phase. */ size = 10; + size = add_size(size, SpinlockSemaSize()); size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE, sizeof(ShmemIndexEnt))); size = add_size(size, BufferShmemSize()); diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c index 18ba426..4efd0fb 100644 --- a/src/backend/storage/ipc/shmem.c +++ b/src/backend/storage/ipc/shmem.c @@ -116,9 +116,24 @@ InitShmemAllocation(void) Assert(shmhdr != NULL); /* - * Initialize the spinlock used by ShmemAlloc. We have to do the space - * allocation the hard way, since obviously ShmemAlloc can't be called - * yet. + * If spinlocks are disabled, initialize emulation layer. We have to do + * the space allocation the hard way, since obviously ShmemAlloc can't be + * called yet. + */ +#ifndef HAVE_SPINLOCKS + { + PGSemaphore spinsemas; + + spinsemas = (PGSemaphore) (((char *) shmhdr) + shmhdr-freeoffset); + shmhdr-freeoffset += MAXALIGN(SpinlockSemaSize()); + SpinlockSemaInit(spinsemas); + Assert(shmhdr-freeoffset = shmhdr-totalsize); + } +#endif + + /* + * Initialize the spinlock used by ShmemAlloc; we have to do this the hard + * way, too, for the same reasons as above. */ ShmemLock = (slock_t *) (((char *) shmhdr) + shmhdr-freeoffset); shmhdr-freeoffset +=
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On 2014-01-06 12:40:25 -0500, Robert Haas wrote: On Mon, Jan 6, 2014 at 11:47 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-01-06 11:08:41 -0500, Robert Haas wrote: Yea. But at least it would fail reliably instead of just under concurrency and other strange circumstances - and there'd be a safe way out. Currently there seem to be all sorts of odd behaviour possible. I simply don't have a better idea :( Is forcibly detoast everything a complete no-go? I realize there are performance concerns with that approach, but I'm not sure how realistic a worry it actually is. The scenario I am primarily worried about is turning a record assignment which previously took up to BLOCK_SIZE + slop amount of memory into something taking up to a gigabyte. That's a pretty damn hefty change. And there's no good way of preventing it short of using a variable for each actually desired column which imnsho isn't really a solution. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Andres Freund and...@2ndquadrant.com writes: On 2014-01-05 14:06:52 -0500, Tom Lane wrote: I seem to recall that there was some good reason for keeping all the LWLocks in an array, back when the facility was first designed. I'm too lazy to research the point right now, but you might want to go back and look at the archives around when lwlock.c was written. Your proposal is at http://www.postgresql.org/message-id/1054.1001520...@sss.pgh.pa.us - afaics there hasn't been much discussion about implementation details at that level. Yeah, there wasn't :-(. On reflection I believe that my reason for building it like this was partially to reduce the number of pointer variables needed, and partially to avoid exposing the contents of the LWLock struct type to the rest of the backend. Robert's idea of continuing to keep the fixed LWLocks in an array would solve the first problem, but I don't think there's any way to avoid exposing the struct type if we want to embed the things into other structs, or even just rely on array indexing. OTOH, the LWLock mechanism has been stable for long enough now that we can probably suppose this struct is no more subject to churn than any other widely-known one, so maybe that consideration is no longer significant. Another nice benefit of switching to separate allocations that we could get rid of the horrid kluge that lwlock.h is the place that defines NUM_BUFFER_PARTITIONS and some other constants that don't belong there. So on the whole I'm in favor of this, as long as we can avoid any notational churn at most call sites. Should we get rid of RequestAddinLWLocks() and LWLockAssign(), in favor of telling extensions to just allocate some space and do LWLockInit() on it? BTW, NumLWLocks() also becomes a big problem if lwlock creation gets decentralized. This is only used by SpinlockSemas() ... so maybe we need to use Heikki's modulo idea to get rid of the need for that to know the number of LWLocks, independently of DSM. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add CREATE support to event triggers
On Mon, Jan 6, 2014 at 1:17 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: I agree, except that I would prefer not to have one function for each DDL statement type; instead we would have a single function that knows to expand arbitrary strings using arbitrary JSON parameter objects, and let ddl_rewrite.c (or whatever we call that file) deal with all the ugliness. Yeah, that might work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] truncating pg_multixact/members
On Sat, Jan 4, 2014 at 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: As far as back-patching the GUCs, my thought would be to back-patch them but mark them GUC_NOT_IN_SAMPLE in 9.3, so we don't have to touch the default postgresql.conf. That seems bizarre and pointless. Keep in mind that 9.3 is still wet behind the ears and many many people haven't adopted it yet. If we do what you're suggesting then we're creating a completely useless inconsistency that will nonetheless affect all those future adopters ... while accomplishing nothing much for those who have already installed 9.3. The latter are not going to have these GUCs in their existing postgresql.conf, true, but there's nothing we can do about that. (Hint: GUC_NOT_IN_SAMPLE doesn't actually *do* anything, other than prevent the variable from being shown by SHOW ALL, which is not exactly helpful here.) Well, I guess what I'm really wondering is whether we should refrain from patching postgresql.conf.sample in 9.3, even if we add the GUC, just because people may have existing configuration files that they've already modified, and it could perhaps create confusion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
I was already starting to consider making the chroot logic optional, based on the resistence expressed by folks on this thread. How about the following: During the configure phase, it checks for chroot and setuid and friends that it will need. The regression suite has config parameters to specify where the chroot directories are to live, defaulting to something sensible. We have two almost identical make targets, called something like 'replicationcheck' and 'sudofullreplicationcheck', and only do the chroot stuff if uid=0, the directories exist, and the bind mounts exist, and the make target was the 'sudofullreplicationcheck'. The tablespace tests would have to be optional, only running in the full test and not the non-full test, and that makes some complications with having two different expectations (in the sense of the results/ vs. expected/ directories). I'm inclined to change the name of the tests from 'replicationtest' or 'replicationcheck' to something broader like 'clustercheck', owing to the expectation that more than replication could be tested in this framework. The sudofull prefix is just a placefiller -- I don't like that naming convention. Not sure about the name to use. mark On Monday, January 6, 2014 10:17 AM, David Johnston pol...@yahoo.com wrote: Andres Freund-3 wrote On 2014-01-06 09:12:03 -0800, Mark Dilger wrote: The reason I was going to all the trouble of creating chrooted environments was to be able to replicate clusters that have tablespaces. Not doing so makes the test code simpler at the expense of reducing test coverage. I am using the same binaries. The chroot directories are not chroot jails. I'm intentionally bind mounting out to all the other directories on the system, except the other clusters' data directories and tablespace directories. The purpose of the chroot is to make the paths the same on all clusters without the clusters clobbering each other. I don't think the benefit of being able to test tablespaces without restarts comes even close to offsetting the cost of requiring sudo permissions and introducing OS dependencies. E.g. there's pretty much no hope of making this work sensibly on windows. So I'd just leave out that part. Only skimming this thread but even if only a handful of buildfarm animals can run this extended test bundle because of the restrictive requirements it is likely better than discarding them altogether. The main thing in this case is to segregate out this routine so that it has to be invoked explicitly and ideally in a ignore if pre-reqs are missing manner. Increasing the likelihood and frequency of test runs in what is a fairly popular platform and that covers non-OS specific code as well has benefits. As long at it doesn't poison anything else I don't see that much harm coming of it. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Re-In-core-regression-tests-for-replication-cascading-archiving-PITR-etc-Michael-Paquier-tp5785400p578.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Mon, Jan 6, 2014 at 9:48 AM, Amit Kapila amit.kapil...@gmail.com wrote: Couldn't we also handle this by postponing FreeConfigVariables until after the if (error) block? Wouldn't doing that way can lead to bigger memory leak, if error level is ERROR. Though in current fix also it can leak memory but it will be just for ErrorConfFile_save. I think some similar case can happen for 'pre_value' in code currently as well, that's why I have fixed it in a similar way in patch. I was assuming that error-recovery would reset the containing memory context, but I'm not sure what memory context we're executing in at this point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Sun, Jan 5, 2014 at 10:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes: You are correct. pg_dump export reloptions using WITH clause of CREATE TABLE statement. I.e.: CREATE TABLE foo ( ) WITH (autovacuum_enabled=false, bdr.do_replicate=false); So if this statement checks for 'bdr' extension is loaded then in partial restore it can be fail. I see absolutely *nothing* wrong with failing that command if bdr is not installed. For an analogy, if this table includes a column of type bar defined by some extension baz, we are certainly going to fail the CREATE TABLE if baz isn't installed. Now, if bdr is installed but the validation doesn't happen unless bdr is loaded in some sense, then that is an implementation deficiency that I think we can insist be rectified before this feature is accepted. We could add a catalog pg_custom_reloption with a reloption namespace, a reloption name, and a pg_proc OID for a checker-function. This is a lot more overhead than just having a hook the way we do for GUCs, and I'm not sure how you'd handle invalidation, but in theory it solves the problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How to reproduce serialization failure for a read only transaction.
If two transactions both read and write, I can easily reproduce the following: could not serialize access due to read/write dependencies among transactions. However, the 9.3 documentation says that When relying on Serializable transactions to prevent anomalies, it is important that any data read from a permanent user table not be considered valid until the transaction which read it has successfully committed. This is true even for read-only transactions. I cannot have a read-only transaction fail because of serialization anomalies. Can someone show me a working example please? -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Robert Haas robertmh...@gmail.com writes: Well, I took a look at this and it turns out not to be very hard, so here's a patch. Currently, we allocate 3 semaphore per shared buffer and a bunch of others, but the 3 per shared buffer dominates, so you end up with ~49k spinlocks for the default of 128MB shared_buffers. I chose to peg the number of semaphores at 1024, which is quite small compared to the current allocation, but the number of spinlock allocations that can be in progress at any given time is limited by the number of running backends. Even allowing for the birthday paradox, that should be enough to run at least a few dozen backends without suffering serious problems due to the multiplexing - especially because in real workloads, contention is usually concentrated around a small number of spinlocks that are unlikely to all be mapped to the same underlying semaphore. I'm happy enough with this way forward. Objections? -1 for the any_spinlock_held business (useless overhead IMO, as it doesn't have anything whatsoever to do with enforcing the actual coding rule). And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h, and maybe dropping SpinlockSemas() altogether in favor of just referencing the constant. Otherwise this seems reasonable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] truncating pg_multixact/members
Robert Haas robertmh...@gmail.com writes: On Sat, Jan 4, 2014 at 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: Keep in mind that 9.3 is still wet behind the ears and many many people haven't adopted it yet. If we do what you're suggesting then we're creating a completely useless inconsistency that will nonetheless affect all those future adopters ... while accomplishing nothing much for those who have already installed 9.3. The latter are not going to have these GUCs in their existing postgresql.conf, true, but there's nothing we can do about that. (Hint: GUC_NOT_IN_SAMPLE doesn't actually *do* anything, other than prevent the variable from being shown by SHOW ALL, which is not exactly helpful here.) Well, I guess what I'm really wondering is whether we should refrain from patching postgresql.conf.sample in 9.3, even if we add the GUC, just because people may have existing configuration files that they've already modified, and it could perhaps create confusion. If we don't update postgresql.conf.sample then we'll just be creating different confusion. My argument above is that many more people are likely to be affected in the future by an omission in postgresql.conf.sample than would be affected now by an inconsistency between postgresql.conf.sample and their actual conf file. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 2:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, I took a look at this and it turns out not to be very hard, so here's a patch. Currently, we allocate 3 semaphore per shared buffer and a bunch of others, but the 3 per shared buffer dominates, so you end up with ~49k spinlocks for the default of 128MB shared_buffers. I chose to peg the number of semaphores at 1024, which is quite small compared to the current allocation, but the number of spinlock allocations that can be in progress at any given time is limited by the number of running backends. Even allowing for the birthday paradox, that should be enough to run at least a few dozen backends without suffering serious problems due to the multiplexing - especially because in real workloads, contention is usually concentrated around a small number of spinlocks that are unlikely to all be mapped to the same underlying semaphore. I'm happy enough with this way forward. Objections? -1 for the any_spinlock_held business (useless overhead IMO, as it doesn't have anything whatsoever to do with enforcing the actual coding rule). Hmm. I thought that was a pretty well-aimed bullet myself; why do you think that it isn't? I don't particularly mind ripping it out, but it seemed like a good automated test to me. And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h, and maybe dropping SpinlockSemas() altogether in favor of just referencing the constant. Otherwise this seems reasonable. As far as pg_config_manual.h is concerned, is this the sort of thing you have in mind? #ifndef HAVE_SPINLOCKS #define NUM_SPINLOCK_SEMAPHORES 1024 #endif -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Robert Haas robertmh...@gmail.com writes: On Sun, Jan 5, 2014 at 10:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Now, if bdr is installed but the validation doesn't happen unless bdr is loaded in some sense, then that is an implementation deficiency that I think we can insist be rectified before this feature is accepted. We could add a catalog pg_custom_reloption with a reloption namespace, a reloption name, and a pg_proc OID for a checker-function. This is a lot more overhead than just having a hook the way we do for GUCs, and I'm not sure how you'd handle invalidation, but in theory it solves the problem. If we're willing to tie the reloption names to extension names, which seems reasonable to me, then we don't need a new catalog --- just add a checker-function column to pg_extension. I don't follow your point about invalidation. Once an extension has accepted a reloption value, it doesn't get to change its mind later; it has to deal with that value somehow forevermore. Using a hook, or failing to validate the value at all, certainly isn't going to make that requirement go away. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 1:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: OTOH, the LWLock mechanism has been stable for long enough now that we can probably suppose this struct is no more subject to churn than any other widely-known one, so maybe that consideration is no longer significant. On the whole, I'd say it's been more stable than most. But even if we do decide to change it, I'm not sure that really matters very much. We whack widely-used data structures around fairly regularly, and I haven't observed that to cause many problems. Just as one concrete example, PGPROC changes pretty regularly, and I'm not aware of much code that's been broken by that. Should we get rid of RequestAddinLWLocks() and LWLockAssign(), in favor of telling extensions to just allocate some space and do LWLockInit() on it? I don't really see any reason to deprecate that mechanism - it'll just make it harder for people who want to write code that works on multiple PostgreSQL releases to do so easily, and it's my belief that there's a decent amount of third-party code that uses those APIs. EnterpriseDB has some, for example. Broadly, I don't see any problem with presuming that most code that uses lwlocks will be happy to keep those lwlocks in the main array while allowing them to be stored elsewhere in those cases where that's not convenient. Perhaps in the long run that won't be true, but then again perhaps it will. Either way, I don't see a real reason to rush into a change in policy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
Andres Freund and...@2ndquadrant.com writes: On 2014-01-06 12:40:25 -0500, Robert Haas wrote: Is forcibly detoast everything a complete no-go? I realize there are performance concerns with that approach, but I'm not sure how realistic a worry it actually is. The scenario I am primarily worried about is turning a record assignment which previously took up to BLOCK_SIZE + slop amount of memory into something taking up to a gigabyte. That's a pretty damn hefty change. And there's no good way of preventing it short of using a variable for each actually desired column which imnsho isn't really a solution. Dunno ... if you have a table that contains a gigabyte-width column, should you be all that surprised if SELECT * INTO r FROM table results in r occupying about a gigabyte? And I can't count the number of times I've heard people deprecate using SELECT * at all in production code, so I don't agree with the claim that listing the columns you want is an unacceptable solution. I don't doubt that there are some folks for whom this would be a noticeable space-consumption hit compared to current behavior, but I have a hard time working up a lot of sympathy for them. I'm more concerned about the possible performance hit from detoasting more-reasonably-sized columns (say in the tens-of-KB range) when they might not get used. But we really need to benchmark that rather than just guess about whether it's a problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 2:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: -1 for the any_spinlock_held business (useless overhead IMO, as it doesn't have anything whatsoever to do with enforcing the actual coding rule). Hmm. I thought that was a pretty well-aimed bullet myself; why do you think that it isn't? I don't particularly mind ripping it out, but it seemed like a good automated test to me. The coding rule is only short straight-line code while holding a spinlock. That could be violated in any number of nasty ways without trying to take another spinlock. And since the whole point of the rule is that spinlock-holding code segments should be quick, adding more overhead to them doesn't seem very nice, even if it is only done in Assert builds. I agree it'd be nicer if we had some better way than mere manual inspection to enforce proper use of spinlocks; but this change doesn't seem to me to move the ball downfield by any meaningful distance. And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h, and maybe dropping SpinlockSemas() altogether in favor of just referencing the constant. Otherwise this seems reasonable. As far as pg_config_manual.h is concerned, is this the sort of thing you have in mind? #ifndef HAVE_SPINLOCKS #define NUM_SPINLOCK_SEMAPHORES 1024 #endif I think we can just define it unconditionally, don't you? It shouldn't get referenced in HAVE_SPINLOCK builds. Or is the point that you want to enforce that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 1:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: OTOH, the LWLock mechanism has been stable for long enough now that we can probably suppose this struct is no more subject to churn than any other widely-known one, so maybe that consideration is no longer significant. On the whole, I'd say it's been more stable than most. But even if we do decide to change it, I'm not sure that really matters very much. Actually, the real value of a module-local struct definition is that you can be pretty darn sure that nothing except the code in that file is manipulating the struct contents. I would've preferred that we expose only an abstract struct definition, but don't quite see how to do that if we're going to embed the things in buffer headers. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 3:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 2:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: -1 for the any_spinlock_held business (useless overhead IMO, as it doesn't have anything whatsoever to do with enforcing the actual coding rule). Hmm. I thought that was a pretty well-aimed bullet myself; why do you think that it isn't? I don't particularly mind ripping it out, but it seemed like a good automated test to me. The coding rule is only short straight-line code while holding a spinlock. That could be violated in any number of nasty ways without trying to take another spinlock. And since the whole point of the rule is that spinlock-holding code segments should be quick, adding more overhead to them doesn't seem very nice, even if it is only done in Assert builds. I agree it'd be nicer if we had some better way than mere manual inspection to enforce proper use of spinlocks; but this change doesn't seem to me to move the ball downfield by any meaningful distance. Well, my thought was mainly that, while it may be a bad idea to take another spinlock while holding a spinlock under any circumstances, somebody might do it and it might appear to work just fine. The most likely sequences seems to me to be something like SpinLockAcquire(...) followed by LWLockConditionalAcquire(), thinking that things are OK because the lock acquisition is conditional - but in fact the conditional acquire still takes the spinlock unconditionally. Even so, without --disable-spinlocks, this will probably work just fine. Most likely there won't even be a performance problem, because a lwlock has to be *very* hot for the spinlock acquisitions to become a problem. But with --disable-spinlocks, it will unpredictably self-deadlock. And since few developers are likely to test with --disable-spinlocks, the problem most likely won't be noticed. When someone does then try to use --disable-spinlocks to port to a new problem and starts hitting the deadlocks, they may not know enough to attribute them to the real cause. All in all it doesn't seem like unduly expensive insurance, but I can remove it if the consensus is against it. And I'd suggest defining NUM_SPINLOCK_SEMAPHORES in pg_config_manual.h, and maybe dropping SpinlockSemas() altogether in favor of just referencing the constant. Otherwise this seems reasonable. As far as pg_config_manual.h is concerned, is this the sort of thing you have in mind? #ifndef HAVE_SPINLOCKS #define NUM_SPINLOCK_SEMAPHORES 1024 #endif I think we can just define it unconditionally, don't you? It shouldn't get referenced in HAVE_SPINLOCK builds. Or is the point that you want to enforce that? I don't think it really matters much; seems like a style question to me. That's why I asked. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 3:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 1:55 PM, Tom Lane t...@sss.pgh.pa.us wrote: OTOH, the LWLock mechanism has been stable for long enough now that we can probably suppose this struct is no more subject to churn than any other widely-known one, so maybe that consideration is no longer significant. On the whole, I'd say it's been more stable than most. But even if we do decide to change it, I'm not sure that really matters very much. Actually, the real value of a module-local struct definition is that you can be pretty darn sure that nothing except the code in that file is manipulating the struct contents. I would've preferred that we expose only an abstract struct definition, but don't quite see how to do that if we're going to embed the things in buffer headers. Agreed. I think it's good to keep struct definitions private as much as possible, and I do. But I don't think this is going to cause a big problem either; lwlocks have been around for a long time and the conventions for using them are pretty well understood, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 3:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: I agree it'd be nicer if we had some better way than mere manual inspection to enforce proper use of spinlocks; but this change doesn't seem to me to move the ball downfield by any meaningful distance. Well, my thought was mainly that, while it may be a bad idea to take another spinlock while holding a spinlock under any circumstances, somebody might do it and it might appear to work just fine. The most likely sequences seems to me to be something like SpinLockAcquire(...) followed by LWLockConditionalAcquire(), thinking that things are OK because the lock acquisition is conditional - but in fact the conditional acquire still takes the spinlock unconditionally. The point I'm making is that no such code should get past review, whether it's got an obvious performance problem or not. There's a huge amount of stuff that in principle we could add overhead to Assert-enabled builds for, but we prefer not to --- an example not too far afield from this issue is that there's no mechanism for detecting deadlocks among multiple LWLock holders. If we go too far down the path of adding useless checks to Assert builds, people will stop using such builds for routine development, which would surely be a large net negative reliability-wise. I'd be okay with adding overhead to SpinLockAcquire/Release if it had some meaningful probability of catching real bugs, but I don't see that that claim can be made here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 3:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: I agree it'd be nicer if we had some better way than mere manual inspection to enforce proper use of spinlocks; but this change doesn't seem to me to move the ball downfield by any meaningful distance. Well, my thought was mainly that, while it may be a bad idea to take another spinlock while holding a spinlock under any circumstances, somebody might do it and it might appear to work just fine. The most likely sequences seems to me to be something like SpinLockAcquire(...) followed by LWLockConditionalAcquire(), thinking that things are OK because the lock acquisition is conditional - but in fact the conditional acquire still takes the spinlock unconditionally. The point I'm making is that no such code should get past review, whether it's got an obvious performance problem or not. Sure, I agree, but we all make mistakes. It's just a judgement call as to how likely you think it is that someone might make this particular mistake, a topic upon which opinions may vary. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On 06/01/2014 03:14, Robert Haas wrote: That's up to the application. After calling dsm_create(), you call dsm_segment_handle() to get the 32-bit integer handle for that segment. Then you have to get that to the other process(es) somehow. If you're trying to share a handle with a background worker, you can stuff it in bgw_main_arg. Otherwise, you'll probably need to store it in the main shared memory segment, or a file, or whatever. Well, that works for sysv shm, sure. But I was interested (possibly from Konstantin) how the handle transfer takes place at the moment, particularly if it is possible to create additional segments dynamically. I haven't looked at the extension at all. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On 06/01/2014 04:20, Amit Kapila wrote: Duplicate handle should work, but we need to communicate the handle to other process using IPC. Only if the other process needs to use it. The IPC is not to transfer the handle to the other process, just to tell it which slot in its handle table contains the handle. If you just want to ensure that its use-count never goes to zero, the receiver does not need to know what the handle is. However ... The point remains that you need to duplicate it into every process that might want to use it subsequently, so it makes sense to DuplicateHandle into the parent, and then to advertise that handle value publicly so that other child processes can DuplicateHandle it back into their own process. The handle value can change so you also need to refer to the handle in the parent and map it in each child to the local equivalent. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On Mon, Jan 6, 2014 at 4:04 PM, james ja...@mansionfamily.plus.com wrote: The point remains that you need to duplicate it into every process that might want to use it subsequently, so it makes sense to DuplicateHandle into the parent, and then to advertise that handle value publicly so that other child processes can DuplicateHandle it back into their own process. Well, right now we just reopen the same object from all of the processes, which seems to work fine and doesn't require any of this complexity. The only problem I don't know how to solve is how to make a segment stick around for the whole postmaster lifetime. If duplicating the handle into the postmaster without its knowledge gets us there, it may be worth considering, but that doesn't seem like a good reason to rework the rest of the existing mechanism. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Mon, Jan 6, 2014 at 2:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Jan 5, 2014 at 10:50 PM, Tom Lane t...@sss.pgh.pa.us wrote: Now, if bdr is installed but the validation doesn't happen unless bdr is loaded in some sense, then that is an implementation deficiency that I think we can insist be rectified before this feature is accepted. We could add a catalog pg_custom_reloption with a reloption namespace, a reloption name, and a pg_proc OID for a checker-function. This is a lot more overhead than just having a hook the way we do for GUCs, and I'm not sure how you'd handle invalidation, but in theory it solves the problem. If we're willing to tie the reloption names to extension names, which seems reasonable to me, then we don't need a new catalog --- just add a checker-function column to pg_extension. That seems sketchy to me. If somebody sets up a home-brew replication solution, I can't see why they should have to package it as an extension to register a reloption. So far, extensions are all about packaging, not functionality, and I'm not excited about changing that. I don't follow your point about invalidation. Once an extension has accepted a reloption value, it doesn't get to change its mind later; it has to deal with that value somehow forevermore. Using a hook, or failing to validate the value at all, certainly isn't going to make that requirement go away. Well... I'm assuming that the contents of the pg_custom_reloption catalog (or whatever catalog we use) would have to be loaded into some backend-local cache on first use, so it would need to be invalidated later as necessary. But come to think of it, that's actually not a problem at all; we can easily register a syscache callback and that should work fine. Not sure what I was worried about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] truncating pg_multixact/members
On Mon, Jan 6, 2014 at 2:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jan 4, 2014 at 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote: Keep in mind that 9.3 is still wet behind the ears and many many people haven't adopted it yet. If we do what you're suggesting then we're creating a completely useless inconsistency that will nonetheless affect all those future adopters ... while accomplishing nothing much for those who have already installed 9.3. The latter are not going to have these GUCs in their existing postgresql.conf, true, but there's nothing we can do about that. (Hint: GUC_NOT_IN_SAMPLE doesn't actually *do* anything, other than prevent the variable from being shown by SHOW ALL, which is not exactly helpful here.) Well, I guess what I'm really wondering is whether we should refrain from patching postgresql.conf.sample in 9.3, even if we add the GUC, just because people may have existing configuration files that they've already modified, and it could perhaps create confusion. If we don't update postgresql.conf.sample then we'll just be creating different confusion. My argument above is that many more people are likely to be affected in the future by an omission in postgresql.conf.sample than would be affected now by an inconsistency between postgresql.conf.sample and their actual conf file. I don't really have a horse in the race, so I'm OK with that if that's the consensus. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to reproduce serialization failure for a read only transaction.
On Jan6, 2014, at 20:41 , AK alk...@gmail.com wrote: If two transactions both read and write, I can easily reproduce the following: could not serialize access due to read/write dependencies among transactions. However, the 9.3 documentation says that When relying on Serializable transactions to prevent anomalies, it is important that any data read from a permanent user table not be considered valid until the transaction which read it has successfully committed. This is true even for read-only transactions. I cannot have a read-only transaction fail because of serialization anomalies. Can someone show me a working example please? A read-only transaction will abort due to a serialization failure if observes a state of the database which doesn't exist in any serial transaction schedule. Here's an example (default isolation level is assumed to be serializable, of course) W1: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; W1: UPDATE t SET count=count+1 WHERE id=1; -- (*2) W1: SELECT data FROM t WHERE id=2; -- (*1) W2: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; W2: UPDATE t SET count=count+1 WHERE id=2; -- (*1, *2) W2: COMMIT; R : START TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY; R : SELECT data FROM t WHERE id IN (1,2); -- (*2) W1: COMMIT; -- R will now report a serialization error! Due to (*1), W1 must execute before W2 in any serial schedule, since W1 reads record 2 which is later modified by W2. Due to (*2), R must execute after W2 but before W1 since it reads record 2 previously modified by W2 and record 1 later modified by W1. (Note that W1 hasn't committed at time R acquires its snapshot) The dependencies induced by (*1) or (*2) alone are satisfyable by a serial schedule, but both together aren't - if W1 must execute before W2 as required by (*1), then surely every transaction that runs after W2 in such a schedule also runs after W1, thus contradicting (*2). Now since (*1) alone isn't contradictory, committing W1 succeeds. That leaves only the last line, the COMMIT of R, to fail, which it does. The gist of this example is that whether the state observed by R exists in any serial transaction schedule or not is only certain after all concurrent read-write transactions (W1 and W2) have committed. You can avoid the error above by specifying DEFERRABLE in R's START TRANSACTION command. The session will then acquire a snapshot and wait for all possibly interfering read-write transactions to commit. If the snapshot turns out to be observable in some serial schedule, the session will continue, otherwise the database will acquire a new snapshot and wait again. Thus, once the START TRANSACTION with the DEFERRABLE flag has committed, you can be sure that the transaction won't later be aborted due to a serialization error. BTW, since this is a question about how to use postgres rather than how to extend it, it actually belongs on pgsql-general, not on the hackers list. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Convert Datum* to char*
On Mon, Jan 6, 2014 at 11:34 AM, Tom Lane t...@sss.pgh.pa.us wrote: Masterprojekt Naumann1 mpws201...@gmail.com writes: Nevertheless, slot-tts_nvalid is always 0. I hope that there is no other problem. You should not be touching the tts_values/tts_isnull arrays without having first called slot_getsomeattrs or slot_getallattrs. See comments in src/include/executor/tuptable.h for some documentation. Another problem is that TextDatumGetCString() is only the right thing to do if the value is, in fact, of type text. If you've got an integer column, for example, TextDatumGetCString() is not your friend. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: How to reproduce serialization failure for a read only transaction.
Hi Florian, can you explain why do you state that default isolation level is assumed to be serializable, of course, when you explicitly specify isolation level for every session - why should he default matter at all? When I am trying to reproduce the scenario which you have posted, I am observing different results. Here is my full scenario: Session 1. Setting up: CREATE TABLE cars( license_plate VARCHAR NOT NULL, reserved_by VARCHAR NULL ); INSERT INTO cars(license_plate) VALUES ('SUPRUSR'),('MIDLYPH'); Session 2: W1 BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE cars SET reserved_by = 'Julia' WHERE license_plate = 'SUPRUSR' AND reserved_by IS NULL; SELECT * FROM Cars WHERE license_plate IN('SUPRUSR','MIDLYPH'); Session 3: W2 BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE cars SET reserved_by = 'Ryan' WHERE license_plate = 'MIDLYPH' AND reserved_by IS NULL; COMMIT; Session 4: R BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY; SELECT * FROM Cars WHERE license_plate IN('SUPRUSR','MIDLYPH'); Session 2: W1 COMMIT; ERROR: could not serialize access due to read/write dependencies among transactions What am I doing wrong? Thank you for your help! -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785597.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On Mon, Jan 6, 2014 at 9:48 AM, Stephen Frost sfr...@snowman.net wrote: None of these ideas are a complete solution for LWLOCK_STATS. In the other three cases noted above, we only need an identifier for the lock instantaneously, so that we can pass it off to the logger or dtrace or whatever. But LWLOCK_STATS wants to hold on to data about the locks that were visited until the end of the session, and it does that using an array that is *indexed* by lwlockid. I guess we could replace that with a hash table. Ugh. Any suggestions? Yeah, that's not fun. No good suggestions here offhand. Replacing it with a hashtable turns out not to be too bad, either in terms of code complexity or performance, so I think that's the way to go. I did some test runs with pgbench -S, scale factor 300, 32 clients, shared_buffers=8GB, five minute runs and got these results: resultsr.lwlock-stats.32.300.300:tps = 195493.037962 (including connections establishing) resultsr.lwlock-stats.32.300.300:tps = 189985.964658 (including connections establishing) resultsr.lwlock-stats.32.300.300:tps = 197641.293892 (including connections establishing) resultsr.lwlock-stats-htab.32.300.300:tps = 193286.066063 (including connections establishing) resultsr.lwlock-stats-htab.32.300.300:tps = 191832.100877 (including connections establishing) resultsr.lwlock-stats-htab.32.300.300:tps = 191899.834211 (including connections establishing) resultsr.master.32.300.300:tps = 197939.111998 (including connections establishing) resultsr.master.32.300.300:tps = 198641.337720 (including connections establishing) resultsr.master.32.300.300:tps = 198675.404349 (including connections establishing) master is the master branch, commit 10a82cda67731941c18256e009edad4a784a2994. lwlock-stats is the same, but with LWLOCK_STATS defined. lwlock-stats-htab is the same, with the attached patch and LWLOCK_STATS defined. The runs were interleaved, but the results are shown here grouped by test run. If we assume that the 189k result is an outlier, then there's probably some regression associated with the lwlock-stats-htab patch, but not a lot. Considering that this code isn't even compiled unless you have LWLOCK_STATS defined, I think that's OK. This is only part of the solution, of course: a complete solution will involve making the hash table key something other than the lock ID. What I'm thinking we can do is making the lock ID consist of two unsigned 32-bit integers. One of these will be stored in the lwlock itself, which if my calculations are correct won't increase the size of LWLockPadded on any common platforms (a 64-bit integer would). Let's call this the tranch id. The other will be derived from the LWLock address. Let's call this the instance ID. We'll keep a table of tranch IDs, which will be assigned consecutively starting with 0. We'll keep an array of metadata for tranches, indexed by tranch ID, and each entry will have three associated pieces of information: an array base, a stride length, and a printable name. When we need to identify an lwlock in the log or to dtrace, we'll fetch the tranch ID from the lwlock itself and use that to index into the tranch metadata array. We'll then take the address of the lwlock, subtract the array base address for the tranch, and divide by the stride length; the result is the instance ID. When reporting the user, we can report either the tranch ID directly or the associated name for that tranch; in either case, we'll also report the instance ID. So initially we'll probably just have tranch 0: the main LWLock array. If we move buffer content and I/O locks to the buffer headers, we'll define tranch 1 and tranch 2 with the same base address: the start of the buffer descriptor array, and the same stride length, the size of a buffer descriptor. One will have the associated name buffer content lock and the other buffer I/O lock. If we want, we can define split the main LWLock array into several tranches so that we can more easily identify lock manager locks, predicate lock manager locks, and buffer mapping locks. I like this system because it's very cheap - we only need a small array of metadata and a couple of memory accesses to name a lock - but it still lets us report data in a way that's actually *more* human-readable than what we have now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index 4f88d3f..c1da2a8 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -32,6 +32,10 @@ #include storage/proc.h #include storage/spin.h +#ifdef LWLOCK_STATS +#include utils/hsearch.h +#endif + /* We use the ShmemLock spinlock to protect LWLockAssign */ extern slock_t *ShmemLock; @@ -91,11 +95,17 @@ static int lock_addin_request = 0; static bool lock_addin_request_allowed = true; #ifdef LWLOCK_STATS +typedef struct lwlock_stats +{ +
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
On Jan6, 2014, at 23:28 , AK alk...@gmail.com wrote: can you explain why do you state that default isolation level is assumed to be serializable, of course, when you explicitly specify isolation level for every session - why should he default matter at all? Sorry, that was a leftover - I initially wrote just START TRANSACTION with specifying an isolation level. When I am trying to reproduce the scenario which you have posted, I am observing different results. Hm, yeah, I missed two things. First, dependency tracking can produce false positives, i.e. assume that dependencies exist between transactions which are actually independent. In my example, postgres fails to realize that W2 can be executed after W1, unless it uses an index scan for the UPDATE in W2. You can avoid that either by creating an index on the id column, and forcing W2 to use that by setting enable_seqscan to off, or by creating two tables t1 and t2 instead of one table t with two records (You'll have to modify the SELECT to scan both tables too). Second, since R executes it's SELECT before W1 commits, postgres is already aware that R poses a problem when W1 commits, and it chooses to cancel W1 instead of R. To avoid that, R needs to do the SELECT after W1 committed. Yet still force R to acquire a snapshot *before* that commit (without that, there's no serialization failure since R than simply executes after W1 and W2), you'll need to do e.g. SELECT 1 after R's START TRANSACTION command. I think the following should work (or, rather, fail) CREATE TABLE t (id INT PRIMARY KEY, count INT); INSERT INTO t (id, count) SELECT i, 0 FROM generate_series(1,2); W1: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; W1: UPDATE t SET count=count+1 WHERE id=1; W1: SELECT count FROM t WHERE id=2; W2: SET enable_seqscan=off; W2: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; W2: UPDATE t SET count=count+1 WHERE id=2; W2: COMMIT; R : START TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY; R : SELECT 1; W1: COMMIT; R : SELECT data FROM t WHERE id IN (1,2); -- Should fail best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
On 1/6/14, 5:27 PM, Florian Pflug wrote: On Jan6, 2014, at 23:28 , AK alk...@gmail.com wrote: First, dependency tracking can produce false positives, i.e. assume that dependencies exist between transactions which are actually independent. In my example, postgres fails to realize that W2 can be executed after W1, unless it uses an index scan for the UPDATE in W2. You can avoid that either by creating an index on the id column, and forcing W2 to use that by setting enable_seqscan to off, or by creating two tables t1 and t2 instead of one table t with two records (You'll have to modify the SELECT to scan both tables too). Second, since R executes it's SELECT before W1 commits, postgres is already aware that R poses a problem when W1 commits, and it chooses to cancel W1 instead of R. To avoid that, R needs to do the SELECT after W1 committed. Yet still force R to acquire a snapshot *before* that commit (without that, there's no serialization failure since R than simply executes after W1 and W2), you'll need to do e.g. SELECT 1 after R's START TRANSACTION command. I think the following should work (or, rather, fail) This email and the previous one are an awesome bit of information, can we add it to the docs somehow? Even if it's just dumping the emails into a wiki page and referencing it? -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On 1/6/14, 2:59 PM, Robert Haas wrote: On Mon, Jan 6, 2014 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jan 6, 2014 at 3:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: I agree it'd be nicer if we had some better way than mere manual inspection to enforce proper use of spinlocks; but this change doesn't seem to me to move the ball downfield by any meaningful distance. Well, my thought was mainly that, while it may be a bad idea to take another spinlock while holding a spinlock under any circumstances, somebody might do it and it might appear to work just fine. The most likely sequences seems to me to be something like SpinLockAcquire(...) followed by LWLockConditionalAcquire(), thinking that things are OK because the lock acquisition is conditional - but in fact the conditional acquire still takes the spinlock unconditionally. The point I'm making is that no such code should get past review, whether it's got an obvious performance problem or not. Sure, I agree, but we all make mistakes. It's just a judgement call as to how likely you think it is that someone might make this particular mistake, a topic upon which opinions may vary. There's been discussions in the past about the overhead added by testing and having more than one level of test capabilities so that facilities like the buildfarm can run more expensive testing without burdening developers with that. ISTM this is another case of that (presumably Tom's concern is the performance hit of adding an assert in a critical path). If we had a more aggressive form of assert (assert_anal? :)) we could use that here and let the buildfarm bore the brunt of that cost. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
Jim Nasby escribió: On 1/6/14, 2:59 PM, Robert Haas wrote: On Mon, Jan 6, 2014 at 3:57 PM, Tom Lane t...@sss.pgh.pa.us wrote: The point I'm making is that no such code should get past review, whether it's got an obvious performance problem or not. Sure, I agree, but we all make mistakes. It's just a judgement call as to how likely you think it is that someone might make this particular mistake, a topic upon which opinions may vary. I agree that excessive optimism might cause problems in the future. Maybe it makes sense to have such a check #ifdef'ed out on most builds to avoid extra overhead, but not having any check at all just because we trust the review process too much doesn't strike me as the best of ideas. There's been discussions in the past about the overhead added by testing and having more than one level of test capabilities so that facilities like the buildfarm can run more expensive testing without burdening developers with that. ISTM this is another case of that (presumably Tom's concern is the performance hit of adding an assert in a critical path). If we had a more aggressive form of assert (assert_anal? :)) we could use that here and let the buildfarm bore the brunt of that cost. Sounds good, but let's not enable it blindly on all machines. There are some that are under very high pressure to build and test all the branches, so if we add something too costly on them it might cause trouble. So whatever we do, it should at least have the option to opt-out, or perhaps even make it opt-in. (I'd think opt-out is better, because otherwise very few machines are going to have it enabled at all.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On 1/4/14, 12:00 PM, Tom Lane wrote: If you have some settings that need to be table-specific, then I agree that the reloptions infrastructure is a nice place to track them. What's actually missing here is some compelling examples of such settings for plausible extensions. I've got a real world example. At work we use limited dumps of production to support development. We do a schema dump and then grab data out of certain seed tables. To mark tables as being seed, we have to store that info in a table, but that gets us into another problem: what if a table gets renamed? In order to solve that problem, we created a tables table: CREATE TABLE tools.tables( id SERIAL PRIMARY KEY, table_schema text, table_name text ); That way if we need to rename a table we update one record in one place instead of a bunch of places (we have other code that makes use of tools.tables). (And no, we can't use OIDs because they're not immutable between dumps). This is obviously ugly and fragile. It'd be much better if we could just define a custom setting on the table itself that says hey, dump the data from here!. (FWIW, same applies to sequnces). Actually, I just checked and our seed object stuff doesn't use tools.tables, but our inheritance framework and a change notification system we wrote does. Those are other examples of where we need to store additional information about a table and had to create a system of our own to handle it. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] truncating pg_multixact/members
On 1/4/14, 8:19 AM, Robert Haas wrote: Also, while multixactid_freeze_min_age should be low, perhaps a million as you suggest, multixactid_freeze_table_age should NOT be lowered to 3 million or anything like it. If you do that, people who are actually doing lots of row locking will start getting many more full-table scans. We want to avoid that at all cost. I'd probably make the default the same as for vacuum_freeze_table_age, so that mxids only cause extra full-table scans if they're being used more quickly than xids. Same default as vacuum_freeze_table_age, or default TO vacuum_freeze_table_age? I'm thinking the latter makes more sense... -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On 1/2/14, 1:32 PM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: The simplest fix would be to just detoast everything on assignment but that was rejected on performance grounds in that previous thread. I don't see any other realistic way to fix this, however, so maybe we should just bite the bullet and do it anyway. Or just say don't do that. TRUNCATE on a table that's in use by open transactions has all sorts of issues besides this one. The given example is a pretty narrow corner case anyway --- with a less contorted coding pattern, we'd still have AccessShareLock on the table, blocking the TRUNCATE from removing data. I'd still not want to blow up performance in order to make this example work. If concurrent TRUNCATE isn't safe outside of this case then why do we allow it? IE: why doesn't TRUNCATE exclusive lock the relation? I'd much rather have working concurrent truncation than having to lock the relation, but if it's not safe we shouldn't hand people that footgun... -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On 1/6/14, 2:21 PM, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-01-06 12:40:25 -0500, Robert Haas wrote: Is forcibly detoast everything a complete no-go? I realize there are performance concerns with that approach, but I'm not sure how realistic a worry it actually is. The scenario I am primarily worried about is turning a record assignment which previously took up to BLOCK_SIZE + slop amount of memory into something taking up to a gigabyte. That's a pretty damn hefty change. And there's no good way of preventing it short of using a variable for each actually desired column which imnsho isn't really a solution. Dunno ... if you have a table that contains a gigabyte-width column, should you be all that surprised if SELECT * INTO r FROM table results in r occupying about a gigabyte? And I can't count the number of times I've heard people deprecate using SELECT * at all in production code, so I don't agree with the claim that listing the columns you want is an unacceptable solution. I see your logic, but the problem is a good developer would have actually tested that case and said Oh look, plpgsql isn't blindly copying the entire record. Now we're changing that case underneath them. That's a pretty significant change that could affect a LOT of code on the user's side. And if they've got conditional code down-stream that sometimes hits the TOASTed value and sometimes doesn't then they're in for even more fun... The deferred access pattern of detoasting is a very powerful performance improvement and I'd hate to see us limiting it in plpgsql. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: How to reproduce serialization failure for a read only transaction.
This worked for me - thank you so much! The SELECT did fail. Also I cannot reproduce a scenario when applications must not depend on results read during a transaction that later aborted;. In this example the SELECT itself has failed. Can you show an example where a SELECT completes, but the COMMIT blows up? -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785618.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] truncating pg_multixact/members
On Mon, Jan 6, 2014 at 7:50 PM, Jim Nasby j...@nasby.net wrote: On 1/4/14, 8:19 AM, Robert Haas wrote: Also, while multixactid_freeze_min_age should be low, perhaps a million as you suggest, multixactid_freeze_table_age should NOT be lowered to 3 million or anything like it. If you do that, people who are actually doing lots of row locking will start getting many more full-table scans. We want to avoid that at all cost. I'd probably make the default the same as for vacuum_freeze_table_age, so that mxids only cause extra full-table scans if they're being used more quickly than xids. Same default as vacuum_freeze_table_age, or default TO vacuum_freeze_table_age? I'm thinking the latter makes more sense... Same default. I think it's a mistake to keep leading people to think that the sensible values for one set of parameters are somehow related to a sensible set of values for the other set. They're really quite different things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR: missing chunk number 0 for toast value
On Mon, Jan 6, 2014 at 8:02 PM, Jim Nasby j...@nasby.net wrote: If concurrent TRUNCATE isn't safe outside of this case then why do we allow it? IE: why doesn't TRUNCATE exclusive lock the relation? It *does*. The problem is that the *other* transaction that's reading the relation can still retain a TOAST pointer after it no longer holds the lock. That's uncool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fixing bug #8228 (set-valued function called in context that cannot accept a set)
I kinda forgot about this bug when I went off on vacation: http://www.postgresql.org/message-id/e1uncv4-0007of...@wrigleys.postgresql.org The way to fix it seems to be to do static prechecking of whether a function's input arguments can return sets, rather than relying on their behavior during the first execution. We already have the infrastructure needed, namely expression_returns_set(), so a fix can be pretty short, as illustrated in the attached proposed patch. Now one might object that this will add an unwanted amount of overhead to query startup. expression_returns_set() is fairly cheap, since it only requires a parsetree walk and not any catalog lookups, but it's not zero cost. On the other hand, some of that cost is bought back in normal non-set cases by the fact that we needn't go through ExecMakeFunctionResult() even once. I tried to measure whether there was a slowdown using this test case: $ pgbench -c 4 -T 60 -S -n bench in a non-assert build using a scale-factor-10 pgbench database, on an 8-core machine. The best reported transaction rate over five trials was 35556.680918 in current HEAD, and 35466.438468 with the patch, for an apparent slowdown of 0.3%. This is below what I'd normally consider significant, since the run-to-run variability is considerably more than that, but there may be some actual slowdown there. We could consider a more invasive fix that would try to push the static checking back into the planner or even parser, but that would not make things any faster when queries are only executed once after planning, as is true in this test scenario. In any case, adding fields to FuncExpr/OpExpr would not be a back-patchable fix. Thoughts? Unless someone has a better idea, I'm inclined to push this patch and call it good. regards, tom lane diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index 67dca78..6f45973 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *** tupledesc_match(TupleDesc dst_tupdesc, T *** 1634,1642 * init_fcache is presumed already run on the FuncExprState. * * This function handles the most general case, wherein the function or ! * one of its arguments might (or might not) return a set. If we find ! * no sets involved, we will change the FuncExprState's function pointer ! * to use a simpler method on subsequent calls. */ static Datum ExecMakeFunctionResult(FuncExprState *fcache, --- 1634,1640 * init_fcache is presumed already run on the FuncExprState. * * This function handles the most general case, wherein the function or ! * one of its arguments can return a set. */ static Datum ExecMakeFunctionResult(FuncExprState *fcache, *** restart: *** 1906,1918 /* * Non-set case: much easier. * ! * We change the ExprState function pointer to use the simpler ! * ExecMakeFunctionResultNoSets on subsequent calls. This amounts to ! * assuming that no argument can return a set if it didn't do so the ! * first time. */ - fcache-xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResultNoSets; - if (isDone) *isDone = ExprSingleResult; --- 1904,1915 /* * Non-set case: much easier. * ! * In common cases, this code path is unreachable because we'd have ! * selected ExecMakeFunctionResultNoSets instead. However, it's ! * possible to get here if an argument sometimes produces set results ! * and sometimes scalar results. For example, a CASE expression might ! * call a set-returning function in only some of its arms. */ if (isDone) *isDone = ExprSingleResult; *** ExecEvalFunc(FuncExprState *fcache, *** 2371,2380 init_fcache(func-funcid, func-inputcollid, fcache, econtext-ecxt_per_query_memory, true); ! /* Go directly to ExecMakeFunctionResult on subsequent uses */ ! fcache-xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult; ! ! return ExecMakeFunctionResult(fcache, econtext, isNull, isDone); } /* --- 2368,2389 init_fcache(func-funcid, func-inputcollid, fcache, econtext-ecxt_per_query_memory, true); ! /* ! * We need to invoke ExecMakeFunctionResult if either the function itself ! * or any of its input expressions can return a set. Otherwise, invoke ! * ExecMakeFunctionResultNoSets. In either case, change the evalfunc ! * pointer to go directly there on subsequent uses. ! */ ! if (fcache-func.fn_retset || expression_returns_set((Node *) func-args)) ! { ! fcache-xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult; ! return ExecMakeFunctionResult(fcache, econtext, isNull, isDone); ! } ! else ! { ! fcache-xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResultNoSets; ! return ExecMakeFunctionResultNoSets(fcache, econtext, isNull, isDone);
Re: [HACKERS] Get more from indices.
Tom Lane wrote: Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: [ pathkey_and_uniqueindx_v7_20131203.patch ] I started to look at this patch. I don't understand the reason for the foreach loop in index_pathkeys_are_extensible (and the complete lack of comments in the patch isn't helping). Isn't it sufficient to check that the index is unique/immediate/allnotnull and its ordering is a prefix of query_pathkeys? If not, what's the rationale for the specific tests being made on the pathkeys --- this code doesn't make much sense to me. Thank you for taking time to look at this patch. I think it's not sufficient to check those things. Let me explain the reason why this patch has that code. The patch has that code in order to prevent build_join_pathkeys() from building incorrect join pathkeys', where the pathkeys for a join relation constructed by mergejoin or nestloop join are built normally just by using the outer path's pathkeys. Without that code, the patch would produce an incorrect result for such a case. An example will be shown below. A simple approach to avoid this problem would be to apply this idea only when each pathkey in query_pathkeys references the indexed relation in addition to that the index is unique/immediate/allnotnull and its ordering is a prefix of query_pathkeys. That's the reason. [Data] CREATE TABLE t (a int not null, b int not null, c int, d text); CREATE UNIQUE INDEX i_t_ab ON t (a, b); INSERT INTO t (SELECT a / 5, 4 - (a % 5), a, 't' FROM generate_series(00, 09) a); ANALYZE t; CREATE TABLE t2 (e text, f int); INSERT INTO t2 VALUES ('t', 2); INSERT INTO t2 VALUES ('t', 1); ANALYZE t2; [Query] EXPLAIN SELECT * FROM t, t2 WHERE t.a 10 AND t.d = t2.e ORDER BY t.a, t.b, t.c, t.d, t2.f LIMIT 4; QUERY PLAN Limit (cost=0.29..3.96 rows=4 width=20) - Nested Loop (cost=0.29..110.17 rows=120 width=20) Join Filter: (t.d = t2.e) - Index Scan using i_t_ab on t (cost=0.29..107.34 rows=60 width=14) Index Cond: (a 10) - Materialize (cost=0.00..1.03 rows=2 width=6) - Seq Scan on t2 (cost=0.00..1.02 rows=2 width=6) (7 rows) SELECT * FROM t, t2 WHERE t.a 10 AND t.d = t2.e ORDER BY t.a, t.b, t.c, t.d, t2.f LIMIT 4; a | b | c | d | e | f ---+---+---+---+---+--- 0 | 0 | 4 | t | t | 2 0 | 0 | 4 | t | t | 1 0 | 1 | 3 | t | t | 2 0 | 1 | 3 | t | t | 1 (4 rows) (Note the column f is sorted in the descending order.) Sorry for the delay. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [HITB-Announce] HITB Magazine Issue 10 Out Now
Issue #10 is now available! Hello readers and welcome to the somewhat overdue Issue 010 of HITB Magazine. As they say, better late than never! Since the last issue, we've also changed the HITB Security Conference Call for Papers submission guidelines to now require speakers to submit a research 'white paper' to accompany their talk. The first round of papers came to us via #HITB2013KUL in October and thankfully we now have loads of AWESOME CONTENT! We've got so much good stuff we could have probably put together two issues even! With the new change to the CFP submissions, we've decided to also change our publication schedule for 2014 to a 'per HITB SecConf' release cycle. This means you can expect a new magazine approximately every 6 months which we'll release alongside a HITB Security event. What else do we have planned for 2014? Well next year also marks the 5th year anniversary of the HITB Security Conference in Amsterdam and we're celebrating it in traditional HITB fashion - by adding something special to our line up - our first ever HITB hacker expo! A 3-day IT security and technology exhibition unlike anything that's been done before. Think RSA or Mobile World Congress meets Makerfaire with a generous touch of HITBSecConf thrown in for good measure. What exactly does that mean? Imagine an area dedicated to hackerspaces; makers with 3D printers, laser cutters and other fabrication goodies coupled with TOOOL's Lock Picking Village, HITB and Mozilla's HackWEEKDAY developer hackathon, our Capture the Flag 'live hacking' competition and more all wrapped around a 3-day exhibition with Microsoft and Google as the main anchors. The cost to attend? ABSOLUTELY NOTHING! Yup, entrance to HITB Haxpo will be F-R-E-E! Head over to http://haxpo.nl for further details and to register (we've got a new registration system too!) On behalf of The HITB Editorial Team, I hope you enjoy this special end of year issue we've put together and we wish you all a very Happy New Year, and have a great time ahead! Download Issue #10 - http://magazine.hackinthebox.org/hitb-magazine.html Regards, Hafez Kamal Hack in The Box (M) Sdn. Bhd 36th Floor, Menara Maxis Kuala Lumpur City Centre 50088 Kuala Lumpur, Malaysia Tel: +603-26157299 Fax: +603-26150088 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers