Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 23 August 2016 at 08:27, Craig Ringer wrote: > On 10 August 2016 at 14:44, Michael Paquier > wrote: > >> > I am looking a bit more seriously at this patch and assigned myself as >> a reviewer. >> > > Much appreciated. > > >> > macos complains here. You may want to replace %06lds by just %06d. >> > > Yeah, or cast to a type known to be big enough. Will amend. > I used an upcast to (long), because on Linux it's a long. I don't see the point of messing about adding a configure test for something this trivial. > This patch generates a core dump, use for example pg_ctl start -w and >> you'll bump into the trace above. There is something wrong with the >> queue handling. >> > > Huh. I didn't see that here (Fedora 23). I'll look more closely. > > >> Do you have plans for a more generic structure for the command queue list? >> > > No plans, no. This was a weekend experiment that turned into a useful > patch and I'm having to scrape up time for it amongst much more important > things like logical failover / sequence decoding and various other > replication work. > > Thanks for the docs review too, will amend. > > >> + fprintf(stderr, "internal error, COPY in batch mode"); >> + abort(); >> I don't think that's a good idea. > > My thinking there was that it's a "shouldn't happen" case. It's a problem in library logic. I'd use an Assert() here in the backend. I could printfPQExpBuffer(...) an error and return failure instead if you think that's more appropriate. I'm not sure how the app would handle it correctly, but OTOH it's generally better for libraries not to call abort(). So I'll do that, but since it's an internal error that's not meant to happen I'll skip the gettext calls. > Error messages should also use libpq_gettext, and perhaps >> be stored in conn->errorMessage as we do so for OOMs happening on >> client-side and reporting them back even if they are not expected >> (those are blocked PQsendQueryStart in your patch). >> > I didn't get that last part, re PQsendQueryStart. > src/test/examples is a good idea to show people what this new API can >> do, but this is never getting compiled. It could as well be possible >> to include tests in src/test/modules/, in the same shape as what >> postgres_fdw is doing by connecting to itself and link it to libpq. As >> this patch complicates quote a lot fe-exec.c, I think that this would >> be worth it. Thoughts? > > I think it makes sense to use the TAP framework. Added src/test/modules/test_libpq/ with a test for async mode. Others can be added/migrated based on that. I thought it made more sense for the tests to live there than in src/interfaces//libpq/ since they need test client programs and shouldn't pollute the library directory. I've made the docs changes too. Thanks. I fixed the list handling error. I'm amazed it appears to run fine, and without complaint from valgrind, here, since it was an accidentally _deleted_ line. Re lists, I looked at simple_list.c and it's exceedingly primitive. Using it would mean more malloc()ing since we'll have a list cell and then a struct pointed to it, and won't recycle members, but... whatever. It's not going to matter a great deal. The reason I did it with an embedded list originally was because that's how it's done for PGnotify, but that's not exactly new code The bigger problem is that simple_list also uses pg_malloc, which won't set conn->errorMessage, it'll just fprintf() and exit(1). I'm not convinced it's appropriate to use that for libpq. For now I've left list handling unchanged. If it's to move to a generic list, it should probably be one that knows how to use pg_malloc_extended(size, MCXT_ALLOC_NO_OOM) and emit its own libpq-error-handling-aware error. I'm not sure whether that list should use cell heads embedded in the structures it manages or pointing to them, either. Updated patch attached. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From fad0def5570907f5c8e3b6d65d57e5e7678e7383 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Fri, 20 May 2016 12:45:18 +0800 Subject: [PATCH] Pipelining (batch) support for libpq Allow libpq clients to avoid excessive round trips by pipelining multiple commands into batches. A sync is only sent at the end of a batch. Commands in a batch succeed or fail together. Adds TAP tests for libpq at src/test/modules/test_libpq . Includes a test program in src/test/modules/test_libpq/testlibpqbatch.c --- doc/src/sgml/libpq.sgml | 478 src/interfaces/libpq/.gitignore |1 + src/interfaces/libpq/Makefile|5 + src/interfaces/libpq/exports.txt |7 + src/interfaces/libpq/fe-connect.c| 17 + src/interfaces/libpq/fe-exec.c | 572 - src/interfaces/libpq/fe-protocol2.c |6 + src/interfaces/libpq/fe-protocol3.c
Re: [HACKERS] Write Ahead Logging for Hash Indexes
Hi All, Following are the steps that i have followed to verify the WAL Logging of hash index, 1. I used Mithun's patch to improve coverage of hash index code [1] to verify the WAL Logging of hash index. Firstly i have confirmed if all the XLOG records associated with hash index are being covered or not using this patch. In case if any of the XLOG record for hash index operation is not being covered i have added a testcase for it. I have found that one of the XLOG record 'XLOG_HASH_MOVE_PAGE_CONTENTS' was not being covered and added a small testcase for the same. The patch for this is available @ [2]. 2. I executed the regression test suite and found all the hash indexes that are getting created as a part of regression test suite using the below query. SELECT t.relname index_name, t.oid FROM pg_class t JOIN pg_am idx ON idx.oid = t.relam WHERE idx.amname = 'hash'; 3. Thirdly, I have calculated the number of pages associated with each hash index and compared every page of hash index on master and standby server using pg_filedump tool. As for example if the number of pages associated with 'con_hash_index' is 10 then here is what i did, On master: - select pg_relation_filepath('con_hash_index'); pg_relation_filepath -- base/16408/16433 (1 row) ./pg_filedump -if -R 0 9 /home/edb/git-clone-postgresql/postgresql/TMP/postgres/master/base/16408/16433 > /tmp/file1 On Slave: --- select pg_relation_filepath('con_hash_index'); pg_relation_filepath -- base/16408/16433 (1 row) ./pg_filedump -if -R 0 9 /home/edb/git-clone-postgresql/postgresql/TMP/postgres/standby/base/16408/16433 > /tmp/file2 compared file1 and file2 using some diff tool. Following are the list of hash indexes that got created inside regression database when regression test suite was executed on a master server. hash_i4_index hash_name_index hash_txt_index hash_f8_index con_hash_index hash_idx In short, this is all i did and found no issues during testing. Please let me know if you need any further details. I would like to Thank Amit for his support and guidance during the testing phase. [1] - https://www.postgresql.org/message-id/CAA4eK1JOBX%3DYU33631Qh-XivYXtPSALh514%2BjR8XeD7v%2BK3r_Q%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAE9k0PkNjryhSiG53mjnKFhi%2BMipJMjSa%3DYkH-UeW3bfr1HPJQ%40mail.gmail.com With Regards, Ashutosh Sharma EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] "Some tests to cover hash_index"
Hi, I missed to attach the patch in my previous mail. Here i attach the patch. With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com On Tue, Aug 23, 2016 at 11:47 AM, Ashutosh Sharma wrote: > Hi All, > > I have reverified the code coverage for hash index code using the test > file (commit-hash_coverage_test) attached with this mailing list and have > found that some of the code in _hash_squeezebucket() function flow is not > being covered. For this i have added a small testcase on top of 'commit > hash_coverage_test' patch. I have done this mainly to test Amit's WAL for > hash index patch [1]. > > I have also removed the warning message that we used to get for hash index > like 'WARNING: hash indexes are not WAL-logged and their use is > discouraged' as this message is now no more visible w.r.t hash index after > the WAL patch for hash index. Please have a look and let me know your > thoughts. > > [1] - https://www.postgresql.org/message-id/CAA4eK1JOBX% > 3DYU33631Qh-XivYXtPSALh514%2BjR8XeD7v%2BK3r_Q%40mail.gmail.com > > With Regards, > Ashutosh Sharma > EnterpriseDB:http://www.enterprisedb.com > > On Sat, Aug 6, 2016 at 9:41 AM, Amit Kapila > wrote: > >> On Thu, Aug 4, 2016 at 7:24 PM, Mithun Cy >> wrote: >> >>> I am attaching the patch to improve some coverage of hash index code [1]. >>> I have added some basic tests, which mainly covers overflow pages. It >>> took 2 sec extra time in my machine in parallel schedule. >>> >>> >>> >>> >>> Hit Total Coverage >>> old tests Line Coverage 780 1478 52.7 >>> >>> Function Coverage 63 85 74.1 >>> improvement after tests Line Coverage 1181 1478 79.9 % >>> >>> Function Coverage 78 85 91.8 % >>> >>> >> >> I think the code coverage improvement for hash index with these tests >> seems to be quite good, however time for tests seems to be slightly on >> higher side. Do anybody have better suggestion for these tests? >> >> diff --git a/src/test/regress/sql/concurrent_hash_index.sql >> b/src/test/regress/sql/concurrent_hash_index.sql >> I wonder why you have included a new file for these tests, why can't be >> these added to existing hash_index.sql. >> >> +-- >> +-- Cause some overflow insert and splits. >> +-- >> +CREATE TABLE con_hash_index_table (keycol INT); >> +CREATE INDEX con_hash_index on con_hash_index_table USING HASH (keycol); >> >> The relation name con_hash_index* choosen in above tests doesn't seem to >> be appropriate, how about hash_split_heap* or something like that. >> >> Register your patch in latest CF (https://commitfest.postgresql.org/10/) >> >> -- >> With Regards, >> Amit Kapila. >> EnterpriseDB: http://www.enterprisedb.com >> > > diff --git a/src/test/regress/expected/concurrent_hash_index.out b/src/test/regress/expected/concurrent_hash_index.out index c3b8036..60191c0 100644 --- a/src/test/regress/expected/concurrent_hash_index.out +++ b/src/test/regress/expected/concurrent_hash_index.out @@ -3,7 +3,6 @@ -- CREATE TABLE con_hash_index_table (keycol INT); CREATE INDEX con_hash_index on con_hash_index_table USING HASH (keycol); -WARNING: hash indexes are not WAL-logged and their use is discouraged INSERT INTO con_hash_index_table VALUES (1); INSERT INTO con_hash_index_table SELECT * from con_hash_index_table; INSERT INTO con_hash_index_table SELECT * from con_hash_index_table; @@ -75,5 +74,15 @@ DROP TABLE hash_ovfl_temp_heap CASCADE; CREATE TABLE hash_ovfl_heap_float4 (x float4, y int); INSERT INTO hash_ovfl_heap_float4 VALUES (1.1,1); CREATE INDEX hash_idx ON hash_ovfl_heap_float4 USING hash (x); -WARNING: hash indexes are not WAL-logged and their use is discouraged DROP TABLE hash_ovfl_heap_float4 CASCADE; +-- +-- Test hash index insertion with squeeze bucket (XLOG_HASH_MOVE_PAGE_CONTENTS +-- WAL record type). +-- +CREATE TABLE hash_split_buckets (seqno int4, random int4); +CREATE INDEX hash_idx ON hash_split_buckets USING hash (random int4_ops) +with (fillfactor = 10); +INSERT INTO hash_split_buckets (seqno, random) SELECT a, a*5 FROM +GENERATE_SERIES(1, 10) a; +REINDEX INDEX hash_idx; +DROP TABLE hash_split_buckets; diff --git a/src/test/regress/sql/concurrent_hash_index.sql b/src/test/regress/sql/concurrent_hash_index.sql index 8f930d5..d3b09d0 100644 --- a/src/test/regress/sql/concurrent_hash_index.sql +++ b/src/test/regress/sql/concurrent_hash_index.sql @@ -78,3 +78,15 @@ CREATE TABLE hash_ovfl_heap_float4 (x float4, y int); INSERT INTO hash_ovfl_heap_float4 VALUES (1.1,1); CREATE INDEX hash_idx ON hash_ovfl_heap_float4 USING hash (x); DROP TABLE hash_ovfl_heap_float4 CASCADE; + +-- +-- Test hash index insertion with squeeze bucket (XLOG_HASH_MOVE_PAGE_CONTENTS +-- WAL record type). +-- +CREATE TABLE hash_split_buckets (seqno int4, random int4); +CREATE INDEX hash_idx ON hash_split_buckets USING hash (random int4_ops) +with (fillfactor = 10); +INSERT INTO hash_split_buckets (seqno, random) SELECT a, a*5 FROM +GENERATE_SERIES(1, 10) a; +REINDEX INDEX hash_idx; +DROP TABLE hash_split_buckets; -
Re: [HACKERS] "Some tests to cover hash_index"
Hi All, I have reverified the code coverage for hash index code using the test file (commit-hash_coverage_test) attached with this mailing list and have found that some of the code in _hash_squeezebucket() function flow is not being covered. For this i have added a small testcase on top of 'commit hash_coverage_test' patch. I have done this mainly to test Amit's WAL for hash index patch [1]. I have also removed the warning message that we used to get for hash index like 'WARNING: hash indexes are not WAL-logged and their use is discouraged' as this message is now no more visible w.r.t hash index after the WAL patch for hash index. Please have a look and let me know your thoughts. [1] - https://www.postgresql.org/message-id/CAA4eK1JOBX%3DYU33631Qh-XivYXtPSALh514%2BjR8XeD7v%2BK3r_Q%40mail.gmail.com With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com On Sat, Aug 6, 2016 at 9:41 AM, Amit Kapila wrote: > On Thu, Aug 4, 2016 at 7:24 PM, Mithun Cy > wrote: > >> I am attaching the patch to improve some coverage of hash index code [1]. >> I have added some basic tests, which mainly covers overflow pages. It >> took 2 sec extra time in my machine in parallel schedule. >> >> >> >> >> Hit Total Coverage >> old tests Line Coverage 780 1478 52.7 >> >> Function Coverage 63 85 74.1 >> improvement after tests Line Coverage 1181 1478 79.9 % >> >> Function Coverage 78 85 91.8 % >> >> > > I think the code coverage improvement for hash index with these tests > seems to be quite good, however time for tests seems to be slightly on > higher side. Do anybody have better suggestion for these tests? > > diff --git a/src/test/regress/sql/concurrent_hash_index.sql > b/src/test/regress/sql/concurrent_hash_index.sql > I wonder why you have included a new file for these tests, why can't be > these added to existing hash_index.sql. > > +-- > +-- Cause some overflow insert and splits. > +-- > +CREATE TABLE con_hash_index_table (keycol INT); > +CREATE INDEX con_hash_index on con_hash_index_table USING HASH (keycol); > > The relation name con_hash_index* choosen in above tests doesn't seem to > be appropriate, how about hash_split_heap* or something like that. > > Register your patch in latest CF (https://commitfest.postgresql.org/10/) > > -- > With Regards, > Amit Kapila. > EnterpriseDB: http://www.enterprisedb.com >
Re: [HACKERS] Patch: initdb: "'" for QUOTE_PATH (non-windows)
Thanks for committing it Tom! Thank you all for your help. I really like the Postgres project! If there's anything that especially needs worked on let me know, I'd love to help. Best, Ryan
Re: [HACKERS] Tracking wait event for latches
On Tue, Aug 23, 2016 at 12:44 AM, Robert Haas wrote: > On Mon, Aug 22, 2016 at 9:49 AM, Michael Paquier > wrote: >> The reason why I chose this way is that there are a lot of them. It is >> painful to maintain the order of the array elements in perfect mapping >> with the list of IDs... > > You can use stupid macro tricks to help with that problem... Yeah, still after thinking about it I think I would just go with an array like lock types and be done with it. With a comment to mention that the order should be respected things would be enough... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Tue, Aug 23, 2016 at 10:57 AM, Michael Paquier wrote: > > Also, what's the use case of allowing only a certain set of rmgrs to > be checked. Wouldn't a simple on/off switch be simpler? > I think there should be a way to test WAL for one particular resource manager. For example, if someone develops a new index or some other heap storage, only that particular module can be verified. Generating WAL for all the resource managers together can also serve the purpose, but it will be slightly difficult to verify it. > As presented, > wal_consistency_mask is also going to be very quite confusing for > users. You should not need to apply some maths to set up this > parameter, a list of rmgr names may be more adapted if this level of > tuning is needed, > Yeah, that can be better. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in to_timestamp().
Hi Artur Zakirov, Please see following review comment for "0001-to-timestamp-format-checking-v2.patch" & share your thought: #1. 15 + to_timestamp('2000JUN', ' MON') Documented as working case, but unfortunatly it does not : postgres=# SELECT to_timestamp('2000JUN', ' MON'); ERROR: invalid value "---" for "MON" DETAIL: The given value did not match any of the allowed values for this field. #2. 102 + /* Previous character was a quote */ 103 + else if (in_text) 104 + { 105 + if (*str == '"') 106 + { 107 + str++; 108 + in_text = false; 109 + } 110 + else if (*str == '\\') 111 + { 112 + str++; 113 + in_backslash = true; 114 + } 115 + else 116 + { 117 + n->type = NODE_TYPE_CHAR; 118 + n->character = *str; 119 + n->key = NULL; 120 + n->suffix = 0; 121 + n++; 122 + str++; 123 + } 124 + continue; 125 + } 126 + NODE_TYPE_CHAR assumption in else block seems incorrect. What if we have space after double quote? It will again have incorrect output without any error, see below: postgres=# SELECT to_timestamp('Year: 1976, Month: May, Day: 16', postgres(# '"Year:" , "Month:" FMMonth, "Day:" DD'); to_timestamp -- 0006-05-16 00:00:00-07:52:58 (1 row) I guess, we might need NODE_TYPE_SEPARATOR and NODE_TYPE_SPACE check as well? #3. 296 - /* Ignore spaces before fields when not in FX (fixed width) mode */ 297 + /* Ignore spaces before fields when not in FX (fixed * width) mode */ 298 if (!fx_mode && n->key->id != DCH_FX) 299 { 300 - while (*s != '\0' && isspace((unsigned char) *s)) 301 + while (*s != '\0' && (isspace((unsigned char) *s))) 302 s++; Unnecessary hunk? We should not touch any code unless it necessary to implement proposed feature, otherwise it will add unnecessary diff to the patch and eventually extra burden to review the code. Similarly hunk in the patch @ line # 313 - 410, nothing to do with to_timestamp behaviour improvement, IIUC. If you think this changes need to be in, please submit separate cleanup-patch. >I attached second patch "0002-to-timestamp-validation-v2.patch". With it >PostgreSQL perform additional checks for date and time. But as I wrote >there is another patch in the thread "to_date_valid()" wich differs from >this patch. @community : I am not sure what to do with this patch, should we keep it as separate enhancement? Regards, Amul Sul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Tue, Aug 23, 2016 at 1:32 PM, Amit Kapila wrote: > On Mon, Aug 22, 2016 at 9:16 PM, Robert Haas wrote: >> On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier >> wrote: >>> Another pin-point is: given a certain page, how do we identify of >>> which type it is? One possibility would be again to extend the AM >>> handler with some kind of is_self function with a prototype like that: >>> bool handler->is_self(Page); >>> If the page is of the type of the handler, this returns true, and >>> false otherwise. Still here performance would suck. >>> >>> At the end, what we want is a clean interface, and more thoughts into it. >> >> I think that it makes sense to filter based on the resource manager >> ID >> > > +1. Yes actually that's better. That's simple enough and removes any need to looking at pd_special. > I think the patch currently addresses only a subset of resource > manager id's (mainly Heap and Index resource manager id's). Do we > want to handle the complete resource manager list as defined in > rmgrlist.h? Not all of them generate FPWs. I don't think it matters much. > Another thing that needs some thoughts is the UI of this patch, > currently it is using integer mask which might not be best way, but > again as it is intended to be mainly used for tests, it might be okay. What we'd want to have is a way to visualize easily differences of pages. Any other ideas than MASK_MARKER would be welcome of course. > Do we want to enable some tests in the regression suite by using this option? We could get out a recovery test that sets up a standby/master and runs the tests of src/test/regress with pg_regress with this parameter enabled. + * bufmask.c + * Routines for buffer masking, used to ensure that buffers used for + * comparison across nodes are in a consistent state. + * + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California Copyright notices need to be updated. (It's already been 2 years!!) Also, what's the use case of allowing only a certain set of rmgrs to be checked. Wouldn't a simple on/off switch be simpler? As presented, wal_consistency_mask is also going to be very quite confusing for users. You should not need to apply some maths to set up this parameter, a list of rmgr names may be more adapted if this level of tuning is needed, still it seems to me that we don't need this much. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
Yes, I've verified the outputs and log contents after running gmake installcheck and gmake installcheck-world. The status of the test was marked as pass for all the testcases. On Mon, Aug 22, 2016 at 9:26 PM, Simon Riggs wrote: > On 22 August 2016 at 13:44, Kuntal Ghosh wrote: > >> Please let me know your thoughts on this. > > Do the regression tests pass with this option enabled? > > -- > Simon Riggshttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Thanks & Regards, Kuntal Ghosh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Mon, Aug 22, 2016 at 9:16 PM, Robert Haas wrote: > On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier > wrote: >> Another pin-point is: given a certain page, how do we identify of >> which type it is? One possibility would be again to extend the AM >> handler with some kind of is_self function with a prototype like that: >> bool handler->is_self(Page); >> If the page is of the type of the handler, this returns true, and >> false otherwise. Still here performance would suck. >> >> At the end, what we want is a clean interface, and more thoughts into it. > > I think that it makes sense to filter based on the resource manager > ID > +1. I think the patch currently addresses only a subset of resource manager id's (mainly Heap and Index resource manager id's). Do we want to handle the complete resource manager list as defined in rmgrlist.h? Another thing that needs some thoughts is the UI of this patch, currently it is using integer mask which might not be best way, but again as it is intended to be mainly used for tests, it might be okay. Do we want to enable some tests in the regression suite by using this option? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Write Ahead Logging for Hash Indexes
On Tue, Aug 23, 2016 at 8:54 AM, Amit Kapila wrote: > $SUBJECT will make hash indexes reliable and usable on standby. > AFAIU, currently hash indexes are not recommended to be used in > production mainly because they are not crash-safe and with this patch, > I hope we can address that limitation and recommend them for use in > production. > > This patch is built on my earlier patch [1] of making hash indexes > concurrent. The main reason for doing so is that the earlier patch > allows to complete the split operation and used light-weight locking > due to which operations can be logged at granular level. > > WAL for different operations: > > This has been explained in README as well, but I am again writing it > here for the ease of people. > > .. > One of the challenge in writing this patch was that the current code > was not written with a mindset that we need to write WAL for different > operations. Typical example is _hash_addovflpage() where pages are > modified across different function calls and all modifications needs > to be done atomically, so I have to refactor some code so that the > operations can be logged sensibly. > This patch has not done handling for OldSnapshot. Previously, we haven't done TestForOldSnapshot() checks in hash index as they were not logged, but now with this patch, it makes sense to perform such checks. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forbid use of LF and CR characters in database and role names
Peter Geoghegan writes: > On Mon, Aug 22, 2016 at 6:28 PM, Michael Paquier > wrote: >> There is no need to put restrictions on those I think, and they are >> actually supported. > Bi-directional text support (i.e., the use of right-to-left control > characters) is known to have security implications, FWIW. There is an > interesting discussion of the matter here: > http://www.unicode.org/reports/tr36/#Bidirectional_Text_Spoofing The problem with implementing anything like that is that it requires assumptions about what encoding we're dealing with, which would be entirely not based in fact. (The DB encoding is not a good guide to what global names are encoded as, much less what encoding some shell might think it's using.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forbid use of LF and CR characters in database and role names
On Mon, Aug 22, 2016 at 6:28 PM, Michael Paquier wrote: > There is no need to put restrictions on those I think, and they are > actually supported. Bi-directional text support (i.e., the use of right-to-left control characters) is known to have security implications, FWIW. There is an interesting discussion of the matter here: http://www.unicode.org/reports/tr36/#Bidirectional_Text_Spoofing -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exporting more function in libpq
Alvaro Herrera writes: > Craig Ringer wrote: >> Shouldn't that generally be done by extending libpq to add the required >> functionality? > The thought that came to me was that maybe we need a separate library > that handles the lower level operations (a "fe/be" library, if you will) > which can be exported for others to use and is used by libpq to > implement the slightly-higher-level functionality. If you wanted a library that exposed something close to the wire-level protocol, I do not think that tearing out some of the oldest and cruftiest parts of libpq and exposing them verbatim is really the best way to go about it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forbid use of LF and CR characters in database and role names
On Tue, Aug 23, 2016 at 10:19 AM, Peter Geoghegan wrote: > I haven't looked at the patch, but offhand I wonder if it's worth > considering control characters added by unicode, if you haven't already. There is no need to put restrictions on those I think, and they are actually supported. Look for example at pg_upgrade/test.sh, we are already testing them with database names :) Not BEL of course, but that works. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Forbid use of LF and CR characters in database and role names
I haven't looked at the patch, but offhand I wonder if it's worth considering control characters added by unicode, if you haven't already. -- Peter Geoghegan
Re: [HACKERS] Forbid use of LF and CR characters in database and role names
On Fri, Aug 12, 2016 at 10:12 AM, Michael Paquier wrote: > Note that pg_dump[all] and pg_upgrade already have safeguards against > those things per the same routines putting quotes for execution as > commands into psql and shell. So attached is a patch to implement this > restriction in the backend, and I am adding that to the next CF for > 10.0. Attached is as well a script able to trigger those errors. > Thoughts? I am re-sending the patch. For a reason escaping me, it is showing up as 'invalid/octet-stream'... (Thanks Bruce for noting that) -- Michael diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c index c1c0223..5746958 100644 --- a/src/backend/commands/dbcommands.c +++ b/src/backend/commands/dbcommands.c @@ -77,6 +77,7 @@ typedef struct } movedb_failure_params; /* non-export function prototypes */ +static void check_db_name(const char *dbname); static void createdb_failure_callback(int code, Datum arg); static void movedb(const char *dbname, const char *tblspcname); static void movedb_failure_callback(int code, Datum arg); @@ -456,6 +457,9 @@ createdb(const CreatedbStmt *stmt) /* Note there is no additional permission check in this path */ } + /* do sanity checks on database name */ + check_db_name(dbname); + /* * Check for db name conflict. This is just to give a more friendly error * message than "unique index violation". There's a race condition but @@ -745,6 +749,22 @@ check_encoding_locale_matches(int encoding, const char *collate, const char *cty pg_encoding_to_char(collate_encoding; } +/* + * Perform sanity checks on the database name. + */ +static void +check_db_name(const char *dbname) +{ + if (strchr(dbname, '\n') != NULL) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("database name cannot use LF character"))); + if (strchr(dbname, '\r') != NULL) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("database name cannot use CR character"))); +} + /* Error cleanup callback for createdb */ static void createdb_failure_callback(int code, Datum arg) @@ -949,6 +969,9 @@ RenameDatabase(const char *oldname, const char *newname) int npreparedxacts; ObjectAddress address; + /* check format of new name */ + check_db_name(newname); + /* * Look up the target database's OID, and get exclusive lock on it. We * need this for the same reasons as DROP DATABASE. diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c index b6ea950..8954e16 100644 --- a/src/backend/commands/user.c +++ b/src/backend/commands/user.c @@ -57,6 +57,21 @@ static void DelRoleMems(const char *rolename, Oid roleid, bool admin_opt); +/* Do sanity checks on role name */ +static void +check_role_name(const char *rolname) +{ + if (strchr(rolname, '\n') != NULL) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("role name cannot use LF character"))); + if (strchr(rolname, '\r') != NULL) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("role name cannot use CR character"))); +} + + /* Check if current user has createrole privileges */ static bool have_createrole_privilege(void) @@ -111,6 +126,9 @@ CreateRole(CreateRoleStmt *stmt) DefElem*dvalidUntil = NULL; DefElem*dbypassRLS = NULL; + /* sanity check for role name */ + check_role_name(stmt->role); + /* The defaults can vary depending on the original statement type */ switch (stmt->stmt_type) { @@ -1137,6 +1155,9 @@ RenameRole(const char *oldname, const char *newname) ObjectAddress address; Form_pg_authid authform; + /* sanity check for role name */ + check_role_name(newname); + rel = heap_open(AuthIdRelationId, RowExclusiveLock); dsc = RelationGetDescr(rel); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Better locale-specific-character-class handling for regexps
I got tired of hearing complaints about the issue described in this thread: https://www.postgresql.org/message-id/flat/24241.1329347196%40sss.pgh.pa.us Here's a proposed fix. I've not done extensive performance testing, but it seems to be as fast or faster than the old code in cases where there are not too many "large" characters in the input. And, more to the point, it gets the right answer for such large characters. I'll add this to the upcoming commitfest. regards, tom lane diff --git a/src/backend/regex/README b/src/backend/regex/README index 6c9f483..b4a7ad7 100644 *** a/src/backend/regex/README --- b/src/backend/regex/README *** and similarly additional source files re *** 27,39 regexec. This was done to avoid exposing internal symbols globally; all functions not meant to be part of the library API are static. ! (Actually the above is a lie in one respect: there is one more global ! symbol, pg_set_regex_collation in regcomp. It is not meant to be part of ! the API, but it has to be global because both regcomp and regexec call it. ! It'd be better to get rid of that, as well as the static variables it ! sets, in favor of keeping the needed locale state in the regex structs. ! We have not done this yet for lack of a design for how to add ! application-specific state to the structs.) What's where in src/backend/regex/: --- 27,40 regexec. This was done to avoid exposing internal symbols globally; all functions not meant to be part of the library API are static. ! (Actually the above is a lie in one respect: there are two more global ! symbols, pg_set_regex_collation and pg_reg_getcolor in regcomp. These are ! not meant to be part of the API, but they have to be global because both ! regcomp and regexec call them. It'd be better to get rid of ! pg_set_regex_collation, as well as the static variables it sets, in favor of ! keeping the needed locale state in the regex structs. We have not done this ! yet for lack of a design for how to add application-specific state to the ! structs.) What's where in src/backend/regex/: *** colors: *** 274,301 an existing color has to be subdivided. The last two of these are handled with the "struct colordesc" array and ! the "colorchain" links in NFA arc structs. The color map proper (that ! is, the per-character lookup array) is handled as a multi-level tree, ! with each tree level indexed by one byte of a character's value. The ! code arranges to not have more than one copy of bottom-level tree pages ! that are all-the-same-color. ! Unfortunately, this design does not seem terribly efficient for common ! cases such as a tree in which all Unicode letters are colored the same, ! because there aren't that many places where we get a whole page all the ! same color, except at the end of the map. (It also strikes me that given ! PG's current restrictions on the range of Unicode values, we could use a ! 3-level rather than 4-level tree; but there's not provision for that in ! regguts.h at the moment.) ! A bigger problem is that it just doesn't seem very reasonable to have to ! consider each Unicode letter separately at regex parse time for a regex ! such as "\w"; more than likely, a huge percentage of those codes will ! never be seen at runtime. We need to fix things so that locale-based ! character classes are somehow processed "symbolically" without making a ! full expansion of their contents at parse time. This would mean that we'd ! have to be ready to call iswalpha() at runtime, but if that only happens ! for high-code-value characters, it shouldn't be a big performance hit. Detailed semantics of an NFA --- 275,330 an existing color has to be subdivided. The last two of these are handled with the "struct colordesc" array and ! the "colorchain" links in NFA arc structs. ! Ideally, we'd do the first two operations using a simple linear array ! storing the current color assignment for each character code. ! Unfortunately, that's not terribly workable for large charsets such as ! Unicode. Our solution is to divide the color map into two parts. A simple ! linear array is used for character codes up to MAX_SIMPLE_CHR, which can be ! chosen large enough to include all popular characters (so that the ! significantly-slower code paths about to be described are seldom invoked). ! Characters above that need be considered at compile time only if they ! appear explicitly in the regex pattern. We store each such mentioned ! character or character range as an entry in the "colormaprange" array in ! the colormap. (Overlapping ranges are split into unique subranges, so that ! each range in the finished list needs only a single color that describes ! all its characters.) When mapping a character above MAX_SIMPLE_CHR to a ! color at runtime, we search this list of ranges explicitly. ! That's still not quite enough, though, because of lo
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
On 23 August 2016 at 01:03, Robert Haas wrote: > > I think you should use underscores to separate all of the words > instead of only some of them. > > ifassigned => if_assigned ifrecent=> if_recent Updated patch series attached. As before, 0-4 intended for commit, 5 just because it'll be handy to have around for people doing wraparound related testing. Again, thanks for taking a look. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From 81cbe525261a15a21415af361b3421038eccc895 Mon Sep 17 00:00:00 2001 From: Craig Ringer Date: Fri, 19 Aug 2016 14:44:15 +0800 Subject: [PATCH 1/4] Introduce txid_status(bigint) to get status of an xact If an appliation is disconnected while a COMMIT request is in flight, the backend crashes mid-commit, etc, then an application may not be sure whether or not a commit completed successfully or was rolled back. While two-phase commit solves this it does so at a considerable overhead, so introduce a lighter alternative. txid_status(bigint) lets an application determine the status of a a commit based on an xid-with-epoch as returned by txid_current() or similar. Status may be committed, aborted, in-progress (including prepared xacts) or null if the xact is too old for its commit status to still be retained because it has passed the wrap-around epoch boundary. Applications must call txid_current() in their transactions to make much use of this since PostgreSQL does not automatically report an xid to the client when one is assigned. --- doc/src/sgml/func.sgml | 28 +++- src/backend/access/transam/clog.c| 23 -- src/backend/catalog/system_views.sql | 20 + src/backend/utils/adt/txid.c | 82 src/include/access/clog.h| 23 ++ src/include/catalog/pg_proc.h| 2 + src/test/regress/expected/txid.out | 50 ++ src/test/regress/sql/txid.sql| 35 +++ 8 files changed, 239 insertions(+), 24 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 169a385..8edf490 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17139,6 +17139,10 @@ SELECT collation for ('foo' COLLATE "de_DE"); txid_visible_in_snapshot + +txid_status + + The functions shown in provide server transaction information in an exportable form. The main @@ -17157,7 +17161,7 @@ SELECT collation for ('foo' COLLATE "de_DE"); txid_current() bigint - get current transaction ID, assigning a new one if the current transaction does not have one + get current 64-bit transaction ID with epoch, assigning a new one if the current transaction does not have one txid_current_snapshot() @@ -17184,6 +17188,11 @@ SELECT collation for ('foo' COLLATE "de_DE"); boolean is transaction ID visible in snapshot? (do not use with subtransaction ids) + + txid_status(bigint) + txid_status + report the status of the given xact - committed, aborted, in-progress, or null if the xid is too old + @@ -17254,6 +17263,23 @@ SELECT collation for ('foo' COLLATE "de_DE"); +txid_status(bigint) reports the commit status of a recent +transaction. Any recent transaction can be identified as one of + + in-progress + committed + aborted + +Prepared transactions are identified as in-progress. +The commit status of transactions older than the transaction ID wrap-around +threshold is no longer known by the system, so txid_status +returns NULL for such transactions. Applications may use +txid_status to determine whether a transaction committed +or aborted when the application and/or database server crashed or lost +connection while a COMMIT command was in-progress. + + + The functions shown in provide information about transactions that have been already committed. These functions mainly provide information about when the transactions diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c index 2634476..1a6e26d 100644 --- a/src/backend/access/transam/clog.c +++ b/src/backend/access/transam/clog.c @@ -41,29 +41,6 @@ #include "miscadmin.h" #include "pg_trace.h" -/* - * Defines for CLOG page sizes. A page is the same BLCKSZ as is used - * everywhere else in Postgres. - * - * Note: because TransactionIds are 32 bits and wrap around at 0x, - * CLOG page numbering also wraps around at 0x/CLOG_XACTS_PER_PAGE, - * and CLOG segment numbering at - * 0x/CLOG_XACTS_PER_PAGE/SLRU_PAGES_PER_SEGMENT. We need take no - * explicit notice of that fact in this module, except when comparing segment - * and page numbers in TruncateCLOG (see CLOGPagePrecedes). - */ - -/* We need two bits per xact, so four xacts
[HACKERS] Re: [BUGS] Re: Missing rows with index scan when collation is not "C" (PostgreSQL 9.5)
On Wed, Mar 23, 2016 at 10:46 AM, Tom Lane wrote: > Robert Haas writes: >> Are you still in information-gathering more, or are you going to issue >> a recommendation on how we should proceed here, or what? > > If I had to make a recommendation right now, I would go for your > option #4, ie shut 'em all down Scotty. We do not know the full extent > of the problem but it looks pretty bad, and I think our first priority > has to be to guarantee data integrity. I do not have a lot of faith in > the proposition that glibc's is the only buggy implementation, either. For the record, I have been able to determine by using amcheck on the Heroku platform that en_US.UTF-8 cases are sometimes affected by an inconsistency between strcoll() and strxfrm() behavior, which was previously an open question. I saw only two instances of this across many thousands of servers. For some reason, both cases involved strings with code points from the Arabic alphabet, even though each case was from a totally unrelated customer database. I'll go update the Wiki page for this [1] now. [1] https://wiki.postgresql.org/wiki/Abbreviated_keys_glibc_issue -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
On 23 August 2016 at 01:03, Robert Haas wrote: > I think you should use underscores to separate all of the words > instead of only some of them. > Right. Will fix. Thanks for taking a look. > Also, note that the corresponding internal function is > GetTopTransactionIdIfAny(), which might suggest txid_current_if_any() > rather than txid_current_if_assigned(), but you could argue that your > naming is better. Yeah, I do argue that in this case. Not a hugely strong opinion, but we refer to txid_current() in the docs as: "get current transaction ID, assigning a new one if the current transaction does not have one" so I thought it'd be worth being consistent with that. It's not like txid_current() mirrors the name of GetTopTransactionId() after all ;) and I think it's more important to be consistent with what the user sees than with the code. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] PATCH: Batch/pipelining support for libpq
On 10 August 2016 at 14:44, Michael Paquier wrote: > On Fri, Jun 3, 2016 at 8:51 PM, Dmitry Igrishin wrote: > >> BTW, I've publushed the HTML-ified SGML docs to > >> http://2ndquadrant.github.io/postgres/libpq-batch-mode.html as a > preview. > > Typo detected: "Returns 1 if the batch curently being received" -- > "curently". > > I am looking a bit more seriously at this patch and assigned myself as > a reviewer. > Much appreciated. > testlibpqbatch.c:1239:73: warning: format specifies type 'long' but > the argument has type '__darwin_suseconds_t' (aka 'int') [-Wformat] > printf("batch insert elapsed: %ld.%06lds\n", > elapsed_time.tv_sec, elapsed_time.tv_usec); > macos complains here. You may want to replace %06lds by just %06d. > Yeah, or cast to a type known to be big enough. Will amend. > This patch generates a core dump, use for example pg_ctl start -w and > you'll bump into the trace above. There is something wrong with the > queue handling. > Huh. I didn't see that here (Fedora 23). I'll look more closely. > Do you have plans for a more generic structure for the command queue list? > No plans, no. This was a weekend experiment that turned into a useful patch and I'm having to scrape up time for it amongst much more important things like logical failover / sequence decoding and various other replication work. Thanks for the docs review too, will amend. > + fprintf(stderr, "internal error, COPY in batch mode"); > + abort(); > I don't think that's a good idea. defaultNoticeProcessor can be > overridden to allow applications to have error messages sent > elsewhere. Error messages should also use libpq_gettext, and perhaps > be stored in conn->errorMessage as we do so for OOMs happening on > client-side and reporting them back even if they are not expected > (those are blocked PQsendQueryStart in your patch). > > src/test/examples is a good idea to show people what this new API can > do, but this is never getting compiled. It could as well be possible > to include tests in src/test/modules/, in the same shape as what > postgres_fdw is doing by connecting to itself and link it to libpq. As > this patch complicates quote a lot fe-exec.c, I think that this would > be worth it. Thoughts? I didn't think it added much complexity to fe-exec.c personally. A lot of the appeal is that it has very minor impact on anything that isn't using it. I think it makes sense to (ab)use the recovery module tests for this, invoking the test program from there. Ideally I'd like to teach pgsql and pg_restore how to use async mode, but that's a whole separate patch. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Why --backup-and-modify-in-place in perltidy config?
On Mon, Aug 15, 2016 at 10:19:12AM -0400, Tom Lane wrote: > Andrew Dunstan writes: > > On 08/14/2016 04:38 PM, Tom Lane wrote: > >> I did a trial run following the current pgindent README procedure, and > >> noticed that the perltidy step left me with a pile of '.bak' files > >> littering the entire tree. This seems like a pretty bad idea because > >> a naive "git add ." would have committed them. It's evidently because > >> src/tools/pgindent/perltidyrc includes --backup-and-modify-in-place. > > BTW, after experimenting with this, I did not find any way to get perltidy > to overwrite the original files without making a backup file. Yep, that's why --backup-and-modify-in-place had to be used. I have a local script to remove file with specified extentions, but didn't document that cleanup step. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().
On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: > We could test to see how much it slows things down. But it > may be worth paying the cost even if it ends up being kinda expensive. Here are some numbers from a Xeon E7-8830 @ 2.13GHz running Linux 3.10 running the attached program. It's fairly noisy and I didn't run super long tests with many repeats, but the general theme is visible. If you're actually going to USE the memory, it's only a small extra cost to have reserved seats. But if there's a strong chance you'll never access most of the memory, you might call it expensive. Segment size 1MB: base = shm_open + ftruncate + mmap + munmap + close = 5us base + fallocate = 38us base + memset = 332us base + fallocate + memset = 346us Segment size 1GB: base = shm_open + ftruncate + mmap + munmap + close = 10032us base + fallocate = 30774us base + memset = 602925us base + fallocate + memset = 655433us -- Thomas Munro http://www.enterprisedb.com #include #include #include #include #include #include #include #define SEGMENT_NAME "/my_test_segment" int main(int argc, char *argv[]) { int loops, i; off_t size; bool fallocatep; bool memsetp; bool hugep; void *mem; if (argc != 6) { fprintf(stderr, "Usage: %s \n", argv[0]); return EXIT_FAILURE; } loops = atoi(argv[1]); size = atoi(argv[2]) * 1024 * 1024; fallocatep = atoi(argv[3]) != 0; memsetp = atoi(argv[4]) != 0; hugep = atoi(argv[5]) != 0; for (i = 0; i < loops; ++i) { int fd; fd = shm_open(SEGMENT_NAME, O_CREAT | O_RDWR, S_IWUSR | S_IRUSR); if (fd < 0) { perror("shm_open"); goto cleanup; } if (ftruncate(fd, size)) { perror("ftruncate"); goto cleanup; } if (fallocatep && fallocate(fd, 0, 0, size)) { perror("fallocate"); goto cleanup; } mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED | (hugep ? MAP_HUGETLB : 0), fd, 0); if (mem == NULL) { fprintf(stderr, "mmap failed"); goto cleanup; } if (memsetp) memset(mem, 0, size); munmap(mem, size); close(fd); } shm_unlink(SEGMENT_NAME); return EXIT_SUCCESS; cleanup: shm_unlink(SEGMENT_NAME); return EXIT_FAILURE; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding of sequence advances, part II
On 23 Aug 2016 05:43, "Kevin Grittner" wrote: > > On Mon, Aug 22, 2016 at 3:29 PM, Robert Haas wrote: > > > it seems to me that > > this is just one facet of a much more general problem: given two > > transactions T1 and T2, the order of replay must match the order of > > commit unless you can prove that there are no dependencies between > > them. I don't see why it matters whether the operations are sequence > > operations or data operations; it's just a question of whether they're > > modifying the same "stuff". It matters because sequence operations aren't transactional in pg. Except when they are - operations on a newly CREATEd sequence or one where we did a TRUNCATE ...RESTART IDENTITY. But we don't store the xid of the xact associated with a transactional sequence update along with the sequence update anywhere. We just rely on nk other xact knowing to look at the sequence relfilenode we're changing. Doesn't work so well in logical rep. We also don't store knowledge of whether or not the sequence advance is transactional. Again important because for two xacts t1 and t2: * Sequence last value is 50 * T1 calls nextval. Needs a new chunk because all cached values have been used. Writes sequence wal advancing seq last_value to 100, returns 51. * T2 calls nextval, gets cached value 52. * T2 commits * Master crashes and we fail over to replica. This is fine for physical rep. We replay the sequence advance and all is well. But for logical rep the sequence can't be treated as part of t1. If t1 rolls back or we fail over before replying it we might return value 52 from nextval even though we replayed and committed t2 that used value 52. Oops. However if some xact t3 creates a sequence we can't replay updates to it until the sequence relation is committed. And it's even more fun with TRUNCATE ... RESTART IDENTITY where we need rollback behaviour too. Make sense? It's hard because sequences are sometimes but not always exrmpt from transactional behaviour and pg doesn't record when, since it can rely on physical wal redo order and can apply sequence advances before the sequence relation is committed yet. > > The commit order is the simplest and safest *unless* there is a > read-write anti-dependency a/k/a read-write dependency a/k/a > rw-conflict: where a read from one transaction sees the "before" > version of data modified by the other transaction. In such a case > it is necessary for correct serializable transaction behavior for > the transaction that read the "before" image to be replayed before > the write it didn't see, regardless of commit order. If you're not > trying to avoid serialization anomalies, it is less clear to me > what is best. Could you provide an example of a case where xacts replayed in commit order will produce incorrect results? Remember that we aren't doing statement based replication in pg logical decoding/replication. We don't care how a row got changed, only that we make consistent transitions from before state to after state to for each transaction, such that the data committed and visible on the master is visible on the standby and no uncommitted or not yet visible data on the master is committed/visible on the replica. The replica should have visible committed data matching the master as it was when it originally executed the xact we most recently replayed. No locking is decoded or replayed. It is not expected that a normal non replication client executing some other concurrent xact will have the same effect if run on standby as on master. It's replication not tightly coupled clustering. If/when we have things like parallel decoding and replay of concurrent xacts then issues like the dependencies you mention will start to become a concern. We are a long way from there. For sequences the requirement IMO is that the sequence advances on the replica to or past the position it was at on the master when the first xact that saw those sequence values committed. We should never see the sequence 'behind' such that calling nextval on the replica can produce a value already seen and stored by some committed xact on the replica. Being a bit ahead is ok, much like pg discards sequence values on crash. That's not that hard. The problems arise when the sequence it's self isn't committed yet, per above.
Re: [HACKERS] UTF-8 docs?
> On 8/22/16 9:32 AM, Tatsuo Ishii wrote: >> I don't know what kind of problem you are seeing with encoding >> handling, but at least UTF-8 is working for Japanese, French and >> Russian. > > Those translations are using DocBook XML. But in the mean time I can create UTF-8 HTML files like this: make html [snip] /bin/mkdir -p html SP_CHARSET_FIXED=1 SP_ENCODING=UTF-8 openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t sgml -i output-html -i include-index postgres.sgml Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] distinct estimate of a hard-coded VALUES list
On 08/22/2016 07:42 PM, Alvaro Herrera wrote: Robert Haas wrote: On Sat, Aug 20, 2016 at 4:58 PM, Tom Lane wrote: Jeff Janes writes: On Thu, Aug 18, 2016 at 2:25 PM, Tom Lane wrote: It does know it, what it doesn't know is how many duplicates there are. Does it know whether the count comes from a parsed query-string list/array, rather than being an estimate from something else? If it came from a join, I can see why it would be dangerous to assume they are mostly distinct. But if someone throws 6000 things into a query string and only 200 distinct values among them, they have no one to blame but themselves when it makes bad choices off of that. I am not exactly sold on this assumption that applications have de-duplicated the contents of a VALUES or IN list. They haven't been asked to do that in the past, so why do you think they are doing it? It's hard to know, but my intuition is that most people would deduplicate. I mean, nobody is going to want to their query generator to send X IN (1, 1, ) to the server if it could have just sent X IN (1). Also, if we patch it this way and somebody has a slow query because of a lot of duplicate values, it's easy to solve the problem by de-duplicating. But with the current code, people that have the opposite problem has no way to work around it. I certainly agree it's better when a smart user can fix his query plan by deduplicating the values than when we end up generating a poor plan due to assuming some users are somewhat dumb. I wonder how expensive would it be to actually count the number of distinct values - there certainly are complex data types where the comparisons are fairly expensive, but I would not expect those to be used in explicit VALUES lists. Also, maybe there's some sufficiently accurate estimation approach - e.g. for small number of values we can compute the number of distinct values directly (and it's still going to be fairly cheap), while for larger number we could probably sample the values similarly to what ANALYZE does. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_receivexlog does not report flush position with --synchronous
Hi guys, while adding synchronous WAL streaming to Barman, I noticed that pg_receivexlog - unless a replication slot is specified and --synchronous is passed - does not become a synchronous receiver (if the application_name is in the synchronous_standby_names value). I was a bit surprised by this behaviour. By reading the pg_receivexlog documentation, I assumed that: 1) if I set application_name properly for pg_receivexlog (let's say 'barman_receive_wal') 2) then I set synchronous_standby_names so that barman_receive_wal is first in the list 3) then I run pg_receivexlog with --synchronous I would find the pg_receivexlog in 'sync' state in the pg_stat_replication view on the master. However, I kept receiving the 'async' state. After looking at the documentation once more, I noticed that '--synchronous' was mentioned also in the '--slot-name' option but the explanation - at least to me - was not very clear. I tried again by creating a replication slot and passing it to pg_receivexlog and this time I could see 'sync' as streaming state. Looking up the code in more details I see that, unless replication slot are enabled, pg_receivexlog does not report the flush position (this is a precaution that's been taken when '--synchronous' was probably not around). Please find this very short patch which - in case replication slots are not present but synchronous is - reports the flush position. I am not sure if it is a bug or not. I any case I guess we should probably improve the documentation - it's a bit late here so maybe I can try better tomorrow with a fresher mind. :) Thanks, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia - Director PostgreSQL Training, Services and Support gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it 0001-pg_receivexlog-does-not-report-flush-position-with-s.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().
On Tue, Aug 23, 2016 at 8:41 AM, Robert Haas wrote: > On Tue, Aug 16, 2016 at 7:41 PM, Thomas Munro > wrote: >> I still think it's worth thinking about something along these lines on >> Linux only, where holey Swiss tmpfs files can bite you. Otherwise >> disabling overcommit on your OS isn't enough to prevent something >> which is really a kind of deferred overcommit with a surprising >> failure mode (SIGBUS rather than OOM SIGKILL). > > Yeah, I am inclined to agree. I mean, creating a DSM is fairly > heavyweight already, so one extra system call isn't (I hope) a crazy > overhead. We could test to see how much it slows things down. But it > may be worth paying the cost even if it ends up being kinda expensive. > We don't really have any way of knowing whether the caller's request > is reasonable relative to the amount of virtual memory available, and > converting a possible SIGBUS into an ereport(ERROR, ...) is a big win. Here's a version of the patch that only does something special if the following planets are aligned: * Linux only: for now, there doesn't seem to be any reason to assume that other operating systems share this file-with-holes implementation quirk, or that posix_fallocate would work on such a fd, or which errno values to tolerate if it doesn't. From what I can tell, Solaris, FreeBSD etc either don't overcommit or do normal non-stealth overcommit with the usual out-of-swap failure mode for shm_open memory, with a way to turn overcommit off. So I put a preprocessor test in to do this just for __linux__, and I used "fallocate" (a non-standard Linux syscall) instead of "posix_fallocate". * Glibc version >= 2.10: ancient versions and other libc implementations don't have fallocate, so I put a test into the configure script. * Kernel version >= 2.6.23+: the man page says that ancient kernels don't provide the syscall, and that glibc sets errno to ENOSYS in that case, so I put a check in to keep calm and carry on. I don't know if any distros ever shipped with an old enough kernel and new enough glibc for ENOSYS to happen in the wild; for example RHEL5 had neither kernel nor glibc support, and RHEL6 had both. I haven't personally tested that path. Maybe it would be worth thinking about whether this is a condition that should cause dsm_create to return NULL rather than ereporting, depending on a flag along the lines of the existing DSM_CREATE_NULL_IF_MAXSEGMENTS. But that could be a separate patch if it turns out to be useful. -- Thomas Munro http://www.enterprisedb.com fallocate.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
On 23/08/16 09:40, Andres Freund wrote: Hi, as noted in [1] I started hacking on removing the current implementation of SRFs in the targetlist (tSRFs henceforth). IM discussion brought the need for a description of the problem, need and approach to light. There are several reasons for wanting to get rid of tSRFs. The primary ones in my opinion are that the current behaviour of several SRFs in one targetlist is confusing, and that the implementation burden currently is all over the executor. Especially the latter is what is motivating me working on this, because it blocks my work on making the executor faster for queries involving significant amounts of tuples. Batching is hard if random places in the querytree can icnrease the number of tuples. The basic idea, hinted at in several threads, is, at plan time, to convert a query like SELECT generate_series(1, 10); into SELECT generate_series FROM ROWS FROM(generate_series(1, 10)); thereby avoiding the complications in the executor (c.f. execQual.c handling of isDone/ExprMultipleResult and supporting code in many executor nodes / node->*.ps.ps_TupFromTlist). There are several design questions along the way: 1) How to deal with the least-common-multiple behaviour of tSRFs. E.g. =# SELECT generate_series(1, 3), generate_series(1,2); returning ┌─┬─┐ │ generate_series │ generate_series │ ├─┼─┤ │ 1 │ 1 │ │ 2 │ 2 │ │ 3 │ 1 │ │ 1 │ 2 │ │ 2 │ 1 │ │ 3 │ 2 │ └─┴─┘ (6 rows) but =# SELECT generate_series(1, 3), generate_series(5,7); returning ┌─┬─┐ │ generate_series │ generate_series │ ├─┼─┤ │ 1 │ 5 │ │ 2 │ 6 │ │ 3 │ 7 │ └─┴─┘ discussion in this thread came, according to my reading, to the conclusion that that behaviour is just confusing and that the ROWS FROM behaviour of =# SELECT * FROM ROWS FROM(generate_series(1, 3), generate_series(1,2)); ┌─┬─┐ │ generate_series │ generate_series │ ├─┼─┤ │ 1 │ 1 │ │ 2 │ 2 │ │ 3 │ (null) │ └─┴─┘ (3 rows) makes more sense. I had always implicitly assumed that having 2 generated sequences would act as equivalent to: SELECT sa, sb FROM ROWS FROM(generate_series(1, 3)) AS sa, ROWS FROM(generate_series(5, 7)) AS sb ORDER BY sa, sb; sa | sb + 1 | 5 1 | 6 1 | 7 2 | 5 2 | 6 2 | 7 3 | 5 3 | 6 3 | 7 Obviously I was wrong - but to me, my implicit assumption makes more sense! [...] Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changed SRF in targetlist handling
Hi, On 2016-08-22 16:20:58 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2016-08-17 17:41:28 -0700, Andres Freund wrote: > >> Tom, do you think this is roughly going in the right direction? > > I've not had time to look at this patch, I'm afraid. If you still > want me to, I can make time in a day or so. That'd greatly be appreciated. I think polishing the POC up to committable patch will be a considerable amount of work, and I'd like design feedback before that. > > I'm working on these. Atm ExecMakeTableFunctionResult() resides in > > execQual.c - I'm inlining it into nodeFunctionscan.c now, because > > there's no other callers, and having it separate seems to bring no > > benefit. > > > Please speak soon up if you disagree. > > I think ExecMakeTableFunctionResult was placed in execQual.c because > it seemed to belong there alongside the support for SRFs in tlists. > If that's going away then there's no good reason not to move the logic > to where it's used. Cool, then we agree. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
On Mon, Aug 1, 2016 at 3:18 PM, Peter Geoghegan wrote: > Attached WIP patch series: This has bitrot, since commit da1c9163 changed the interface for checking parallel safety. I'll have to fix that, and will probably take the opportunity to change how workers have maintenance_work_mem apportioned while I'm at it. To recap, it would probably be better if maintenance_work_mem remained a high watermark for the entire CREATE INDEX, rather than applying as a per-worker allowance. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding of sequence advances, part II
On Mon, Aug 22, 2016 at 3:29 PM, Robert Haas wrote: > it seems to me that > this is just one facet of a much more general problem: given two > transactions T1 and T2, the order of replay must match the order of > commit unless you can prove that there are no dependencies between > them. I don't see why it matters whether the operations are sequence > operations or data operations; it's just a question of whether they're > modifying the same "stuff". The commit order is the simplest and safest *unless* there is a read-write anti-dependency a/k/a read-write dependency a/k/a rw-conflict: where a read from one transaction sees the "before" version of data modified by the other transaction. In such a case it is necessary for correct serializable transaction behavior for the transaction that read the "before" image to be replayed before the write it didn't see, regardless of commit order. If you're not trying to avoid serialization anomalies, it is less clear to me what is best. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Hi, as noted in [1] I started hacking on removing the current implementation of SRFs in the targetlist (tSRFs henceforth). IM discussion brought the need for a description of the problem, need and approach to light. There are several reasons for wanting to get rid of tSRFs. The primary ones in my opinion are that the current behaviour of several SRFs in one targetlist is confusing, and that the implementation burden currently is all over the executor. Especially the latter is what is motivating me working on this, because it blocks my work on making the executor faster for queries involving significant amounts of tuples. Batching is hard if random places in the querytree can icnrease the number of tuples. The basic idea, hinted at in several threads, is, at plan time, to convert a query like SELECT generate_series(1, 10); into SELECT generate_series FROM ROWS FROM(generate_series(1, 10)); thereby avoiding the complications in the executor (c.f. execQual.c handling of isDone/ExprMultipleResult and supporting code in many executor nodes / node->*.ps.ps_TupFromTlist). There are several design questions along the way: 1) How to deal with the least-common-multiple behaviour of tSRFs. E.g. =# SELECT generate_series(1, 3), generate_series(1,2); returning ┌─┬─┐ │ generate_series │ generate_series │ ├─┼─┤ │ 1 │ 1 │ │ 2 │ 2 │ │ 3 │ 1 │ │ 1 │ 2 │ │ 2 │ 1 │ │ 3 │ 2 │ └─┴─┘ (6 rows) but =# SELECT generate_series(1, 3), generate_series(5,7); returning ┌─┬─┐ │ generate_series │ generate_series │ ├─┼─┤ │ 1 │ 5 │ │ 2 │ 6 │ │ 3 │ 7 │ └─┴─┘ discussion in this thread came, according to my reading, to the conclusion that that behaviour is just confusing and that the ROWS FROM behaviour of =# SELECT * FROM ROWS FROM(generate_series(1, 3), generate_series(1,2)); ┌─┬─┐ │ generate_series │ generate_series │ ├─┼─┤ │ 1 │ 1 │ │ 2 │ 2 │ │ 3 │ (null) │ └─┴─┘ (3 rows) makes more sense. We also discussed erroring out if two SRFs return differing amount of rows, but that seems not to be preferred so far. And we can easily add it if we want. 2) A naive conversion to ROWS FROM, like in the example in the introductory paragraph, can change the output, when implemented as a join from ROWS FROM to the rest of the query, rather than the other way round. E.g. =# EXPLAIN SELECT * FROM few, ROWS FROM(generate_series(1,10)); ┌──┐ │ QUERY PLAN │ ├──┤ │ Nested Loop (cost=0.00..36.03 rows=2000 width=8)│ │ -> Function Scan on generate_series (cost=0.00..10.00 rows=1000 width=4) │ │ -> Materialize (cost=0.00..1.03 rows=2 width=4) │ │ -> Seq Scan on few (cost=0.00..1.02 rows=2 width=4)│ └──┘ (4 rows) =# SELECT * FROM few, ROWS FROM(generate_series(1,3)); ┌┬─┐ │ id │ generate_series │ ├┼─┤ │ 1 │ 1 │ │ 2 │ 1 │ │ 1 │ 2 │ │ 2 │ 2 │ │ 1 │ 3 │ │ 2 │ 3 │ └┴─┘ (6 rows) surely isn't what was intended. So the join order needs to be enforced. 3) tSRFs are evaluated after GROUP BY, and window functions: =# SELECT generate_series(1, count(*)) FROM (VALUES(1),(2),(10)) f; ┌─┐ │ generate_series │ ├─┤ │ 1 │ │ 2 │ │ 3 │ └─┘ which means we have to push the "original" query into a subquery, with the ROWS FROM laterally referencing the subquery: SELECT generate_series FROM (SELECT count(*) FROM (VALUES(1),(2),(10)) f) s, ROWS FROM (generate_series(1,s.count)); 4) The evaluation order of tSRFs in combination with ORDER BY is a bit confusing. Namely tSRFs are implemented after ORDER BY has been evaluated, unless the ORDER BY references the SRF. E.g. =# SELECT few.id, generate_series FROM ROWS FROM(generate_series(1,3)),few ORDER BY few.id DESC; might return ┌┬─┐ │ id │ generate_series │ ├┼─┤ │ 24 │ 3 │ │ 24 │ 2 │ │ 24 │ 1 │ .. instead of ┌┬
Re: [HACKERS] pg_bsd_indent - improvements around offsetof and sizeof
Bruce Momjian wrote: > On Tue, Aug 16, 2016 at 11:47:09AM -0700, Andres Freund wrote: > > On 2016-08-15 18:09:02 +, Piotr Stefaniak wrote: > > > There are more fixes I intend to do, of which the most relevant for > > > Postgres are: > > > 1) fixing "function pointer typedef formatting" > > > > This alone would warrant a bottle of something rather expensive. > > Agreed. I was kind of hoping we could use this for the pgindent run we > just did, but that is being done just before 9.6 final, which seems too > close. I suggest we run it once everything is ready, and run it on all > back-branches so we can backpatch things. The ideal time would probably > be right after we have done minor releases. The problem is that this is > going to break queued-up patches, so maybe it has to be done right > before 10.0 beta, and again, to all back branches too. I think it doesn't really matter -- surely we don't want to do it just before some important release, but other than that I don't think there are any constraints. The amount of pain for large patch maintainers is unrelated to the timing. (I sketched a way to mechanically rebase patches across a pgindent run; I haven't had the chance to try it.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)
On 23/08/16 08:27, Tom Lane wrote: Andres Freund writes: On 2016-08-22 13:54:43 -0400, Robert Haas wrote: On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane wrote: I'm inclined to suggest you forget this approach and propose a single counter for "SQL commands executed", which avoids all of the above definitional problems. People who need more detail than that are probably best advised to look to contrib/pg_stat_statements, anyway. I disagree. I think SQL commands executed, lumping absolutely everything together, really isn't much use. I'm inclined to agree. I think that's a quite useful stat when looking at an installation one previously didn't have a lot of interaction with. Well, let's at least have an "other" category so you can add up the counters and get a meaningful total. regards, tom lane Initially I thought of something I thought was factious, but then realized it might actually be both practicable & useful... How about 2 extra categories (if appropriate!!!): 1. Things that actually might be sort of as fitting in 2 or more of the existing categories, or there is an ambiguity as to which category is appropriate. 2. Things that don't fit into any existing category This was inspired by a real use case, in a totally unrelated area - but where I attempted to ensure counts were in the most useful categories I was able to provide. The user had listed categories, but I found that the data didn't always fit neatly into them as specified. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sslmode=require fallback
On Fri, Aug 19, 2016 at 09:22:32AM -0700, Jeff Janes wrote: > On Sat, Jul 30, 2016 at 11:18 AM, Bruce Momjian wrote: > > On Fri, Jul 29, 2016 at 11:27:06AM -0400, Peter Eisentraut wrote: > > On 7/29/16 11:13 AM, Bruce Momjian wrote: > > > Yes, I am thinking of a case where Postgres is down but a malevolent > > > user starts a Postgres server on 5432 to gather passwords. Verifying > > > against an SSL certificate would avoid this problem, so there is some > > > value in using SSL on localhost. (There is no such security available > > > for Unix-domain socket connections.) > > > > Sure, there is the requirepeer connection option for that. > > Oh, nice, I had not seen that. > > > > Hi Bruce, > > There is typo in the doc patch you just committed > > "On way to prevent spoofing of" > > s/On/One/ Oops, thanks, fixed. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_bsd_indent - improvements around offsetof and sizeof
On Tue, Aug 16, 2016 at 11:47:09AM -0700, Andres Freund wrote: > On 2016-08-15 18:09:02 +, Piotr Stefaniak wrote: > > There are more fixes I intend to do, of which the most relevant for > > Postgres are: > > 1) fixing "function pointer typedef formatting" > > This alone would warrant a bottle of something rather expensive. Agreed. I was kind of hoping we could use this for the pgindent run we just did, but that is being done just before 9.6 final, which seems too close. I suggest we run it once everything is ready, and run it on all back-branches so we can backpatch things. The ideal time would probably be right after we have done minor releases. The problem is that this is going to break queued-up patches, so maybe it has to be done right before 10.0 beta, and again, to all back branches too. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
On 2016-08-22 17:50:11 -0300, Alvaro Herrera wrote: > > 2. You can't write to unlogged tables on standby servers, so this > > doesn't help solve the problem of wanting to use temporary tables on > > standbys. > > Check. We could think about relaxing this restriction, which would > enable the feature to satisfy that use case. (I think the main > complication there is the init fork of btrees on those catalogs; other > relations could just be truncated to empty on restart.) Isn't the main complication that visibility currently requires xids to be assigned? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add -c to rsync commands on SR tutorial wiki page
On Thu, Aug 18, 2016 at 2:27 PM, Jim Nasby wrote: > I don't think it's any great leap for someone to think they can use those > commands incrementally. It's certainly one of the first things you think of > when using rsync. AFAIK there's no downside at all to using -c when it is a > brand new copy, so I'm thinking we should just put it in there, especially > considering what the potential downside is. I think that's right. The suggestion that people might use some backup tool is a good one, but that's not a reason not to add -c here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
Robert Haas wrote: > However: > > 1. The number of tables for which we would need to add a duplicate, > unlogged table is formidable. You need pg_attribute, pg_attrdef, > pg_constraint, pg_description, pg_type, pg_trigger, pg_rewrite, etc. > And the backend changes needed so that we used the unlogged copy for > temp tables and the permanent copy for regular tables is probably > really large. Check. This is the most serious issue, IMV. > 2. You can't write to unlogged tables on standby servers, so this > doesn't help solve the problem of wanting to use temporary tables on > standbys. Check. We could think about relaxing this restriction, which would enable the feature to satisfy that use case. (I think the main complication there is the init fork of btrees on those catalogs; other relations could just be truncated to empty on restart.) > 3. While it makes creating temporary tables a lighter-weight > operation, because you no longer need to write WAL for the catalog > entries, there's probably still substantially more overhead than just > stuffing them in backend-local RAM. So the performance benefits are > probably fairly modest. You also save catalog bloat ... These benefits may not be tremendous, but I think they may be good enough for many users. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().
On Tue, Aug 16, 2016 at 7:41 PM, Thomas Munro wrote: > I still think it's worth thinking about something along these lines on > Linux only, where holey Swiss tmpfs files can bite you. Otherwise > disabling overcommit on your OS isn't enough to prevent something > which is really a kind of deferred overcommit with a surprising > failure mode (SIGBUS rather than OOM SIGKILL). Yeah, I am inclined to agree. I mean, creating a DSM is fairly heavyweight already, so one extra system call isn't (I hope) a crazy overhead. We could test to see how much it slows things down. But it may be worth paying the cost even if it ends up being kinda expensive. We don't really have any way of knowing whether the caller's request is reasonable relative to the amount of virtual memory available, and converting a possible SIGBUS into an ereport(ERROR, ...) is a big win. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
On Tue, Aug 16, 2016 at 8:03 PM, Jim Nasby wrote: > On 8/16/16 11:59 AM, Robert Haas wrote: > ... >> >> That doesn't really solve the problem, because OTHER backends won't be >> able to see them. So, if I create a fast temporary table in one >> session that depends on a permanent object, some other session can >> drop the permanent object. If there were REAL catalog entries, that >> wouldn't work, because the other session would see the dependency. > > Some discussion about TEMP functions is happening on -general right now, and > there's other things where temp objects are good to have, so it'd be nice to > have a more generic fix for this stuff. Is the idea of "partitioning" the > catalogs to store temp objects separate from permanent fatally flawed? I wouldn't say it's fatally flawed. But you might need a world-renowned team of physicians working round the clock for days in a class 1 trauma center to save it. If you imagine that you have a permanent pg_class which holds permanent tables and a temporary pg_class per-backend which stores temporary tables, then you very quickly end up with the same deadly flaw as in Aleksander's design: other backends cannot see all of the dependency entries and can drop things that they shouldn't be permitted to drop. However, you could have a permanent pg_class which holds the records for permanent tables and an *unlogged* table, say pg_class_unlogged, which holds records for temporary tables. Now everybody can see everybody else's data, yet we don't have to create permanent catalog entries. So we are not dead. All of the temporary catalog tables vanish on a crash, too, and in a very clean way, which is great. However: 1. The number of tables for which we would need to add a duplicate, unlogged table is formidable. You need pg_attribute, pg_attrdef, pg_constraint, pg_description, pg_type, pg_trigger, pg_rewrite, etc. And the backend changes needed so that we used the unlogged copy for temp tables and the permanent copy for regular tables is probably really large. 2. You can't write to unlogged tables on standby servers, so this doesn't help solve the problem of wanting to use temporary tables on standbys. 3. While it makes creating temporary tables a lighter-weight operation, because you no longer need to write WAL for the catalog entries, there's probably still substantially more overhead than just stuffing them in backend-local RAM. So the performance benefits are probably fairly modest. Overall I feel like the development effort that it would take to make this work would almost certainly be better-expended elsewhere. But of course I'm not in charge of how people who work for other companies spend their time... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding of sequence advances, part II
On 2016-08-22 16:29:12 -0400, Robert Haas wrote: > So, I wish I could give you some better advice on this topic, but > sadly I am not an expert in this area. However, it seems to me that > this is just one facet of a much more general problem: given two > transactions T1 and T2, the order of replay must match the order of > commit unless you can prove that there are no dependencies between > them. I don't see why it matters whether the operations are sequence > operations or data operations; it's just a question of whether they're > modifying the same "stuff". > > Of course, it's possible I'm missing something important here... Maybe that normally logical decoding outputs stuff in commit order? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical decoding of sequence advances, part II
On Sun, Aug 21, 2016 at 11:13 PM, Craig Ringer wrote: > If the sequence is created in the current xact (i.e. uncommitted) we have to > add the sequence updates to that xact to be replayed only if it commits. The > sequence is visible only to the toplevel xact that created the sequence so > advances of it can only come from that xact and its children. The actual > CREATE SEQUENCE is presumed to be handled separately by an event trigger or > similar. > > If the new sequence is committed we must replay sequence advances > immediately and non-transactionally to ensure they're not lost due to xact > rollback or replayed in the wrong order due to xact commit order. So, I wish I could give you some better advice on this topic, but sadly I am not an expert in this area. However, it seems to me that this is just one facet of a much more general problem: given two transactions T1 and T2, the order of replay must match the order of commit unless you can prove that there are no dependencies between them. I don't see why it matters whether the operations are sequence operations or data operations; it's just a question of whether they're modifying the same "stuff". Of course, it's possible I'm missing something important here... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)
Andres Freund writes: > On 2016-08-22 13:54:43 -0400, Robert Haas wrote: >> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane wrote: >>> I'm inclined to suggest you forget this approach and propose a single >>> counter for "SQL commands executed", which avoids all of the above >>> definitional problems. People who need more detail than that are >>> probably best advised to look to contrib/pg_stat_statements, anyway. >> I disagree. I think SQL commands executed, lumping absolutely >> everything together, really isn't much use. > I'm inclined to agree. I think that's a quite useful stat when looking > at an installation one previously didn't have a lot of interaction with. Well, let's at least have an "other" category so you can add up the counters and get a meaningful total. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] distinct estimate of a hard-coded VALUES list
Robert Haas writes: > On Sat, Aug 20, 2016 at 4:58 PM, Tom Lane wrote: >> I am not exactly sold on this assumption that applications have >> de-duplicated the contents of a VALUES or IN list. They haven't been >> asked to do that in the past, so why do you think they are doing it? > It's hard to know, but my intuition is that most people would > deduplicate. I mean, nobody is going to want to their query generator > to send X IN (1, 1, ) to the server if it > could have just sent X IN (1). I dunno, these are the very same people who send "WHERE 1=1" so that they can save one line of code to decide whether to append AND or not before the first real condition. Still, maybe we should optimize for smart queries rather than stupid ones. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in abbreviated keys abort handling (found with amcheck)
On Mon, Aug 22, 2016 at 12:34 PM, Robert Haas wrote: > Ugh, that sucks. Thanks for the report and patch. Committed and > back-patched to 9.5. Thanks. Within Heroku, there is a lot of enthusiasm for the idea of sharing hard data about the prevalence of problems like this. I hope to be able to share figures in the next few weeks, when I finish working through the backlog. Separately, I would like amcheck to play a role in how we direct users to REINDEX, as issues like this come to light. It would be much more helpful if we didn't have to be so conservative. I hesitate to say that amcheck will detect cases where this bug led to corruption with 100% reliability, but I think that any case that one can imagine in which amcheck fails here is unlikely in the extreme. The same applies to the glibc abbreviated keys issue. I actually didn't find any glibc strxfrm() issues yet, even though any instances of corruption of text indexes I've seen originated before the point release in which strxfrm() became distrusted. I guess that not that many Heroku users use the "C" locale, which would still be affected with the latest point release. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changed SRF in targetlist handling
Andres Freund writes: > On 2016-08-17 17:41:28 -0700, Andres Freund wrote: >> Tom, do you think this is roughly going in the right direction? I've not had time to look at this patch, I'm afraid. If you still want me to, I can make time in a day or so. > I'm working on these. Atm ExecMakeTableFunctionResult() resides in > execQual.c - I'm inlining it into nodeFunctionscan.c now, because > there's no other callers, and having it separate seems to bring no > benefit. > Please speak soon up if you disagree. I think ExecMakeTableFunctionResult was placed in execQual.c because it seemed to belong there alongside the support for SRFs in tlists. If that's going away then there's no good reason not to move the logic to where it's used. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in abbreviated keys abort handling (found with amcheck)
On Fri, Aug 19, 2016 at 6:07 PM, Peter Geoghegan wrote: > I found another bug as a result of using amcheck on Heroku customer > databases. This time, the bug is in core Postgres. It's one of mine. > > There was a thinko in tuplesort's abbreviation abort logic, causing > certain SortTuples to be spuriously marked NULL (and so, subsequently > sorted as a NULL tuple, despite not actually changing anything about > the representation of caller tuples). The attached patch fixes this > bug. Ugh, that sucks. Thanks for the report and patch. Committed and back-patched to 9.5. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changed SRF in targetlist handling
Hi, On 2016-05-23 09:26:03 +0800, Craig Ringer wrote: > SRFs-in-tlist are a lot faster for lockstep iteration etc. They're also > much simpler to write, though if the result result rowcount differs > unexpectedly between the functions you get exciting and unexpected > behaviour. > > WITH ORDINALITY provides what I think is the last of the functionality > needed to replace SRFs-in-from, but at a syntatactic complexity and > performance cost. The following example demonstrates that, though it > doesn't do anything that needs LATERAL etc. I'm aware the following aren't > semantically identical if the rowcounts differ. I think here you're just missing ROWS FROM (generate_series(..), generate_series(...)) Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Exporting more function in libpq
Craig Ringer wrote: > On 19 August 2016 at 14:17, Tatsuo Ishii wrote: > > > I would like to proppse to export these functions in libpq. > > > > pqPutMsgStart > > pqPutMsgEnd > > pqPutc > > pqPuts > > pqPutInt > > pqPutnchar > > pqFlush > > pqHandleSendFailure > > > > I think this would be useful to create a tool/library which needs to > > handle frontend/backend protocol messages in detail. > > Shouldn't that generally be done by extending libpq to add the required > functionality? The thought that came to me was that maybe we need a separate library that handles the lower level operations (a "fe/be" library, if you will) which can be exported for others to use and is used by libpq to implement the slightly-higher-level functionality. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Changed SRF in targetlist handling
On 2016-08-17 17:41:28 -0700, Andres Freund wrote: > Tom, do you think this is roughly going in the right direction? My plan > here is to develop two patches, to come before this: > > a) Allow to avoid using a tuplestore for SRF_PERCALL SRFs in ROWS FROM - >otherwise our performance would regress noticeably in some cases. > b) Allow ROWS FROM() to return SETOF RECORD type SRFs as one column, >instead of expanded. That's important to be able move SETOF RECORD >returning functions in the targetlist into ROWS FROM, which otherwise >requires an explicit column list. I'm working on these. Atm ExecMakeTableFunctionResult() resides in execQual.c - I'm inlining it into nodeFunctionscan.c now, because there's no other callers, and having it separate seems to bring no benefit. Please speak soon up if you disagree. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Most efficient way for libPQ .. PGresult serialization
Oh I see. I just read more on use cases PgBouncer, but seems like it can't be used for my project. The reason is that I need to have my middleware to have full control over each transaction. That is it must be able to decide if it's going to commit or abort a single query (reason why libpq is used in the middleware), and it must be able to decide when to send back the result. Also it does things like load balancing with it's algorithm. So, what middleware does is (simplied, ignoring other details) 1. listens to query and does load balancing 2. execute query on behalf of client to server with libpq (does not have to be libpq). 3. serialize the result and send it back And the #3 is why I asked for ways to serialize PGresult (of libpq) Client app will deserialize the result and thus be able to interpret PGresult as if it used libpq itself. Thanks! On Thu, Aug 18, 2016 at 9:05 PM, Craig Ringer wrote: > > On 19 August 2016 at 03:08, Joshua Bay wrote: > >> Thanks, >> But I don't think my question was clear enough. >> >> I already managed the connection pooling, and what I need is to serialize >> the result. >> >> If PGresult was a contiguous block, I could have just create buffer and >> call memcpy for serialization, but structure of result seems much more >> complicated. >> >> So, I was asking if there is an easy way to achieve serialization >> > > It's wire format is a serialization. That's kind of the point. > > I don't understand what you're trying to do here, so it's hard to give a > better answer. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] Most efficient way for libPQ .. PGresult serialization
No, it can be anything else. Please correctly me if I'm wrong, but to me, PgPool-II looks like a proxy server that forwards all the data without interpretation. Can I intercept in the middle and control the flow between client and server? For e.g, I need control when the result of transaction is sent back to the result? On Sat, Aug 20, 2016 at 2:39 AM, Craig Ringer wrote: > On 19 August 2016 at 22:16, Joshua Bay wrote: > >> Oh I see. >> I just read more on use cases PgBouncer, but seems like it can't be used >> for my project. >> The reason is that I need to have my middleware to have full control over >> each transaction. >> That is it must be able to decide if it's going to commit or abort a >> single query (reason why libpq is used in the middleware), and it must be >> able to decide when to send back the result. Also it does things like load >> balancing with it's algorithm. >> >> So, what middleware does is (simplied, ignoring other details) >> 1. listens to query and does load balancing >> 2. execute query on behalf of client to server with libpq (does not have >> to be libpq). >> 3. serialize the result and send it back >> >> And the #3 is why I asked for ways to serialize PGresult (of libpq) >> >> Client app will deserialize the result and thus be able to interpret >> PGresult as if it used libpq itself. >> >> > Surely the app should just use libpq, and your middleware should be a > proxy? > > Like, say, PgPool-II? > > Otherwise you'll have to extract all the results handling parts of libpq > into some smaller cut-down library and graft on > serialization/deserialization code. There's nothing built-in for that > ,since the natural and logical serialization for query results is the > PostgreSQL wire protocol. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] distinct estimate of a hard-coded VALUES list
Robert Haas wrote: > On Sat, Aug 20, 2016 at 4:58 PM, Tom Lane wrote: > > Jeff Janes writes: > >> On Thu, Aug 18, 2016 at 2:25 PM, Tom Lane wrote: > >>> It does know it, what it doesn't know is how many duplicates there are. > > > >> Does it know whether the count comes from a parsed query-string list/array, > >> rather than being an estimate from something else? If it came from a join, > >> I can see why it would be dangerous to assume they are mostly distinct. > >> But if someone throws 6000 things into a query string and only 200 distinct > >> values among them, they have no one to blame but themselves when it makes > >> bad choices off of that. > > > > I am not exactly sold on this assumption that applications have > > de-duplicated the contents of a VALUES or IN list. They haven't been > > asked to do that in the past, so why do you think they are doing it? > > It's hard to know, but my intuition is that most people would > deduplicate. I mean, nobody is going to want to their query generator > to send X IN (1, 1, ) to the server if it > could have just sent X IN (1). Also, if we patch it this way and somebody has a slow query because of a lot of duplicate values, it's easy to solve the problem by de-duplicating. But with the current code, people that have the opposite problem has no way to work around it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)
On 2016-08-22 13:54:43 -0400, Robert Haas wrote: > On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane wrote: > > I'm inclined to suggest you forget this approach and propose a single > > counter for "SQL commands executed", which avoids all of the above > > definitional problems. People who need more detail than that are > > probably best advised to look to contrib/pg_stat_statements, anyway. > > I disagree. I think SQL commands executed, lumping absolutely > everything together, really isn't much use. I'm inclined to agree. I think that's a quite useful stat when looking at an installation one previously didn't have a lot of interaction with. > Haribabu's categorization > scheme seems to need some work, but the idea of categorizing > statements by type and counting executions per type seems very > reasonable. I'd consider instead using something like COALESCE(commandType, nodeTag(Query->utilityStmt)) as the categories. Not sure if I'd even pivot that. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)
On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane wrote: > I'm inclined to suggest you forget this approach and propose a single > counter for "SQL commands executed", which avoids all of the above > definitional problems. People who need more detail than that are > probably best advised to look to contrib/pg_stat_statements, anyway. I disagree. I think SQL commands executed, lumping absolutely everything together, really isn't much use. Haribabu's categorization scheme seems to need some work, but the idea of categorizing statements by type and counting executions per type seems very reasonable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Showing parallel status in \df+
2016-08-22 18:19 GMT+02:00 Robert Haas : > On Mon, Aug 22, 2016 at 4:49 AM, Pavel Stehule > wrote: > > This feature shows source code for PL function when \df statement was > used. > > I am not too sure, if this functionality is necessary - but I don't see > any > > argument against. Sometimes it can be useful, mainly when we work with > > overloaded functions. > > Wait, really? I thought Peter was complaining about the fact that it > *removed* that from the display. > > He also complained about the fact that the subject line of this thread > and what the patch actually does have diverged considerably, which I > think is a fair complaint. > If I understand to purpose of this patch - it is compromise - PL source is removed from table, but it is printed in result. I am sure so there are low benefit from displaying the body of PL function inside table. But I see some benefit on Tom's design. We cannot to simply show source code of more functions. \sf doesn't support it. The source is displayed on the end, so there is low impact on result. Regards Pavel > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)
On 2016-08-22 13:49:47 -0400, Robert Haas wrote: > On Mon, Aug 22, 2016 at 1:46 PM, Andres Freund wrote: > > I don't think the runtime overhead is likely to be all that high - if > > you look at valgrind.supp the peformancecritical parts basically are: > > - pgstat_send - the context switching is going to drown out some zeroing > > - xlog insertions - making the crc computation more predictable would > > actually be nice > > - reorderbuffer serialization - zeroing won't be a material part of the > > cost > > > > The rest is mostly bootstrap or python related. > > > > There might be cases where we *don't* unconditionally do the zeroing - > > e.g. I'm doubtful about the sinval stuff where we currently only > > conditionally clear - but the stuff in valgrind.supp seems fine. > > Naturally you'll be wanting to conclusively demonstrate this with > benchmarks on multiple workloads, platforms, and concurrency levels. > Right? :-) Pah ;) I do think some micro-benchmarks aiming at the individual costs make sense, we're only taking about ~three places in the code - don't think concurrency plays a large role though ;) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)
On Mon, Aug 22, 2016 at 1:46 PM, Andres Freund wrote: > I don't think the runtime overhead is likely to be all that high - if > you look at valgrind.supp the peformancecritical parts basically are: > - pgstat_send - the context switching is going to drown out some zeroing > - xlog insertions - making the crc computation more predictable would > actually be nice > - reorderbuffer serialization - zeroing won't be a material part of the > cost > > The rest is mostly bootstrap or python related. > > There might be cases where we *don't* unconditionally do the zeroing - > e.g. I'm doubtful about the sinval stuff where we currently only > conditionally clear - but the stuff in valgrind.supp seems fine. Naturally you'll be wanting to conclusively demonstrate this with benchmarks on multiple workloads, platforms, and concurrency levels. Right? :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
On 2016-08-22 13:41:57 -0400, Robert Haas wrote: > On Mon, Aug 22, 2016 at 1:32 PM, Heikki Linnakangas wrote: > >> But what about the best case? If we create a scenario where there are > >> no open read-write transactions at all and (somehow) lots and lots of > >> ProcArrayLock contention, how much does this help? > > > > I ran some quick pgbench tests on my laptop, but didn't see any meaningful > > benefit. I think the best I could see is about 5% speedup, when running > > "pgbench -S", with 900 idle connections sitting in the background. On the > > positive side, I didn't see much slowdown either. (Sorry, I didn't record > > the details of those tests, as I was testing many different options and I > > didn't see a clear difference either way.) > > That's not very exciting. I think it's neither exciting nor worrying - the benefit of the pgproc batch clearing itself wasn't apparent on small hardware either... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)
On 2016-08-22 13:16:34 -0400, Robert Haas wrote: > On Sat, Aug 20, 2016 at 3:46 PM, Tom Lane wrote: > > So to me, it seems like the core of this complaint boils down to "my > > sanitizer doesn't understand the valgrind exclusion patterns that have > > been created for Postgres". We can address that to some extent by trying > > to reduce the number of valgrind exclusions we need, but it's unlikely to > > be practical to get that to zero, and it's not very clear that adding > > runtime cycles is a good tradeoff for it either. So maybe we need to push > > back on the assumption that people should expect their sanitizers to > > produce zero warnings without having made some effort to adapt the > > valgrind rules. I don't think the runtime overhead is likely to be all that high - if you look at valgrind.supp the peformancecritical parts basically are: - pgstat_send - the context switching is going to drown out some zeroing - xlog insertions - making the crc computation more predictable would actually be nice - reorderbuffer serialization - zeroing won't be a material part of the cost The rest is mostly bootstrap or python related. There might be cases where we *don't* unconditionally do the zeroing - e.g. I'm doubtful about the sinval stuff where we currently only conditionally clear - but the stuff in valgrind.supp seems fine. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
On Mon, Aug 22, 2016 at 1:32 PM, Heikki Linnakangas wrote: >> But what about the best case? If we create a scenario where there are >> no open read-write transactions at all and (somehow) lots and lots of >> ProcArrayLock contention, how much does this help? > > I ran some quick pgbench tests on my laptop, but didn't see any meaningful > benefit. I think the best I could see is about 5% speedup, when running > "pgbench -S", with 900 idle connections sitting in the background. On the > positive side, I didn't see much slowdown either. (Sorry, I didn't record > the details of those tests, as I was testing many different options and I > didn't see a clear difference either way.) That's not very exciting. > It seems that Amit's PGPROC batch clearing patch was very effective. I > remember seeing ProcArrayLock contention very visible earlier, but I can't > hit that now. I suspect you'd still see contention on bigger hardware, > though, my laptop has oly 4 cores. I'll have to find a real server for the > next round of testing. It's good that those patches were effective, and I bet that approach could be further refined, too. However, I think Amit may have mentioned in an internal meeting that he was able to generate some pretty serious ProcArrayLock contention with some of his hash index patches applied. I don't remember the details, though. >> Because there's only a purpose to trying to minimize the losses if >> there are some gains to which we can look forward. > > Aside from the potential performance gains, this slashes a lot of > complicated code: > > 70 files changed, 2429 insertions(+), 6066 deletions(-) > > That removed code is quite mature at this point, and I'm sure we'll add some > code back to this patch as it evolves, but still. That's interesting, but it might just mean we're replacing well-tested code with new, buggy code. By the time you fix all the performance regressions, those numbers could be a lot closer together. > Also, I'm looking forward for a follow-up patch, to track snapshots in > backends at a finer level, so that vacuum could remove tuples more > aggressively, if you have pg_dump running for days. CSN snapshots isn't a > strict requirement for that, but it makes it simpler, when you can represent > a snapshot with a small fixed-size integer. That would certainly be nice, but I think we need to be careful not to sacrifice too much trying to get there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
Hi, On 2016-08-22 20:32:42 +0300, Heikki Linnakangas wrote: > I ran some quick pgbench tests on my laptop, but didn't see any meaningful > benefit. I think the best I could see is about 5% speedup, when running > "pgbench -S", with 900 idle connections sitting in the background. On the > positive side, I didn't see much slowdown either. (Sorry, I didn't record > the details of those tests, as I was testing many different options and I > didn't see a clear difference either way.) Hm. Does the picture change if those are idle in transaction, after assigning an xid. > It seems that Amit's PGPROC batch clearing patch was very effective. It usually breaks down if you have a mixed read/write workload - might be worthehile prototyping that. > I > remember seeing ProcArrayLock contention very visible earlier, but I can't > hit that now. I suspect you'd still see contention on bigger hardware, > though, my laptop has oly 4 cores. I'll have to find a real server for the > next round of testing. Yea, I think that's true. I can just about see ProcArrayLock contention on my more powerful laptop, to see it really bad you need bigger hardware / higher concurrency. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: PROPOSAL: make PostgreSQL sanitizers-friendly (and prevent information disclosure)
On Sat, Aug 20, 2016 at 3:46 PM, Tom Lane wrote: > Noah Misch writes: >> On Fri, Aug 19, 2016 at 07:22:02PM -0400, Tom Lane wrote: >>> So maybe we ought to work towards taking those out? > >> Maybe. It's a policy question boiling down to our willingness to spend >> cycles >> zeroing memory in order to limit trust in the confidentiality of storage >> backing the data directory. Think of "INSERT INTO t VALUES(my_encrypt('key', >> 'cleartext'))", subsequent to which bits of the key or cleartext leak to disk >> by way of WAL padding bytes. A reasonable person might expect that not to >> happen; GNU Privacy Guard and a few like-minded programs prevent it. I'm on >> the fence regarding whether PostgreSQL should target this level of vigilance. >> An RDBMS is mainly a tool for managing persistent data, and PostgreSQL will >> never be a superior tool for data that _must not_ persist. Having said that, >> the runtime cost of zeroing memory and the development cost of reviewing the >> patches to do so is fairly low. > > [ after thinking some more about this... ] > > FWIW, I put pretty much zero credence in the proposition that junk left in > padding bytes in WAL or data files represents a meaningful security issue. > An attacker who has access to those files will probably find much more > that is of interest in the non-pad data. My only interest here is in > making the code sanitizer-clean, which seems like it is useful for > debugging purposes independently of any security arguments. > > So to me, it seems like the core of this complaint boils down to "my > sanitizer doesn't understand the valgrind exclusion patterns that have > been created for Postgres". We can address that to some extent by trying > to reduce the number of valgrind exclusions we need, but it's unlikely to > be practical to get that to zero, and it's not very clear that adding > runtime cycles is a good tradeoff for it either. So maybe we need to push > back on the assumption that people should expect their sanitizers to > produce zero warnings without having made some effort to adapt the > valgrind rules. One idea is to add protect additional memory-clearing operations with #ifdef SANITIZER_CLEAN or something of that sort. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
On 08/22/2016 07:49 PM, Robert Haas wrote: Nice to see you working on this again. On Mon, Aug 22, 2016 at 12:35 PM, Heikki Linnakangas wrote: A sequential scan of a table like that with 10 million rows took about 700 ms on my laptop, when the hint bits are set, without this patch. With this patch, if there's a snapshot holding back the xmin horizon, so that we need to check the CSN log for every XID, it took about 3 ms. So we have some optimization work to do :-). I'm not overly worried about that right now, as I think there's a lot of room for improvement in the SLRU code. But that's the next thing I'm going to work. So the worst case for this patch is obviously bad right now and, as you say, that means that some optimization work is needed. But what about the best case? If we create a scenario where there are no open read-write transactions at all and (somehow) lots and lots of ProcArrayLock contention, how much does this help? I ran some quick pgbench tests on my laptop, but didn't see any meaningful benefit. I think the best I could see is about 5% speedup, when running "pgbench -S", with 900 idle connections sitting in the background. On the positive side, I didn't see much slowdown either. (Sorry, I didn't record the details of those tests, as I was testing many different options and I didn't see a clear difference either way.) It seems that Amit's PGPROC batch clearing patch was very effective. I remember seeing ProcArrayLock contention very visible earlier, but I can't hit that now. I suspect you'd still see contention on bigger hardware, though, my laptop has oly 4 cores. I'll have to find a real server for the next round of testing. Because there's only a purpose to trying to minimize the losses if there are some gains to which we can look forward. Aside from the potential performance gains, this slashes a lot of complicated code: 70 files changed, 2429 insertions(+), 6066 deletions(-) That removed code is quite mature at this point, and I'm sure we'll add some code back to this patch as it evolves, but still. Also, I'm looking forward for a follow-up patch, to track snapshots in backends at a finer level, so that vacuum could remove tuples more aggressively, if you have pg_dump running for days. CSN snapshots isn't a strict requirement for that, but it makes it simpler, when you can represent a snapshot with a small fixed-size integer. Yes, seeing some direct performance gains would be nice too. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] distinct estimate of a hard-coded VALUES list
On Sat, Aug 20, 2016 at 4:58 PM, Tom Lane wrote: > Jeff Janes writes: >> On Thu, Aug 18, 2016 at 2:25 PM, Tom Lane wrote: >>> It does know it, what it doesn't know is how many duplicates there are. > >> Does it know whether the count comes from a parsed query-string list/array, >> rather than being an estimate from something else? If it came from a join, >> I can see why it would be dangerous to assume they are mostly distinct. >> But if someone throws 6000 things into a query string and only 200 distinct >> values among them, they have no one to blame but themselves when it makes >> bad choices off of that. > > I am not exactly sold on this assumption that applications have > de-duplicated the contents of a VALUES or IN list. They haven't been > asked to do that in the past, so why do you think they are doing it? It's hard to know, but my intuition is that most people would deduplicate. I mean, nobody is going to want to their query generator to send X IN (1, 1, ) to the server if it could have just sent X IN (1). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Transaction traceability - txid_status(bigint)
On Sat, Aug 20, 2016 at 9:46 AM, Craig Ringer wrote: > Ahem. Forgot to squash in a fixup commit. Updated patch of > txid_status(bigint) attachd. > > A related patch follows, adding a new txid_current_ifassigned(bigint) > function as suggested by Jim Nasby. It's usefully connected to txid_status() > and might as well be added at the same time. > > Since it builds on the same history I've also attached an updated version of > txid_recent(bigint) now called txid_convert_ifrecent(bigint), per the > above-linked thread. > > Finally, and not intended for commit, is a useful test function I wrote to > cause extremely rapid xid wraparound, bundled up into a src/test/regress > test case. txid_incinerate() can jump the server about UINT32/2 xids in ~2 > seconds if fsync is off, making it handy for testing. Posting so others can > use it for their own test needs later and because it's useful for testing > these patches that touch on the xid epoch. I think you should use underscores to separate all of the words instead of only some of them. Also, note that the corresponding internal function is GetTopTransactionIdIfAny(), which might suggest txid_current_if_any() rather than txid_current_if_assigned(), but you could argue that your naming is better... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
Nice to see you working on this again. On Mon, Aug 22, 2016 at 12:35 PM, Heikki Linnakangas wrote: > A sequential scan of a table like that with 10 million rows took about 700 > ms on my laptop, when the hint bits are set, without this patch. With this > patch, if there's a snapshot holding back the xmin horizon, so that we need > to check the CSN log for every XID, it took about 3 ms. So we have some > optimization work to do :-). I'm not overly worried about that right now, as > I think there's a lot of room for improvement in the SLRU code. But that's > the next thing I'm going to work. So the worst case for this patch is obviously bad right now and, as you say, that means that some optimization work is needed. But what about the best case? If we create a scenario where there are no open read-write transactions at all and (somehow) lots and lots of ProcArrayLock contention, how much does this help? Because there's only a purpose to trying to minimize the losses if there are some gains to which we can look forward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
On 08/22/2016 07:35 PM, Heikki Linnakangas wrote: And here's a new patch version... And here's the attachment I forgot, *sigh*.. - Heikki csn-4.patch.gz Description: application/gzip -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal for CSN based snapshots
And here's a new patch version. Still lots of work to do, especially in performance testing, and minimizing the worst-case performance hit. On 08/09/2016 03:16 PM, Heikki Linnakangas wrote: Next steps: * Hot standby feedback is broken, now that CSN != LSN again. Will have to switch this back to using an "oldest XID", rather than a CSN. * I plan to replace pg_subtrans with a special range of CSNs in the csnlog. Something like, start the CSN counter at 2^32 + 1, and use CSNs < 2^32 to mean "this is a subtransaction, parent is XXX". One less SLRU to maintain. * Put per-proc xmin back into procarray. I removed it, because it's not necessary for snapshots or GetOldestSnapshot() (which replaces GetOldestXmin()) anymore. But on second thoughts, we still need it for deciding when it's safe to truncate the csnlog. * In this patch, HeapTupleSatisfiesVacuum() is rewritten to use an "oldest CSN", instead of "oldest xmin", but that's not strictly necessary. To limit the size of the patch, I might revert those changes for now. I did all of the above. This patch is now much smaller, as I didn't change all the places that used to deal with global-xmin's, like I did earlier. The oldest-xmin is now computed pretty like it always has been. * Rewrite the way RecentGlobalXmin is updated. As Alvaro pointed out in his review comments two years ago, that was quite complicated. And I'm worried that the lazy scheme I had might not allow pruning fast enough. I plan to make it more aggressive, so that whenever the currently oldest transaction finishes, it's responsible for advancing the "global xmin" in shared memory. And the way it does that, is by scanning the csnlog, starting from the current "global xmin", until the next still in-progress XID. That could be a lot, if you have a very long-running transaction that ends, but we'll see how it performs. I ripped out all that, and created a GetRecentGlobalXmin() function that computes a global-xmin value when needed, like GetOldestXmin() does. Seems most straightforward. Since we no longer get a RecentGlobalXmin value essentially for free in GetSnapshotData(), as we no longer scan the proc array, it's better to compute the value only when needed. * Performance testing. Clearly this should have a performance benefit, at least under some workloads, to be worthwhile. And not regress. I wrote a little C module to create a "worst-case" table. Every row in the table has a different xmin, and the xmin values are shuffled across the table, to defeat any caching. A sequential scan of a table like that with 10 million rows took about 700 ms on my laptop, when the hint bits are set, without this patch. With this patch, if there's a snapshot holding back the xmin horizon, so that we need to check the CSN log for every XID, it took about 3 ms. So we have some optimization work to do :-). I'm not overly worried about that right now, as I think there's a lot of room for improvement in the SLRU code. But that's the next thing I'm going to work. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Showing parallel status in \df+
On Mon, Aug 22, 2016 at 4:49 AM, Pavel Stehule wrote: > This feature shows source code for PL function when \df statement was used. > I am not too sure, if this functionality is necessary - but I don't see any > argument against. Sometimes it can be useful, mainly when we work with > overloaded functions. Wait, really? I thought Peter was complaining about the fact that it *removed* that from the display. He also complained about the fact that the subject line of this thread and what the patch actually does have diverged considerably, which I think is a fair complaint. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo in comment to function LockHasWaitersRelation() [master branch]
On Mon, Aug 22, 2016 at 9:01 AM, Dmitry Ivanov wrote: >> Hi hackers, >> >> I've found a typo in a comment to function LockHasWaitersRelation() [lmgr.c >> : >> 271, master branch]: >> >> This is a functiion to check > > Attached a patch. Thanks. Committed with a bit of additional wordsmithing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On 22 August 2016 at 13:44, Kuntal Ghosh wrote: > Please let me know your thoughts on this. Do the regression tests pass with this option enabled? -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LSN as a recovery target
On Mon, Aug 22, 2016 at 8:28 AM, Michael Paquier wrote: > On Mon, Aug 22, 2016 at 7:12 PM, Adrien Nayrat > wrote: >> As Julien said, there is nothing to notice that error comes from >> recovery.conf. >> My fear would be that an user encounters an error like this. Il will be >> difficult to link to the recovery.conf. > > Thinking a bit wider than that, we may want to know such context for > normal GUC parameters as well, and that's not the case now. Perhaps > there is actually a reason why that's not done for GUCs, but it seems > that it would be useful there as well. That would give another reason > to move all that under the GUC umbrella. Maybe so, but that's been tried multiple times without success. If you think an error context is useful here, and I bet it is, I'd say just add it and be done with it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier wrote: > Another pin-point is: given a certain page, how do we identify of > which type it is? One possibility would be again to extend the AM > handler with some kind of is_self function with a prototype like that: > bool handler->is_self(Page); > If the page is of the type of the handler, this returns true, and > false otherwise. Still here performance would suck. > > At the end, what we want is a clean interface, and more thoughts into it. I think that it makes sense to filter based on the resource manager ID, but I can't see how we could reasonably filter based on the AM name. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tracking wait event for latches
On Mon, Aug 22, 2016 at 9:49 AM, Michael Paquier wrote: > The reason why I chose this way is that there are a lot of them. It is > painful to maintain the order of the array elements in perfect mapping > with the list of IDs... You can use stupid macro tricks to help with that problem... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF-8 docs?
On 8/22/16 9:32 AM, Tatsuo Ishii wrote: > I don't know what kind of problem you are seeing with encoding > handling, but at least UTF-8 is working for Japanese, French and > Russian. Those translations are using DocBook XML. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improved DefElem list processing
Peter Eisentraut wrote: > I'm not happy that utils/acl.h has prototypes for aclchk.c, because > acl.h is included all over the place. Perhaps I should make a > src/include/catalog/aclchk.c to clean that up. I've been bothered by that too in the past. +1 for the cleanup. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improved DefElem list processing
On Mon, Aug 22, 2016 at 10:41 PM, Pavel Stehule wrote: > 1. This patch introduce location in DefElement node, and inject ParserState > to SQL commands, where ParserState was not used. It allows to show the > position of an error. This patch is not small, but almost changes are > trivial. > > 2. There are no problems with patching, compiling, tests - all tests passed. > > 3. There is not any new functionality, so new tests and new documentation is > not necessary. > > I'll mark this patch as ready for commiter. Now that I look at those patches, +1 for both. Particularly the redundant-option checks will remove a lot of boring code. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm_unpin_segment
On Mon, Aug 22, 2016 at 6:06 PM, Thomas Munro wrote: > On Tue, Aug 23, 2016 at 12:07 AM, Amit Kapila wrote: >> + int control_slot = -1; >> ... >> + if (control_slot == -1) >> + elog(ERROR, "cannot unpin unknown segment handle"); >> >> Isn't it better to use INVALID_CONTROL_SLOT for control_slot and use >> datatype as uint32 (same is used for dsm_segment->control_slot and >> nitems)? > > Yes, it is better. New version attached. > This version of patch looks good to me. I have marked it as Ready For Committer. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tracking wait event for latches
On Mon, Aug 22, 2016 at 6:09 PM, Alexander Korotkov wrote: > Hi, Michael! > > On Thu, Aug 4, 2016 at 8:26 AM, Michael Paquier > wrote: > I took a look at your patch. Couple of notes from me. Thanks! >> const char * >> GetEventIdentifier(uint16 eventId) >> { >> const char *res; >> switch (eventId) >> { >> case EVENT_ARCHIVER_MAIN: >> res = "ArchiverMain"; >> break; >> ... long long list of events ... >> case EVENT_WAL_SENDER_WRITE_DATA: >> res = "WalSenderWriteData"; >> break; >> default: >> res = "???"; >> } >> return res; >> } > > > Would it be better to use an array here? The reason why I chose this way is that there are a lot of them. It is painful to maintain the order of the array elements in perfect mapping with the list of IDs... >> typedef enum EventIdentifier >> { > > > EventIdentifier seems too general name for me, isn't it? Could we name it > WaitEventIdentifier? Or WaitEventId for shortcut? OK. So WaitEventIdentifier? The reason to include Identifier is for consistency with lwlock structure notation. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] improved DefElem list processing
Hi 2016-08-11 17:32 GMT+02:00 Peter Eisentraut < peter.eisentr...@2ndquadrant.com>: > On 8/5/16 11:25 AM, Peter Eisentraut wrote: > > On 8/4/16 2:21 PM, Tom Lane wrote: > >> Forgot to mention: seems like you should have added a location > >> argument to makeDefElem. > > > > I was hesitating to do that lest it break extensions or something, but I > > guess we break bigger things than that all the time. I'll change it. > > In order not to work on two patches that directly conflict with each > other, I have proceeded with the location patch and postponed the > duplicate checking patch. > > Attached is a biggish patch to review. It adds location information to > all places DefElems are created in the parser and then adds errposition > information in a lot of places, but surely not all of them. That can be > improved over time. > > I'm not happy that utils/acl.h has prototypes for aclchk.c, because > acl.h is included all over the place. Perhaps I should make a > src/include/catalog/aclchk.c to clean that up. > > Here are some example commands to try for getting suitable error messages: > > create collation foo (foo = bar, bar = foo); > copy test from stdin (null 'x', null 'x'); > create function foo (a int, b int) returns int as $$ select a+b $$ > language sql language sql; > create function foo (a int, b int) returns int as $$ select a+b $$ > language sql volatile stable; > create function foo (a int, b int) returns int as $$ select a+b $$ > language sql with (foo = bar); > create sequence foo minvalue 1 minvalue 2; > create type foo (foo = bar); > create user foo createdb nocreatedb; > explain (foo, bar) select 1; > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > I am sending a review of this patch: 1. This patch introduce location in DefElement node, and inject ParserState to SQL commands, where ParserState was not used. It allows to show the position of an error. This patch is not small, but almost changes are trivial. 2. There are no problems with patching, compiling, tests - all tests passed. 3. There is not any new functionality, so new tests and new documentation is not necessary. I'll mark this patch as ready for commiter. Regards Pavel > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >
Re: [HACKERS] WAL consistency check facility
On Mon, Aug 22, 2016 at 9:44 PM, Kuntal Ghosh wrote: > Please let me know your thoughts on this. Since custom AMs have been introduced, I have kept that in a corner of my mind and thought about it a bit. And while the goal of this patch is clearly worth it, I don't think that the page masking interface is clear at all. For example, your patch completely ignores contrib/bloom, and we surely want to do something about it. The idea would be to add a page masking routine in IndexAmRoutine and heap to be able to perform page masking operations directly with that. This would allow as well one to be able to perform page masking for bloom or any custom access method, and this will allow this sanity check to be more generic as well. Another pin-point is: given a certain page, how do we identify of which type it is? One possibility would be again to extend the AM handler with some kind of is_self function with a prototype like that: bool handler->is_self(Page); If the page is of the type of the handler, this returns true, and false otherwise. Still here performance would suck. At the end, what we want is a clean interface, and more thoughts into it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF-8 docs?
> On 8/22/16 1:16 AM, Tatsuo Ishii wrote: >> Just out of curiopusity, I wonder why we can't make the encoding of >> SGML docs to be UTF-8, rather than current ISO-8859-1. > > Encoding handling in DocBook SGML is weird, and making it work robustly > will either fail or might be more work than just completing the > conversion to XML. I don't know what kind of problem you are seeing with encoding handling, but at least UTF-8 is working for Japanese, French and Russian. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)
On Mon, Aug 22, 2016 at 10:09 PM, Amit Kapila wrote: > Won't the similar problem exists for nonExclusiveBackups? Basically > in similar situation, the count for nonExclusiveBackups will be > decremented and if it hits pg_start_backup_callback(), the following > Assertion can fail. > pg_start_backup_callback() > { > .. > Assert(XLogCtl->Insert.nonExclusiveBackups > 0); > .. > } This cannot be hit for non-exclusive backups as pg_start_backup() and pg_stop_backup() need to be launched from the same session: their call will never overlap across session, and this overlap is the reason why exclusive backups are exposed to the problem. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)
On Mon, Aug 22, 2016 at 7:13 AM, Michael Paquier wrote: > On Sat, Aug 6, 2016 at 6:35 PM, Andreas Seltenreich > wrote: >> Michael Paquier writes: >> >>> Andreas, with the patch attached is the assertion still triggered? >>> [2. text/x-diff; base-backup-crash-v2.patch] >> >> I didn't observe the crashes since applying this patch. There should >> have been about five by the amount of fuzzing done. > > I have reworked the patch, replacing those two booleans with a single > enum. That's definitely clearer. Also, I have added this patch to the > CF to not lose track of it: > https://commitfest.postgresql.org/10/731/ > Horiguchi-san, I have added you as a reviewer as you provided some > input. I hope you don't mind. > Won't the similar problem exists for nonExclusiveBackups? Basically in similar situation, the count for nonExclusiveBackups will be decremented and if it hits pg_start_backup_callback(), the following Assertion can fail. pg_start_backup_callback() { .. Assert(XLogCtl->Insert.nonExclusiveBackups > 0); .. } -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LSN as a recovery target
On 8/22/16 8:28 AM, Michael Paquier wrote: > Thinking a bit wider than that, we may want to know such context for > normal GUC parameters as well, and that's not the case now. Perhaps > there is actually a reason why that's not done for GUCs, but it seems > that it would be useful there as well. GUC parsing generally needs, or used to need, to work under more constrained circumstances, e.g., no full memory management. That's not a reason not to try this, but there might be non-obvious problems. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Typo in comment to function LockHasWaitersRelation() [master branch]
> Hi hackers, > > I've found a typo in a comment to function LockHasWaitersRelation() [lmgr.c > : > 271, master branch]: > >> This is a functiion to check Attached a patch. -- Dmitry Ivanov Postgres Professional: http://www.postgrespro.com Russian Postgres Companydiff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c index 7b08555..d0fdb94 100644 --- a/src/backend/storage/lmgr/lmgr.c +++ b/src/backend/storage/lmgr/lmgr.c @@ -268,7 +268,7 @@ UnlockRelation(Relation relation, LOCKMODE lockmode) /* * LockHasWaitersRelation * - * This is a functiion to check if someone else is waiting on a + * This is a function to check if someone else is waiting on a * lock, we are currently holding. */ bool -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Typo in comment to function LockHasWaitersRelation() [master branch]
Hi hackers, I've found a typo in a comment to function LockHasWaitersRelation() [lmgr.c : 271, master branch]: >> This is a functiion to check -- Dmitry Ivanov Postgres Professional: http://www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UTF-8 docs?
On 8/22/16 1:16 AM, Tatsuo Ishii wrote: > Just out of curiopusity, I wonder why we can't make the encoding of > SGML docs to be UTF-8, rather than current ISO-8859-1. Encoding handling in DocBook SGML is weird, and making it work robustly will either fail or might be more work than just completing the conversion to XML. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WAL consistency check facility
Hi, I've attached a patch to check if the current page is equal with the FPW after applying WAL on it. This is how the patch works: 1. When a WAL record is inserted, a FPW is done for that operation. But, a flag is kept to indicate whether that page needs to be restored. 2. During recovery, when a redo operation is done, we do a comparison with the FPW contained in the WAL record with the current page in the buffer. For this purpose, I've used Michael's patch with minor changes to check whether two pages are actually equal or not. 3. I've also added a guc variable (wal_consistency_mask) to indicate the operations (HEAP,BTREE,HASH,GIN etc) for which this feature (always FPW and consistency check) is to be enabled. How to use the patch: 1. Apply the patch. 2. In postgresql.conf file, set wal_consistency_mask variable accordingly. For debug messages, set log_min_messages = debug1. Michael's patch: https://www.postgresql.org/message-id/CAB7nPqR4vxdKijP%2BDu82vOcOnGMvutq-gfqiU2dsH4bsM77hYg%40mail.gmail.com Reference thread: https://www.postgresql.org/message-id/flat/CAB7nPqR4vxdKijP%2BDu82vOcOnGMvutq-gfqiU2dsH4bsM77hYg%40mail.gmail.com#cab7npqr4vxdkijp+du82vocongmvutq-gfqiu2dsh4bsm77...@mail.gmail.com Please let me know your thoughts on this. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f13f9c1..9380079 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -25,6 +25,7 @@ #include "access/commit_ts.h" #include "access/multixact.h" #include "access/rewriteheap.h" +#include "access/rmgr.h" #include "access/subtrans.h" #include "access/timeline.h" #include "access/transam.h" @@ -52,7 +53,9 @@ #include "replication/walreceiver.h" #include "replication/walsender.h" #include "storage/barrier.h" +#include "storage/bufmask.h" #include "storage/bufmgr.h" +#include "storage/bufpage.h" #include "storage/fd.h" #include "storage/ipc.h" #include "storage/large_object.h" @@ -94,6 +97,7 @@ bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; bool wal_compression = false; +int wal_consistency_mask = 0; bool log_checkpoints = false; int sync_method = DEFAULT_SYNC_METHOD; int wal_level = WAL_LEVEL_MINIMAL; @@ -867,6 +871,9 @@ static void WALInsertLockAcquireExclusive(void); static void WALInsertLockRelease(void); static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt); +void checkWALConsistency(XLogReaderState *xlogreader); +void checkWALConsistencyForBlock(XLogReaderState *record, uint8 block_id); + /* * Insert an XLOG record represented by an already-constructed chain of data * chunks. This is a low-level routine; to construct the WAL record header @@ -6868,6 +6875,12 @@ StartupXLOG(void) /* Now apply the WAL record itself */ RmgrTable[record->xl_rmid].rm_redo(xlogreader); +/* + * Check whether the page associated with WAL record is consistent + * with the existing page + */ +checkWALConsistency(xlogreader); + /* Pop the error context stack */ error_context_stack = errcallback.previous; @@ -11626,3 +11639,160 @@ XLogRequestWalReceiverReply(void) { doRequestWalReceiverReply = true; } + +/* + * Check whether the page associated with WAL record is consistent with the + * existing page or not. + */ +void checkWALConsistency(XLogReaderState *xlogreader) +{ + RmgrIds rmid = (RmgrIds) XLogRecGetRmid(xlogreader); + int block_id; + int enableWALConsistencyMask = 1; + RmgrIds rmids[] = {RM_HEAP2_ID,RM_HEAP_ID,RM_BTREE_ID,RM_HASH_ID,RM_GIN_ID,RM_GIST_ID,RM_SEQ_ID,RM_SPGIST_ID,RM_BRIN_ID}; + int size = sizeof(rmids)/sizeof(rmid); + int i; + for(i=0;imax_block_id; block_id++) +checkWALConsistencyForBlock(xlogreader,block_id); + break; + } + /* + * Enable checking for the next bit + */ + enableWALConsistencyMask <<= 1; + } +} +void checkWALConsistencyForBlock(XLogReaderState *record, uint8 block_id) +{ + Buffer buf; + char *ptr; + DecodedBkpBlock *bkpb; + char tmp[BLCKSZ]; + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blkno; + Page page; + + if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno)) + { + /* Caller specified a bogus block_id. Don't do anything. */ + return; + } + buf = XLogReadBufferExtended(rnode, forknum, blkno, + RBM_WAL_CHECK); + page = BufferGetPage(buf); + + bkpb = &record->blocks[block_id]; + if(bkpb->bkp_image!=NULL) + ptr = bkpb->bkp_image; + else + { + elog(WARNING, + "No page found in WAL for record %X/%X, rel %u/%u/%u, " + "forknum %u, blkno %u", + (uint32) (record->ReadRecPtr>> 32), (uint32) record->ReadRecPtr , + rnode.spcNode, rnode.dbNode, rnode.relNode, + forknum, blkno); + return; + } + + if (bkpb->bimg_info & BKPIMAGE_IS_COMPRESSED) + { + /* If a backup block image is compressed, decompress it */ + if (pglz_decompress(ptr, bkpb->bimg_len, tmp,
Re: [HACKERS] dsm_unpin_segment
On Tue, Aug 23, 2016 at 12:07 AM, Amit Kapila wrote: > + int control_slot = -1; > ... > + if (control_slot == -1) > + elog(ERROR, "cannot unpin unknown segment handle"); > > Isn't it better to use INVALID_CONTROL_SLOT for control_slot and use > datatype as uint32 (same is used for dsm_segment->control_slot and > nitems)? Yes, it is better. New version attached. > Apart from this, I have verified your patch on Windows using attached > dsm_demo module. Basically, by using dsm_demo_create(), I have pinned > the segment and noticed that Handle count of postmaster is incremented > by 1 and then by using dsm_demo_unpin_segment() unpinned the segment > which decrements the Handle count in Postmaster. Thanks! -- Thomas Munro http://www.enterprisedb.com dsm-unpin-segment-v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] LSN as a recovery target
On Mon, Aug 22, 2016 at 7:12 PM, Adrien Nayrat wrote: > As Julien said, there is nothing to notice that error comes from > recovery.conf. > My fear would be that an user encounters an error like this. Il will be > difficult to link to the recovery.conf. Thinking a bit wider than that, we may want to know such context for normal GUC parameters as well, and that's not the case now. Perhaps there is actually a reason why that's not done for GUCs, but it seems that it would be useful there as well. That would give another reason to move all that under the GUC umbrella. > For the others settings (xid, timeline,name) there is an explicit > message that notice error is in recovery.conf. > > I see it is not the case for recovery_target_time. Yes, in this case the parameter value is parsed using an datatype _in function call. And the error message depends on that.. > Should we modify only the documentation or should we try to find a > solution to point the origin of error? The patch as proposed is complicated enough I think, and it would be good to keep things simple if we can. So having something in the docs looks fine to me, and that's actually the reference to pg_lsn used to parse the parameter value. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Binary I/O for isn extension
Hello Shay, Attached is a new version of the patch, adding an upgrade script and the rest of it. Note that because, as Fabien noted, there's doesn't seem to be a way to add send/receive functions with ALTER TYPE, I did that by updating pg_type directly - hope that's OK. This patch does not apply anymore, because there as been an update in between to mark relevant contrib functions as "parallel". Could you update the patch? -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dsm_unpin_segment
On Mon, Aug 22, 2016 at 5:24 AM, Thomas Munro wrote: > On Sat, Aug 20, 2016 at 11:37 PM, Amit Kapila wrote: >> On Tue, Aug 9, 2016 at 10:07 AM, Thomas Munro >> wrote: >>> On Tue, Aug 9, 2016 at 12:53 PM, Tom Lane wrote: The larger picture here is that Robert is exhibiting a touching but unfounded faith that extensions using this feature will contain zero bugs. IMO there needs to be some positive defense against mistakes in using the pin/unpin API. As things stand, multiple pin requests don't have any fatal consequences (especially not on non-Windows), so I have little confidence that it's not happening in the field. I have even less confidence that there wouldn't be too many unpin requests. >>> >>> Ok, here is a version that defends against invalid sequences of >>> pin/unpin calls. I had to move dsm_impl_pin_segment into the block >>> protected by DynamicSharedMemoryControlLock, so that it could come >>> after the already-pinned check, but before updating any state, since >>> it makes a Windows syscall that can fail. That said, I've only tested >>> on Unix and will need to ask someone to test on Windows. >>> >> >> Few review comments: > > Thanks for the review! > >> 1. >> + /* Note that 1 means no references (0 means unused slot). */ >> + if (--dsm_control->item[i].refcnt == 1) >> + destroy = true; >> + >> + /* >> + * Allow implementation-specific code to run. We have to do this before >> + * releasing the lock, because impl_private_pm_handle may get modified by >> + * dsm_impl_unpin_segment. >> + */ >> + if (control_slot >= 0) >> + dsm_impl_unpin_segment(handle, >> + &dsm_control->item[control_slot].impl_private_pm_handle); >> >> If there is an error in dsm_impl_unpin_segment(), then we don't need >> to decrement the reference count. Isn't it better to do it after the >> dsm_impl_unpin_segment() is successful. Similarly, I think pinned >> should be set to false after dsm_impl_unpin_segment(). > > Hmm. Yeah, OK. Things are in pretty bad shape if you fail to unpin > despite having run the earlier checks, but you're right, it's better > that way. New version attached. > + int control_slot = -1; ... + if (control_slot == -1) + elog(ERROR, "cannot unpin unknown segment handle"); Isn't it better to use INVALID_CONTROL_SLOT for control_slot and use datatype as uint32 (same is used for dsm_segment->control_slot and nitems)? Apart from this, I have verified your patch on Windows using attached dsm_demo module. Basically, by using dsm_demo_create(), I have pinned the segment and noticed that Handle count of postmaster is incremented by 1 and then by using dsm_demo_unpin_segment() unpinned the segment which decrements the Handle count in Postmaster. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com dsm_demo_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: initdb: "'" for QUOTE_PATH (non-windows)
On Mon, Aug 22, 2016 at 8:45 PM, Tom Lane wrote: > Likely it would be better to refactor all of this so we get just one > reference to libpq and one reference to libpgport, but that'd require > a more thoroughgoing cleanup than you have here. (Now that I think > about it, adding -L/-l to LDFLAGS is pretty duff coding style to > begin with --- we should be adding those things to LIBS, or at least > putting them just before LIBS in the command lines.) Agreed, this needs more thoughts. As that's messy, I won't high-jack more this thread and begin a new one with a more fully-bloomed patch once I get time to look at it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: initdb: "'" for QUOTE_PATH (non-windows)
Michael Paquier writes: > On Sun, Aug 21, 2016 at 3:02 AM, Tom Lane wrote: >> I looked into this and soon found that fe_utils/string_utils.o has >> dependencies on libpq that are much wider than just pqexpbuffer :-(. > pqexpbuffer.c is an independent piece of facility, so we could move it > to src/common and leverage the dependency a bit, and have libpq link > to the source file itself at build phase. The real problem is the call > to PQmblen in psqlscan.l... And this, I am not sure how this could be > refactored cleanly. I see all of these as libpq dependencies in string_utils.o: U PQclientEncoding U PQescapeStringConn U PQmblen U PQserverVersion Maybe we could split that file into two parts (libpq-dependent and not) but it would be pretty arbitrary. > And actually, I had a look at the build failure that you reported in > 3855.1471713...@sss.pgh.pa.us. While that was because of a copy of > libpq.so that I had in my own LD_LIBRARY_PATH, shouldn't all the other > frontend utilities depending on fe_utils also use $(libpq_pgport) > instead of -lpq? All the rest of them already have that, because their link commands look like, eg for psql, LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq psql: $(OBJS) | submake-libpq submake-libpgport submake-libpgfeutils $(CC) $(CFLAGS) $(OBJS) $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X) ^^ The extra reference to -lpq just makes sure that references to libpq from libpgfeutils get resolved properly. (And yeah, there are some platforms that need that :-(.) We don't need an extra copy of the -L flag. This is all pretty messy, not least because of the way libpq_pgport is set up; as Makefile.global notes, # ... This does cause duplicate -lpgport's to appear # on client link lines. Likely it would be better to refactor all of this so we get just one reference to libpq and one reference to libpgport, but that'd require a more thoroughgoing cleanup than you have here. (Now that I think about it, adding -L/-l to LDFLAGS is pretty duff coding style to begin with --- we should be adding those things to LIBS, or at least putting them just before LIBS in the command lines.) You're right that I missed the desirability of invoking submake-libpq and submake-libpgfeutils in initdb's build. Will fix that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers