Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
On Thu, Feb 2, 2017 at 6:14 PM, Amit Kapila wrote: > > /* > + * If the index list was invalidated, we better also invalidate the index > + * attribute list (which should automatically invalidate other attributes > + * such as primary key and replica identity) > + */ > > + relation->rd_indexattr = NULL; > + > > I think setting directly to NULL will leak the memory. > Good catch, thanks. Will fix. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
On Fri, Feb 3, 2017 at 12:57 AM, Fabien COELHO wrote: > > Hello Corey, > > Glad you like the barking. I'm happy to let the prompt issue rest for now, >> we can always add it later. >> >> If we DID want it, however, I don't think it'll be hard to add a special >> prompt (Thinking %T or %Y because they both look like branches, heh), >> > > Ah!:-) T may stand for Tree, but Y looks a little bit more like branches. > Maybe Y for Yew. > Well played. The %Y prompt can be a separate patch. New patch. Highlights: - rebased to master as of ten minutes ago - Interactive barking on branching state changes, commands typed while in inactive state - Help text. New block in help text called "Conditionals" - SendQuery calls in mainloop.c are all encapsulated in send_query() to ensure the same if-active and if-interactive logic is used - Exactly one perl TAP test, testing ON_ERROR_STOP. I predict more will be needed, but I'm not sure what coverage is desired - I also predict that my TAP test style is pathetic - regression tests now have comments to explain purpose diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index ae58708..c0ba4c4 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -2035,6 +2035,67 @@ hello 10 + +\if expr +\elif expr +\else +\endif + + +This group of commands implements nestable conditional blocks, like +this: + + +SELECT +EXISTS(SELECT 1 FROM customer WHERE customer_id = 123) as is_customer, +EXISTS(SELECT 1 FROM employee WHERE employee_id = 456) as is_employee +\gset +\if :is_customer +SELECT * FROM customer WHERE customer_id = 123; +\elif :is_employee +\echo 'is not a customer but is an employee' +SELECT * FROM employee WHERE employee_id = 456; +\else +\if yes +\echo 'not a customer or employee' +\else +\echo 'this should never print' +\endif +\endif + + +Conditional blocks must begin with a \if and end +with an \endif, and the pairs must be found in +the same source file. If an EOF is reached on the main file or an +\include-ed file before all local +\if-\endif are matched, then +psql will raise an error. + + +A conditional block can have any number of +\elif clauses, which may optionally be followed by a +single \else clause. + + +The \if and \elif commands +read the rest of the line and evaluate it as a boolean expression. +Currently, expressions are limited to a single unquoted string +which are evaluated like other on/off options, so the valid values +are any unambiguous case insensitive matches for one of: +true, false, 1, +0, on, off, +yes, no. So +t, T, and tR +will all match true. + + +Lines within false branches are not evaluated in any way: queries are +not sent to the server, non-conditional commands are not evaluated but +bluntly ignored, nested if-expressions in such branches are also not +evaluated but are tallied to check for proper nesting. + + + \ir or \include_relative filename diff --git a/src/bin/psql/Makefile b/src/bin/psql/Makefile index c53733f..7a418c6 100644 --- a/src/bin/psql/Makefile +++ b/src/bin/psql/Makefile @@ -61,8 +61,16 @@ uninstall: clean distclean: rm -f psql$(X) $(OBJS) lex.backup + rm -rf tmp_check # files removed here are supposed to be in the distribution tarball, # so do not clean them in the clean/distclean rules maintainer-clean: distclean rm -f sql_help.h sql_help.c psqlscanslash.c + + +check: + $(prove_check) + +installcheck: + $(prove_installcheck) diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index f17f610..dcf567e 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -49,6 +49,7 @@ #include "psqlscanslash.h" #include "settings.h" #include "variables.h" +#include "fe_utils/psqlscan_int.h" /* * Editable database object types. @@ -132,7 +133,7 @@ HandleSlashCmds(PsqlScanState scan_state, status = PSQL_CMD_ERROR; } - if (status != PSQL_CMD_ERROR) + if (status != PSQL_CMD_ERROR && pset.active_branch) { /* eat any remaining arguments after a valid command */ /* note we suppress evaluation of backticks here */ @@ -194,6 +195,30 @@ read_connect_arg(PsqlScanState scan_state) return result; } +/* + * Read and interpret argument as a boolean expression. + * Return true if a boolean value was successfully parsed. + */ +static bool +read_boolean_expression(PsqlScanState scan_state, char *action, + bool *result) +{ + char*value = psql_scan_slash_option(scan_st
Re: [HACKERS] Packages: Again
2017-02-03 7:34 GMT+01:00 Craig Ringer : > On 3 February 2017 at 14:27, Pavel Stehule > wrote: > > > > > > 2017-01-20 17:01 GMT+01:00 Joshua D. Drake : > >> > >> On 01/17/2017 09:26 AM, Robert Haas wrote: > >>> > >>> On Fri, Jan 13, 2017 at 7:24 PM, Peter Geoghegan > wrote: > > MERGE isn't UPSERT, and isn't even in competition with UPSERT as a > feature. I've written reams of text explaining why this is so in > precise detail, ... > >> > >> > >> > >> Hello, > >> > >> This is the webinar that started this whole thread (well the original > >> thread, not this weird MERGE/UPSERT stuff): > >> > >> https://www.commandprompt.com/blog/postgresql_for_oracle_people/ > >> > >> Thank you to everyone that responded. You will see in this Webinar that > at > >> least from the Oracle people perspective, PostgreSQL is not an option > unless > >> it has packages. > >> > >> The other item that people bring up a few times is Oracle Forms but as > >> that is actually external (although dependent) on Oracle, I don't see > that > >> as our responsibility. > > > > > > DB2 propose using schemas instead packages > > > > https://www.ibm.com/developerworks/data/library/ > techarticle/dm-0711zubiri/ > > > > Now I am working with Oracle application - and I try to understand to > Oracle > > developers - often pattern is using "Oracle schema" as database - and > then > > the packages has sense. But there is not a mapping "Oracle schema" = > > "PostgreSQL schema" - and packages is a redundant concept in Postgres > (and > > in all db, where the schema are like namaspace - MSSQL, DB2, MySQL). > > It sounds like we could benefit from a documentation section > "packages" that describes how to get package-like behaviour (minus the > pre-compiled updates) from Pg using schemas and, once added, secure > variables. > It should be documented and presented (who is read a documentation? :-)) It is not only PostgreSQL issue, same issue has to have any other databases. The Oracle architecture is very specific and often question is, how to map Oracle database to PostgreSQL. A common questions - how schema should be used, where schema should be used, where database should be used. What is practical limit of size of PostgreSQL catalogue. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] proposal: session server side variables
2017-02-03 7:25 GMT+01:00 Fabien COELHO : > > Hello Pavel, > > The @1 area is partially solved by psql session variables or by pgAdmin >> scripting functionality. @2 is partially solved by GUC but without >> possibility to set a access rights. >> >> I didn't found any implementation of XA variables [...] >> > > I did: GUCs in PostgreSQL are an implementation of transactional session > variables. > GUC was not designed for usage in stored procedures. > > As I wrote on the thread, given the "security check" use case, the safe > alternative to transactional session variables is to have nested > transactions. This seems like a far away prospect for pg, but is a reality > for Oracle, DB2 and some others that have untransactional session > variables, so at least it is safe in their case, if not elegant. > You have everywhere the fence between transactional/untransactional - and you can see some unwanted artefacts there. The world "secure" just means - a possibility to set access rights - nothing more, nothing less. > > My "hard" opinion is that providing an unsafe by default feature (i.e. > which works as in some particular cases, but may fail silently if the > transaction fails), especially for a security related use case which > motivates the whole feature addition, is a very bad idea for the product. > If a committer likes it anyway, good for you. > I respect to your opinion and I understand - I have a similar strong opinion on packages in Postgres. In this case I prefer a common implementation - and common expectation. When some feature is PostgreSQL original, then we can design how we can. But when we implement some feature that exists already, then we should to respect some previous, older major implementation. The great example is our implementation of OUT parameters in PL. The idea is great - modern languages use it Golang, Rust. But in PL area is unique, different. One from significant issues migrations to Postgres, Postgres adoptions is this small feature. The people who is working with stored procedures doesn't expect XA behave, overhead when they working with some objects named "variables". We can implement XA support for variables, ale I don't think so default should be XA. Only few cases where variables can be used are are XA sensitive. Regards Pavel > > Other opinions I expressed on the thread are somehow "softer", i.e. even > if I think that there are better (simpler, easier) alternatives, these are > only alternatives. > > -- > Fabien. >
Re: [HACKERS] Packages: Again
On 3 February 2017 at 14:27, Pavel Stehule wrote: > > > 2017-01-20 17:01 GMT+01:00 Joshua D. Drake : >> >> On 01/17/2017 09:26 AM, Robert Haas wrote: >>> >>> On Fri, Jan 13, 2017 at 7:24 PM, Peter Geoghegan wrote: MERGE isn't UPSERT, and isn't even in competition with UPSERT as a feature. I've written reams of text explaining why this is so in precise detail, ... >> >> >> >> Hello, >> >> This is the webinar that started this whole thread (well the original >> thread, not this weird MERGE/UPSERT stuff): >> >> https://www.commandprompt.com/blog/postgresql_for_oracle_people/ >> >> Thank you to everyone that responded. You will see in this Webinar that at >> least from the Oracle people perspective, PostgreSQL is not an option unless >> it has packages. >> >> The other item that people bring up a few times is Oracle Forms but as >> that is actually external (although dependent) on Oracle, I don't see that >> as our responsibility. > > > DB2 propose using schemas instead packages > > https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/ > > Now I am working with Oracle application - and I try to understand to Oracle > developers - often pattern is using "Oracle schema" as database - and then > the packages has sense. But there is not a mapping "Oracle schema" = > "PostgreSQL schema" - and packages is a redundant concept in Postgres (and > in all db, where the schema are like namaspace - MSSQL, DB2, MySQL). It sounds like we could benefit from a documentation section "packages" that describes how to get package-like behaviour (minus the pre-compiled updates) from Pg using schemas and, once added, secure variables. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Packages: Again
2017-01-20 17:01 GMT+01:00 Joshua D. Drake : > On 01/17/2017 09:26 AM, Robert Haas wrote: > >> On Fri, Jan 13, 2017 at 7:24 PM, Peter Geoghegan wrote: >> >>> MERGE isn't UPSERT, and isn't even in competition with UPSERT as a >>> feature. I've written reams of text explaining why this is so in >>> precise detail, ... >>> >> > > Hello, > > This is the webinar that started this whole thread (well the original > thread, not this weird MERGE/UPSERT stuff): > > https://www.commandprompt.com/blog/postgresql_for_oracle_people/ > > Thank you to everyone that responded. You will see in this Webinar that at > least from the Oracle people perspective, PostgreSQL is not an option > unless it has packages. > > The other item that people bring up a few times is Oracle Forms but as > that is actually external (although dependent) on Oracle, I don't see that > as our responsibility. DB2 propose using schemas instead packages https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/ Now I am working with Oracle application - and I try to understand to Oracle developers - often pattern is using "Oracle schema" as database - and then the packages has sense. But there is not a mapping "Oracle schema" = "PostgreSQL schema" - and packages is a redundant concept in Postgres (and in all db, where the schema are like namaspace - MSSQL, DB2, MySQL). Regards Pavel > > Sincerely, > > JD > > > -- > Command Prompt, Inc. http://the.postgres.company/ > +1-503-667-4564 > PostgreSQL Centered full stack support, consulting and development. > Everyone appreciates your honesty, until you are honest with them. > Unless otherwise stated, opinions are my own. >
Re: [HACKERS] proposal: session server side variables
Hello Pavel, The @1 area is partially solved by psql session variables or by pgAdmin scripting functionality. @2 is partially solved by GUC but without possibility to set a access rights. I didn't found any implementation of XA variables [...] I did: GUCs in PostgreSQL are an implementation of transactional session variables. As I wrote on the thread, given the "security check" use case, the safe alternative to transactional session variables is to have nested transactions. This seems like a far away prospect for pg, but is a reality for Oracle, DB2 and some others that have untransactional session variables, so at least it is safe in their case, if not elegant. My "hard" opinion is that providing an unsafe by default feature (i.e. which works as in some particular cases, but may fail silently if the transaction fails), especially for a security related use case which motivates the whole feature addition, is a very bad idea for the product. If a committer likes it anyway, good for you. Other opinions I expressed on the thread are somehow "softer", i.e. even if I think that there are better (simpler, easier) alternatives, these are only alternatives. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: session server side variables
On Fri, Feb 3, 2017 at 2:56 PM, Pavel Stehule wrote: > My patch was marked as "returned with feedback". Personally, I had not a > idea what can be next step and what is preferred design, if some preferred > design exists. I don't know what I have to change on my proposal. Perhaps this was not adapted, sorry about that. Now the latest patch is 2-month old and does not apply. If you think that the approach you are taking is worth it, you can of course submit again a new version and make the discussion move on. Finding a consensus is the difficult part though. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: session server side variables
There is a link - comparation Oracle package variables and DB2 global variables https://www.ibm.com/developerworks/data/library/techarticle/dm-0711zubiri/ Regards Pavel
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
Hello Corey, Glad you like the barking. I'm happy to let the prompt issue rest for now, we can always add it later. If we DID want it, however, I don't think it'll be hard to add a special prompt (Thinking %T or %Y because they both look like branches, heh), Ah!:-) T may stand for Tree, but Y looks a little bit more like branches. Maybe Y for Yew. With a prompt1 of '%T> ' Would then resolve to "ttfi" [...] This is just idle musing, I'm perfectly happy to leave it out entirely. I like it. I would prefer to have it available, but my advice is to follow committer' opinions. At the minimum, there must be some trace, either "barking" or "prompting". -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: session server side variables
2017-02-01 6:42 GMT+01:00 Pavel Stehule : > > > 2017-02-01 6:05 GMT+01:00 Michael Paquier : > >> On Wed, Jan 11, 2017 at 10:42 PM, Craig Ringer >> wrote: >> > There is no code yet. Code review and testing is where things get >> firmer. >> > >> > My personal stance right now is that I'd like to see catalog-decared >> typed >> > variables. I would prefer them to be transactional and would at least >> oppose >> > anything that didn't allow future room for that capability. I'd prefer >> that >> > non-transactional vars be clearly declared as such. >> > >> > In the end though... I'm not the one implementing it. I can have some >> > influence through the code review process. But it's whoever steps up >> with a >> > proposed implementation that has the biggest say. The rest of us can >> say yes >> > or no to some degree... but nobody can make someone else implement >> something >> > they don't want. >> >> The last patch is from the 6th of December and does not apply anymore: >> https://www.postgresql.org/message-id/CAFj8pRA9w_AujBAYdLR0U >> VfXwwoxhmn%2BFbNHnD3_NL%3DJ9x3y8w%40mail.gmail.com >> I don't have a better idea than marking this patch as "returned with >> feedback" for now, as the thread has died 3 weeks ago as well. >> > > There is not a agreement on the form of session variables. > Today I found on net a documentation to DB2 "CREATE VARIABLE" command. I had not any idea, so this statement exists already, although it is old feature - I found a doc from 2007. The DB2 design is very similar to my proposal - secured access, persistent metadata, unshared untransactional data limmited by session. They doesn't use a access functions - the access is with notation schemaname.variablename. I proposed this syntax as next step in implementation. The DB2 authors doesn't propose transactional variables - when user needs XA behave, then global temporary tables should be used. My patch was marked as "returned with feedback". Personally, I had not a idea what can be next step and what is preferred design, if some preferred design exists. I don't know what I have to change on my proposal. I understand, so there are two different concepts - 1. using variables for adhoc writing like T-SQL, MySQL or 2. using variables as global session objects for stored procedures. The @1 area is partially solved by psql session variables or by pgAdmin scripting functionality. @2 is partially solved by GUC but without possibility to set a access rights. I didn't found any implementation of XA variables or persistent variables on the world. Regards Pavel > > Regards > > Pavel > > >> -- >> Michael >> > >
Re: [HACKERS] Logical Replication and Character encoding
On 3 Feb. 2017 15:47, "Kyotaro HORIGUCHI" wrote: Hello, At Fri, 3 Feb 2017 09:16:47 +0800, Craig Ringer wrote in > On 2 February 2017 at 11:45, Euler Taveira wrote: > > > I don't think storage without conversion is an acceptable approach. We > > should provide options to users such as ignore tuple or NULL for > > column(s) with conversion problem. I wouldn't consider storage data > > without conversion because it silently show incorrect data and we > > historically aren't flexible with conversion routines. It is possible technically. But changing the behavior of a subscript and/or publication requires change of SQL syntax. It seems a bit too late for proposing such a new feature.. IMHO unintentional silent data loss must not be happen so the default behavior on conversion failure cannot be other than stop of replication. Agree. Which is why we should default to disallowing mismatched upstream and downstream encodings. At least to start with. > pglogical and BDR both require identical encoding; they test for this > during startup and refuse to replicate if the encoding differs. > > For the first pass at core I suggest a variant on that policy: require > source and destination encoding to be the same. This should probably > be the first change, since it definitively prevents the issue from > arising. If the check is performed by BDR, pglogical itself seems to be allowed to convert strings when the both end have different encodings in a non-BDR environment. Is this right? Hm. Maybe it's changed since I last looked. We started off disallowing mismatched encodings anyway. Note that I'm referring to pglogical, the tool, not "in core logical replication for postgres" > If time permits we could also allow destination encoding to be UTF-8 > (including codepage 65001) with any source encoding. This requires > encoding conversion to be performed, of course. Does this mean that BDR might work on heterogeneous encoding environemnt? No. Here the "we" was meant to be PG core for V10 or later. But anyway some encodings (like SJIS) have caharacters with the same destination in its mapping so BDR doesn't seem to work with such conversions. So each encoding might should have a property to inform its usability under BDR environment, but. PG doesn't allow SJIS as a db encoding. So it doesn't matter here. > The downside is that this will impact users who use a common subset of > two encodings. This is most common for Windows-1252 <-> ISO-8859-15 > (or -1 if you're old-school) but also arises anywhere the common 7 bit > subset is used. Until we can define an encoding exception policy > though, I think we should defer supporting those and make them a > "later" problem. If the conversion is rejected for now, we should check the encoding identity instead. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [HACKERS] Radix tree for character conversion
Tnanks to that Heikki have pushed the first two patches and a part of the third, only one patch is remaining now. # Sorry for not separating KOI8 stuffs. At Tue, 31 Jan 2017 19:06:09 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20170131.190609.254672218.horiguchi.kyot...@lab.ntt.co.jp> > > Thanks for the new version, I'll look at it once I am done with the > > cleanup of the current CF. For now I have moved it to the CF 2017-03. > > Agreed. Thank you. Attached is the latest version on the current master (555494d). Note: since this patch is created by git diff --irreversble-delete, three files mb/Unicode/*.(txt|xml) to be deleted are left alone. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 68d75100b7e8aaab7706ea780a1e23557c676c87 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Tue, 10 Jan 2017 20:02:00 +0900 Subject: [PATCH] Use radix tree for character conversion This patch adds multibyte character converter based using radix tree based on Heikki's rework of my previous patch. --- src/backend/utils/mb/Makefile | 2 + src/backend/utils/mb/Unicode/.gitignore|11 + src/backend/utils/mb/Unicode/Makefile |72 +- src/backend/utils/mb/Unicode/UCS_to_BIG5.pl| 9 +- src/backend/utils/mb/Unicode/UCS_to_EUC_CN.pl | 9 +- .../utils/mb/Unicode/UCS_to_EUC_JIS_2004.pl|19 +- src/backend/utils/mb/Unicode/UCS_to_EUC_JP.pl | 6 +- src/backend/utils/mb/Unicode/UCS_to_EUC_KR.pl |13 +- src/backend/utils/mb/Unicode/UCS_to_EUC_TW.pl | 9 +- src/backend/utils/mb/Unicode/UCS_to_GB18030.pl | 9 +- src/backend/utils/mb/Unicode/UCS_to_JOHAB.pl |11 +- .../utils/mb/Unicode/UCS_to_SHIFT_JIS_2004.pl |14 +- src/backend/utils/mb/Unicode/UCS_to_SJIS.pl|29 +- src/backend/utils/mb/Unicode/UCS_to_UHC.pl |11 +- src/backend/utils/mb/Unicode/UCS_to_most.pl| 5 +- src/backend/utils/mb/Unicode/convutils.pm | 679 +- src/backend/utils/mb/Unicode/euc-jis-2004-std.txt | 11549 --- src/backend/utils/mb/Unicode/gb-18030-2000.xml | 30916 --- src/backend/utils/mb/Unicode/make_mapchecker.pl|78 + src/backend/utils/mb/Unicode/map_checker.c |94 + .../utils/mb/Unicode/sjis-0213-2004-std.txt| 11549 --- src/backend/utils/mb/char_converter.c | 116 + src/backend/utils/mb/conv.c| 137 +- .../conversion_procs/utf8_and_big5/utf8_and_big5.c | 8 +- .../utf8_and_cyrillic/utf8_and_cyrillic.c |16 +- .../utf8_and_euc2004/utf8_and_euc2004.c| 8 +- .../utf8_and_euc_cn/utf8_and_euc_cn.c | 8 +- .../utf8_and_euc_jp/utf8_and_euc_jp.c | 8 +- .../utf8_and_euc_kr/utf8_and_euc_kr.c | 8 +- .../utf8_and_euc_tw/utf8_and_euc_tw.c | 8 +- .../utf8_and_gb18030/utf8_and_gb18030.c| 8 +- .../conversion_procs/utf8_and_gbk/utf8_and_gbk.c | 8 +- .../utf8_and_iso8859/utf8_and_iso8859.c| 127 +- .../utf8_and_johab/utf8_and_johab.c| 8 +- .../conversion_procs/utf8_and_sjis/utf8_and_sjis.c | 8 +- .../utf8_and_sjis2004/utf8_and_sjis2004.c | 8 +- .../conversion_procs/utf8_and_uhc/utf8_and_uhc.c | 8 +- .../conversion_procs/utf8_and_win/utf8_and_win.c |98 +- src/include/mb/pg_wchar.h |56 +- 39 files changed, 1355 insertions(+), 54385 deletions(-) create mode 100644 src/backend/utils/mb/Unicode/.gitignore delete mode 100644 src/backend/utils/mb/Unicode/euc-jis-2004-std.txt delete mode 100644 src/backend/utils/mb/Unicode/gb-18030-2000.xml create mode 100755 src/backend/utils/mb/Unicode/make_mapchecker.pl create mode 100644 src/backend/utils/mb/Unicode/map_checker.c delete mode 100644 src/backend/utils/mb/Unicode/sjis-0213-2004-std.txt create mode 100644 src/backend/utils/mb/char_converter.c diff --git a/src/backend/utils/mb/Makefile b/src/backend/utils/mb/Makefile index 89bec21..d48e729 100644 --- a/src/backend/utils/mb/Makefile +++ b/src/backend/utils/mb/Makefile @@ -14,6 +14,8 @@ include $(top_builddir)/src/Makefile.global OBJS = encnames.o conv.o mbutils.o wchar.o wstrcmp.o wstrncmp.o +conv.o: conv.c char_converter.c + include $(top_srcdir)/src/backend/common.mk clean distclean maintainer-clean: diff --git a/src/backend/utils/mb/Unicode/.gitignore b/src/backend/utils/mb/Unicode/.gitignore new file mode 100644 index 000..3908cc3 --- /dev/null +++ b/src/backend/utils/mb/Unicode/.gitignore @@ -0,0 +1,11 @@ +# ignore backup files of editors +/*[~#] + +# ignore authority files +/*.TXT +/*.txt +/*.xml + +# ignore generated files +/map_checker +/map_checker.h diff --git a/src/backend/utils/mb/Unicode/Makefile b/src/backend/utils/mb/Unicode/Makefile index 10708b3..6706157 100644 --- a/src/backend/utils/mb/Unicode/Mak
Re: [HACKERS] pageinspect: Hash index support
>> I think it's OK to check hash_bitmap_info() or any other functions >> with different page types at least once. >> >> [1]- >> https://www.postgresql.org/message-id/CA%2BTgmoZUjrVy52TUU3b_Q5L6jcrt7w5R4qFwMXdeCuKQBmYWqQ%40mail.gmail.com > > Sure, I just don't know if we need to test them 4 or 5 times. But I > think this is good enough for now - it's not like this code is written > in stone once committed. > > So, committed. Wow, I wish every patch had this many reviewers. > Thanks Robert and Thank you to all the experts for showing their interest towards this project. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
At Thu, 2 Feb 2017 15:34:33 +0900, Michael Paquier wrote in > On Thu, Feb 2, 2017 at 1:26 AM, Fujii Masao wrote: > > I'm afraid that many WAL segments would start with a continuation record > > when there are the workload of short transactions (e.g., by pgbench), and > > which would make restart_lsn go behind very much. No? > > I don't quite understand this argument. Even if there are many small > transactions, that would cause restart_lsn to just be late by one > segment, all the time. > > > The discussion on this thread just makes me think that restart_lsn should > > indicate the replay location instead of flush location. This seems safer. > > That would penalize WAL retention on the primary with standbys using > recovery_min_apply_delay and a slot for example... > > We can attempt to address this problem two ways. The patch proposed > (ugly btw and there are two typos!) is doing it in the WAL sender by > not making restart_lsn jump to the next segment if a continuation > record is found. Sorry for the ug..:p Anyway, the previous version was not the latest. The attached one is the revised version. (Sorry, I haven't find a typo by myself..) > Or we could have the standby request for the next > segment instead if the record it wants to replay from is at a boundary > and that it locally has the beginning of the record, and it has it > because it already confirmed to the primary that it flushed to the > next segment. Not sure which fix is better though. We could it as I said, with some refactoring ReadRecord involving reader plugin mechanism.. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From d835bf248e6869f7b843d339c9213a082e332297 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Wed, 1 Feb 2017 16:07:22 +0900 Subject: [PATCH] Fix a bug of physical replication slot. A physical-replication standby can stop just at the boundary of WAL segments. restart_lsn of the slot on the master can be assumed to be the same location. The last segment on the master will be removed after some checkpoints for the case. If the first record of the next segment is a continuation record, it is only on the master and its beginning is only on the standby so the standby cannot restart because the record to read is scattered to two sources. This patch detains restart_lsn in the last sgement when the first page of the next segment is a continuation record. --- src/backend/replication/walsender.c | 104 +--- 1 file changed, 97 insertions(+), 7 deletions(-) diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 76f09fb..0ec7ba9 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -188,6 +188,13 @@ static volatile sig_atomic_t replication_active = false; static LogicalDecodingContext *logical_decoding_ctx = NULL; static XLogRecPtr logical_startptr = InvalidXLogRecPtr; +/* + * This variable corresponds to restart_lsn in pg_replication_slots for a + * physical slot. This has a valid value only when it differs from the current + * flush pointer. + */ +static XLogRecPtr restartLSN = InvalidXLogRecPtr; + /* Signal handlers */ static void WalSndSigHupHandler(SIGNAL_ARGS); static void WalSndXLogSendHandler(SIGNAL_ARGS); @@ -220,7 +227,7 @@ static void WalSndPrepareWrite(LogicalDecodingContext *ctx, XLogRecPtr lsn, Tran static void WalSndWriteData(LogicalDecodingContext *ctx, XLogRecPtr lsn, TransactionId xid, bool last_write); static XLogRecPtr WalSndWaitForWal(XLogRecPtr loc); -static void XLogRead(char *buf, XLogRecPtr startptr, Size count); +static bool XLogRead(char *buf, XLogRecPtr startptr, Size count, bool noutfoundok); /* Initialize walsender process before entering the main command loop */ @@ -504,6 +511,9 @@ StartReplication(StartReplicationCmd *cmd) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), (errmsg("cannot use a logical replication slot for physical replication"; + + /* Restore restartLSN from replication slot */ + restartLSN = MyReplicationSlot->data.restart_lsn; } /* @@ -519,6 +529,10 @@ StartReplication(StartReplicationCmd *cmd) else FlushPtr = GetFlushRecPtr(); + /* Set InvalidXLogRecPtr if catching up */ + if (restartLSN == FlushPtr) + restartLSN = InvalidXLogRecPtr; + if (cmd->timeline != 0) { XLogRecPtr switchpoint; @@ -727,7 +741,7 @@ logical_read_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int req count = flushptr - targetPagePtr; /* now actually read the data, we know it's there */ - XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ); + XLogRead(cur_page, targetPagePtr, XLOG_BLCKSZ, false); return count; } @@ -1486,7 +1500,7 @@ static void ProcessStandbyReplyMessage(void) { XLogRecPtr writePtr, -flushPtr, +flushPtr, oldFlushPtr, applyPtr; bool replyRequested; @@ -1515,6 +1529,7 @@ ProcessStandbyReplyMessage(void) WalSnd *walsnd = MyW
Re: [HACKERS] Logical Replication and Character encoding
Hello, At Fri, 3 Feb 2017 09:16:47 +0800, Craig Ringer wrote in > On 2 February 2017 at 11:45, Euler Taveira wrote: > > > I don't think storage without conversion is an acceptable approach. We > > should provide options to users such as ignore tuple or NULL for > > column(s) with conversion problem. I wouldn't consider storage data > > without conversion because it silently show incorrect data and we > > historically aren't flexible with conversion routines. It is possible technically. But changing the behavior of a subscript and/or publication requires change of SQL syntax. It seems a bit too late for proposing such a new feature.. IMHO unintentional silent data loss must not be happen so the default behavior on conversion failure cannot be other than stop of replication. > pglogical and BDR both require identical encoding; they test for this > during startup and refuse to replicate if the encoding differs. > > For the first pass at core I suggest a variant on that policy: require > source and destination encoding to be the same. This should probably > be the first change, since it definitively prevents the issue from > arising. If the check is performed by BDR, pglogical itself seems to be allowed to convert strings when the both end have different encodings in a non-BDR environment. Is this right? > If time permits we could also allow destination encoding to be UTF-8 > (including codepage 65001) with any source encoding. This requires > encoding conversion to be performed, of course. Does this mean that BDR might work on heterogeneous encoding environemnt? But anyway some encodings (like SJIS) have caharacters with the same destination in its mapping so BDR doesn't seem to work with such conversions. So each encoding might should have a property to inform its usability under BDR environment, but... On the other hand, no prolem is seen in encoding conversions in non-BDR environments. (except the behavior on failure) > The downside is that this will impact users who use a common subset of > two encodings. This is most common for Windows-1252 <-> ISO-8859-15 > (or -1 if you're old-school) but also arises anywhere the common 7 bit > subset is used. Until we can define an encoding exception policy > though, I think we should defer supporting those and make them a > "later" problem. If the conversion is rejected for now, we should check the encoding identity instead. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning - another take
On 2017/01/31 6:42, Peter Eisentraut wrote: > On 1/25/17 12:54 AM, Ashutosh Bapat wrote: >> The documentation available at >> https://www.postgresql.org/docs/devel/static/sql-createtable.html, >> does not make it clear that the lower bound of a range partition is >> always inclusive and the higher one is exclusive. I think a note in >> section " PARTITION OF parent_table FOR VALUES partition_bound_spec" >> would be helpful. > > Hmm. I see the practical use of that, but I think this is going to be a > source of endless confusion. Can we make that a bit clearer in the > syntax, for example by using additional keywords (INCLUSIVE/EXCLUSIVE)? The decision not to make that configurable with INCLUSIVE/EXCLUSIVE syntax was deliberate. To summarize, we can start with a default configuration catering to most practical cases (that is, inclusive lower and exclusive upper bounds) and documenting so (not done yet, which I will post a doc patch today for). If it turns out that there is some demand for making that configurable, we can later add the code to handle that internally plus the syntax. But *starting* with that syntax means we have to potentially needlessly carry the code to handle seldom used cases that could not be made as efficient as it is now with all lower bounds being inclusive and upper bounds exclusive. Thanks, Amit [1] https://www.postgresql.org/message-id/CA%2BTgmoZou4ApEvC_nfhOxsi5G4SoD_evwNaiYn60ZcJ4XB_-QQ%40mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
> > > On reflection, it seems fairly improbable to me that people would use > \if and friends interactively. They're certainly useful for scripting, > but would you really type commands that you know are going to be ignored? > I'm thinking the one use-case is where a person is debugging a non-interactive script, cuts and pastes it into an interactive script, and then scrolls through command history to fix the bits that didn't work. So, no, you wouldn't *type* them per se, but you'd want the session as if you had. The if-then barking might really be useful in that context. > > Therefore, I don't think we should stress out about fitting branching > activity into the prompts. That's just not the use-case. (Note: we > might well have to reconsider that if we get looping, but for now it's > not a problem.) Moreover, if someone is confused because they don't > realize they're inside a failed \if, it's unlikely that a subtle change in > the prompt would help them. So your more in-the-face approach of printing > messages seems good to me. > Glad you like the barking. I'm happy to let the prompt issue rest for now, we can always add it later. If we DID want it, however, I don't think it'll be hard to add a special prompt (Thinking %T or %Y because they both look like branches, heh), and it could print the if-state stack, maybe something like: \if true \if true \if false \if true With a prompt1 of '%T> ' Would then resolve to ttfi> for true, true, false, ignored. This is just idle musing, I'm perfectly happy to leave it out entirely. > This seems more or less reasonable, although: > > > # \endif > > active \endif, executing commands > > looks a bit weird. Maybe instead say "exited \if, executing commands"? > +1 > > BTW, what is your policy about nesting these things in include files? > My immediate inclination is that if we hit EOF with open \if state, > we should drop it and revert to the state in the surrounding file. > Otherwise things will be way too confusing. > That's how it works now if you have ON_ERROR_STOP off, plus an error message telling you about the imbalance. If you have ON_ERROR_STOP on, it's fatal. All \if-\endif pairs must be balanced within a file (well, within a MainLoop, but to the user it looks like a file). Every new file opened via \i or \ir starts a new if-stack. Because commands in an inactive branch are never executed, we don't have to worry about the state of the parent stack when we do a \i, because we know it's trues all the way down. We chose this not so much because if-endif needed it (we could have thrown it into the pset struct just as easily), but because of the issues that might come up with a \while loop: needing to remember previous GOTO points in a file (if the input even *is* a file...) is going to be hard enough, remembering them across files would be harder, and further complicated by the possibility that a file \included on one iteration might not be included on the next (or vice versa)...and like you said, way too confusing.
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
At Fri, 3 Feb 2017 01:02:47 +0900, Fujii Masao wrote in > On Thu, Feb 2, 2017 at 2:36 PM, Michael Paquier > wrote: > > On Thu, Feb 2, 2017 at 2:13 PM, Tom Lane wrote: > >> Kyotaro HORIGUCHI writes: > >>> Then, the reason for the TRY-CATCH cluase is that I found that > >>> some functions called from there can throw exceptions. > >> > >> Yes, but all LWLocks should be released by normal error recovery. > >> It should not be necessary for this code to clean that up by hand. > >> If it were necessary, there would be TRY-CATCH around every single > >> LWLockAcquire in the backend, and we'd have an unreadable and > >> unmaintainable system. Please don't add a TRY-CATCH unless it's > >> *necessary* -- and you haven't explained why this one is. > > Yes. Thank you for the suggestion. I minunderstood that. > > Putting hands into the code and at the problem, I can see that > > dropping a subscription on a node makes it unresponsive in case of a > > stop. And that's just because calls to LWLockRelease are missing as in > > the patch attached. A try/catch problem should not be necessary. > > Thanks for the patch! > > With the patch, LogicalRepLauncherLock is released at the end of > DropSubscription(). But ISTM that the lock should be released just after > logicalrep_worker_stop() and there is no need to protect the removal of > replication slot with the lock. That's true. logicalrep_worker_stop returns after confirmig that worker->proc is cleard, so no false relaunch cannot be caused. After all, logicalrep_worker_stop is surrounded by LWLockAcquire/Relase pair. So it can be moved into the funciton and make the lock secrion to be more narrower. > /* > * If we found worker but it does not have proc set it is starting up, > * wait for it to finish and then kill it. > */ > while (worker && !worker->proc) > { > > ISTM that the above loop in logicalrep_worker_stop() is not necessary > because LogicalRepLauncherLock ensures that the above condition is > always false. Thought? Am I missing something? The lock exists only to keep the launcher from starting a worker. Creating a subscription and starting a worker for the slot run independently. > If the above condition is true, which means that there is the worker slot > having the "subid" of the worker to kill, but its "proc" has not been set yet. Yes. The situation happens after launcher sets subid and before ApplyWorkerMain attaches the slot. The lock doesn't protect the section. If someone can drop a subscription just after its creation, it happens. > Without LogicalRepLauncherLock, this situation can happen after "subid" > is set by the launcher and before "proc" is set by the worker. But > LogicalRepLauncherLock protects those operations, so logicalrep_worker_stop() > called while holding the lock should always think the above condition is > false. regards, -- Kyotaro Horiguchi NTT Open Source Software Center >From 490add168b7cf63ec584e75f6a7f79efc08ae200 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Fri, 3 Feb 2017 10:31:01 +0900 Subject: [PATCH] Refactor the lock section for subscription worker termination --- src/backend/commands/subscriptioncmds.c| 3 --- src/backend/replication/logical/launcher.c | 5 + 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 3b70807..67c587c 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -508,9 +508,6 @@ DropSubscription(DropSubscriptionStmt *stmt) /* Clean up dependencies */ deleteSharedDependencyRecordsFor(SubscriptionRelationId, subid, 0); - /* Protect against launcher restarting the worker. */ - LWLockAcquire(LogicalRepLauncherLock, LW_EXCLUSIVE); - /* Kill the apply worker so that the slot becomes accessible. */ logicalrep_worker_stop(subid); diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c index d222cff..233be06 100644 --- a/src/backend/replication/logical/launcher.c +++ b/src/backend/replication/logical/launcher.c @@ -368,6 +368,9 @@ logicalrep_worker_stop(Oid subid) break; } + /* Block the lauchner not to restart this worker */ + LWLockAcquire(LogicalRepLauncherLock); + /* Now terminate the worker ... */ kill(worker->proc->pid, SIGTERM); LWLockRelease(LogicalRepWorkerLock); @@ -398,6 +401,8 @@ logicalrep_worker_stop(Oid subid) ResetLatch(&MyProc->procLatch); } + + LWLockRelease(LogicalRepLauncherLock); } /* -- 2.9.2 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Constraint exclusion failed to prune partition in case of partition expression involves function call
On 2017/02/02 21:09, amul sul wrote: > Hi, > > In following case, constraint exclusion not able prune partition (even > if function is immutable), is this know behaviour? Yes. The where condition in your example query does not specify the partition key column, so constraint exclusion won't work, which requires variable in the condition to be spelled out exactly same as the partition key column. Here the partitioning code is going to return check constraints of the form abs(a) = 0 for foo_list1, abs(a) = 1 for foo_list2 and so on, for the constraint exclusion logic to work upon. > --Explain plan > postgres=# explain select * from foo_list where a = 2; >QUERY PLAN > - > Append (cost=0.00..103.50 rows=25 width=36) >-> Seq Scan on foo_list (cost=0.00..0.00 rows=1 width=36) > Filter: (a = 2) >-> Seq Scan on foo_list1 (cost=0.00..25.88 rows=6 width=36) > Filter: (a = 2) >-> Seq Scan on foo_list2 (cost=0.00..25.88 rows=6 width=36) > Filter: (a = 2) >-> Seq Scan on foo_list3 (cost=0.00..25.88 rows=6 width=36) > Filter: (a = 2) >-> Seq Scan on foo_list4 (cost=0.00..25.88 rows=6 width=36) > Filter: (a = 2) > (11 rows) If you try with where abs(a) = 2, it works: explain select * from foo_list where abs(a) = 2; QUERY PLAN - Append (cost=0.00..29.05 rows=7 width=36) -> Seq Scan on foo_list (cost=0.00..0.00 rows=1 width=36) Filter: (abs(a) = 2) -> Seq Scan on foo_list3 (cost=0.00..29.05 rows=6 width=36) Filter: (abs(a) = 2) (5 rows) See an old exchange at the link below for a kind of similar example and some explanations about why the thing that one thinks would or should happen doesn't happen. https://www.postgresql.org/message-id/CA%2BTgmoaE9NZ_RiqZQLp2aJXPO4E78QxkQYL-FR2zCDop96Ahdg%40mail.gmail.com Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
Corey Huinker writes: >> As it is, I've added interactive mode psql_error notifications about the >> resulting branching state of any branching commands, and any attempt to >> send non-branching commands or queries while in an inactive branch will >> generate a psql_error saying that the command was ignored. Waiting til I >> get what should or shouldn't be done about prompts before issuing the next >> patch revision. On reflection, it seems fairly improbable to me that people would use \if and friends interactively. They're certainly useful for scripting, but would you really type commands that you know are going to be ignored? Therefore, I don't think we should stress out about fitting branching activity into the prompts. That's just not the use-case. (Note: we might well have to reconsider that if we get looping, but for now it's not a problem.) Moreover, if someone is confused because they don't realize they're inside a failed \if, it's unlikely that a subtle change in the prompt would help them. So your more in-the-face approach of printing messages seems good to me. > So far, interactive branching information will look like this (it prints on > every branching command and on every ignored command): This seems more or less reasonable, although: > # \endif > active \endif, executing commands looks a bit weird. Maybe instead say "exited \if, executing commands"? BTW, what is your policy about nesting these things in include files? My immediate inclination is that if we hit EOF with open \if state, we should drop it and revert to the state in the surrounding file. Otherwise things will be way too confusing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logical Replication and Character encoding
On 2 February 2017 at 11:45, Euler Taveira wrote: > I don't think storage without conversion is an acceptable approach. We > should provide options to users such as ignore tuple or NULL for > column(s) with conversion problem. I wouldn't consider storage data > without conversion because it silently show incorrect data and we > historically aren't flexible with conversion routines. pglogical and BDR both require identical encoding; they test for this during startup and refuse to replicate if the encoding differs. For the first pass at core I suggest a variant on that policy: require source and destination encoding to be the same. This should probably be the first change, since it definitively prevents the issue from arising. If time permits we could also allow destination encoding to be UTF-8 (including codepage 65001) with any source encoding. This requires encoding conversion to be performed, of course. The downside is that this will impact users who use a common subset of two encodings. This is most common for Windows-1252 <-> ISO-8859-15 (or -1 if you're old-school) but also arises anywhere the common 7 bit subset is used. Until we can define an encoding exception policy though, I think we should defer supporting those and make them a "later" problem. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect: Hash index support
On Fri, Feb 3, 2017 at 4:28 AM, Jesper Pedersen wrote: > On 02/02/2017 02:24 PM, Robert Haas wrote: >> >> So, committed. Wow, I wish every patch had this many reviewers. >> > > Thanks Robert ! 9 people in total per the commit message. Yes that's rare, and thanks for doing the effort to list everybody. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote: > On 02/01/2017 01:07 PM, Konstantin Knizhnik wrote: > >Attached please find my patch for XLC/AIX. > >The most critical fix is adding __sync to pg_atomic_fetch_add_u32_impl. > >The comment in this file says that: > > > > * __fetch_and_add() emits a leading "sync" and trailing "isync", > >thereby > > * providing sequential consistency. This is undocumented. > > > >But it is not true any more (I checked generated assembler code in > >debugger). > >This is why I have added __sync() to this function. Now pgbench working > >normally. Konstantin, does "make -C src/bin/pg_bench check" fail >10% of the time in the bad build? > Seems like it was not so much undocumented, but an implementation detail > that was not guaranteed after all.. Seems so. > There was a long thread on these things the last time this was changed: > https://www.postgresql.org/message-id/20160425185204.jrvlghn3jxulsb7i%40alap3.anarazel.de. > I couldn't find an explanation there of why we thought that fetch_and_add > implicitly performs sync and isync. It was in the generated code, for AIX xlc 12.1.0.0. > >Also there is mysterious disappearance of assembler section function > >with sync instruction from pg_atomic_compare_exchange_u32_impl. > >I have fixed it by using __sync() built-in function instead. > > __sync() seems more appropriate there, anyway. We're using intrinsics for > all the other things in generic-xlc.h. But it sure is scary that the "asm" > sections just disappeared. That is a problem, but it's a stretch to conclude that asm sections are generally prone to removal, while intrinsics are generally durable. > @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile > pg_atomic_uint32 *ptr, > static inline uint32 > pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) > { > + uint32 ret; > + > /* > - * __fetch_and_add() emits a leading "sync" and trailing "isync", > thereby > - * providing sequential consistency. This is undocumented. > + * Use __sync() before and __isync() after, like in compare-exchange > + * above. >*/ > - return __fetch_and_add((volatile int *)&ptr->value, add_); > + __sync(); > + > + ret = __fetch_and_add((volatile int *)&ptr->value, add_); > + > + __isync(); > + > + return ret; > } Since this emits double syncs with older xlc, I recommend instead replacing the whole thing with inline asm. As I opined in the last message of the thread you linked above, the intrinsics provide little value as abstractions if one checks the generated code to deduce how to use them. Now that the generated code is xlc-version-dependent, the port is better off with compiler-independent asm like we have for ppc in s_lock.h. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
On 3 February 2017 at 03:34, Robert Haas wrote: > On Wed, Feb 1, 2017 at 4:35 PM, Craig Ringer wrote: >> Right. Per my comments uothread I don't see why we need to add anything more >> to WAL here. >> >> Stas was concerned about what happens in logical decoding if we crash >> between PREPSRE TRANSACTION and COMMIT PREPARED. But we'll always go back >> and decode the whole txn again anyway so it doesn't matter. >> >> We can just track it on ReorderBufferTxn when we see it at PREPARE >> TRANSACTION time. > > Oh, hmm. I guess if that's how it works then we don't need it in WAL > after all. I'm not sure that re-decoding the already-prepared > transaction is a very good plan, but if that's what we're doing anyway > this patch probably shouldn't change it. We don't have much choice at the moment. Logical decoding must restart from the xl_running_xacts most recently prior to the xid allocation for the oldest xact the client hasn't confirmed receipt of decoded data + commit for. That's because reorder buffers are not persistent; if a decoding session crashes we throw away accumulated reorder buffers, both those in memory and those spilled to disk. We have to re-create them by restarting decoding from the beginning of the oldest xact of interest. We could make reorder buffers persistent and shared between decoding sessions but it'd totally change the logical decoding model and create some other problems. It's certainly not a topic for this patch. So we can take it as given that we'll always restart decoding from BEGIN again at a crash. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal
On Fri, Feb 3, 2017 at 8:50 AM, Daniel Verite wrote: > What if we look at the change from the pessimistic angle? > An example of confusion that the change would create: > a lot of users currently choose pg_wal for the destination > directory of their archive command. Really? I find that surprising to say "a lot". Perhaps there are some, but first I would not suspect that there are many people to use this repository name, and even less crazy enough to store them directly in PGDATA itself. And of course that's not on a different partition :) > Also googling for pg_wal, I'm finding food for thought like this > IBM technote: > http://www-01.ibm.com/support/docview.wss?uid=isg3T1015637 > which recommends to > "Remove all files under /var/lib/pgsql/9.0/data/pg_wal/" > and also calls that directory the "write-ahead log directory" > which is quite confusing because apparently it's the destination of > their archive command. Well this product of IBM is one. > There's also the disruption in existing backup scripts that directly > reference pg_xlog. Obviously these scripts are critical, and there's > something weird in breaking that intentionally. The motivation of the > breakage is likely to be felt as frivolous and unfair to those people who > are adversely impacted by the change, even if the part of not > reading the release notes is their fault. Those are not complicated to fix because they are hard failures. Sufficient tests need to be done so as backup scripts don't show in red on live systems before deploying them. The original reason to do the rename is that there are folks as well thinking that removing pg_xlog is fine on full disk because, you know, it contains just *logs* so they are not critical for the system. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Rename pg_switch_xlog to pg_switch_wal
Peter Eisentraut wrote: > You really have (at least) three options here: > > 1. Rename nothing > 2. Rename directory only > 3. Rename everything I vote for 1) as I believe the renaming will create more confusion than it's worth, not even considering the renaming of functions and views. What if we look at the change from the pessimistic angle? An example of confusion that the change would create: a lot of users currently choose pg_wal for the destination directory of their archive command. Less-informed users that set up archiving and/or log shipping in PG10 based on advice online from previous versions will be fairly confused about the missing pg_xlog, and the fact that the pg_wal directory they're supposed to create already exists. Also googling for pg_wal, I'm finding food for thought like this IBM technote: http://www-01.ibm.com/support/docview.wss?uid=isg3T1015637 which recommends to "Remove all files under /var/lib/pgsql/9.0/data/pg_wal/" and also calls that directory the "write-ahead log directory" which is quite confusing because apparently it's the destination of their archive command. This brings the question: what about the people who will delete their pg_wal (ex-pg_xlog) when they face a disk-full condition and they mix up in their mind the archive directory and the WAL directory? It's hard to guess how many could make that mistake but I don't see why it would be less probable than confusing pg_xlog and pg_log. At least the contents of the latter directories look totally different, contrary to the wal directory versus wal archive. Also, what about the users who are helped currently by the fact that "pg_xlog" is associated at the top of google results to good articles that explain what it is, why it fills up and what to do about it? By burying the name "pg_xlog" we're also loosing that connection, and in the worst case making worse the problem we wanted to help with. There's also the disruption in existing backup scripts that directly reference pg_xlog. Obviously these scripts are critical, and there's something weird in breaking that intentionally. The motivation of the breakage is likely to be felt as frivolous and unfair to those people who are adversely impacted by the change, even if the part of not reading the release notes is their fault. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
> > > All, > As it is, I've added interactive mode psql_error notifications about the > resulting branching state of any branching commands, and any attempt to > send non-branching commands or queries while in an inactive branch will > generate a psql_error saying that the command was ignored. Waiting til I > get what should or shouldn't be done about prompts before issuing the next > patch revision. > So far, interactive branching information will look like this (it prints on every branching command and on every ignored command): # \if true active \if, executing commands # select 1; ?column? -- 1 (1 row) Time: 0.282 ms # \else inactive \else, ignoring commands # select 1; inside inactive branch, query ignored. # select ... # 1; inside inactive branch, query ignored. # \endif active \endif, executing commands # \if false inactive \if, ignoring commands # \i file_name inside inactive branch, command ignored. # \elif false inactive \elif, ignoring commands # \else active \else, executing commands # \endif active \endif, executing commands Comments welcome.
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
On Wed, Feb 1, 2017 at 4:58 PM, Jim Nasby wrote: > I think the issue here is that the original case for this is a user > accidentally getting into an \if and then having no clue what's going on. > That's similar to what happens when you miss a quote or a semicolon. We > handle those cases with %R, and I think %R needs to support if as well. > > Perhaps there's value to providing more info (active branch, etc), but > ISTM trying to do that will just confuse the original (%R) case. > Jim, After spending a few minutes to familiarize myself with %R, I'm in agreement with your second statement (adding if-else to %R will just confuse %R). However, your first statement seems to indicate the opposite. Can you elaborate? All, As it is, I've added interactive mode psql_error notifications about the resulting branching state of any branching commands, and any attempt to send non-branching commands or queries while in an inactive branch will generate a psql_error saying that the command was ignored. Waiting til I get what should or shouldn't be done about prompts before issuing the next patch revision.
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
On 2017-02-02 22:44, Tom Lane wrote: Erik Rijkers writes: Something is broken in HEAD: Fixed, thanks for the report! Indeed, the complicated version of the script runs again as before. Thank you very much, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
Erik Rijkers writes: > Something is broken in HEAD: Fixed, thanks for the report! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Hello, There may possibly be a very small window for a double exit() self-deadlock during a forked backend process's ProcessStartupPacket returns and status is not STATUS_OK. The process will go into proc_exit and then a very well timed SIGQUIT will call startup_die for another proc_exit. If timed correctly, the two exit() calls can deadlock since exit() is not re-entrant. It seems extremely hard to hit the deadlock but I believe the opportunity is there. Using gdb, I was able to create the window and get this stacktrace: #0 startup_die (postgres_signal_arg=0) at postmaster.c:5090 #1 #2 proc_exit_prepare (code=0) at ipc.c:158 #3 0x007c4135 in proc_exit (code=0) at ipc.c:102 #4 0x0076b736 in BackendInitialize (port=0x2c13740) at postmaster.c:4207 #5 0x0076b190 in BackendStartup (port=0x2c13740) at postmaster.c:3979 #6 0x00767ad3 in ServerLoop () at postmaster.c:1722 #7 0x007671df in PostmasterMain (argc=3, argv=0x2bebad0) at postmaster.c:1330 #8 0x006b5df6 in main (argc=3, argv=0x2bebad0) at main.c:228 I got the stacktrace by doing the following: 1. gdb attach to postmaster and run set follow-fork child and break postmaster.c:4206(right after ProcessStartupPacket) and continue 2. In another terminal, open a psql session which should trigger the gdb follow 3. In the gdb session, set status=1 and step into proc_exit() 4. In another terminal, kill -s QUIT to send SIGQUIT to the child process. Or run pg_ctl stop -M immediate. 5. In the gdb session, step to process the signal into startup_die and run bt This was discovered while hacking on Greenplum Database (currently based off of Postgres 8.3) where we recently started encountering the self-deadlock intermittently in our testing environment. Here's the pull request discussion: https://github.com/greenplum-db/gpdb/pull/1662 In that pull request, we fix the issue by checking for proc_exit_inprogress. Is there a reason why startup_die should not check for proc_exit_inprogress? In the above pull request, Heikki also mentions that a similar scenario can happen during palloc() as well... which is similar to what we saw in Greenplum a couple years back for a deadlock in a malloc() call where we responded by changing exit() to _exit() in quickdie as a fix. That could possibly be applicable to latest Postgres as well. Regards, Jimmy
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On 2 February 2017 at 18:48, Peter Eisentraut wrote: > On 2/2/17 8:32 AM, Simon Riggs wrote: >> I think we should remove the "replication" false database concept in >> pg_hba.conf altogether and allow any valid pg_hba rule to invoke a >> replication connection, if one is requested. Roles would still need >> the REPLICATION capability before this would be allowed. Having both >> of those things doesn't materially improve security control. > > It's weirdly inconsistent now. You need a "replication" line in > pg_hba.conf to connect for logical decoding, but you can't restrict that > to a specific database because the database column in pg_hba.conf is > occupied by the "replication" key word. Agreed. Change needed. > However, you would still want a way to configure a user for logical > decoding for any database but no physical replication, or vice versa. > Just getting rid of the replication key word would prevent that, I think. We currently have REPLICATION to mean "physical replication". I think if we have another capability for logical replication then we would be able to easily allow one or the other, or both, so I don't see a problem here that forces us to keep pg_hba.conf the way it is. >> It would also be useful to be able prevent users with REPLICATION >> capability from connecting as normal users, if the are marked as >> NOLOGIN. > > That sounds useful. > > (Superusers not have the replication attribute by default is an > additional possible annoyance.) -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2017-02-02 14:47:53 -0500, Robert Haas wrote: > I expect that increasing the maximum value of shared_buffers beyond > what can be stored by an integer could have a noticeable distributed > performance cost for the entire system. It might be a pretty small > cost, but then again maybe not; for example, BufferDesc's buf_id > member would have to get wider, and probably the freeNext member, too. > Andres already did unspeakable things to make a BufferDesc fit into > one cache line for performance reasons, so that wouldn't be great > news. Yea, we'd have to get rid of BufferDesc's buf_id - but that's not that hard, just a bit verbose. You can get the buf_id already with a tiny bit of pointer math. I don't think there should be too many other changes, but I might be wrong. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2017-02-02 11:41:44 -0800, Jim Nasby wrote: > On 2/1/17 4:28 PM, Andres Freund wrote: > > On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: > > > With current limits, the most bgwriter can do (with 8k pages) is 1000 > > > pages > > > * 100 times/sec = 780MB/s. It's not hard to exceed that with modern > > > hardware. Should we increase the limit on bgwriter_lru_maxpages? > > > > FWIW, I think working on replacing bgwriter (e.g. by working on the > > patch I send with a POC replacement) wholesale is a better approach than > > spending time increasing limits. > > Do you have a link to that? I'm not seeing anything in the archives. Not at hand, but I can just give you the patches. These are very likely to conflict, but it shouldn't be too hard to resolve... What it basically does is move as much of the clock-sweep to bgwriter, which keeps a certain number of clean pages available. There's a lock-free ringbuffer that backends can just pick pages off. The approach with the ringbuffer has the advantage that with relatively small changes we can scale to having multiple bgwriters (needs some changes in the ringbuf implementation, but not that many). Andres >From c008aad0445d2d8b1247597940ce09dbd60e Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Fri, 19 Feb 2016 12:07:51 -0800 Subject: [PATCH 1/2] Basic lockless single producer, multiple consumer ringbuffer --- src/backend/lib/Makefile | 2 +- src/backend/lib/ringbuf.c | 162 ++ src/include/lib/ringbuf.h | 60 + 3 files changed, 223 insertions(+), 1 deletion(-) create mode 100644 src/backend/lib/ringbuf.c create mode 100644 src/include/lib/ringbuf.h diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile index 2d2ba84fe9..03a4795706 100644 --- a/src/backend/lib/Makefile +++ b/src/backend/lib/Makefile @@ -13,6 +13,6 @@ top_builddir = ../../.. include $(top_builddir)/src/Makefile.global OBJS = binaryheap.o bipartite_match.o hyperloglog.o ilist.o pairingheap.o \ - rbtree.o stringinfo.o + rbtree.o ringbuf.o stringinfo.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/lib/ringbuf.c b/src/backend/lib/ringbuf.c new file mode 100644 index 00..62761cc447 --- /dev/null +++ b/src/backend/lib/ringbuf.c @@ -0,0 +1,162 @@ +/*- + * + * ringbuf.c + * Single writer, multiple reader, lock-free ringbuffer. + * + * ... + * + * Copyright (c) 2015, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/backend/lib/ringbuf.c + * + *- + */ + +#include "postgres.h" + +#include "lib/ringbuf.h" +#include "storage/proc.h" + +static inline uint32 +ringbuf_backendid(ringbuf *rb, uint32 pos) +{ + return pos & 0x; +} + +/* + * Compute the new offset, slightly complicated by the fact that we only want + * to modify the lower 16 bits. + */ +static inline uint32 +ringbuf_advance_pos(uint32 pos) +{ + return ((pos + 1) & 0x) | (pos & 0x); +} + +uint32 +ringbuf_elements(ringbuf *rb) +{ + uint32 read_off = ringbuf_pos(rb, pg_atomic_read_u32(&rb->read_state)); + uint32 write_off = ringbuf_pos(rb, rb->write_off); + + /* not wrapped around */ + if (read_off <= write_off) + { + return write_off - read_off; + } + + /* wrapped around */ + return (rb->size - read_off) + write_off; +} + +size_t +ringbuf_size(size_t nelems) +{ + Assert(nelems <= 0x); + return sizeof(ringbuf) + sizeof(void *) * nelems; +} + +/* + * Memory needs to be externally allocated and be at least + * ringbuf_size(nelems) large. + */ +ringbuf * +ringbuf_create(void *target, size_t nelems) +{ + ringbuf *rb = (ringbuf *) target; + + Assert(nelems <= 0x); + + memset(target, 0, ringbuf_size(nelems)); + + rb->size = nelems; + pg_atomic_init_u32(&rb->read_state, 0); + rb->write_off = 0; + + return rb; +} + +bool +ringbuf_push(ringbuf *rb, void *data) +{ + uint32 read_off = pg_atomic_read_u32(&rb->read_state); + + /* + * Check if full - can be outdated, but that's ok. New readers are just + * going to further consume elements, never cause the buffer to become + * full. + */ + if (ringbuf_pos(rb, read_off) == ringbuf_pos(rb, rb->write_off + 1)) + { + return false; + } + + rb->elements[ringbuf_pos(rb, rb->write_off)] = data; + + /* + * The write adding the data needs to be visible before the corresponding + * increase of write_off is visible. + */ + pg_write_barrier(); + + rb->write_off = ringbuf_advance_pos(rb->write_off); + + return true; +} + + +bool +ringbuf_pop(ringbuf *rb, void **data) +{ + void *ret; + uint32 mybackend = MyProc->backendId; + + Assert((mybackend & 0x) == mybackend); + + while (true) + { + uint32 read_state = pg_atomic_read_u32(&rb->read_state); + uint32 read_off = ringbuf_pos(rb, read_state); + uint32 old_read_state = read_state; + + /* check if empty - can be outdated,
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On Wed, Feb 1, 2017 at 9:47 PM, Jim Nasby wrote: > Before doing that the first thing to look at would be why the limit is > currently INT_MAX / 2 instead of INT_MAX. Generally the rationale for GUCs with limits of that sort is that there is or might be code someplace that multiplies the value by 2 and expects the result not to overflow. I expect that increasing the maximum value of shared_buffers beyond what can be stored by an integer could have a noticeable distributed performance cost for the entire system. It might be a pretty small cost, but then again maybe not; for example, BufferDesc's buf_id member would have to get wider, and probably the freeNext member, too. Andres already did unspeakable things to make a BufferDesc fit into one cache line for performance reasons, so that wouldn't be great news. Anyway, I committed the patch posted here. Or the important line out of the two, anyway. :-) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Time to up bgwriter_lru_maxpages?
On 2/1/17 4:28 PM, Andres Freund wrote: On 2016-11-28 11:40:53 -0800, Jim Nasby wrote: With current limits, the most bgwriter can do (with 8k pages) is 1000 pages * 100 times/sec = 780MB/s. It's not hard to exceed that with modern hardware. Should we increase the limit on bgwriter_lru_maxpages? FWIW, I think working on replacing bgwriter (e.g. by working on the patch I send with a POC replacement) wholesale is a better approach than spending time increasing limits. Do you have a link to that? I'm not seeing anything in the archives. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
On Wed, Feb 1, 2017 at 4:35 PM, Craig Ringer wrote: > Right. Per my comments uothread I don't see why we need to add anything more > to WAL here. > > Stas was concerned about what happens in logical decoding if we crash > between PREPSRE TRANSACTION and COMMIT PREPARED. But we'll always go back > and decode the whole txn again anyway so it doesn't matter. > > We can just track it on ReorderBufferTxn when we see it at PREPARE > TRANSACTION time. Oh, hmm. I guess if that's how it works then we don't need it in WAL after all. I'm not sure that re-decoding the already-prepared transaction is a very good plan, but if that's what we're doing anyway this patch probably shouldn't change it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding of two-phase transactions
On Wed, Feb 1, 2017 at 2:32 PM, Tom Lane wrote: > Robert Haas writes: >> Also, including the GID in the WAL for each COMMIT/ABORT PREPARED >> doesn't seem inordinately expensive to me. > > I'm confused ... isn't it there already? If not, how do we handle > reconstructing 2PC state from WAL at all? By XID. See xl_xact_twophase, which gets included in xl_xact_commit or xl_xact_abort. The GID has got to be there in the XL_XACT_PREPARE record, but not when actually committing/rolling back. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect: Hash index support
On 02/02/2017 02:24 PM, Robert Haas wrote: So, committed. Wow, I wish every patch had this many reviewers. Thanks Robert ! Best regards, Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect: Hash index support
On Thu, Feb 2, 2017 at 6:29 AM, Ashutosh Sharma wrote: >> >> + ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info); >> + Assert(ptr <= uargs->page + BLCKSZ); >> >> I think this should be promoted to an ereport(); these functions can >> accept an arbitrary bytea. > > I think we had added 'ptr' variable initially just to dump hash code > in hexadecimal format but now since we have removed that logic from > current code, I think it is no more required and I have therefore > removed it from the current patch. Below is the code that used it > initially. It got changed as per your comment - [1] OK. > I think it's OK to check hash_bitmap_info() or any other functions > with different page types at least once. > > [1]- > https://www.postgresql.org/message-id/CA%2BTgmoZUjrVy52TUU3b_Q5L6jcrt7w5R4qFwMXdeCuKQBmYWqQ%40mail.gmail.com Sure, I just don't know if we need to test them 4 or 5 times. But I think this is good enough for now - it's not like this code is written in stone once committed. So, committed. Wow, I wish every patch had this many reviewers. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Provide list of subscriptions and publications in psql's completion
On 2/2/17 12:48 AM, Michael Paquier wrote: > +#define Query_for_list_of_subscriptions \ > +" SELECT pg_catalog.quote_ident(subname) "\ > +" FROM pg_catalog.pg_subscription "\ > +" WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'" This query should also be qualified by current database. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Provide list of subscriptions and publications in psql's completion
On 2/2/17 12:48 PM, Fujii Masao wrote: > +#define Query_for_list_of_subscriptions \ > +" SELECT pg_catalog.quote_ident(subname) "\ > +" FROM pg_catalog.pg_subscription "\ > +" WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'" > > Since non-superuser is not allowed to access to pg_subscription, > pg_stat_subscription should be accessed here, instead. Thought? Arguably, you could leave it like that, assuming it fails cleanly for nonsuperusers. Nonsuperusers are not going to be able to run any commands on subscriptions anyway, so there is little use for it. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On 2/2/17 8:32 AM, Simon Riggs wrote: > I think we should remove the "replication" false database concept in > pg_hba.conf altogether and allow any valid pg_hba rule to invoke a > replication connection, if one is requested. Roles would still need > the REPLICATION capability before this would be allowed. Having both > of those things doesn't materially improve security control. It's weirdly inconsistent now. You need a "replication" line in pg_hba.conf to connect for logical decoding, but you can't restrict that to a specific database because the database column in pg_hba.conf is occupied by the "replication" key word. However, you would still want a way to configure a user for logical decoding for any database but no physical replication, or vice versa. Just getting rid of the replication key word would prevent that, I think. > It would also be useful to be able prevent users with REPLICATION > capability from connecting as normal users, if the are marked as > NOLOGIN. That sounds useful. (Superusers not have the replication attribute by default is an additional possible annoyance.) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Thu, Feb 02, 2017 at 12:14:10PM -0500, Tom Lane wrote: > Also, somebody who wants a check like that isn't necessarily going to want > "no WHERE clause" training wheels. So you're going to need to think about > facilities to enable or disable different checks. WHERE-less-ness should be something that should be visible to a statement trigger that could then reject the operation if desirable. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Provide list of subscriptions and publications in psql's completion
On Thu, Feb 2, 2017 at 2:48 PM, Michael Paquier wrote: > Hi all, > > While testing a bit the logical replication facility, I have bumped on > the fast that psql's completion does not show the list of things > already created. Attached is a patch. +#define Query_for_list_of_subscriptions \ +" SELECT pg_catalog.quote_ident(subname) "\ +" FROM pg_catalog.pg_subscription "\ +" WHERE substring(pg_catalog.quote_ident(subname),1,%d)='%s'" Since non-superuser is not allowed to access to pg_subscription, pg_stat_subscription should be accessed here, instead. Thought? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Thu, Feb 02, 2017 at 12:14:10PM -0500, Tom Lane wrote: > Alvaro Herrera writes: > > Pavel Stehule wrote: > >> Better to enhance this feature step by step. > > > Agreed -- IMO this is a reasonable first step, except that I would > > rename the proposed extension so that it doesn't focus solely on > > the first step. > > Quite. The patch fails to make up its mind whether it's a trivial > example meant as a coding demonstration, or something that's going > to become actually useful. > > In the category of "actually useful", I would put checks like "are > there unqualified outer references in subqueries". That's something > we see complaints about at least once a month, I think, and it's the > type of error that even seasoned SQL authors can make easily. But > the current patch is not extensible in that direction (checking for > this in post_parse_analyze_hook seems quite impractical). > > Also, somebody who wants a check like that isn't necessarily going > to want "no WHERE clause" training wheels. So you're going to need > to think about facilities to enable or disable different checks. This is just the discussion I'd hoped for. I'll draft up a patch in the next day or two, reflecting what's gone so far. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
Alvaro Herrera writes: > Pavel Stehule wrote: >> Better to enhance this feature step by step. > Agreed -- IMO this is a reasonable first step, except that I would > rename the proposed extension so that it doesn't focus solely on the > first step. Quite. The patch fails to make up its mind whether it's a trivial example meant as a coding demonstration, or something that's going to become actually useful. In the category of "actually useful", I would put checks like "are there unqualified outer references in subqueries". That's something we see complaints about at least once a month, I think, and it's the type of error that even seasoned SQL authors can make easily. But the current patch is not extensible in that direction (checking for this in post_parse_analyze_hook seems quite impractical). Also, somebody who wants a check like that isn't necessarily going to want "no WHERE clause" training wheels. So you're going to need to think about facilities to enable or disable different checks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TRAP: FailedAssertion("!(hassrf)", File: "nodeProjectSet.c", Line: 180)
Andres Freund writes: > On 2017-02-01 23:27:36 -0500, Tom Lane wrote: >> I think the appropriate fix is that, once split_pathtarget_at_srfs() has >> computed a tentative list of SRFs it needs to evaluate, it has to make a >> second pass to see if any of them match expressions that were assigned to >> the next level down. This is pretty annoying, but we'd only have to do it >> if target_contains_srfs and context.nextlevel_contains_srfs are both true, >> which will be a negligibly small fraction of queries in practice. > Hm. Can't really come up with something better, but I'm kinda tired > too... I wrote a patch along that line, and was just about ready to commit it when I realized that really this is all wrong. Fixing it this way handles the case of regression=# select generate_series(1,3), generate_series(1,3) + 1; generate_series | ?column? -+-- 1 |2 2 |3 3 |4 (3 rows) which is what you got before v10, because the two SRFs ran in lockstep despite being at different expression nesting levels. However, consider regression=# select generate_series(1,3), generate_series(2,4) + 1; generate_series | ?column? -+-- 1 |3 2 |3 3 |3 1 |4 2 |4 3 |4 1 |5 2 |5 3 |5 (9 rows) That's *not* what you got before: regression=# select generate_series(1,3), generate_series(2,4) + 1; generate_series | ?column? -+-- 1 |3 2 |4 3 |5 (3 rows) Really the problem here is that split_pathtarget_at_srfs is completely wrong about how to assign SRFs to different levels in a stack of ProjectSet nodes. It's doing that according to each SRF's top-down nesting level, but it needs to do it bottom-up, so that a SRF is evaluated in the k'th step if there are k-1 nested levels of SRFs in its arguments. This is doable, but I think the routine will have to be completely rewritten not just hacked around the edges. Off to do that ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Pavan Deolasee wrote: > I can reproduce this entire scenario using gdb sessions. This also explains > why the patch I sent earlier helps to solve the problem. Ouch. Great detective work there. I think it's quite possible that this bug explains some index errors, such as primary keys (or unique indexes) that when recreated fail with duplicate values: the first value inserted would get in via an UPDATE during the CIC race window, and sometime later the second value is inserted correctly. Because the first one is missing from the index, the second one does not give rise to a dupe error upon insertion, but they are of course both detected when the index is recreated. I'm going to study the bug a bit more, and put in some patch before the upcoming minor tag on Monday. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
Pavel Stehule wrote: > Identification of unjoined tables should be very useful - but it is far to > original proposal - so it can be solved separately. > > This patch is simple - and usually we prefer more simple patches than one > bigger. > > Better to enhance this feature step by step. Agreed -- IMO this is a reasonable first step, except that I would rename the proposed extension so that it doesn't focus solely on the first step. I'd pick a name that suggests that various kinds of checks are applied to queries, so "require_where" would be only one of various options that can be enabled. It will make no sense to enable the require_where module in order to disallow unjoined tables. At the same time, I don't see us providing a dozen of "require_foo" modules. So we'd better start making this one extensible from the get go. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
Hi Tony, On 02.02.2017 17:10, REIX, Tony wrote: Hi Konstantin I've discussed the "zombie/exit" issue with our expert here. - He does not think that AIX has anything special here - If the process is marked in ps, this is because the flag SEXIT is set, thus the process is blocked somewhere in the kexitx() syscall, waiting for something. - In order to know what it is waiting for, the best would be to have a look with *kdb*. kdb shows the following stack: pvthread+073000 STACK: [005E1958]slock+000578 (005E1958, 80001032 [??]) [9558].simple_lock+58 () [00651DBC]vm_relalias+00019C (??, ??, ??, ??, ??) [006544AC]vm_map_entry_delete+00074C (??, ??, ??) [00659C30]vm_map_delete+000150 (??, ??, ??, ??) [00659D88]vm_map_deallocate+48 (??, ??) [0011C588]kexitx+001408 (??) [000BB08C]kexit+8C () ___ Recovery (FFF9290) ___ WARNING: Eyecatcher/version mismatch in RWA -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
2017-02-02 16:34 GMT+01:00 Bruce Momjian : > On Thu, Feb 2, 2017 at 07:18:45AM -0800, David Fetter wrote: > > On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote: > > > I just don't see this patch going in. I think it needs are larger > > > approach to the problems it is trying to solve. I think it then > > > will be very useful. > > > > What problems that it's trying to solve are not addressed? > > Unjoined tables. Inconsistent alias references. I think there are a > bunch of things and if we can make a list and get a mode to warn about > all of them, it would be very useful. > Identification of unjoined tables should be very useful - but it is far to original proposal - so it can be solved separately. This patch is simple - and usually we prefer more simple patches than one bigger. Better to enhance this feature step by step. Regards Pavel > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + As you are, so once was I. As I am, so you will be. + > + Ancient Roman grave inscription + > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Thu, Feb 02, 2017 at 10:34:43AM -0500, Bruce Momjian wrote: > On Thu, Feb 2, 2017 at 07:18:45AM -0800, David Fetter wrote: > > On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote: > > > I just don't see this patch going in. I think it needs are > > > larger approach to the problems it is trying to solve. I think > > > it then will be very useful. > > > > What problems that it's trying to solve are not addressed? > > Unjoined tables. Inconsistent alias references. I think there are > a bunch of things and if we can make a list and get a mode to warn > about all of them, it would be very useful. There is always a delicate balance between putting in all that's required for a minimum viable feature and actually getting something out there. As I see it, this feature's main benefit is that it prevents some very common and at the same time very damaging kinds of harm. It also, for now, serves a pedagogical purpose. It's relatively straight-forward for someone with little or no PostgreSQL experience to look at it and see what it does. I originally sent the feature to cover unsubtle types of blunders, I'd like to keep "unsubtle blunders" as the guiding principle here. I can see where an unintentional CROSS JOIN would qualify under that standard. I'm not sure I understand what you mean by inconsistent aliasing. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
On Thu, Feb 2, 2017 at 2:36 PM, Michael Paquier wrote: > On Thu, Feb 2, 2017 at 2:13 PM, Tom Lane wrote: >> Kyotaro HORIGUCHI writes: >>> Then, the reason for the TRY-CATCH cluase is that I found that >>> some functions called from there can throw exceptions. >> >> Yes, but all LWLocks should be released by normal error recovery. >> It should not be necessary for this code to clean that up by hand. >> If it were necessary, there would be TRY-CATCH around every single >> LWLockAcquire in the backend, and we'd have an unreadable and >> unmaintainable system. Please don't add a TRY-CATCH unless it's >> *necessary* -- and you haven't explained why this one is. Yes. > Putting hands into the code and at the problem, I can see that > dropping a subscription on a node makes it unresponsive in case of a > stop. And that's just because calls to LWLockRelease are missing as in > the patch attached. A try/catch problem should not be necessary. Thanks for the patch! With the patch, LogicalRepLauncherLock is released at the end of DropSubscription(). But ISTM that the lock should be released just after logicalrep_worker_stop() and there is no need to protect the removal of replication slot with the lock. /* * If we found worker but it does not have proc set it is starting up, * wait for it to finish and then kill it. */ while (worker && !worker->proc) { ISTM that the above loop in logicalrep_worker_stop() is not necessary because LogicalRepLauncherLock ensures that the above condition is always false. Thought? Am I missing something? If the above condition is true, which means that there is the worker slot having the "subid" of the worker to kill, but its "proc" has not been set yet. Without LogicalRepLauncherLock, this situation can happen after "subid" is set by the launcher and before "proc" is set by the worker. But LogicalRepLauncherLock protects those operations, so logicalrep_worker_stop() called while holding the lock should always think the above condition is false. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 02.02.2017 18:20, REIX, Tony wrote: Hi Konstantin I have an issue with pgbench. Any idea ? Pgbench -s options specifies scale. Scale 1000 corresponds to 1000 million rows and requires about 16Gb at disk. # mkdir /tmp/PGS # chown pgstbf.staff /tmp/PGS # su pgstbf $ /opt/freeware/bin/*initdb* -D /tmp/PGS The files belonging to this database system will be owned by user "pgstbf". This user must also own the server prcess. The database cluster will be initialized with locale "C". The default database encoding has accordingly been set to "SQL_ASCII". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /tmp/PGS ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: $ /opt/freeware/bin/*pg_ctl* -D /tmp/PGS -l /tmp/PGS/logfile *start* server starting $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status pg_ctl: server is running (PID: 11599920) /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS" $ /usr/bin/*createdb* pgstbf $ $ *pgbench* -i -s 1000 creating tables... 10 of 1 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s) ... 1 of 1 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s) *ERROR: could not extend file "base/16384/24614": wrote only 7680 of 8192 bytes at block 131071** ** HINT: Check free disk space.* CONTEXT: COPY pgbench_accounts, line 7995584 PQendcopy failed After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, I've retried and I got nearly the same: 1 of 1 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s) ERROR: could not extend file "base/16384/16397.6": *No space left on device* HINT: Check free disk space. CONTEXT: COPY pgbench_accounts, line 51235802 PQendcopy failed *Do I need more than 6GB ???* *Thanks* *Tony* $ df -k . Filesystem1024-blocks Free %UsedIused %Iused Mounted on /dev/hd1 45088768 6719484 86% 94601639% /home bash-4.3$ pwd /tmp/PGS bash-4.3$ ll /tmp/PGS lrwxrwxrwx1 root system 10 Feb 2 08:43 /tmp/PGS -> /home/PGS/ $ df -k Filesystem1024-blocks Free %UsedIused %Iused Mounted on /dev/hd4 524288277284 48%1073314% / /dev/hd2 6684672148896 98%4930348% /usr /dev/hd9var 2097152314696 85%2493418% /var /dev/hd3 3145728 2527532 20% 418 1% /tmp */dev/hd1 45088768 6719484 86% 94601639% /home* /dev/hd11admin 1310721306921%7 1% /admin /proc - -- - - /proc /dev/hd10opt 65273856829500 99% 93833941% /opt /dev/livedump 2621442617761%4 1% /var/adm/ras/livedump /aha- --18 1% /aha $ cat logfile LOG: database system was shut down at 2017-02-02 09:08:31 CST LOG: MultiXact member wraparound protections are now enabled LOG: autovacuum launcher started LOG: database system is ready to accept connections ERROR: could not extend file "base/16384/16397.6": No space left on device HINT: Check free disk space. CONTEXT: COPY pgbench_accounts, line 51235802 STATEMENT: copy pgbench_accounts from stdin $ *ulimit -a* core file size (blocks, -c) 1048575 data seg size (kbytes, -d) 131072 *file size (blocks, -f) unlimited* max memory size (kbytes, -m) 32768 open files (-n) 2000 pipe size(512 bytes, -p) 64 stack size (kbytes, -s) 32768 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited bash-4.3$ ll /tmp/PGS lrwxrwxrwx1 root system 10 Feb 2 08:43 /tmp/PGS -> /home/PGS/ bash-4.3$ ls -l total 120 -rw---1 pgstbf staff 4 Feb 2 09:08 PG_VERSION drwx--6 pgstbf staff 256 Feb 2 09:09 base drwx--2 pgstbf staff 4096 Feb 2 09:09 global -rw---1 pgstbf staff 410 Feb 2 09:13 logfile drwx--2 pgstbf staff 256 Feb 2 09:08 pg_clog drwx--2 pgstbf staff 256 Feb 2 09:08 pg_commit_ts drwx--2 pgstbf staff 256 Feb 2 09:08 pg_dynshmem -rw---1 pgstbf staff 4462 Feb 2 09:08 pg
[HACKERS] Some patch triage from Brussels
The developer meeting in Brussels ran out of topics :), so we have spent some time going through the patches on the upcoming CF. We did manage to assign a couple of patches to people (and we also discussed but didn't actually assign all the patches that Robert has his hands in :P). And we found one patch that was already committed but not flagged as such :) We did not really have quorum to make any hard decisions about patches, so we didn't do that. However, we did come up with a list of patches that this group things we should push back to PostgreSQL 11 at this point, in order to make it easier to focus on the remaining ones for PostgreSQL 10 and not end up with too many things that are half-finished. This was based on the general feeling of the people here who had seen it, as well as when the patches were submitted. Our suggestion is we push the following patches back. We haven't done it, because we figured we don't have the mandate to do the push, but we suggest this is done fairly quickly. * Protect syscache from bloating with negative cache entries * pg_stat_statements with plans * "Causal reads" mode for load balancing reads without stale data * Vertical Clustered Index This is not a judgement on if we think we *want* the features -- we actually want all of them. We just don't think they're going to be ready in time for 10. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Thu, Feb 2, 2017 at 07:18:45AM -0800, David Fetter wrote: > On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote: > > I just don't see this patch going in. I think it needs are larger > > approach to the problems it is trying to solve. I think it then > > will be very useful. > > What problems that it's trying to solve are not addressed? Unjoined tables. Inconsistent alias references. I think there are a bunch of things and if we can make a list and get a mode to warn about all of them, it would be very useful. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
I just don't see this patch going in. I think it needs are larger approach to the problems it is trying to solve. I think it then will be very useful. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
Hi Konstantin I have an issue with pgbench. Any idea ? # mkdir /tmp/PGS # chown pgstbf.staff /tmp/PGS # su pgstbf $ /opt/freeware/bin/initdb -D /tmp/PGS The files belonging to this database system will be owned by user "pgstbf". This user must also own the server prcess. The database cluster will be initialized with locale "C". The default database encoding has accordingly been set to "SQL_ASCII". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /tmp/PGS ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile start server starting $ /opt/freeware/bin/pg_ctl -D /tmp/PGS -l /tmp/PGS/logfile status pg_ctl: server is running (PID: 11599920) /opt/freeware/bin/postgres_64 "-D" "/tmp/PGS" $ /usr/bin/createdb pgstbf $ $ pgbench -i -s 1000 creating tables... 10 of 1 tuples (0%) done (elapsed 0.29 s, remaining 288.09 s) ... 1 of 1 tuples (100%) done (elapsed 42.60 s, remaining 0.00 s) ERROR: could not extend file "base/16384/24614": wrote only 7680 of 8192 bytes at block 131071 HINT: Check free disk space. CONTEXT: COPY pgbench_accounts, line 7995584 PQendcopy failed After cleaning all /tmp/PGS and symlinking it to /home, where I have 6GB free, I've retried and I got nearly the same: 1 of 1 tuples (100%) done (elapsed 204.65 s, remaining 0.00 s) ERROR: could not extend file "base/16384/16397.6": No space left on device HINT: Check free disk space. CONTEXT: COPY pgbench_accounts, line 51235802 PQendcopy failed Do I need more than 6GB ??? Thanks Tony $ df -k . Filesystem1024-blocks Free %UsedIused %Iused Mounted on /dev/hd1 45088768 6719484 86% 94601639% /home bash-4.3$ pwd /tmp/PGS bash-4.3$ ll /tmp/PGS lrwxrwxrwx1 root system 10 Feb 2 08:43 /tmp/PGS -> /home/PGS/ $ df -k Filesystem1024-blocks Free %UsedIused %Iused Mounted on /dev/hd4 524288277284 48%1073314% / /dev/hd2 6684672148896 98%4930348% /usr /dev/hd9var 2097152314696 85%2493418% /var /dev/hd3 3145728 2527532 20% 418 1% /tmp /dev/hd1 45088768 6719484 86% 94601639% /home /dev/hd11admin 1310721306921%7 1% /admin /proc - -- - - /proc /dev/hd10opt 65273856829500 99% 93833941% /opt /dev/livedump 2621442617761%4 1% /var/adm/ras/livedump /aha- --18 1% /aha $ cat logfile LOG: database system was shut down at 2017-02-02 09:08:31 CST LOG: MultiXact member wraparound protections are now enabled LOG: autovacuum launcher started LOG: database system is ready to accept connections ERROR: could not extend file "base/16384/16397.6": No space left on device HINT: Check free disk space. CONTEXT: COPY pgbench_accounts, line 51235802 STATEMENT: copy pgbench_accounts from stdin $ ulimit -a core file size (blocks, -c) 1048575 data seg size (kbytes, -d) 131072 file size (blocks, -f) unlimited max memory size (kbytes, -m) 32768 open files (-n) 2000 pipe size(512 bytes, -p) 64 stack size (kbytes, -s) 32768 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited bash-4.3$ ll /tmp/PGS lrwxrwxrwx1 root system 10 Feb 2 08:43 /tmp/PGS -> /home/PGS/ bash-4.3$ ls -l total 120 -rw---1 pgstbf staff 4 Feb 2 09:08 PG_VERSION drwx--6 pgstbf staff 256 Feb 2 09:09 base drwx--2 pgstbf staff 4096 Feb 2 09:09 global -rw---1 pgstbf staff 410 Feb 2 09:13 logfile drwx--2 pgstbf staff 256 Feb 2 09:08 pg_clog drwx--2 pgstbf staff 256 Feb 2 09:08 pg_commit_ts drwx--2 pgstbf staff 256 Feb 2 09:08 pg_dynshmem -rw---1 pgstbf staff 4462 Feb 2 09:08 pg_hba.conf -rw---1 pgstbf staff 1636 Feb 2 09:08 pg_ident.conf drwx--4 pgstbf staff 256 Feb 2 09:08 pg_logical drwx--4 pgstbf staff 256 Feb
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Thu, Feb 02, 2017 at 03:16:29PM +, Bruce Momjian wrote: > I just don't see this patch going in. I think it needs are larger > approach to the problems it is trying to solve. I think it then > will be very useful. What problems that it's trying to solve are not addressed? Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
On Mon, Jan 30, 2017 at 7:20 PM, Pavan Deolasee wrote: > > Based on my investigation so far and the evidence that I've collected, > what probably happens is that after a relcache invalidation arrives at the > new transaction, it recomputes the rd_indexattr but based on the old, > cached rd_indexlist. So the rd_indexattr does not include the new columns. > Later rd_indexlist is updated, but the rd_indexattr remains what it was. > > I was kinda puzzled this report did not get any attention, though it's clearly a data corruption issue. Anyways, now I know why this is happening and can reproduce this very easily via debugger and two sessions. The offending code is RelationGetIndexAttrBitmap(). Here is the sequence of events: Taking the same table as in the report: CREATE TABLE cic_tab_accounts ( aid bigint UNIQUE, abalance bigint, aid1 bigint ); 1. Session S1 calls DROP INDEX pgb_a_aid1 and completes 2. Session S2 is in process of rebuilding its rd_indexattr bitmap (because of previous relcache invalidation caused by DROP INDEX). To be premise, assume that the session has finished rebuilding its index list. Since DROP INDEX has just happened, 4793 indexoidlist = RelationGetIndexList(relation); 3. S1 then starts CREATE INDEX CONCURRENTLY pgb_a_aid1 ON cic_tab_accounts(aid1) 4. S1 creates catalog entry for the new index, commits phase 1 transaction and builds MVCC snapshot for the second phase and also finishes the initial index build 5. S2 now proceeds. Now notice that S2 had computed the index list based on the old view i.e. just one index. The following comments in relcahce.c are now significant: 4799 /* 4800 * Copy the rd_pkindex and rd_replidindex value computed by 4801 * RelationGetIndexList before proceeding. This is needed because a 4802 * relcache flush could occur inside index_open below, resetting the 4803 * fields managed by RelationGetIndexList. (The values we're computing 4804 * will still be valid, assuming that caller has a sufficient lock on 4805 * the relation.) 4806 */ That's what precisely goes wrong. 6. When index_open is called, relcache invalidation generated by the first transaction commit in CIC gets accepted and processed. As part of this invalidation, rd_indexvalid is set to 0, which invalidates rd_indexlist too 7. But S2 is currently using index list obtained at step 2 above (which has only old index). It goes ahead and computed rd_indexattr based on just the old index. 8. While relcahce invalidation processing at step 6 cleared rd_indexattr (along with rd_indexvalid), it is again set at 4906 relation->rd_indexattr = bms_copy(indexattrs); But what gets set at step 8 is based on the old view. There is no way rd_indexattr will be invalidated until S2 receives another relcache invalidation. This will happen when the CIC proceeds. But until then, every update done by S2 will generate broken HOT chains, leading to data corruption. I can reproduce this entire scenario using gdb sessions. This also explains why the patch I sent earlier helps to solve the problem. Thanks, Pavan -- Pavan Deolasee http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Non-deterministic behavior with floating point in parallel mode
Ruben Buchatskiy writes: > We have found that in parallel mode result of queries is non-deterministic > when the types of the attributes in table are double precision > (floating-point). Yeah ... > That is because floating-point addition is not necessarily associative. Right, exactly. > Is this desirable behavior? It's not especially the fault of parallelism. Any change in the order in which the SUM visits the rows could cause a similar change in the results. IOW, you are being overoptimistic about how deterministic this result is any of the time. Use numeric, not float, if you can't tolerate this sort of behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On 02/02/17 14:32, Simon Riggs wrote: > On 23 January 2017 at 04:29, Michael Paquier > wrote: >> Hi all, >> >> As now wal_level = replica has become the default for Postgres 10, >> could we consider as well making replication connections enabled by >> default in pg_hba.conf? > > Agreed > >> This requires just uncommenting a couple of >> lines in pg_hba.conf.sample. > > I don't think that is the right way to do this. Changing the default > doesn't reduce the complexity. > > I think we should remove the "replication" false database concept in > pg_hba.conf altogether and allow any valid pg_hba rule to invoke a > replication connection, if one is requested. Roles would still need > the REPLICATION capability before this would be allowed. Having both > of those things doesn't materially improve security control. > +1 > It would also be useful to be able prevent users with REPLICATION > capability from connecting as normal users, if the are marked as > NOLOGIN. > +1 -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
Hi Konstantin I've discussed the "zombie/exit" issue with our expert here. - He does not think that AIX has anything special here - If the process is marked in ps, this is because the flag SEXIT is set, thus the process is blocked somewhere in the kexitx() syscall, waiting for something. - In order to know what it is waiting for, the best would be to have a look with kdb. - either it is waiting for an asynchronous I/O to end, or a thread to end if the process is multi-thread - Using the proctree command for analyzing the issue is not a good idea, since the process will block in kexitx() if there is an operation on /proc being done - If the process is marked , that means that the process has not called waitpid() yet for getting the son's status. Maybe the parent is blocked in non-interruptible code where the signal handler cannot be called. - In short, that may be due to many causes... Use kdb is the best way. - Instead of proctree (which makes use of /proc), use: "ps -faT ". I'll try to reproduce here. Regards Tony Le 01/02/2017 à 21:26, Konstantin Knizhnik a écrit : On 02/01/2017 08:30 PM, REIX, Tony wrote: About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me. 1. Process is not in zomby state (according to ps). It is in state... It is something AIX specific, I have not see processes in this state at Linux. 2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork() Regards, Tony Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit : Hi Tony, On 01.02.2017 18:42, REIX, Tony wrote: Hi Konstantin XLC. I'm on AIX 7.1 for now. I'm using this version of XLC v13: # xlc -qversion IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07) Version: 13.01.0003.0003 With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates . With the following XLC v12 version, I have NO test failure: # /usr/vac/bin/xlc -qversion IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72) Version: 12.01..0016 So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ? Configure. What are the options that you give to the configure ? export CC="/opt/IBM/xlc/13.1.3/bin/xlc" export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 " export LDFLAGS="-Wl,-bbigtoc,-b64" export AR="/usr/bin/ar -X64" export LD="/usr/bin/ld -b64 " export NM="/usr/bin/nm -X64" ./configure --prefix="/opt/postgresql/xlc-debug/9.6" Hard load & 64 cores ? OK. That clearly explains why I do not see this issue. pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing. pgbench is part of Postgres distributive (src/bin/pgbench) I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome ! Performance. - Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ? pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice. But the best case is to implement your own benchmark simulating actual workload of your real application. - I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)). How to help ? How could I help for improving the quality and performance of PostgreSQL on AIX ? We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html It will be great if you can somehow help to fix this problem. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On 23 January 2017 at 04:29, Michael Paquier wrote: > Hi all, > > As now wal_level = replica has become the default for Postgres 10, > could we consider as well making replication connections enabled by > default in pg_hba.conf? Agreed > This requires just uncommenting a couple of > lines in pg_hba.conf.sample. I don't think that is the right way to do this. Changing the default doesn't reduce the complexity. I think we should remove the "replication" false database concept in pg_hba.conf altogether and allow any valid pg_hba rule to invoke a replication connection, if one is requested. Roles would still need the REPLICATION capability before this would be allowed. Having both of those things doesn't materially improve security control. It would also be useful to be able prevent users with REPLICATION capability from connecting as normal users, if the are marked as NOLOGIN. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Pavan Deolasee wrote: > Do you think we should apply the patch to remove ItemPointerCopy()? I will > rework the HeapTupleHeaderGetNextTid() after that. Not that it depends on > removing ItemPointerCopy(), but decided to postpone it until we make a call > on that patch. My inclination is not to. We don't really know where we are going with storage layer reworks in the near future, and we might end up changing this in other ways. We might find ourselves needing this kind of abstraction again. I don't think this means we need to follow it completely in new code, since it's already broken in other places, but let's not destroy it completely just yet. > BTW I've run now long stress tests with the patch applied and see no new > issues, even when indexes are dropped and recreated concurrently (includes > my patch to fix CIC bug in the master though). In another 24 hour test, > WARM could do 274M transactions where as master did 164M transactions. I > did not drop and recreate indexes during this run. Eh, that's a 67% performance improvement. Nice. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch: optimize information_schema.constraint_column_usage
Hello hackers, The view information_schema.constraint_column_usage becomes slow when the number of columns and constraints raise to substantial values. This is because of a join condition that allows only join filter to enforce. The patch is to optimize it. See many_constraints.sql file attached for a performance test: create 3000 tables with 10 columns and a PK each and select * from the view. The last statement works for 22 seconds on master branch, 34 milliseconds optimized on my laptop. Best Regards, Alexey Bashtanov many-constraints.sql Description: application/sql diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index 00550eb..ffb1564 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -801,8 +801,8 @@ CREATE VIEW constraint_column_usage AS WHERE nr.oid = r.relnamespace AND r.oid = a.attrelid AND nc.oid = c.connamespace -AND (CASE WHEN c.contype = 'f' THEN r.oid = c.confrelid AND a.attnum = ANY (c.confkey) - ELSE r.oid = c.conrelid AND a.attnum = ANY (c.conkey) END) +AND r.oid = CASE c.contype WHEN 'f' THEN c.confrelid ELSE c.conrelid END +AND a.attnum = ANY (CASE c.contype WHEN 'f' THEN c.confkey ELSE c.conkey END) AND NOT a.attisdropped AND c.contype IN ('p', 'u', 'f') AND r.relkind = 'r' -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Non-deterministic behavior with floating point in parallel mode
Hi hackers, We have found that in parallel mode result of queries is non-deterministic when the types of the attributes in table are double precision (floating-point). Our example is based on TPC-H, but some NUMERIC columns type was changed to DOUBLE PRECISION; When running without parallelism tpch=# set max_parallel_workers_per_gather to 0; SET tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date '1998-12-01' - interval '116 days'; sum -- 448157055361.319 (1 row) output is always the same. But in parallel mode tpch=# set max_parallel_workers_per_gather to 1; SET tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date '1998-12-01' - interval '116 days'; sum -- 448157055361.341 (1 row) tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date '1998-12-01' - interval '116 days'; sum - 448157055361.348 (1 row) result differs between runs. That is because floating-point addition is not necessarily associative. That is, (a + b) + c is not necessarily equal to a + (b + c). In parallel mode, the order in which the attribute values are added (summed) changes between runs, which leads to non-deterministic results. Is this desirable behavior? -- *Best Regards,**Ruben.* ISP RAS.
Re: [HACKERS] Enabling replication connections by default in pg_hba.conf
On Thu, Feb 2, 2017 at 6:08 AM, Peter Eisentraut < peter.eisentr...@2ndquadrant.com> wrote: > On 1/22/17 10:29 PM, Michael Paquier wrote: > > As now wal_level = replica has become the default for Postgres 10, > > could we consider as well making replication connections enabled by > > default in pg_hba.conf? This requires just uncommenting a couple of > > lines in pg_hba.conf.sample. > > Yes, I think this makes sense, as one of the reason for these changes is > to simplify the use of pg_basebackup. > +1. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: \if, \elseif, \else, \endif (was Re: [HACKERS] PSQL commands: \quit_if, \quit_unless)
Hello, What do I need to do to get TAP tests running? I misunderstood. You need to configure with "--enable-tap-tests". There is a spurious empty line added at the very end of "mainloop.h": + #endif /* MAINLOOP_H */ Not in my diff, but that's been coming and going in your diff reviews. Strange. Maybe this is linked to the warning displayed with "git apply" when I apply the diff. I would suggest to add a short one line comment before each test to explain what is being tested, like "-- test \elif execution", "-- test \else execution"... Where are you suggesting this? In "regress/sql/psql.sql", in front of each group which starts a test. Debatable suggestion about "psql_branch_empty": The name isn't great. Maybe psql_branch_stack_empty()? Yep, maybe, or "empty_stack" or "stack_is_empty" or IDK... "psql_branch_end_state": it is a pop, it could be named "psql_branch_pop" Yeah, we either need to go fully with telling the programmer that it's a stack (push/pop/empty) or (begin_branch/end_branch/not_branching). I'm inclined to go full-stack, as it were. Anything consistent is ok. I'm fine with calling a stack a stack:-) -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
On Mon, Jan 30, 2017 at 7:20 PM, Pavan Deolasee wrote: > Hello All, > > While stress testing WARM, I encountered a data consistency problem. It > happens when index is built concurrently. What I thought to be a WARM > induced bug and it took me significant amount of investigation to finally > conclude that it's a bug in the master. In fact, we tested all the way up to > 9.2 release and the bug exists in all releases. > > In my test set up, I keep a UNIQUE index on an "aid" column and then have > many other "aid[1-4]" columns, on which indexes are built and dropped. The > script then updates a randomly selected row using any of the aid[1-4] > columns. Even these columns are updated during the tests to force WARM > updates, but within a narrow range of non-overlapping values. So we can > always find the unique row using some kind of range condition. The > reproduction case is much simpler and I'll explain that below. > > In the reproduction scenario (attached), it happens fairly quickly, may be > after 3-10 rounds of DROP/CREATE. You may need to adjust the -j value if it > does not happen for you even after a few rounds. Just for the record, I > haven't checked past reports to correlate if the bug explains any of the > index corruption issues we might have received. > > To give a short summary of how CIC works with HOT, it consists of three > phases: > 1. Create index entry in the catalog, mark it not ready for inserts, commit > the transaction, start new transaction and then wait for all potential lock > holders on the table to end > 2. Take a MVCC snapshot, build the index, mark index ready for inserts, > commit and then once again wait for potential lock holders on the table to > end > 3. Do a validation phase where we look at MVCC-visible tuples and add > missing entries to the index. We only look at the root line pointer of each > tuple while deciding whether a particular tuple already exists in the index > or not. IOW we look at HOT chains and not tuples themselves. > > The interesting case is phase 1 and 2. The phase 2 works on the premise that > every transaction started after we finished waiting for all existing > transactions will definitely see the new index. All these new transactions > then look at the columns used by the new index and consider them while > deciding HOT-safety of updates. Now for reasons which I don't fully > understand, there exists a path where a transaction starts after CIC waited > and took its snapshot at the start of the second phase, but still fails to > see the new index. Or may be it sees the index, but fails to update its > rd_indexattr list to take into account the columns of the new index. That > leads to wrong decisions and we end up with a broken HOT chain, something > the CIC algorithm is not prepared to handle. This in turn leads the new > index missing entries for the update tuples i.e. the index may have aid1 = > 10, but the value in the heap could be aid1 = 11. > > Based on my investigation so far and the evidence that I've collected, what > probably happens is that after a relcache invalidation arrives at the new > transaction, it recomputes the rd_indexattr but based on the old, cached > rd_indexlist. So the rd_indexattr does not include the new columns. Later > rd_indexlist is updated, but the rd_indexattr remains what it was. > > There is definitely an evidence of this sequence of events (collected after > sprinkling code with elog messages), but it's not yet clear to me why it > affects only a small percentage of transactions. One possibility is that it > probably depends on at what stage an invalidation is received and processed > by the backend. I find it a bit confusing that even though rd_indexattr is > tightly coupled to the rd_indexlist, we don't invalidate or recompute them > together. Shouldn't we do that? > During invalidation processing, we do destroy them together in function RelationDestroyRelation(). So ideally, after invalidation processing, both should be built again, but not sure why that is not happening in this case. > For example, in the attached patch we > invalidate rd_indexattr whenever we recompute rd_indexlist (i.e. if we see > that rd_indexvalid is 0). /* + * If the index list was invalidated, we better also invalidate the index + * attribute list (which should automatically invalidate other attributes + * such as primary key and replica identity) + */ + relation->rd_indexattr = NULL; + I think setting directly to NULL will leak the memory. > This patch fixes the problem for me (I've only > tried master), but I am not sure if this is a correct fix. > I think it is difficult to say that fix is correct unless there is a clear explanation of what led to this problem. Another angle to investigate is to find out when relation->rd_indexvalid is set to 0 for the particular session where you are seeing the problem. I could see few places where we are setting it to 0 and clearing rd_indexlist, but not rd_indexattr. I am not s
Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?
I forgot to mention this.. At Thu, 02 Feb 2017 21:11:16 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20170202.26.49572782.horiguchi.kyot...@lab.ntt.co.jp> > Hello, I digged this topic covered by a spiderweb.. > > # PostGIS requires so many libraries installed :( > > FIY, for the given test case, the following query hits the bug. > > | SELECT edge.geom AS geom > | FROM (SELECT * FROM knn_recheck_geom) AS edge > | ORDER BY '01010014AE47E17A141FC09AA170C0' <-> edge.geom LIMIT 2; > > > At Tue, 3 Jan 2017 11:18:55 -0500, Robert Haas wrote > in > > On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe wrote: > > > It's still not quite clear to me even looking at that git commit, why > > > those need to error instead of going thru recheck aside from efficiency. > > > > The code that reorders the returned tuples assumes that (1) the actual > > distance is always greater than or equal to the estimated distance and > > (2) the index returns the tuples in order of increasing estimated > > distance. Imagine that the estimated distances are 0, 1, 2, 3... and > > the real distances are 2,3,4,5... When it sees the > > estimated-distance-0 tuple it computes that the actual distance is 2, > > but it doesn't know whether there's going to be a tuple later with an > > actual distance between 0 and 2, so it buffers the tuple. When it sees > > the estimated-distance-1 tuple it computes that the actual distance is > > 2, and now it knows there won't be any more estimated or actual > > distances between 0 and 1, but there could still be a tuple with an > > estimated distance of 1 and 2 whose actual distance is also between 1 > > and 2, so it buffers the second tuple as well. When it sees the third > > tuple, with estimated distance 2, it now knows that there won't be any > > further tuples whose estimated or actual distance is less than 2. So > > now it can emit the first tuple that it saw, because that had an > > actual distance of 2 and from this point forward the index will never > > return anything with a smaller estimated or actual distance. The > > estimated-distance-1 tuple still has to stay in the buffer, though, > > until we see a tuple whose estimated distance is greater than that > > tuple's actual distance (3). > > The estimation is calculated using box2df_distance, and the > recheck evaluation is using distnce (of libgeom). > > For the problematic case, the distance's result is > "6.992439" and box2df_distance's is > "6.99330517578125" by "%20.18lf". The point should be just on > the edge of its bounding box. The binary representaions of the two are the following. distance: "40 1b f8 d4 fd f3 b6 45" =(0:plus) (101 = 2^(1025-1023) = *4) 1."bf8d4fdf3b645" estimation: "40 1b f8 d5 00 00 00 00" =(0:plus) (101 = 2^(1025-1023) = *4) 1."bf8d5" I mean, the estimation (box2df_distance) seems getting precision reduction on its matissa. (I cannot say more than this, though) > gserialized_gist_distance_2d() of PostGIS 2.3.0: > > /* In all cases, since we only have keys (boxes) we'll return */ > > /* the minimum possible distance, which is the box2df_distance */ > > /* and let the recheck sort things out in the case of leaves */ > > distance = (double)box2df_distance(entry_box, &query_box); > > This theoretically doesn't give larger value than the real > distance but it is failing. I'm not sure how to fix this but at > least it is not a problem of GIST or PostgreSQL. > > If set the distance() of libgeom as the base, box2df_distance > might should return a very-very slightly smaller value (I don't > know how much it should be.) so that it can conseal the error of > its operation. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?
Hello, I digged this topic covered by a spiderweb.. # PostGIS requires so many libraries installed :( FIY, for the given test case, the following query hits the bug. | SELECT edge.geom AS geom | FROM (SELECT * FROM knn_recheck_geom) AS edge | ORDER BY '01010014AE47E17A141FC09AA170C0' <-> edge.geom LIMIT 2; At Tue, 3 Jan 2017 11:18:55 -0500, Robert Haas wrote in > On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe wrote: > > It's still not quite clear to me even looking at that git commit, why those > > need to error instead of going thru recheck aside from efficiency. > > The code that reorders the returned tuples assumes that (1) the actual > distance is always greater than or equal to the estimated distance and > (2) the index returns the tuples in order of increasing estimated > distance. Imagine that the estimated distances are 0, 1, 2, 3... and > the real distances are 2,3,4,5... When it sees the > estimated-distance-0 tuple it computes that the actual distance is 2, > but it doesn't know whether there's going to be a tuple later with an > actual distance between 0 and 2, so it buffers the tuple. When it sees > the estimated-distance-1 tuple it computes that the actual distance is > 2, and now it knows there won't be any more estimated or actual > distances between 0 and 1, but there could still be a tuple with an > estimated distance of 1 and 2 whose actual distance is also between 1 > and 2, so it buffers the second tuple as well. When it sees the third > tuple, with estimated distance 2, it now knows that there won't be any > further tuples whose estimated or actual distance is less than 2. So > now it can emit the first tuple that it saw, because that had an > actual distance of 2 and from this point forward the index will never > return anything with a smaller estimated or actual distance. The > estimated-distance-1 tuple still has to stay in the buffer, though, > until we see a tuple whose estimated distance is greater than that > tuple's actual distance (3). The estimation is calculated using box2df_distance, and the recheck evaluation is using distnce (of libgeom). For the problematic case, the distance's result is "6.992439" and box2df_distance's is "6.99330517578125" by "%20.18lf". The point should be just on the edge of its bounding box. gserialized_gist_distance_2d() of PostGIS 2.3.0: > /* In all cases, since we only have keys (boxes) we'll return */ > /* the minimum possible distance, which is the box2df_distance */ > /* and let the recheck sort things out in the case of leaves */ > distance = (double)box2df_distance(entry_box, &query_box); This theoretically doesn't give larger value than the real distance but it is failing. I'm not sure how to fix this but at least it is not a problem of GIST or PostgreSQL. If set the distance() of libgeom as the base, box2df_distance might should return a very-very slightly smaller value (I don't know how much it should be.) so that it can conseal the error of its operation. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Constraint exclusion failed to prune partition in case of partition expression involves function call
Hi, In following case, constraint exclusion not able prune partition (even if function is immutable), is this know behaviour? --CASE 1 : create table & insert data create table foo_list (a integer, b text) partition by list (abs(a)); create table foo_list1 partition of foo_list for values in (0); create table foo_list2 partition of foo_list for values in (1); create table foo_list3 partition of foo_list for values in (2); create table foo_list4 partition of foo_list for values in (3); insert into foo_list values(0),(1),(-1),(2),(-2),(3),(-3); --Explain plan postgres=# explain select * from foo_list where a = 2; QUERY PLAN - Append (cost=0.00..103.50 rows=25 width=36) -> Seq Scan on foo_list (cost=0.00..0.00 rows=1 width=36) Filter: (a = 2) -> Seq Scan on foo_list1 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) -> Seq Scan on foo_list2 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) -> Seq Scan on foo_list3 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) -> Seq Scan on foo_list4 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) (11 rows) AFAUI, constraint exclusion should prune all above table other than foo_list3 as happens in the following case : -- CASE 2: create table & insert data create table bar_list (a integer, b text) partition by list (a); create table bar_list1 partition of bar_list for values in (0); create table bar_list2 partition of bar_list for values in (1); create table bar_list3 partition of bar_list for values in (2); create table bar_list4 partition of bar_list for values in (3); insert into bar_list values(0),(1),(2),(3); --- Explain plan postgres=# explain select * from bar_list where a = 2; QUERY PLAN - Append (cost=0.00..25.88 rows=7 width=36) -> Seq Scan on bar_list (cost=0.00..0.00 rows=1 width=36) Filter: (a = 2) -> Seq Scan on bar_list3 (cost=0.00..25.88 rows=6 width=36) Filter: (a = 2) (5 rows) Thanks & Regards, Amul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect: Hash index support
> > + ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info); > + Assert(ptr <= uargs->page + BLCKSZ); > > I think this should be promoted to an ereport(); these functions can > accept an arbitrary bytea. I think we had added 'ptr' variable initially just to dump hash code in hexadecimal format but now since we have removed that logic from current code, I think it is no more required and I have therefore removed it from the current patch. Below is the code that used it initially. It got changed as per your comment - [1] > > + if (opaque->hasho_flag & LH_BUCKET_PAGE) > + stat->hasho_prevblkno = InvalidBlockNumber; > + else > + stat->hasho_prevblkno = opaque->hasho_prevblkno; > > I think we should return the raw value here. Mithun's patch to cache > the metapage hasn't gone in yet, but even if it does, let's assume > anyone using contrib/pageinspect wants to see the data that's > physically present, not our gloss on it. okay, agreed. I have corrected this in the attached v10 patch. > > Other than that, I don't think I have any other comments on this. The > tests that got added look a little verbose to me (do we really need to > check pages 1-4 separately in each case when they're all hash pages? > if hash_bitmap_info rejects one, surely it will reject the others) but > I'm not going to fight too hard if Peter wants it that way. > I think it's OK to check hash_bitmap_info() or any other functions with different page types at least once. [1]- https://www.postgresql.org/message-id/CA%2BTgmoZUjrVy52TUU3b_Q5L6jcrt7w5R4qFwMXdeCuKQBmYWqQ%40mail.gmail.com -- With Regards, Ashutosh Sharma EnterpriseDB:http://www.enterprisedb.com 0001-Add-support-for-hash-index-in-pageinspect-contrib-mo_v10.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add tab completion for DEALLOCATE
Tom Lane writes: > ilm...@ilmari.org (Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?=) writes: >> While playing with prepared statements in psql, I noticed that EXECUTE >> tab-completes the list of active prepared statements, but DEALLOCATE >> does not. >> Attached is a patch to fix this. > > Good idea, but I think it would be better to give DEALLOCATE its own > entry in the list, so it could be placed in alphabetical order. I was following the example from FETCH/MOVE further down to avoid duplicating the body of the stanza, but I guess in this case it's simple enough that it doesn't matter. Here's an updated patch wich adds it as a separate stanza. -- - Twitter seems more influential [than blogs] in the 'gets reported in the mainstream press' sense at least. - Matt McLeod - That'd be because the content of a tweet is easier to condense down to a mainstream media article. - Calle Dybedahl >From 8b5e4ed63e558475b193b1d668ed14baae0c497c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= Date: Thu, 2 Feb 2017 10:09:26 + Subject: [PATCH] Add tab completion for DEALLOCATE EXECUTE already tab-completes the list of prepared statements, but DEALLOCATE was missing. --- src/bin/psql/tab-complete.c | 4 1 file changed, 4 insertions(+) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index d6fffcf42f..539ab1afe2 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -2453,6 +2453,10 @@ psql_completion(const char *text, int start, int end) else if (Matches5("CREATE", "EVENT", "TRIGGER", MatchAny, "ON")) COMPLETE_WITH_LIST3("ddl_command_start", "ddl_command_end", "sql_drop"); +/* DEALLOCATE */ + else if (Matches1("DEALLOCATE")) + COMPLETE_WITH_QUERY(Query_for_list_of_prepared_statements); + /* DECLARE */ else if (Matches2("DECLARE", MatchAny)) COMPLETE_WITH_LIST5("BINARY", "INSENSITIVE", "SCROLL", "NO SCROLL", -- 2.11.0 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Superowners
On 30 January 2017 at 16:34, Peter Eisentraut wrote: > On 1/30/17 9:04 AM, Simon Riggs wrote: >> all I want in this release is >> super-ownership. > > What exactly is super-ownership, and what problems does it solve? The problem is that there is no easy way for a DBA to have privs on multiple sets of objects, so there is a request for superuser in many cases. Superuser is too strong for most situations, so we are stuck. We need some middle ground where a single user can manage many "normal application objects" (tables, views, sequences, matviews, functions, indexes, triggers) without problem, while not compromising other areas that require higher security. Probably more than 50% of PostgreSQL installs now use services that block superuser accounts, so the majority of PostgreSQL users are affected by these problems. The permissions desirable for logical replication are a good example of this, but not in any sense the only issue. My hope is that we release v10 with a permissions model that allows logical replication to be realistically usable when superuser is not available. This is not a new requirement, but the privilege aspect of the logical replication has been pushed back. While thinking about other problems of access control I've rethought this so I now see the wider problem and would like to solve that rather than just focus on the needs of logical replication. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Superowners
On 30 January 2017 at 16:43, Tom Lane wrote: > Simon Riggs writes: >> Agreed. Let me reiterate: all I want in this release is >> super-ownership. > > While I'm not entirely convinced whether super-ownership is a good idea > or not, I am pretty sure that rushing to get it into v10 is a bad idea. > This is a rather fundamental change in our permissions model and it > might turn out to have undesirable consequences. Agreed. My view is that the current mechanism almost forces people to use superusers for many things and that is definitely undesirable. > Or even more directly: any patch for this would necessarily be landing > in the last v10 commitfest. We have a project policy against major > changes showing up for the first time in the last fest of a cycle, > for good reasons. I understand. > Let's take our time and get it right. So we are able to see what is proposed, I attach a patch. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services db_owner_has_obj_privs.v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multivariate statistics (v19)
On 02/01/2017 11:52 PM, Alvaro Herrera wrote: Still looking at 0002. pg_ndistinct_in disallows input, claiming that pg_node_tree does the same thing. But pg_node_tree does it for security reasons: you could crash the backend if you supplied a malicious value. I don't think that applies to pg_ndistinct_in. Perhaps it will be useful to inject fake stats at some point, so why not allow it? It shouldn't be complicated (though it does require writing some additional code, so perhaps that's one reason we don't want to allow input of these values). > Yes, I haven't written the code, and I'm not sure it's a very practical way to inject custom statistics. But if we decide to allow that in the future, we can probably add the code. There's a subtle difference between pg_node_tree and the data types for statistics - pg_node_tree stores the value as a string (matching the nodeToString output), so the _in function is fairly simple. Of course, stringToNode() assumes safe input, which is why the input is disabled. OTOH the statistics are stored in an optimized binary format, allowing to use the value directly (without having to do expensive parsing etc). I was thinking that the easiest way to add support for _in would be to add a bunch of Nodes for the statistics, along with in/out functions, but keeping the internal binary representation. But that'll be tricky to do in a safe way - even if those nodes are coded in a very defensive ways, I'd bet there'll be ways to inject unsafe nodes. So I'm OK with not having the _in for now. If needed, it's possible to construct the statistics as a bytea using a bit of C code. That's at least obviously unsafe, as anything written in C, touching the memory. The comment on top of pg_ndistinct_out is missing the "_out"; also it talks about histograms, which is not what this is about. OK, will fix. In the same function, a trivial point you don't need to pstrdup() the .data out of a stringinfo; it's already palloc'ed in the right context -- just PG_RETURN_CSTRING(str.data) and forget about "ret". Saves you one line. Will fix too. Nearby, some auxiliary functions such as n_choose_k and num_combinations are not documented. What it is that they do? I'd move these at the end of the file, keeping the important entry points at the top of the file. I'd say n-choose-k is pretty widely known term from combinatorics. The comment would essentially say just 'this is n-choose-k' which seems rather pointless. So as much as I dislike the self-documenting code, this actually seems like a good case of that. I see this patch has a estimate_ndistinct() which claims to be a re- implementation of code already in analyze.c, but it is actually a lot simpler than what analyze.c does. I've been wondering if it'd be a good idea to use some of this code so that some routines are moved out of analyze.c; good implementations of statistics-related functions would live in src/backend/statistics/ where they can be used both by analyze.c and your new mvstats stuff. (More generally I am beginning to wonder if the new directory should be just src/backend/statistics.) I'll look into that. I have to check if I ignored some assumptions or corner cases the analyze.c deals with. common.h does not belong in src/backend/utils/mvstats; IMO it should be called src/include/utils/mvstat.h. Also, it must not include postgres.h, and it probably doesn't need most of the #includes it has; those are better put into whatever include it. It definitely needs a guarding #ifdef MVSTATS_H around its whole content too. An include file is not just a way to avoid #includes in other files; it is supposed to be a minimally invasive way of exporting the structs and functions implemented in some file into other files. So it must be kept minimal. Will do. psql/tab-complete.c compares the wrong version number (9.6 instead of 10). Is it important to have a cast from pg_ndistinct to bytea? I think it's odd that outputting it as bytea yields something completely different than as text. (The bytea is not human readable and cannot be used for future input, so what is the point?) Because it internally is a bytea, and it seems useful to have the ability to inspect the bytea value directly (e.g. to see the length of the bytea and not the string output). In another subthread you seem to have surrendered to the opinion that the new catalog should be called pg_statistics_ext, just in case in the future we come up with additional things to put on it. However, given its schema, with a "starelid / stakeys", is it sensible to think that we're going to get anything other than something that involves multiple variables? Maybe it should just be "pg_statistics_multivar" and if something else comes along we create another catalog with an appropriate schema. Heck, how does this catalog serve the purpose of cross-table statistics in the first place, given that it has room to record a single re
Re: pg_authid.rolpassword format (was Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol)
On 02/02/2017 05:50 AM, David Rowley wrote: On 2 February 2017 at 00:13, Heikki Linnakangas wrote: Ok, I'll drop the second patch for now. I committed the first patch after fixing the things you and Michael pointed out. Thanks for the review! dbd69118 caused small compiler warning for me. The attached fixed it. Fixed, thanks! - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvements in psql hooks for variables
On 02/01/2017 11:26 AM, Tom Lane wrote: > "Daniel Verite" writes: >> That works for me. I tested and read it and did not find anything >> unexpected or worth objecting. >> "\unset var" acting as "\set var off" makes sense considering >> that its opposite "\set var" is now an accepted >> synonym of "\set var on" for the boolean built-ins. > Thanks for reviewing! I've pushed this now --- Andrew, you should > be able to revert the RedisFDW test script to the way it was. > > Done. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Deadlock in XLogInsert at AIX
Last update on the issue with deadlock in XLogInsert. After almost one day of working, pgbench is once again not working normally:( There are no deadlock, there are no core files and no error messages in log. But TPS is almost zero: progress: 57446.0 s, 1.1 tps, lat 3840.265 ms stddev NaNQ progress: 57447.3 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57448.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57449.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57450.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57451.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57452.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57453.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57454.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57455.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57456.5 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57457.1 s, 164.6 tps, lat 11504.085 ms stddev 5902.148 progress: 57458.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57459.0 s, 234.0 tps, lat 1597.573 ms stddev 3665.814 progress: 57460.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57461.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57462.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57463.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57464.0 s, 602.8 tps, lat 906.765 ms stddev 1940.256 progress: 57465.0 s, 7.2 tps, lat 38.052 ms stddev 12.302 progress: 57466.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57467.1 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57468.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57469.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57470.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57471.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57472.1 s, 147.8 tps, lat 4379.790 ms stddev 3431.477 progress: 57473.0 s, 1314.1 tps, lat 156.884 ms stddev 535.761 progress: 57474.0 s, 1272.2 tps, lat 31.548 ms stddev 59.538 progress: 57475.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57476.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57477.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57478.0 s, 1688.6 tps, lat 268.379 ms stddev 956.537 progress: 57479.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57480.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57481.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57482.1 s, 29.0 tps, lat 3500.432 ms stddev 54.177 progress: 57483.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57484.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57485.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57486.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57487.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57488.0 s, 66.0 tps, lat 9813.646 ms stddev 19.807 progress: 57489.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57490.0 s, 31.0 tps, lat 8368.125 ms stddev 933.997 progress: 57491.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57492.0 s, 1601.0 tps, lat 226.865 ms stddev 844.952 progress: 57493.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ progress: 57494.0 s, 0.0 tps, lat NaNQ ms stddev -NaNQ ps auwx shows the following picture: [10:44:12]postgres@postgres:~/postgresql $ ps auwx | fgrep postgres postgres 61802470 0.4 0.0 4064 4180 pts/6 A18:54:58 976:56 pgbench -c 100 -j 20 -P 1 -T 10 -p 5436 postgres 15271518 0.0 0.0 138276 15024 - A10:43:34 0:06 postgres: autovacuum worker process postgres postgres 13305354 0.0 0.0 22944 21356 - A20:49:04 27:51 postgres: autovacuum worker process postgres postgres 5245902 0.0 0.0 14072 14020 - A18:54:59 10:24 postgres: postgres postgres [local] COMMIT postgres 44303278 0.0 0.0 15176 14036 - A18:54:59 10:18 postgres: postgres postgres [local] COMMIT postgres 38601340 0.0 0.0 11564 14008 - A18:54:59 10:16 postgres: postgres postgres [local] COMMIT postgres 53674890 0.0 0.0 12712 14004 - A18:54:59 8:54 postgres: postgres postgres [local] COMMIT postgres 27591640 0.0 0.0 15040 14028 - A18:54:59 8:38 postgres: postgres postgres [local] COMMIT postgres 40960422 0.0 0.0 12128 13996 - A18:54:59 8:36 postgres: postgres postgres [local] COMMIT postgres 41288514 0.0 0.0 10544 14012 - A18:54:59 8:30 postgres: postgres postgres [local] idle postgres 55771564 0.0 0.0 12844 14008 - A18:54:59 8:24 postgres: postgres postgres [local] COMMIT postgres 21760842 0.0 0.0 13164 14008 - A18:54:59 8:17 postgres: postgres postgres [local] COMMIT postgres 18810974 0.0 0.0 10416 14012 - A18:54:59 8:13 postgres: postgres postgres [local] idle in transaction postgres 17566474 0.0 0.0 10224 14012 - A18:54:59 8:02 postgres: postgres postgres [local] COMMIT postgres 63963402 0.0 0.0 11300 14000 - A18:54:59 7:48 postgres: postgres postgres [local] COMMIT postgres 9963962 0.0 0.0 15548 14024 - A18:54:59 7:37 postgres: postgres postgres [local] idle postgres 10094942 0.0 0.0 12192 13996 - A18:54:59 7:33 po