Re: Tid scan improvements
On Thu, 11 Jul 2019 at 10:22, David Rowley wrote: > So it seems that the plan is to insist that TIDs are tuple identifiers > for all table AMs, for now. If that changes in the future, then so be > it, but I don't think that's cause for delaying any work on TID Range > Scans. Also from scanning around tableam.h, I see that there's no > shortage of usages of BlockNumber, so it seems reasonable to assume > table AMs must use blocks... It's hard to imagine moving away from > that given that we have shared buffers. > > We do appear to have some table AM methods that are optional, although > I'm not sure where the documentation is about that. For example, in > get_relation_info() I see: > > info->amhasgetbitmap = amroutine->amgetbitmap != NULL && > relation->rd_tableam->scan_bitmap_next_block != NULL; > > so it appears that at least scan_bitmap_next_block is optional. > > I think what I'd do would be to add a table_setscanlimits API method > for table AM and perhaps have the planner only add TID range scan > paths if the relation has a non-NULL function pointer for that API > function. It would be good to stick a comment at least in tableam.h > that mentions that the callback is optional. That might help a bit > when it comes to writing documentation on each API function and what > they do. Hi. Here's a new patch. Summary of changes compared to last time: - I've added the additional "scan_setlimits" table AM method. To check whether it's implemented in the planner, I have added an additional "has_scan_setlimits" flag to RelOptInfo. It seems to work. - I've also changed nodeTidrangescan to not require anything from heapam now. - To simply the patch and avoid changing heapam, I've removed the backward scan support (which was needed for FETCH LAST/PRIOR) and made ExecSupportsBackwardScan return false for this plan type. - I've removed the vestigial passing of "direction" through nodeTidrangescan. If my understanding is correct, the direction passed to TidRangeNext will always be forward. But I did leave FETCH LAST/PRIOR in the regression tests (after adding SCROLL to the cursor). The patch now only targets the simple use case of restricting the range of a table to scan. I think it would be nice to eventually support the other major use cases of ORDER BY ctid ASC/DESC and MIN/MAX(ctid), but that can be another feature... Edmund v8-0001-Add-a-new-plan-type-Tid-Range-Scan-to-support-range-.patch Description: Binary data
Re: A little report on informal commit tag usage
On Mon, Jul 15, 2019 at 5:12 PM Tom Lane wrote: > Thomas Munro writes: > > 42 Doc > [...] I see a lot more than 42 such commit messages in the past > year, so not sure what you were counting? I would have tried to exclude the first line messages if I'd thought of that. But anyway, the reason for the low Doc number is case sensitivity. I ran that on a Mac and its lame collation support failed me in the "sort" step (also -i didn't do what I wanted, but that wasn't the issue). Trying again on FreeBSD box and explicitly setting LANG for the benefit of anyone else wanting to run this (see end), and then removing a few obvious false matches, I now get similar numbers in most fields but a higher "doc" number: 767 Author 9 Authors 144 Backpatch-through 55 Backpatch 14 Bug 14 Co-authored-by 27 Diagnosed-by 1599 Discussion 119 doc 36 docs 284 Reported-by 5 Review 8 Reviewed by 460 Reviewed-by 7 Security 9 Tested-by git log --since 2018-07-14 | \ grep -E '^ +[a-zA-Z].*: ' | \ LANG=en_US.UTF-8 sort | \ sed 's/:.*//' | \ LANG=en_US.UTF-8 uniq -ic | \ grep -v -E '^ *[12] ' -- Thomas Munro https://enterprisedb.com
Creating partitions automatically at least on HASH?
Hello pgdevs, sorry if this has been already discussed, but G did not yield anything convincing about that. While looking at HASH partitioning and creating a few ones, it occured to me that while RANGE and LIST partitions cannot be guessed easily, it would be easy to derive HASH partitioned table for a fixed MODULUS, e.g. with CREATE TABLE foo(...) PARTITION BY HASH AUTOMATIC (MODULUS 10); -- or some other syntax Postgres could derive statically the 10 subtables, eg named foo_$0$ to foo_$1$. That would not be a replacement for the feature where one may do something funny and doubtful like (MODULUS 2 REMAINDER 0, MODULUS 4 REMAINDER 1, MODULUS 4 REMAINDER 3). The same declarative approach could eventually be considered for RANGE with a fixed partition duration and starting and ending points. This would be a relief on the longer path of dynamically creating partitions, but with lower costs than a dynamic approach. The ALTER thing would be a little pain. Thoughts? -- Fabien.
Re: XLogRecGetFullXid()
On Fri, Jul 12, 2019 at 1:25 PM Thomas Munro wrote: > Here is a small patch extracted from the undo log patch set that I'd > like to discuss separately and commit soon. [...] Pushed. -- Thomas Munro https://enterprisedb.com
Re: A little report on informal commit tag usage
Thomas Munro writes: > Here are the tags that people have used in the past year, in commit messages: > 763 Author >9 Authors > 144 Backpatch-through > 55 Backpatch > 14 Bug > 14 Co-authored-by > 27 Diagnosed-By > 1593 Discussion > 42 Doc > 284 Reported-By >5 Review >8 Reviewed by > 456 Reviewed-By >7 Security >9 Tested-By One small comment on that --- I'm not sure what you meant to count in respect to the "Doc" item, but I believe there's a fairly widespread convention to write "doc:" or some variant in the initial summary line of commits that touch only documentation. The point here is to let release-note writers quickly ignore such commits, since we never list them as release note items. Bruce and I, being the usual suspects for release-note writing, are pretty religious about this but other people do it too. I see a lot more than 42 such commit messages in the past year, so not sure what you were counting? Anyway, that's not a "tag" in the sense I understand you to be using (otherwise the entries would look something like "Doc: yes" and be at the end, which is unhelpful for the purpose). But it's a related sort of commit-message convention. regards, tom lane
Re: refactoring - share str2*int64 functions
Hello Thomas, +extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result); +extern uint64 pg_strtouint64(const char *str, char **endptr, int base); One of these things is not like the other. Indeed. I agree that it is unfortunate, and it was bothering me a little as well. Let's see... the int64 version is used only by pgbench and is being promoted to common where it can be used by more code. No and yes. The pgbench code was a copy of server-side internal "scanint8", so it is used both by pgbench and the server-side handling of "int8", it is used significantly, taking advantage of its versatile error reporting feature on both sides. With a name like that, wouldn't it make sense to bring it into line with the uint64 interface, and then move pgbench's error reporting stuff back into pgbench? That would need moving the server-side error handling as well, which I would not really be happy with. The uint64 one derives its shape from the family of standard functions like strtol() so I think it wins. Yep, it cannot be changed either. I do not think that changing the error handling capability is appropriate, it is really a feature of the function. The function could try to use an internal pg_strtoint64 which would look like the other unsigned version, but then it would not differentiate the various error conditions (out of range vs syntax error). The compromise I can offer is to change the name of the first one, say to "pg_scanint8" to reflect its former backend name. Attached a v4 which does a renaming so as to avoid the name similarity but signature difference. I also made both error messages identical. -- Fabien.diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index ba57628f6f..5c3536b117 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -79,6 +79,7 @@ #include "utils/builtins.h" #include "utils/hashutils.h" #include "utils/memutils.h" +#include "common/string.h" PG_MODULE_MAGIC; diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c index 8eedb613a1..e8744f054c 100644 --- a/src/backend/executor/spi.c +++ b/src/backend/executor/spi.c @@ -21,6 +21,7 @@ #include "catalog/heap.h" #include "catalog/pg_type.h" #include "commands/trigger.h" +#include "common/string.h" #include "executor/executor.h" #include "executor/spi_priv.h" #include "miscadmin.h" diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 6c2626ee62..3735268e3a 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -34,6 +34,7 @@ #include "fmgr.h" #include "miscadmin.h" +#include "common/string.h" #include "nodes/extensible.h" #include "nodes/parsenodes.h" #include "nodes/plannodes.h" diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c index 1baf7ef31f..9a66ad41bd 100644 --- a/src/backend/parser/parse_node.c +++ b/src/backend/parser/parse_node.c @@ -25,10 +25,10 @@ #include "parser/parse_expr.h" #include "parser/parse_relation.h" #include "utils/builtins.h" -#include "utils/int8.h" #include "utils/lsyscache.h" #include "utils/syscache.h" #include "utils/varbit.h" +#include "common/string.h" static void pcb_error_callback(void *arg); @@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location) case T_Float: /* could be an oversize integer as well as a float ... */ - if (scanint8(strVal(value), true, )) + if (pg_scanint8(strVal(value), true, )) { /* * It might actually fit in int32. Probably only INT_MIN can diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c index d317fd7006..673f25a014 100644 --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -12,6 +12,7 @@ */ #include "postgres.h" +#include "common/string.h" #include "catalog/pg_publication.h" #include "replication/logical.h" @@ -20,7 +21,6 @@ #include "replication/pgoutput.h" #include "utils/inval.h" -#include "utils/int8.h" #include "utils/memutils.h" #include "utils/syscache.h" #include "utils/varlena.h" @@ -111,7 +111,7 @@ parse_output_parameters(List *options, uint32 *protocol_version, errmsg("conflicting or redundant options"))); protocol_version_given = true; - if (!scanint8(strVal(defel->arg), true, )) + if (!pg_scanint8(strVal(defel->arg), true, )) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("invalid proto_version"))); diff --git a/src/backend/utils/adt/cash.c b/src/backend/utils/adt/cash.c index c92e9d5046..618660f372 100644 --- a/src/backend/utils/adt/cash.c +++ b/src/backend/utils/adt/cash.c @@ -26,7 +26,6 @@ #include "libpq/pqformat.h" #include "utils/builtins.h" #include "utils/cash.h" -#include "utils/int8.h" #include "utils/numeric.h" #include
A little report on informal commit tag usage
On Fri, Jul 12, 2019 at 1:25 PM Michael Paquier wrote: > On Thu, Jul 11, 2019 at 09:44:07AM -0400, Tom Lane wrote: > > I thought we *did* have an agreement, to wit using > > > > Discussion: https://postgr.es/m/ > > > > to link to relevant mail thread(s). Some people use more tags > > but that seems inessential to me. > > Hehe. I actually was thinking about advocating for having more of > them in the commit logs. I'll just start a new thread about what I > had in mind. Perhaps that will lead us nowhere, but let's see. [Moving to -hackers] Here are the tags that people have used in the past year, in commit messages: 763 Author 9 Authors 144 Backpatch-through 55 Backpatch 14 Bug 14 Co-authored-by 27 Diagnosed-By 1593 Discussion 42 Doc 284 Reported-By 5 Review 8 Reviewed by 456 Reviewed-By 7 Security 9 Tested-By Other things I've noticed: * a few people list authors and reviewers in prose in a fairly mechanical paragraph * some people put back-patch and bug number information in prose * a few people list authors and reviewers with full email addresses * some people repeat tags for multiple values, others make comma separated lists * some people break long lines of meta-data with newlines * authors "X and Y" may be an alternative to "X, Y", or imply greater collaboration The counts above were produced by case-insensitively sorting and counting unique stuff that precedes a colon, and then throwing out those used fewer than three times (these are false matches and typos), and then throwing out a couple of obvious false matches by hand. Starting from here: git log --since 2018-07-14 | \ grep -E '^ *[A-Z].*: ' | \ sort -i | \ sed 's/:.*//' | \ uniq -ic | \ grep -v -E '^ *[12] ' -- Thomas Munro https://enterprisedb.com
Re: proposal - patch: psql - sort_by_size
Hi pá 12. 7. 2019 v 15:10 odesílatel Fabien COELHO napsal: > > Hello Pavel, > > > rebased patch attached > > I prefer patches with a number rather than a date, if possible. For one > thing, there may be several updates in one day. > > About this version (20180708, probably v3): applies cleanly, compiles, > make check ok, doc build ok. No tests. > attached version 4 > It works for me on a few manual tests against a 11.4 server. > > Documentation: if you say "\d*+", then it already applies to \db+ and > \dP+, so why listing them? Otherwise, state all commands or make it work > on all commands that have a size? > > About the text: >- remove , before "sorts" >- ... outputs by decreasing size, when size is displayed. >- add: When size is not displayed, the output is sorted by names. > fixed > I still think that the object name should be kept as a secondary sort > criterion, in case of size equality, so that the output is deterministic. > Having plenty of objects of the same size out of alphabetical order looks > very strange. > fixed Regards Pavel > > I still do not like much the boolean approach. I understand that the name > approach has been rejected, and I can understand why. > > I've been thinking about another more generic interface, that I'm putting > here for discussion, I do not claim that it is a good idea. Probably could > fall under "over engineering", but it might not be much harder to > implement, and it solves a few potential problems. > > The idea is to add an option to \d commands, such as "\echo -n": > >\dt+ [-o 1d,2a] ... > > meaning do the \dt+, order by column 1 descending, column 2 ascending. > With this there would be no need for a special variable nor other > extensions to specify some ordering, whatever the user wishes. > > Maybe it could be "\dt+ [-o '1 DESC, 2 ASC'] ..." so that the string > is roughly used as an ORDER BY specification by the query, but it would be > longer to specify. > > It also solves the issue that if someone wants another sorting order we > would end with competing boolean variables such as SORT_BY_SIZE, > SORT_BY_TYPE, SORT_BY_SCHEMA, which would be pretty unpractical. The > boolean approach works for *one* sorting extension and breaks at the next > extension. > > Also, the boolean does not say that it is a descending order. I could be > interested in looking at the small tables. > > Another benefit for me is that I do not like much variables with side > effects, whereas with an explicit syntax there would be no such thing, the > user has what was asked for. Ok, psql is full of them, but I cannot say I > like it for that. > > The approach could be extended to specify a limit, eg \dt -l 10 would > add a LIMIT 10 on the query. > > Also, the implementation could be high enough so that the description > handlers would not have to deal with it individually, it could return > the query which would then be completed with SORT/LIMIT clauses before > being executed, possibly with a default order if none is specified. > > -- > Fabien. > > > diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index 7789fc6177..40aeb8cef0 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -3973,6 +3973,17 @@ bar + +SORT_BY_SIZE + + +Sorts \d[bimtv]+ and \l+ +outputs by decreasing size (when size is displayed). When size +is not displayed, then output is sorted by name. + + + + SQLSTATE diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 8b4cd53631..9a7f02607c 100644 --- a/src/bin/psql/describe.c +++ b/src/bin/psql/describe.c @@ -223,6 +223,7 @@ describeTablespaces(const char *pattern, bool verbose) PQExpBufferData buf; PGresult *res; printQueryOpt myopt = pset.popt; + const char *sizefunc = NULL; if (pset.sversion < 8) { @@ -265,9 +266,12 @@ describeTablespaces(const char *pattern, bool verbose) gettext_noop("Options")); if (verbose && pset.sversion >= 90200) + { appendPQExpBuffer(, ",\n pg_catalog.pg_size_pretty(pg_catalog.pg_tablespace_size(oid)) AS \"%s\"", gettext_noop("Size")); + sizefunc = "pg_catalog.pg_tablespace_size(oid)"; + } if (verbose && pset.sversion >= 80200) appendPQExpBuffer(, @@ -281,7 +285,10 @@ describeTablespaces(const char *pattern, bool verbose) NULL, "spcname", NULL, NULL); - appendPQExpBufferStr(, "ORDER BY 1;"); + if (pset.sort_by_size && sizefunc) + appendPQExpBuffer(, "ORDER BY %s DESC, 1;", sizefunc); + else + appendPQExpBufferStr(, "ORDER BY 1;"); res = PSQLexec(buf.data); termPQExpBuffer(); @@ -863,6 +870,7 @@ listAllDbs(const char *pattern, bool verbose) PGresult *res; PQExpBufferData buf; printQueryOpt myopt = pset.popt; + const char *sizefunc = NULL; initPQExpBuffer(); @@ -882,12 +890,15 @@ listAllDbs(const char *pattern, bool
Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Thanks, Thomas. On Mon, Jul 8, 2019 at 6:47 AM Thomas Munro wrote: > On Mon, Feb 18, 2019 at 7:39 PM Hubert Zhang wrote: > > Based on the assumption we use smgr as hook position, hook API option1 > or option2 which is better? > > Or we could find some balanced API between option1 and option2? > > > > Again comments on other better hook positions are also appreciated! > > Hi Hubert, > > The July Commitfest is now running, and this entry is in "needs > review" state. Could you please post a rebased patch? > > I have questions about how disk quotas should work and I think we'll > probably eventually want more hooks than these, but simply adding > these hooks so extensions can do whatever they want doesn't seem very > risky for core. I think it's highly likely that the hook signatures > will have to change in future releases too, but that seems OK for such > detailed internal hooks. As for your question, my first reaction was > that I preferred your option 1, because SMgrRelation seems quite > private and there are no existing examples of that object being > exposed to extensions. But on reflection, other callbacks don't take > such a mollycoddling approach. So my vote is for option 2 "just pass > all the arguments to the callback", which I understand to be the > approach of patch you have posted. > > +if (smgrcreate_hook) > +{ > +(*smgrcreate_hook)(reln, forknum, isRedo); > +} > > Usually we don't use curlies for single line if branches. > > I have rebased the patch to v4 and removed the unnecessary braces. As your comments, Options 2 is still used in patch v4. Agree that diskquota extension may use more hooks in future. Currently the behavior of diskquota extension is that we use smgr hooks to detect active tables and record them in the shared memory. Bgworkers of diskquota extension will read these active tables from shared memory and calculate the latest table size and sum them into the size of schema or role. If size of schema of role exceeds their quota limit, they will be put into a black list in shared memory. When a new query comes, ExecutorCheckPerms_hook will be used to check the black list the cancel the query if needed. -- Thanks Hubert Zhang disk_quota_hooks_v4.patch Description: Binary data
doc: mention pg_reload_conf() in pg_hba.conf documentation
Hi I noticed the documentation for pg_hba.conf: https://www.postgresql.org/docs/current/auth-pg-hba-conf.html says: you will need to signal the postmaster (using pg_ctl reload or kill -HUP) to make it re-read the file. It would be useful to mention pg_reload_conf() as another option here, as done elsewhere in the docs. Patch with suggested change attached. Regards Ian Barwick -- Ian Barwick https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml new file mode 100644 index 3ed74d8..c90ca0b *** a/doc/src/sgml/client-auth.sgml --- b/doc/src/sgml/client-auth.sgml *** hostnogssenc databaseSIGHUPSIGHUP signal. If you edit the file on an active system, you will need to signal the postmaster !(using pg_ctl reload or kill -HUP) to make it !re-read the file. --- 650,658 SIGHUPSIGHUP signal. If you edit the file on an active system, you will need to signal the postmaster !(using pg_ctl reload, kill -HUP !or by calling the SQL function pg_reload_conf()) !to make it re-read the file.
Re: Commitfest 2019-07, the first of five* for PostgreSQL 13
Hello hackers, Here's a quick update at the end of the second week of CF1. status | w1 | w2 +-+- Committed | 32 | 41 Moved to next CF | 5 | 6 Needs review | 146 | 128 Ready for Committer| 7 | 9 Rejected | 2 | 2 Returned with feedback | 2 | 2 Waiting on Author | 29 | 35 Withdrawn | 8 | 8 It looks like we continued our commit rate of around 10/week, punted one to the next CF, and returned/rejected nothing. Last week I highlighted 20 'Needs review' patches whose threads hadn't seen traffic for the longest time as places that could use some attention if our goal is to move all of these patches closer to their destiny. A few of them made some progress and one was committed. Here are another 20 like that -- these are threads have been silent for 24 to 90 days. That means they mostly apply and pass basic testing (or I'd probably have reported the failure on the thread and they wouldn't be on this list). Which means you can test them! 2080 | Minimizing pg_stat_statements performanc | {"Raymond Martin"} 2103 | Fix failure of identity columns if there | {"Laurenz Albe"} 1472 | SQL/JSON: functions | {"Fedor Sigaev","Alexander Korotkov","Nikita Glukhov","Oleg Bartunov"} 2124 | Introduce spgist quadtree @<(point,circl | {"Matwey V. Kornilov"} 1306 | pgbench - another attempt at tap test fo | {"Fabien Coelho"} 2126 | Rearrange postmaster startup order to im | {"Tom Lane"} 2128 | Fix issues with "x SIMILAR TO y ESCAPE N | {"Tom Lane"} 2102 | Improve Append/MergeAppend EXPLAIN outpu | {"David Rowley"} 1774 | Block level parallel vacuum | {"Masahiko Sawada"} 2086 | pgbench - extend initialization phase co | {"Fabien Coelho"} 1348 | BRIN bloom and multi-minmax indexes | {"Tomas Vondra"} 2183 | Opclass parameters | {"Nikita Glukhov"} 1854 | libpq trace log | {"Aya Iwata"} 2147 | Parallel grouping sets | {"Richard Guo"} 2148 | vacuumlo: report the number of large obj | {"Timur Birsh"} 1984 | Fix performance issue in foreign-key-awa | {"David Rowley"} 1911 | anycompatible and anycompatiblearray pol | {"Pavel Stehule"} 2048 | WIP: Temporal primary and foreign keys | {"Paul Jungwirth"} 2160 | Multi insert in CTAS/MatView | {"Paul Guo","Taylor Vesely"} 2154 | Race conditions with TAP test for syncre | {"Michael Paquier"} -- Thomas Munro https://enterprisedb.com
Re: [PATCH] Implement uuid_version()
On 7/14/19 9:40 PM, Peter Eisentraut wrote: On 2019-07-13 17:13, Fabien COELHO wrote: What about avoiding a redirection with something like: Datum (* const pg_random_uuid)(PG_FUNCTION_ARGS) = gen_random_uuid; That seems very confusing. Dunno. Possibly. The user does not have to look at the implementation, and probably such code would deserve a comment. The point is to avoid one call so as to perform the same (otherwise the pg_random_uuid would be slightly slower), and to ensure that it behaves the same, as it would be the very same function by construction. I've switched the patch to ready anyway. committed Small doc tweak suggestion - the pgcrypto docs [1] now say about gen_random_uuid(): Returns a version 4 (random) UUID. (Obsolete, this function is now also included in core PostgreSQL.) which gives the impression the code contains two versions of this function, the core one and an obsolete one in pgcrypto. Per the commit message the situation is actually: The pgcrypto implementation now internally redirects to the built-in one. Suggested wording improvement in the attached patch. [1] https://www.postgresql.org/docs/devel/pgcrypto.html#id-1.11.7.34.9 Regards Ian Barwick -- Ian Barwick https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/doc/src/sgml/pgcrypto.sgml b/doc/src/sgml/pgcrypto.sgml new file mode 100644 index 0acd11e..f636703 *** a/doc/src/sgml/pgcrypto.sgml --- b/doc/src/sgml/pgcrypto.sgml *** gen_random_bytes(count integer) returns *** 1132,1139 gen_random_uuid() returns uuid !Returns a version 4 (random) UUID. (Obsolete, this function is now also !included in core PostgreSQL.) --- 1132,1141 gen_random_uuid() returns uuid !Returns a version 4 (random) UUID. (This function now redirects internally !to the core PostgreSQL function providing !the same functionality and is included in the extension for backwards compatibility; !see for details.)
Re: Re: SQL/JSON: functions
On Tue, May 14, 2019 at 12:54 PM Andrew Alsup wrote: > On 3/5/19 5:35 PM, Nikita Glukhov wrote: > > Attached 36th version of the patches rebased onto jsonpath v36. > While testing this patch a found a few issues: > > [1] I was not able to apply the patch to the current HEAD. However, it > applies cleanly to commit: e988878f85 (NOTE: I did not investigate which > commit between e988878f85 and HEAD caused problems). Thanks for that note, which should help other reviewers/testers looking a this patch set in CF1. I hope we can eventually get a rebased patch set, though. -- Thomas Munro https://enterprisedb.com
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
On Sun, Jul 14, 2019 at 02:38:40PM -0400, James Coleman wrote: On Tue, Jul 9, 2019 at 10:54 AM Tomas Vondra wrote: On Tue, Jul 09, 2019 at 09:28:42AM -0400, James Coleman wrote: >On Mon, Jul 8, 2019 at 9:37 PM Tomas Vondra > wrote: >> >> On Mon, Jul 08, 2019 at 12:07:06PM -0400, James Coleman wrote: >> > ... >> > >> >I guess I'm still not following. If (2) is responsible (currently) for >> >adding an explicit sort, why wouldn't adding an incremental sort be an >> >example of that responsibility? The subpath that either a Sort or >> >IncrementalSort is being added on top of (to then feed into the >> >GatherMerge) is the same in both cases right? >> > >> >Unless you're saying that the difference is that since we have a >> >partial ordering already for incremental sort then incremental sort >> >falls into the category of "maintaining" existing ordering of the >> >subpath? >> > >> >> Oh, I think I understand what you're saying. Essentially, we should not >> be making generate_gather_paths responsible for adding the incremental >> sort. Instead, we should be looking at places than are adding explicit >> sort (using create_sort_path) and also consider adding incremental sort. >> >> I definitely agree with the second half - we should look at all places >> that create explicit sorts and make them also consider incremental >> sorts. That makes sense. > >Yep, exactly. > If I remember correctly, one of the previous patch versions (in the early 2018 commitfests) actually modified many of those places, but it did that in a somewhat "naive" way. It simply used incremental sort whenever the path was partially sorted, or something like that. So it did not use costing properly. There was an attempt to fix that in the last commitfest but the costing model was deemed to be a bit too rough and unreliable (especially the ndistinct estimates can be quite weak), so the agreement was to try salvaging the patch for PG11 by only considering incremental sort in "safest" places with greatest gains. We've significantly improved the costing model since then, and the implementation likely handles the corner cases much better. But that does not mean we have to introduce the incremental sort to all those places at once - it might be wise to split that into separate patches. Yes, although we haven't added the MCV checking yet; that's on my mental checklist, but another new area of the codebase for me to understand, so I've been prioritizing other parts of the patch. Sure, no problem. It's not just about picking the right plan - we've kinda what impact these extra paths might have on planner performance, so maybe we should look at that too. And the impact might be different for each of those places. I'll leave that up to you, but I certainly won't insist on doing it all in one huge patch. I'm not opposed to handling some of them separately, but I would like to at least hit the places where it's most likely (for example, with LIMIT) to improve things. I supposed I'll have to look at all of the usages of create_sort_path() and try to rank them in terms of perceived likely value. Yep, makes sense. >> But I'm not sure it'll address all cases - the problem is that those >> places add the explicit sort because they need sorted input. Gather >> Merge does not do that, it only preserves existing ordering of paths. >> >> So it's possible the path does not have an explicit sort on to, and >> gather merge will not know to add it. And once we have the gather merge >> in place, we can't push place "under" it. > >That's the explanation I was missing; and it makes sense (to restate: >sometimes sorting is useful even when not required for correctness of >the user returned data). > Yes, although even when the sorting is required for correctness (because the user specified ORDER BY) you can do it at different points in the plan. We'd still produce correct results, but the sort might be done at the very end without these changes. For example we might end up with plans Incremental Sort (presorted: a, path keys: a,b) -> Gather Merge (path keys: a) -> Index Scan (path keys: a) but with those changes we might push the incremental sort down into the parallel part: Gather Merge (path keys: a,b) -> Incremental Sort (presorted: a, path keys: a,b) -> Index Scan (path keys: a) which is likely better. Perhaps that's what you meant, though. I was thinking of ordering being useful for grouping/aggregation or merge joins; I didn't realize the above plan wasn't possible yet, so that explanation is helpful. >> And I think I know why is that - while gather_grouping_paths() tries to >> add explicit sort below the gather merge, there are other places that >> call generate_gather_paths() that don't do that. In this case it's >> probably apply_scanjoin_target_to_paths() which simply builds >> >>parallel (seq|index) scan + gather merge >> >> and that's it. The problem is likely the same - the code does not know >>
Re: Change ereport level for QueuePartitionConstraintValidation
On Mon, 15 Jul 2019 at 11:46, Thomas Munro wrote: > > On Tue, Jul 2, 2019 at 12:17 AM Sergei Kornilov wrote: > > This change is discussed as open item for pg12. Seems we have nor > > objections nor agreement. I attached updated version due merge conflict. > > > > > Per discussion started here: > > > https://www.postgresql.org/message-id/CA%2BTgmoZWSLUjVcc9KBSVvbn%3DU5QRgW1O-MgUX0y5CnLZOA2qyQ%40mail.gmail.com > > I took the liberty of setting this to "Ready for Committer" to see if > we can get a decision one way or another and clear both a Commitfest > item and a PG12 Open Item. No committer is signed up, but it looks > like Amit L wrote the messages in question, Robert committed them, and > David made arguments for AND against on the referenced thread, so I'm > CCing them, and retreating to a safe distance. I think the only argument against it was around lack of ability to test if the constraint was used to verify no row breaks the partition bound during the ATTACH PARTITION. Does anyone feel strongly that we need to the test to confirm that the constraint was used for this? If nobody feels so strongly about that then I say we can just push this. It seems something that's unlikely to get broken, but then you could probably say that for most things our tests test for. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On Sun, Jul 14, 2019 at 12:13:45PM -0400, Joe Conway wrote: On 7/13/19 5:58 PM, Tomas Vondra wrote: On Sat, Jul 13, 2019 at 02:41:34PM -0400, Joe Conway wrote: [2] also says provides additional support for AES 256. It also mentions CBC versus XTS -- I came across this elsewhere and it bears discussion: "Currently, the following pairs of encryption modes are supported: AES-256-XTS for contents and AES-256-CTS-CBC for filenames AES-128-CBC for contents and AES-128-CTS-CBC for filenames Adiantum for both contents and filenames If unsure, you should use the (AES-256-XTS, AES-256-CTS-CBC) pair. AES-128-CBC was added only for low-powered embedded devices with crypto accelerators such as CAAM or CESA that do not support XTS." --- [2] also states this, which again makes me think in terms of table being the moral equivalent to a file: "Unlike dm-crypt, fscrypt operates at the filesystem level rather than at the block device level. This allows it to encrypt different files with different keys and to have unencrypted files on the same filesystem. This is useful for multi-user systems where each user’s data-at-rest needs to be cryptographically isolated from the others. However, except for filenames, fscrypt does not encrypt filesystem metadata." [5] has this to say which seems independent of mode: "When encrypting data with a symmetric block cipher, which uses blocks of n bits, some security concerns begin to appear when the amount of data encrypted with a single key comes close to 2n/2 blocks, i.e. n*2n/2 bits. With AES, n = 128 (AES-128, AES-192 and AES-256 all use 128-bit blocks). This means a limit of more than 250 millions of terabytes, which is sufficiently large not to be a problem. That's precisely why AES was defined with 128-bit blocks, instead of the more common (at that time) 64-bit blocks: so that data size is practically unlimited." FWIW I was a bit confused at first, because the copy paste mangled the formulas a bit - it should have been 2^(n/2) and n*2^(n/2). Yeah, sorry about that. But goes on to say: "I wouldn't use n*2^(n/2) bits in any sort of recommendation. Once you reach that number of bits the probability of a collision will grow quickly and you will be way over 50% probability of a collision by the time you reach 2*n*2^(n/2) bits. In order to keep the probability of a collision negligible I recommend encrypting no more than n*2^(n/4) bits with the same key. In the case of AES that works out to 64GB" It is hard to say if that recommendation is per key or per key+IV. Hmm, yeah. The question is what collisions they have in mind? Presumably it's AES(block1,key) = AES(block2,key) in which case it'd be with fixed IV, so per key+IV. Seems likely. But I did find that files in an encrypted file system are encrypted with derived keys from a master key, and I view this as analogous to what we are doing. My understanding always was that we'd do something like that, i.e. we'd have a master key (or perhaps multiple of them, for various users), but the data would be encrypted with secondary (generated) keys, and those secondary keys would be encrypted by the master key. At least that's what was proposed at the beginning of this thread by Insung Moon. In my email I linked the wrong page for [2]. The correct one is here: [2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html Following that, I think we could end up with three tiers: 1. A master key encryption key (KEK): this is the ley supplied by the database admin using something akin to ssl_passphrase_command 2. A master data encryption key (MDEK): this is a generated key using a cryptographically secure pseudo-random number generator. It is encrypted using the KEK, probably with Key Wrap (KW): or maybe better Key Wrap with Padding (KWP): 3a. Per table data encryption keys (TDEK): use MDEK and HKDF to generate table specific keys. 3b. WAL data encryption keys (WDEK): Similarly use MDEK and a HKDF to generate new keys when needed for WAL (based on the other info we need to change WAL keys every 68 GB unless I read that wrong). I believe that would allows us to have multiple keys but they are derived securely from the one DEK using available info similar to the way we intend to use LSN to derive the IVs -- perhaps table.oid for tables and something else for WAL. We also need to figure out how/when to generate new WDEK. Maybe every checkpoint, also meaning we would have to force a checkpoint every 68GB? I think that very much depends on what exactly the 68GB refers to - key or key+IV? If key+IV, then I suppose we can use LSN as IV and we would not need to change checkpoints. But it's not clear to me why we would need to force checkpoints at all? Surely we can just write a WAL message about switching to the new key, or something like that? [HKDF]: https://tools.ietf.org/html/rfc5869 [KW]: https://tools.ietf.org/html/rfc3394 [KWP]: https://tools.ietf.org/html/rfc5649 But
Re: Using unique btree indexes for pathkeys with one extra column
On Mon, 15 Jul 2019 at 12:59, Thomas Munro wrote: > In the category "doing more tricks with our existing btrees", which > includes all that difficult stuff like skip scans and incremental > sort, here's an easier planner-only one: if you have a unique index > on (a) possibly "including" (b) and you have a pathkey (a, b), you can > use an index [only] scan. That is, if the index is unique, and you > want exactly one extra column in index order, then you don't need any > extra sorting to get (a, b) in order. (If the index is not unique, or > there is more than one extra trailing column in the pathkey, you need > the incremental sort patch[1] to use this index). This was brought to > my attention by a guru from a different RDBMS complaining about stupid > stuff that PostgreSQL does and I was triggered to write this message > as a kind of TODO note... > > [1] https://commitfest.postgresql.org/24/1124/ This is one of the problems I've wanted to solve in the various times I've mentioned the word "UniqueKeys" on this mailing list. Probably my most detailed explanation is in https://www.postgresql.org/message-id/CAKJS1f86FgODuUnHiQ25RKeuES4qTqeNxm1QbqJWrBoZxVGLiQ%40mail.gmail.com Without detecting the UniqueKeys through joins then the optimisation you mention is limited to just single rel queries, since a join may duplicate the "a" column and make it so the sort on "b" is no longer redundant. In my view, limiting this to just single relation queries is just too restrictive to bother writing any code for, so I think do to as you mention we need the full-blown thing I mention in the link above. i.e tagging a list of UniqueKeys onto RelOptInfo and checking which ones are still applicable after joins and tagging those onto join RelOptInfos too. PathKey redundancy could then take into account that list of UniqueKeys the RelOptInfo level. At the top-level plan, you can do smarts for ORDER BY / GROUP BY / DISTINCT. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: refactoring - share str2*int64 functions
On Sun, Jul 14, 2019 at 3:22 AM Fabien COELHO wrote: > Here is the updated patch on which I checked "make check-world". Thanks! So, we're moving pg_strtouint64() to a place where frontend code can use it, and getting rid of some duplication. I like it. I wanted this once before myself[1]. +extern bool pg_strtoint64(const char *str, bool errorOK, int64 *result); +extern uint64 pg_strtouint64(const char *str, char **endptr, int base); One of these things is not like the other. Let's see... the int64 version is used only by pgbench and is being promoted to common where it can be used by more code. With a name like that, wouldn't it make sense to bring it into line with the uint64 interface, and then move pgbench's error reporting stuff back into pgbench? The uint64 one derives its shape from the family of standard functions like strtol() so I think it wins. [1] https://www.postgresql.org/message-id/CAEepm=2kec8xdbewgdtdobxgqphfw4kcd7bzxr6nmfihjjn...@mail.gmail.com -- Thomas Munro https://enterprisedb.com
Re: [sqlsmith] Crash in mcv_get_match_bitmap
OK, attached is a sequence of WIP fixes for the issues discussed here. 1) using column-specific collations (instead of type/default ones) The collations patch is pretty simple, but I'm not sure it actually does the right thing, particularly during estimation where it uses collation from the Var node (varcollid). But looking at 5e0928005, this should use the same collation as when building the extended statistics (which we get from the per-column stats, as stored in pg_statistic.stacoll#). But we don't actually store collations for extended statistics, so we can either modify pg_statistic_ext_data and store it there, or lookup the per-column statistic info during estimation, and use that. I kinda think the first option is the right one, but that'd mean yet another catversion bump. OTOH 5e0928005 actually did modify the extended statistics (mvdistinct and dependencies) to use type->typcollation during building, so maybe we want to use the default type collation for some reason? 2) proper extraction of Var/Const from opclauses This is the primary issue discussed in this thread - I've renamed the function to examine_opclause_expression() so that it kinda resembles examine_variable() and I've moved it to the "internal" header file. We still need it from two places so it can't be static, but hopefully this naming is acceptable. 3) handling of NULL values (Const and MCV items) Aside from the issue that Const may represent NULL, I've realized the code might do the wrong thing for NULL in the MCV item itself. It did treat it as mismatch and update the bitmap, but it might have invoke the operator procedure anyway (depending on whether it's AND/OR clause, what's the current value in the bitmap, etc.). This patch should fix both issues by treating them as mismatch, and skipping the proc call. 4) refactoring of the bitmap updates This is not a bug per se, but while working on (3) I've realized the code updating the bitmap is quite repetitive and it does stuff like if (is_or) bitmap[i] = Max(bitmap[i], match) else bitmap[i] = Min(bitmap[i], match) over and over on various places. This moves this into a separate static function, which I think makes it way more readable. Also, it replaces the Min/Max with a plain boolean operators (the patch originally used three states, not true/false, hence the Min/Max). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >From 1586963befe8fe7de473a28515bd7676fa2d0acd Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Sun, 14 Jul 2019 14:19:01 +0200 Subject: [PATCH 1/5] Use proper collations in extended statistics The extended statistics code was a bit confused about which collation to use when building the statistics and when computing the estimates. For building it used a default collation for each data type, while for estimation it used DEFAULT_COLLATION_OID. That's clearly inconsistent. Commit 5e0928005 changed how this works for per-column statistics, in which case we now use collation specified for each column - both for building the statistics and selectivity estimation. This commit adopts the same approach for extended statistics. Note: One issue is that for per-column statistics we store collation in pg_statistic catalog, but we don't store this in pg_statistic_ext. So we'd have to either add another column into the catalog (which is probably the right thing to do) or rely on info from pg_statistic. But we probably need to add this into pg_statistic_ext, to allow stats on expressions, or extended statistics with different collations. --- src/backend/statistics/mcv.c | 16 +--- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c index 913a72ff67..2e375edcb4 100644 --- a/src/backend/statistics/mcv.c +++ b/src/backend/statistics/mcv.c @@ -348,7 +348,7 @@ build_mss(VacAttrStats **stats, int numattrs) elog(ERROR, "cache lookup failed for ordering operator for type %u", colstat->attrtypid); - multi_sort_add_dimension(mss, i, type->lt_opr, type->typcollation); + multi_sort_add_dimension(mss, i, type->lt_opr, colstat->attrcollid); } return mss; @@ -668,7 +668,7 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats) /* sort and deduplicate the data */ ssup[dim].ssup_cxt = CurrentMemoryContext; - ssup[dim].ssup_collation = DEFAULT_COLLATION_OID; + ssup[dim].ssup_collation = stats[dim]->attrcollid; ssup[dim].ssup_nulls_first = false; PrepareSortSupportFromOrderingOp(typentry->lt_opr, [dim]); @@ -1577,8 +1577,6 @@ mcv_get_match_bitmap(PlannerInfo *root, List *clauses, if (ok) { - TypeCacheEntry
Using unique btree indexes for pathkeys with one extra column
Hello, In the category "doing more tricks with our existing btrees", which includes all that difficult stuff like skip scans and incremental sort, here's an easier planner-only one: if you have a unique index on (a) possibly "including" (b) and you have a pathkey (a, b), you can use an index [only] scan. That is, if the index is unique, and you want exactly one extra column in index order, then you don't need any extra sorting to get (a, b) in order. (If the index is not unique, or there is more than one extra trailing column in the pathkey, you need the incremental sort patch[1] to use this index). This was brought to my attention by a guru from a different RDBMS complaining about stupid stuff that PostgreSQL does and I was triggered to write this message as a kind of TODO note... [1] https://commitfest.postgresql.org/24/1124/ -- Thomas Munro https://enterprisedb.com
Re: Change ereport level for QueuePartitionConstraintValidation
On Tue, Jul 2, 2019 at 12:17 AM Sergei Kornilov wrote: > This change is discussed as open item for pg12. Seems we have nor objections > nor agreement. I attached updated version due merge conflict. > > > Per discussion started here: > > https://www.postgresql.org/message-id/CA%2BTgmoZWSLUjVcc9KBSVvbn%3DU5QRgW1O-MgUX0y5CnLZOA2qyQ%40mail.gmail.com I took the liberty of setting this to "Ready for Committer" to see if we can get a decision one way or another and clear both a Commitfest item and a PG12 Open Item. No committer is signed up, but it looks like Amit L wrote the messages in question, Robert committed them, and David made arguments for AND against on the referenced thread, so I'm CCing them, and retreating to a safe distance. -- Thomas Munro https://enterprisedb.com
Re: Built-in connection pooler
On Mon, Jul 15, 2019 at 7:56 AM Konstantin Knizhnik wrote: > Can you please explain me more precisely how to reproduce the problem > (how to configure postgres and how to connect to it)? Maybe it's just that postmaster.c's ConnCreate() always allocates a struct for port->gss if the feature is enabled, but the equivalent code in proxy.c's proxy_loop() doesn't? ./configure --prefix=$HOME/install/postgres \ --enable-cassert \ --enable-debug \ --enable-depend \ --with-gssapi \ --with-icu \ --with-pam \ --with-ldap \ --with-openssl \ --enable-tap-tests \ --with-includes="/opt/local/include" \ --with-libraries="/opt/local/lib" \ CC="ccache cc" CFLAGS="-O0" ~/install/postgres/bin/psql postgres -h localhost -p 6543 2019-07-15 08:37:25.348 NZST [97972] LOG: server process (PID 97974) was terminated by signal 11: Segmentation fault: 11 (lldb) bt * thread #1, stop reason = signal SIGSTOP * frame #0: 0x000104163e7e postgres`secure_read(port=0x7fda9ef001d0, ptr=0x0001047ab690, len=8192) at be-secure.c:164:6 frame #1: 0x00010417760d postgres`pq_recvbuf at pqcomm.c:969:7 frame #2: 0x0001041779ed postgres`pq_getbytes(s="??\x04~?, len=1) at pqcomm.c:1110:8 frame #3: 0x000104284147 postgres`ProcessStartupPacket(port=0x7fda9ef001d0, secure_done=true) at postmaster.c:2064:6 ... (lldb) f 0 frame #0: 0x000104163e7e postgres`secure_read(port=0x7fda9ef001d0, ptr=0x0001047ab690, len=8192) at be-secure.c:164:6 161 else 162 #endif 163 #ifdef ENABLE_GSS -> 164 if (port->gss->enc) 165 { 166 n = be_gssapi_read(port, ptr, len); 167 waitfor = WL_SOCKET_READABLE; (lldb) print port->gss (pg_gssinfo *) $0 = 0x > > Obviously your concept of tainted backends (= backends that can't be > > reused by other connections because they contain non-default session > > state) is quite simplistic and would help only the very simplest use > > cases. Obviously the problems that need to be solved first to do > > better than that are quite large. Personally I think we should move > > all GUCs into the Session struct, put the Session struct into shared > > memory, and then figure out how to put things like prepared plans into > > something like Ideriha-san's experimental shared memory context so > > that they also can be accessed by any process, and then we'll mostly > > be tackling problems that we'll have to tackle for threads too. But I > > think you made the right choice to experiment with just reusing the > > backends that have no state like that. > > That was not actually my idea: it was proposed by Dimitri Fontaine. > In PgPRO-EE we have another version of builtin connection pooler > which maintains session context and allows to use GUCs, prepared statements > and temporary tables in pooled sessions. Interesting. Do you serialise/deserialise plans and GUCs using machinery similar to parallel query, and did you changed temporary tables to use shared buffers? People have suggested that we do that to allow temporary tables in parallel queries too. FWIW I have also wondered about per (database, user) pools for reusable parallel workers. > But the main idea of this patch was to make connection pooler less > invasive and > minimize changes in Postgres core. This is why I have implemented proxy. How do you do it without a proxy? One idea I've wondered about that doesn't involve feeding all network IO through an extra process and extra layer of syscalls like this is to figure out how to give back your PGPROC slot when idle. If you don't have one and can't acquire one at the beginning of a transaction, you wait until one becomes free. When idle and not in a transaction you give it back to the pool, perhaps after a slight delay. That may be impossible for some reason or other, I don't know. If it could be done, it'd deal with the size-of-procarray problem (since we like to scan it) and provide a kind of 'admission control' (limiting number of transactions that can run), but wouldn't deal with the amount-of-backend-memory-wasted-on-per-process-caches problem (maybe solvable by shared caching). Ok, time for a little bit more testing. I was curious about the user experience when there are no free backends. 1. I tested with connection_proxies=1, max_connections=4, and I began 3 transactions. Then I tried to make a 4th connection, and it blocked while trying to connect and the log shows a 'sorry, too many clients already' message. Then I committed one of my transactions and the 4th connection was allowed to proceed. 2. I tried again, this time with 4 already existing connections and no transactions. I began 3 concurrent transactions, and then when I tried to begin a 4th transaction the BEGIN command blocked until one of the other transactions committed. Some thoughts: Why should case 1 block? Shouldn't I be allowed to connect, even though I can't begin a
Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT
On Mon, Jul 1, 2019 at 6:21 AM Thomas Munro wrote: > > Hi Matheus, > > As the commitfest is starting, could you please send a rebased patch? > > Hi all, Glad to start working on that again... Follows the rebased version (at 5925e55498). Thank you all. Best regards, -- Matheus de Oliveira diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 90bf19564c..198c640b98 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -55,7 +55,9 @@ ALTER TABLE [ IF EXISTS ] name ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } ADD table_constraint [ NOT VALID ] ADD table_constraint_using_index -ALTER CONSTRAINT constraint_name [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] +ALTER CONSTRAINT constraint_name + [ ON DELETE action ] [ ON UPDATE action ] + [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ] VALIDATE CONSTRAINT constraint_name DROP CONSTRAINT [ IF EXISTS ] constraint_name [ RESTRICT | CASCADE ] DISABLE TRIGGER [ trigger_name | ALL | USER ] diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 0f1a9f0e54..b54c3d67c5 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -8979,8 +8979,43 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd, errmsg("constraint \"%s\" of relation \"%s\" is not a foreign key constraint", cmdcon->conname, RelationGetRelationName(rel; + /* + * Verify for FKCONSTR_ACTION_UNKNOWN, if found, replace by current + * action. We could handle FKCONSTR_ACTION_UNKNOWN bellow, but since + * we already have to handle the case of changing to the same action, + * seems simpler to simple replace FKCONSTR_ACTION_UNKNOWN by the + * current action here. + */ + if (cmdcon->fk_del_action == FKCONSTR_ACTION_UNKNOWN) + cmdcon->fk_del_action = currcon->confdeltype; + + if (cmdcon->fk_upd_action == FKCONSTR_ACTION_UNKNOWN) + cmdcon->fk_upd_action = currcon->confupdtype; + + /* + * Do the same for deferrable attributes. But consider that if changed + * only initdeferred attribute and to true, force deferrable to be also + * true. On the other hand, if changed only deferrable attribute and to + * false, force initdeferred to be also false. + */ + if (!cmdcon->was_deferrable_set) + cmdcon->deferrable = cmdcon->initdeferred ? true : currcon->condeferrable; + + if (!cmdcon->was_initdeferred_set) + cmdcon->initdeferred = !cmdcon->deferrable ? false : currcon->condeferred; + + /* + * This is a safe check only, should never happen here. + */ + if (cmdcon->initdeferred && !cmdcon->deferrable) + ereport(ERROR, +(errcode(ERRCODE_SYNTAX_ERROR), + errmsg("constraint declared INITIALLY DEFERRED must be DEFERRABLE"))); + if (currcon->condeferrable != cmdcon->deferrable || - currcon->condeferred != cmdcon->initdeferred) + currcon->condeferred != cmdcon->initdeferred || + currcon->confdeltype != cmdcon->fk_del_action || + currcon->confupdtype != cmdcon->fk_upd_action) { HeapTuple copyTuple; HeapTuple tgtuple; @@ -8998,6 +9033,8 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd, copy_con = (Form_pg_constraint) GETSTRUCT(copyTuple); copy_con->condeferrable = cmdcon->deferrable; copy_con->condeferred = cmdcon->initdeferred; + copy_con->confdeltype = cmdcon->fk_del_action; + copy_con->confupdtype = cmdcon->fk_upd_action; CatalogTupleUpdate(conrel, >t_self, copyTuple); InvokeObjectPostAlterHook(ConstraintRelationId, @@ -9034,23 +9071,106 @@ ATExecAlterConstraint(Relation rel, AlterTableCmd *cmd, otherrelids = list_append_unique_oid(otherrelids, tgform->tgrelid); - /* - * Update deferrability of RI_FKey_noaction_del, - * RI_FKey_noaction_upd, RI_FKey_check_ins and RI_FKey_check_upd - * triggers, but not others; see createForeignKeyActionTriggers - * and CreateFKCheckTrigger. - */ - if (tgform->tgfoid != F_RI_FKEY_NOACTION_DEL && -tgform->tgfoid != F_RI_FKEY_NOACTION_UPD && -tgform->tgfoid != F_RI_FKEY_CHECK_INS && -tgform->tgfoid != F_RI_FKEY_CHECK_UPD) -continue; - copyTuple = heap_copytuple(tgtuple); copy_tg = (Form_pg_trigger) GETSTRUCT(copyTuple); + /* + * Set deferrability here, but note that it may be overridden bellow + * if the pg_trigger entry is on the referencing table and depending + * on the action used for ON UPDATE/DELETE. But for check triggers + * (in the referenced table) it is kept as is (since ON + * UPDATE/DELETE actions makes no difference for the check + * triggers). + */ copy_tg->tgdeferrable = cmdcon->deferrable; copy_tg->tginitdeferred = cmdcon->initdeferred; + + /* + * Set ON DELETE action + */ + if (tgform->tgfoid == F_RI_FKEY_NOACTION_DEL || +tgform->tgfoid == F_RI_FKEY_RESTRICT_DEL || +tgform->tgfoid == F_RI_FKEY_CASCADE_DEL
Re: Built-in connection pooler
On 14.07.2019 8:03, Thomas Munro wrote: On Tue, Jul 9, 2019 at 8:30 AM Konstantin Knizhnik wrote: Rebased version of the patch is attached. Thanks for including nice documentation in the patch, which gives a good overview of what's going on. I haven't read any code yet, but I took it for a quick drive to understand the user experience. These are just some first impressions. I started my server with -c connection_proxies=1 and tried to connect to port 6543 and the proxy segfaulted on null ptr accessing port->gss->enc. I rebuilt without --with-gssapi to get past that. First of all a lot of thanks for your review. Sorry, I failed to reproduce the problem with GSSAPI. I have configured postgres with --with-openssl --with-gssapi and then try to connect to the serverwith psql using the following connection string: psql "sslmode=require dbname=postgres port=6543 krbsrvname=POSTGRES" There are no SIGFAULS and port->gss structure is normally initialized. Can you please explain me more precisely how to reproduce the problem (how to configure postgres and how to connect to it)? Using SELECT pg_backend_pid() from many different connections, I could see that they were often being served by the same process (although sometimes it created an extra one when there didn't seem to be a good reason for it to do that). I could see the proxy managing these connections with SELECT * FROM pg_pooler_state() (I suppose this would be wrapped in a view with a name like pg_stat_proxies). I could see that once I did something like SET foo.bar = 42, a backend became dedicated to my connection and no other connection could use it. As described. Neat. Obviously your concept of tainted backends (= backends that can't be reused by other connections because they contain non-default session state) is quite simplistic and would help only the very simplest use cases. Obviously the problems that need to be solved first to do better than that are quite large. Personally I think we should move all GUCs into the Session struct, put the Session struct into shared memory, and then figure out how to put things like prepared plans into something like Ideriha-san's experimental shared memory context so that they also can be accessed by any process, and then we'll mostly be tackling problems that we'll have to tackle for threads too. But I think you made the right choice to experiment with just reusing the backends that have no state like that. That was not actually my idea: it was proposed by Dimitri Fontaine. In PgPRO-EE we have another version of builtin connection pooler which maintains session context and allows to use GUCs, prepared statements and temporary tables in pooled sessions. But the main idea of this patch was to make connection pooler less invasive and minimize changes in Postgres core. This is why I have implemented proxy. On my FreeBSD box (which doesn't have epoll(), so it's latch.c's old school poll() for now), I see the connection proxy process eating a lot of CPU and the temperature rising. I see with truss that it's doing this as fast as it can: poll({ 13/POLLIN 17/POLLIN|POLLOUT },2,1000) = 1 (0x1) Ouch. I admit that I had the idea to test on FreeBSD because I noticed the patch introduces EPOLLET and I figured this might have been tested only on Linux. FWIW the same happens on a Mac. Yehh. This is really the problem. In addition to FreeBSD and MacOS, it also takes place at Win32. I have to think more how to solve it. We should somehow emulate "edge-triggered" more for this system. The source of the problem is that proxy is reading data from one socket and writing it in another socket. If write socket is busy, we have to wait until is is available. But at the same time there can be data available for input, so poll will return immediately unless we remove read event for this socket. Not here how to better do it without changing WaitEvenSet API. C:\projects\postgresql\src\include\../interfaces/libpq/libpq-int.h(33): fatal error C1083: Cannot open include file: 'pthread-win32.h': No such file or directory (src/backend/postmaster/proxy.c) [C:\projects\postgresql\postgres.vcxproj] I will investigate the problem with Win32 build once I get image of Win32 for VirtualBox. These relative includes in proxy.c are part of the problem: #include "../interfaces/libpq/libpq-fe.h" #include "../interfaces/libpq/libpq-int.h" I didn't dig into this much but some first reactions: I have added override CPPFLAGS := $(CPPFLAGS) -I$(top_builddir)/src/port -I$(top_srcdir)/src/port in Makefile for postmaster in order to fix this problem (as in interface/libpq/Makefile). But looks like it is not enough. As I wrote above I will try to solve this problem once I get access to Win32 environment. 1. I see that proxy.c uses libpq, and correctly loads it as a dynamic library just like postgres_fdw. Unfortunately it's part of core, so it can't use the same technique as postgres_fdw
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
On Tue, Jul 9, 2019 at 10:54 AM Tomas Vondra wrote: > > On Tue, Jul 09, 2019 at 09:28:42AM -0400, James Coleman wrote: > >On Mon, Jul 8, 2019 at 9:37 PM Tomas Vondra > > wrote: > >> > >> On Mon, Jul 08, 2019 at 12:07:06PM -0400, James Coleman wrote: > >> > ... > >> > > >> >I guess I'm still not following. If (2) is responsible (currently) for > >> >adding an explicit sort, why wouldn't adding an incremental sort be an > >> >example of that responsibility? The subpath that either a Sort or > >> >IncrementalSort is being added on top of (to then feed into the > >> >GatherMerge) is the same in both cases right? > >> > > >> >Unless you're saying that the difference is that since we have a > >> >partial ordering already for incremental sort then incremental sort > >> >falls into the category of "maintaining" existing ordering of the > >> >subpath? > >> > > >> > >> Oh, I think I understand what you're saying. Essentially, we should not > >> be making generate_gather_paths responsible for adding the incremental > >> sort. Instead, we should be looking at places than are adding explicit > >> sort (using create_sort_path) and also consider adding incremental sort. > >> > >> I definitely agree with the second half - we should look at all places > >> that create explicit sorts and make them also consider incremental > >> sorts. That makes sense. > > > >Yep, exactly. > > > > If I remember correctly, one of the previous patch versions (in the early > 2018 commitfests) actually modified many of those places, but it did that > in a somewhat "naive" way. It simply used incremental sort whenever the > path was partially sorted, or something like that. So it did not use > costing properly. There was an attempt to fix that in the last commitfest > but the costing model was deemed to be a bit too rough and unreliable > (especially the ndistinct estimates can be quite weak), so the agreement > was to try salvaging the patch for PG11 by only considering incremental > sort in "safest" places with greatest gains. > > We've significantly improved the costing model since then, and the > implementation likely handles the corner cases much better. But that does > not mean we have to introduce the incremental sort to all those places at > once - it might be wise to split that into separate patches. Yes, although we haven't added the MCV checking yet; that's on my mental checklist, but another new area of the codebase for me to understand, so I've been prioritizing other parts of the patch. > It's not just about picking the right plan - we've kinda what impact these > extra paths might have on planner performance, so maybe we should look at > that too. And the impact might be different for each of those places. > > I'll leave that up to you, but I certainly won't insist on doing it all in > one huge patch. I'm not opposed to handling some of them separately, but I would like to at least hit the places where it's most likely (for example, with LIMIT) to improve things. I supposed I'll have to look at all of the usages of create_sort_path() and try to rank them in terms of perceived likely value. > >> But I'm not sure it'll address all cases - the problem is that those > >> places add the explicit sort because they need sorted input. Gather > >> Merge does not do that, it only preserves existing ordering of paths. > >> > >> So it's possible the path does not have an explicit sort on to, and > >> gather merge will not know to add it. And once we have the gather merge > >> in place, we can't push place "under" it. > > > >That's the explanation I was missing; and it makes sense (to restate: > >sometimes sorting is useful even when not required for correctness of > >the user returned data). > > > > Yes, although even when the sorting is required for correctness (because > the user specified ORDER BY) you can do it at different points in the > plan. We'd still produce correct results, but the sort might be done at > the very end without these changes. > > For example we might end up with plans > >Incremental Sort (presorted: a, path keys: a,b) > -> Gather Merge (path keys: a) > -> Index Scan (path keys: a) > > but with those changes we might push the incremental sort down into the > parallel part: > >Gather Merge (path keys: a,b) > -> Incremental Sort (presorted: a, path keys: a,b) > -> Index Scan (path keys: a) > > which is likely better. Perhaps that's what you meant, though. I was thinking of ordering being useful for grouping/aggregation or merge joins; I didn't realize the above plan wasn't possible yet, so that explanation is helpful. > >> And I think I know why is that - while gather_grouping_paths() tries to > >> add explicit sort below the gather merge, there are other places that > >> call generate_gather_paths() that don't do that. In this case it's > >> probably apply_scanjoin_target_to_paths() which simply builds > >> > >>parallel (seq|index) scan + gather merge > >> > >> and that's it.
Re: Conflict handling for COPY FROM
I think making ERROR a reserved word is a terrible idea, and we don't need it for this feature anyway. Use a new option in the parenthesized options list instead. error_limit being an integer, please don't use it as a boolean: if (cstate->error_limit) ... Add an explicit comparison to zero instead, for code readability. Also, since each error decrements the same variable, it becomes hard to reason about the state: at the end, are we ending with the exact number of errors, or did we start with the feature disabled? I suggest that it'd make sense to have a boolean indicating whether this feature has been requested, and the integer is just the remaining allowed problems. Line 3255 or thereabouts contains an excess " char The "warn about it" comment is obsolete, isn't it? There's no warning there. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On 7/13/19 5:58 PM, Tomas Vondra wrote: > On Sat, Jul 13, 2019 at 02:41:34PM -0400, Joe Conway wrote: >>[2] also says provides additional support for AES 256. It also mentions >>CBC versus XTS -- I came across this elsewhere and it bears discussion: >> >>"Currently, the following pairs of encryption modes are supported: >> >>AES-256-XTS for contents and AES-256-CTS-CBC for filenames >>AES-128-CBC for contents and AES-128-CTS-CBC for filenames >>Adiantum for both contents and filenames >> >>If unsure, you should use the (AES-256-XTS, AES-256-CTS-CBC) pair. >> >>AES-128-CBC was added only for low-powered embedded devices with crypto >>accelerators such as CAAM or CESA that do not support XTS." >>--- >>[2] also states this, which again makes me think in terms of table being >>the moral equivalent to a file: >> >>"Unlike dm-crypt, fscrypt operates at the filesystem level rather than >>at the block device level. This allows it to encrypt different files >>with different keys and to have unencrypted files on the same >>filesystem. This is useful for multi-user systems where each user’s >>data-at-rest needs to be cryptographically isolated from the others. >>However, except for filenames, fscrypt does not encrypt filesystem >>metadata." >>[5] has this to say which seems independent of mode: >> >>"When encrypting data with a symmetric block cipher, which uses blocks >>of n bits, some security concerns begin to appear when the amount of >>data encrypted with a single key comes close to 2n/2 blocks, i.e. n*2n/2 >>bits. With AES, n = 128 (AES-128, AES-192 and AES-256 all use 128-bit >>blocks). This means a limit of more than 250 millions of terabytes, >>which is sufficiently large not to be a problem. That's precisely why >>AES was defined with 128-bit blocks, instead of the more common (at that >>time) 64-bit blocks: so that data size is practically unlimited." >> > > FWIW I was a bit confused at first, because the copy paste mangled the > formulas a bit - it should have been 2^(n/2) and n*2^(n/2). Yeah, sorry about that. >>But goes on to say: >>"I wouldn't use n*2^(n/2) bits in any sort of recommendation. Once you >>reach that number of bits the probability of a collision will grow >>quickly and you will be way over 50% probability of a collision by the >>time you reach 2*n*2^(n/2) bits. In order to keep the probability of a >>collision negligible I recommend encrypting no more than n*2^(n/4) bits >>with the same key. In the case of AES that works out to 64GB" >> >>It is hard to say if that recommendation is per key or per key+IV. > > Hmm, yeah. The question is what collisions they have in mind? Presumably > it's AES(block1,key) = AES(block2,key) in which case it'd be with fixed > IV, so per key+IV. Seems likely. >>But I did find that files in an encrypted file system are encrypted with >>derived keys from a master key, and I view this as analogous to what we >>are doing. >> > > My understanding always was that we'd do something like that, i.e. we'd > have a master key (or perhaps multiple of them, for various users), but > the data would be encrypted with secondary (generated) keys, and those > secondary keys would be encrypted by the master key. At least that's > what was proposed at the beginning of this thread by Insung Moon. In my email I linked the wrong page for [2]. The correct one is here: [2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html Following that, I think we could end up with three tiers: 1. A master key encryption key (KEK): this is the ley supplied by the database admin using something akin to ssl_passphrase_command 2. A master data encryption key (MDEK): this is a generated key using a cryptographically secure pseudo-random number generator. It is encrypted using the KEK, probably with Key Wrap (KW): or maybe better Key Wrap with Padding (KWP): 3a. Per table data encryption keys (TDEK): use MDEK and HKDF to generate table specific keys. 3b. WAL data encryption keys (WDEK): Similarly use MDEK and a HKDF to generate new keys when needed for WAL (based on the other info we need to change WAL keys every 68 GB unless I read that wrong). I believe that would allows us to have multiple keys but they are derived securely from the one DEK using available info similar to the way we intend to use LSN to derive the IVs -- perhaps table.oid for tables and something else for WAL. We also need to figure out how/when to generate new WDEK. Maybe every checkpoint, also meaning we would have to force a checkpoint every 68GB? [HKDF]: https://tools.ietf.org/html/rfc5869 [KW]: https://tools.ietf.org/html/rfc3394 [KWP]: https://tools.ietf.org/html/rfc5649 > But AFAICS the 2-tier key scheme is primarily motivated by operational > reasons, i.e. effort to rotate the master key etc. So I would not expect > to find recommendations to use multiple keys in sources primarily > dealing with cryptography. It does in [2] > One extra thing we should
Re: [PATCH] Implement uuid_version()
On 2019-07-13 17:13, Fabien COELHO wrote: >>> What about avoiding a redirection with something like: >>> >>> Datum (* const pg_random_uuid)(PG_FUNCTION_ARGS) = gen_random_uuid; >> >> That seems very confusing. > > Dunno. Possibly. The user does not have to look at the implementation, and > probably such code would deserve a comment. > > The point is to avoid one call so as to perform the same (otherwise the > pg_random_uuid would be slightly slower), and to ensure that it behaves > the same, as it would be the very same function by construction. > > I've switched the patch to ready anyway. committed -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
On 7/13/19 2:41 PM, Joe Conway wrote: > [2] > https://www.postgresql.org/message-id/20190708194733.cztnwhqge4acepzw%40development BTW I managed to mess up this link. This is what I intended to link there (from Tomas): [2] https://www.kernel.org/doc/html/latest/filesystems/fscrypt.html I am sure I have confused the heck out of everyone reading what I wrote by that error :-/ Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: pg_stat_statements vs. SELECT FOR UPDATE
Hi > Sergei> PS: my note about comments in tests from my previous review is > Sergei> actual too. > > I changed the comment when committing it. Great, thank you! regards, Sergei
Re: pg_stat_statements vs. SELECT FOR UPDATE
> "Sergei" == Sergei Kornilov writes: Sergei> PS: my note about comments in tests from my previous review is Sergei> actual too. I changed the comment when committing it. -- Andrew (irc:RhodiumToad)
Re: Patch to document base64 encoding
Hello Karl, It really works in research papers: "Theorem X can be proven by applying Proposition Y. See Figure 2 for details. Algorithm Z describes whatever, which is listed in Table W..." I've not thought about it before but I suppose the difference is between declarative and descriptive, the latter being more inviting and better allows for flow between sentences. Otherwise you're writing in bullet points. So it is a question of balance between specification and narration. In regular prose you're always going to see the "the" unless the sentence starts with the name. The trouble is that we can't start sentences with function names because of capitalization confusion. Sure. For me "Function" would work as a title on its name, as in "Sir Samuel", "Doctor Frankenstein", "Mister Bean", "Professor Layton"... "Function sqrt" and solves the casing issue on the function name which is better not capitalized. -- Fabien.
Re: pgbench - implement strict TPC-B benchmark
Hello Thomas, Thanks for the feedback. + the account branch has a 15% probability to be in the same branch as the teller (unless I would say "... has a 15% probability of being in the same ...". The same wording appears further down in the comment. Fixed. I see that the parameters you propose match the TPCB 2.0 description[1], [...] Nearly:-( While re-re-re-re-reading the spec, it was 85%, i.e. people mostly go to their local teller, I managed to get it wrong. Sigh. Fixed. Hopefully. I've updated the script a little so that it is closer to spec. I've also changed the script definition so that it still works as expected if someone changes "nbranches" definition for some reason, even if this is explicitely discourage by comments. I wonder if "strict" is the right name here though. "tpcb-like-2" at least leaves room for someone to propose yet another variant, and still includes the "-like" disclaimer, which I interpret as a way of making it clear that this benchmark and results produced by it have no official TPC audited status. Hmmm. The -like suffix is really about the conformance of the script, not the rest, but that should indeed be clearer. I've expanded the comment and doc about this with a disclaimers, so that there is no ambiguity about what is expected to conform, which is only the transaction script. I have added a comment about the non conformance of the "int" type use for balances in the initialization phase. Also, on second thought, I've changed the name to "standard-tpcb", but I'm unsure whether it is better than "script-tpcb". There is an insentive to have a different prefix so that "-b t" would not complain of ambiguity between "tpcb-like*", which would be a regression. This is why I did not choose the simple "tcp". There may be a "standard-tpcb-2" anyway. I have added a small test run in the TAP test. On my TODO list is adding an initialization option to change the balance type for conformance, by using NUMERIC or integer8. While thinking about that, an unrelated thought occured to me: adding a partitioned initialization variant would be nice to test the performance impact of partitioning easily. I should have thought of that as soon as partitioning was added. Added to my TODO list as well. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 09af9f1a7d..6ff557b2b3 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -347,7 +347,8 @@ pgbench options d An optional integer weight after @ allows to adjust the probability of drawing the script. If not specified, it is set to 1. Available built-in scripts are: tpcb-like, -simple-update and select-only. +simple-update, select-only and +standard-tpcb. Unambiguous prefixes of built-in names are accepted. With special name list, show the list of built-in scripts and exit immediately. @@ -869,6 +870,25 @@ pgbench options d If you select the select-only built-in (also -S), only the SELECT is issued. + + + If you select the standard-tpcb built-in, the SQL commands are similar + to tpcb-like, but the delta amount is 6 digits, + the account branch has a 85% probability of being in the same branch as the teller (unless + there is only one branch in which case it is always the same), and the final account + balance is actually extracted by the script. + + + + +Disclaimer: with standard-tpcb, only the transaction script +is expected to conform to the "TPC Benchmark(tm) B revision 2.0" specification. +Other parts of the benchmark, such as type capabilities, initialization, +performance data collection, checks on final values, database configuration, +test duration... may or may not conform to the standard depending on the actual +run. + + diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index 8b84658ccd..4888beb129 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -193,8 +193,12 @@ int64 random_seed = -1; * end of configurable parameters */ -#define nbranches 1 /* Makes little sense to change this. Change - * -s instead */ +/* + * It makes little sense to change "nbranches", use -s instead. + * + * Nevertheless, "ntellers" and "naccounts" must be divisible by "nbranches". + */ +#define nbranches 1 #define ntellers 10 #define naccounts 10 @@ -566,6 +570,61 @@ static const BuiltinScript builtin_script[] = "", "\\set aid random(1, " CppAsString2(naccounts) " * :scale)\n" "SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n" + }, + { + /*--- + * Standard conforming TPC-B benchmark script as defined in + * "TPC BENCHMARK (tm) B Standard Specification Revision 2.0" + * from 7 June 1994, see http://www.tpc.org/tpcb/spec/tpcb_current.pdf + * +
Re: Conflict handling for COPY FROM
Hi, sorry for answering a bit later than I hoped. Here is my review so far: Contents == This patch starts to address in my opinion one of COPY's shortcoming, which is error handling. PK and exclusion errors are taken care of, but not (yet?) other types of errors. Documentation is updated, "\h copy" also and some regression tests are added. Initial Run === Patch applies (i've tested v6) cleanly. make: OK make install: OK make check: OK make installcheck: OK Performance I've tested the patch on a 1.1G file with 10 000 000 lines. Each test was done 15 times on a small local VM. Table is without constraints. head: 38,93s head + patch: 38,76s Another test was one a 0.1GB file with 1 000 000 lines. Each test done 10 times on a small local VM and the table has a pk. COPY4,550s COPY CONFLICT 4,595s COPY CONFLICT with only one pk error 10,529s COPY CONFLICT pk error every 100 lines 10,859s COPY CONFLICT pk error every 1000 lines10,879s I did not test exclusions so far. Thoughts == I find the feature useful in itself. One big question for me is can it be improved later on to handle other types of errors (like check constraints for example) ? A "-1" for the error limit would be very useful in my opinion. I am also afraid that the name "error_limit" might mislead users into thinking that all error types are handled. But I do not have a better suggestion without making this clause much longer... I've had a short look at the code, but this will need review by someone else. Anyway, thanks a lot for taking the time to work on it. Anthony On Sun, Jul 14, 2019 at 3:48 AM Thomas Munro wrote: > On Fri, Jul 12, 2019 at 1:42 AM Surafel Temesgen > wrote: > > Here are the patch that contain all the comment given except adding a > way to specify > > to ignoring all error because specifying a highest number can do the > work and may be > > try to store such badly structure data is a bad idea > > Hi Surafel, > > FYI GCC warns: > > copy.c: In function ‘CopyFrom’: > copy.c:3383:8: error: ‘dest’ may be used uninitialized in this > function [-Werror=maybe-uninitialized] > (void) dest->receiveSlot(myslot, dest); > ^ > copy.c:2702:16: note: ‘dest’ was declared here > DestReceiver *dest; > ^ > > -- > Thomas Munro > https://enterprisedb.com > > >