Re: pgbench - allow to create partitioned tables
On Sat, Sep 21, 2019 at 1:18 PM Fabien COELHO wrote: > > Yes, this code is correct. I am not sure if you understood the point, > > so let me try again. I am bothered about below code in the patch: > > + /* only print partitioning information if some partitioning was detected > > */ > > + if (partition_method != PART_NONE) > > > > This is the only place now where we check 'whether there are any > > partitions' differently. I am suggesting to make this similar to > > other checks (if (partitions > 0)). > > As I said somewhere up thread, you can have a partitioned table with zero > partitions, and it works fine (yep! the update just does not do anything…) > so partitions > 0 is not a good way to know whether there is a partitioned > table when running a bench. It is a good way for initialization, though, > because we are creating them. > >sh> pgbench -i --partitions=1 >sh> psql -c 'DROP TABLE pgbench_accounts_1' >sh> pgbench -T 10 >... >transaction type: >scaling factor: 1 >partition method: hash >partitions: 0 > I am not sure how many users would be able to make out that it is a run where actual partitions are not present unless they beforehand know and detect such a condition in their scripts. What is the use of such a run which completes without actual updates? I think it is better if give an error for such a case rather than allowing to execute it and then give some information which doesn't make much sense. > > I incorporated most of them, although I made them terser, and fixed them > when inaccurate. > > I did not buy moving the condition inside the fillfactor function. > I also don't agree with your position. My main concern here is that we can't implicitly assume that fillfactor need to be appended. At the very least we should have a comment saying why we are always appending the fillfactor for partitions, something like I had in my patch. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: Avoiding possible future conformance headaches in JSON work
Peter Eisentraut writes: > On 2019-09-20 01:14, Tom Lane wrote: >> Sure. But we also modeled those features on the same language that the >> committee is looking at (or at least I sure hope we did). So it's >> reasonable to assume that they would come out at the same spot without >> any prompting. And we can prompt them ;-). > Also, I understand these are features proposed for PG13, not in PG12. > So while this is an important discussion, it's not relevant to the PG12 > release, right? > (If so, I'm content to close these open items.) I took a quick look to compare our jsonpath documentation with ISO/IEC TR_19075-6_2017 (I did *not* try to see if the code agrees with the docs ;-)). As far as I can see, everything described in our docs appears in the TR, with the exception of two things that are already documented as Postgres extensions: 1. Recursive (multilevel) wildcards, ie .** and .**{level [to level]} accessors, per table 8.25. 2. We allow a path expression to be a Boolean predicate, although the TR allows predicates only in filters, per example in 9.16.1: '$.track.segments[*].HR < 70' (It's not exactly clear to me why this syntax is necessary; what's it do that you can't do more verbosely with a filter?) I have no opinion on whether we're opening ourselves to significant spec-compliance risks through these two features. I am, however, unexcited about adding some kind of "PG only" marker to the language, for a couple of reasons. First, I really doubt that a single boolean flag would get us far in terms of dealing with future compliance issues. As soon as we have two extension features (i.e., already) we have the question of what happens if one gets standardized and the other doesn't; and that risk gets bigger if we're going to add half a dozen more things. Second, we've procrastinated too long and thereby effectively made a decision already. At this point I don't see how we could push in any such change without delaying the release. So my vote at this point is "ship it as-is". regards, tom lane
Re: Nondeterministic collations vs. text_pattern_ops
On 2019-09-21 22:30, Tom Lane wrote: > I wrote: >> Peter Eisentraut writes: Here is a draft patch. > >> Where are we on pushing that? I'm starting to get antsy about the >> amount of time remaining before rc1. It's a low-risk fix, but still, >> it'd be best to have a complete buildfarm cycle on it before Monday's >> wrap. > > Since time is now really running short, I went ahead and pushed > this, after doing a closer review and finding one nitpicky bug. Great! I think that covers all the open issues then. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: some PostgreSQL 12 release notes comments
I've pushed some release note adjustments responding to your points about the GSSAPI and name-collation entries. I see the LDAP text is fixed already. regards, tom lane
Re: dropdb --force
pá 20. 9. 2019 v 5:10 odesílatel Thomas Munro napsal: > On Thu, Sep 19, 2019 at 6:44 AM Pavel Stehule > wrote: > > I am sending updated version > > +appendPQExpBuffer(, "DROP DATABASE %s%s%s;", > + (force ? " (FORCE) " : ""), > > An extra space before (FORCE) caused the test to fail: > > # 2019-09-19 19:07:22.203 UTC [1516] 050_dropdb.pl LOG: statement: > DROP DATABASE (FORCE) foobar2; > > # doesn't match '(?^:statement: DROP DATABASE (FORCE) foobar2)' > should be fixed now. thank you for echo Pavel > -- > Thomas Munro > https://enterprisedb.com >
Re: dropdb --force
pá 20. 9. 2019 v 7:58 odesílatel Dilip Kumar napsal: > On Thu, Sep 19, 2019 at 12:14 AM Pavel Stehule > wrote: > > > > Hi > > > > I am sending updated version - the changes against last patch are two. I > use pg_terminate_backed for killing other terminates like Tom proposed. I > am not sure if it is 100% practical - on second hand the necessary right to > kill other sessions is almost available - and consistency with > pg_terminal_backend has sense. More - native implementation is > significantly better then solution implemented outer. I fixed docs little > bit - the timeout for logout (as reaction on SIGTERM is only 5 sec). > > > > Few comments on the patch. > > @@ -598,10 +598,20 @@ standard_ProcessUtility(PlannedStmt *pstmt, > case T_DropdbStmt: > { > DropdbStmt *stmt = (DropdbStmt *) parsetree; > + bool force = false; > + ListCell *lc; > + > + foreach(lc, stmt->options) > + { > + DefElem*item = (DefElem *) lfirst(lc); > + > + if (strcmp(item->defname, "force") == 0) > + force = true; > + } > > /* no event triggers for global objects */ > PreventInTransactionBlock(isTopLevel, "DROP DATABASE"); > - dropdb(stmt->dbname, stmt->missing_ok); > + dropdb(stmt->dbname, stmt->missing_ok, force); > > If you see the createdb, then options are processed inside the > createdb function but here we are processing outside > the function. Wouldn't it be good to keep them simmilar and also it > will be expandable in case we add more options > in the future? > > I though about it, but I think so current state is good enough. There are only few parameters - and I don't think significantly large number of new options. When number of parameters of any functions is not too high, then I think so is better to have a function with classic list of parameters instead function with one parameter of some struct type. If somebody try to use function dropdb from outside (extension), then he don't need to create new structure, new list, and simply call simple C API. I prefer API of simple types against structures and nodes there. Just I think so it's more practical of somebody try to use this function from other places than from parser. > > initPQExpBuffer(); > > - appendPQExpBuffer(, "DROP DATABASE %s%s;", > - (if_exists ? "IF EXISTS " : ""), fmtId(dbname)); > - > /* Avoid trying to drop postgres db while we are connected to it. */ > if (maintenance_db == NULL && strcmp(dbname, "postgres") == 0) > maintenance_db = "template1"; > @@ -134,6 +136,10 @@ main(int argc, char *argv[]) > host, port, username, prompt_password, > progname, echo); > + appendPQExpBuffer(, "DROP DATABASE %s%s%s;", > + (force ? " (FORCE) " : ""), > + (if_exists ? "IF EXISTS " : ""), fmtId(dbname)); > + > > I did not understand why you have moved this code after > connectMaintenanceDatabase function? I don't see any problem but in > earlier code > initPQExpBuffer and appendPQExpBuffer were together now you have moved > appendPQExpBuffer after the connectMaintenanceDatabase so if there is > no reason to do that then better to keep it how it was earlier. > > I am not a author of this change, but it is not necessary and I returned original order > -extern bool CountOtherDBBackends(Oid databaseId, > - int *nbackends, int *nprepared); > +extern bool CountOtherDBBackends(Oid databaseId, int *nbackends, int > *nprepared); > > Unrelated change > fixed Thank you for check. I am sending updated patch Regards Pavel > -- > Regards, > Dilip Kumar > EnterpriseDB: http://www.enterprisedb.com > diff --git a/doc/src/sgml/ref/drop_database.sgml b/doc/src/sgml/ref/drop_database.sgml index 3ac06c984a..11a31899d2 100644 --- a/doc/src/sgml/ref/drop_database.sgml +++ b/doc/src/sgml/ref/drop_database.sgml @@ -21,7 +21,11 @@ PostgreSQL documentation -DROP DATABASE [ IF EXISTS ] name +DROP DATABASE [ ( option [, ...] ) ] [ IF EXISTS ] name + +where option can be one of: + +FORCE @@ -32,9 +36,11 @@ DROP DATABASE [ IF EXISTS ] name DROP DATABASE drops a database. It removes the catalog entries for the database and deletes the directory containing the data. It can only be executed by the database owner. - Also, it cannot be executed while you or anyone else are connected - to the target database. (Connect to postgres or any - other database to issue this command.) + Also, it cannot be executed while you are connected to the target database. + (Connect to postgres or any other database to issue this + command.) + If anyone else is connected to the target database, this command will fail + - unless you use the FORCE option described below. @@ -64,6 +70,24 @@ DROP DATABASE [ IF EXISTS ] name + + +FORCE + + + Attempt to terminate all existing connections to the target database. + + + This will fail, if current user has no permissions to terminate other + connections. Required permissions are the same as with +
PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node
The step to reproduce this issue. 1. Create a table create table gist_point_tbl(id int4, p point); create index gist_pointidx on gist_point_tbl using gist(p); 2. Insert data insert into gist_point_tbl (id, p) select g,point(g*10, g*10) from generate_series(1, 100) g; 3. Delete data delete from gist_point_bl; 4. Vacuum table vacuum gist_point_tbl; -- Send SIGINT to vacuum process after WAL-log of the truncation is flushed and the truncation is not finished -- We will receive error message "ERROR: canceling statement due to user request" 5. Vacuum table again vacuum gist_point tbl; -- The standby node crashed and the PANIC log is "PANIC: WAL contains references to invalid pages" The standby node succeed to replay truncate log but master node doesn't truncate the file, it will be crashed if master node writes to these blocks which truncated in standby node. I try to fix issue to prevent query cancel interrupts during truncating. diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c index 5df4382b7e..04b696ae01 100644 --- a/src/backend/catalog/storage.c +++ b/src/backend/catalog/storage.c @@ -26,6 +26,7 @@ #include "access/xlogutils.h" #include "catalog/storage.h" #include "catalog/storage_xlog.h" +#include "miscadmin.h" #include "storage/freespace.h" #include "storage/smgr.h" #include "utils/memutils.h" @@ -248,6 +249,14 @@ RelationTruncate(Relation rel, BlockNumber nblocks) if (vm) visibilitymap_truncate(rel, nblocks); + /* +* When master node flush WAL-log of the truncation and then receive SIGINT signal to cancel +* this transaction before the truncation, if standby receive this WAL-log and do the truncation, +* standby node will crash when master node writes to these blocks which are truncated in standby node. +* So we prevent query cancel interrupts. +*/ + HOLD_CANCEL_INTERRUPTS(); + /* * We WAL-log the truncation before actually truncating, which means * trouble if the truncation fails. If we then crash, the WAL replay @@ -288,6 +297,8 @@ RelationTruncate(Relation rel, BlockNumber nblocks) /* Do the real work */ smgrtruncate(rel->rd_smgr, MAIN_FORKNUM, nblocks); + + RESUME_CANCEL_INTERRUPTS(); }
Re: Fwd: Extending range type operators to cope with elements
On Tue, Sep 17, 2019 at 5:18 AM David Fetter wrote: > It's not done by pull request at this time. Instead, it is done by sending > patches to this mailing list. Dear all You will find enclosed the patch that extends the range type operators so they cope with elements. Any comments most welcome. Esteban diff -urdN postgresql-11.5-orig/doc/src/sgml/func.sgml postgresql-11.5-ranges/doc/src/sgml/func.sgml --- postgresql-11.5-orig/doc/src/sgml/func.sgml 2019-09-21 11:28:11.836309263 +0200 +++ postgresql-11.5-ranges/doc/src/sgml/func.sgml 2019-09-21 10:32:53.320004000 +0200 @@ -13228,6 +13228,20 @@ + +strictly left of element +int8range(1,10) 100 +t + + + + +element strictly left of +10 int8range(100,110) +t + + + strictly right of int8range(50,60) int8range(20,30) @@ -13235,6 +13249,20 @@ + +strictly right of element +int8range(50,60) 20 +t + + + + +element strictly right of +50 int8range(20,30) +t + + + does not extend to the right of int8range(1,20) int8range(18,20) @@ -13242,6 +13270,20 @@ + +does not extend to the right of element +int8range(1,20) 20 +t + + + + +element does not extend to the right of +19 int8range(18,20) +t + + + does not extend to the left of int8range(7,20) int8range(5,10) @@ -13249,12 +13291,40 @@ + +does not extend to the left of element +int8range(7,20) 5 +t + + + + +element does not extend to the left of +7 int8range(5,10) +t + + + -|- is adjacent to numrange(1.1,2.2) -|- numrange(2.2,3.3) t + + -|- +is adjacent to element +numrange(1.1,2.2) -|- 2.2 +t + + + + -|- +element is adjacent to +2.2 -|- numrange(2.2,3.3, '()') +t + + + union diff -urdN postgresql-11.5-orig/src/backend/utils/adt/rangetypes.c postgresql-11.5-ranges/src/backend/utils/adt/rangetypes.c --- postgresql-11.5-orig/src/backend/utils/adt/rangetypes.c 2019-09-21 11:28:11.628205263 +0200 +++ postgresql-11.5-ranges/src/backend/utils/adt/rangetypes.c 2019-09-21 10:32:53.320004000 +0200 @@ -548,6 +548,322 @@ PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val)); } +/* strictly left of element? (internal version) */ +bool +range_before_elem_internal(TypeCacheEntry *typcache, RangeType *r, Datum val) +{ + RangeBound lower, +upper; + bool empty; + int32 cmp; + + range_deserialize(typcache, r, , , ); + + /* An empty range is neither left nor right any other range */ + if (empty) + return false; + + if (!upper.infinite) + { + cmp = DatumGetInt32(FunctionCall2Coll(>rng_cmp_proc_finfo, + typcache->rng_collation, + upper.val, val)); + if (cmp < 0 || + (cmp == 0 && !upper.inclusive)) + return true; + } + + return false; +} + +/* strictly left of element? */ +Datum +range_before_elem(PG_FUNCTION_ARGS) +{ + RangeType *r = PG_GETARG_RANGE_P(0); + Datum val = PG_GETARG_DATUM(1); + TypeCacheEntry *typcache; + + typcache = range_get_typcache(fcinfo, RangeTypeGetOid(r)); + + PG_RETURN_BOOL(range_before_elem_internal(typcache, r, val)); +} + +/* does not extend to right of element? (internal version) */ +bool +range_overleft_elem_internal(TypeCacheEntry *typcache, RangeType *r, Datum val) +{ + RangeBound lower, +upper; + bool empty; + + range_deserialize(typcache, r, , , ); + + /* An empty range is neither left nor right any element */ + if (empty) + return false; + + if (!upper.infinite) + { + if (DatumGetInt32(FunctionCall2Coll(>rng_cmp_proc_finfo, + typcache->rng_collation, + upper.val, val)) <= 0) + return true; + } + + return false; +} + +/* does not extend to right of element? */ +Datum +range_overleft_elem(PG_FUNCTION_ARGS) +{ + RangeType *r = PG_GETARG_RANGE_P(0); + Datum val = PG_GETARG_DATUM(1); + TypeCacheEntry *typcache; + + typcache = range_get_typcache(fcinfo, RangeTypeGetOid(r)); + + PG_RETURN_BOOL(range_overleft_elem_internal(typcache, r, val)); +} + +/* strictly right of element? (internal version) */ +bool +range_after_elem_internal(TypeCacheEntry *typcache, RangeType *r, Datum val) +{ + RangeBound lower, +upper; + bool empty; + int32 cmp; + + range_deserialize(typcache, r, , , ); + + /* An empty range is neither left nor right any other range */ + if (empty) + return false; + + if (!lower.infinite) + { + cmp = DatumGetInt32(FunctionCall2Coll(>rng_cmp_proc_finfo, + typcache->rng_collation, +
Re: WIP: Generic functions for Node types using generated metadata
On Fri, Sep 20, 2019 at 03:43:54PM -0700, Andres Freund wrote: > Hi, > > On 2019-09-19 22:18:57 -0700, Andres Freund wrote: > > While working on this I evolved the node string format a bit: > > > > 1) Node types start with the their "normal" name, rather than > >uppercase. There seems little point in having such a divergence. > > > > 2) The node type is followed by the node-type id. That allows to more > >quickly locate the corresponding node metadata (array and one name > >recheck, rather than a binary search). I.e. the node starts with > >"{Scan 18 " rather than "{SCAN " as before. > > > > 3) Nodes that contain other nodes as sub-types "inline", still emit {} > >for the subtype. There's no functional need for this, but I found the > >output otherwise much harder to read. E.g. for mergejoin we'd have > >something like > > > >{MergeJoin 37 :join {Join 35 :plan {Plan ...} :jointype JOIN_INNER ...} > > :skip_mark_restore true ...} > > > > 4) As seen in the above example, enums are decoded to their string > >values. I found that makes the output easier to read. Again, not > >functionally required. > > > > 5) Value nodes aren't emitted without a {Value ...} anymore. I changed > >this when I expanded the WRITE/READ tests, and encountered failures > >because the old encoding is not entirely rountrip safe > >(e.g. -INT32_MIN will be parsed as a float at raw parse time, but > >after write/read, it'll be parsed as an integer). While that could be > >fixed in other ways (e.g. by emitting a trailing . for all floats), I > >also found it to be clearer this way - Value nodes are otherwise > >undistinguishable from raw strings, raw numbers etc, which is not > >great. > > > > It'd also be easier to now just change the node format to something else. > > E.g. to just use json. +many JSON's been around long enough to give some assurance that it's not going away, and it's pretty simple. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Re: pgbench - allow to create partitioned tables
Hello Amit, Yes, this code is correct. I am not sure if you understood the point, so let me try again. I am bothered about below code in the patch: + /* only print partitioning information if some partitioning was detected */ + if (partition_method != PART_NONE) This is the only place now where we check 'whether there are any partitions' differently. I am suggesting to make this similar to other checks (if (partitions > 0)). As I said somewhere up thread, you can have a partitioned table with zero partitions, and it works fine (yep! the update just does not do anything…) so partitions > 0 is not a good way to know whether there is a partitioned table when running a bench. It is a good way for initialization, though, because we are creating them. sh> pgbench -i --partitions=1 sh> psql -c 'DROP TABLE pgbench_accounts_1' sh> pgbench -T 10 ... transaction type: scaling factor: 1 partition method: hash partitions: 0 query mode: simple number of clients: 1 number of threads: 1 duration: 10 s number of transactions actually processed: 2314 latency average = 4.323 ms tps = 231.297122 (including connections establishing) tps = 231.549125 (excluding connections establishing) As postgres does not break, there is no good reason to forbid it. [...] Sure, even in that case your older version of pgbench will be able to detect by below code [...] "unexpected partition method: " [...]. Yes, that is what I was saying. Hmm, you have just written what each part of the query is doing which I think one can identify if we write some general comment as I have in the patch to explain the overall intent. Even if we write what each part of the statement is doing, the comment explaining overall intent is required. There was some comments. I personally don't like writing a comment for each sub-part of the query as that makes reading the query difficult. See the patch sent by me in my previous email. I did not notice there was an attachment. I have done that in some of the cases in the patch attached by me in my last email. Have you looked at those changes? Nope, as I was not expected one. Try to make those changes in the next version unless you see something wrong is written in comments. I incorporated most of them, although I made them terser, and fixed them when inaccurate. I did not buy moving the condition inside the fillfactor function. See attached v14. -- Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index c857aa3cba..e3a0abb4c7 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -306,6 +306,31 @@ pgbench options d + + --partitions=NUM + + +Create a partitioned pgbench_accounts table with +NUM partitions of nearly equal size for +the scaled number of accounts. +Default is 0, meaning no partitioning. + + + + + + --partition-method=NAME + + +Create a partitioned pgbench_accounts table with +NAME method. +Expected values are range or hash. +This option requires that --partitions is set to non-zero. +If unspecified, default is range. + + + + --tablespace=tablespace diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c index ed7652bfbf..97b73d5a8a 100644 --- a/src/bin/pgbench/pgbench.c +++ b/src/bin/pgbench/pgbench.c @@ -186,6 +186,25 @@ int64 latency_limit = 0; char *tablespace = NULL; char *index_tablespace = NULL; +/* + * Number of "pgbench_accounts" partitions, found or to create. + * When creating, 0 is the default and means no partitioning. + * When running, this is the actual number of partitions. + */ +static int partitions = 0; + +/* partitioning strategy for "pgbench_accounts" */ +typedef enum +{ + PART_NONE, /* no partitioning */ + PART_RANGE, /* range partitioning */ + PART_HASH /* hash partitioning */ +} + partition_method_t; + +static partition_method_t partition_method = PART_NONE; +static const char *PARTITION_METHOD[] = {"none", "range", "hash"}; + /* random seed used to initialize base_random_sequence */ int64 random_seed = -1; @@ -617,6 +636,9 @@ usage(void) " --foreign-keys create foreign key constraints between tables\n" " --index-tablespace=TABLESPACE\n" " create indexes in the specified tablespace\n" + " --partitions=NUM partition pgbench_accounts in NUM parts (default: 0)\n" + " --partition-method=(range|hash)\n" + " partition pgbench_accounts with this method (default: range)\n" " --tablespace=TABLESPACE create tables in the specified tablespace\n" " --unlogged-tablescreate tables as unlogged tables\n" "\nOptions to select what to run:\n" @@ -3601,6 +3623,82 @@ initDropTables(PGconn *con)
Re: Efficient output for integer types
> "David" == David Fetter writes: David> +static inline uint32 David> +decimalLength64(const uint64_t v) Should be uint64, not uint64_t. Also return an int, not a uint32. For int vs. int32, my own inclination is to use "int" where the value is just a (smallish) number, especially one that will be used as an index or loop count, and use "int32" when it actually matters that it's 32 bits rather than some other size. Other opinions may differ. David> +{ David> + uint32 t; David> + static uint64_t PowersOfTen[] = { uint64 not uint64_t here too. David> +int32 David> +pg_ltoa_n(uint32 value, char *a) If this is going to handle only unsigned values, it should probably be named pg_ultoa_n. David> + uint32 i = 0, adjust = 0; "adjust" is not assigned anywhere else. Presumably that's from previous handling of negative numbers? David> + memcpy(a, "0", 1); *a = '0'; would suffice. David> + i += adjust; Superfluous? David> + uint32_tuvalue = (uint32_t)value; uint32 not uint32_t. David> + int32 len; See above re. int vs. int32. David> + uvalue = (uint32_t)0 - (uint32_t)value; Should be uint32 not uint32_t again. For anyone wondering, I suggested this to David in place of the ugly special casing of INT32_MIN. This method avoids the UB of doing (-value) where value==INT32_MIN, and is nevertheless required to produce the correct result: 1. If value < 0, then ((uint32)value) is (value + UINT32_MAX + 1) 2. (uint32)0 - (uint32)value becomes (UINT32_MAX+1)-(value+UINT32_MAX+1) which is (-value) as required David> +int32 David> +pg_lltoa_n(uint64_t value, char *a) Again, if this is doing unsigned, then it should be named pg_ulltoa_n David> + if (value == PG_INT32_MIN) This being inconsistent with the others is not nice. -- Andrew (irc:RhodiumToad)
Re: A problem about partitionwise join
On Fri, Sep 20, 2019 at 2:33 PM Richard Guo wrote: > > Hi Dilip, > > Thank you for reviewing this patch. > > On Fri, Sep 20, 2019 at 12:48 PM Dilip Kumar wrote: >> >> On Thu, Aug 29, 2019 at 3:15 PM Richard Guo wrote: >> > >> > >> > Attached is a patch as an attempt to address this issue. The idea is >> > quite straightforward. When building partition info for joinrel, we >> > generate any possible EC-derived joinclauses of form 'outer_em = >> > inner_em', which will be used together with the original restrictlist to >> > check if there exists an equi-join condition for each pair of partition >> > keys. >> > >> > Any comments are welcome! >> /* >> + * generate_join_implied_equalities_for_all >> + * Create any EC-derived joinclauses of form 'outer_em = inner_em'. >> + * >> + * This is used when building partition info for joinrel. >> + */ >> +List * >> +generate_join_implied_equalities_for_all(PlannerInfo *root, >> + Relids join_relids, >> + Relids outer_relids, >> + Relids inner_relids) >> >> I think we need to have more detailed comments about why we need this >> separate function, we can also explain that >> generate_join_implied_equalities function will avoid >> the join clause if EC has the constant but for partition-wise join, we >> need that clause too. > > > Thank you for the suggestion. > >> >> >> + while ((i = bms_next_member(matching_ecs, i)) >= 0) >> + { >> + EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i); >> + List*outer_members = NIL; >> + List*inner_members = NIL; >> + ListCell *lc1; >> + >> + /* Do not consider this EC if it's ec_broken */ >> + if (ec->ec_broken) >> + continue; >> + >> + /* Single-member ECs won't generate any deductions */ >> + if (list_length(ec->ec_members) <= 1) >> + continue; >> + >> >> I am wondering isn't it possible to just process the missing join >> clause? I mean 'generate_join_implied_equalities' has only skipped >> the ECs which has const so >> can't we create join clause only for those ECs and append it the >> "Restrictlist" we already have? I might be missing something? > > > For ECs without const, 'generate_join_implied_equalities' also has > skipped some join clauses since it only selects the joinclause with > 'best_score' between outer members and inner members. And the missing > join clauses are needed to generate partitionwise join. That's why the > query below cannot be planned as partitionwise join, as we have missed > joinclause 'foo.k = bar.k'. oh right. I missed that part. > select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k; > > And yes 'generate_join_implied_equalities_for_all' will create join > clauses that have existed in restrictlist. I think it's OK since the > same RestrictInfo deduced from EC will share the same pointer and > list_concat_unique_ptr will make sure there are no duplicates in the > restrictlist used by have_partkey_equi_join. > ok -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Efficient output for integer types
On Sat, Sep 21, 2019 at 03:36:21AM +0100, Andrew Gierth wrote: > > "David" == David Fetter writes: > > David> + /* Compute the result string. */ > David> + if (value >= 1) > David> + { > David> + const uint32 value2 = value % 1; > David> + > David> + const uint32 c = value2 % 1; > David> + const uint32 d = value2 / 1; > David> + const uint32 c0 = (c % 100) << 1; > David> + const uint32 c1 = (c / 100) << 1; > David> + const uint32 d0 = (d % 100) << 1; > David> + const uint32 d1 = (d / 100) << 1; > David> + > David> + char *pos = a + olength - i; > David> + > David> + value /= 1; > David> + > David> + memcpy(pos - 2, DIGIT_TABLE + c0, 2); > David> + memcpy(pos - 4, DIGIT_TABLE + c1, 2); > David> + memcpy(pos - 6, DIGIT_TABLE + d0, 2); > David> + memcpy(pos - 8, DIGIT_TABLE + d1, 2); > David> + i += 8; > David> + } > > For the 32-bit case, there's no point in doing an 8-digit divide > specially, it doesn't save any time. It's sufficient to just change > > David> + if (value >= 1) > > to while(value >= 1) Done. > in order to process 4 digits at a time. > > David> + for(int i = 0; i < minwidth - len; i++) > David> + { > David> + memcpy(str + i, DIGIT_TABLE, 1); > David> + } > > Should be: > memset(str, '0', minwidth-len); Done. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate >From 4407b0771cd53890acf41ffee8908d1a701c52c0 Mon Sep 17 00:00:00 2001 From: David Fetter Date: Sun, 15 Sep 2019 00:06:29 -0700 Subject: [PATCH v10] Make int4 and int8 operations more efficent To: hackers MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="2.21.0" This is a multi-part message in MIME format. --2.21.0 Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: 8bit - Output routines now do more digits per iteration, and - Code determines the number of decimal digits in int4/int8 efficiently - Split off pg_ltoa_n from pg_ltoa - Use same to make other functions shorter diff --git a/src/backend/access/common/printsimple.c b/src/backend/access/common/printsimple.c index 651ade14dd..5c5b6d33b2 100644 --- a/src/backend/access/common/printsimple.c +++ b/src/backend/access/common/printsimple.c @@ -112,7 +112,7 @@ printsimple(TupleTableSlot *slot, DestReceiver *self) case INT8OID: { int64 num = DatumGetInt64(value); - char str[23]; /* sign, 21 digits and '\0' */ + char str[MAXINT8LEN + 1]; pg_lltoa(num, str); pq_sendcountedtext(, str, strlen(str), false); diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c index 0ff9394a2f..6230807906 100644 --- a/src/backend/utils/adt/int8.c +++ b/src/backend/utils/adt/int8.c @@ -27,8 +27,6 @@ #include "utils/builtins.h" -#define MAXINT8LEN 25 - typedef struct { int64 current; diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c index 70138feb29..babc6f6166 100644 --- a/src/backend/utils/adt/numutils.c +++ b/src/backend/utils/adt/numutils.c @@ -20,6 +20,68 @@ #include "common/int.h" #include "utils/builtins.h" +#include "port/pg_bitutils.h" + +/* + * A table of all two-digit numbers. This is used to speed up decimal digit + * generation by copying pairs of digits into the final output. + */ +static const char DIGIT_TABLE[200] = +"00" "01" "02" "03" "04" "05" "06" "07" "08" "09" +"10" "11" "12" "13" "14" "15" "16" "17" "18" "19" +"20" "21" "22" "23" "24" "25" "26" "27" "28" "29" +"30" "31" "32" "33" "34" "35" "36" "37" "38" "39" +"40" "41" "42" "43" "44" "45" "46" "47" "48" "49" +"50" "51" "52" "53" "54" "55" "56" "57" "58" "59" +"60" "61" "62" "63" "64" "65" "66" "67" "68" "69" +"70" "71" "72" "73" "74" "75" "76" "77" "78" "79" +"80" "81" "82" "83" "84" "85" "86" "87" "88" "89" +"90" "91" "92" "93" "94" "95" "96" "97" "98" "99"; + +/* + * Adapted from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10 + */ +static inline uint32 +decimalLength32(const uint32 v) +{ + uint32 t; + static uint32 PowersOfTen[] = + {1,10,100, + 1000, 1, 10, + 100, 1000, 1, + 10}; + /* + * Compute base-10 logarithm by dividing the base-2 logarithm + * by a good-enough approximation of the base-2 logarithm of 10 + */ + t = (pg_leftmost_one_pos32(v) + 1)*1233/4096; + return t + (v >= PowersOfTen[t]); +} + +static inline uint32 +decimalLength64(const uint64_t v) +{ + uint32 t; + static uint64_t PowersOfTen[] = { + UINT64CONST(1), UINT64CONST(10), +