Re: An inefficient query caused by unnecessary PlaceHolderVar
On Wed, May 31, 2023 at 1:27 AM James Coleman wrote: > This looks good to me. Thanks for the review! > A few small tweaks suggested to comment wording: > > +-- lateral reference for simple Var can escape PlaceHolderVar if the > +-- referenced rel is under the same lowest nulling outer join > +explain (verbose, costs off) > > I think this is clearer: "lateral references to simple Vars do not > need a PlaceHolderVar when the referenced rel is part of the same > lowest nulling outer join"? Thanks for the suggestion! How about we go with "lateral references to simple Vars do not need a PlaceHolderVar when the referenced rel is under the same lowest nulling outer join"? This seems a little more consistent with the comment in prepjointree.c. > * lateral references to something outside the subquery > being > - * pulled up. (Even then, we could omit the > PlaceHolderVar if > - * the referenced rel is under the same lowest outer > join, but > - * it doesn't seem worth the trouble to check that.) > + * pulled up. Even then, we could omit the > PlaceHolderVar if > + * the referenced rel is under the same lowest nulling > outer > + * join. > > I think this is clearer: "references something outside the subquery > being pulled up and is not under the same lowest outer join." Agreed. Will use this one. > One other thing: it would be helpful to have the test query output be > stable between HEAD and this patch; perhaps add: > > order by 1, 2, 3, 4, 5, 6, 7 > > to ensure stability? Thanks for the suggestion! I wondered about that too but I'm a bit confused about whether we should add ORDER BY in test case. I checked 'sql/join.sql' and found that some queries are using ORDER BY but some are not. Not sure what the criteria are. Thanks Richard
RE: Support logical replication of DDLs
On Wed, May 31, 2023 5:41 PM shveta malik wrote: > > PFA the set of patches consisting above changes. All the changes are > made in 0008 patch. > > Apart from above changes, many partition attach/detach related tests > are uncommented in alter_table.sql in patch 0008. > Thanks for updating the patch, here are some comments. 1. I think some code can be removed because it is not for supporting table commands. 1.1 0001 patch, in deparse_RenameStmt(). OBJECT_ATTRIBUTE is type's attribute. + tmp_obj = new_objtree("CASCADE"); + if (node->renameType != OBJECT_ATTRIBUTE || + node->behavior != DROP_CASCADE) + append_not_present(tmp_obj); + append_object_object(ret, "cascade", tmp_obj); 1.2 0001 patch, in deparse_AlterRelation(). + case AT_AddColumnToView: + /* CREATE OR REPLACE VIEW -- nothing to do here */ + break; 1.3 0001 patch. ("alter table ... owner to ... " is deparsed in deparse_AlterRelation(), instead of this funciton.) +static ObjTree * +deparse_AlterOwnerStmt(ObjectAddress address, Node *parsetree) +{ + AlterOwnerStmt *node = (AlterOwnerStmt *) parsetree; + + return new_objtree_VA("ALTER %{objtype}s %{identity}s OWNER TO %{newowner}I", 3, + "objtype", ObjTypeString, + stringify_objtype(node->objectType), + "identity", ObjTypeString, + getObjectIdentity(, false), + "newowner", ObjTypeString, + get_rolespec_name(node->newowner)); +} 1.4 0001 patch, in deparse_AlterSeqStmt(). I think only "owned_by" option is needed, other options can't be generated by "alter table" command. + foreach(cell, ((AlterSeqStmt *) parsetree)->options) + { + DefElem*elem = (DefElem *) lfirst(cell); + ObjElem*newelm; + + if (strcmp(elem->defname, "cache") == 0) + newelm = deparse_Seq_Cache(seqform, false); + else if (strcmp(elem->defname, "cycle") == 0) + newelm = deparse_Seq_Cycle(seqform, false); + else if (strcmp(elem->defname, "increment") == 0) + newelm = deparse_Seq_IncrementBy(seqform, false); + else if (strcmp(elem->defname, "minvalue") == 0) + newelm = deparse_Seq_Minvalue(seqform, false); + else if (strcmp(elem->defname, "maxvalue") == 0) + newelm = deparse_Seq_Maxvalue(seqform, false); + else if (strcmp(elem->defname, "start") == 0) + newelm = deparse_Seq_Startwith(seqform, false); + else if (strcmp(elem->defname, "restart") == 0) + newelm = deparse_Seq_Restart(seqvalues->last_value); + else if (strcmp(elem->defname, "owned_by") == 0) + newelm = deparse_Seq_OwnedBy(objectId, false); + else if (strcmp(elem->defname, "as") == 0) + newelm = deparse_Seq_As(seqform); + else + elog(ERROR, "invalid sequence option %s", elem->defname); + + elems = lappend(elems, newelm); + } 2. 0001 patch, in deparse_AlterTableStmt(), + case AT_CookedColumnDefault: + { + Relationattrrel; + HeapTuple atttup; + Form_pg_attribute attStruct; ... I think this case can be removed because it is for "create table like", and in such case the function has returned before reaching here, see below. + /* +* ALTER TABLE subcommands generated for TableLikeClause is processed in +* the top level CREATE TABLE command; return empty here. +*/ + if (stmt->table_like) + return NULL; 3. 0001 patch, in deparse_AlterRelation(). Is there any case that "ALTER TABLE" command adds an index which is not a constraint? If not, maybe we can remove the check or replace it with an assert. + case AT_AddIndex: + { ... + + if (!istmt->isconstraint) + break; 4. To run this test when building with meson, it seems we should add "test_ddl_deparse_regress" to src/test/modules/meson.build. 5. create table t1 (a int); create table t2 (a int); SET allow_in_place_tablespaces = true; CREATE TABLESPACE ddl_tblspace LOCATION ''; RESET allow_in_place_tablespaces; ALTER TABLE ALL IN
Re: ALTER TABLE SET ACCESS METHOD on partitioned tables
On Thu, May 25, 2023 at 03:49:12PM +0900, Michael Paquier wrote: > looking at the patch. Here are a few comments. ... > * No need to add an explicit dependency for the toast table, as the > * main table depends on it. > */ > - if (RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) > + if ((RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) > || > + relkind == RELKIND_PARTITIONED_TABLE) > > The comment at the top of this code block needs an update. What do you think the comment ought to say ? It already says: src/backend/catalog/heap.c- * Make a dependency link to force the relation to be deleted if its src/backend/catalog/heap.c- * access method is. > Speaking of which, ATExecSetAccessMethodNoStorage() does a catalog > update even if ALTER TABLE is defined to use the same table AM as what > is currently set. There is no need to update the relation's pg_class > entry in this case. Be careful that InvokeObjectPostAlterHook() still > needs to be checked in this case. Perhaps some tests should be added > in test_oat_hooks.sql? It would be tempted to add a new SQL file for > that. Are you suggesting to put this in a conditional: if oldrelam!=newAccessMethod ? + ((Form_pg_class) GETSTRUCT(tuple))->relam = newAccessMethod; + CatalogTupleUpdate(pg_class, >t_self, tuple); + + /* Update dependency on new AM */ + changeDependencyFor(RelationRelationId, relid, AccessMethodRelationId, + oldrelam, newAccessMethod); Why is that desirable ? I'd prefer to see it written with fewer conditionals, not more; then, it hits the same path every time. This ought to address your other comments. -- Justin >From a726bd7ca8844f6eee639d672afa7edace329caf Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Sun, 7 Mar 2021 00:11:38 -0600 Subject: [PATCH] Allow specifying access method of partitioned tables.. ..to be inherited by partitions See also: ca4103025dfe26eaaf6a500dec9170fbb176eebc 8586bf7ed8889f39a59dd99b292014b73be85342 ebfe2dbd6b624e2a428e14b7ee9322cc096f63f7 - prevent DROP AM Authors: Justin Pryzby, Soumyadeep Chakraborty --- doc/src/sgml/ref/alter_table.sgml | 4 ++ doc/src/sgml/ref/create_table.sgml | 9 ++- src/backend/catalog/heap.c | 3 +- src/backend/commands/tablecmds.c| 89 +++-- src/backend/utils/cache/lsyscache.c | 22 ++ src/backend/utils/cache/relcache.c | 5 ++ src/bin/pg_dump/pg_dump.c | 3 +- src/bin/pg_dump/t/002_pg_dump.pl| 34 ++ src/include/catalog/pg_class.h | 4 +- src/include/utils/lsyscache.h | 1 + src/test/regress/expected/create_am.out | 62 - src/test/regress/sql/create_am.sql | 25 +-- 12 files changed, 212 insertions(+), 49 deletions(-) diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index d4d93eeb7c6..d32d4c44b10 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -719,6 +719,10 @@ WITH ( MODULUS numeric_literal, REM This form changes the access method of the table by rewriting it. See for more information. + When applied to a partitioned table, there is no data to rewrite, but any + partitions created afterwards with + CREATE TABLE PARTITION OF will use that access method, + unless overridden by an USING clause. diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 10ef699fab9..b20d272b151 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -1295,9 +1295,12 @@ WITH ( MODULUS numeric_literal, REM This optional clause specifies the table access method to use to store the contents for the new table; the method needs be an access method of type TABLE. See for more - information. If this option is not specified, the default table access - method is chosen for the new table. See for more information. + information. If this option is not specified, a default table access + method is chosen for the new table. + When creating a partition, the default table access method is the + access method of its parent. + For other tables, the default is determined by + . diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 2a0d82aedd7..bbf8e08618b 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1452,7 +1452,8 @@ heap_create_with_catalog(const char *relname, * No need to add an explicit dependency for the toast table, as the * main table depends on it. */ - if (RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) + if ((RELKIND_HAS_TABLE_AM(relkind) && relkind != RELKIND_TOASTVALUE) || +relkind ==
[BUG] pg_dump does not properly deal with BEGIN ATOMIC function
Hi, What appears to be a pg_dump/pg_restore bug was observed with the new BEGIN ATOMIC function body syntax introduced in Postgres 14. Dependencies inside a BEGIN ATOMIC function cannot be resolved if those dependencies are dumped after the function body. The repro case is when a primary key constraint is used in a ON CONFLICT ON CONSTRAINT used within the function. With the attached repro, pg_restore fails with pg_restore: error: could not execute query: ERROR: constraint "a_pkey" for table "a" does not exist Command was: CREATE FUNCTION public.a_f(c1_in text, c2 integer DEFAULT 60) RETURNS void I am not sure if the answer if to dump functions later on in the process. Would appreciate some feedback on this issue. Regards, Sami Imseih Amazon Web Services (AWS) repro.sh Description: repro.sh
Re: generate syscache info automatically
On 31.05.23 13:02, Dagfinn Ilmari Mannsåker wrote: For other patterns without the optional bits in the keyword, it becomes even simpler, e.g. if (/^DECLARE_TOAST\(\s* (?\w+),\s* (?\d+),\s* (?\d+)\s* \)/x ) { push @{ $catalog{toasting} }, {%+}; } I'd be happy to submit a patch to do this for all the ParseHeader() regexes (in a separate thread) if others agree this is an improvement. I would welcome such a patch.
Re: Docs: Encourage strong server verification with SCRAM
On Wed, May 31, 2023 at 10:08:39AM -0400, Jacob Champion wrote: > LGTM! Okay. Does anybody have any comments and/or objections? -- Michael signature.asc Description: PGP signature
Re: [PATCH] Support % wildcard in extension upgrade filenames
On Thu, Apr 27, 2023 at 12:49:57PM +0200, Sandro Santilli wrote: > On Mon, Apr 24, 2023 at 01:06:24PM -0400, Mat Arye wrote: > > Hi All, > > > > I've done upgrade maintenance for multiple extensions now (TimescaleDB > > core, Promscale) and I think the original suggestion (wildcard filenames > > with control-file switch to enable) here is a good one. > > Thanks for your comment, Mat. > > I'm happy to bring back the control-file switch if there's an > agreement about that. I'm attaching an up-to-date version of the patch with the control-file switch back in, so there's an explicit choice by extension developers. --strk; >From 466338401ce8305b7ac9aa59386816c3a6884f02 Mon Sep 17 00:00:00 2001 From: Sandro Santilli Date: Wed, 14 Sep 2022 11:10:10 +0200 Subject: [PATCH v3] Allow wildcard (%) in extension upgrade paths A wildcard character "%" will be accepted in the "source" side of the upgrade script and be considered usable to upgrade any version to the "target" side. Using wildcards needs to be explicitly requested by extensions via a "wildcard_upgrades" setting in their control file. Includes regression test and documentation. --- doc/src/sgml/extend.sgml | 14 + src/backend/commands/extension.c | 58 +-- src/test/modules/test_extensions/Makefile | 7 ++- .../expected/test_extensions.out | 15 + .../test_extensions/sql/test_extensions.sql | 9 +++ .../test_ext_wildcard1--%--2.0.sql| 6 ++ .../test_ext_wildcard1--1.0.sql | 6 ++ .../test_ext_wildcard1.control| 4 ++ 8 files changed, 110 insertions(+), 9 deletions(-) create mode 100644 src/test/modules/test_extensions/test_ext_wildcard1--%--2.0.sql create mode 100644 src/test/modules/test_extensions/test_ext_wildcard1--1.0.sql create mode 100644 src/test/modules/test_extensions/test_ext_wildcard1.control diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml index 218940ee5c..3d4003eaef 100644 --- a/doc/src/sgml/extend.sgml +++ b/doc/src/sgml/extend.sgml @@ -822,6 +822,20 @@ RETURNS anycompatible AS ... + + + wildcard_upgrades (boolean) + + +This parameter, if set to true (which is not the +default), allows ALTER EXTENSION to consider +a wildcard character % as matching any version of +the extension. Such wildcard match will only be used when no +perfect match is found for a version. + + + + diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index 0eabe18335..207b4649f2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -88,6 +88,7 @@ typedef struct ExtensionControlFile bool relocatable; /* is ALTER EXTENSION SET SCHEMA supported? */ bool superuser; /* must be superuser to install? */ bool trusted; /* allow becoming superuser on the fly? */ + bool wildcard_upgrades; /* allow using wildcards in upgrade scripts */ int encoding; /* encoding of the script file, or -1 */ List *requires; /* names of prerequisite extensions */ List *no_relocate; /* names of prerequisite extensions that @@ -132,6 +133,7 @@ static void ApplyExtensionUpdates(Oid extensionOid, bool cascade, bool is_create); static char *read_whole_file(const char *filename, int *length); +static bool file_exists(const char *name); /* @@ -584,6 +586,14 @@ parse_extension_control_file(ExtensionControlFile *control, errmsg("parameter \"%s\" requires a Boolean value", item->name))); } + else if (strcmp(item->name, "wildcard_upgrades") == 0) + { + if (!parse_bool(item->value, >wildcard_upgrades)) +ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("parameter \"%s\" requires a Boolean value", +item->name))); + } else if (strcmp(item->name, "encoding") == 0) { control->encoding = pg_valid_server_encoding(item->value); @@ -656,6 +666,7 @@ read_extension_control_file(const char *extname) control->relocatable = false; control->superuser = true; control->trusted = false; + control->wildcard_upgrades = false; control->encoding = -1; /* @@ -913,7 +924,15 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, if (from_version == NULL) elog(DEBUG1, "executing extension script for \"%s\" version '%s'", control->name, version); else + { + if ( control->wildcard_upgrades && ! file_exists(filename) ) + { + elog(DEBUG1, "extension upgrade script \"%s\" does not exist, will try wildcard", filename); + /* if filename does not exist, try wildcard */ + filename = get_extension_script_filename(control, "%", version); + } elog(DEBUG1, "executing extension script for \"%s\" update from version '%s' to '%s'", control->name, from_version, version); + } /* * If installing a trusted extension on behalf of a
Re: Support edit order of the fields in table
On Thu, 2023-06-01 at 00:31 +0800, Chang Wei 昌維 wrote: > Hi Postgres community: I think support editing order of the fields in > table is a useful feature. I have known that the order of fields will > effect the data structure of rows data, but I think we could add a extra > information to identify the display order of fields but not effect the > rows data, and the order identification is only used to display in order > while execute `SELECT * FROM [table_name]` and display the table > structure on GUI tools like pgAdmin. > > Now, we must create a new view and define the order of fields if we need > to display the fields of table in a order of our demand, it is not a > good way. But PostgreSQL tables are not spreadsheets. When, except in the display of the result of interactive queries, would the order matter? Yours, Laurenz Albe
Re: Incremental View Maintenance, take 2
On Thu, 1 Jun 2023 23:59:09 +0900 Yugo NAGATA wrote: > Hello hackers, > > Here's a rebased version of the patch-set adding Incremental View > Maintenance support for PostgreSQL. That was discussed in [1]. > [1] > https://www.postgresql.org/message-id/flat/20181227215726.4d166b4874f8983a641123f5%40sraoss.co.jp --- * Overview Incremental View Maintenance (IVM) is a way to make materialized views up-to-date by computing only incremental changes and applying them on views. IVM is more efficient than REFRESH MATERIALIZED VIEW when only small parts of the view are changed. ** Feature The attached patchset provides a feature that allows materialized views to be updated automatically and incrementally just after a underlying table is modified. You can create an incementally maintainable materialized view (IMMV) by using CREATE INCREMENTAL MATERIALIZED VIEW command. The followings are supported in view definition queries: - SELECT ... FROM ... WHERE ..., joins (inner joins, self-joins) - some built-in aggregate functions (count, sum, avg, min, max) - GROUP BY clause - DISTINCT clause Views can contain multiple tuples with the same content (duplicate tuples). ** Restriction The following are not supported in a view definition: - Outer joins - Aggregates otehr than above, window functions, HAVING - Sub-queries, CTEs - Set operations (UNION, INTERSECT, EXCEPT) - DISTINCT ON, ORDER BY, LIMIT, OFFSET Also, a view definition query cannot contain other views, materialized views, foreign tables, partitioned tables, partitions, VALUES, non-immutable functions, system columns, or expressions that contains aggregates. --- * Design An IMMV is maintained using statement-level AFTER triggers. When an IMMV is created, triggers are automatically created on all base tables contained in the view definition query. When a table is modified, changes that occurred in the table are extracted as transition tables in the AFTER triggers. Then, changes that will occur in the view are calculated by a rewritten view dequery in which the modified table is replaced with the transition table. For example, if the view is defined as "SELECT * FROM R, S", and tuples inserted into R are stored in a transiton table dR, the tuples that will be inserted into the view are calculated as the result of "SELECT * FROM dR, S". ** Multiple Tables Modification Multiple tables can be modified in a statement when using triggers, foreign key constraint, or modifying CTEs. When multiple tables are modified, we need the state of tables before the modification. For example, when some tuples, dR and dS, are inserted into R and S respectively, the tuples that will be inserted into the view are calculated by the following two queries: "SELECT * FROM dR, S_pre" "SELECT * FROM R, dS" where S_pre is the table before the modification, R is the current state of table, that is, after the modification. This pre-update states of table is calculated by filtering inserted tuples and appending deleted tuples. The subquery that represents pre-update state is generated in get_prestate_rte(). Specifically, the insterted tuples are filtered by calling IVM_visible_in_prestate() in WHERE clause. This function checks the visibility of tuples by using the snapshot taken before table modification. The deleted tuples are contained in the old transition table, and this table is appended using UNION ALL. Transition tables for each modification are collected in each AFTER trigger function call. Then, the view maintenance is performed in the last call of the trigger. In the original PostgreSQL, tuplestores of transition tables are freed at the end of each nested query. However, their lifespan needs to be prolonged to the end of the out-most query in order to maintain the view in the last AFTER trigger. For this purpose, SetTransitionTablePreserved is added in trigger.c. ** Duplicate Tulpes When calculating changes that will occur in the view (= delta tables), multiplicity of tuples are calculated by using count(*). When deleting tuples from the view, tuples to be deleted are identified by joining the delta table with the view, and tuples are deleted as many as specified multiplicity by numbered using row_number() function. This is implemented in apply_old_delta(). When inserting tuples into the view, each tuple is duplicated to the specified multiplicity using generate_series() function. This is implemented in apply_new_delta(). ** DISTINCT clause When DISTINCT is used, the view has a hidden column __ivm_count__ that stores multiplicity for tuples. When tuples are deleted from or inserted into the view, the values of __ivm_count__ column is decreased or increased as many as specified multiplicity. Eventually, when the values becomes zero, the corresponding tuple is deleted from the
Re: Why does pg_bsd_indent need to be installed?
On Wed, May 31, 2023 at 01:21:05PM -0400, Tom Lane wrote: > Peter Eisentraut writes: > > On 25.05.23 13:05, Tom Lane wrote: > >> Well, if you know where the build directory is, sure. But any way you > >> slice it there is an extra bit of knowledge required. Since pg_bsd_indent > >> changes so seldom, keeping it in your PATH is at least as easy as any > >> other solution, IMO. > > > The reason I bumped into this is that 15 and 16 use different versions > > of pg_bsd_indent, so you can't just keep one copy in, like, ~/bin/. > > Well, personally, I never bother to adjust patches to the indentation > rules of old versions, so using the latest pg_bsd_indent suffices. I guess we could try looking for pg_bsd_indent-$MAJOR_VERSION first, then pg_bsd_indent. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: Why does pg_bsd_indent need to be installed?
Peter Eisentraut writes: > On 25.05.23 13:05, Tom Lane wrote: >> Well, if you know where the build directory is, sure. But any way you >> slice it there is an extra bit of knowledge required. Since pg_bsd_indent >> changes so seldom, keeping it in your PATH is at least as easy as any >> other solution, IMO. > The reason I bumped into this is that 15 and 16 use different versions > of pg_bsd_indent, so you can't just keep one copy in, like, ~/bin/. Well, personally, I never bother to adjust patches to the indentation rules of old versions, so using the latest pg_bsd_indent suffices. regards, tom lane
Re: [PATCH] Support % wildcard in extension upgrade filenames
On Thu, May 18, 2023 at 11:14:52PM +0200, Przemysław Sztoch wrote: > For me, it would be a big help if you could specify a simple from/to range > as the source version: > myext--1.0.0-1.9.9--2.0.0.sql > myext--2.0.0-2.9.9--3.0.0.sql > myext--3.0.0-3.2.3--3.2.4.sql > > The idea of % wildcard in my opinion is fully contained in the from/to > range. Yes, it will be once partial matching is implemented (in its current state the patch only allows the wildcard to cover the whole version, not just a portion, but the idea was to move to partial matches too). > The name "ANY" should also be allowed as the source version. It is. The only character with a special meaning, in my patch, would be the "%" character, and would ONLY be special if the extension has set the wildcard_upgrades directive to "true" in the .control file. > Some extensions are a collection of CREATE OR REPLACE FUNCTION. All upgrades > are one and the same SQL file. Yes, this is the case for PostGIS, and that's why we're looking forward to addint support for this wildcard. --strk;
Re: generate syscache info automatically
Peter Eisentraut writes: > The idea was mentioned in [0]. genbki.pl already knows everything about > system catalog indexes. If we add a "please also make a syscache for > this one" flag to the catalog metadata, we can have genbki.pl produce > the tables in syscache.c and syscache.h automatically. +1 on this worthwhile reduction of manual work. Tangentially, it reminded me of one of my least favourite parts of Catalog.pm, the regexes in ParseHeader(): > diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm > index 84aaeb002a..a727d692b7 100644 > --- a/src/backend/catalog/Catalog.pm > +++ b/src/backend/catalog/Catalog.pm > @@ -110,7 +110,7 @@ sub ParseHeader > }; > } > elsif ( > - > /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(.+)\)/ > + > /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s*(\w+),\s*(\d+),\s*(\w+),\s*(\w+),\s*(.+)\)/ > ) > { > push @{ $catalog{indexing} }, > @@ -120,7 +120,8 @@ sub ParseHeader > index_name => $3, > index_oid => $4, > index_oid_macro => $5, > - index_decl => $6 > + table_name => $6, > + index_decl => $7 > }; > } > elsif (/^DECLARE_OID_DEFINING_MACRO\(\s*(\w+),\s*(\d+)\)/) Now that we require Perl 5.14, we could replace this parenthesis- counting nightmare with named captures (introduced in Perl 5.10), which would make the above change look like this instead (context expanded to show the whole elsif block): elsif ( /^DECLARE_(UNIQUE_)?INDEX(_PKEY)?\(\s* (?\w+),\s* (?\d+),\s* (?\w+),\s* +(?\w+),\s* (?.+) \)/x ) { push @{ $catalog{indexing} }, { is_unique => $1 ? 1 : 0, is_pkey => $2 ? 1 : 0, %+, }; } For other patterns without the optional bits in the keyword, it becomes even simpler, e.g. if (/^DECLARE_TOAST\(\s* (?\w+),\s* (?\d+),\s* (?\d+)\s* \)/x ) { push @{ $catalog{toasting} }, {%+}; } I'd be happy to submit a patch to do this for all the ParseHeader() regexes (in a separate thread) if others agree this is an improvement. - ilmari
Re: Do we want a hashset type?
On 5/31/23 17:40, Joel Jacobson wrote: > On Wed, May 31, 2023, at 16:53, Tomas Vondra wrote: >> I think this needs a better explanation - what exactly is a hashset in >> this context? Something like an array with a hash for faster lookup of >> unique elements, or what? > > In this context, by "hashset" I am indeed referring to a data structure > similar > to an array, where each element would be unique, and lookups would be faster > than arrays for larger number of elements due to hash-based lookups. > Why would you want hash-based lookups? It should be fairly trivial to implement as a user-defined data type, but in what cases are you going to ask "does the hashset contain X"? > This data structure would store identifiers (IDs) of the nodes, not the > complete > nodes themselves. > How does storing just the IDs solves anything? Isn't the main challenge the random I/O when fetching the adjacent nodes? This does not really improve that, no? >> Presumably it'd store whole adjacent nodes, not just some sort of node >> ID. So what if a node is adjacent to many other nodes? What if a node is >> added/deleted/modified? > > That would require updating the hashset, which should be close to O(1) in > practical applications. > But you need to modify hashsets for all the adjacent nodes. Also, O(1) doesn't say it's cheap. I wonder how expensive it'd be in practice. >> AFAICS the main problem is the lookups of adjacent nodes, generating >> lot of random I/O etc. Presumably it's not that hard to keep the >> "relational" schema with table for vertices/edges, and then an auxiliary >> table with adjacent nodes grouped by node, possibly maintained by a >> couple triggers. A bit like an "aggregated index" except the queries >> would have to use it explicitly. > > Yes, auxiliary table would be good, since we don't want to duplicate all > node-related data, and only store the IDs in the adjacent nodes hashset. > I may be missing something, but as mentioned, I don't quite see how this would help. What exactly would this save us? If you create an index on the edge node IDs, you should get the adjacent nodes pretty cheap from an IOS. Granted, a pre-built hashset is going to be faster, but if the next step is fetching the node info, that's going to do a lot of random I/O, dwarfing all of this. It's entirely possible I'm missing some important aspect. It'd be very useful to have a couple example queries illustrating the issue, that we could use to actually test different approaches. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Support edit order of the fields in table
Hi Postgres community: I think support editing order of the fields in table is a useful feature. I have known that the order of fields will effect the data structure of rows data, but I think we could add a extra information to identify the display order of fields but not effect the rows data, and the order identification is only used to display in order while execute `SELECT * FROM [table_name]` and display the table structure on GUI tools like pgAdmin. Now, we must create a new view and define the order of fields if we need to display the fields of table in a order of our demand, it is not a good way. Many Thanks Chang Wei
Re: Do we want a hashset type?
On Wed, May 31, 2023, at 16:53, Tomas Vondra wrote: > I think this needs a better explanation - what exactly is a hashset in > this context? Something like an array with a hash for faster lookup of > unique elements, or what? In this context, by "hashset" I am indeed referring to a data structure similar to an array, where each element would be unique, and lookups would be faster than arrays for larger number of elements due to hash-based lookups. This data structure would store identifiers (IDs) of the nodes, not the complete nodes themselves. > Presumably it'd store whole adjacent nodes, not just some sort of node > ID. So what if a node is adjacent to many other nodes? What if a node is > added/deleted/modified? That would require updating the hashset, which should be close to O(1) in practical applications. > AFAICS the main problem is the lookups of adjacent nodes, generating > lot of random I/O etc. Presumably it's not that hard to keep the > "relational" schema with table for vertices/edges, and then an auxiliary > table with adjacent nodes grouped by node, possibly maintained by a > couple triggers. A bit like an "aggregated index" except the queries > would have to use it explicitly. Yes, auxiliary table would be good, since we don't want to duplicate all node-related data, and only store the IDs in the adjacent nodes hashset. /Joel
Re: Do we want a hashset type?
On 5/31/23 16:09, Joel Jacobson wrote: > Hi, > > I've been working with a social network start-up that uses PostgreSQL as > their > only database. Recently, they became interested in graph databases, largely > because of an article [1] suggesting that a SQL database "just chokes" > when it > encounters a depth-five friends-of-friends query (for a million users having > 50 friends each). > > The article didn't provide the complete SQL queries, so I had to buy the > referenced book to get the full picture. It turns out, the query was a > simple > self-join, which, of course, isn't very efficient. When we rewrote the query > using a modern RECURSIVE CTE, it worked but still took quite some time. > > Of course, there will always be a need for specific databases, and some > queries > will run faster on them. But, if PostgreSQL could handle graph queries > with a > Big-O runtime similar to graph databases, scalability wouldn't be such a big > worry. > > Just like the addition of the JSON type to PostgreSQL helped reduce the hype > around NoSQL, maybe there's something simple that's missing in > PostgreSQL that > could help us achieve the same Big-O class performance as graph > databases for > some of these type of graph queries? > > Looking into the key differences between PostgreSQL and graph databases, > it seems that one is how they store adjacent nodes. In SQL, a graph can be > represented as one table for the Nodes and another table for the Edges. > For a friends-of-friends query, we would need to query Edges to find > adjacent > nodes, recursively. > > Graph databases, on the other hand, keep adjacent nodes immediately > accessible > by storing them with the node itself. This looks like a major difference in > terms of how the data is stored. > > Could a hashset type help bridge this gap? > > The idea would be to store adjacent nodes as a hashset column in a Nodes > table. > I think this needs a better explanation - what exactly is a hashset in this context? Something like an array with a hash for faster lookup of unique elements, or what? Presumably it'd store whole adjacent nodes, not just some sort of node ID. So what if a node is adjacent to many other nodes? What if a node is added/deleted/modified? AFAICS the main problem is the lookups of adjacent nodes, generating lot of random I/O etc. Presumably it's not that hard to keep the "relational" schema with table for vertices/edges, and then an auxiliary table with adjacent nodes grouped by node, possibly maintained by a couple triggers. A bit like an "aggregated index" except the queries would have to use it explicitly. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Refactor ssl tests to avoid using internal PostgreSQL::Test::Cluster methods
> On 31 May 2023, at 15:46, Melih Mutlu wrote: > I was comparing this new restart function to start and stop functions. > I see that restart() does not return a value if it's successful while > others return 1. > Its return value is not checked anywhere, so it may not be useful at > the moment. But I feel like it would be nice to make it look like > start()/stop(). What do you think? It should absolutely return 1, that was an oversight. Fixed as well as documentation updated. >> command_fails( >>[ 'pg_ctl', '-D', $node->data_dir, '-l', $node->logfile, 'restart' ], >>'restart fails with incorrect SSL protocol bounds'); > > There are two other places where ssl tests restart the node like > above. We can call $node->restart in those lines too. Fixed in the attached v2. -- Daniel Gustafsson v2-0001-Avoid-using-internal-test-methods-in-SSL-tests.patch Description: Binary data
Do we want a hashset type?
Hi, I've been working with a social network start-up that uses PostgreSQL as their only database. Recently, they became interested in graph databases, largely because of an article [1] suggesting that a SQL database "just chokes" when it encounters a depth-five friends-of-friends query (for a million users having 50 friends each). The article didn't provide the complete SQL queries, so I had to buy the referenced book to get the full picture. It turns out, the query was a simple self-join, which, of course, isn't very efficient. When we rewrote the query using a modern RECURSIVE CTE, it worked but still took quite some time. Of course, there will always be a need for specific databases, and some queries will run faster on them. But, if PostgreSQL could handle graph queries with a Big-O runtime similar to graph databases, scalability wouldn't be such a big worry. Just like the addition of the JSON type to PostgreSQL helped reduce the hype around NoSQL, maybe there's something simple that's missing in PostgreSQL that could help us achieve the same Big-O class performance as graph databases for some of these type of graph queries? Looking into the key differences between PostgreSQL and graph databases, it seems that one is how they store adjacent nodes. In SQL, a graph can be represented as one table for the Nodes and another table for the Edges. For a friends-of-friends query, we would need to query Edges to find adjacent nodes, recursively. Graph databases, on the other hand, keep adjacent nodes immediately accessible by storing them with the node itself. This looks like a major difference in terms of how the data is stored. Could a hashset type help bridge this gap? The idea would be to store adjacent nodes as a hashset column in a Nodes table. Apache AGE is an option for users who really need a new graph query language. But I believe there are more users who occasionally run into a graph problem and would be glad if there was an efficient way to solve it in SQL without having to bring in a new database. /Joel [1] https://neo4j.com/news/how-much-faster-is-a-graph-database-really/
Re: Docs: Encourage strong server verification with SCRAM
On Sun, May 28, 2023 at 2:22 PM Jonathan S. Katz wrote: > The above assumes that the reader reviewed the previous paragraph and > followed the guidelines there. However, we can make it explicit. Please > see attached. LGTM! Thanks, --Jacob
Re: Refactor ssl tests to avoid using internal PostgreSQL::Test::Cluster methods
Hi Daniel, Thanks for the patch. Daniel Gustafsson , 31 May 2023 Çar, 15:48 tarihinde şunu yazdı: > > To avoid this, the attached adds fail_ok functionality to restart() which > makes > it easier to use it in tests, and aligns it with how stop() and start() works. > The resulting SSL tests are also more readable IMO. I agree that it's more readable this way. I only have a few comments: > > + BAIL_OUT("pg_ctl restart failed") unless $params{fail_ok}; > + return 0; > + } > > > $self->_update_pid(1); > return; I was comparing this new restart function to start and stop functions. I see that restart() does not return a value if it's successful while others return 1. Its return value is not checked anywhere, so it may not be useful at the moment. But I feel like it would be nice to make it look like start()/stop(). What do you think? > command_fails( > [ 'pg_ctl', '-D', $node->data_dir, '-l', $node->logfile, 'restart' ], > 'restart fails with incorrect SSL protocol bounds'); There are two other places where ssl tests restart the node like above. We can call $node->restart in those lines too. Best, -- Melih Mutlu Microsoft
Refactor ssl tests to avoid using internal PostgreSQL::Test::Cluster methods
The SSL tests for pg_ctl restarts with an incorrect key passphrase run pg_ctl manually and use the internal method _update_pid to set the server PID file accordingly. This is needed since $node->restart will BAIL in case the restart fails, which clearly isn't useful to anyone wanting to test restarts. This is the only use of _update_pid outside of Cluster.pm. To avoid this, the attached adds fail_ok functionality to restart() which makes it easier to use it in tests, and aligns it with how stop() and start() works. The resulting SSL tests are also more readable IMO. -- Daniel Gustafsson v1-0001-Avoid-using-internal-test-methods-in-SSL-tests.patch Description: Binary data
Re: benchmark results comparing versions 15.2 and 16
Hi Mark, On Tue, May 30, 2023 at 1:03 PM MARK CALLAGHAN wrote: > Do you want me to try PG 16 without ICU or PG 15 with ICU? > I can do that, but it will take a few days before the server is available. Sorry for the late reply. Most of the Postgres developers (myself included) are attending pgCon right now. It would be nice to ascertain just how much of a boost we're getting from our use of ICU as far as sysbench goes. I'd appreciate having that information. We discussed the choice of ICU as default collation provider at yesterday's developer meeting: https://wiki.postgresql.org/wiki/PgCon_2023_Developer_Meeting#High_level_thoughts_and_feedback_on_moving_toward_ICU_as_the_preferred_collation_provider https://wiki.postgresql.org/wiki/StateOfICU Just confirming my theory about abbreviated keys (without rerunning the benchmark) should be simple - perhaps that would be a useful place to start. You could just rerun the two individual queries of interest from an interactive psql session. There are even low-level debug messages available through the trace_sort GUC. From a psql session you'd run something along the lines of "set client_min_messages=log; set trace_sort=on; $QUERY". You'll see lots of LOG messages with specific information about the use of abbreviated keys and the progress of each sort. Thanks -- Peter Geoghegan
Re: WAL Insertion Lock Improvements
On Mon, May 22, 2023 at 09:26:25AM +0900, Michael Paquier wrote: > Simpler and consistent, nice. I don't have much more to add, so I > have switched the patch as RfC. While at PGcon, Andres has asked me how many sockets are in the environment I used for the tests, and lscpu tells me the following, which is more than 1: CPU(s): 64 On-line CPU(s) list: 0-63 Core(s) per socket: 16 Socket(s): 2 NUMA node(s):2 @Andres: Were there any extra tests you wanted to be run for more input? -- Michael signature.asc Description: PGP signature
Re: PG 16 draft release notes ready
On Wed, May 31, 2023 at 06:03:01PM +1200, David Rowley wrote: > I don't think this should go under "E.1.3.11. Source Code". The patch > was entirely aimed to increase performance, not just of allocations > themselves, but of any operations which uses palloc'd memory. This is > due to the patch increasing the density of memory allocation on blocks > malloc'd by our memory context code so that fewer CPU cache lines need > to be touched in the entire backend process for *all* memory that's > allocated with palloc. The performance increase here can be fairly > significant for small-sized palloc requests when CPU cache pressure is > high. Since CPU caches aren't that big, it does not take much of a > query to put the cache pressure up. Hashing or sorting a few million > rows is going to do that. > > The patch here was born out of the regression report I made in [1], > which I mention in [2] about the prototype patch Andres wrote to fix > the performance regression. > > I think "E.1.3.1.2. General Performance" might be a better location. > Having it under "Source Code" makes it sound like it was some > refactoring work. That's certainly not the case. Okay, moved. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Re: Why does pg_bsd_indent need to be installed?
On 25.05.23 13:05, Tom Lane wrote: Well, if you know where the build directory is, sure. But any way you slice it there is an extra bit of knowledge required. Since pg_bsd_indent changes so seldom, keeping it in your PATH is at least as easy as any other solution, IMO. The reason I bumped into this is that 15 and 16 use different versions of pg_bsd_indent, so you can't just keep one copy in, like, ~/bin/.
Re: testing dist tarballs
On 27.05.23 14:47, Andres Freund wrote: Separately, it's somewhat confusing that we include errcodes.h etc in src/backend/utils, rather than its final location, in src/include/utils. It works, even without perl, because copying the file doesn't require perl, it's just generating it... The "copying" is actually a symlink, right? I don't think we want to ship symlinks in the tarball? Fair point - still seems we should just create the files in the right directory instead of doing it in the wrong place and then creating symlinks to make them accessible... Right. I think the reason this was set up this way is that with make it is generally dubious to create target files outside of the current directory.
Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed
Hi, On 5/30/23 12:34 PM, Drouvot, Bertrand wrote: Hi hackers, Please find attached a patch proposal to $SUBJECT. Indeed, we have seen occurrences in [1] that some slots were not invalidated (while we expected vacuum to remove dead rows leading to slots invalidation on the standby). Though we don't have strong evidences that this was due to transactions holding back global xmin (as vacuum did not run in verbose mode), suspicion is high enough (as Tom pointed out that the test is broken on its face (see [1])). The proposed patch: - set autovacuum = off on the primary (as autovacuum is the usual suspect for holding global xmin). - Ensure that vacuum is able to remove dead rows before launching the slots invalidation tests. - If after 10 attempts the vacuum is still not able to remove the dead rows then the slots invalidation tests are skipped: that should be pretty rare, as currently the majority of the tests are green (with only one attempt). While at it, the patch also addresses the nitpicks mentioned by Robert in [2]. [1]: https://www.postgresql.org/message-id/flat/OSZPR01MB6310CFFD7D0DCD60A05DB1C3FD4A9%40OSZPR01MB6310.jpnprd01.prod.outlook.com#71898e088d2a57564a1bd9c41f3e6f36 [2]: https://www.postgresql.org/message-id/CA%2BTgmobHGpU2ZkChgKifGDLaf_%2BmFA7njEpeTjfyNf_msCZYew%40mail.gmail.com Please find attached V2 that, instead of the above proposal, waits for a new snapshot that has a newer horizon before doing the vacuum (as proposed by Andres in [1]). So, V2: - set autovacuum = off on the primary (as autovacuum is the usual suspect for holding global xmin). - waits for a new snapshot that has a newer horizon before doing the vacuum(s). - addresses the nitpicks mentioned by Robert in [2]. V2 also keeps the verbose mode for the vacuum(s) (as done in V1), as it may help for further analysis if needed. [1]: https://www.postgresql.org/message-id/20230530152426.ensapay7pozh7ozn%40alap3.anarazel.de [2]: https://www.postgresql.org/message-id/CA%2BTgmobHGpU2ZkChgKifGDLaf_%2BmFA7njEpeTjfyNf_msCZYew%40mail.gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comFrom 383bfcf39257d4542b35ffe2ab56b43182ca2dea Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date: Tue, 30 May 2023 07:54:02 + Subject: [PATCH v2] Ensure vacuum is able to remove dead rows in 035_standby_logical_decoding We want to ensure that vacuum was able to remove dead rows (aka no other transactions holding back global xmin) before testing for slots invalidation on the standby. --- .../t/035_standby_logical_decoding.pl | 76 +-- 1 file changed, 37 insertions(+), 39 deletions(-) 100.0% src/test/recovery/t/ diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl index 64beec4bd3..bd426f3068 100644 --- a/src/test/recovery/t/035_standby_logical_decoding.pl +++ b/src/test/recovery/t/035_standby_logical_decoding.pl @@ -185,7 +185,7 @@ sub change_hot_standby_feedback_and_wait_for_xmins # Check conflicting status in pg_replication_slots. sub check_slots_conflicting_status { - my ($conflicting) = @_; + my ($conflicting, $details) = @_; if ($conflicting) { @@ -193,7 +193,7 @@ sub check_slots_conflicting_status 'postgres', qq( select bool_and(conflicting) from pg_replication_slots;)); - is($res, 't', "Logical slots are reported as conflicting"); + is($res, 't', "logical slots are reported as conflicting $details"); } else { @@ -201,7 +201,7 @@ sub check_slots_conflicting_status 'postgres', qq( select bool_or(conflicting) from pg_replication_slots;)); - is($res, 'f', "Logical slots are reported as non conflicting"); + is($res, 'f', "logical slots are reported as non conflicting $details"); } } @@ -256,6 +256,22 @@ sub check_for_invalidation ) or die "Timed out waiting confl_active_logicalslot to be updated"; } +# Launch $sql and wait for a new snapshot that has a newer horizon before +# doing the vacuum with $vac_option on $to_vac. +sub wait_until_vacuum_can_remove +{ + my ($vac_option, $sql, $to_vac) = @_; + + my $xid = $node_primary->safe_psql('testdb', qq[$sql + select txid_current();]); + + $node_primary->poll_query_until('testdb', + "SELECT (select txid_snapshot_xmin(txid_current_snapshot()) - $xid) > 0") + or die "new snapshot does not have a newer horizon"; + + $node_primary->safe_psql('testdb', qq[VACUUM $vac_option verbose $to_vac; + INSERT INTO flush_wal DEFAULT VALUES; --
RE: Support logical replication of DDLs
On Tues, May 30, 2023 at 19:19 PM vignesh C wrote: > The attached patch has the changes for the above. Thanks for updating the new patch set. Here are some comments: === For patch 0001 1. In the function deparse_Seq_As. ``` + if (OidIsValid(seqdata->seqtypid)) + append_object_object(ret, "seqtype", + new_objtree_for_type(seqdata->seqtypid, -1)); + else + append_not_present(ret); ``` I think seqdata->seqtypid is always valid because we get this value from pg_sequence. I think it's fine to not check this value here. ~~~ 2. Deparsed results of the partition table. When I run the following SQLs: ``` create table parent (a int primary key) partition by range (a); create table child partition of parent default; ``` I got the following two deparsed results: ``` CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN , CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a) CREATE TABLE public.child PARTITION OF public.parent (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT ``` When I run these two deparsed results on another instance, I got the following error: ``` postgres=# CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN , CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a); CREATE TABLE postgres=# CREATE TABLE public.child PARTITION OF public.parent (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT; ERROR: multiple primary keys for table "child" are not allowed ``` I think that we could skip deparsing the primary key related constraint for partition (child) table in the function obtainConstraints for this case. === For patch 0008 3. Typos in the comments atop the function append_object_to_format_string ``` - * Return the object name which is extracted from the input "*%{name[:.]}*" - * style string. And append the input format string to the ObjTree. + * Append new jsonb key:value pair to the output parse state -- varargs version. + * + * The "fmt" argument is used to append as a "fmt" element in current object. + * The "skipObject" argument indicates if we want to skip object creation + * considering it will be taken care by the caller. + * The "numobjs" argument indicates the number of extra elements to append; + * for each one, a name (string), type (from the jbvType enum) and value must + * be supplied. The value must match the type given; for instance, jbvBool + * requires an * bool, jbvString requires a char * and so no, + * Each element type must match the conversion specifier given in the format + * string, as described in ddl_deparse_expand_command. + * + * Note we don't have the luxury of sprintf-like compiler warnings for + * malformed argument lists. */ -static char * -append_object_to_format_string(ObjTree *tree, char *sub_fmt) +static JsonbValue * +new_jsonb_VA(JsonbParseState *state, char *fmt, bool skipObject, int numobjs,...) ``` s/and so no/and so on s/requires an * bool/requires an bool s/type must/type must ~~~ 4. Typos of the function new_jsonb_for_type ``` /* - * Allocate a new object tree to store parameter values. + * A helper routine to insert jsonb for coltyp to the output parse state. */ -static ObjTree * -new_objtree(char *fmt) +static void +new_jsonb_for_type(JsonbParseState *state, Oid typeId, int32 typmod) ... + format_type_detailed(typeId, typmod, +, _name, , _array); ``` s/coltyp/typId s/typeId/typId ~~~ 5. In the function deparse_ColumnDef_toJsonb + /* +* create coltype object having 4 elements: schemaname, typename, typemod, +* typearray +*/ + { + /* Push the key first */ + insert_jsonb_key(state, "coltype"); + + /* Push the value */ + new_jsonb_for_type(state, typid, typmod); + } The '{ }' here seems a little strange. Do we need them? Many places have written the same as here in this patch. Regards, Wang wei
Re: ERROR: wrong varnullingrels (b 3) (expected (b)) for Var 2/1
> On 31.05.2023, at 08:36, Richard Guo wrote: > > Attached is a patch for that. Does this make sense? > > Thanks > Richard > All I can say is that it fixes the error for me — also for the non-simplified original query that I have. -markus
Re: ERROR: wrong varnullingrels (b 3) (expected (b)) for Var 2/1
On Wed, May 31, 2023 at 10:47 AM Richard Guo wrote: > On Tue, May 30, 2023 at 10:48 PM Markus Winand > wrote: > >> I found an error similar to others before ([1]) that is still persists as >> of head right now (0bcb3ca3b9). >> >> CREATE TABLE t ( >> n INTEGER >> ); >> >> SELECT * >> FROM (VALUES (1)) t(c) >> LEFT JOIN t ljl1 ON true >> LEFT JOIN LATERAL (WITH cte AS (SELECT * FROM t WHERE t.n = ljl1.n) >> SELECT * FROM cte) ljl2 ON ljl1.n = 1; >> >> ERROR: wrong varnullingrels (b 3) (expected (b)) for Var 2/1 >> >> Note that the error does **not** occur if the CTE is unwrapped like this: >> >> SELECT * >> FROM (VALUES (1)) t(c) >> LEFT JOIN t ljl1 ON true >> LEFT JOIN LATERAL (SELECT * FROM t WHERE t.n = ljl1.n) ljl2 ON ljl1.n = >> 1; > > > Thanks for the report! Reproduced here. Also it can be reproduced with > subquery, as long as the subquery is not pulled up. > > SELECT * > FROM (VALUES (1)) t(c) > LEFT JOIN t ljl1 ON true > LEFT JOIN LATERAL (SELECT * FROM t WHERE t.n = ljl1.n offset 0) ljl2 ON > ljl1.n = 1; > ERROR: wrong varnullingrels (b 3) (expected (b)) for Var 2/1 > > When we transform the first form of identity 3 to the second form, we've > converted Pb*c to Pbc in deconstruct_distribute_oj_quals. But we > neglect to consider that rel C might be a RTE_SUBQUERY and contains > quals that have lateral references to B. So the B vars in such quals > have wrong nulling bitmaps and we'd finally notice that when we do > fix_upper_expr for the NestLoopParam expressions. > We can identify in which form of identity 3 the plan is built up by checking the relids of the B/C join's outer rel. If it's in the first form, the outer rel's relids must contain the A/B join. Otherwise it should only contain B's relid. So I'm considering that maybe we can adjust the nulling bitmap for nestloop parameters according to that. Attached is a patch for that. Does this make sense? Thanks Richard v1-0001-Fix-nulling-bitmap-for-nestloop-parameters.patch Description: Binary data
Re: [16Beta1][doc] pgstat: Track time of the last scan of a relation
On Wed, 31 May 2023 at 15:57, Shinoda, Noriyoshi (PN Japan FSIP) wrote: > According to the documentation [2], the data type of the columns added to > these views is 'timestamptz'. > However, columns of the same data type in pg_stat_all_tables.last_vacuum, > last_analyze and other tables are unified to 'timestamp with time zone'. The > attached patch changes the data type of the added column from timestamptz to > timestamp with time zone. I agree that it would be good to make those consistently use timestamp with time zone for all columns of that type in the docs for pg_stat_all_tables. More generally, it might be good if we did it for the entire docs: doc $ git grep "timestamptz" | wc -l 17 doc $ git grep "timestamp with time zone" | wc -l 74 Clearly "timestamp with time zone" is much more commonly used. The bar is probably set a bit higher for changing the longer-established ones, however. David
Re: PG 16 draft release notes ready
On Wed, 31 May 2023 at 11:32, Bruce Momjian wrote: > > On Thu, May 25, 2023 at 05:57:25PM +1200, David Rowley wrote: > > On 64-bit builds, it was 16 bytes for AllocSet contexts, 24 bytes for > > generation contexts and 16 bytes for slab contexts. > > Okay, item added to Source Code: I don't think this should go under "E.1.3.11. Source Code". The patch was entirely aimed to increase performance, not just of allocations themselves, but of any operations which uses palloc'd memory. This is due to the patch increasing the density of memory allocation on blocks malloc'd by our memory context code so that fewer CPU cache lines need to be touched in the entire backend process for *all* memory that's allocated with palloc. The performance increase here can be fairly significant for small-sized palloc requests when CPU cache pressure is high. Since CPU caches aren't that big, it does not take much of a query to put the cache pressure up. Hashing or sorting a few million rows is going to do that. The patch here was born out of the regression report I made in [1], which I mention in [2] about the prototype patch Andres wrote to fix the performance regression. I think "E.1.3.1.2. General Performance" might be a better location. Having it under "Source Code" makes it sound like it was some refactoring work. That's certainly not the case. A bit more detail: Here's a small histogram of the number of allocations in various size buckets from running make check with some debug output in AllocSetAlloc and GenerationAlloc to record the size of the allocation: bucket | number_of_allocations | percent_of_total_allocations +---+- up to 16 bytes | 8,881,106 | 31.39 up to 32 bytes | 4,579,608 | 16.18 up to 64 bytes | 6,574,107 | 23.23 above 64 bytes | 8,260,714 | 29.19 So quite a large portion of our allocations (at least in our test suite) are small. Halving the 16-byte chunk header down 8 bytes on a 16-byte allocation means a 25% memory saving. David [1] https://www.postgresql.org/message-id/CAApHDvqXpLzav6dUeR5vO_RBh_feHrHMLhigVQXw9jHCyKP9PA%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAApHDvowHNSVLhMc0cnovg8PfnYQZxit-gP_bn3xkT4rZX3G0w%40mail.gmail.com