Help regarding figuring out routes in pgAdmin4
Dear PostgreSQL Hackers, I hope this email finds you well. As I delve into the codebase, I've encountered some challenges understanding how routes are implemented within the application. As I navigate through the codebase, I've encountered some challenges understanding how routes are implemented within pgAdmin4. While I've made efforts to study the documentation and examine the source code, I'm still struggling to grasp certain aspects, particularly in files like roles __init__.py. Coming from a different stack, I'm accustomed to certain patterns and conventions which seem to differ in Flask. Specifically, I cannot locate the familiar @blueprint.route decorator in these files, which has left me somewhat perplexed. Thank you very much for your time and assistance. I look forward to hearing from you at your earliest convenience.
Re: Removing unneeded self joins
On 3/5/2024 20:55, Robert Haas wrote: One of my most embarrassing gaffes in this area personally was a448e49bcbe40fb72e1ed85af910dd216d45bad8. I don't know how I managed to commit the original patch without realizing it was going to cause an increase in the WAL size, but I can tell you that when I realized it, my heart sank through the floor. I discovered this feature and agree that it looks like a severe problem. Unfortunately, in the case of the SJE patch, the committer and reviewers don't provide negative feedback. We see the only (I'm not sure I use the proper English phrase) 'negative feelings' from people who haven't reviewed or analysed it at all (at least, they didn't mention it). Considering the situation, I suggest setting the default value of enable_self_join_removal to false in PG17 for added safety and then changing it to true in early PG18. Having no objective negative feedback, we have no reason to change anything in the design or any part of the code. It looks regrettable and unusual. After designing the feature, fixing its bugs, and reviewing joint patches on the commitfest, the question more likely lies in the planner design. For example, I wonder if anyone here knows why exactly the optimiser makes a copy of the whole query subtree in some places. Another example is PlannerInfo. Can we really control all the consequences of introducing, let's say, a new JoinDomain entity? You also mentioned 2024.pgconf.dev. Considering the current migration policy in some countries, it would be better to work through the online presence as equivalent to offline. Without an online part of the conference, the only way to communicate and discuss is through this mailing list. -- regards, Andrei Lepikhov
Re: On disable_cost
On Sun, 5 May 2024 at 04:57, Tom Lane wrote: > > David Rowley writes: > > That doesn't get you the benefits of fewer CPU cycles, but where did > > that come from as a motive to change this? There's no shortage of > > other ways to make the planner faster if that's an issue. > > The concern was to not *add* CPU cycles in order to make this area > better. But I do tend to agree that we've exhausted all the other > options. It really looks to me that Robert was talking about not generating paths for disabled path types. He did write "just to save CPU cycles" in the paragraph I quoted. I think we should concern ourselves with adding overhead to add_path() *only* when we actually see a patch which slows it down in a way that we can measure. I find it hard to imagine that adding a single comparison for every Path is measurable. Each of these paths has been palloced and costed, both of which are significantly more expensive than adding another comparison to compare_path_costs_fuzzily(). I'm only willing for benchmarks on an actual patch to prove me wrong on that. Nothing else. add_path() has become a rat's nest of conditions over the years and those seem to have made it without concerns about performance. > BTW, I looked through costsize.c just now to see exactly what we are > using disable_cost for, and it seemed like a majority of the cases are > just wrong. Where possible, we should implement a plan-type-disable > flag by not generating the associated Path in the first place, not by > applying disable_cost to it. But it looks like a lot of people have > erroneously copied the wrong logic. I would say that only these plan > types should use the disable_cost method: > > seqscan > nestloop join > sort I think this oversimplifies the situation. I only spent 30 seconds looking and I saw cases where this would cause issues. If enable_hashagg is false, we could fail to produce some plans where the type is sortable but not hashable. There's also an issue with nested loops being unable to FULL OUTER JOIN. However, I do agree that there are some in there that are adding disable_cost that should be done by just not creating the Path. enable_gathermerge is one. enable_bitmapscan is probably another. I understand you only talked about the cases adding disable_cost in costsize.c. But just as a reminder, there are other things we need to be careful not to break. For example, enable_indexonlyscan=false should defer to still making an index scan. Nobody who disables enable_indexonlyscan without disabling enable_indexscan wants queries that are eligible to use IOS to use seq scan instead. They'd still want Index Scan to be considered, otherwise they'd have disabled enable_indexscan. David
Fix for recursive plpython triggers
This fixes bug #18456 [1]. Since we're in back-branch release freeze, I'll just park it for the moment. But I think we should shove it in once the freeze lifts so it's in 17beta1. regards, tom lane [1] https://www.postgresql.org/message-id/18456-82d3d70134aefd28%40postgresql.org diff --git a/src/pl/plpython/expected/plpython_trigger.out b/src/pl/plpython/expected/plpython_trigger.out index dd1ca32fa4..4cb90cb520 100644 --- a/src/pl/plpython/expected/plpython_trigger.out +++ b/src/pl/plpython/expected/plpython_trigger.out @@ -618,3 +618,30 @@ SELECT * FROM trigger_test_generated; ---+--- (0 rows) +-- recursive call of a trigger mustn't corrupt TD (bug #18456) +CREATE TABLE recursive_trigger_test (a int, b int); +CREATE FUNCTION recursive_trigger_func() RETURNS trigger +LANGUAGE plpython3u +AS $$ +if TD["event"] == "UPDATE": +plpy.execute("INSERT INTO recursive_trigger_test VALUES (1, 2)") +plpy.notice("TD[event] => " + str(TD["event"]) + ", expecting UPDATE"); +else: +plpy.notice("TD[event] => " + str(TD["event"]) + ", expecting INSERT"); +return None +$$; +CREATE TRIGGER recursive_trigger_trig + AFTER INSERT OR UPDATE ON recursive_trigger_test + FOR EACH ROW EXECUTE PROCEDURE recursive_trigger_func(); +INSERT INTO recursive_trigger_test VALUES (0, 0); +NOTICE: TD[event] => INSERT, expecting INSERT +UPDATE recursive_trigger_test SET a = 11 WHERE b = 0; +NOTICE: TD[event] => INSERT, expecting INSERT +NOTICE: TD[event] => UPDATE, expecting UPDATE +SELECT * FROM recursive_trigger_test; + a | b ++--- + 11 | 0 + 1 | 2 +(2 rows) + diff --git a/src/pl/plpython/plpy_exec.c b/src/pl/plpython/plpy_exec.c index 3145c69699..6f7b5e121d 100644 --- a/src/pl/plpython/plpy_exec.c +++ b/src/pl/plpython/plpy_exec.c @@ -335,6 +335,13 @@ PLy_exec_trigger(FunctionCallInfo fcinfo, PLyProcedure *proc) PLy_output_setup_tuple(>result, rel_descr, proc); PLy_input_setup_tuple(>result_in, rel_descr, proc); + /* + * If the trigger is called recursively, we must push outer-level + * arguments into the stack. This must be immediately before the PG_TRY + * to ensure that the corresponding pop happens. + */ + PLy_global_args_push(proc); + PG_TRY(); { int rc PG_USED_FOR_ASSERTS_ONLY; @@ -397,6 +404,7 @@ PLy_exec_trigger(FunctionCallInfo fcinfo, PLyProcedure *proc) } PG_FINALLY(); { + PLy_global_args_pop(proc); Py_XDECREF(plargs); Py_XDECREF(plrv); } @@ -501,6 +509,13 @@ PLy_function_save_args(PLyProcedure *proc) result->args = PyDict_GetItemString(proc->globals, "args"); Py_XINCREF(result->args); + /* If it's a trigger, also save "TD" */ + if (proc->is_trigger) + { + result->td = PyDict_GetItemString(proc->globals, "TD"); + Py_XINCREF(result->td); + } + /* Fetch all the named arguments */ if (proc->argnames) { @@ -550,6 +565,13 @@ PLy_function_restore_args(PLyProcedure *proc, PLySavedArgs *savedargs) Py_DECREF(savedargs->args); } + /* Restore the "TD" object, too */ + if (savedargs->td) + { + PyDict_SetItemString(proc->globals, "TD", savedargs->td); + Py_DECREF(savedargs->td); + } + /* And free the PLySavedArgs struct */ pfree(savedargs); } @@ -568,8 +590,9 @@ PLy_function_drop_args(PLySavedArgs *savedargs) Py_XDECREF(savedargs->namedargs[i]); } - /* Drop ref to the "args" object, too */ + /* Drop refs to the "args" and "TD" objects, too */ Py_XDECREF(savedargs->args); + Py_XDECREF(savedargs->td); /* And free the PLySavedArgs struct */ pfree(savedargs); @@ -578,9 +601,9 @@ PLy_function_drop_args(PLySavedArgs *savedargs) /* * Save away any existing arguments for the given procedure, so that we can * install new values for a recursive call. This should be invoked before - * doing PLy_function_build_args(). + * doing PLy_function_build_args() or PLy_trigger_build_args(). * - * NB: caller must ensure that PLy_global_args_pop gets invoked once, and + * NB: callers must ensure that PLy_global_args_pop gets invoked once, and * only once, per successful completion of PLy_global_args_push. Otherwise * we'll end up out-of-sync between the actual call stack and the contents * of proc->argstack. diff --git a/src/pl/plpython/plpy_procedure.c b/src/pl/plpython/plpy_procedure.c index 9cbbe7e3c8..ba7786d31c 100644 --- a/src/pl/plpython/plpy_procedure.c +++ b/src/pl/plpython/plpy_procedure.c @@ -183,6 +183,7 @@ PLy_procedure_create(HeapTuple procTup, Oid fn_oid, bool is_trigger) proc->fn_readonly = (procStruct->provolatile != PROVOLATILE_VOLATILE); proc->is_setof = procStruct->proretset; proc->is_procedure = (procStruct->prokind == PROKIND_PROCEDURE); + proc->is_trigger = is_trigger; proc->src = NULL; proc->argnames = NULL; proc->args = NULL; diff --git a/src/pl/plpython/plpy_procedure.h b/src/pl/plpython/plpy_procedure.h index 8968b5c92e..5db854fc8b 100644 --- a/src/pl/plpython/plpy_procedure.h +++ b/src/pl/plpython/plpy_procedure.h @@ -16,6 +16,7 @@ typedef struct PLySavedArgs { struct
Re: partitioning and identity column
> On Tue, Apr 30, 2024 at 04:29:11PM +0530, Ashutosh Bapat wrote: > On Sun, Apr 28, 2024 at 12:29 PM Alexander Lakhin > wrote: > > > 27.04.2024 18:00, Alexander Lakhin wrote: > > > > > > Please look also at another script, which produces the same error: > > > > I've discovered yet another problematic case: > > CREATE TABLE tbl1 (a int GENERATED ALWAYS AS IDENTITY, b text) > > PARTITION BY LIST (a); > > CREATE TABLE tbl2 (b text, a int NOT NULL); > > ALTER TABLE tbl1 ATTACH PARTITION tbl2 DEFAULT; > > > > INSERT INTO tbl2 DEFAULT VALUES; > > ERROR: no owned sequence found > > > > Though it works with tbl2(a int NOT NULL, b text)... > > Take a look at this too, please. > > > > Thanks Alexander for the report. > > PFA patch which fixes all the three problems. > > I had not fixed getIdentitySequence() to fetch identity sequence associated > with the partition because I thought it would be better to fail with an > error when it's not used correctly. But these bugs show 1. the error is > misleading and unwanted 2. there are more places where adding that logic > to getIdentitySequence() makes sense. Fixed the function in these patches. > Now callers like transformAlterTableStmt have to be careful not to call the > function on a partition. > > I have examined all the callers of getIdentitySequence() and they seem to > be fine. The code related to SetIdentity, DropIdentity is not called for > partitions, errors for which are thrown elsewhere earlier. Thanks for the fix. I had a quick look, it covers the issues mentioned above in the thread. Few nitpicks/questions: * I think it makes sense to verify if the ptup is valid. This approach would fail if the target column of the root partition is marked as attisdropped. Oid -getIdentitySequence(Oid relid, AttrNumber attnum, bool missing_ok) +getIdentitySequence(Relation rel, AttrNumber attnum, bool missing_ok) { [...] + relid = llast_oid(ancestors); + ptup = SearchSysCacheAttName(relid, attname); + attnum = ((Form_pg_attribute) GETSTRUCT(ptup))->attnum; * getIdentitySequence is used in build_column_default, which in turn often appears in loops over table attributes. AFAICT it means that the same root partition search will be repeated multiple times in such situations if there is more than one identity. I assume the performance impact of this repetition is negligible? * Maybe a silly question, but since I'm not aware about all the details here, I'm curious -- the approach of mapping attributes of a partition to the root partition attributes, how robust is it? I guess there is no way that the root partition column will be not what is expected, e.g. due to some sort of concurrency?
Re: drop column name conflict
On Sat, May 4, 2024 at 11:29 AM Tom Lane wrote: > I think we intentionally did not bother with preventing this, > on the grounds that if you were silly enough to name a column > that way then you deserve any ensuing problems. Fair enough. > If we were going to expend any code on the scenario, I'd prefer > to make it be checks in column addition/renaming that disallow > naming a column this way. Is there any interest in making this change? The attached patch could use some cleanup, but seems to accomplish what's described. It's definitely more involved than the previous one and may not be worth the effort. If you feel that it's worth it I can clean it up, otherwise I'll drop it. Thanks, Joe Koshakow From 936a9e3509867574633882f5c1ec714d2f2604ec Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Sat, 4 May 2024 10:12:37 -0400 Subject: [PATCH] Prevent name conflicts when dropping a column Previously, dropped columns were always renamed to "pg.dropped.". In the rare scenario that a column with that name already exists, the column drop would fail with an error about violating the unique constraint on "pg_attribute_relid_attnam_index". This commit fixes that issue by preventing users from creating columns with a name that matches "pg.dropped.\d+". This is backwards incompatible. --- src/backend/catalog/heap.c | 57 -- src/backend/commands/tablecmds.c | 2 + src/include/catalog/heap.h | 3 ++ src/test/regress/expected/alter_table.out | 7 +++ src/test/regress/expected/create_table.out | 3 ++ src/test/regress/sql/alter_table.sql | 6 +++ src/test/regress/sql/create_table.sql | 3 ++ 7 files changed, 77 insertions(+), 4 deletions(-) diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 922ba79ac2..0a0afe833d 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -231,6 +231,9 @@ static const FormData_pg_attribute a6 = { static const FormData_pg_attribute *const SysAtt[] = {, , , , , }; +static const char *dropped_col_pre = "pg.dropped."; +static const char *dropped_col_post = ""; + /* * This function returns a Form_pg_attribute pointer for a system attribute. * Note that we elog if the presented attno is invalid, which would only @@ -468,10 +471,10 @@ CheckAttributeNamesTypes(TupleDesc tupdesc, char relkind, MaxHeapAttributeNumber))); /* - * first check for collision with system attribute names + * first check for collision with system attribute and reserved names * * Skip this for a view or type relation, since those don't have system - * attributes. + * attributes and cannot drop columns. */ if (relkind != RELKIND_VIEW && relkind != RELKIND_COMPOSITE_TYPE) { @@ -484,6 +487,11 @@ CheckAttributeNamesTypes(TupleDesc tupdesc, char relkind, (errcode(ERRCODE_DUPLICATE_COLUMN), errmsg("column name \"%s\" conflicts with a system column name", NameStr(attr->attname; + + if ((CHKATYPE_RESERVED_NAME & flags) == 0) + { +CheckAttributeReservedName(NameStr(attr->attname)); + } } } @@ -679,6 +687,47 @@ CheckAttributeType(const char *attname, } } +/* + * TODO: Add function description. + */ +void +CheckAttributeReservedName(const char *attname) +{ + size_t name_len, +pre_len, +post_len; + int i; + + name_len = strlen(attname); + pre_len = strlen(dropped_col_pre); + post_len = strlen(dropped_col_post); + + if (name_len < pre_len + post_len + 1) + { + return; + } + if (memcmp(attname, dropped_col_pre, pre_len) != 0) + { + return; + } + for (i = pre_len; i < name_len - post_len; i++) + { + if (!isdigit(attname[i])) + { + return; + } + } + if (memcmp(attname + (name_len - post_len), dropped_col_post, post_len) != 0) + { + return; + } + + ereport(ERROR, + (errcode(ERRCODE_RESERVED_NAME), + errmsg("column name \"%s\" conflicts with a reserved column name", + attname))); +} + /* * InsertPgAttributeTuples * Construct and insert a set of tuples in pg_attribute. @@ -1148,7 +1197,7 @@ heap_create_with_catalog(const char *relname, * hack to allow creating pg_statistic and cloning it during VACUUM FULL. */ CheckAttributeNamesTypes(tupdesc, relkind, - allow_system_table_mods ? CHKATYPE_ANYARRAY : 0); + (allow_system_table_mods ? CHKATYPE_ANYARRAY : 0) | (is_internal ? CHKATYPE_RESERVED_NAME : 0)); /* * This would fail later on anyway, if the relation already exists. But @@ -1705,7 +1754,7 @@ RemoveAttributeById(Oid relid, AttrNumber attnum) * Change the column name to something that isn't likely to conflict */ snprintf(newattname, sizeof(newattname), - "pg.dropped.%d", attnum); + "%s%d%s", dropped_col_pre, attnum, dropped_col_post); namestrcpy(&(attStruct->attname), newattname); /* Clear the missing value */ diff --git a/src/backend/commands/tablecmds.c
Re: On disable_cost
David Rowley writes: > I don't think you'd need to wait longer than where we do set_cheapest > and find no paths to find out that there's going to be a problem. At a base relation, yes, but that doesn't work for joins: it may be that a particular join cannot be formed, yet other join sequences will work. We have that all the time from outer-join ordering restrictions, never mind enable_xxxjoin flags. So I'm not sure that we can usefully declare early failure for joins. > I think the int Path.disabledness idea is worth coding up to try it. > I imagine that a Path will incur the maximum of its subpath's > disabledness's then add_path() just needs to prefer lower-valued > disabledness Paths. I would think sum not maximum, but that's a detail. > That doesn't get you the benefits of fewer CPU cycles, but where did > that come from as a motive to change this? There's no shortage of > other ways to make the planner faster if that's an issue. The concern was to not *add* CPU cycles in order to make this area better. But I do tend to agree that we've exhausted all the other options. BTW, I looked through costsize.c just now to see exactly what we are using disable_cost for, and it seemed like a majority of the cases are just wrong. Where possible, we should implement a plan-type-disable flag by not generating the associated Path in the first place, not by applying disable_cost to it. But it looks like a lot of people have erroneously copied the wrong logic. I would say that only these plan types should use the disable_cost method: seqscan nestloop join sort as those are the only ones where we risk not being able to make a plan at all for lack of other alternatives. There is also some weirdness around needing to force use of tidscan if we have WHERE CURRENT OF. But perhaps a different hack could be used for that. We also have this for hashjoin: * If the bucket holding the inner MCV would exceed hash_mem, we don't * want to hash unless there is really no other alternative, so apply * disable_cost. I'm content to leave that be, if we can't remove disable_cost entirely. What I'm wondering at this point is whether we need to trouble with implementing the separate-disabledness-count method, if we trim back the number of places using disable_cost to the absolute minimum. regards, tom lane
Re: AIX support
Hi Team, There are couple of updates, firstly we got an AIX node on the OSU lab. Please feel free to reach me, so that we can provide access to the node. We have started working on setting up the buildfarm on that node. Secondly, as part of the buildfarm setup on our local nodes, we are hitting an issue with the plperl extension. In the logs we noticed that when the plperl extension is being created, it is failing to load the perl library. - CREATE EXTENSION plperlu; + server closed the connection unexpectedly + This probably means the server terminated abnormally + before or while processing the request. + connection to server was lost In the logfile we could see these 2024-05-04 05:05:17.537 CDT [3997786:17] pg_regress/plperl_setup LOG: statement: CREATE EXTENSION plperl; Util.c: loadable library and perl binaries are mismatched (got first handshake key 9b80080, needed 9a80080) We tried to resolve some of the suggestions mentioned here, but things couldn’t resolve. https://www.postgresql.org/message-id/CALDaNm15qaRFb3WiPFAdFqoB9pj1E5SCPPUGB%2BnJ4iF4gXO6Rw%40mail.gmail.com Any inputs here would be greatly appreciated. Regards, Sriram.
Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Hello, Matthias! > We can just release the current snapshot, and get a new one, right? I > mean, we don't actually use the transaction for much else than > visibility during the first scan, and I don't think there is a need > for an actual transaction ID until we're ready to mark the index entry > with indisready. > I suppose we could be resetting the snapshot every so often? Or use > multiple successive TID range scans with a new snapshot each? It seems like it is not so easy in that case. Because we still need to hold catalog snapshot xmin, releasing the snapshot which used for the scan does not affect xmin propagated to the horizon. That's why d9d076222f5b94a85e0e318339cfc44b8f26022d(1) affects only the data horizon, but not the catalog's one. So, in such a situation, we may: 1) starts scan from scratch with some TID range multiple times. But such an approach feels too complex and error-prone for me. 2) split horizons propagated by `MyProc` to data-related xmin and catalog-related xmin. Like `xmin` and `catalogXmin`. We may just mark snapshots as affecting some of the horizons, or both. Such a change feels easy to be done but touches pretty core logic, so we need someone's approval for such a proposal, probably. 3) provide some less invasive (but less non-kludge) way: add some kind of process flag like `PROC_IN_SAFE_IC_XMIN` and function like `AdvanceIndexSafeXmin` which changes the way backend affect horizon calculation. In the case of `PROC_IN_SAFE_IC_XMIN` `ComputeXidHorizons` uses value from `proc->safeIcXmin` which is updated by `AdvanceIndexSafeXmin` while switching scan snapshots. So, with option 2 or 3, we may avoid holding data horizon during the first phase scan by resetting the scan snapshot every so often (and, optionally, using `AdvanceIndexSafeXmin` in case of 3rd approach). The same will be possible for the second phase (validate). We may do the same "resetting the snapshot every so often" technique, but there is still the issue with the way we distinguish tuples which were missed by the first phase scan or were inserted into the index after the visibility snapshot was taken. So, I see two options here: 1) approach with additional index with some custom AM proposed by you. It looks correct and reliable but feels complex to implement and maintain. Also, it negatively affects performance of table access (because of an additional index) and validation scan (because we need to merge additional index content with visibility snapshot). 2) one more tricky approach. We may add some boolean flag to `Relation` about information of index building in progress (`indexisbuilding`). It may be easily calculated using `(index->indisready && !index->indisvalid)`. For a more reliable solution, we also need to somehow check if backend/transaction building the index still in progress. Also, it is better to check if index is building concurrently using the "safe_index" way. I think there is a non too complex and expensive way to do so, probably by addition of some flag to index catalog record. Once we have such a flag, we may "legally" prohibit `heap_page_prune_opt` affecting the relation updating `GlobalVisHorizonKindForRel` like this: if (rel != NULL && rel->rd_indexvalid && rel->rd_indexisbuilding) return VISHORIZON_CATALOG; So, in common it works this way: * backend building the index affects catalog horizon as usual, but data horizon is regularly propagated forward during the scan. So, other relations are processed by vacuum and `heap_page_prune_opt` without any restrictions * but our relation (with CIC in progress) accessed by `heap_page_prune_opt` (or any other vacuum-like mechanics) with catalog horizon to honor CIC work. Therefore, validating scan may be sure what none of the HOT-chain will be truncated. Even regular vacuum can't affect it (but yes, it can't be anyway because of relation locking). As a result, we may easily distinguish tuples missed by first phase scan, just by testing them against reference snapshot (which used to take visibility snapshot). So, for me, this approach feels non-kludge enough, safe and effective and the same time. I have a prototype of this approach and looks like it works (I have a good test catching issues with index content for CIC). What do you think about all this? [1]: https://github.com/postgres/postgres/commit/d9d076222f5b94a85e0e318339cfc44b8f26022d#diff-8879f0173be303070ab7931db7c757c96796d84402640b9e386a4150ed97b179R1779-R1793
Re: drop column name conflict
Joseph Koshakow writes: > There's a rare edge case in `alter table` that can prevent users from > dropping a column as shown below > # create table atacc1(a int, "pg.dropped.1" int); > CREATE TABLE > # alter table atacc1 drop column a; > ERROR: duplicate key value violates unique constraint > "pg_attribute_relid_attnam_index" > DETAIL: Key (attrelid, attname)=(16407, pg.dropped.1) > already exists. I think we intentionally did not bother with preventing this, on the grounds that if you were silly enough to name a column that way then you deserve any ensuing problems. If we were going to expend any code on the scenario, I'd prefer to make it be checks in column addition/renaming that disallow naming a column this way. What you propose here doesn't remove the fundamental tension about whether this is valid user namespace or not, it just makes things less predictable. regards, tom lane
drop column name conflict
Hi all, There's a rare edge case in `alter table` that can prevent users from dropping a column as shown below # create table atacc1(a int, "pg.dropped.1" int); CREATE TABLE # alter table atacc1 drop column a; ERROR: duplicate key value violates unique constraint "pg_attribute_relid_attnam_index" DETAIL: Key (attrelid, attname)=(16407, pg.dropped.1) already exists. It seems a bit silly and unlikely that anyone would ever find themselves in this scenario, but it also seems easy enough to fix as shown by the attached patch. Does anyone think this is worth fixing? If so I can submit it to the current commitfest. Thanks, Joe Koshakow From 50f6e73d9bc1e00ad3988faa80a84af70aef Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Sat, 4 May 2024 10:12:37 -0400 Subject: [PATCH] Prevent name conflicts when dropping a column Previously, dropped columns were always renamed to "pg.dropped.". In the rare scenario that a column with that name already exists, the column drop would fail with an error about violating the unique constraint on "pg_attribute_relid_attnam_index". This commit fixes that issue by appending an int to dropped column name until we find a unique name. Since tables have a maximum of 16,000 columns and the max int is larger than 16,000, we are guaranteed to find a unique name. --- src/backend/catalog/heap.c| 16 +++- src/test/regress/expected/alter_table.out | 4 src/test/regress/sql/alter_table.sql | 5 + 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 922ba79ac2..852ed442e1 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1658,11 +1658,13 @@ RemoveAttributeById(Oid relid, AttrNumber attnum) Relation rel; Relation attr_rel; HeapTuple tuple; + HeapTuple drop_tuple_check; Form_pg_attribute attStruct; char newattname[NAMEDATALEN]; Datum valuesAtt[Natts_pg_attribute] = {0}; bool nullsAtt[Natts_pg_attribute] = {0}; bool replacesAtt[Natts_pg_attribute] = {0}; + int i; /* * Grab an exclusive lock on the target table, which we will NOT release @@ -1702,10 +1704,22 @@ RemoveAttributeById(Oid relid, AttrNumber attnum) attStruct->attgenerated = '\0'; /* - * Change the column name to something that isn't likely to conflict + * Change the column name to something that doesn't conflict */ snprintf(newattname, sizeof(newattname), "pg.dropped.%d", attnum); + Assert(PG_INT32_MAX > MaxHeapAttributeNumber); + drop_tuple_check = SearchSysCacheCopy2(ATTNAME, + ObjectIdGetDatum(relid), + PointerGetDatum(newattname)); + for (i = 0; HeapTupleIsValid(drop_tuple_check); i++) + { + snprintf(newattname, sizeof(newattname), + "pg.dropped.%d.%d", attnum, i); + drop_tuple_check = SearchSysCacheCopy2(ATTNAME, + ObjectIdGetDatum(relid), + PointerGetDatum(newattname)); + } namestrcpy(&(attStruct->attname), newattname); /* Clear the missing value */ diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index 7666c76238..844ae55467 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -1554,6 +1554,10 @@ insert into atacc1(id, value) values (null, 0); ERROR: null value in column "id" of relation "atacc1" violates not-null constraint DETAIL: Failing row contains (null, 0). drop table atacc1; +-- test dropping a column doesn't cause name conflicts +create table atacc1(a int, "pg.dropped.1" int); +alter table atacc1 drop column a; +drop table atacc1; -- test inheritance create table parent (a int, b int, c int); insert into parent values (1, 2, 3); diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql index 9df5a63bdf..d5d912a2e2 100644 --- a/src/test/regress/sql/alter_table.sql +++ b/src/test/regress/sql/alter_table.sql @@ -1097,6 +1097,11 @@ insert into atacc1(value) values (100); insert into atacc1(id, value) values (null, 0); drop table atacc1; +-- test dropping a column doesn't cause name conflicts +create table atacc1(a int, "pg.dropped.1" int); +alter table atacc1 drop column a; +drop table atacc1; + -- test inheritance create table parent (a int, b int, c int); insert into parent values (1, 2, 3); -- 2.34.1
Re: On disable_cost
On Sat, 4 May 2024 at 08:34, Robert Haas wrote: > Another idea is to remove the ERROR mentioned above from > set_cheapest() and just allow planning to continue even if some > relations end up with no paths. (This would necessitate finding and > fixing any code that could be confused by a pathless relation.) Then, > if you get to the top of the plan tree and you have no paths there, > redo the join search discarding the constraints (or maybe just some of > the constraints, e.g. allow sequential scans and nested loops, or > something). I don't think you'd need to wait longer than where we do set_cheapest and find no paths to find out that there's going to be a problem. I don't think redoing planning is going to be easy or even useful. I mean what do you change when you replan? You can't just do enable_seqscan and enable_nestloop as if there's no index to provide sorted input and the plan requires some sort, then you still can't produce a plan. Adding enable_sort to the list does not give me much confidence we'll never fail to produce a plan either. It just seems impossible to know which of the disabled ones caused the RelOptInfo to have no paths. Also, you might end up enabling one that caused the planner to do something different than it would do today. For example, a Path that today incurs 2x disable_cost vs a Path that only receives 1x disable_cost might do something different if you just went and enabled a bunch of enable* GUCs before replanning. > Now, I suppose it might be that even if we can't remove disable_cost, > something along these lines is still worth doing, just to save CPU > cycles. You could for example try planning with only non-disabled > stuff and then do it over again with everything if that doesn't work > out, still keeping disable_cost around so that you avoid disabled > nodes where you can. But I'm kind of hoping that I'm missing something > and there's some approach that could both kill disable_cost and save > some cycles at the same time. If (any of) you have an idea, I'd love > to hear it! I think the int Path.disabledness idea is worth coding up to try it. I imagine that a Path will incur the maximum of its subpath's disabledness's then add_path() just needs to prefer lower-valued disabledness Paths. That doesn't get you the benefits of fewer CPU cycles, but where did that come from as a motive to change this? There's no shortage of other ways to make the planner faster if that's an issue. David
Re: Weird test mixup
On Thu, May 02, 2024 at 01:52:20PM +0500, Andrey M. Borodin wrote: > As far as I understand this will effectively forbid calling > injection_points_detach() for local injection point of other > backend. Do I get it right? Yes, that would be the intention. Noah has other use cases in mind with this interface that I did not think about supporting, hence he's objecting to this restriction. -- Michael signature.asc Description: PGP signature
Re: pg_sequence_last_value() for unlogged sequences on standbys
On Fri, May 03, 2024 at 03:49:08PM -0500, Nathan Bossart wrote: > On Wed, May 01, 2024 at 12:39:53PM +0900, Michael Paquier wrote: >> By the way, shouldn't we also change the function to return NULL for a >> failed permission check? It would be possible to remove the >> has_sequence_privilege() as well, thanks to that, and a duplication >> between the code and the function view. I've been looking around a >> bit, noticing one use of this function in check_pgactivity (nagios >> agent), and its query also has a has_sequence_privilege() so returning >> NULL would simplify its definition in the long-run. I'd suspect other >> monitoring queries to do something similar to bypass permission >> errors. > > I'm okay with that, but it would be v18 material that I'd track separately > from the back-patchable fix proposed in this thread. Of course. I mean only the permission check simplification for HEAD. My apologies if my words were unclear. -- Michael signature.asc Description: PGP signature
Re: pg_sequence_last_value() for unlogged sequences on standbys
On Fri, May 03, 2024 at 05:22:06PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> IIUC this would cause other sessions' temporary sequences to appear in the >> view. Is that desirable? > > I assume Michael meant to move the test into the C code, not drop > it entirely --- I agree we don't want that. Yup. I meant to remove it from the script and keep only something in the C code to avoid the duplication, but you're right that the temp sequences would create more noise than now. > Moving it has some attraction, but pg_is_other_temp_schema() is also > used in a lot of information_schema views, so we couldn't get rid of > it without a lot of further hacking. Not sure we want to relocate > that filter responsibility in just one view. Okay. -- Michael signature.asc Description: PGP signature
Re: pg17 issues with not-null contraints
On 2024-May-03, Justin Pryzby wrote: > But if it's created with LIKE: > postgres=# CREATE TABLE t1 (LIKE t); > postgres=# ALTER TABLE t ATTACH PARTITION t1 DEFAULT ; > > ..one also sees: > > Not-null constraints: > "t1_i_not_null" NOT NULL "i" Hmm, I think the problem here is not ATTACH; the not-null constraint is there immediately after CREATE. I think this is all right actually, because we derive a not-null constraint from the primary key and this is definitely intentional. But I also think that if you do CREATE TABLE t1 (LIKE t INCLUDING CONSTRAINTS) then you should get only the primary key and no separate not-null constraint. That will make the table more similar to the one being copied. Does that make sense to you? -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Estoy de acuerdo contigo en que la verdad absoluta no existe... El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)
Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?
> On 3 May 2024, at 21:21, Tom Lane wrote: > > Peter Eisentraut writes: >> On 03.05.24 10:39, Daniel Gustafsson wrote: >>> They are no-ops when linking against v18, but writing an extension which >>> targets all supported versions of postgres along with their respective >>> supported OpenSSL versions make them still required, or am I missing >>> something? > >> I don't think extensions come into play here, since this is libpq, so >> only the shared library interface compatibility matters. > > Yeah, server-side extensions don't really seem to be at hazard, > but doesn't the argument apply to client-side applications and > libraries that want to work across different PG/OpenSSL versions? Right, I was using "extension" a bit carelessly, what I meant was client-side applications using libpq. -- Daniel Gustafsson
Re: UUID v7
> On 3 May 2024, at 11:18, Andrey M. Borodin wrote: > Here's the documentation from ClickHouse [0] for their implementation. It's identical to provided patch in this thread, with few notable exceptions: 1. Counter is 42 bits, not 18. The counter have no guard bits, every bit is initialized with random number on time ticks. 2. By default counter is shared between threads. Alternative function generateUUIDv7ThreadMonotonic() provides thread-local counter. Thanks! Best regards, Andrey Borodin. [0] https://clickhouse.com/docs/en/sql-reference/functions/uuid-functions#generateUUIDv7