Re: Allow parallel plan for referential integrity checks?
On Fri, 18 Aug 2023 at 16:29, Juan José Santamaría Flecha wrote: > > > On Thu, Aug 17, 2023 at 3:51 PM Frédéric Yhuel > wrote: >> >> On 8/17/23 14:00, Frédéric Yhuel wrote: >> > On 8/17/23 09:32, Frédéric Yhuel wrote: >> >> On 8/10/23 17:06, Juan José Santamaría Flecha wrote: >> >>> Recently I restored a database from a directory format backup and >> >>> having this feature would have been quite useful >> >> >> >> Thanks for resuming work on this patch. I forgot to mention this in my >> >> original email, but the motivation was also to speed up the restore >> >> process. Parallelizing the FK checks could make a huge difference in >> >> certain cases. We should probably provide such a test case (with perf >> >> numbers), and maybe this is it what Robert asked for. >> > >> > I have attached two scripts which demonstrate the following problems: > > > Thanks for the scripts, but I think Robert's concerns come from the safety, > and not the performance, of the parallel operation. > > Proving its vulnerability could be easy with a counter example, but assuring > its safety is trickier. What test would suffice to do that? I'm seeing that there has been no activity in this thread for more than 5 months, I'm planning to close this in the current commitfest unless someone is planning to take it forward. It can be opened again when there is more interest. Regards, Vignesh
Re: ReadRecentBuffer() doesn't scale well
On Fri, 30 Jun 2023 at 07:43, Thomas Munro wrote: > > On Fri, Jun 30, 2023 at 3:39 AM Andres Freund wrote: > > I am wondering if we don't want something more generic than stashing this in > > rd_amcache. Don't want to end up duplicating relevant code across the uses > > of > > rd_amcache in every AM. > > I suppose we could try to track hot pages automatically. Ants Aasma > mentioned that he was working on a tiny SIMD-based LRU that could be > useful for something like that, without being too slow. Just for > fun/experimentation, here's a simple attempt to use a very stupid > stand-in LRU to cache the most recent 8 lookups for each fork of each > relation. Obviously that will miss every time for many workloads so > it'd have to be pretty close to free and this code probably isn't good > enough, but perhaps Ants knows how to sprinkle the right magic fairy > dust on it. It should automatically discover things like root pages, > the heap target block during repeat inserts etc, and visibility map > pages would stick around, and perhaps a few more pages of btrees that > have a very hot key range (but not pgbench). > > > I do wonder if we should have an unlocked pre-check for a) the buffer being > > valid and b) BufferTagsEqual() matching. With such a pre-check the race for > > increasing the usage count of the wrong buffer is quite small, without the > > pre-check it doesn't seem that small anymore. > > Yeah, that makes sense. Done in this version. I have changed the status of commitfest entry to "Waiting on Author" as Andres's comments were not discussed further. Feel free to handle the comments and change the status accordingly for the commitfest entry. Regards, Vignesh
Re: XLog size reductions: smaller XLRec block header for PG17
On Tue, 26 Sept 2023 at 02:09, Matthias van de Meent wrote: > > On Tue, 19 Sept 2023 at 01:03, Andres Freund wrote: > > > > Hi, > > > > On 2023-05-18 19:22:26 +0300, Heikki Linnakangas wrote: > > > On 18/05/2023 17:59, Matthias van de Meent wrote: > > > > It changes the block IDs used to fit in 6 bits, using the upper 2 bits > > > > of the block_id field to store how much data is contained in the > > > > record (0, <=UINT8_MAX, or <=UINT16_MAX bytes). > > > > > > Perhaps we should introduce a few generic inline functions to do varint > > > encoding. That could be useful in many places, while this scheme is very > > > tailored for XLogRecordBlockHeader. > > This scheme is reused later for the XLogRecord xl_tot_len field over > at [0], and FWIW is thus being reused. Sure, it's tailored to this WAL > use case, but IMO we're getting good value from it. We don't use > protobuf or JSON for WAL, we use our own serialization format. Having > some specialized encoding/decoding in that format for certain fields > is IMO quite acceptable. > > > Yes - I proposed that and wrote an implementation of reasonably efficient > > varint encoding. Here's my prototype: > > https://postgr.es/m/20221004234952.anrguppx5owewb6n%40awork3.anarazel.de > > As I mentioned on that thread, that prototype has a significant > probability of doing nothing to improve WAL size, or even increasing > the WAL size for installations which consume a lot of OIDs. > > > I think it's a bad tradeoff to write lots of custom varint encodings, just > > to > > eek out a bit more space savings. > > This is only a single "custom" varint encoding though, if you can even > call it that. It makes a field's size depend on flags set in another > byte, which is not that much different from the existing use of > XLR_BLOCK_ID_DATA_[LONG, SHORT]. > > > The increase in code complexity IMO makes it a bad tradeoff. > > Pardon me for asking, but what would you consider to be a good > tradeoff then? I think the code relating to the WAL storage format is > about as simple as you can get it within the feature set it provides > and the size of the resulting records. While I think there is still > much to gain w.r.t. WAL record size, I don't think we can get much of > those improvements without adding at least some amount of complexity, > something I think to be true for most components in PostgreSQL. > > So, except for redesigning significant parts of the public WAL APIs, > are we just going to ignore any potential improvements because they > "increase code complexity"? I'm seeing that there has been no activity in this thread for nearly 4 months, I'm planning to close this in the current commitfest unless someone is planning to take it forward. Regards, Vignesh
Re: Cross-database SERIALIZABLE safe snapshots
On Mon, 10 Jul 2023 at 14:48, Heikki Linnakangas wrote: > > On 09/03/2023 07:34, Thomas Munro wrote: > > Here is a feature idea that emerged from a pgsql-bugs thread[1] that I > > am kicking into the next commitfest. Example: > > > > s1: \c db1 > > s1: CREATE TABLE t (i int); > > s1: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; > > s1: INSERT INTO t VALUES (42); > > > > s2: \c db2 > > s2: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY DEFERRABLE; > > s2: SELECT * FROM x; > > > > I don't know of any reason why s2 should have to wait, or > > alternatively, without DEFERRABLE, why it shouldn't immediately drop > > from SSI to SI (that is, opt out of predicate locking and go faster). > > This change relies on the fact that PostgreSQL doesn't allow any kind > > of cross-database access, except for shared catalogs, and all catalogs > > are already exempt from SSI. I have updated this new version of the > > patch to explain that very clearly at the place where that exemption > > happens, so that future hackers would see that we rely on that fact > > elsewhere if reconsidering that. > > Makes sense. > > > @@ -1814,7 +1823,17 @@ GetSerializableTransactionSnapshotInt(Snapshot > > snapshot, > > { > > othersxact = dlist_container(SERIALIZABLEXACT, > > xactLink, iter.cur); > > > > - if (!SxactIsCommitted(othersxact) > > + /* > > +* We can't possibly have an unsafe conflict with a > > transaction in > > +* another database. The only possible overlap is > > on shared > > +* catalogs, but we don't support SSI for shared > > catalogs. The > > +* invalid database case covers 2PC, because we > > don't yet record > > +* database OIDs in the 2PC information. We also > > filter out doomed > > +* transactions as they can't possibly commit. > > +*/ > > + if ((othersxact->database == InvalidOid || > > +othersxact->database == MyDatabaseId) > > + && !SxactIsCommitted(othersxact) > > && !SxactIsDoomed(othersxact) > > && !SxactIsReadOnly(othersxact)) > > { > > Why don't we set the database OID in 2PC transactions? We actually do > set it correctly - or rather we never clear it - when a transaction is > prepared. But you set it to invalid when recovering a prepared > transaction on system startup. So the comment is a bit misleading: the > optimization doesn't apply to 2PC transactions recovered after restart, > other 2PC transactions are fine. > > I'm sure it's not a big deal in practice, but it's also not hard to fix. > We do store the database OID in the twophase state. The caller of > predicatelock_twophase_recover() has it, we just need a little plumbing > to pass it down. > > Attached patches: > > 1. Rebased version of your patch, just trivial pgindent conflict fixes > 2. Some comment typo fixes and improvements > 3. Set the database ID on recovered 2PC transactions > > A test for this would be nice. isolationtester doesn't support > connecting to different databases, restarting the server to test the 2PC > recovery, but a TAP test could do it. @Thomas Munro As this patch is already marked as "Ready for Committer", do you want to take this patch forward based on Heikki's suggestions and get it committed? Regards, Vignesh
Re: PATCH: Using BRIN indexes for sorted output
On Wed, 2 Aug 2023 at 21:34, Tomas Vondra wrote: > > > > On 8/2/23 17:25, Sergey Dudoladov wrote: > > Hello, > > > >> Parallel version is not supported, but I think it should be possible. > > > > @Tomas are you working on this ? If not, I would like to give it a try. > > > > Feel free to try. Just keep it in a separate part/patch, to make it > easier to combine the work later. > > >> static void > >> AssertCheckRanges(BrinSortState *node) > >> { > >> #ifdef USE_ASSERT_CHECKING > >> > >> #endif > >> } > > > > I guess it should not be empty at the ongoing development stage. > > > > Attached a small modification of the patch with a draft of the docs. > > > > Thanks. FWIW it's generally better to always post the whole patch > series, otherwise the cfbot gets confused as it's unable to combine > stuff from different messages. Are we planning to take this patch forward? It has been nearly 5 months since the last discussion on this. If the interest has gone down and if there are no plans to handle this I'm thinking of returning this commitfest entry in this commitfest and can be opened when there is more interest. Regards, Vignesh
Re: Change GUC hashtable to use simplehash?
On Sat, 2024-01-20 at 13:48 +0700, John Naylor wrote: > The above identity is not true for this haszero64 macro. I see. > I hope this makes it more clear. Maybe the comment could use some > work. Yes, thank you. I don't think we need to change the algorithm. After having stepped away from this work for a couple weeks and returning to it, I think the comments and/or naming could be more clear. We first use the result of haszero64() as a boolean to break out of the loop, but then later use it in a more interesting way to count the number of remaining bytes. Perhaps you can take the comment out of the loop and just describe the algorithm we're using, and make a note that we have to byteswap first. "Indeterminate" could be explained briefly as well. These are minor comments. Regards, Jeff Davis
Re: PG12 change to DO UPDATE SET column references
On Sat, Jan 20, 2024 at 5:57 PM Tom Lane wrote: > > James Coleman writes: > > On Sat, Jan 20, 2024 at 12:59 PM Tom Lane wrote: > >> A HINT if the bogus column name (1) matches the relation name and > >> (2) is field-qualified seems plausible to me. Then it's pretty > >> likely to be a user misunderstanding about whether to write the > >> relation name. > > > Attached is a patch to do just that. We could also add tests for > > regular UPDATEs if you think that's useful. > > Pushed with minor alterations: > > 1. I think our usual style for conditional hints is to use a ternary > expression within the ereport, rather than duplicating code. In this > case that way allows not touching any of the existing lines, making > review easier. Ah, I'd wondered if we had a pattern for that, but I didn't know what I was looking for. > 2. I thought we should test the UPDATE case as well as the ON CONFLICT > case, but I didn't think we needed quite as many tests as you had > here. I split up the responsibility so that one test covers the > alias case and the other the no-alias case. That all makes sense. I figured it was better to have tests show all the possible combinations for review and then you could whittle them down as you saw fit. Thanks for reviewing and committing! Regards, James Coleman
Re: Make documentation builds reproducible
Peter Eisentraut writes: > On 20.01.24 17:03, Tom Lane wrote: >> * I gather that the point here is to change some generated anchor >> tags. Would any of these tags be things people would be likely >> to have bookmarked? > No, because the problem is that the anchor names are randomly generated > in each build. D'oh. No objection then. regards, tom lane
Re: PG12 change to DO UPDATE SET column references
James Coleman writes: > On Sat, Jan 20, 2024 at 12:59 PM Tom Lane wrote: >> A HINT if the bogus column name (1) matches the relation name and >> (2) is field-qualified seems plausible to me. Then it's pretty >> likely to be a user misunderstanding about whether to write the >> relation name. > Attached is a patch to do just that. We could also add tests for > regular UPDATEs if you think that's useful. Pushed with minor alterations: 1. I think our usual style for conditional hints is to use a ternary expression within the ereport, rather than duplicating code. In this case that way allows not touching any of the existing lines, making review easier. 2. I thought we should test the UPDATE case as well as the ON CONFLICT case, but I didn't think we needed quite as many tests as you had here. I split up the responsibility so that one test covers the alias case and the other the no-alias case. regards, tom lane
Re: Make documentation builds reproducible
On 20.01.24 17:03, Tom Lane wrote: Peter Eisentraut writes: I think there was general agreement with what this patch is doing, but I guess it's too boring to actually review the patch in detail. Let's say, if there are no objections, I'll go ahead and commit it. I re-read the thread and have two thoughts: * We worried about whether this change would be compatible with a (presently unreleased) version of docbook that contains the upstreamed fix. It seems unlikely that there's a problem, but maybe worth checking? The code in the patch is the same code as upstream, so it would behave the same as a new release. * I gather that the point here is to change some generated anchor tags. Would any of these tags be things people would be likely to have bookmarked? No, because the problem is that the anchor names are randomly generated in each build.
Re: PG12 change to DO UPDATE SET column references
On Sat, Jan 20, 2024 at 12:59 PM Tom Lane wrote: > > James Coleman writes: > > I do wonder if it's plausible (and sufficiently easy) to improve the > > error message here. "column 'foo' of relation 'foo'" makes one thing > > that you've written foo.foo, (in my real-world case the error message > > also cut off the sql past "foo.", and so I couldn't even tell if the > > sql was just malformed). At the very least it'd be nice to have a HINT > > here (perhaps just when the relation and column name match). > > > Before I look at where it is, Is such an improvement something we'd be > > interested in? > > A HINT if the bogus column name (1) matches the relation name and > (2) is field-qualified seems plausible to me. Then it's pretty > likely to be a user misunderstanding about whether to write the > relation name. Attached is a patch to do just that. We could also add tests for regular UPDATEs if you think that's useful. Regards, James Coleman v1-0001-Helpful-hint-for-qualified-target-column-in-UPDAT.patch Description: Binary data
Re: [PATCH] Add support function for containment operators
jian he writes: > Now I see your point. If the transformed plan is right, the whole > added code should be fine. > but keeping the textrange_supp related test should be a good idea. > since we don't have SUBTYPE_OPCLASS related sql tests. Yeah, it's a little harder to make a table-less test for that case. I thought about using current_user or the like as a stable comparison value, but that introduces some doubt about what the collation would be. That test seems cheap enough as-is, since it's handling only a tiny amount of data. Committed. regards, tom lane
Re: PG12 change to DO UPDATE SET column references
James Coleman writes: > I do wonder if it's plausible (and sufficiently easy) to improve the > error message here. "column 'foo' of relation 'foo'" makes one thing > that you've written foo.foo, (in my real-world case the error message > also cut off the sql past "foo.", and so I couldn't even tell if the > sql was just malformed). At the very least it'd be nice to have a HINT > here (perhaps just when the relation and column name match). > Before I look at where it is, Is such an improvement something we'd be > interested in? A HINT if the bogus column name (1) matches the relation name and (2) is field-qualified seems plausible to me. Then it's pretty likely to be a user misunderstanding about whether to write the relation name. regards, tom lane
Re: PG12 change to DO UPDATE SET column references
On Sat, Jan 20, 2024 at 11:12 AM Tom Lane wrote: > > James Coleman writes: > > Suppose I have this table: > > create table foo (id int primary key); > > > On PG11 this works: > > postgres=# insert into foo (id) values (1) on conflict (id) do update > > set foo.id = 1; > > INSERT 0 1 > > Hmm, are you sure about that? I get > > ERROR: column "foo" of relation "foo" does not exist > LINE 2: on conflict (id) do update set foo.id = 1; > ^ > > in every branch back to 9.5 where ON CONFLICT was introduced. > > I'm checking branch tip in each case, so conceivably this is > something that was changed post-11.0, but I kinda doubt we > would have back-patched it. Hmm, I just tested it on the official 11.15 docker image and couldn't reproduce it. That leads me to believe that the difference isn't in PG11 vs. 12, but rather in 2ndQuadrant Postgres (which we are running for PG11, but are not using for > 11). Egg on my face twice in this thread. I do wonder if it's plausible (and sufficiently easy) to improve the error message here. "column 'foo' of relation 'foo'" makes one thing that you've written foo.foo, (in my real-world case the error message also cut off the sql past "foo.", and so I couldn't even tell if the sql was just malformed). At the very least it'd be nice to have a HINT here (perhaps just when the relation and column name match). Before I look at where it is, Is such an improvement something we'd be interested in? Regards, James Coleman
Re: Patch: Improve Boolean Predicate JSON Path Docs
So, overall reaction to this patch: I like the approach of defining "predicate check expressions" as being a different thing from standard jsonpath expressions. However, I'm not so thrilled with just saying "don't use" one type or the other with different jsonpath functions. According to my tests, some of these functions seem to give sensible results anyway with the path type you say not to use, while some give less-sensible results, and others give errors. We ought to try to document that, and maybe even clean up the less sane behaviors. (That is, I don't feel that a docs-only patch is necessarily the thing to do here.) As an example, @? seems to behave sanely with a standard jsonpath: regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ < 5)' ; ?column? -- t (1 row) regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 5)' ; ?column? -- f (1 row) It will take a predicate, but seems to always return true: regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] < 5' ; ?column? -- t (1 row) regression=# select '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] > 5' ; ?column? -- t (1 row) Surely we're not helping anybody by leaving that behavior in place. Making it do something useful, throwing an error, or returning NULL all seem superior to this. I observe that @@ returns NULL for the path type it doesn't like, so maybe that's what to do here. (Unsurprisingly, jsonb_path_exists acts similarly.) BTW, jsonb_path_query_array and jsonb_path_query_first seem to take both types of path, like jsonb_path_query, so ISTM they need docs changes too. regards, tom lane
Re: Patch: Improve Boolean Predicate JSON Path Docs
"David E. Wheeler" writes: > While you’re in there, Tom, would it make sense to fold in something like > [this patch][1] I posted last month to clarify which JSONPath comparison > operators can take advantage of a index? > --- a/doc/src/sgml/json.sgml > +++ b/doc/src/sgml/json.sgml > @@ -513,7 +513,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ > '$.tags[*] == "qui"'; > > For these operators, a GIN index extracts clauses of the form > accessors_chain > -= constant out of > +== constant out of > the jsonpath pattern, and does the index search based on > the keys and values mentioned in these clauses. The accessors chain > may include .key, Right, clearly a typo. > @@ -522,6 +522,9 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ > '$.tags[*] == "qui"'; > The jsonb_ops operator class also > supports .* and .** accessors, > but the jsonb_path_ops operator class does not. > +Only the == and != +linkend="functions-sqljson-path-operators">SQL/JSON Path Operators > +can use the index. > You sure about that? It would surprise me if we could effectively use a not-equal condition with an index. If it is only == that works, then the preceding statement seems sufficient. regards, tom lane
Re: PG12 change to DO UPDATE SET column references
James Coleman writes: > Suppose I have this table: > create table foo (id int primary key); > On PG11 this works: > postgres=# insert into foo (id) values (1) on conflict (id) do update > set foo.id = 1; > INSERT 0 1 Hmm, are you sure about that? I get ERROR: column "foo" of relation "foo" does not exist LINE 2: on conflict (id) do update set foo.id = 1; ^ in every branch back to 9.5 where ON CONFLICT was introduced. I'm checking branch tip in each case, so conceivably this is something that was changed post-11.0, but I kinda doubt we would have back-patched it. regards, tom lane
Re: Make documentation builds reproducible
Peter Eisentraut writes: > I think there was general agreement with what this patch is doing, but I > guess it's too boring to actually review the patch in detail. Let's > say, if there are no objections, I'll go ahead and commit it. I re-read the thread and have two thoughts: * We worried about whether this change would be compatible with a (presently unreleased) version of docbook that contains the upstreamed fix. It seems unlikely that there's a problem, but maybe worth checking? * I gather that the point here is to change some generated anchor tags. Would any of these tags be things people would be likely to have bookmarked? regards, tom lane
Re: Patch: Improve Boolean Predicate JSON Path Docs
On Jan 19, 2024, at 21:46, Erik Wienhold wrote: > Interesting... copy-pasting the entire \set command works for me with > psql 16.1 in gnome-terminal and tmux. Typing it out manually gives me > the "unterminated quoted string" error. Maybe has to do with my stty > settings. Yes, same on macOS Terminal.app and 16.1 compiled with readline. I didn’t realize that \set didn’t support newlines, because it works fine when you paste something with newlines. Curious. >> I experimented with >> >> SELECT ' >> ... multiline json value ... >> ' AS json >> \gexec >> >> but that didn't seem to work either. Anybody have a better idea? > > Fine with me (the \gset variant). Much cleaner TBH. david=# select '{ "track": { "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", "HR": 135 } ] } }'::jsonb as json; json {"track": {"segments": [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]}} (1 row) david=# \gset david=# select :'json'::jsonb; jsonb {"track": {"segments": [{"HR": 73, "location": [47.763, 13.4034], "start time": "2018-10-14 10:05:14"}, {"HR": 135, "location": [47.706, 13.2635], "start time": "2018-10-14 10:39:21"}]}} (1 row) So great! While you’re in there, Tom, would it make sense to fold in something like [this patch][1] I posted last month to clarify which JSONPath comparison operators can take advantage of a index? --- a/doc/src/sgml/json.sgml +++ b/doc/src/sgml/json.sgml @@ -513,7 +513,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"'; For these operators, a GIN index extracts clauses of the form accessors_chain -= constant out of +== constant out of the jsonpath pattern, and does the index search based on the keys and values mentioned in these clauses. The accessors chain may include .key, @@ -522,6 +522,9 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @@ '$.tags[*] == "qui"'; The jsonb_ops operator class also supports .* and .** accessors, but the jsonb_path_ops operator class does not. +Only the == and != SQL/JSON Path Operators +can use the index. Best, David [1]: https://www.postgresql.org/message-id/0ece6b9c-cdde-4b65-be5a-49d737204...@justatheory.com
Re: PG12 change to DO UPDATE SET column references
On Saturday, January 20, 2024, James Coleman wrote: > > > Well, egg on my face for definitely missing that in the docs. > > Unfortunately that doesn't explain why it works on PG11 and not on PG12. > It was a bug that got fixed. I’m sure a search of the mailing list archives or Git will turn up the relevant discussion when it happened; though I suppose it may just have been a refactoring to leverage the consistency with update and no one realized they were fixing a bug at the time. David J.
Re: PG12 change to DO UPDATE SET column references
On Fri, Jan 19, 2024 at 1:53 PM David G. Johnston wrote: > > On Fri, Jan 19, 2024 at 10:01 AM James Coleman wrote: >> >> Making this more confusing is the fact that if I want to do something >> like "SET bar = foo.bar + 1" the table qualification cannot be present >> on the setting column but is required on the reading column. >> >> There isn't anything in the docs that I see about this, and I don't >> see anything scanning the release notes for PG12 either (though I >> could have missed a keyword to search for). >> > > https://www.postgresql.org/docs/12/sql-insert.html > > "When referencing a column with ON CONFLICT DO UPDATE, do not include the > table's name in the specification of a target column. For example, INSERT > INTO table_name ... ON CONFLICT DO UPDATE SET table_name.col = 1 is invalid > (this follows the general behavior for UPDATE)." > > The same text exists for v11. Well, egg on my face for definitely missing that in the docs. Unfortunately that doesn't explain why it works on PG11 and not on PG12. Regards, James Coleman
Re: Synchronizing slots from primary to standby
On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar wrote: > > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila wrote: > > > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik wrote: > > > > > > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta > > on the topic of extending replication commands instead of using the > > current model where we fetch the required slot information via SQL > > using a database connection. I would like to summarize the discussion > > and would like to know the thoughts of others on this topic. > > > > In the current patch, we launch the slotsync worker on physical > > standby which connects to the specified database (currently we let > > users specify the required dbname in primary_conninfo) on the primary. > > It then fetches the required information for failover marked slots > > from the primary and also does some primitive checks on the upstream > > node via SQL (the additional checks are like whether the upstream node > > has a specified physical slot or whether the upstream node is a > > primary node or a standby node). To fetch the required information it > > uses a libpqwalreciever API which is mostly apt for this purpose as it > > supports SQL execution but for this patch, we don't need a replication > > connection, so we extend the libpqwalreciever connect API. > > What sort of extension we have done to 'libpqwalreciever'? Is it > something like by default this supports replication connections so we > have done an extension to the API so that we can provide an option > whether to create a replication connection or a normal connection? > Yeah and in the future there could be more as well. The other function added walrcv_get_dbname_from_conninfo doesn't appear to be a problem either for now. > > Now, the concerns related to this could be that users would probably > > need to change existing mechanisms/tools to update priamry_conninfo > > and one of the alternatives proposed is to have an additional GUC like > > slot_sync_dbname. Users won't be able to drop the database this worker > > is connected to aka whatever is specified in slot_sync_dbname but as > > the user herself sets up the configuration it shouldn't be a big deal. > > Yeah for this purpose users may use template1 or so which they > generally don't plan to drop. > Using template1 has other problems like users won't be able to create a new database. See [2] (point number 2.2) > > So in case the user wants to drop that > database user needs to turn off the slot syncing option and then it > can be done? > Right. > > Then we also discussed whether extending libpqwalreceiver's connect > > API is a good idea and whether we need to further extend it in the > > future. As far as I can see, slotsync worker's primary requirement is > > to execute SQL queries which the current API is sufficient, and don't > > see something that needs any drastic change in this API. Note that > > tablesync worker that executes SQL also uses these APIs, so we may > > need something in the future for either of those. Then finally we need > > a slotsync worker to also connect to a database to use SQL and fetch > > results. > > While looking into the patch v64-0002 I could not exactly point out > what sort of extensions are there in libpqwalreceiver.c, I just saw > one extra API for fetching the dbname from connection info? > Right, the worry was that we may need it in the future. > > Now, let us consider if we extend the replication commands like > > READ_REPLICATION_SLOT and or introduce a new set of replication > > commands to fetch the required information then we don't need a DB > > connection with primary or a connection in slotsync worker. As per my > > current understanding, it is quite doable but I think we will slowly > > go in the direction of making replication commands something like SQL > > because today we need to extend it to fetch all slots info that have > > failover marked as true, the existence of a particular replication, > > etc. Then tomorrow, if we want to extend this work to have multiple > > slotsync workers say workers perdb then we have to extend the > > replication command to fetch per-database failover marked slots. To > > me, it sounds more like we are slowly adding SQL-like features to > > replication commands. > > > > Apart from this when we are reading per-db replication slots without > > connecting to a database, we probably need some additional protection > > mechanism so that the database won't get dropped. > > Something like locking the database only while fetching the slots? > Possible, but can we lock the database from an auxiliary process? > > Considering all this it seems that for now probably extending > > replication commands can simplify a few things like mentioned above > > but using SQL's with db-connection is more extendable. > > Even I have similar thoughts. > Thanks. [1] - https://www.postgresql.org/message-id/CAJpy0uBhPx1MDHh903XpFAhpBH23KzVXyg_4VjH2zXk81oGi1w%40mail.gmai
Re: Make documentation builds reproducible
On 20.01.24 03:33, vignesh C wrote: On Fri, 25 Aug 2023 at 01:23, Tristan Partin wrote: On Thu Aug 24, 2023 at 2:30 PM CDT, Tom Lane wrote: "Tristan Partin" writes: On Wed Aug 23, 2023 at 2:24 PM CDT, Peter Eisentraut wrote: Somewhere at PGCon, I forgot exactly where, maybe in the same meeting where we talked about getting rid of distprep, we talked about that the documentation builds are not reproducible (in the sense of https://reproducible-builds.org/). This is easily fixable, Is there anything I am missing? Is Postgres relying on releases older than snapshot-2018-12-07-01? If so, is it possible to up the minimum version? AFAICT the "latest stable release" of docbook-xsl is still 1.79.2, which seems to have been released in 2017, so it's unsurprising that it's missing this fix. It's kind of hard to argue that developers (much less distro packagers) should install unsupported snapshot releases in order to build our docs. Having said that, maybe we should check whether this patch is compatible with those snapshot releases, just in case somebody is using one. I agree with you. Thanks for the pointer. I'm seeing that there has been no activity in this thread for nearly 5 months, I'm planning to close this in the current commitfest unless someone is planning to take it forward. I think there was general agreement with what this patch is doing, but I guess it's too boring to actually review the patch in detail. Let's say, if there are no objections, I'll go ahead and commit it.
Re: Create shorthand for including all extra tests
On 15.01.24 09:54, Nazir Bilal Yavuz wrote: Hi, On Wed, 10 Jan 2024 at 23:48, Peter Eisentraut wrote: On 05.09.23 19:26, Nazir Bilal Yavuz wrote: Thanks for the feedback! I updated the patch, 'needs-private-lo' option enables kerberos, ldap, load_balance and ssl extra tests now. As was discussed, I don't think "needs private lo" is the only condition for these tests. At least kerberos and ldap also need extra software installed, and load_balance might need editing the system's hosts file. So someone would still need to familiarize themselves with these tests individually before setting a global option like this. Also, if we were to create test groupings like this, I think the implementation should be different. The way you have it, there is a sort of central registry of all affected tests in src/test/perl/PostgreSQL/Test/Utils.pm and a mapping of groups to tests. I would prefer a more decentralized approach where each test decides on its own whether to run, with pseudo-code conditionals like if (!(PG_TEST_EXTRA contains "ldap" or PG_TEST_EXTRA contains "needs-private-lo")) skip_all Anyway, at the moment, I don't see a sensible way to group these things beyond what we have now (effectively, "ldap" is already a group, because it affects more than one test suite). Right now, we have six possible values, which is probably just about doable to keep track of manually. If we get a lot more, then we need to look into this again, but maybe then we'll also have more patterns to group things around. I see your point. It looks like the best option is to reevaluate this if there are more PG_TEST_EXTRA options. Ok, I'm closing this commitfest entry.