Re: pg_receivewal - couple of improvements
On Sun, Feb 6, 2022 at 12:16 PM Michael Paquier wrote: > > On Thu, Feb 03, 2022 at 10:01:42PM +0800, Julien Rouhaud wrote: > > I don't get it. If you're missing WAL it means that will you have to do > > that > > tedious manual work to retrieve them no matter what. So on top of that > > tedious > > work, you also have to make sure that you don't provide a bogus start > > position. > > I may be wrong in saying that, but the primary use case I have seen > for pg_receivewal is a service integration for archiving. Not only for archiving, but it can also be used as a synchronous/asynchronous standby. > > Maybe that's a good idea but I'm still having a hard time imagining a > > scenario > > where it would actually be a good idea. > > With the defaults that we have now in place (current LSN location, > current slot's location or the archive location), I am not really > convinced that we need more control in this area with the proposed > option. Having the start position as an option for pg_receivewal can be useful in scenarios where the LSN/WAL file calculated from the pg_receivwal's target directory is removed by the primary (for whatever reasons). In such scenarios, we have to manually remove the WAL files(a risky thing IMO) in the pg_receivwal's target directory so that the pg_receivewal will calculate the next start position from the slot info and then from the server position via IDENTIFY_SYSTEM command. With the start position as an option, users can just provide the LSN from which they want to stream the WAL, in the above case, it can be from primary's latest checkpoint LSN. I still feel that the start position as an option would be a good addition here, so that users can choose the start position. If others don't agree with the start position as an option, at least having the current start position calculation (first, from pg_receivewal's target directory, if not, from slot's restart_lsn, if not, from server's identifiy_system command) as a user specified option will be of great help. Users can say, 'calculate start position from target directory or slot's restart_lsn or server's identify_system command). Regards, Bharath Rupireddy.
Re: pg_upgrade should truncate/remove its logs before running
Michael Paquier writes: > On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote: >> As already mentioned, there's been no notice to buildfarm owners ... >> so has Andrew actually made a release? > There has been one as of 8 days ago: > https://github.com/PGBuildFarm/client-code/releases [ scrapes buildfarm logs ... ] Not even Andrew's own buildfarm critters are using it, so permit me leave to doubt that he thinks it's fully baked. Andrew? regards, tom lane
Re: pg_upgrade should truncate/remove its logs before running
On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote: > As already mentioned, there's been no notice to buildfarm owners ... > so has Andrew actually made a release? There has been one as of 8 days ago: https://github.com/PGBuildFarm/client-code/releases And I have just looked at that as point of reference. -- Michael signature.asc Description: PGP signature
Re: pg_upgrade should truncate/remove its logs before running
On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote: > Michael Paquier writes: > > So, it took me some time to get back to this thread, and looked at it > > for the last couple of days... The buildfarm client v14 has been > > released on the 29th of January, which means that we are good to go. > > As already mentioned, there's been no notice to buildfarm owners ... > so has Andrew actually made a release? There's a v14 release on the github project ([1]) from 8 days ago, so it seems so. [1] https://github.com/PGBuildFarm/client-code/releases/tag/REL_14
Re: Ensure that STDERR is empty during connect_ok
On Wed, Feb 02, 2022 at 03:40:39PM +0100, Daniel Gustafsson wrote: > As part of the NSS patchset, quite a few bugs (and NSS quirks) were found by > inspecting STDERR in connect_ok and require it to be empty. This is not > really > NSS specific, and could help find issues in other libraries as well so I > propose to apply it regardless of the fate of the NSS patchset. Sounds sensible from here. Thanks! All the code paths seem to be covered, at quick glance. -- Michael signature.asc Description: PGP signature
Re: pg_upgrade should truncate/remove its logs before running
Michael Paquier writes: > So, it took me some time to get back to this thread, and looked at it > for the last couple of days... The buildfarm client v14 has been > released on the 29th of January, which means that we are good to go. As already mentioned, there's been no notice to buildfarm owners ... so has Andrew actually made a release? regards, tom lane
Re: libpq async duplicate error results
Hello Tom, I concur with Fabien's analysis: we report the FATAL message from the server during the first PQgetResult, and then the second call discovers that the connection is gone and reports "server closed the connection unexpectedly". Those are two independent events (in libpq's view anyway) so reporting them separately is correct, or at least necessary unless you want to engage in seriously major redesign and behavioral changes. It is annoying that some of the text is duplicated in the second report, but in the end that's cosmetic, and I'm not sure we can improve it without breaking other things. In particular, we can't just think about what comes back in the PGgetResult calls, we also have to consider what PQerrorMessage(conn) will report afterwards. So I don't think that resetting conn->errorMessage during a PQgetResult series would be a good fix. An idea that I don't have time to pursue right now is to track how much of conn->errorMessage has been read out by PQgetResult, and only report newly-added text in later PQgetResult calls of a series, while PQerrorMessage would continue to report the whole thing. Buffer resets would occur where they do now. It'd probably be necessary to make a user-visible PQgetResult that becomes a wrapper around PQgetResultInternal, because internal callers such as PQexecFinish will need the old behavior, or at least some of them will. I agree that the message reset is not convincing, but it was the only way to get the expected behavior without changing the existing functions. I see two paths: (1) keep the duplicate message in this corner case. (2) do the hocus pocus you suggest around PQerrorMessage and PQgetResult to work around the duplication, just for this corner case. I'd tend to choose (1) to keep it simple, even if (2) is feasible. -- Fabien.
Re: 2022-01 Commitfest
Hi, On Sun, Feb 06, 2022 at 03:49:50PM +0900, Michael Paquier wrote: > On Wed, Feb 02, 2022 at 01:00:18PM -0500, Tom Lane wrote: > > Agreed, we're not here to cause make-work for submitters. RWF is > > appropriate if the patch has been in Waiting On Author for awhile > > and doesn't seem to be going anywhere, but otherwise we should > > just punt it to the next CF. > > FWIW, I just apply a two-week rule here, as of half the commit fest > period to let people the time to react: > - If a patch has been waiting on author since the 15th of January, > mark it as RwF. > - If it has been left as waiting on author after the 15th of January, > move it to the next CF. Thanks. Note that I was planning to do that on Monday, as it didn't seemed rushed enough to spend time on it during the weekend.
Re: Ensure that STDERR is empty during connect_ok
On Wed, Feb 02, 2022 at 04:42:09PM +0100, Daniel Gustafsson wrote: > Disclaimer: longer term I would prefer to remove test plan counting like above > and Toms comment downthread. That would be really nice. -- Michael signature.asc Description: PGP signature
Re: 2022-01 Commitfest
On Wed, Feb 02, 2022 at 01:00:18PM -0500, Tom Lane wrote: > Agreed, we're not here to cause make-work for submitters. RWF is > appropriate if the patch has been in Waiting On Author for awhile > and doesn't seem to be going anywhere, but otherwise we should > just punt it to the next CF. FWIW, I just apply a two-week rule here, as of half the commit fest period to let people the time to react: - If a patch has been waiting on author since the 15th of January, mark it as RwF. - If it has been left as waiting on author after the 15th of January, move it to the next CF. > Anyway, thanks to Julien for doing this mostly-thankless task > this time! +1. -- Michael signature.asc Description: PGP signature
Re: pg_receivewal - couple of improvements
On Thu, Feb 03, 2022 at 10:01:42PM +0800, Julien Rouhaud wrote: > I don't get it. If you're missing WAL it means that will you have to do that > tedious manual work to retrieve them no matter what. So on top of that > tedious > work, you also have to make sure that you don't provide a bogus start > position. I may be wrong in saying that, but the primary use case I have seen for pg_receivewal is a service integration for archiving. > Maybe that's a good idea but I'm still having a hard time imagining a scenario > where it would actually be a good idea. With the defaults that we have now in place (current LSN location, current slot's location or the archive location), I am not really convinced that we need more control in this area with the proposed option. -- Michael signature.asc Description: PGP signature
Re: Windows now has fdatasync()
On Sun, Dec 12, 2021 at 03:48:10PM +1300, Thomas Munro wrote: > I tried out a quick POC patch and it runs a bit faster than fsync(), as > expected. Good news, as a too high difference would be suspect :) How much difference does it make in % and are the numbers rather reproducible? Just wondering.. -- Michael signature.asc Description: PGP signature
Re: pg_upgrade should truncate/remove its logs before running
On Sun, Feb 06, 2022 at 02:03:44PM +0800, Julien Rouhaud wrote: > I didn't follow that thread closely, but if having the latest buildfarm client > version installed is a hard requirement this will likely be a problem. First, > there was no email to warn buildfarm owners that a new version is available, > and even if there was I doubt that every owner would have updated it since. > Especially since this is the lunar new year period, so at least 2 buildfarm > owners (me included) are on holidays since last week. The buildfarm will still be able to work as it did so that's not a hard requirement per-se. The only thing changing is that we would not find the logs in the event of a failure in the tests of pg_upgrade, and the buildfarm client is coded to never fail if it does not see logs in some of the paths it looks at, it just holds the full history of the paths we have used across the ages. -- Michael signature.asc Description: PGP signature
Re: Unclear problem reports
On Sun, Feb 6, 2022 at 5:28 AM Noah Misch wrote: > > > > - Canned responses are noise to every list reader other than the OP. > > > - Canned responses make the list feel like a sales funnel rather than a > > > normal > > > free software mailing list. > > > > All bug reports posted through the bug form gets moderated before > > they're passed through. Moderators also have access to a set of > > pre-defined canned responses (such as the example of where they should > > go to report pgadmin bugs). It will also catch posts made directly to > > -bugs by people who are not subscribed, but people who are subscribed > > will not have their posts moderated by default. > > > > If we're talking about people submitting bug reports, åperhaps a lot > > can be done with (1) a couple of more such responses and (2) more > > clear instructions fo the moderators of when they are supposed to use > > them. > > > > Error on the side of letting them through is probably always the best > > choice, but if it's clear that it can be better handled by a canned > > response, we do have an actual system for that. > > I was referring to David G. Johnston's proposal to send canned email responses > when a thread (presumably containing a not-obviously-unreasonable PostgreSQL > question or report) gets no replies in N days. The proposed email directed > the user toward paid professional services. Sending such a message at > moderation time would solve the noise problem, but it would make the "sales > funnel" problem worse. (Also, I figure moderators won't know at moderation > time which messages will get zero replies.) For which part of $SUBJECT did > you envision using new moderator responses? IMO, having a postgresql wiki page describing the steps to debug or perhaps fix the most common problems/bugs is a good idea. Of course, adding the steps to wiki and maintaining it takes some effort from the experts here. But this is better than having canned email responses for bug reports which requires more effort than a wiki page. Today, there are a lot of outside resources listing the steps to fix/debug a postgres related issue. The users can always refer to the wiki page and try out things on their own, if the steps don't fix the issue, they can always reach out via postgresql bugs list. I also agree with the point that improving some of the most common error messages with hints on what to do next to fix the issues. Regards, Bharath Rupireddy.
Re: pg_upgrade should truncate/remove its logs before running
Hi, On Sun, Feb 06, 2022 at 01:36:07PM +0900, Michael Paquier wrote: > > The buildfarm client v14 has been > released on the 29th of January, which means that we are good to go. I didn't follow that thread closely, but if having the latest buildfarm client version installed is a hard requirement this will likely be a problem. First, there was no email to warn buildfarm owners that a new version is available, and even if there was I doubt that every owner would have updated it since. Especially since this is the lunar new year period, so at least 2 buildfarm owners (me included) are on holidays since last week.
Re: GUC flags
On Mon, Jan 31, 2022 at 04:56:45PM -0600, Justin Pryzby wrote: > I'm not clear on what things are required/prohibited to allow/expect > "installcheck" to pass. It's possible that postgresql.conf doesn't even exist > in the data dir, right ? There are no written instructions AFAIK, but I have as personal rule to not break the tests in configurations where they worked previously. > It's okay with me if the config_file-reading stuff isn't re-implemented. Actually, I am thinking that we should implement it before retiring completely check_guc, but not in the fashion you are suggesting. I would be tempted to add something in the TAP tests as of src/test/misc/, where we initialize an instance to get the information about all the GUCs from SQL, and map that to the sample file located at pg_config --sharedir. I actually have in my patch set for pg_upgrade's TAP a perl routine that could be used for this purpose, as of the following in Cluster.pm: +=item $node->config_data($option) + +Grab some data from pg_config, with $option being the command switch +used. + +=cut + +sub config_data +{ + my ($self, $option) = @_; + local %ENV = $self->_get_env(); + + my ($stdout, $stderr); + my $result = + IPC::Run::run [ $self->installed_command('pg_config'), $option ], + '>', \$stdout, '2>', \$stderr + or die "could not execute pg_config"; + chomp($stdout); + $stdout =~ s/\r$//; + + return $stdout; +} What do you think? (I was thinking about applying that separately anyway, to lower the load of the pg_upgrade patch a bit.) -- Michael signature.asc Description: PGP signature
Re: pg_upgrade should truncate/remove its logs before running
On Sat, Jan 29, 2022 at 09:53:25AM +0900, Michael Paquier wrote: > On Fri, Jan 28, 2022 at 06:27:29PM -0500, Andrew Dunstan wrote: >> I have committed this. But it will take time to get every buildfarm own >> to upgrade. > > Thanks for that. So, it took me some time to get back to this thread, and looked at it for the last couple of days... The buildfarm client v14 has been released on the 29th of January, which means that we are good to go. I have found one issue while reviewing things: the creation of the new subdirectory and its contents should satisfy group permissions for the new cluster's data folder, but we were not doing that properly as we called GetDataDirectoryCreatePerm() after make_outputdirs() so we missed the proper values for create_mode and umask(). The rest looked fine, and I got a green CI run on my own repo. Hence, applied. I'll keep an eye on the buildfarm, in case. -- Michael signature.asc Description: PGP signature
Re: Adding CI to our tree
Hi, On 2022-02-03 23:04:04 -0600, Justin Pryzby wrote: > > I assume this doesn't yet work to a meaningful degree? Last time I checked > > there were quite a few tests that needed to be invoked in a specific > > directory. > > It works - tap_check() does chdir(). Ah, I thought you'd implemented a target that does it all in one prove invocation... > It currently fails in 027_stream_regress.pl, although I keep hoping that it > had been fixed... That's likely because you're not setting REGRESS_OUTPUTDIR like src/test/recovery/Makefile and recoverycheck() are doing. Greetings, Andres Freund
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Hi, On Sat, Feb 5, 2022 at 9:32 PM Tomas Vondra wrote: > > I'm also not claiming this is 100% worth it - queries with a suitable > combination of clauses (conditions on the join keys) seems rather > uncommon. Thanks for showing interest in this. I want to add some other user cases which seem not very uncommon. a). When we join the key on a foregin table, in which case, push down a qual to foregin key would be pretty good to reduce the data transformed from the network. b). If the people join many partitioned table on partitioned key, but they want to query more than 1 partitions (which means the qual on partition key is not a simple "partitionKey = Const"), then we have to do a run-time partition prune (lose the chance for initial partition prune). We have big difference on the performance aspect as well. I guess some of the people who think we may need this feature are not very clear about what bad it would be if we add this feature (Of course Including me). I summarized the discussion before and hacked the solution at [1], the current state looks reasonable to me. I'm not sure if I missed any point. > Of course, this breaks the estimates in the faster query, because we now > apply the condition twice - once for the index scan, one as the join > clause. So instead of ~100k rows the join is estimated as ~1000 rows. I think my patch has addressed this. Here is the example: postgres=# set geqo to off; -- disable this feature, we have an estimation error. -- using geqo guc in patch is just for easy testing. SET postgres=# explain analyze SELECT t1.a, t2.a FROM t1 JOIN t2 USING (a) WHERE (t1.a > 99000) and t2.a > 99000; QUERY PLAN Merge Join (cost=0.73..2408.37 rows=990 width=8) (actual time=0.032..21.350 rows=99900 loops=1) Merge Cond: (t1.a = t2.a) -> Index Only Scan using t1_a_idx on t1 (cost=0.29..29.64 rows=991 width=4) (actual time=0.014..0.121 rows=1000 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 -> Index Only Scan using t2_a_idx on t2 (cost=0.43..2113.20 rows=101301 width=4) (actual time=0.013..9.854 rows=99900 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 Planning Time: 0.282 ms Execution Time: 24.823 ms (10 rows) postgres=# set geqo to on; -- enable this feature and let planner derive the qual by itself, the estimation -- is good. SET postgres=# explain analyze SELECT t1.a, t2.a FROM t1 JOIN t2 USING (a) WHERE (t1.a > 99000) ; QUERY PLAN Merge Join (cost=0.73..2408.37 rows=97680 width=8) (actual time=0.031..21.296 rows=99900 loops=1) Merge Cond: (t1.a = t2.a) -> Index Only Scan using t1_a_idx on t1 (cost=0.29..29.64 rows=991 width=4) (actual time=0.014..0.116 rows=1000 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 -> Index Only Scan using t2_a_idx on t2 (cost=0.43..2113.20 rows=101301 width=4) (actual time=0.012..9.751 rows=99900 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 Planning Time: 0.269 ms Execution Time: 24.749 ms (10 rows) So I think knowing what bad it is to have this feature is the key point to discussion now. [1] https://www.postgresql.org/message-id/CAKU4AWpo9z0hMHDWUKuce4Z-NpcybV0J2UVu5%2BDVwyP-CrHCQg%40mail.gmail.com -- Best Regards Andy Fan
Re: Unclear problem reports
On Sun, Feb 06, 2022 at 12:41:52AM +0100, Magnus Hagander wrote: > On Sun, Feb 6, 2022 at 12:02 AM Noah Misch wrote: > > On Fri, Feb 04, 2022 at 01:36:46PM -0500, Bruce Momjian wrote: > > > On Wed, Feb 2, 2022 at 07:21:19PM -0700, David G. Johnston wrote: > > > > Have pg_lister queue up a check for, say, two or three days after the > > > > bug > > > > reporting form is filled out. If the report hasn't been responded to by > > > > someone other than the OP send out a reply that basically says: > > > > > > > > We're sorry your message hasn't yet attracted a response. If your > > > > report falls > > > > into the category of "tech support for a malfunctioning server", and > > > > this > > > > includes error messages that are difficult or impossible to replicate > > > > in a > > > > development environment (maybe give some examples), you may wish to > > > > consider > > > > eliciting paid professional help. Please see this page on our website > > > > for a > > > > directory of companies that provide such services. The PostgreSQL Core > > > > Project > > > > itself refrains from making recommendations since many, if not all, of > > > > these > > > > companies contribute back to the project in order to keep it both free > > > > and open > > > > source. > > > > > > Yes, that is an idea. I have canned email responses for common issues > > > like PGAdmin questions on the bugs list, but for these cases, I don't > > > know if someone might actually know the answer, and I don't know how > > > long to wait for an answer. Should we be going the other way and state > > > on the bugs submission page, > > > https://www.postgresql.org/account/submitbug/: > > > > > > If you are having a serious problem with the software and do not > > > receive a reply, consider additional support channels, including > > > professional support (https://www.postgresql.org/support/). > > > > https://www.postgresql.org/account/submitbug/ does say to "ensure you have > > read" https://www.postgresql.org/docs/current/bug-reporting.html, which > > says, > > "If you need help immediately, consider obtaining a commercial support > > contract." Hence, this expands on an existing message and makes it more > > prominent. I think that's reasonable, though I'm not sure it's a net win. > > One could also edit the page that appears after submitting a bug. Editing a > > web page is superior to using canned email responses, for two reasons: > > > > - Canned responses are noise to every list reader other than the OP. > > - Canned responses make the list feel like a sales funnel rather than a > > normal > > free software mailing list. > > All bug reports posted through the bug form gets moderated before > they're passed through. Moderators also have access to a set of > pre-defined canned responses (such as the example of where they should > go to report pgadmin bugs). It will also catch posts made directly to > -bugs by people who are not subscribed, but people who are subscribed > will not have their posts moderated by default. > > If we're talking about people submitting bug reports, åperhaps a lot > can be done with (1) a couple of more such responses and (2) more > clear instructions fo the moderators of when they are supposed to use > them. > > Error on the side of letting them through is probably always the best > choice, but if it's clear that it can be better handled by a canned > response, we do have an actual system for that. I was referring to David G. Johnston's proposal to send canned email responses when a thread (presumably containing a not-obviously-unreasonable PostgreSQL question or report) gets no replies in N days. The proposed email directed the user toward paid professional services. Sending such a message at moderation time would solve the noise problem, but it would make the "sales funnel" problem worse. (Also, I figure moderators won't know at moderation time which messages will get zero replies.) For which part of $SUBJECT did you envision using new moderator responses?
Re: Unclear problem reports
On Sun, Feb 6, 2022 at 12:02 AM Noah Misch wrote: > > On Fri, Feb 04, 2022 at 01:36:46PM -0500, Bruce Momjian wrote: > > On Wed, Feb 2, 2022 at 07:21:19PM -0700, David G. Johnston wrote: > > > On Wed, Feb 2, 2022 at 5:35 PM Bruce Momjian wrote: > > > Have pg_lister queue up a check for, say, two or three days after the bug > > > reporting form is filled out. If the report hasn't been responded to by > > > someone other than the OP send out a reply that basically says: > > > > > > We're sorry your message hasn't yet attracted a response. If your report > > > falls > > > into the category of "tech support for a malfunctioning server", and this > > > includes error messages that are difficult or impossible to replicate in a > > > development environment (maybe give some examples), you may wish to > > > consider > > > eliciting paid professional help. Please see this page on our website > > > for a > > > directory of companies that provide such services. The PostgreSQL Core > > > Project > > > itself refrains from making recommendations since many, if not all, of > > > these > > > companies contribute back to the project in order to keep it both free > > > and open > > > source. > > > > Yes, that is an idea. I have canned email responses for common issues > > like PGAdmin questions on the bugs list, but for these cases, I don't > > know if someone might actually know the answer, and I don't know how > > long to wait for an answer. Should we be going the other way and state > > on the bugs submission page, https://www.postgresql.org/account/submitbug/: > > > > If you are having a serious problem with the software and do not > > receive a reply, consider additional support channels, including > > professional support (https://www.postgresql.org/support/). > > https://www.postgresql.org/account/submitbug/ does say to "ensure you have > read" https://www.postgresql.org/docs/current/bug-reporting.html, which says, > "If you need help immediately, consider obtaining a commercial support > contract." Hence, this expands on an existing message and makes it more > prominent. I think that's reasonable, though I'm not sure it's a net win. > One could also edit the page that appears after submitting a bug. Editing a > web page is superior to using canned email responses, for two reasons: > > - Canned responses are noise to every list reader other than the OP. > - Canned responses make the list feel like a sales funnel rather than a normal > free software mailing list. All bug reports posted through the bug form gets moderated before they're passed through. Moderators also have access to a set of pre-defined canned responses (such as the example of where they should go to report pgadmin bugs). It will also catch posts made directly to -bugs by people who are not subscribed, but people who are subscribed will not have their posts moderated by default. If we're talking about people submitting bug reports, åperhaps a lot can be done with (1) a couple of more such responses and (2) more clear instructions fo the moderators of when they are supposed to use them. Error on the side of letting them through is probably always the best choice, but if it's clear that it can be better handled by a canned response, we do have an actual system for that. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Re: [BUG]Update Toast data failure in logical replication
Hi, On 2022-02-04 17:45:36 +0530, Amit Kapila wrote: > diff --git a/contrib/test_decoding/expected/toast.out > b/contrib/test_decoding/expected/toast.out > index cd03e9d..a757e7d 100644 > --- a/contrib/test_decoding/expected/toast.out > +++ b/contrib/test_decoding/expected/toast.out > @@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM > pg_logical_slot_get_changes('regression_slot', > table public.toasted_key: INSERT: id[integer]:1 > toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 > COMMIT > BEGIN > - table public.toasted_key: UPDATE: id[integer]:1 > toasted_key[text]:unchanged-toast-datum > toasted_col1[text]:unchanged-toast-datum > toasted_col2[text]:'987654321098765432109876543210987654321098765432109 > + table public.toasted_key: UPDATE: old-key: > toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678 Hm. This looks weird. What happened to the the change to toasted_col2 that was in the "removed" line? This corresponds to the following statement I think: -- test update of a toasted key without changing it UPDATE toasted_key SET toasted_col2 = toasted_col1; which previously was inserted as: INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES(repeat('1234567890', 200), repeat('9876543210', 200)); so toasted_col2 should have changed from NULL to repeat('9876543210', 200) Am I misreading something? Greetings, Andres Freund
Re: Unclear problem reports
Hi, On 2022-02-02 19:35:36 -0500, Bruce Momjian wrote: > The Postgres community is great at diagnosing problems and giving users > feedback. In most cases, we can either diagnose a problem and give a > fix, or at least give users a hint at finding the cause. > > However, there is a class of problems that are very hard to help with, > and I have perhaps seen an increasing number of them recently, e.g.: > > > https://www.postgresql.org/message-id/17384-f50f2eedf541e512%40postgresql.org > > https://www.postgresql.org/message-id/CALTnk7uOrMztNfzjNOZe3TdquAXDPD3vZKjWFWj%3D-Fv-gmROUQ%40mail.gmail.com > > I consider these as problems that need digging to find the cause, and > users are usually unable to do sufficient digging, and we don't have > time to give them instructions, so they never get a reply. > > Is there something we can do to improve this situation? I think some of this can be improved measurably, with moderate effort, by improving our error messages / diagnostics. Taking e.g. the second report: [7804] LOG: starting PostgreSQL 13.1, compiled by Visual C++ build 1914, 64-bit [8812] LOG: invalid primary checkpoint record [8812] PANIC: could not locate a valid checkpoint record [7804] LOG: startup process (PID 8812) was terminated by exception 0xC409 [7804] HINT: See C include file "ntstatus.h" for a description of the hexadecimal value. [7804] LOG: aborting startup due to startup process failure [7804] LOG: database system is shut down We don't provide any details in this log report that allow to diagnose the problem: "invalid primary checkpoint record" doesn't say *why* it's invalid, nor does it say the location at which the checkpoint record was, nor whether the cluster was shutdown gracefully before. It could be that the permissions on the WAL files were wrong, it could be missing files, actually invalid WAL, ... The while following bit about "startup process ... was terminated" and it's HINT is useless information, because we actually "knowingly" caused that PANIC. Even disregarding that, it certainly doesn't help to list numerical exception codes and references to ntstatus.h. The first report is similar, although it's a bit harder to improve: ERROR: missing chunk number 0 for toast value 152073604 in pg_toast_2619 Easy to fix: We don't mention the table that pg_toast_2619 corresponds to - something we can determine, rather forcing the user to figure out how to do so. Harder to fix: We don't provide information about where the reference to the toast value comes from. In general we don't necessarily know that, because Datum's are handed around fairly widely. But in a lot of cases we actually could know. > Should we just tell them they need to hire a Postgres expert? I assume > these are users who do not already have access to such experts. I think these are more our fault than our users :( Greetings, Andres Freund
Re: Unclear problem reports
On Fri, Feb 04, 2022 at 01:36:46PM -0500, Bruce Momjian wrote: > On Wed, Feb 2, 2022 at 07:21:19PM -0700, David G. Johnston wrote: > > On Wed, Feb 2, 2022 at 5:35 PM Bruce Momjian wrote: > > Have pg_lister queue up a check for, say, two or three days after the bug > > reporting form is filled out. If the report hasn't been responded to by > > someone other than the OP send out a reply that basically says: > > > > We're sorry your message hasn't yet attracted a response. If your report > > falls > > into the category of "tech support for a malfunctioning server", and this > > includes error messages that are difficult or impossible to replicate in a > > development environment (maybe give some examples), you may wish to consider > > eliciting paid professional help. Please see this page on our website for a > > directory of companies that provide such services. The PostgreSQL Core > > Project > > itself refrains from making recommendations since many, if not all, of these > > companies contribute back to the project in order to keep it both free and > > open > > source. > > Yes, that is an idea. I have canned email responses for common issues > like PGAdmin questions on the bugs list, but for these cases, I don't > know if someone might actually know the answer, and I don't know how > long to wait for an answer. Should we be going the other way and state > on the bugs submission page, https://www.postgresql.org/account/submitbug/: > > If you are having a serious problem with the software and do not > receive a reply, consider additional support channels, including > professional support (https://www.postgresql.org/support/). https://www.postgresql.org/account/submitbug/ does say to "ensure you have read" https://www.postgresql.org/docs/current/bug-reporting.html, which says, "If you need help immediately, consider obtaining a commercial support contract." Hence, this expands on an existing message and makes it more prominent. I think that's reasonable, though I'm not sure it's a net win. One could also edit the page that appears after submitting a bug. Editing a web page is superior to using canned email responses, for two reasons: - Canned responses are noise to every list reader other than the OP. - Canned responses make the list feel like a sales funnel rather than a normal free software mailing list. It's safe to say $SUBJECT won't have a general solution. There will always be some vanguard of tough user experiences, each one implying a separate project. For example, Julien replied to the "could not locate a valid checkpoint record" -bugs thread with some followup questions, but one could improve the log messages to answer those questions. Where we don't know how to improve the product enough, wiki pages (Julien suggested this) seem good.
Re: Release notes for February minor releases
Hi, On 2022-02-04 14:58:59 -0500, Tom Lane wrote: > I've pushed the first draft for $SUBJECT at > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ab22eea83169c8d0eb15050ce61cbe3d7dae4de6 +Author: Andres Freund +Branch: master [18b87b201] 2022-01-13 18:13:41 -0800 +Branch: REL_14_STABLE [dad1539ae] 2022-01-14 10:56:12 -0800 +--> + + Fix corruption of HOT chains when a RECENTLY_DEAD tuple changes + state to fully DEAD during page pruning (Andres Freund) + Even if that happens, it's still pretty unlikely to cause corruption - so maybe s/corruption/chance of corruption/? + + This happens when the last transaction that could see + the tuple ends while the page is being pruned. The transaction doesn't need to have ended while the page is vacuumed - the horizon needs to have been "refined/updated" while the page is pruned so that a tuple version that was first considered RECENTLY_DEAD is now considered DEAD. Which can only happen if RecentXmin changed after vacuum_set_xid_limits(), which only can happen if catalog snapshot invalidations and other invalidations are processed in vac_open_indexes() and RecentXmin changed since vacuum_set_xid_limits(). Then a page involving tuples in a specific "arrangement" need to be encountered. That's obviously to complicated for the release notes. Trying to make it more understandable I came up with the following, which still does not seem great: This can only happen if transactions, some having performed DDL, commit within a narrow window at the start of VACUUM. If VACUUM then prunes a page containing several tuple version that started to be removable within the aforementioned time window, the bug may cause corruption on that page (but no further pages). A tuple that is pointed to by a redirect item elsewhere on the page can get removed. [...] Greetings, Andres Freund
Re: Release notes for February minor releases
Hi, On 2022-02-04 22:27:54 +0100, Michael Banck wrote: > On Fri, Feb 04, 2022 at 02:58:59PM -0500, Tom Lane wrote: > > I've pushed the first draft for $SUBJECT at > > > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ab22eea83169c8d0eb15050ce61cbe3d7dae4de6 > > > > Please send comments/corrections by Sunday. > > > + > > + Fix corruption of HOT chains when a RECENTLY_DEAD tuple changes > > + state to fully DEAD during page pruning (Andres Freund) > > + > > + > > + > > + This happens when the last transaction that could see > > + the tuple ends while the page is being pruned. It was then possible > > + to remove a tuple that is pointed to by a redirect item elsewhere on > > + the page. While that causes no immediate problem, when the item slot > > + is re-used by some new tuple, that tuple would be thought to be part > > + of the pre-existing HOT chain, creating a form of index corruption. > > Well, ouchy. I don't think the above description is quite accurate / makes it sound much easier to hit than it is. The time window in which the stars need to align badly is not per-page window, but per-vacuum. And the window is very narrow. Even if that prerequisite was fulfilled, one additionally needs to encounter a pretty rare combination of tids of very specific xid "ages". > > + If this seems to have affected a table, REINDEX > > + should repair the damage. > > I don't think this is very helpful to the reader, are their indexes > corrupt or not? If we can't tell them a specific command to run to > check, can we at least mention that running amcheck would detect that > (if it actually does)? It does not reliably. Unfortunately heap amcheck does not verify HOT chains to any meaningful degree. Nor does btree amcheck check whether index tuples point to matching heap tuples :( > Otherwise, I guess the only way to be sure is to > just reindex every index? Or is this at least specific to b-trees? It's an issue on the heap side, so unfortunately it is not btree specific. Greetings, Andres Freund
Re: Synchronizing slots from primary to standby
Hi, On 2022-01-03 14:46:52 +0100, Peter Eisentraut wrote: > From ec00dc6ab8bafefc00e9b1c78ac9348b643b8a87 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Mon, 3 Jan 2022 14:43:36 +0100 > Subject: [PATCH v3] Synchronize logical replication slots from primary to > standby I've just skimmed the patch and the related threads. As far as I can tell this cannot be safely used without the conflict handling in [1], is that correct? Greetings, Andres Freund [1] https://postgr.es/m/CA%2BTgmoZd-JqNL1-R3RJ0jQRD%2B-dc94X0nPJgh%2BdwdDF0rFuE3g%40mail.gmail.com
Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
On Tue, Feb 1, 2022 at 3:10 AM Greg Stark wrote: > > So I looked at this patch and I have the same basic question as Bruce. Thanks a lot for the comments. > Do we really want to expose every binary tool associated with Postgres > as an extension? Obviously this is tempting for cloud provider users > which is not an unreasonable argument. But it does have consequences. Perhaps not every tool needs to be exposed, but given the advantages that pg_walinspect can provide it's a good candidate to have it as a core extension. Some of the advantages are - debugging, WAL analysis, feeding WAL stats and info to dashboards to show customers and answer their queries, RCA etc., for educational purposes - one can understand the WAL structure, stats, different record types etc. Another nice thing is getting raw WAL data out of the running server (of course all the users can't get it only the allowed ones, currently users with pg_monitor role, if required we can change it to some other predefined role). For instance, the raw WAL data can be fed to external page repair tools to apply on a raw page (one can get this from pageinspect extension). > 1) Some things like pg_waldump are running code that is not normally > under user control. This could have security issues or reliability > issues. I understand this and also I think the same concern applies to pageinspect tool which exposes getting raw page data. The pg_walinspect functions are currently accessible by the users with pg_monitor role, if required we can change this to some other predefined role. > On that front I'm especially concerned that pg_verify_raw_wal_record() > for example would let an attacker feed arbitrary hand crafted xlog > records into the parser which is not normally something a user can do. > If they feed it something it's not expecting it might be easy to cause > a crash and server restart. This function does nothing (no writes) to the server but just checking the CRC of the WAL record. If at all one can make the server crash with an input, then that should be a problem with the server code which needs to be fixed. But IMO this function doesn't have a concern as such. > There's also a bit of concern about data retention. Generally in > Postgres when rows are deleted there's very weak guarantees about the > data really being wiped. We definitely don't wipe it from disk of > course. And things like pageinspect could expose it long after it's > been deleted. But one might imagine after pageinspect shows it's gone > and/or after a vacuum full the data is actually purged. But then > something like pg_walinspect would make even that insufficient. The idea of pg_walinspect is to get the WAL info, data and stats out of a running postgres server, if the WAL isn't available, the functions would say that. > 2) There's no documentation. I'm guessing you hesitated to write > documentation until the interface is settled but actually sometimes > writing documentation helps expose things in the interface that look > strange when you try to explain them. I will send out the new patch set with documentation soon. > 3) And the interface does look a bit strange. Like what's the deal > with pg_get_wal_record_info_2() ? I gather it's just a SRF version of > pg_get_wal_record_info() but that's a strange name. And then what's > the point of pg_get_wal_record_info() at all? Why wouldn't the SRF be > sufficient even for the specific case of a single record? I agree, pg_get_wal_record_info_2 is a poor naming. pg_get_wal_record_info_2 takes range of LSN (start and end) to give the wal info, whereas pg_get_wal_record_info just takes one LSN. Maybe I will change pg_get_wal_record_info_2 to pg_get_wal_record_info_range or pg_get_wal_records_info or someother namign is better? If the suggestion is to overload pg_get_wal_record_info one with single LSN and another with start and end LSNs, I'm okay with that too. Otherwise, I can have pg_get_wal_record_info with start and end LSN (end LSN default to NULL) and return setof record. > And the stats functions seem a bit out of place to me. If the SRF > returned the data in the right format the user should be able to do > aggregate queries to generate these stats easily enough. If anything a > simple SQL function to do the aggregations could be provided. > > Now this is starting to get into the realm of bikeshedding but... Some > of the code is taken straight from pg_waldump and does things like: > > + appendStringInfo(&rec_blk_ref, "blkref #%u: rel %u/%u/%u fork %s blk %u", > > But that's kind of out of place for an SQL interface. It makes it hard > to write queries since things like the relid, block number etc are in > the string. If I wanted to use these functions I would expect to be > doing something like "select all the decoded records pertaining to > block n". I will think more about this and change it in the next version of the patch set, perhaps I will add more columns to the functions. > All that said, I don't
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
Hi, there's been an interesting case [1] of a slow query on pgsql-general, related to the topic discussed in this thread. It causes an order the query to run slower by multiple orders of magnitude, and I think it's interesting, so let me present a simple example demonstrating it. create table t1 (a int); create table t2 (a int); insert into t1 select i from generate_series(1,10) s(i); insert into t2 select mod(i,10) from generate_series(1,1000) s(i); create index on t1(a); create index on t2(a); vacuum analyze t1, t2; -- we need to force mergejoin set enable_nestloop = off; Now, let's run a simple query: SELECT t1.a, t2.a FROM t1 JOIN t2 USING (a) WHERE (t1.a > 99000) ORDER BY t1.a LIMIT 100; QUERY PLAN Limit (cost=4.63..224.57 rows=100 width=8) (actual time=8999.487..8999.707 rows=100 loops=1) -> Merge Join (cost=4.63..209447.97 rows=95226 width=8) (actual time=8999.485..8999.620 rows=100 loops=1) Merge Cond: (t1.a = t2.a) -> Index Only Scan using t1_a_idx on t1 (cost=0.29..29.25 rows=969 width=4) (actual time=0.010..0.011 rows=1 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 -> Index Only Scan using t2_a_idx on t2 (cost=0.43..183464.09 rows=977 width=4) (actual time=0.026..4594.757 rows=9900200 loops=1) Heap Fetches: 0 Planning Time: 0.338 ms Execution Time: 8999.768 ms (10 rows) Now, let's do a simple trick and add condition on t2.a, "implied" by the join condition (t1.a = t2.a) and inequality (t1.a > 99000). SELECT t1.a, t2.a FROM t1 JOIN t2 USING (a) WHERE (t1.a > 99000) AND (t2.a > 99000) ORDER BY t1.a LIMIT 100; QUERY PLAN Limit (cost=0.77..250.39 rows=100 width=8) (actual time=0.040..0.294 rows=100 loops=1) -> Merge Join (cost=0.77..2297.23 rows=920 width=8) (actual time=0.039..0.172 rows=100 loops=1) Merge Cond: (t1.a = t2.a) -> Index Only Scan using t1_a_idx on t1 (cost=0.29..29.25 rows=969 width=4) (actual time=0.031..0.031 rows=1 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 -> Index Only Scan using t2_a_idx on t2 (cost=0.43..2014.87 rows=96596 width=4) (actual time=0.005..0.052 rows=100 loops=1) Index Cond: (a > 99000) Heap Fetches: 0 Planning Time: 0.222 ms Execution Time: 0.414 ms (11 rows) Well, that's quite a difference! From 9000ms to 1ms, pretty good. What is happening in the first plan is the merge join needs t2 sorted by t2.a, and the index-only-scan looks like a great way to do that, as it has low startup cost (because LIMIT likes that). But this completely misses that (t1.a > 9900) implies (t2.a > 9900) through the equality in join condition. So we start scanning t2_a_idx, only to throw the first 99% of tuples away. In the original report this is particularly egregious, because the index only scan looks like this: -> Index Only Scan using data_class_pkey on data_class ta (cost=0.57..4935483.78 rows=216964862 width=8) (actual time=0.018..35022.908 rows=151321889 loops=1) Heap Fetches: 151321889 So yeah, 151M heap fetches, that's bound to be expensive :-/ Adding the condition on t2.a allows just skipping the first chunk of the index, eliminating the expensive part. Of course, this breaks the estimates in the faster query, because we now apply the condition twice - once for the index scan, one as the join clause. So instead of ~100k rows the join is estimated as ~1000 rows. I'm also not claiming this is 100% worth it - queries with a suitable combination of clauses (conditions on the join keys) seems rather uncommon. But it seems like an interesting example, because it may be seen either as missed execution optimization (failing to skip the initial chunk of rows), or an costing issue due to not accounting for having to process the rows (which would likely result in picking a different plan). regards [1] https://www.postgresql.org/message-id/CA%2B1Wm9U_sP9237f7OH7O%3D-UTab71DWOO4Qc-vnC78DfsJQBCwQ%40mail.gmail.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [BUG]Update Toast data failure in logical replication
On 2022-Feb-05, Amit Kapila wrote: > On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera wrote: > > > > I don't have a reason not to commit this patch. > > > > It is not very clear to me from this so just checking again, are you > fine with back-patching this as well? Hmm, of course, I never thought it'd be a good idea to let the bug unfixed in back branches. Thanks! -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El sudor es la mejor cura para un pensamiento enfermo" (Bardia)
Re: row filtering for logical replication
On Fri, Feb 4, 2022 at 2:58 PM houzj.f...@fujitsu.com wrote: > > On Thursday, February 3, 2022 11:11 PM houzj.f...@fujitsu.com > > > Since the v76--clean-up-pgoutput-cache-invalidation.patch has been > committed, attach a new version patch set to make the cfbot happy. Also > addressed the above comments related to tab-complete in 0002 patch. > I don't like some of the error message changes in this new version. For example: v75: +CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL; +CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer); +CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27); +ERROR: invalid publication WHERE expression for relation "testpub_rf_tbl3" +DETAIL: User-defined operators are not allowed. v77 +CREATE FUNCTION testpub_rf_func1(integer, integer) RETURNS boolean AS $$ SELECT hashint4($1) > $2 $$ LANGUAGE SQL; +CREATE OPERATOR =#> (PROCEDURE = testpub_rf_func1, LEFTARG = integer, RIGHTARG = integer); +CREATE PUBLICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27); +ERROR: invalid publication WHERE expression +LINE 1: ...ICATION testpub6 FOR TABLE testpub_rf_tbl3 WHERE (e =#> 27); + ^ +DETAIL: User-defined or mutable functions are not allowed I think the detailed message by v75 "DETAIL: User-defined operators are not allowed." will be easier for users to understand. I have made some code changes and refactoring to make this behavior like previous without removing the additional checks you have added in v77. I have made a few changes to comments and error messages. Attached is a top-up patch on your v77 patch series. I suggest we can combine the 0001 and 0002 patches as well. -- With Regards, Amit Kapila. v77_diff_amit.1.patch Description: Binary data
Re: Latest LLVM breaks our code again
On Fri, Feb 4, 2022 at 8:12 AM Andres Freund wrote: > On 2022-02-03 10:44:11 +0100, Fabien COELHO wrote: > > For these reasons, I'm inclined to let seawasp as it is. It might be easier to use the nightly packages at https://apt.llvm.org/. You could update daily and still save so much CPU that ... > I find seawasp tracking the development trunk compilers useful. Just not for > --with-llvm. The latter imo *reduces* seawasp's, because once there's an API > change we can't see whether there's e.g. a compiler codegen issue leading to > crashes or whatnot. > > What I was proposing was to remove --with-llvm from seawasp, and have a > separate animal tracking the newest llvm release branch (I can run/host that > if needed). ... you could do a couple of variations like that ^ on the same budget :-) FWIW 14 just branched. Vive LLVM 15.
Re: Release notes for February minor releases
On Fri, Feb 04, 2022 at 04:35:07PM -0500, Tom Lane wrote: > Michael Banck writes: > > On Fri, Feb 04, 2022 at 02:58:59PM -0500, Tom Lane wrote: > >> + If this seems to have affected a table, REINDEX > >> + should repair the damage. > > > I don't think this is very helpful to the reader, are their indexes > > corrupt or not? If we can't tell them a specific command to run to > > check, can we at least mention that running amcheck would detect that > > (if it actually does)? Otherwise, I guess the only way to be sure is to > > just reindex every index? Or is this at least specific to b-trees? > > Yeah, I wasn't too happy with that advice either, but it seems like the > best we can do [1]. I don't think we should advise blindly reindexing > your whole installation, because it's a very low-probability bug. Right ok. I wonder whether it makes sense to at hint at the low probability then; I guess if you know Postgres well you can deduct from the "when the last transaction that could see the tuple ends while the page is being pruned" that it is a low-probability corner-case, but I fear lots of users will be unable to gauge the chances they got hit by this bug and just blindly assume they are affected (and/or ask around). I just woke up, so I don't have any good wording suggetsions yet. Michael -- Michael Banck Teamleiter PostgreSQL-Team Projektleiter Tel.: +49 2166 9901-171 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz