Re: pg15b2: large objects lost on upgrade

2022-08-08 Thread Tom Lane
Bruce Momjian writes: > I thought the test was setting up a configuration that would never be > used by normal servers. Is that false? *If* we made it disable autovac before starting pg_upgrade, then that would be a process not used by normal users. I don't care whether pg_upgrade disables autov

Re: pg15b2: large objects lost on upgrade

2022-08-08 Thread Bruce Momjian
On Mon, Aug 8, 2022 at 09:51:46PM -0400, Tom Lane wrote: > Bruce Momjian writes: > >> Hmmm ... now that you mention it, I see nothing in 002_pg_upgrade.pl > >> that attempts to turn off autovacuum on either the source server or > >> the destination. So one plausible theory is that autovac moved

Re: pg15b2: large objects lost on upgrade

2022-08-08 Thread Tom Lane
Bruce Momjian writes: >> Hmmm ... now that you mention it, I see nothing in 002_pg_upgrade.pl >> that attempts to turn off autovacuum on either the source server or >> the destination. So one plausible theory is that autovac moved the >> numbers since we checked. > Uh, pg_upgrade assumes autovac

Re: pg15b2: large objects lost on upgrade

2022-08-08 Thread Bruce Momjian
On Tue, Aug 2, 2022 at 03:32:05PM -0400, Tom Lane wrote: > "Jonathan S. Katz" writes: > > On 8/2/22 1:12 PM, Tom Lane wrote: > >> Sadly, we're still not out of the woods. I see three buildfarm > >> failures in this test since Robert resolved the "-X" problem [1][2][3]: > > > Looking at the test

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Peter Geoghegan
On Thu, Aug 4, 2022 at 12:31 PM Robert Haas wrote: > > What about autoanalyze? > > What about it? It has a tendency to consume an XID, here or there, quite unpredictably. I've noticed that this often involves an analyze of pg_statistic. Have you accounted for that? You said upthread that you don

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Thu, Aug 4, 2022 at 3:10 PM Tom Lane wrote: > Robert Haas writes: > > I think the most practical alternative is to put this file back to the > > way it was before I started tinkering with it, and revisit this issue > > after the release. > > Yeah, that seems like the right thing. We are runni

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Thu, Aug 4, 2022 at 3:23 PM Peter Geoghegan wrote: > On Thu, Aug 4, 2022 at 12:15 PM Robert Haas wrote: > > Given that the old cluster is suffering no new write > > transactions, there's probably exactly two values that are legal: one > > being the value from the old cluster, which we know, an

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Peter Geoghegan
On Thu, Aug 4, 2022 at 12:15 PM Robert Haas wrote: > Given that the old cluster is suffering no new write > transactions, there's probably exactly two values that are legal: one > being the value from the old cluster, which we know, and the other > being whatever a vacuum of that table would produ

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Thu, Aug 4, 2022 at 12:59 PM Andres Freund wrote: > Why you think it's better to not have the test than to have a very limited > amount of fuzziness (by using the next xid as an upper limit). What's the bug > that will reliably pass the nextxid fuzzy comparison, but not an exact > comparison?

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Tom Lane
Robert Haas writes: > I think the most practical alternative is to put this file back to the > way it was before I started tinkering with it, and revisit this issue > after the release. Yeah, that seems like the right thing. We are running too low on time to have any confidence that a modified v

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Thu, Aug 4, 2022 at 1:49 PM Tom Lane wrote: > Note that the patch you proposed at [1] will not fix anything. > It turns off autovac in the new node, but the buildfarm failures > we've seen appear to be due to autovac running on the old node. > (I believe that autovac in the new node is *also* a

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Peter Geoghegan
On Thu, Aug 4, 2022 at 11:07 AM Tom Lane wrote: > How much will that add to the test's runtime? I could get behind this > idea if it's not exorbitantly expensive. I'm not sure offhand, but I suspect it wouldn't be too bad. -- Peter Geoghegan

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Tom Lane
Peter Geoghegan writes: > Perhaps amcheck's verify_heapam() function can be used here. What > could be better than exhaustively verifying that the relfrozenxid (and > relminmxid) invariants hold for every single tuple in the table? How much will that add to the test's runtime? I could get behind

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Peter Geoghegan
On Thu, Aug 4, 2022 at 9:44 AM Robert Haas wrote: > But if you don't want to do that, and you also don't want to have > random failures, the only alternatives are weakening the check and > removing the test. It's kind of hard to say which is better, but I'm > inclined to think that if we just weak

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Tom Lane
Robert Haas writes: > IMHO it's 100% clear how to make it robust. If you want to check that > two values are the same, you can't let one of them be overwritten by > an unrelated event in the middle of the check. There are many specific > things we could do here, a few of which I proposed in my pre

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Andres Freund
Hi, On 2022-08-04 12:43:49 -0400, Robert Haas wrote: > On Thu, Aug 4, 2022 at 10:26 AM Tom Lane wrote: > > Robert Haas writes: > > > 100 << 2^32, so it's not terrible, but I'm honestly coming around to > > > the view that we ought to just nuke this test case. > > > > I'd hesitated to suggest tha

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Thu, Aug 4, 2022 at 10:26 AM Tom Lane wrote: > Robert Haas writes: > > 100 << 2^32, so it's not terrible, but I'm honestly coming around to > > the view that we ought to just nuke this test case. > > I'd hesitated to suggest that, but I think that's a fine solution. > Especially since we can a

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Tom Lane
Robert Haas writes: > 100 << 2^32, so it's not terrible, but I'm honestly coming around to > the view that we ought to just nuke this test case. I'd hesitated to suggest that, but I think that's a fine solution. Especially since we can always put it back in later if we think of a more robust way.

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Tom Lane
"Jonathan S. Katz" writes: > On 8/3/22 4:19 PM, Tom Lane wrote: >> I like the idea of txid_current(), but we have no comparable >> function for mxid do we? While you could get both numbers from >> pg_control_checkpoint(), I doubt that's sufficiently up-to-date. > ...unless we force a checkpoint

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Thu, Aug 4, 2022 at 10:02 AM Jonathan S. Katz wrote: > Attached is the "band-aid / sloppy" version of the patch. Given from the > test examples I kept seeing deltas over 100 for relfrozenxid, I chose > 1000. The mxid delta was less, but I kept it at 1000 for consistency > (and because I hope th

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Robert Haas
On Wed, Aug 3, 2022 at 7:19 PM Tom Lane wrote: > "Jonathan S. Katz" writes: > > I did rule out wanting to do the "xid + $X" check after reviewing some > > of the output. I think that both $X could end up varying, and it really > > feels like a bandaid. > > It is that. I wouldn't feel comfortable

Re: pg15b2: large objects lost on upgrade

2022-08-04 Thread Jonathan S. Katz
On 8/3/22 4:19 PM, Tom Lane wrote: "Jonathan S. Katz" writes: I did rule out wanting to do the "xid + $X" check after reviewing some of the output. I think that both $X could end up varying, and it really feels like a bandaid. It is that. I wouldn't feel comfortable with $X less than 100 or

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Tom Lane
"Jonathan S. Katz" writes: > I did rule out wanting to do the "xid + $X" check after reviewing some > of the output. I think that both $X could end up varying, and it really > feels like a bandaid. It is that. I wouldn't feel comfortable with $X less than 100 or so, which is probably sloppy en

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Jonathan S. Katz
On 8/3/22 2:08 PM, Peter Geoghegan wrote: On Wed, Aug 3, 2022 at 1:47 PM Tom Lane wrote: Again, this seems to me to be breaking the test's real-world applicability for a (false?) sense of stability. I agree. A lot of the VACUUM test flappiness issues we've had to deal with in the past now se

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Peter Geoghegan
On Wed, Aug 3, 2022 at 1:47 PM Tom Lane wrote: > Again, this seems to me to be breaking the test's real-world applicability > for a (false?) sense of stability. I agree. A lot of the VACUUM test flappiness issues we've had to deal with in the past now seem like problems with VACUUM itself, the t

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Andres Freund
On 2022-08-03 16:46:57 -0400, Tom Lane wrote: > Robert Haas writes: > > Or we could disable autovacuum on the new cluster, which I think is a > > better solution. I like it when things match exactly; it makes me feel > > that the universe is well-ordered. > > Again, this seems to me to be breakin

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Tom Lane
Robert Haas writes: > Or we could disable autovacuum on the new cluster, which I think is a > better solution. I like it when things match exactly; it makes me feel > that the universe is well-ordered. Again, this seems to me to be breaking the test's real-world applicability for a (false?) sense

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Peter Geoghegan
On Wed, Aug 3, 2022 at 1:34 PM Tom Lane wrote: > That doesn't seem like it'd be all that thorough: we expect VACUUM > to skip pages whenever possible. I'm also a bit concerned about > the expense, though admittedly this test is ridiculously expensive > already. I bet the SKIP_PAGES_THRESHOLD stu

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Robert Haas
On Wed, Aug 3, 2022 at 4:20 PM Andres Freund wrote: > > I don't really like this approach. Imagine that the code got broken in > > such a way that relfrozenxid and relminmxid were set to a value chosen > > at random - say, the contents of 4 bytes of unallocated memory that > > contained random gar

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Tom Lane
Peter Geoghegan writes: > It couldn't hurt to do that as well, in passing (at the same time as > testing that newrelfrozenxid >= oldrelfrozenxid directly). But > deliberately running VACUUM afterwards seems like a good idea. We > really ought to expect VACUUM to catch cases where > relfrozenxid/re

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Peter Geoghegan
On Wed, Aug 3, 2022 at 1:20 PM Andres Freund wrote: > > I don't really like this approach. Imagine that the code got broken in > > such a way that relfrozenxid and relminmxid were set to a value chosen > > at random - say, the contents of 4 bytes of unallocated memory that > > contained random gar

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Andres Freund
Hi, On 2022-08-03 09:59:40 -0400, Robert Haas wrote: > On Tue, Aug 2, 2022 at 3:51 PM Tom Lane wrote: > > > The test does look helpful and it would catch regressions. Loosely > > > quoting Robert on a different point upthread, we don't want to turn off > > > the alarm just because it's spuriously

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Peter Geoghegan
On Wed, Aug 3, 2022 at 6:59 AM Robert Haas wrote: > I don't really like this approach. Imagine that the code got broken in > such a way that relfrozenxid and relminmxid were set to a value chosen > at random - say, the contents of 4 bytes of unallocated memory that > contained random garbage. Well

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Peter Geoghegan
On Tue, Aug 2, 2022 at 12:32 PM Tom Lane wrote: > Hmmm ... now that you mention it, I see nothing in 002_pg_upgrade.pl > that attempts to turn off autovacuum on either the source server or > the destination. So one plausible theory is that autovac moved the > numbers since we checked. It's very

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Robert Haas
On Wed, Aug 3, 2022 at 10:13 AM Tom Lane wrote: > If you have a different solution that you can implement by, say, > tomorrow, then go for it. But I want to see some fix in there > within about 24 hours, because 15beta3 wraps on Monday and we > will need at least a few days to see if the buildfar

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Jonathan S. Katz
> On Aug 3, 2022, at 10:14 AM, Tom Lane wrote: > > Robert Haas writes: >>> On Tue, Aug 2, 2022 at 3:51 PM Tom Lane wrote: >>> I also think that ">=" is a sufficient requirement. > >> I don't really like this approach. Imagine that the code got broken in >> such a way that relfrozenxid and r

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Tom Lane
Robert Haas writes: > On Tue, Aug 2, 2022 at 3:51 PM Tom Lane wrote: >> I also think that ">=" is a sufficient requirement. > I don't really like this approach. Imagine that the code got broken in > such a way that relfrozenxid and relminmxid were set to a value chosen > at random - say, the con

Re: pg15b2: large objects lost on upgrade

2022-08-03 Thread Robert Haas
On Tue, Aug 2, 2022 at 3:51 PM Tom Lane wrote: > > The test does look helpful and it would catch regressions. Loosely > > quoting Robert on a different point upthread, we don't want to turn off > > the alarm just because it's spuriously going off. > > I think the weakened check is OK (and possibly

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Jonathan S. Katz
On 8/2/22 4:20 PM, Jonathan S. Katz wrote: On 8/2/22 3:51 PM, Tom Lane wrote: "Jonathan S. Katz" writes: On 8/2/22 3:39 PM, Tom Lane wrote: I am not in favor of disabling autovacuum in the test: ordinary users are not going to do that while pg_upgrade'ing, so it'd make the test less represent

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Jonathan S. Katz
On 8/2/22 3:51 PM, Tom Lane wrote: "Jonathan S. Katz" writes: On 8/2/22 3:39 PM, Tom Lane wrote: I am not in favor of disabling autovacuum in the test: ordinary users are not going to do that while pg_upgrade'ing, so it'd make the test less representative of real-world usage, which seems like

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Tom Lane
"Jonathan S. Katz" writes: > On 8/2/22 3:39 PM, Tom Lane wrote: >>> I am not in favor of disabling autovacuum in the test: ordinary >>> users are not going to do that while pg_upgrade'ing, so it'd make >>> the test less representative of real-world usage, which seems like >>> a bad idea. We could

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Jonathan S. Katz
On 8/2/22 3:39 PM, Tom Lane wrote: "Jonathan S. Katz" writes: On 8/2/22 3:23 PM, Robert Haas wrote: I'm not quite sure how to rule that theory in or out, though. Without overcomplicating this, are we able to check to see if autovacuum ran during the course of the test? Looks like we're al

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Tom Lane
"Jonathan S. Katz" writes: > On 8/2/22 3:23 PM, Robert Haas wrote: >> I'm not quite sure how to rule that theory in or out, though. > Without overcomplicating this, are we able to check to see if autovacuum > ran during the course of the test? Looks like we're all thinking along the same lines.

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Tom Lane
"Jonathan S. Katz" writes: > On 8/2/22 1:12 PM, Tom Lane wrote: >> Sadly, we're still not out of the woods. I see three buildfarm >> failures in this test since Robert resolved the "-X" problem [1][2][3]: > Looking at the test code, is there anything that could have changed the > relfrozenxid o

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Jonathan S. Katz
On 8/2/22 3:23 PM, Robert Haas wrote: On Tue, Aug 2, 2022 at 1:12 PM Tom Lane wrote: Not sure what to make of this, except that maybe the test is telling us about an actual bug of exactly the kind it's designed to expose. That could be, but what would the bug be exactly? It's hard to think of

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Robert Haas
On Tue, Aug 2, 2022 at 1:12 PM Tom Lane wrote: > Not sure what to make of this, except that maybe the test is telling > us about an actual bug of exactly the kind it's designed to expose. That could be, but what would the bug be exactly? It's hard to think of a more direct way of setting relminmx

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Jonathan S. Katz
On 8/2/22 1:12 PM, Tom Lane wrote: "Jonathan S. Katz" writes: Given this appears to be resolved, I have removed this from "Open Items". Thanks! Sadly, we're still not out of the woods. I see three buildfarm failures in this test since Robert resolved the "-X" problem [1][2][3]: Not sure wha

Re: pg15b2: large objects lost on upgrade

2022-08-02 Thread Tom Lane
"Jonathan S. Katz" writes: > Given this appears to be resolved, I have removed this from "Open > Items". Thanks! Sadly, we're still not out of the woods. I see three buildfarm failures in this test since Robert resolved the "-X" problem [1][2][3]: grassquit reported [19:34:15.249](0.001s) not

Re: pg15b2: large objects lost on upgrade

2022-08-01 Thread Jonathan S. Katz
On 7/30/22 10:40 AM, Tom Lane wrote: Noah Misch writes: The pg_backend_pid is from "SELECT pg_catalog.pg_backend_pid();" in ~/.psqlrc, so the lack of -X caused that. The latest commit fixes things on a normal GNU/Linux box, so I bet it will fix wrasse. Yup, looks like we're all good now. Th

Re: pg15b2: large objects lost on upgrade

2022-07-30 Thread Tom Lane
Noah Misch writes: > The pg_backend_pid is from "SELECT pg_catalog.pg_backend_pid();" in ~/.psqlrc, > so the lack of -X caused that. The latest commit fixes things on a normal > GNU/Linux box, so I bet it will fix wrasse. Yup, looks like we're all good now. Thanks! rega

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Noah Misch
On Fri, Jul 29, 2022 at 07:16:34PM -0400, Tom Lane wrote: > Robert Haas writes: > > wrasse just failed the new test: > > > [00:09:44.167](0.001s) not ok 16 - old and new horizons match after > > pg_upgrade > > [00:09:44.167](0.001s) > > [00:09:44.167](0.000s) # Failed test 'old and new horizon

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Tom Lane
Robert Haas writes: > Here's a patch that uses a variant of that approach: it just runs > safe_psql straight up and gets the output, then writes it out to temp > files if the output doesn't match and we need to run diff. Let me know > what you think of this. That looks good to me, although obviou

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 8:02 PM Tom Lane wrote: > Personally I'd try to replace the two horizon-collection steps with > $newnode->psql calls, using extra_params to inject the '-o' and target > filename command line words. But if you want to try adding -X as > a quicker answer, maybe that will be

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Tom Lane
Robert Haas writes: > On Fri, Jul 29, 2022 at 7:16 PM Tom Lane wrote: >> I am suspicious that the problem stems from the nonstandard >> way you've invoked psql to collect the horizon data. > Well, I just copied the pg_dump block which occurs directly beforehand > and modified it. I think that mu

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 7:16 PM Tom Lane wrote: > That's not the only thing weird about this printout: there should be > three columns not two in that query's output, and what happened to > the trailing newline? I don't think we're looking at desired > output at all. Well, that's an awfully good

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Tom Lane
Robert Haas writes: > wrasse just failed the new test: > [00:09:44.167](0.001s) not ok 16 - old and new horizons match after pg_upgrade > [00:09:44.167](0.001s) > [00:09:44.167](0.000s) # Failed test 'old and new horizons match > after pg_upgrade' > # at t/002_pg_upgrade.pl line 345. > [00:09

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 5:13 PM Robert Haas wrote: > On Fri, Jul 29, 2022 at 4:00 PM Tom Lane wrote: > > Looks plausible. > > Committed. wrasse just failed the new test: [00:09:44.167](0.001s) not ok 16 - old and new horizons match after pg_upgrade [00:09:44.167](0.001s) [00:09:44.167](0.000s)

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Andrew Dunstan
On 2022-07-29 Fr 14:35, Robert Haas wrote: > On Fri, Jul 29, 2022 at 1:49 PM Tom Lane wrote: >> crake has been failing its cross-version-upgrade tests [1] since >> this went in: >> >> log files for step xversion-upgrade-REL9_4_STABLE-HEAD: >> ==~_~===-=-===~_~== >> /home/andrew/bf/root/upgrade.

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 4:00 PM Tom Lane wrote: > Looks plausible. Committed. -- Robert Haas EDB: http://www.enterprisedb.com

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Tom Lane
Robert Haas writes: > On Fri, Jul 29, 2022 at 3:10 PM Robert Haas wrote: >> However, the catalogs show the relfilenode being correct, and the >> relfrozenxid set to a larger value. I suspect the problem here is that >> this needs to be done in the other order, with the TRUNCATE first and >> the u

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 3:10 PM Robert Haas wrote: > However, the catalogs show the relfilenode being correct, and the > relfrozenxid set to a larger value. I suspect the problem here is that > this needs to be done in the other order, with the TRUNCATE first and > the update to the pg_class colum

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 2:35 PM Robert Haas wrote: > But what exactly is this test case testing? I've previously complained > about buildfarm outputs not being as labelled as well as they need to > be in order to be easily understood by, well, me anyway. It'd sure > help if the commands that led u

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Robert Haas
On Fri, Jul 29, 2022 at 1:49 PM Tom Lane wrote: > crake has been failing its cross-version-upgrade tests [1] since > this went in: > > log files for step xversion-upgrade-REL9_4_STABLE-HEAD: > ==~_~===-=-===~_~== > /home/andrew/bf/root/upgrade.crake/HEAD/REL9_4_STABLE-amcheck-1.log > ==~_~===-=-

Re: pg15b2: large objects lost on upgrade

2022-07-29 Thread Tom Lane
Robert Haas writes: > That's also my preference, so committed and back-patched to v15. crake has been failing its cross-version-upgrade tests [1] since this went in: log files for step xversion-upgrade-REL9_4_STABLE-HEAD: ==~_~===-=-===~_~== /home/andrew/bf/root/upgrade.crake/HEAD/REL9_4_STABLE

Re: pg15b2: large objects lost on upgrade

2022-07-28 Thread Robert Haas
On Tue, Jul 26, 2022 at 9:09 PM Bruce Momjian wrote: > This behavior is new in PG 15, and I would be worried to have one new > behavior in PG 15 and another one in PG 16. Therefore, I would like to > see it in PG 15 and master. That's also my preference, so committed and back-patched to v15. --

Re: pg15b2: large objects lost on upgrade

2022-07-26 Thread Bruce Momjian
On Tue, Jul 26, 2022 at 03:45:11PM -0400, Robert Haas wrote: > On Mon, Jul 18, 2022 at 2:57 PM Robert Haas wrote: > > Well, it took a while to figure out how to make that work, but I > > believe I've got it now. Attached please find a couple of patches that > > should get the job done. They might

Re: pg15b2: large objects lost on upgrade

2022-07-26 Thread Robert Haas
On Mon, Jul 18, 2022 at 2:57 PM Robert Haas wrote: > Well, it took a while to figure out how to make that work, but I > believe I've got it now. Attached please find a couple of patches that > should get the job done. They might need a bit of polish, but I think > the basic concepts are sound. So

Re: pg15b2: large objects lost on upgrade

2022-07-18 Thread Robert Haas
On Mon, Jul 18, 2022 at 4:06 PM Andres Freund wrote: > How about adding a new binary_upgrade_* helper function for this purpose > instead, instead of tying it into truncate? I considered that briefly, but it would need to do a lot of things that TRUNCATE already knows how to do, so it does not se

Re: pg15b2: large objects lost on upgrade

2022-07-18 Thread Bruce Momjian
On Mon, Jul 18, 2022 at 02:57:40PM -0400, Robert Haas wrote: > So I tried implementing this but I didn't get it quite right the first > time. It's not enough to call smgrdounlinkall() instead of > RelationDropStorage(), because just as RelationDropStorage() does not > actually drop the storage but

Re: pg15b2: large objects lost on upgrade

2022-07-18 Thread Andres Freund
Hi, On 2022-07-18 14:57:40 -0400, Robert Haas wrote: > As to whether this is a good fix, I think someone could certainly > argue otherwise. This is all a bit grotty. However, I don't find it > all that bad. As long as we're moving files from between one PG > cluster and another using an external t

Re: pg15b2: large objects lost on upgrade

2022-07-18 Thread Robert Haas
On Tue, Jul 12, 2022 at 4:51 PM Robert Haas wrote: > I have a few more ideas to try here. It occurs to me that we could fix > this more cleanly if we could get the dump itself to set the > relfilenode for pg_largeobject to the desired value. Right now, it's > just overwriting the relfilenode store

Re: pg15b2: large objects lost on upgrade

2022-07-12 Thread Michael Paquier
On Tue, Jul 12, 2022 at 04:51:44PM -0400, Robert Haas wrote: > I spent a bunch of time looking at this today and I have more sympathy > for Justin's previous proposal now. I found it somewhat hacky that he > was relying on the hard-coded value of LargeObjectRelationId and > LargeObjectLOidPNIndexId

Re: pg15b2: large objects lost on upgrade

2022-07-12 Thread Robert Haas
On Mon, Jul 11, 2022 at 9:16 AM Robert Haas wrote: > I am not saying we shouldn't try to fix this up more thoroughly, just > that I think you are overestimating the consequences. I spent a bunch of time looking at this today and I have more sympathy for Justin's previous proposal now. I found it

Re: pg15b2: large objects lost on upgrade

2022-07-11 Thread Robert Haas
On Sun, Jul 10, 2022 at 9:31 PM Michael Paquier wrote: > Hmm. That would mean that the more LOs a cluster has, the more bloat > there will be in the new cluster once the upgrade is done. That could > be quite a few gigs worth of data laying around depending on the data > inserted in the source cl

Re: pg15b2: large objects lost on upgrade

2022-07-10 Thread Michael Paquier
On Fri, Jul 08, 2022 at 10:44:07AM -0400, Robert Haas wrote: > Thanks for checking over the reasoning, and the kind words in general. Thanks for fixing the main issue. > I just committed Justin's fix for the bug, without fixing the fact > that the new cluster's original pg_largeobject files will

Re: pg15b2: large objects lost on upgrade

2022-07-08 Thread Robert Haas
On Fri, Jul 8, 2022 at 11:53 AM Justin Pryzby wrote: > pg_upgrade drops template1 and postgres before upgrading: Hmm, but I bet you could fiddle with template0. Indeed what's the difference between a user fiddling with template0 and me committing a patch that bumps catversion? If the latter doesn

Re: pg15b2: large objects lost on upgrade

2022-07-08 Thread Justin Pryzby
On Thu, Jul 07, 2022 at 03:11:38PM -0400, Robert Haas wrote: > point: we assume that nothing significant has happened between when > the cluster was created and when pg_upgrade is run, but we don't check > it. Either we shouldn't assume it, or we should check it. > > So, is such activity ever legi

Re: pg15b2: large objects lost on upgrade

2022-07-08 Thread Robert Haas
On Thu, Jul 7, 2022 at 4:16 PM Bruce Momjian wrote: > You are right to be concerned since you are spanning number spaces, but > I think you are fine because the relfilenode in the user-space cannot > have been used since it already was being used in each database. It is > true we never had a per-

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Bruce Momjian
On Tue, Jul 5, 2022 at 12:43:54PM -0400, Robert Haas wrote: > On Sat, Jul 2, 2022 at 11:49 AM Justin Pryzby wrote: > > I suppose it's like Bruce said, here. > > > > https://www.postgresql.org/message-id/20210601140949.GC22012%40momjian.us > > Well, I feel dumb. I remember reading that email back

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Robert Haas
+ /* Keep track of whether a filenode matches the OID */ + if (maps[mapnum].relfilenumber == LargeObjectRelationId) + *has_lotable = true; + if (maps[mapnum].relfilenumber == LargeObjectLOidPNIndexId) + *has_loind

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Tom Lane
Justin Pryzby writes: > On Thu, Jul 07, 2022 at 02:38:44PM -0400, Robert Haas wrote: >> On Thu, Jul 7, 2022 at 2:24 PM Bruce Momjian wrote: >>> Uh, that initdb-created pg_largeobject file should not have any data in >>> it ever, as far as I know at that point in pg_upgrade. How would values >>>

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Justin Pryzby
On Thu, Jul 07, 2022 at 02:38:44PM -0400, Robert Haas wrote: > On Thu, Jul 7, 2022 at 2:24 PM Bruce Momjian wrote: > > On Thu, Jul 7, 2022 at 01:38:44PM -0400, Robert Haas wrote: > > > On Thu, Jul 7, 2022 at 1:10 PM Justin Pryzby wrote: > > > > Maybe it's a good idea to check that the file is em

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Robert Haas
On Thu, Jul 7, 2022 at 2:24 PM Bruce Momjian wrote: > On Thu, Jul 7, 2022 at 01:38:44PM -0400, Robert Haas wrote: > > On Thu, Jul 7, 2022 at 1:10 PM Justin Pryzby wrote: > > > Maybe it's a good idea to check that the file is empty before unlinking... > > > > If we want to verify that there are n

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Bruce Momjian
On Thu, Jul 7, 2022 at 01:38:44PM -0400, Robert Haas wrote: > On Thu, Jul 7, 2022 at 1:10 PM Justin Pryzby wrote: > > Maybe it's a good idea to check that the file is empty before unlinking... > > If we want to verify that there are no large objects in the cluster, > we could do that in check_ne

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Robert Haas
On Thu, Jul 7, 2022 at 1:10 PM Justin Pryzby wrote: > Maybe it's a good idea to check that the file is empty before unlinking... If we want to verify that there are no large objects in the cluster, we could do that in check_new_cluster_is_empty(). However, even if there aren't, the length of the

Re: pg15b2: large objects lost on upgrade

2022-07-07 Thread Justin Pryzby
On Wed, Jul 06, 2022 at 08:25:04AM -0400, Robert Haas wrote: > On Wed, Jul 6, 2022 at 7:56 AM Justin Pryzby wrote: > > I'm looking into it, but it'd help to hear suggestions about where to put > > it. > > My current ideas aren't very good. > > In main() there is a comment that begins "Most failu

Re: pg15b2: large objects lost on upgrade

2022-07-06 Thread Robert Haas
On Wed, Jul 6, 2022 at 7:56 AM Justin Pryzby wrote: > I'm looking into it, but it'd help to hear suggestions about where to put it. > My current ideas aren't very good. In main() there is a comment that begins "Most failures happen in create_new_objects(), which has just completed at this point."

Re: pg15b2: large objects lost on upgrade

2022-07-06 Thread Justin Pryzby
On Tue, Jul 05, 2022 at 02:40:21PM -0400, Robert Haas wrote: > On Tue, Jul 5, 2022 at 12:56 PM Justin Pryzby wrote: > > My patch also leaves a 0 byte file around from initdb, which is harmless, > > but > > dirty. > > > > I've seen before where a bunch of 0 byte files are abandoned in an > > other

Re: pg15b2: large objects lost on upgrade

2022-07-05 Thread Robert Haas
On Tue, Jul 5, 2022 at 12:56 PM Justin Pryzby wrote: > My patch also leaves a 0 byte file around from initdb, which is harmless, but > dirty. > > I've seen before where a bunch of 0 byte files are abandoned in an > otherwise-empty tablespace, with no associated relation, and I have to "rm" > them

Re: pg15b2: large objects lost on upgrade

2022-07-05 Thread Justin Pryzby
On Tue, Jul 05, 2022 at 12:43:54PM -0400, Robert Haas wrote: > On Sat, Jul 2, 2022 at 11:49 AM Justin Pryzby wrote: > > I suppose it's like Bruce said, here. > > > > https://www.postgresql.org/message-id/20210601140949.GC22012%40momjian.us > > Well, I feel dumb. I remember reading that email back

Re: pg15b2: large objects lost on upgrade

2022-07-05 Thread Robert Haas
On Sat, Jul 2, 2022 at 11:49 AM Justin Pryzby wrote: > I suppose it's like Bruce said, here. > > https://www.postgresql.org/message-id/20210601140949.GC22012%40momjian.us Well, I feel dumb. I remember reading that email back when Bruce sent it, but it seems that it slipped out of my head between

Re: pg15b2: large objects lost on upgrade

2022-07-05 Thread Shruthi Gowda
I was able to reproduce the issue. Also, the issue does not occur with code before to preserve relfilenode commit. I tested your patch and it fixes the problem. I am currently analyzing a few things related to the issue. I will come back once my analysis is completed. On Sat, Jul 2, 2022 at 9:19 P

Re: pg15b2: large objects lost on upgrade

2022-07-02 Thread Julien Rouhaud
On Sat, Jul 02, 2022 at 08:34:04AM -0400, Robert Haas wrote: > On Fri, Jul 1, 2022 at 7:14 PM Justin Pryzby wrote: > > I noticed this during beta1, but dismissed the issue when it wasn't easily > > reproducible. Now, I saw the same problem while upgrading from beta1 to > > beta2, > > so couldn't

Re: pg15b2: large objects lost on upgrade

2022-07-02 Thread Justin Pryzby
On Sat, Jul 02, 2022 at 08:34:04AM -0400, Robert Haas wrote: > On Fri, Jul 1, 2022 at 7:14 PM Justin Pryzby wrote: > > I noticed this during beta1, but dismissed the issue when it wasn't easily > > reproducible. Now, I saw the same problem while upgrading from beta1 to > > beta2, > > so couldn't

Re: pg15b2: large objects lost on upgrade

2022-07-02 Thread Robert Haas
On Fri, Jul 1, 2022 at 7:14 PM Justin Pryzby wrote: > I noticed this during beta1, but dismissed the issue when it wasn't easily > reproducible. Now, I saw the same problem while upgrading from beta1 to > beta2, > so couldn't dismiss it. It turns out that LOs are lost if VACUUM FULL was > run.

Re: pg15b2: large objects lost on upgrade

2022-07-01 Thread Michael Paquier
On Fri, Jul 01, 2022 at 06:14:13PM -0500, Justin Pryzby wrote: > I reproduced the problem at 9a974cbcba but not its parent commit. > > commit 9a974cbcba005256a19991203583a94b4f9a21a9 > Author: Robert Haas > Date: Mon Jan 17 13:32:44 2022 -0500 > > pg_upgrade: Preserve relfilenodes and tabl

pg15b2: large objects lost on upgrade

2022-07-01 Thread Justin Pryzby
I noticed this during beta1, but dismissed the issue when it wasn't easily reproducible. Now, I saw the same problem while upgrading from beta1 to beta2, so couldn't dismiss it. It turns out that LOs are lost if VACUUM FULL was run. | /usr/pgsql-15b1/bin/initdb --no-sync -D pg15b1.dat -k | /usr/