On Thu, Apr 29, 2021 at 8:50 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Amit Kapila <amit.kapil...@gmail.com> writes: > > This is the first test and inserts just one small record, so how it > > can lead to spill of data. Do you mean to say that may be some > > background process has written some transaction which leads to a spill > > of data? > > autovacuum, say? > > > Yeah, something like this could happen. Another possibility here could > > be that before the stats collector has processed drop and create > > messages, we have enquired about the stats which lead to it giving us > > the old stats. Note, that we don't wait for 'drop' or 'create' message > > to be delivered. So, there is a possibility of the same. What do you > > think? > > You should take a close look at the stats test in the main regression > tests. We had to jump through *high* hoops to get that to be stable, > and yet it still fails semi-regularly. This looks like pretty much the > same thing, and so I'm pessimistically inclined to guess that it will > never be entirely stable. >
True, it is possible that we can't make it entirely stable but I would like to try some more before giving up on this. Otherwise, I guess the other possibility is to remove some of the latest tests added or probably change them to be more forgiving. For example, we can change the currently failing test to not check 'spill*' count and rely on just 'total*' count which will work even in scenarios we discussed for this failure but it will reduce the efficiency/completeness of the test case. > (At least not before the fabled stats collector rewrite, which may well > introduce some entirely new set of failure modes.) > > Do we really need this test in this form? Perhaps it could be converted > to a TAP test that's a bit more forgiving. > We have a TAP test for slot stats but there we are checking some scenarios across the restart. We can surely move these tests also there but it is not apparent to me how it can create a difference? -- With Regards, Amit Kapila.