On Mon, Jan 28, 2019 at 4:40 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, Jan 28, 2019 at 10:03 AM John Naylor > <john.nay...@2ndquadrant.com> wrote: > > > > On Mon, Jan 28, 2019 at 4:53 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > There are a few buildfarm failures due to this commit, see my email on > > > pgsql-committers. If you have time, you can also once look into > > > those. > > > > I didn't see anything in common with the configs of the failed > > members. None have a non-default BLCKSZ that I can see. > > > > I have done an analysis of the different failures on buildfarm. > > > 2. > @@ -15,13 +15,9 @@ > SELECT octet_length(get_raw_page('test_rel_forks', 'main', 100)) AS main_100; > ERROR: block number 100 is out of range for relation "test_rel_forks" > SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 0)) AS fsm_0; > - fsm_0 > -------- > - 8192 > -(1 row) > - > +ERROR: could not open file "base/50769/50798_fsm": No such file or directory > SELECT octet_length(get_raw_page('test_rel_forks', 'fsm', 10)) AS fsm_10; > -ERROR: block number 10 is out of range for relation "test_rel_forks" > +ERROR: could not open file "base/50769/50798_fsm": No such file or directory > > This indicates that even though the Vacuum is executed, but the FSM > doesn't get created. This could be due to different BLCKSZ, but the > failed machines don't seem to have a non-default value of it. I am > not sure why this could happen, maybe we need to check once in the > failed regression database to see the size of relation? >
This symptom is shown in the below buildfarm critters: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2019-01-28%2005%3A05%3A22 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2019-01-28%2003%3A20%3A02 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=locust&dt=2019-01-28%2003%3A13%3A47 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary&dt=2019-01-28%2003%3A07%3A39 All of these seems to run with fsync=off. Is it possible that vacuum has updated FSM, but the same is not synced to disk and when we try to read it, we didn't get the required page? This is just a guess. I have checked all the buildfarm failures and I see only 4 symptoms for which I have sent some initial analysis. I think you can also once cross-verify the same. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com