On 11/19/22 04:10, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
>> On 2022-11-18 15:55:34 -0500, Tom Lane wrote:
>>> We realized today [1] that it's been some time since the buildfarm
>>> had any debug_discard_caches (nee CLOBBER_CACHE_ALWAYS) coverage.
> 
>> Do we know when it was covered last? I assume it's before the addition of
>> test_oat_hooks in 90efa2f5565?
> 
> As far as that goes: some digging in the buildfarm DB says that avocet
> last did a CCA run on 2021-10-22 and trilobite on 2021-10-24.  They
> were then offline completely until 2022-02-10, and when they restarted
> the runtimes were way too short to be CCA tests.
> 

Yeah. I'll try setting up a better monitoring / alerting to notice
issues like this more promptly ... it's a bit tough, because IIRC the
gap 2021-10-22 - 2022-02-10 was due to the tests running, but getting
stuck for some reason. So it's not like the machine was off.

I wonder if it'd make sense to have some simple & optional alerting
based on how long ago the machine reported the last result. Send e-mail
if there was no report for a month or so would be enough.

> Seems like maybe we need a little more redundancy in this bunch of
> buildfarm animals.
> 

It's actually a bit worse than that, because both animals are on the
same machine. So avocet gets "stuck" -> trilobite is stuck too.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to