Tom Lane wrote:
BTW, before I forget, this little project turned up a couple of
small improvements for the current buildfarm infrastructure:

1.  There are half a dozen entries with obviously bogus timestamps:

bfarm=# select sysname,snapshot,branch from mfailures where snapshot < 
'2004-01-01';
sysname | snapshot | branch ------------+---------------------+--------
 corgi      | 1997-10-14 14:20:10 | HEAD
 kookaburra | 1970-01-01 01:23:00 | HEAD
 corgi      | 1997-09-30 11:47:08 | HEAD
 corgi      | 1997-10-17 14:20:11 | HEAD
 corgi      | 1997-12-21 15:20:11 | HEAD
 corgi      | 1997-10-15 14:20:10 | HEAD
 corgi      | 1997-09-28 11:47:09 | HEAD
 corgi      | 1997-09-28 11:47:08 | HEAD
(8 rows)

indicating wrong system clock settings on these buildfarm machines.
(Indeed, IIRC these failures were actually caused by the ridiculous
clock settings --- we have at least one regression test that checks
century >= 21 ...)  Perhaps the buildfarm server should bounce
reports with timestamps more than a day in the past or a few minutes in
the future.  I think though that a more useful answer would be to
include "time of receipt of report" in the permanent record, and then
subsequent analysis could make its own decisions about whether to
believe the snapshot timestamp --- plus we could track elapsed times for
builds, which could be interesting in itself.


We actually do timestamp the reports - I just didn't include that in the extract. I will alter the view it's based on. We started doing this in Nov 2005, so I'm going to restrict the view to cases where the report_time is not null - I doubt we're interested in ancient history.

A revised extract is available at http://www.pgbuildfarm.org/mfailures2.dump

We already reject snapshot times that are in the future.

Use of NTP is highly recommended to buildfarm members, but I'm reluctant to make it mandatory, as they might not have it available. I think we can do this: alter the client script to report its idea of current time at the time it makes the web transaction. If it's off from the server time by more than some small value (say 60 secs), adjust the snapshot time accordingly. If they don't report it then we can reject insane dates (more than 24hours ago seems about right).

So I agree with both your suggestions ;-)



2. I was annoyed repeatedly that some buildfarm members weren't
reporting log_archive_filenames entries, which forced going the long
way round in the process I was using.  Seems like we need some more
proactive means for getting buildfarm owners to keep their script
versions up-to-date.  Not sure what that should look like exactly,
as long as it's not "you can run an ancient version as long as you
please".

                        

Modern clients report the versions of the two scripts involved (see script_version and web_script_version in reported config) so we could easily enforce a minimum version on these.

cheers

andrew


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to