On 16 June 2015 at 19:51, Antony Antony <ant...@phenome.org> wrote: > On Tue, Jun 16, 2015 at 04:38:59PM -0400, Andrew Cagney wrote: >> I suspect that the algorithm is something like: > > may be small difference. directory is set outside the loop. > > runoutput = archive-dir/ + "%Y-%m-%d" + nodename + "make showversion" > for try in 1..5: > for test in tests: > if no runoutput or not archive passed > delete OUTPUT > run test > copy OUTPUT to runoutput/ > > If this works as expected a date change should not cause re-run when running > from a simple "make check"
here's the first bit of snipped code: tried = 0 output_dir = '' # if there is TESTLIST run in batch mode. (output_dir, testlist, ran_tests) = do_test_list( args, start, tried, output_dir) if testlist: while (tried <= args.retry): (output_dir, testlist, ran_tests) = do_test_list( args, start, tried, output_dir) tried = 1 + tried notice how "tried" is only updated after the second call to do_test_list. The latter function has: odir = output_dir if not tried: odir = setup_result_dir(args, True) ... return ..., odir, ... so I suspect that the first two calls to do_test_list both compute output_dir. To be honest, though, I've found swantest too hairy to contemplate changing/fixing. > the loop is to help errors seen on Hugh's server, specifically KVM errros. > Do you still see those? It might be worth getting some stats on it. I don't see any; but then I'm not running stock swantest. I find that when a test fails, it tends to keep failing. Running it 5 times doesn't change that. For what looks like an intermittent failures, I just re-run it later with the machine rebooted. >> So failing tests are always attempted multiple times. It is just >> that, when the time changes, the archive directory also changes >> causing previously successful tests to also be re-attempted. >> >> Passing: >> --retry 0 >> to swantest might help. > > in the past, Fedora 20 and KVM guests on Hugh's server showed a lot reboot > issues and the retry helped then and retry didn't harm me:) I don't know if > it is still needed. It seems things have improved. >> I'm beginning to think that the best default behaviour might be to >> only attempt tests with no OUTPUT directory. It means: > > why not read the json ? JSON files has more info it will note if failure is > due KVM err. If it is KVM and P9 error retrying proved to useful. > > for e.g I see 18 instances of of KVM errors over a year. > find /home/build/results/ -iname "RESULT" | xargs grep KVMERROR |wc -l > 18 > The last tiem I saw this KVMERROR was in Sept 2014. May be Fedora + swantest > improvements fixed these issues. > > 2014-09-05-swantest.libreswan.fi-v3.10-52-g0a4ca86-hugh-2014aug/basic-pluto-04/OUTPUT/RESULT:{"node":"swantest.libreswan.fi","error":"KVMERROR > not able to shutdown 1 guests: [north] > abort","testname":"basic-pluto-04","epoch":1409929425.67,"result":"abort","time":"2014-09-05 > 18:03","runtime":65.25} > > >> - tests are only attempted once (so intermittent failures are not hidden) >> - re-running a test is easy - delete OUTPUT > > I feel on a dedicated machine, where it continously run "make checl" retrys > won't harm. Unfortunately on a local desktop, where we're more worried about turnaround, it makes things slower than they need be :-( > One mystery is why does Hugh's machines take so long. While 3 servers I run > seems to finish "make check" in 5-7 hours. While my tests take 6 hours on an i5, I: - run tests once - skip WIP tests this constrains things to the point that they'll run overnight > This pages show the data from Dec 2014. Jun - Dec my recollection is 5-6 > hours. > http://hal.phenome.nl:8081/results/ > > -antony _______________________________________________ Swan-dev mailing list Swan-dev@lists.libreswan.org https://lists.libreswan.org/mailman/listinfo/swan-dev