On Mon, May 19, 2014 at 07:47:09PM +0100, Milosz Wasilewski wrote: > Hi, > > I'm trying to submit job for TC2 now and I'm in the long queue. There > seem to be a few multinode Android jobs that run on dummy-ssh and > vexpress-tc2 (workload automation). We only have one dummy-ssh device > so there is no way that more than one TC2 is going to be used with > dummy-ssh at the same time. On top of that we have > vexpress-tc2-benchmark which also can run multinode jobs with > dummy-ssh. For some reason if there are couple of multinode jobs > requested for dummy-ssh + vexpress-tc2, the TC2 boards get reserved > and there is no way to submit any other jobs there. While I understand > that 1 board might be in reserved state, there is no point to reserve > all 3 (there is only one dummy-ssh). IMHO this is a bug in multinode.
This is a known issue. The only way we found of not letting multinode jobs starve waiting for devices forever is to reserve their devices as they become available instead of waiting for a moment when all of their requested devices would be available simultaneously. We did not figure out a way of not letting multinode jobs deadlock that wouldn't involve a far more complicated mechanism. > Current status is: > > dummy-ssh: 7 jobs in the queue > vexpress-tc2: 3 reserved + 3 jobs in the queue > > I know that proper solution should be moving with WA to dynamically > allocated VMs, but unfortunately licensing is in the way. Actually I am working right now on a patch to allow multiple dummy-ssh devices on the same host, which might solve this specific problem (assuming WA licensing allow multiple simultaneous uses withing the same host). -- Antonio Terceiro Software Engineer - Linaro http://www.linaro.org
signature.asc
Description: Digital signature
_______________________________________________ linaro-validation mailing list [email protected] http://lists.linaro.org/mailman/listinfo/linaro-validation
