On Mon, May 19, 2014 at 07:47:09PM +0100, Milosz Wasilewski wrote:
> Hi,
> 
> I'm trying to submit job for TC2 now and I'm in the long queue. There
> seem to be a few multinode Android jobs that run on dummy-ssh and
> vexpress-tc2 (workload automation). We only have one dummy-ssh device
> so there is no way that more than one TC2 is going to be used with
> dummy-ssh at the same time. On top of that we have
> vexpress-tc2-benchmark which also can run multinode jobs with
> dummy-ssh. For some reason if there are couple of multinode jobs
> requested for dummy-ssh + vexpress-tc2, the TC2 boards get reserved
> and there is no way to submit any other jobs there. While I understand
> that 1 board might be in reserved state, there is no point to reserve
> all 3 (there is only one dummy-ssh). IMHO this is a bug in multinode.

This is a known issue. The only way we found of not letting multinode
jobs starve waiting for devices forever is to reserve their devices as
they become available instead of waiting for a moment when all of their
requested devices would be available simultaneously.

We did not figure out a way of not letting multinode jobs deadlock that
wouldn't involve a far more complicated mechanism.

> Current status is:
> 
> dummy-ssh: 7 jobs in the queue
> vexpress-tc2: 3 reserved + 3 jobs in the queue
> 
> I know that proper solution should be moving with WA to dynamically
> allocated VMs, but unfortunately licensing is in the way.

Actually I am working right now on a patch to allow multiple dummy-ssh
devices on the same host, which might solve this specific problem
(assuming WA licensing allow multiple simultaneous uses withing the same
host).

-- 
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Attachment: signature.asc
Description: Digital signature

_______________________________________________
linaro-validation mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/linaro-validation

Reply via email to