Re: What should happen when a distributed agent dies?

Brett Porter Wed, 30 Sep 2009 00:36:09 -0700

So I went through the cluster of issues you opened and put them in1.3.5, then realised that we had already kind of settled on the listof things for 1.3.5 :)


Should we:
1) keep them where they are?
2) push them to 1.3.6?
3) push them to 1.4.0?
4) or are they already addressed by Marica's other fix?


Cheers,
Brett

On 29/09/2009, at 3:03 AM, Wendy Smoak wrote:

I've been working with Distributed Builds lately, and I've found that
it works if everything is perfect, but if something goes wrong it has
a hard time coping with the problem, and it doesn't recover.

For example, it's a given that at some point, an agent is going to die
without being properly removed first.

Currently if this happens, the Queues page breaks (error/stack trace)
and you can't edit or delete the offending agent to disable or get rid
of it.

The agent is also still shown as 'enabled' on the Distributed Agents
page even though it's not responding.

What should happen in this case?

I'm all for having the system automatically disable any agent that is
not behaving properly.  At first, the admin may have to manually
re-enable it.  In the future we might come up with a way for it to
auto-recover.

Thoughts?

--
Wendy

Re: What should happen when a distributed agent dies?

Reply via email to