i haven't had to actually do this yet, but if i understand the systemd
socket-activation concept correctly, that may be a useful building block
for putting something like this together (service gets started when another
service tries to access it over the network).

thankfully we don't have any super order-dependent services like that (they
just enter a retry loop until their dependencies are available), but i'm
tempted to try it out now :)



On Thu, Jul 9, 2015 at 11:57 PM Martin A. Brown <mar...@linux-ip.net> wrote:

>
> Hello there,
>
> > I have an app that is distributed across a dozen servers.
> >
> > There are several processes involved, some with dependencies on
> > processes running on other servers.
>
> > What app would you recommend for starting the whole thing up in an
> > orderly manner?
>
> Is it possible to adjust the pieces of software so that there is no
> required 'orderly' startup?
>
> I ask because--if the application requires synchronized startup of
> services across multiple machines, then what happens when one of the
> services (or nodes) early in that dependency chain fails during
> operation?
>
> For example, let's imagine services A through I, each of which must
> be launched before the subsequent can launch:
>
>    A -> B -> C -> D -> E -> F -> G -> H -> I
>
> Assuming normal, orderly, coordinated startup, great.  Now,
> everything is running.
>
> Suppose that service C fails.
>    What happens?
>    Will the application still run?
>    Do D through I need to be restarted (or just D)?
>
> If it is possible to adjust the individual services so that each of
> them can run and retry, fail gracefully, or even fail hard (as fast
> as possible, please) to contend with dependency issues, I would
> recommend that.
>
> Perhaps you have already addressed that question or are in the
> (unenviable) position of contending with feature-complete software
> that is ready for deployment.
>
> Since you are in the 10+ node realm, I think I'd also agree with
> using some sort of configuration management (somebody suggested
> Ansible).  With this many nodes, it's an operational truism that one
> of them will kick the bucket during your dog's midnight birthday
> party [0] and you'll want to be able to move the service quickly to
> another node.
>
> Hurrah for the well-worn configuration management tools.
>
> This is the modern take on startup script dependencies, just now
> with more network in-between!  Everybody needs more network
> in-between!  Not an easy problem.
>
> Anyway, good luck with this conundrum!
>
> -Martin
>
>   [0] Silicon devices sense these moments and cherish destroying our
>       equanimity.
>
> --
> Martin A. Brown
> http://linux-ip.net/
> _______________________________________________
> PLUG mailing list
> PLUG@lists.pdxlinux.org
> http://lists.pdxlinux.org/mailman/listinfo/plug
>
_______________________________________________
PLUG mailing list
PLUG@lists.pdxlinux.org
http://lists.pdxlinux.org/mailman/listinfo/plug

Reply via email to