Re: EEP proposal - Delayed restarts of supervisor children

Loïc Hoguin Thu, 17 Jun 2021 06:26:05 -0700

I don't think having optional delays in the supervisor means that it's agood match for all use cases.

I do not think supervisor delays are a good match for what we may call"network services", be it an interface to a database, an HTTP client, orany other process that provides an interface to something external thatsits across the network.


I do not think jitter or backoff would be a good addition to the supervisor.

But there are cases where delays help in restarts. For example you haverun out of some kind of resources and restarting immediately is unlikelyto improve the situation.

Or the process relies on the distribution being up to function and itrestarts much faster than the distribution restores itself.

It could even be useful in some kinds of "network services". For exampleyou might send events via HTTP and it's not a big deal if the eventsmake it to the other endpoint or not, just fire and forget. The processis up? Great, send it. Otherwise ignore.


Cheers,

On 17/06/2021 15:11, José Valim wrote:

Thanks Maria and Jan for another EEP!

I have to say, however, that I agree with Fred on this topic. Especiallywith the considerations that a restart_delay as an integer value is notenough. I would even say jitter is more important than backoff in manycases, and supporting both exponential backoffs and jitter will requiremore configuration and more complexity to be added to supervisors, whileI also believe it belongs in the worker, as you gain a lot moreflexibility. As one additional example to what Fred said, what if youwant to accumulate requests while you wait for the connection to beestablished, and then issue the commands once it is ready? There aremany other considerations that are only fully realizable in the worker.

On Thu, Jun 17, 2021 at 3:01 PM Maria Scott<[email protected] <mailto:[email protected]>>wrote:


    Hi Viktor :)

     > I support this EEP. :-)

    Glad to hear :)

     > It has been argued before that supervision trees are for
    fault-tolerance
     > of bugs, not network/external errors. But why not enable the use of
     > supervision subtrees for external faults too?

    Yes, I understand both sides of the argument, but yeah, why not? :)
    The real problem we had was to figure out how to delay it right.
    Dragging out the time between crashes and restarts opens up some new
    scenarios and corner cases, especially in the sibling-terminating
    strategies.

     > If we add delays, then how about exponential backoff? e.g.
    doubling the
     > delay for each failed restart attempt. Is it worth considering
    too? It
     > has been suggested before and it's common for network re-attempts.

    We considered but decided against it, for now at least. Simple as it
    sounds on the surface, there is actually quite some complexity
    involved. We think that providing delays alone is already a big step
    forward, and paves the way to future improvements like incremental
    delays.

     > Just forbid the existence of the key restart_delay when restart
    type is
     > temporary.

    We considered this also, but it feels a bit wrong =^^= I mean, it is
    always allowed to have any meaningless key in the map, they are just
    ignored. Other keys (like significant) are allowed to appear as long
    as their values don't clash with other options. Forbidding some keys
    to appear based on the values of other keys, that would be new and
    unique.

    Regards,
    Maria
    _______________________________________________
    eeps mailing list
    [email protected] <mailto:[email protected]>
    http://erlang.org/mailman/listinfo/eeps


_______________________________________________
eeps mailing list
[email protected]
http://erlang.org/mailman/listinfo/eeps


--
Loïc Hoguin
https://ninenines.eu
_______________________________________________
eeps mailing list
[email protected]
http://erlang.org/mailman/listinfo/eeps

Re: EEP proposal - Delayed restarts of supervisor children

Reply via email to