[jira] [Updated] (MESOS-7564) Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication.
[ https://issues.apache.org/jira/browse/MESOS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun-Hung Hsiao updated MESOS-7564: --- Labels: mesosphere (was: ) > Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication. > - > > Key: MESOS-7564 > URL: https://issues.apache.org/jira/browse/MESOS-7564 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar >Priority: Major > Labels: mesosphere > > Currently, we do not have heartbeats for executor <-> agent communication. > This is especially problematic in scenarios when IPFilters are enabled since > the default conntrack keep alive timeout is 5 days. When that timeout > elapses, the executor doesn't get notified via a socket disconnection when > the agent process restarts. The executor would then get killed if it doesn't > re-register when the agent recovery process is completed. > Enabling application level heartbeats or TCP KeepAlive's can be a possible > way for fixing this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (MESOS-7564) Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication.
[ https://issues.apache.org/jira/browse/MESOS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-7564: -- Target Version/s: (was: 1.4.0) > Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication. > - > > Key: MESOS-7564 > URL: https://issues.apache.org/jira/browse/MESOS-7564 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar > > Currently, we do not have heartbeats for executor <-> agent communication. > This is especially problematic in scenarios when IPFilters are enabled since > the default conntrack keep alive timeout is 5 days. When that timeout > elapses, the executor doesn't get notified via a socket disconnection when > the agent process restarts. The executor would then get killed if it doesn't > re-register when the agent recovery process is completed. > Enabling application level heartbeats or TCP KeepAlive's can be a possible > way for fixing this issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7564) Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication.
[ https://issues.apache.org/jira/browse/MESOS-7564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-7564: --- Summary: Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication. (was: Introduce a heartbeat mechanism for executor <-> agent communication.) > Introduce a heartbeat mechanism for v1 HTTP executor <-> agent communication. > - > > Key: MESOS-7564 > URL: https://issues.apache.org/jira/browse/MESOS-7564 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar > > Currently, we do not have heartbeats for executor <-> agent communication. > This is especially problematic in scenarios when IPFilters are enabled since > the default conntrack keep alive timeout is 5 days. When that timeout > elapses, the executor doesn't get notified via a socket disconnection when > the agent process restarts. The executor would then get killed if it doesn't > re-register when the agent recovery process is completed. > Enabling application level heartbeats or TCP KeepAlive's can be a possible > way for fixing this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)