[
https://issues.apache.org/jira/browse/UIMA-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Richard Eckart de Castilho resolved UIMA-5567.
----------------------------------------------
Resolution: Abandoned
DUCC has been retired.
> UIMA-DUCC: Agent should recover its state after restart
> -------------------------------------------------------
>
> Key: UIMA-5567
> URL: https://issues.apache.org/jira/browse/UIMA-5567
> Project: UIMA
> Issue Type: Improvement
> Components: DUCC
> Reporter: Jaroslaw Cwiklik
> Assignee: Jaroslaw Cwiklik
> Priority: Major
>
> Currently bouncing an agent is not possible. After launching a child process,
> an agent adds an entry in its Process Inventory and uses a Process handle to
> call waitFor() to detect child termination. When an agent restarts, it looses
> all its children and has no means to recover its inventory.
> The proposal is to change this behavior to allow agents to bounce and
> subsequently recover their child processes. The bounce may be required to
> update agent code for example.
> An agent has two options to recover its child processes based on cgroup
> availability.
> If cgroups are enabled, an agent on startup will read all PIDs from
> cgroup.proc file. These PIDs reflect running child processes on a node. An
> agent will create a skeleton inventory entry for each PID and fill in the
> details when the OR state is received. The agent will use a PID to find a
> matching process in the OR state. After the new inventory is recovered, the
> timer based inventory update will fetch PIDs from cgroup.proc file again and
> reconcile this with its inventory. To detect child process termination an
> agent will compare PIDs in inventory agains PIDs from cgroup.proc. If a PID
> is in inventory and not present in cgroup.proc, an agent will mark such
> process as Stopped if deallocate flag is true, or will mark it as Failed if
> deallocate flag is false. Any AP process that is no longer running will be
> marked as Stopped.
> If cgroups are not enabled, an agent will recover its inventory from the OR
> state. While in this mode, an agent will disable its Rogue Process Detector
> and not attempt to detect alien processes. The timer based inventory update
> will fetch PIDs from the OS (using ps command) and reconcile this with its
> inventory. To detect child process termination an agent will compare PIDs in
> inventory against PIDs obtained from the OS. If a PID is in inventory and not
> present in the OS, an agent will mark such process as Stopped if deallocate
> flag is true, or will mark it as Failed if deallocate flag is false. Any AP
> process that is no longer running will be marked as Stopped.
> - An agent will no longer call waitFor() on a Process object returned from a
> ProcessBuilder when a child process is launched
> - An agent will continue to drain stdout and stderr of a child process to
> prevent the child (duccling) from hanging and to receive OS errors which may
> occur when exec'ing a process (bad cmd line, etc). After duccling calls
> execve(), child process stdout and stderr are redirected to /dev/null and
> nothing is expected from these streams by the agent.
> - A child process will communicate state changes and initialization status to
> an agent via a provided port. Question here is how the port is provided to a
> child. Currently an agent uses -D (or env) to communicate its listener port
> to a child. The port is determined when an agent starts up and can
> potentially be different when an agent is bounced. So we either use a
> Registry to store agent's port for a child to lookup or insist that an agent
> has a fixed port. If an agent is bounced and such port is not available what
> should happen?
> - An agent should support a new flag "-Dclean=[true|false]" which on startup
> will force an agent to clean up (terminate) all child processes found in
> cgroups. The code for doing this is already in place and its a default agent
> procedure on startup. Still a question if this should be a default behavior.
> Also the same flag should control what happens on agent shutdown. If clean=
> true, the agent will terminate its children otherwise child processes will
> remain running.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)