[ 
https://issues.apache.org/jira/browse/MESOS-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002673#comment-15002673
 ] 

Joseph Wu commented on MESOS-3863:
----------------------------------

Inter-dependency between {{process_manager}} and {{socket_manager}} will 
complicate things:

* {{process_manager}} holds the {{gc}} and various {{HttpProxy}} processes.
* {{socket_manager}} spawns {{HttpProxy}} processes and relies on {{gc}} to 
clean them up.
* {{gc}} relies on {{socket_manager}} links to clean up processes.

{{process::finalize}} should:
# Clean up all processes other than {{gc}}.  This will clear all links and 
delete all {{HttpProxy}} s while {{socket_manager}} still exists.
# Close all sockets via {{SocketManager::close}}.  All of {{socket_manager}} 's 
state is cleaned up via {{SocketManager::close}}, including termination of 
{{HttpProxy}} (termination is idempotent, meaning that killing {{HttpProxy}} s 
via {{process_manager}} is safe).
# At this point, {{socket_manager}} should be empty and only the {{gc}} process 
should be running.  (Since we're finalizing, assume there are no threads trying 
to spawn processes.)  {{socket_manager}} can be deleted.
# {{gc}} can be deleted.  This is currently a leaked pointer, so we'll also 
need to track and delete that.
# {{process_manager}} should be devoid of processes, so we can proceed with 
cleanup (join threads, stop the {{EventLoop}}, etc).

> Investigate the requirements of programmatically re-initializing libprocess
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-3863
>                 URL: https://issues.apache.org/jira/browse/MESOS-3863
>             Project: Mesos
>          Issue Type: Task
>          Components: libprocess, test
>            Reporter: Joseph Wu
>            Assignee: Joseph Wu
>              Labels: mesosphere
>
> This issue is for investigating what needs to be added/changed in 
> {{process::finalize}} such that {{process::initialize}} will start on a clean 
> slate.  Additional issues will be created once done.  Also see [the parent 
> issue|MESOS-3820].
> {{process::finalize}} should cover the following components:
> * {{__s__}} (the server socket)
> ** {{delete}} should be sufficient.  This closes the socket and thereby 
> prevents any further interaction from it.
> * {{process_manager}}
> ** Related prior work: [MESOS-3158]
> ** Cleans up the garbage collector, help, logging, profiler, statistics, 
> route processes (including [this 
> one|https://github.com/apache/mesos/blob/3bda55da1d0b580a1b7de43babfdc0d30fbc87ea/3rdparty/libprocess/src/process.cpp#L963],
>  which currently leaks a pointer).
> ** Cleans up any other {{spawn}} 'd process.
> ** Manages the {{EventLoop}}.
> * {{Clock}}
> ** The goal here is to clear any timers so that nothing can deference 
> {{process_manager}} while we're finalizing/finalized.  It's probably not 
> important to execute any remaining timers, since we're "shutting down" 
> libprocess.  This means:
> *** The clock should be {{paused}} and {{settled}} before the clean up of 
> {{process_manager}}.
> *** Processes, which might interact with the {{Clock}}, should be cleaned up 
> next.
> *** A new {{Clock::finalize}} method would then clear timers, 
> process-specific clocks, and {{tick}} s; and then {{resume}} the clock.
> * {{__address__}} (the advertised IP and port)
> ** Needs to be cleared after {{process_manager}} has been cleaned up.  
> Processes use this to communicate events.  If cleared prematurely, 
> {{TerminateEvents}} will not be sent correctly, leading to infinite waits.
> * {{socket_manager}}
> ** The idea here is to close all sockets and deallocate any existing 
> {{HttpProxy}} or {{Encoder}} objects.
> ** All sockets are created via {{__s__}}, so cleaning up the server socket 
> prior will prevent any new activity.
> * {{mime}}
> ** This is effectively a static map.
> ** It should be possible to statically initialize it.
> * Synchronization atomics {{initialized}} & {{initializing}}.
> ** Once cleanup is done, these should be reset.
> *Summary*:
> * Implement {{Clock::finalize}}.  [MESOS-3882]
> * Implement {{~SocketManager}}.
> * Clean up {{mime}}.
> * Wrap everything up in {{process::finalize}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to