[ https://issues.apache.org/jira/browse/STORM-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118650#comment-15118650 ]
Erik Weathers commented on STORM-1342: -------------------------------------- STORM-1494 is adding support for the supervisor logs to be linked from the Nimbus UI. So this will likely be another area to adjust when (if!?) this is fixed. > support multiple logviewers per host for container-isolated worker logs > ----------------------------------------------------------------------- > > Key: STORM-1342 > URL: https://issues.apache.org/jira/browse/STORM-1342 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core > Reporter: Erik Weathers > Priority: Minor > > h3. Storm-on-Mesos Worker Logs are in varying directories > When using [storm-on-mesos|https://github.com/mesos/storm] with cgroups, each > topology's workers are isolated into separate containers. By default the > worker logs will be saved into container-specific sandbox directories. These > directories are also topology-specific by definition, because, as just > stated, the containers are specific to each topology. > h3. Problem: Storm supports 1-and-only-1 Logviewer per Worker Host > A challenge with this different way of running Storm is that the [Storm > logviewer|https://github.com/apache/storm/blob/768a85926373355c15cc139fd86268916abc6850/docs/_posts/2013-12-08-storm090-released.md#log-viewer-ui] > runs as a single instance on each worker host. This doesn't play well with > having the topology worker logs in separate per-topology containers. The one > logviewer doesn't know about the various sandbox directories that the Storm > Workers are writing to. And if we just spawned new logviewers for each > container, the problem is that the Storm UI only knows about 1 global port > the logviewer, so you cannot just direct. > These problems are documented (or linked to) from [Issue #6 in the > storm-on-mesos project|https://github.com/mesos/storm/issues/6] > h3. Possible Solutions I can envision > # configure the Storm workers to write to log directories that exist on the > raw host outside of the container sandbox, and run a single logviewer on a > host, which serves up the contents of that directory. > #* violates one of the basic reasons for using containers: isolation. > #* also prevents allow a standard use case for Mesos: running more than 1 > instance of a Mesos Framework (e.g., "Storm Cluster") at once on same Mesos > Cluster. e.g., for Blue-Green deployments. > #* a variation on this proposal is to somehow expose the sandbox dirs of all > storm containers to this singleton logviewer process (still has above > problems) > # launch a separate logviewers in each container, and somehow register those > logviewers with Storm such that Storm knows for a given host which logviewer > port is assigned to a given topology. > #* this is the proposed solution > h3. Storm Changes for the Proposed Solution > Nimbus or ZooKeeper could serve as a registrar, recording the association > between a slot (host + worker port) and the logviewer port that is serving > the workers logs. And the Storm-on-Mesos framework could update this registry > when launching a new worker. (This proposal definitely calls for thorough > vetting and thinking.) > h3. Storm-on-Mesos Framework Changes for the Proposed Solution > Along with the interaction with the "registrar" proposed above, the > storm-on-mesos framework can be enhanced to launch multiple logviewers on a > given worker host, where each logviewer is dedicated to serving the worker > logs from a specific topology's container/sandbox directory. This would be > done by launching a logviewer process within the topology's container, and > assigning it an arbitrary listening port that has been determined dynamically > through mesos (which treats ports as one of the schedulable resource > primitives of a worker host). [Code implementing this > logviewer-port-allocation logic already > exists|https://github.com/mesos/storm/commit/af8c49beac04b530c33c1401c829caaa8e368a35], > but [that specific portion of the code was > reverted|https://github.com/mesos/storm/commit/dc3eee0f0e9c06f6da7b2fe697a8e4fc05b5227e] > because of the issues that inspired this ticket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)