[
https://issues.apache.org/jira/browse/S4-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291798#comment-13291798
]
Matthieu Morel commented on S4-27:
----------------------------------
Developments in S4-22 added the following concepts:
* Multiple subsclusters in an S4 namespace, e.g.:
** /s4/clusters/cluster1
** /s4/clusters/cluster2
* Publication of stream producers and consumers, described here:
https://issues.apache.org/jira/secure/attachment/12531387/Inter%20cluster%20communication%20in%20S4%20piper.pdf
> extensions to cluster configuration through Zookeeper
> -----------------------------------------------------
>
> Key: S4-27
> URL: https://issues.apache.org/jira/browse/S4-27
> Project: Apache S4
> Issue Type: Improvement
> Affects Versions: 0.5
> Reporter: Matthieu Morel
> Fix For: 0.5
>
>
> Applications running on S4 clusters are configured through Zookeeper.
> We need to extend the current configuration properties in order to configure
> more features used/required by S4 (streams, SLAs, states etc...)
> Current configuration
> ----------------------------
> It is currently limited to:
> - assigning *tasks* to logical partitions (S4 nodes)
> - publishing *applications*, retrievable from remote repositories
> _Available tasks_, _assigned tasks_ and _applications_ are defined as
> _znodes_, and contain metadata (data associated with the node), as JSON data
> (see ZNRecord class)
> The resulting structure in Zookeeper is currently:
> 1. tasks
> * /<cluster-name>/tasks for available tasks
> - /<cluster-name>/tasks/Task-0 for instance represents 1 logical
> task, and metadata contains the task id and the partition id
> * /<cluster-name>/process for tasks assigned to S4 nodes
> - /<cluster-name>/process/Task-0 is an ephemeral node created by an
> S4 node that took the Task-0 task. Metadata contains the hostname of that S4
> node
> 2. apps
> * /<cluster-name>/apps for applications
> - /<cluster-name>/apps/app1 for instance is the application "app1"
> running on the (logical) cluster and metadata contains just the URI for
> fetching the S4R archive with the application code
> What we need to add
> ----------------------------
> (just some starting points that can be seen as subtasks):
> 1. *nodes state*: it would be really useful to have a general view on the
> available S4 nodes for a given logical cluster. In particular: what nodes are
> available, what is their state (initializing, ready, stopped, processing a
> task, in standby,?).
> --> we could use a new directory /<cluster-name>/nodes and metadata could
> contain information about the node, and notably its state
> --> the corresponding ephemeral znode would be maintained by the Server
> instance or a related entity
> 2. *streams*: if we want to implement inter-app communication through
> streams, then streams should be configurable through Zookeeper.
> --> streams could appear in /<cluster-name>/streams
> - Metadata for streams could include partitioning scheme (as suggested by
> Kishore in S4-10).
> - Metadata could also include a key finder string
> - children nodes could list applications using the stream
> --> corresponding persistent znode would be created at application startup.
> If the stream znode already exists, it would be reused.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira