Rich, if you wanted this comment to get on JIRA - you need to post it to the ticket. I don't think email gate works on ASF JIRA setup.
Cos On Tue, Mar 10, 2015 at 08:44AM, Richard Pelavin wrote: > An approach to look at is having a simple topology DSL say in yaml that the > end user interacts with to specify the logical set of nodes, what daemons > are on each node and how they are interconnected (for more complex things > like ha or indicating how monitors, user authentication ,etc could plug > in).. It would be easy then to write some simple code to "compile this into > for example" hiera.yaml files or as an ENC or even both. > > I think this begs the question as to what type of end user is expected to > use this; if it is a Puppet savvy end user then having them specify things > in hiera would be clear, but if going after end users that are not well > versed in.Puppet having a Bigtop topology dsl" may resonate better and can > be simpler. > > Now, I have done much work around this area so would be happy to propose a > starting point for a topology DSL if this approach makes sense. I can also > flesh out a number of issues that could be addressed to see what priority > people would give it. > One issue for example is whether the topology just logically identifies > nodes and groups of nodes (e.g., the set of slaves) and does not require ip > or dns addresses to be assigned to them; this allows more sharable designs > without locking users into what would be one team's deployment specific > settings. It also facilitates the process where given a toplogy we spin up > a set of nodes and in a late binding way attach the host addresses to the > nodes and to the attributes for connecting between hosts. > > For the specific example of init-hdfs,sh if I am correctly guessing at what > issue would be is that ideally want to only create directories for services > that are being actually used and not create it for all directories or > equivalently for the data driven way in which this is created you want to > construct the description of directories to include as a function of what > deamons are on the topology nodes. This is something I have tackled and can > include this in a write up if interested, > - Rich > > > > ----- Original Message ----- > From: > [email protected] > > To: > <[email protected]> > Cc: > > Sent: > Tue, 10 Mar 2015 09:19:38 +0000 (UTC) > Subject: > [jira] [Commented] (BIGTOP-1746) Introduce the concept of roles in bigtop > cluster deployment > > > > [ > https://issues.apache.org/jira/browse/BIGTOP-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354586#comment-14354586 > ] > > Michael Weiser commented on BIGTOP-1746: > ---------------------------------------- > > At least in the Puppet manifests there are only two places where the > current role concept is implemented: > - manifests/cluster.pp decides what daemons to put on which box > - hieradata/bigtop/cluster.yaml governs what to put into their config files > > cluster.yaml can easily be adjusted and overridden so it doesn't force the > concept of head node and frontend into the config files any more. So the > main point of attack from my point of view is cluster.pp. Unfortunately it > also implements some dependencies between modules and some add-on logic, > such as running init-hdfs.sh. Basically I would suggest moving these > dependencies into their respective modules and then just throwing away > cluster.pp. After that, the classes could be included directly using > hiera_include with the hiera lookup hierachy or possibly an ENC or facts > governing which roles a machine has. > > I have a setup where I've basically done that. I have changes I was already > planning to propose for merging that move dependencies mostly into the > hadoop module. That would render cluster.pp quite empty already. I also > have a concept for assigning roles to nodes via hiera. This, however, is a > bit convoluted and would need streamlining for inclusion in mainline > BigTop. In the most basic case, classes such as hadoop::namenode can just > be assigned to nodes directly such as this: > > manifests/site.pp: > {noformat} > hiera_include("classes") > {noformat} > > hiera.yaml: > {noformat} > --- > :yaml: > :datadir: /etc/puppet/hieradata > :hierarchy: > - "node/%{::fqdn}" > - site > - bigtop/cluster > {noformat} > > hieradata/node/node1.do.main.yaml: > {noformat} > --- > classes: > - hadoop::namenode > - hadoop-zookeeper::server > {noformat} > > > Introduce the concept of roles in bigtop cluster deployment > > ----------------------------------------------------------- > > > > Key: BIGTOP-1746 > > URL: https://issues.apache.org/jira/browse/BIGTOP-1746 > > Project: Bigtop > > Issue Type: New Feature > > Components: deployment > > Reporter: vishnu gajendran > > Labels: features > > Fix For: 0.9.0 > > > > > > Currently, during cluster deployment, puppet categorizes nodes as > head_node, worker_nodes, gateway_nodes, standy_node based on user specified > info. This functionality gives user control over picking up a particular > node as head_node, standy_node, gateway_node and rest others as > worker_nodes. But, I woulld like to have more fine-grained control on which > deamons should run on which node. For example, I do not want to run > namenode, datanode on the same node. This functionality can be introduced > with the concept of roles. Each node can be assigned a set of role. For > example, Node A can be assigned ["namenode", "resourcemanager"] roles. Node > B can be assigned ["datanode", "nodemanager"] and Node C can be assigned > ["nodemanager", "hadoop-client"]. Now, each node will only run the > specified daemons. Prerequisite for this kind of deployment is that each > node should be given the necessary configurations that it needs to know. > For example, each datanode should know which is the namenode etc... This > functionality will allow users to customize the cluster deployment > according to their needs. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332)
