On 10 April 2014 16:28, Andrew Purtell <apurt...@apache.org> wrote: > Hi Steve, > > Does Slider target the deployment and management of components/projects in > the Hadoop project itself? Not just the ecosystem examples mentioned in the > proposal? I don't see this mentioned in the proposal. >
no. That said, some of the stuff I'm prototyping on a service registry should be usable for existing code -there's no reason why a couple of zookeeper arguments shouldn't be enough to look up the bindings for HDFS, Yarn, etc. I've not done much there -currently seeing how well curator service discovery works- so assistance would be welcome. > > The reason I ask is I'm wondering how Slider differentiates from projects > like Apache Twill or Apache Bigtop that are already existing vehicles for > achieving the aims discussed in the Slider proposal. Twill: handles all the AM logic for running new code packaged as a JAR with an executor method Bigtop: stack testing > Tackling > cross-component resource management issues could certainly be that, but > only if core Hadoop services are also brought into the deployment and > management model, because IO pathways extend over multiple layers and > components. You mention HBase and Accumulo as examples. Both are HDFS > clients. Would it be insufficient to reserve or restrict resources for e.g. > the HBase RegionServer without also considering the HDFS DataNode? IO quotas is a tricky one -you can't cgroup-throttle a container for HDFS IO as it takes place on local and remote DN processes. Without doing some priority queuing in the DNs we can hope for some labelling of nodes in the YARN cluster so you can at least isolate the high-SLA apps from IO intensive but lower priority code. > Do the > HDFS DataNode and HBase RegionServer have exactly the same kind of > deployment, recovery/restart, and dynamic scaling concerns? DN's react to loss of the NN by spinning on the cached IP address, or, in HA, to the defined failover address. Now, if we did support ZK lookup of NN IPC and Web ports we could consider an alternate failure mode where the DNs do intermittently poll the ZK bindings during the spin cycle HBase and accumulo do have their own ZK binding mechanism, so don't really need their own registry. But to work with their data you do need the relevant client apps. I would like to have some standard for at least publishing the core binding information in a way that could be parsed by any client app (CLI, web UI, other in-cluster apps) > Or are these > sort of considerations outside the Slider proposal scope? -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.