Answers inline. -- Hitesh
On Mar 12, 2013, at 12:26 PM, Ioan Zeng wrote: > Another evaluation criteria was the community support of the framework > which I rate now as very good :) > > I would like to ask other questions: > > I have seen YARN or MR used only in the context of HDFS. Would it be > possible to keep all YARN features without using it in relation with > HDFS (with no HDFS installed)? It uses the generic filesystem apis from hadoop to a very large extent so it should work with any filesytem solution. There are a couple of features which do depend on HDFS though - log aggregation for example ( collect all logs of all containers into a central place ) that would need to be disabled. There may be some cases which I am may be unaware of. If you do see anything which depends on HDFS, please do file jiras so that we can address the issue. > > You mentioned the CapacityScheduler. Does this require MapReduce? or > is it included in YARN? I understood that MRv2 is just an application > built over the YARN framework. For our use case we don't need MR. > Yes - you are right - there would be no dependency on MapReduce. The CapacityScheduler is the scheduling module used inside the ResourceManager ( which is YARN only ). > For a better understanding of my questions regarding the Distributed > Shell. We intend to use YARN for a distributed automated test > environment which will execute set of test suites for specific builds > in parallel. Do you know about similar usages of YARN or MR, maybe > case studies? > There are a few others who are using Yarn in various scenarios - none who use it for their test infrastructure as far as I know. The closest I can think of would be LinkedIn's use-case where they launch and monitor a bunch of services on a Yarn cluster. ( http://riccomini.name/posts/hadoop/2012-10-12-hortonworks-yarn-meetup/ might be of help ) > Thanks, > Ioan > > On Tue, Mar 12, 2013 at 8:47 PM, Hitesh Shah <hit...@hortonworks.com> wrote: >> Answers regarding DistributedShell. >> >> https://issues.apache.org/jira/secure/attachment/12486023/MapReduce_NextGen_Architecture.pdf >> has some details on YARN's architecture. >> >> -- Hitesh >> >> On Mar 12, 2013, at 7:31 AM, Ioan Zeng wrote: >> >>> >>> Another point I would like to evaluate is the Distributed Shell example >>> usage. >>> Our use case is to start different scripts on a grid. Once a node has >>> finished a script a new script has to be started on it. A report about >>> the scripts execution has to be provided. in case a node has failed to >>> execute a script it should be re-executed on a different node. Some >>> scripts are Windows specific other are Unix specific and have to be >>> executed on a node with a specific OS. >>> >> >> The current implementation of distributed shell is effectively a piece of >> example code to help >> folks write more complex applications. It simply supports launching a script >> on a given number >> of containers ( without accounting for where the containers are assigned ), >> does not handle retries on failures >> and simply reports a success/failure based on the no. of failures in running >> the script. >> >> Based on your use case, it should be easy enough to build on the example >> code to handle the features that >> you require. >> >> The OS specific resource ask is something which will be need to be addressed >> in YARN. Could you file a JIRA >> for this feature request with some details about your use-case. >> >> >>> The question is: >>> Would it be feasible to adapt the example "Distributed Shell" >>> application to have the above features? >>> If yes how could I run some specific scripts only on a specific OS? Is >>> this the ResourceManager responsability? What happens if there is no >>> Windows node for example in the grid but in the queue there is a >>> Windows script? >>> How to re-execute failed scripts? Does it have to be implemented by >>> custom code, or is it a built in feature of YARN? >>> >>> >> >> The way YARN works is slightly different from what you describe above. >> >> What you would do is write some form of a controller which in YARN >> terminology is referred to as an ApplicationMaster. >> It would request containers from the RM ( for example, 5 containers on >> WinOS, 5 on Linux with 1 GB each of RAM ). Once, the container is >> assigned, the controller would be responsible for launching the correct >> script based on the container allocated. The RM would be responsible >> for ensuring the correct set of containers are allocated to the container >> based on resource usage limits, priorities, etc. [ Again to clarify, OS type >> scheduling is currently not supported ]. If a script fails, the container's >> exit code and completion status would be fed back to the controller which >> would then have to handle retries ( may require asking the RM for a new >> container ). >> >> >> >>> Thank you in advance for your support, >>> Ioan Zeng >>