I would be interested in seeing the objectives, constraints,
inputs, etc. guiding this effort.  Depending on these, metascheduling
can be dead simple or impossible.


On Fri, Sep 23, 2016 at 11:00:57PM +0000, Shenoy, Gourav Ganesh wrote:
> My understanding is that we will provide Airavata the capability to run and 
> manage jobs across different clusters. Suresh can confirm that.
> Thanks and Regards,
> Gourav Shenoy 
> On 9/23/16, 4:47 PM, "K Yoshimoto" <kenn...@sdsc.edu> wrote:
>     What do you mean by "meta-scheduler" here?  Are you trying to
>     coordinate running of jobs across or amongst a number of different
>     clusters?
>     On Fri, Sep 23, 2016 at 08:43:19PM +0000, Shenoy, Gourav Ganesh wrote:
>     > Hi Dev,
>     > 
>     > I am working on this project of building a Mesos based meta-scheduler 
> for Airavata, along with Shameera & Mangirish. Here is the jira link: 
> https://issues.apache.org/jira/browse/AIRAVATA-2082.
>     > 
>     > 
>     > ·         We have identified some tasks that would be needed for 
> achieving this, and at the higher level it would consist of:
>     > 
>     > 1.       Resource provisioning – We need to provision resources on 
> cloud & hpc infrastructures such as EC2, Jetstream, Comet, etc.
>     > 
>     > 2.       Building a cluster – Deploying a Mesos cluster on set of nodes 
> obtained from (1) above for task management.
>     > 
>     > 3.       Selecting a scheduler – We need to investigate the scheduler 
> to use with Mesos cluster. Some of the options are Marathon, Aurora. But we 
> need to find one that suits our needs of running serial as well as parallel 
> (MPI) jobs.
>     > 
>     > 4.       Installing & running applications on this cluster – Once the 
> cluster has been deployed and a scheduler choice made, we need to be able to 
> install and run applications on this cluster using Airavata.
>     > 
>     > 
>     > ·         Until now we were able to look into the following:
>     > 
>     > o    Resource provisioning:
>     > 
>     > §  We explored several options of provisioning resources – using cloud 
> libraries as well as via ansible scripts.
>     > 
>     > §  We built a OpenStack4J Java module which would provision instances 
> on OpenStack based clouds (eg: Jetstream).
>     > 
>     > §  We also built a CloudBridge Python module for provisioning EC2 
> instances on Amazon. CloudBridge can also be used to provision instances on 
> OpenStack
>     > 
>     > §  We wrote Ansible scripts for bringing up instances on both AWS and 
> OpenStack based clouds.
>     > 
>     > 
>     > §  Key Points: CloudBridge, OpenStack4J are powerful libraries for 
> resource provisioning, but currently they do single-instance provisioning, 
> and not support templated boot options such as CloudFormation (for AWS) & 
> Heat (for OpenStack).
>     > 
>     > 
>     > o    Building a cluster:
>     > 
>     > §  We wrote Ansible script for deploying a Mesos-Marathon cluster on a 
> set of nodes. This script will install necessary dependencies such as 
> Zookeeper.
>     > 
>     > §  We tested this on OpenStack based clouds & on EC2.
>     > 
>     > §  OpenStack Magnum provides excellent support for doing resource 
> provisioning & deploying mesos cluster, but we are running into some problems 
> while trying it.
>     > 
>     > 
>     > o    Installing a scheduler:
>     > 
>     > §  Our Ansible script is currently installing Marathon as the scheduler 
> on Mesos. We haven’t yet submitted jobs using Marathon.
>     > 
>     > 
>     > ·         Although not finalized, but we are inclined towards using 
> Ansible approach for the above, as Ansible also provides Python APIs and 
> which will allow us to integrate it with Airavata via Thrift. Hence we will 
> be able to easily invoke the Ansible scripts from code without needing to use 
> the command-line interface.
>     > 
>     > 
>     > ·         We are also progressively working on some work-items such as:
>     > 
>     > o    Exploring options to provision and deploy a Mesos-Marathon cluster 
> on HPC systems such as Comet. The challenge would be to use Ansible to 
> provision resources and deploy the cluster. Once we have a cluster, we can 
> try running applications.
>     > 
>     > o    Exploring different scheduler options for running serial and 
> parallel (MPI) jobs on such heterogeneous clusters.
>     > 
>     > o    Exploring orchestration options such as OpenStack Heat, AWS 
> CloudFormation, OpenStack Magnum, etc.
>     > 
>     > Any suggestions and comments are highly appreciated.
>     > 
>     > Thanks and Regards,
>     > Gourav Shenoy
>     > 
>     > 

Reply via email to