Hello Devs,

Thanks Gourav and Shameera for all the work w.r.t. setting up the
Mesos-Marathon cluster on Jetstream.

I am currently evaluating MPICH (http://www.mpich.org/about/overview/) to
be used for launching MPI jobs on top of mesos. MPICH version 1.2 supports
Mesos based MPI scheduling. I have been also trying to submit jobs to the
cluster through Marathon. However, in either cases I am currently facing
issues which I am working to get resolved.

I am compiling my notes into the following google doc. You may please
review and let me know your comments, suggestions.

https://docs.google.com/document/d/1p_Y4Zd4I4lgt264IHspXJli3la25y6bcPcmrTD6nR8g/edit?usp=sharing

Thanks and Regards,
Mangirish Wagle



On Wed, Sep 21, 2016 at 3:20 PM, Shenoy, Gourav Ganesh <goshe...@indiana.edu
> wrote:

> Hi Mangirish,
>
>
>
> I have set up a Mesos-Marathon cluster for you on Jetstream. I will share
> with you with the cluster details in a separate email. Kindly note that
> there are 3 masters & 2 slaves in this cluster.
>
>
>
> I am also working on automating this process for Jetstream (similar to
> Shameera’s ansible script for EC2) and when that is ready, we can create
> clusters or add/remove slave machines from the cluster.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *Mangirish Wagle <vaglomangir...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
> *Date: *Wednesday, September 21, 2016 at 2:36 PM
> *To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
> *Subject: *Running MPI jobs on Mesos based clusters
>
>
>
> Hello All,
>
>
>
> I would like to post for everybody's awareness about the study that I am
> undertaking this fall, i.e. to evaluate various different frameworks that
> would facilitate MPI jobs on Mesos based clusters for Apache Airavata.
>
>
>
> Some of the options that I am looking at are:-
>
>    1. MPI support framework bundled with Mesos
>    2. Apache Aurora
>    3. Marathon
>    4. Chronos
>
> Some of the evaluation criteria that I am planning to base my
> investigation are:-
>
>    - Ease of setup
>    - Documentation
>    - Reliability features like HA
>    - Scaling and Fault recovery
>    - Performance
>    - Community Support
>
> Gourav and Shameera are working on ansible based automation to spin up a
> mesos based cluster and I am planning to use it to setup a cluster for
> experimentation.
>
>
>
> Any suggestions or information about prior work on this would be highly
> appreciated.
>
>
>
> Thank you.
>
>
>
> Best Regards,
>
> Mangirish Wagle
>
>

Reply via email to