Re: Metascheduler work

2018-12-17 Thread Christie, Marcus Aaron
Hi Dimuthu,

> On Dec 17, 2018, at 9:55 AM, DImuthu Upeksha  
> wrote:
> 
> Metascheduling Usecase 3: Users within a gateway need to fairly use a 
> community account. Computational resources like XSEDE enforce fair-share 
> across users, but since gateway job submissions are funneled through a single 
> community account, different users within a gateway are impacted. Airavata 
> will need to implement fair-share scheduling among these users to ensure fair 
> use of allocations as well as allowable queue limits and work with resources 
> policies.
> 
> Comments
> 1. Is there any existing work regarding this usecase? I can remember that two 
> students created a UI to enforce these limits and save in a database. 
> 

I'm guessing that the fair share regarding metascheduling is a little different 
from the allocation manager work done by IU students in the spring. Here fair 
share would mean fairly scheduling jobs so that no one user gets a greater 
share of running jobs. For example, if a queue only allows 50 jobs per user and 
all jobs from a gateway are submitted as a single community account user, and 
you have gateway user A submit 50 jobs and gateway users B and C submit 1 job 
each, then you would want to avoid the situation where all 50 of user A's jobs 
get submitted and user B and C's jobs are waiting on them to complete before 
their jobs get submitted to the queue. You would rather want to have user B and 
user C's jobs get submitted and 48 of user A's jobs.

smime.p7s
Description: S/MIME cryptographic signature


Re: Metascheduler work

2018-12-17 Thread Pankaj Saha
The solution I provided was to run MPI tasks with Mesos was with the help
of Docker Swarm. It required me to develop a Mesos Framework and an
Executor [here
].
Mesos containers could not communicate with each other and that was the
reason to bring Docker Swarm into the picture. However now, Mesos container
can communicate with help of Calico [here
]. Calico
provides an overlay network and each Mesos container gets a unique IP
address.

Thanks
Pankaj





On Mon, Dec 17, 2018 at 12:13 PM Pamidighantam, Sudhakar V <
spami...@illinois.edu> wrote:

> I thought Pankaj presented a way to implement MPI executions through mesos
> more recently.
>
> He may want to comment on this..
>
>
>
> Thanks,
>
> Sudakar.
>
>
>
> *From: *Marlon Pierce 
> *Reply-To: *
> *Date: *Monday, December 17, 2018 at 8:52 AM
> *To: *"dev@airavata.apache.org" 
> *Subject: *Re: Metascheduler work
>
>
>
> Hi Dimuthu,
>
>
>
> This is something we should re-evaluate. Mangirish Wangle looked at Mesos
> integration with Airavata back in 2016, but he ultimately ran into many
> difficulties, including getting MPI jobs to work, if I recall correctly.
>
>
>
> Marlon
>
>
>
>
>
> *From: *"dimuthu.upeks...@gmail.com" 
> *Reply-To: *dev 
> *Date: *Sunday, December 16, 2018 at 7:30 AM
> *To: *dev 
> *Subject: *Metascheduler work
>
>
>
> Hi Folks,
>
>
>
> I found this [1] mail thread and the JIRA ticket [2] which have discussed
> about coming up with an Airavata specific job scheduler. At the end of the
> discussion, seems like an approach based on Mesos has been chosen to
> tryout. Is there any other discussion/ documents regarding this topic? Has
> anyone worked on this and if so, where are the code / design documents?
>
>
>
> [1]
> https://markmail.org/message/tdae5y3togyq4duv#query:+page:1+mid:tdae5y3togyq4duv+state:results
>
> [2] https://issues.apache.org/jira/browse/AIRAVATA-1436
>
>
>
> Thanks
>
> Dimuthu
>


Re: Metascheduler work

2018-12-17 Thread Pamidighantam, Sudhakar V
I thought Pankaj presented a way to implement MPI executions through mesos more 
recently.
He may want to comment on this..

Thanks,
Sudakar.

From: Marlon Pierce 
Reply-To: 
Date: Monday, December 17, 2018 at 8:52 AM
To: "dev@airavata.apache.org" 
Subject: Re: Metascheduler work

Hi Dimuthu,

This is something we should re-evaluate. Mangirish Wangle looked at Mesos 
integration with Airavata back in 2016, but he ultimately ran into many 
difficulties, including getting MPI jobs to work, if I recall correctly.

Marlon


From: "dimuthu.upeks...@gmail.com" 
Reply-To: dev 
Date: Sunday, December 16, 2018 at 7:30 AM
To: dev 
Subject: Metascheduler work

Hi Folks,

I found this [1] mail thread and the JIRA ticket [2] which have discussed about 
coming up with an Airavata specific job scheduler. At the end of the 
discussion, seems like an approach based on Mesos has been chosen to tryout. Is 
there any other discussion/ documents regarding this topic? Has anyone worked 
on this and if so, where are the code / design documents?

[1] 
https://markmail.org/message/tdae5y3togyq4duv#query:+page:1+mid:tdae5y3togyq4duv+state:results
[2] https://issues.apache.org/jira/browse/AIRAVATA-1436

Thanks
Dimuthu


Re: Metascheduler work

2018-12-17 Thread DImuthu Upeksha
Hi Marlon,

Thanks for the clarification. Yes I agree that we need to re-evaluate both
the use case and the approach we are going to implement this.

Hi All,

Considering current requirements of gateways, we somehow need a better job
throttling/ scheduling proxy before the actual schedulers. I will simply
copy and paste the existing use cases mentioned in the wiki and please
share your views on the validity of these use cases in current gateways.
Once we have come up with a concrete set of requirements, then let's think
about the technical side of the solution. I have added my point of view on
each usecase in comments section.

*Metascheduling Usecase 1: Users/Gateways submitting a series of jobs to a
resource. Whereas a resource enforces a per user job limit within a queue
and ensure fair use of the clusters ((example: stampede allows 50 jobs per
user in the normal queue). Airavata will need to implement queues and
throttle jobs respecting the max-job-per-queue limits of a underlying
resource queue. *

Comments:
1. What will happen to the pending jobs in the Airavata queue? Will they
stay in the queue forever until the queues in the clusters become available
or is there an expiry?
2. Are we assuming that Airavata is the only entry point for the queues in
the clusters or can there be some other gateways/ independent users
submitting jobs to the same queue?

*Metascheduling Usecase 2: Users/Gateways delegate job scheduling across
available computational resources to Airavata. Airavata will need to
implement schedulers which become aware of existing loads on the clusters
and spread jobs efficiently. The scheduler should also be able to get
access to heuristics on previous executions and current requirements which
includes job size (number of nodes/cores), memory requirements, wall time
estimates and so forth. *

Comments:
1. Even though the first part makes sense for me, second part looks more
like a machine learning problem and a nice to have a feature. Is this
something like user comes into portal and launch an application with a set
of inputs and Airavata decides in which machine / cluster should this job
run?

*Metascheduling Usecase 3: Users within a gateway need to fairly use a
community account. Computational resources like XSEDE enforce fair-share
across users, but since gateway job submissions are funneled through a
single community account, different users within a gateway are impacted.
Airavata will need to implement fair-share scheduling among these users to
ensure fair use of allocations as well as allowable queue limits and work
with resources policies.*

Comments
1. Is there any existing work regarding this usecase? I can remember that
two students created a UI to enforce these limits and save in a database.

Thanks
Dimuthu

On Mon, Dec 17, 2018 at 7:22 PM Pierce, Marlon  wrote:

> Hi Dimuthu,
>
>
>
> This is something we should re-evaluate. Mangirish Wangle looked at Mesos
> integration with Airavata back in 2016, but he ultimately ran into many
> difficulties, including getting MPI jobs to work, if I recall correctly.
>
>
>
> Marlon
>
>
>
>
>
> *From: *"dimuthu.upeks...@gmail.com" 
> *Reply-To: *dev 
> *Date: *Sunday, December 16, 2018 at 7:30 AM
> *To: *dev 
> *Subject: *Metascheduler work
>
>
>
> Hi Folks,
>
>
>
> I found this [1] mail thread and the JIRA ticket [2] which have discussed
> about coming up with an Airavata specific job scheduler. At the end of the
> discussion, seems like an approach based on Mesos has been chosen to
> tryout. Is there any other discussion/ documents regarding this topic? Has
> anyone worked on this and if so, where are the code / design documents?
>
>
>
> [1]
> https://markmail.org/message/tdae5y3togyq4duv#query:+page:1+mid:tdae5y3togyq4duv+state:results
>
> [2] https://issues.apache.org/jira/browse/AIRAVATA-1436
>
>
>
> Thanks
>
> Dimuthu
>


Re: Metascheduler work

2018-12-17 Thread Pierce, Marlon
Hi Dimuthu,

 

This is something we should re-evaluate. Mangirish Wangle looked at Mesos 
integration with Airavata back in 2016, but he ultimately ran into many 
difficulties, including getting MPI jobs to work, if I recall correctly.

 

Marlon

 

 

From: "dimuthu.upeks...@gmail.com" 
Reply-To: dev 
Date: Sunday, December 16, 2018 at 7:30 AM
To: dev 
Subject: Metascheduler work

 

Hi Folks, 

 

I found this [1] mail thread and the JIRA ticket [2] which have discussed about 
coming up with an Airavata specific job scheduler. At the end of the 
discussion, seems like an approach based on Mesos has been chosen to tryout. Is 
there any other discussion/ documents regarding this topic? Has anyone worked 
on this and if so, where are the code / design documents?

 

[1] 
https://markmail.org/message/tdae5y3togyq4duv#query:+page:1+mid:tdae5y3togyq4duv+state:results

[2] https://issues.apache.org/jira/browse/AIRAVATA-1436

 

Thanks

Dimuthu



smime.p7s
Description: S/MIME cryptographic signature


Re: Unused modules

2018-12-17 Thread Christie, Marcus Aaron
Hi Dimuthu,

Yes, it's loaded at runtime. See the db_event_manager property in 
airavata-server.properties.  See also [1].

Thanks,

Marcus

[1] 
https://github.com/apache/airavata/blob/33a601fe84d297b11171a1157a2561a451ad9d84/modules/server/src/main/java/org/apache/airavata/server/ServerMain.java#L126-L126
 



> On Dec 16, 2018, at 3:57 AM, DImuthu Upeksha  
> wrote:
> 
> Hi Marcus,
> 
> Thanks for raising this issue. Is that a runtime dependency? For me, 
> everything compiled without db-event-manager [1]. Can you point me to the 
> place where we are using that?
> 
> [1] https://travis-ci.org/apache/airavata/builds/465830628 
> 
> 
> Thanks,
> Dimuthu
> 
> On Wed, Dec 12, 2018 at 6:22 AM Christie, Marcus Aaron  > wrote:
> Hi Dimuthu,
> 
> Thanks for cleaning things up!  However, I'm pretty sure we're still using 
> db-event-manager to manage our event-based synchronization between Airavata 
> services.
> 
> The rest looks good to be removed though.
> 
> Thanks,
> 
> Marcus
> 
>> On Dec 10, 2018, at 1:08 AM, DImuthu Upeksha > > wrote:
>> 
>> Hi Folks,
>> 
>> I removed following modules on staging branch [1] and created a separate 
>> branch named archive to keep old changes. 
>> 
>> allocation-manager
>> cloud
>> db-event-manager
>> gfac
>> integration-tests
>> monitoring
>> test-suite
>> workflow
>> workflow-model
>> xbaya
>> xbaya-gui
>> 
>> Following modules were kept as there are some dependencies to other modules
>> 
>> compute-account-provisioning
>> configuration
>> security
>> 
>> Updated repository was tested in the Testing environment and everything 
>> seems to be running smoothly. I will redeploy Staging setup in few days and 
>> merge changes to develop branch as well.
>> 
>> Thanks 
>> Dimuthu
>> 
>> [1] https://github.com/apache/airavata/tree/staging/modules 
>> 
>> 
>> On Fri, Nov 30, 2018 at 8:32 PM Suresh Marru > > wrote:
>> Hi Sudhakar,
>> 
>> The allocation manager from last year contributions is here - 
>> https://github.com/apache/airavata-sandbox/tree/master/allocation-manager 
>>  
>> the one Dimuthu is suggesting to clean up is a stale one. 
>> 
>> I think we should turn the allocation manager into a larger goal of 
>> enforcing quotas mainly for user storage and probably take on as soon as 
>> possible. 
>> 
>> Cheers,
>> Suresh
>> 
>>> On Nov 30, 2018, at 8:37 AM, Pamidighantam, Sudhakar V 
>>> mailto:spami...@illinois.edu>> wrote:
>>> 
>>> What is the estimated timeline for enforceable allocation management to be 
>>> available in Airavata, 2019, 2020?
>>>  
>>> Thanks,
>>> Sudhakar.
>>>  
>>> From: DImuthu Upeksha >> >
>>> Reply-To: "dev@airavata.apache.org " 
>>> mailto:dev@airavata.apache.org>>
>>> Date: Friday, November 30, 2018 at 8:30 AM
>>> To: "dev@airavata.apache.org " 
>>> mailto:dev@airavata.apache.org>>
>>> Subject: Re: Unused modules
>>>  
>>> Hi Suresh,
>>>  
>>> +1 for removing gfac modules as well 
>>>  
>>> Dimuthu
>>>  
>>> On Fri, Nov 30, 2018 at 6:32 PM Apache Airavata >> > wrote:
>>> +1 to remove all of them. While you are at it, should we also remove gfac 
>>> modules from develop and staging branches?
>>>  
>>> Suresh
>>>  
>>> 
>>> On Nov 30, 2018, at 6:44 AM, DImuthu Upeksha >> > wrote:
>>> 
>>> Hi Folks, 
>>>  
>>> I can see that some modules [1] are no longer being used or actively 
>>> developed. 
>>>  
>>> allocation-manager
>>> cloud
>>> compute-account-provisioning
>>> configuration
>>> db-event-manager
>>> integration-tests
>>> monitoring
>>> security
>>> workflow
>>> workflow-model
>>> xbaya
>>> xbaya-gui
>>>  
>>> I'm suggesting to remove these unused modules as they affect the build time 
>>> and the clarity of the code. Any objections / suggestions?
>>>  
>>> [1] https://github.com/apache/airavata/tree/staging/modules 
>>> 
>>>  
>>> Thanks
>>> Dimuthu
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature