Jenkins build is back to stable : airavata-dev » Airavata Registry Core #748

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : airavata-dev #748

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev #747

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev » Airavata Registry Core #747

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: airavata-dev #746

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev » Airavata Registry Core #745

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: airavata-dev #745

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev #744

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev #742

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev » Airavata Registry Core #742

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build became unstable: airavata-dev » Airavata Registry Core #741

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build became unstable: airavata-dev #741

2016-10-13 Thread Apache Jenkins Server
See 



Pull request for Registry Refactoring changes

2016-10-13 Thread Abhijit Karanjkar
Hi Suresh/Supun,
I have created a pull request for the changes regarding Workflow Catalog as
well as App catalog repository.

Thanks & regards,
Abhijit Karanjkar


Jenkins build is back to stable : airavata-dev #740

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is back to stable : airavata-dev » Airavata Registry Core #740

2016-10-13 Thread Apache Jenkins Server
See 




Re: Questions

2016-10-13 Thread Eroma Abeysinghe
Hi Zhong,

Please check my answers in line.

On Wed, Oct 12, 2016 at 4:20 PM, Zhong Wang  wrote:

> After I use the Airavata platform, I do have 4 questions.
>
>
>
> 1)  Can I lock the setting values for some properties after we
> customize these values for a specified computer source, e.g.Queue*,
> Node Count, Total Core and Wall Time limit. We don’t want the users to
> change these setting for specified computer resource.
>
​You can set these values globally (meanning acroos all compute resources
in pga_config.php file) but users can change them at the time they create
their experiments. ​

​For your requirement could you please create  JIRA in
https://issues.apache.org/jira/login.jsp?os_destination=
%2Fbrowse%2FAIRAVATA-2158%3Fjql%3Dproject%2520%253D%2520AIRAVATA
​ you might need to create a username if you already don't have one.​

>
>
>
>
> 2)  Does the Airavata have a function to delete the experiments? I
> made many test cases, but I don’t want to see them anymore.
>
​In our web based gateway portal we don't have delete experiment feature.
Requirement we initially worked on was that researches don't want to loose
their work/experiments. ​


>
>
> 3)  Can the Airavata  show output log from the SLURM or PBS jobs on
> the web page directly?
> ​ No. Since the PGA is generic and many applications from different
> streams are run such requirement did not come. ​
>
>
>
> 4)  What is the meaning of “Data is Staged” in App Input Fields in
> the UI of Edit Application Interface?
>
​Its a sort of futeristic feature. At the moemnt irrespective of the
setting in Application interface, data is staged.​

>
>
>
>
> Thanks,
>
>
>
> Zhong Wang
>



-- 
Thank You,
Best Regards,
Eroma


Re: Running MPI jobs on Mesos based clusters

2016-10-13 Thread Mangirish Wagle
Hi Marlon,
Thanks for confirming and sharing the legal link.

-Mangirish

On Thu, Oct 13, 2016 at 12:13 PM, Pierce, Marlon  wrote:

> BSD is ok: https://www.apache.org/legal/resolved.
>
>
>
> *From: *Mangirish Wagle 
> *Reply-To: *"dev@airavata.apache.org" 
> *Date: *Thursday, October 13, 2016 at 12:03 PM
> *To: *"dev@airavata.apache.org" 
> *Subject: *Re: Running MPI jobs on Mesos based clusters
>
>
>
> Hello Devs,
>
> I needed some advice on the license of the MPI libraries. The MPICH
> library that I have been trying claims to have a "BSD Like" license (
> http://git.mpich.org/mpich.git/blob/HEAD:/COPYRIGHT).
>
> I am aware that OpenMPI which uses BSD license is currently used in our
> application. I had chosen to start investigating MPICH because it claims to
> be a highly portable and high quality implementation of latest MPI
> standard, suitable to cloud based clusters.
>
> If anyone could please advise on the acceptance of the MPICH libraries MSD
> Like license for ASF, that would help.
>
> Thank you.
>
> Best Regards,
>
> Mangirish Wagle
>
>
>
> On Thu, Oct 6, 2016 at 1:48 AM, Mangirish Wagle 
> wrote:
>
> Hello Devs,
>
>
>
> The network issue mentioned above now stands resolved. The problem was
> with the iptables had some conflicting rules which blocked the traffic. It
> was resolved by simple iptables flush.
>
>
>
> Here is the test MPI program running on multiple machines:-
>
>
>
> [centos@mesos-slave-1 ~]$ mpiexec -f machinefile -n 2 ./mpitest
>
> Hello world!  I am process number: 0 on host mesos-slave-1
>
> Hello world!  I am process number: 1 on host mesos-slave-2
>
>
>
> The next step is to try invoking this through framework like Marathon.
> However, the job submission still does not run through Marathon. It seems
> to gets stuck in the 'waiting' state forever (For example
> http://149.165.170.245:8080/ui/#/apps/%2Fmaw-try). Further, I notice that
> Marathon is listed under 'inactive frameworks' in mesos dashboard (
> http://149.165.171.33:5050/#/frameworks).
>
>
>
> I am trying to get this working, though any help/ clues with this would be
> really helpful.
>
>
>
> Thanks and Regards,
>
> Mangirish Wagle
>
>
>
>
> On Fri, Sep 30, 2016 at 9:21 PM, Mangirish Wagle 
> wrote:
>
> Hello Devs,
>
>
>
> I am currently running a sample MPI C program using 'mpiexec' provided by
> MPICH. I followed their installation guide
>  to
> install the libraries on the master and slave nodes of the mesos cluster.
>
>
>
> The approach that I am trying out here is that I am equipping the
> underlying nodes with MPI handling tools and then use the Mesos framework
> like Marathon/ Aurora to submit jobs to run MPI programs by invoking these
> tools.
>
>
>
> You can potentially run an MPI program using mpiexec in the following
> manner:-
>
>
>
> # *mpiexec -f machinefile -n 2 ./mpitest*
>
>- *machinefile *-> File which contains an inventory of machines to run
>the program on and number of processes on each machine.
>- *mpitest *-> MPI program compiled in C using mpicc compiler. The
>program returns the process number and he hostname of the machine running
>the process.
>- *-n *option indicates number of processes that it needs to spawn
>
> Example of machinefile contents:-
>
>
>
> # Entries in the format :
>
> mesos-slave-1:1
>
> mesos-slave-2:1
>
>
>
> The reason for choosing slaves is that Mesos runs the jobs on slaves,
> managed by 'agents' pertaining to the slaves.
>
>
>
> Output of the program with '-n 1':-
>
>
>
> # mpiexec -f machinefile -n 1 ./mpitest
>
> Hello world!  I am process number: 0 on host mesos-slave-1
>
>
>
> But when I try for '-n 2', I am hitting the following error:-
>
>
>
> # mpiexec -f machinefile -n 2 ./mpitest
>
> [proxy:0:1@mesos-slave-2] HYDU_sock_connect (/home/centos/mpich-3.2/src/
> pm/hydra/utils/sock/sock.c:172): unable to connect from "mesos-slave-2"
> to "mesos-slave-1" (No route to host)
>
> [proxy:0:1@mesos-slave-2] main (/home/centos/mpich-3.2/src/
> pm/hydra/pm/pmiserv/pmip.c:189): *unable to connect to server
> mesos-slave-1 at port 44788* (check for firewalls!)
>
>
>
> It seems to not allow the program execution due to network traffic being
> blocked. I checked security groups in scigap openstack for mesos-slave-1,
> mesos-slave-2 nodes and it is set to 'wideopen' policy. Furthermore, I
> tried adding explicit rules to the policies to allow all TCP and UDP
> (Currently I am not sure what protocol is used underneath), even then it
> continues throwing this error.
>
>
>
> Any clues, suggestions, comments about the error or approach as a whole
> would be helpful.
>
>
>
> Thanks and Regards,
>
> Mangirish Wagle
>
>
>
> *Error! Filename not specified.*
>
>
>
> On Tue, Sep 27, 2016 at 11:23 AM, Mangirish Wagle <
> vaglomangir...@gmail.com> 

Jenkins build is still unstable: airavata-dev » Airavata Registry Core #739

2016-10-13 Thread Apache Jenkins Server
See 




Re: Running MPI jobs on Mesos based clusters

2016-10-13 Thread Pierce, Marlon
BSD is ok: https://www.apache.org/legal/resolved. 

 

From: Mangirish Wagle 
Reply-To: "dev@airavata.apache.org" 
Date: Thursday, October 13, 2016 at 12:03 PM
To: "dev@airavata.apache.org" 
Subject: Re: Running MPI jobs on Mesos based clusters

 

Hello Devs,

I needed some advice on the license of the MPI libraries. The MPICH library 
that I have been trying claims to have a "BSD Like" license 
(http://git.mpich.org/mpich.git/blob/HEAD:/COPYRIGHT).

I am aware that OpenMPI which uses BSD license is currently used in our 
application. I had chosen to start investigating MPICH because it claims to be 
a highly portable and high quality implementation of latest MPI standard, 
suitable to cloud based clusters.

If anyone could please advise on the acceptance of the MPICH libraries MSD Like 
license for ASF, that would help.

Thank you.

Best Regards,

Mangirish Wagle

 

On Thu, Oct 6, 2016 at 1:48 AM, Mangirish Wagle  
wrote:

Hello Devs, 

 

The network issue mentioned above now stands resolved. The problem was with the 
iptables had some conflicting rules which blocked the traffic. It was resolved 
by simple iptables flush.

 

Here is the test MPI program running on multiple machines:-

 

[centos@mesos-slave-1 ~]$ mpiexec -f machinefile -n 2 ./mpitest

Hello world!  I am process number: 0 on host mesos-slave-1

Hello world!  I am process number: 1 on host mesos-slave-2

 

The next step is to try invoking this through framework like Marathon. However, 
the job submission still does not run through Marathon. It seems to gets stuck 
in the 'waiting' state forever (For example 
http://149.165.170.245:8080/ui/#/apps/%2Fmaw-try). Further, I notice that 
Marathon is listed under 'inactive frameworks' in mesos dashboard 
(http://149.165.171.33:5050/#/frameworks).

 

I am trying to get this working, though any help/ clues with this would be 
really helpful.

 

Thanks and Regards,

Mangirish Wagle



 

On Fri, Sep 30, 2016 at 9:21 PM, Mangirish Wagle  
wrote:

Hello Devs, 

 

I am currently running a sample MPI C program using 'mpiexec' provided by 
MPICH. I followed their installation guide to install the libraries on the 
master and slave nodes of the mesos cluster.

 

The approach that I am trying out here is that I am equipping the underlying 
nodes with MPI handling tools and then use the Mesos framework like Marathon/ 
Aurora to submit jobs to run MPI programs by invoking these tools.

 

You can potentially run an MPI program using mpiexec in the following manner:-

 

# mpiexec -f machinefile -n 2 ./mpitest

machinefile -> File which contains an inventory of machines to run the program 
on and number of processes on each machine.
mpitest -> MPI program compiled in C using mpicc compiler. The program returns 
the process number and he hostname of the machine running the process.
-n option indicates number of processes that it needs to spawn
Example of machinefile contents:-

 

# Entries in the format :

mesos-slave-1:1

mesos-slave-2:1

 

The reason for choosing slaves is that Mesos runs the jobs on slaves, managed 
by 'agents' pertaining to the slaves.

 

Output of the program with '-n 1':-

 

# mpiexec -f machinefile -n 1 ./mpitest

Hello world!  I am process number: 0 on host mesos-slave-1

 

But when I try for '-n 2', I am hitting the following error:-

 

# mpiexec -f machinefile -n 2 ./mpitest

[proxy:0:1@mesos-slave-2] HYDU_sock_connect 
(/home/centos/mpich-3.2/src/pm/hydra/utils/sock/sock.c:172): unable to connect 
from "mesos-slave-2" to "mesos-slave-1" (No route to host)

[proxy:0:1@mesos-slave-2] main 
(/home/centos/mpich-3.2/src/pm/hydra/pm/pmiserv/pmip.c:189): unable to connect 
to server mesos-slave-1 at port 44788 (check for firewalls!)

 

It seems to not allow the program execution due to network traffic being 
blocked. I checked security groups in scigap openstack for mesos-slave-1, 
mesos-slave-2 nodes and it is set to 'wideopen' policy. Furthermore, I tried 
adding explicit rules to the policies to allow all TCP and UDP (Currently I am 
not sure what protocol is used underneath), even then it continues throwing 
this error.

 

Any clues, suggestions, comments about the error or approach as a whole would 
be helpful.

 

Thanks and Regards,

Mangirish Wagle

 

Error! Filename not specified.

 

On Tue, Sep 27, 2016 at 11:23 AM, Mangirish Wagle  
wrote:

Hello Devs, 

 

Thanks Gourav and Shameera for all the work w.r.t. setting up the 
Mesos-Marathon cluster on Jetstream.

 

I am currently evaluating MPICH (http://www.mpich.org/about/overview/) to be 
used for launching MPI jobs on top of mesos. MPICH version 1.2 supports Mesos 
based MPI scheduling. I have been also trying to submit jobs to the cluster 
through Marathon. However, in either cases I am currently facing issues which I 
am working to get resolved.

 

I am 

Re: Running MPI jobs on Mesos based clusters

2016-10-13 Thread Mangirish Wagle
Hello Devs,

I needed some advice on the license of the MPI libraries. The MPICH library
that I have been trying claims to have a "BSD Like" license (
http://git.mpich.org/mpich.git/blob/HEAD:/COPYRIGHT).

I am aware that OpenMPI which uses BSD license is currently used in our
application. I had chosen to start investigating MPICH because it claims to
be a highly portable and high quality implementation of latest MPI
standard, suitable to cloud based clusters.

If anyone could please advise on the acceptance of the MPICH libraries MSD
Like license for ASF, that would help.

Thank you.

Best Regards,
Mangirish Wagle

On Thu, Oct 6, 2016 at 1:48 AM, Mangirish Wagle 
wrote:

> Hello Devs,
>
> The network issue mentioned above now stands resolved. The problem was
> with the iptables had some conflicting rules which blocked the traffic. It
> was resolved by simple iptables flush.
>
> Here is the test MPI program running on multiple machines:-
>
> [centos@mesos-slave-1 ~]$ mpiexec -f machinefile -n 2 ./mpitest
> Hello world!  I am process number: 0 on host mesos-slave-1
> Hello world!  I am process number: 1 on host mesos-slave-2
>
> The next step is to try invoking this through framework like Marathon.
> However, the job submission still does not run through Marathon. It seems
> to gets stuck in the 'waiting' state forever (For example
> http://149.165.170.245:8080/ui/#/apps/%2Fmaw-try). Further, I notice that
> Marathon is listed under 'inactive frameworks' in mesos dashboard (
> http://149.165.171.33:5050/#/frameworks).
>
> I am trying to get this working, though any help/ clues with this would be
> really helpful.
>
> Thanks and Regards,
> Mangirish Wagle
>
>
>
>
> On Fri, Sep 30, 2016 at 9:21 PM, Mangirish Wagle  > wrote:
>
>> Hello Devs,
>>
>> I am currently running a sample MPI C program using 'mpiexec' provided by
>> MPICH. I followed their installation guide
>>  to
>> install the libraries on the master and slave nodes of the mesos cluster.
>>
>> The approach that I am trying out here is that I am equipping the
>> underlying nodes with MPI handling tools and then use the Mesos framework
>> like Marathon/ Aurora to submit jobs to run MPI programs by invoking these
>> tools.
>>
>> You can potentially run an MPI program using mpiexec in the following
>> manner:-
>>
>> # *mpiexec -f machinefile -n 2 ./mpitest*
>>
>>- *machinefile *-> File which contains an inventory of machines to
>>run the program on and number of processes on each machine.
>>- *mpitest *-> MPI program compiled in C using mpicc compiler. The
>>program returns the process number and he hostname of the machine running
>>the process.
>>- *-n *option indicates number of processes that it needs to spawn
>>
>> Example of machinefile contents:-
>>
>> # Entries in the format :
>> mesos-slave-1:1
>> mesos-slave-2:1
>>
>> The reason for choosing slaves is that Mesos runs the jobs on slaves,
>> managed by 'agents' pertaining to the slaves.
>>
>> Output of the program with '-n 1':-
>>
>> # mpiexec -f machinefile -n 1 ./mpitest
>> Hello world!  I am process number: 0 on host mesos-slave-1
>>
>> But when I try for '-n 2', I am hitting the following error:-
>>
>> # mpiexec -f machinefile -n 2 ./mpitest
>> [proxy:0:1@mesos-slave-2] HYDU_sock_connect
>> (/home/centos/mpich-3.2/src/pm/hydra/utils/sock/sock.c:172): unable to
>> connect from "mesos-slave-2" to "mesos-slave-1" (No route to host)
>> [proxy:0:1@mesos-slave-2] main 
>> (/home/centos/mpich-3.2/src/pm/hydra/pm/pmiserv/pmip.c:189):
>> *unable to connect to server mesos-slave-1 at port 44788* (check for
>> firewalls!)
>>
>> It seems to not allow the program execution due to network traffic being
>> blocked. I checked security groups in scigap openstack for mesos-slave-1,
>> mesos-slave-2 nodes and it is set to 'wideopen' policy. Furthermore, I
>> tried adding explicit rules to the policies to allow all TCP and UDP
>> (Currently I am not sure what protocol is used underneath), even then it
>> continues throwing this error.
>>
>> Any clues, suggestions, comments about the error or approach as a whole
>> would be helpful.
>>
>> Thanks and Regards,
>> Mangirish Wagle
>>
>>
>> On Tue, Sep 27, 2016 at 11:23 AM, Mangirish Wagle <
>> vaglomangir...@gmail.com> wrote:
>>
>>> Hello Devs,
>>>
>>> Thanks Gourav and Shameera for all the work w.r.t. setting up the
>>> Mesos-Marathon cluster on Jetstream.
>>>
>>> I am currently evaluating MPICH (http://www.mpich.org/about/overview/)
>>> to be used for launching MPI jobs on top of mesos. MPICH version 1.2
>>> supports Mesos based MPI scheduling. I have been also trying to submit jobs
>>> to the cluster through Marathon. However, in either cases I am currently
>>> facing issues which I am working to get resolved.
>>>
>>> I am compiling my notes into the following google doc. You may please
>>> review and let me know your 

Jenkins build is still unstable: airavata-dev #738

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev » Airavata Registry Core #738

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: airavata-dev #737

2016-10-13 Thread Apache Jenkins Server
See 



Jenkins build is still unstable: airavata-dev » Airavata Registry Core #737

2016-10-13 Thread Apache Jenkins Server
See 




Jenkins build became unstable: airavata-dev #736

2016-10-13 Thread Apache Jenkins Server
See