Adios

2017-06-16 Thread Ajinkya Dhamnaskar
Hello All,

I am stepping out of Bloomington. I must take this opportunity to thank
team Airavata.
It has been a great experience which has definitely changed my way of
thinking towards any computer science problem.

Special thanks to Marlon and Suresh for giving me this opportunity, I tried
my level beast to stand up to your expectations.

I hope to keep contributing as possible.

Let me be your host when you are in Bay Area :)

Again, thanks for everything!

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: wireless/openwhisk

2017-05-31 Thread Ajinkya Dhamnaskar
Apoorv,

I appreciate your efforts, it gives good introduction about OpenWhisk.

Dev,

While this is being studies for distributed task execution problem, one
thing that clicks me, can OpenWhisk be considered as a potential candidate
for Airavata API and other microservices (gfac, orchestrator) around it?
Now that we are moving towards first class services like 'User Profile', I
think, OpenWhisk might have something to offer.
If you have expertise on OpenWhisk, can you please critique this?


On Wed, May 31, 2017 at 11:41 AM, Apoorv Palkar <apoorv_pal...@aol.com>
wrote:

> https://docs.google.com/document/d/1xF_p4FUEK0rXJxC2i_
> fiBtTydAL5rcKD74PuSV9oyTo/edit?usp=sharing
>
>
> From my analysis, I think Storm && Flink are better than spark/openwhisk.
> So i created an analysis for openwhisk above. The documentation is not as
> good as the technology is new. Could serve as a problem for actual
> prototyping/development.
>
> thanks.
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: Spark context diagrams

2017-05-26 Thread Ajinkya Dhamnaskar
Hey Apoorv,

you can use draw.io.
Also, can you please add brief description explaining the diagram?

On Fri, May 26, 2017 at 11:38 AM, Apoorv Palkar <apoorv_pal...@aol.com>
wrote:

> ok will do.
>
>
>
> -Original Message-
> From: Pamidighantam, Sudhakar V <spami...@illinois.edu>
> To: dev <dev@airavata.apache.org>
> Sent: Fri, May 26, 2017 11:36 am
> Subject: Re: Spark context diagrams
>
> Apoorv:
>
> Can you create these diagrams with creately or some software and annotate
> them better.
>
> It is a bit difficult for old eyes to read them..
>
> Thanks,
> Sudhakar.
>
> On May 26, 2017, at 11:25 AM, Apoorv Palkar <apoorv_pal...@aol.com> wrote:
>
> Hey I've been working on teh spark details and posted 2 diagrams on google
> docs in link below. Hopefully i can with the grove and have it be working
> with/as the potential orchestrator.
>
>
>
> https://docs.google.com/document/d/1kjIBC0ianDVJlSuPs8FanCTO8ili1
> VETA5xKeFqo1gY/edit?usp=sharing
>
>
>
>


-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: Welcome Ajinkya Dhamnaskar as Airavata Committer

2017-04-24 Thread Ajinkya Dhamnaskar
Thank you Mangarish, Marcus, Eroma and Shameera.

On Sun, Apr 23, 2017 at 6:44 PM, Shameera Rathnayaka <shameerai...@gmail.com
> wrote:

> Congratulations Ajinkya !!!
>
> On Mon, Apr 10, 2017 at 11:11 AM Eroma Abeysinghe <
> eroma.abeysin...@gmail.com> wrote:
>
>> Congratulations Ajinkya 
>>
>> On Mon, Apr 10, 2017 at 9:41 AM, Christie, Marcus Aaron <machr...@iu.edu>
>> wrote:
>>
>>> Congratulations and welcome, Ajinkya!
>>>
>>> > On Apr 9, 2017, at 10:57 PM, Suresh Marru <sma...@apache.org> wrote:
>>> >
>>> > Hi All,
>>> >
>>> > The Project Management Committee (PMC) for Apache Airavata has asked
>>> Ajinkya Dhamnaskar to become a committer based on his contributions to the
>>> project. We are pleased to announce that he has accepted.
>>> >
>>> > Being a committer enables easier contribution to the project since
>>> there is no need to go via the patch submission process. This should enable
>>> better productivity.
>>> >
>>> > Please join me in welcoming Ajinkya to Airavata.
>>> >
>>> > Suresh
>>> > (On Behalf of Apache Airavata PMC)
>>>
>>>
>>
>>
>> --
>> Thank You,
>> Best Regards,
>> Eroma
>>
> --
> Shameera Rathnayaka
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: Docker and AWS

2017-04-05 Thread Ajinkya Dhamnaskar
Hello Apoorv,

Airavata is backed up with ansible and Docker <https://www.docker.com/> for
deployment. As you walk through overall architecture you would get to know
there are a lot of components which work together in distributed fashion
and require heavy manual efforts for deployment. Currently, ansible is
being used to automate the deployment process. We are yet to find good fit
for docker, but we have docker images ready for all the components.
As far as AWS is concerned, in my knowledge we use it just for provisioning
instances. Others may provide details regarding the same.

On Wed, Apr 5, 2017 at 11:45 PM, Apoorv Palkar <apoorv_pal...@aol.com>
wrote:

> What aspects of the Airavata project are using Docker and AWS? I'm
> interested in these technologies.




-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Gateway Usage Reporting Fix

2017-02-07 Thread Ajinkya Dhamnaskar
Hello,
I have created a pull request https://github.com/apache/airavata/pull/95.
Fix for Gateway Reporting Issue.
Please check https://issues.apache.org/jira/browse/AIRAVATA-2317

Supun,
Could you please review the same?

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: [#Spring17-Airavata-Courses] : Distributed Workload Management for Airavata

2017-02-01 Thread Ajinkya Dhamnaskar
Hello all,

Just a heads up. Here the name Distributed workload management does not
necessarily mean having different instances of a microservice and then
distributing work among these instances.

Apparently, the problem is how to make each microservice work independently
with concrete distributed communication infrastructure. So, think of it as
a workflow where each microservice does its part of work and communicates
(how? yet to be decided) output. The next underlying microservice
identifies and picks up that output and takes it further towards the final
outcome, having said that, the crux here is, none of the miscoservices need
to worry about other miscoservices in a pipeline.

Vidya Sagar,
I completely second your opinion of having stateless miscoservices, in fact
that is the key. With stateless miscroservices it is difficult to guarantee
consistency in a system but it solves the availability problem to some
extent. I would be interested to understand what do you mean by "an
intelligent job scheduling algorithm, which receives real-time updates from
the microservices with their current state information".

On Wed, Feb 1, 2017 at 11:48 PM, Vidya Sagar Kalvakunta <
vkalv...@umail.iu.edu> wrote:

>
> On Wed, Feb 1, 2017 at 2:37 PM, Amila Jayasekara <thejaka.am...@gmail.com>
> wrote:
>
>> Hi Gourav,
>>
>> Sorry, I did not understand your question. Specifically I am having
>> trouble relating "work load management" to options you suggest (RPC,
>> message based etc.).
>> So what exactly you mean by "workload management" ?
>> What is work in this context ?
>>
>> Also, I did not understand what you meant by "the most efficient way".
>> Efficient interms of what ? Are you looking at speed ?
>>
>> As per your suggestions, it seems you are trying to find a way to
>> communicate between micro services. RPC might be troublesome if you need to
>> communicate with processes separated from a firewall.
>>
>> Thanks
>> -Thejaka
>>
>>
>> On Wed, Feb 1, 2017 at 12:52 PM, Shenoy, Gourav Ganesh <
>> goshe...@indiana.edu> wrote:
>>
>>> Hello dev, arch,
>>>
>>>
>>>
>>> As part of this Spring’17 Advanced Science Gateway Architecture course,
>>> we are working on trying to debate and find possible solutions to the issue
>>> of managing distributed workloads in Apache Airavata. This leads to the
>>> discussion of finding the most efficient way that different Airavata
>>> micro-services should communicate and distribute work, in such a way that:
>>>
>>> 1.   We maintain the ability to scale these micro-services whenever
>>> needed (autoscale perhaps?).
>>>
>>> 2.   Achieve fault tolerance.
>>>
>>> 3.   We can deploy these micro-services independently, or better in
>>> a containerized manner – keeping in mind the ability to use devops for
>>> deployment.
>>>
>>>
>>>
>>> As of now the options we are exploring are:
>>>
>>> 1.   RPC based communication
>>>
>>> 2.   Message based – either master-worker, or work-queue, etc
>>>
>>> 3.   A combination of both these approaches
>>>
>>>
>>>
>>> I am more inclined towards exploring the message based approach, but
>>> again there arises the possibility of handling limitations/corner cases of
>>> message broker such as downtimes (may be more). In my opinion, having
>>> asynchronous communication will help us achieve most of the above-mentioned
>>> points. Another debatable issue is making the micro-services implementation
>>> stateless, such that we do not have to pass the state information between
>>> micro-services.
>>>
>>>
>>>
>>> I would love to hear any thoughts/suggestions/comments on this topic and
>>> open up a discussion via this mail thread. If there is anything that I have
>>> missed which is relevant to this issue, please let me know.
>>>
>>>
>>>
>>> Thanks and Regards,
>>>
>>> Gourav Shenoy
>>>
>>
>>
> Hi Gourav,
>
> Correct me if I'm wrong, but I think this is a case of the job shop
> scheduling problem, as we may have 'n' jobs of varying processing times
> and memory requirements, and we have 'm' microservices with possibly
> different computing and memory capacities, and we are trying to minimize
> the makespan <https://en.wikipedia.org/wiki/Makespan>.
>
> For this use-case, I'm in favor a highly available and consistent message
> broker with an intelligent job scheduling algorithm, which receives
> real-time updates from the microservices with their current state
> information.
>
> As for the state vs stateless implementation, I think that question
> depends on the functionality of a particular microservice. In a broad
> sense, the stateless implementation should be preferred as it will scale
> better horizontally.
>
>
> Regards,
> Vidya Sagar
>
>
> --
> Vidya Sagar Kalvakunta | Graduate MS CS Student | IU School of
> Informatics and Computing | Indiana University Bloomington | (812)
> 691-5002 <8126915002> | vkalv...@iu.edu
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: Airavata Output Processing Doubt

2017-01-18 Thread Ajinkya Dhamnaskar
Marcus,

Not really, STDOUT is not the actual experiment output. Yes we can
configure job file do so, for example lets consider simple application
which adds two integers we can generate job file which adds these two
integer and just echoes the output that way output is there in STDOUT. But,
considering most of the cases that we see, experiment takes time to
complete and does not generate output right away.

As of now, even though we support a lot of output types, only a few of them
we deal with. In case of non file output there is no way to get the output
after job completion. As discussed, we can configure job writers to write
experiment outputs (non file outputs such as INTEGER, STRING etc) to some
file and eventually we can stage this file and read output values, that we
don't lose track. I hope this answers your question.

Shameera,

Thanks, I will discuss this with Suresh once.

On Tue, Jan 17, 2017 at 7:22 PM, Christie, Marcus Aaron <machr...@iu.edu>
wrote:

> Ajinkya,
>
> I think an example of what you are trying to accomplish would be helpful
> for me. I’m not quite understanding what problem you are trying to solve
> nor the proposed solution.  Are you trying to turn the STDOUT of an
> application into an INTEGER value somehow?
>
> Thanks,
>
> Marcus
>
> On Jan 16, 2017, at 12:58 PM, Ajinkya Dhamnaskar <adham...@umail.iu.edu>
> wrote:
>
> Shameera,
>
> Now, I got your concern. This is an inherent problem. I was actually
> referring to STDOUT and STDERR which job generates after execution.
> I was wondering if we can write this output as key value pair in some text
> file under specific directory for particular JOB.
>
> So, basically every experiment will have output staging task, but in case
> of output types (INTEGER, STRING etc) the file which will be staged would
> have output in key-value pairs.
> For example, lets consider the case that you mentioned with two string
> outputs, we can probably generate file on remote server as str1-value2 \n
> str2-value2 and stage the same.
> But, I understand this needs to be implemented in job scripts and requires
> much more than this.
>
> How would you think through this idea? Do you think this has some
> potential?
>
>
> On Mon, Jan 16, 2017 at 6:55 AM, Shameera Rathnayaka <
> shameerai...@gmail.com> wrote:
>
>> Hi Ajinkya,
>>
>>>
>>> As and when job gets completed we save output for the same. I was
>>> wondering, if we can get that information knowing process_id and task_id.
>>> Job table has process_id and task_id, possibly we can fetch output
>>> stored in Job table.
>>>
>> If you check the JobModel we don't associate Job output with it. So no
>> job outputs in JobTable. We have to fetch output and save it in registry(I
>> think that is what you refering to log output in our previous replies)
>> problem is how to fetch and save if the job is completed and output is not
>> a file. If we can find a solution to this we can extend this to support
>> multiple outputs as well.
>>
>>
>>>
>>> Also, in case of multiple outputs, each task would know its output name
>>> and possibly we can use that name alongside process_id for fetching correct
>>> value. Workflow would know which output to use as an input for other
>>> application. I hope, I understood your concern correctly.
>>>
>>
>>> On Sun, Jan 15, 2017 at 11:59 PM, Shameera Rathnayaka <
>>> shameerai...@gmail.com> wrote:
>>>
>>> This approach sounds promising to me, if we have more than one non-file
>>> outputs then we will have more than OutputLogginTasks. But how exactly this
>>> OutputLogginTask(please think of a new name for this task) read the data,
>>> because by the time OutputLogginTask is invoked, the actual job in target
>>> computer resource is completed and output is where? if we have more than
>>> one OutputLogginTasks how it reads the value associated with it. eg: if my
>>> job output two Strings "str1", "str2" and I am using "str2" for downstream
>>> application in my workflow how we can guarantee downstream application
>>> always get the correct value?
>>>
>>> On Sun, Jan 15, 2017 at 1:02 PM Ajinkya Dhamnaskar <
>>> adham...@umail.iu.edu> wrote:
>>>
>>> Hi Shameera,
>>>
>>> If you check org.apache.airavata.orchestrat
>>> or.cpi.impl.SimpleOrchestratorImpl#createAndSaveOutputDataStagingTasks()
>>> method, we entertain output staging task only when output data type is
>>> STDOUT, STDERR and URI. I am suggesting, in default case we will create
>

Re: Airavata Output Processing Doubt

2017-01-16 Thread Ajinkya Dhamnaskar
Shameera,

Now, I got your concern. This is an inherent problem. I was actually
referring to STDOUT and STDERR which job generates after execution.
I was wondering if we can write this output as key value pair in some text
file under specific directory for particular JOB.

So, basically every experiment will have output staging task, but in case
of output types (INTEGER, STRING etc) the file which will be staged would
have output in key-value pairs.
For example, lets consider the case that you mentioned with two string
outputs, we can probably generate file on remote server as str1-value2 \n
str2-value2 and stage the same.
But, I understand this needs to be implemented in job scripts and requires
much more than this.

How would you think through this idea? Do you think this has some potential?


On Mon, Jan 16, 2017 at 6:55 AM, Shameera Rathnayaka <shameerai...@gmail.com
> wrote:

> Hi Ajinkya,
>
>>
>> As and when job gets completed we save output for the same. I was
>> wondering, if we can get that information knowing process_id and task_id.
>> Job table has process_id and task_id, possibly we can fetch output stored
>> in Job table.
>>
> If you check the JobModel we don't associate Job output with it. So no job
> outputs in JobTable. We have to fetch output and save it in registry(I
> think that is what you refering to log output in our previous replies)
> problem is how to fetch and save if the job is completed and output is not
> a file. If we can find a solution to this we can extend this to support
> multiple outputs as well.
>
>
>>
>> Also, in case of multiple outputs, each task would know its output name
>> and possibly we can use that name alongside process_id for fetching correct
>> value. Workflow would know which output to use as an input for other
>> application. I hope, I understood your concern correctly.
>>
>
>> On Sun, Jan 15, 2017 at 11:59 PM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> This approach sounds promising to me, if we have more than one non-file
>> outputs then we will have more than OutputLogginTasks. But how exactly this
>> OutputLogginTask(please think of a new name for this task) read the data,
>> because by the time OutputLogginTask is invoked, the actual job in target
>> computer resource is completed and output is where? if we have more than
>> one OutputLogginTasks how it reads the value associated with it. eg: if my
>> job output two Strings "str1", "str2" and I am using "str2" for downstream
>> application in my workflow how we can guarantee downstream application
>> always get the correct value?
>>
>> On Sun, Jan 15, 2017 at 1:02 PM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
>> wrote:
>>
>> Hi Shameera,
>>
>> If you check org.apache.airavata.orchestrator.cpi.impl.
>> SimpleOrchestratorImpl#createAndSaveOutputDataStagingTasks() method, we
>> entertain output staging task only when output data type is STDOUT, STDERR
>> and URI. I am suggesting, in default case we will create different task
>> which points to OutputLogginTask.
>>
>> OutputLogginTask is nothing but yet another implementation of task,
>> similar to SCPDataStageTask where we stage files and log output as well.
>> But, in OutputLogginTask we need not to stage any data, we would just
>> log data as it come.
>>
>> I am assuming TaskId and ProcessId is sufficient to fetch output. (please
>> correct me if you don't think so)
>>
>> Thanks Shameera, this discussion is helping me a lot.
>>
>> On Sun, Jan 15, 2017 at 9:38 AM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> It is not clear to me how this "OutputLogginTask" knows about job output
>> data. It would be helpful if you can explain your suggested approach bit
>> more.
>>
>> Best,
>> Shameera.
>>
>> On Sat, Jan 14, 2017 at 1:07 PM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
>> wrote:
>>
>> Hi Shameera,
>>
>> One possible solution would be to introduce OutputLoggingTask. We can
>> create output task irrespective of output data type and if there isn't any
>> file to stage we can call OutputLoggingTask.
>> Its sole purpose is to log data, that way we can justify each output type.
>>
>> Please suggest, in case you think of any better solution.
>>
>> Thanks in anticipation.
>>
>>
>> On Sat, Jan 14, 2017 at 9:46 PM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> Yes, that is the case. how would you pla

Re: Airavata Output Processing Doubt

2017-01-15 Thread Ajinkya Dhamnaskar
Shameera,

As and when job gets completed we save output for the same. I was
wondering, if we can get that information knowing process_id and task_id.
Job table has process_id and task_id, possibly we can fetch output stored
in Job table.

Also, in case of multiple outputs, each task would know its output name and
possibly we can use that name alongside process_id for fetching correct
value. Workflow would know which output to use as an input for other
application. I hope, I understood your concern correctly.

On Sun, Jan 15, 2017 at 11:59 PM, Shameera Rathnayaka <
shameerai...@gmail.com> wrote:

> This approach sounds promising to me, if we have more than one non-file
> outputs then we will have more than OutputLogginTasks. But how exactly this
> OutputLogginTask(please think of a new name for this task) read the data,
> because by the time OutputLogginTask is invoked, the actual job in target
> computer resource is completed and output is where? if we have more than
> one OutputLogginTasks how it reads the value associated with it. eg: if my
> job output two Strings "str1", "str2" and I am using "str2" for downstream
> application in my workflow how we can guarantee downstream application
> always get the correct value?
>
> On Sun, Jan 15, 2017 at 1:02 PM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
> wrote:
>
>> Hi Shameera,
>>
>> If you check org.apache.airavata.orchestrator.cpi.impl.
>> SimpleOrchestratorImpl#createAndSaveOutputDataStagingTasks() method, we
>> entertain output staging task only when output data type is STDOUT, STDERR
>> and URI. I am suggesting, in default case we will create different task
>> which points to OutputLogginTask.
>>
>> OutputLogginTask is nothing but yet another implementation of task,
>> similar to SCPDataStageTask where we stage files and log output as well.
>> But, in OutputLogginTask we need not to stage any data, we would just
>> log data as it come.
>>
>> I am assuming TaskId and ProcessId is sufficient to fetch output. (please
>> correct me if you don't think so)
>>
>> Thanks Shameera, this discussion is helping me a lot.
>>
>> On Sun, Jan 15, 2017 at 9:38 AM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> It is not clear to me how this "OutputLogginTask" knows about job output
>> data. It would be helpful if you can explain your suggested approach bit
>> more.
>>
>> Best,
>> Shameera.
>>
>> On Sat, Jan 14, 2017 at 1:07 PM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
>> wrote:
>>
>> Hi Shameera,
>>
>> One possible solution would be to introduce OutputLoggingTask. We can
>> create output task irrespective of output data type and if there isn't any
>> file to stage we can call OutputLoggingTask.
>> Its sole purpose is to log data, that way we can justify each output type.
>>
>> Please suggest, in case you think of any better solution.
>>
>> Thanks in anticipation.
>>
>>
>> On Sat, Jan 14, 2017 at 9:46 PM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> Yes, that is the case. how would you plan to solve it?
>>
>> Regards,
>> Shameera.
>>
>> On Fri, Jan 13, 2017 at 6:37 AM, Ajinkya Dhamnaskar <
>> adham...@umail.iu.edu> wrote:
>>
>> Amila,
>>
>> Thanks Amila for explaining. It really explains how things are mapped. I
>> could see output against JOB but could not figure out from where exactly we
>> are logging output for a process.
>>
>> Shameera,
>>
>> Yeah, that's true. So basically, if application does not have output
>> staging task, it would not log output for respective process.
>> Which means if output data type is not URI, we are not logging output
>> against process.(Please correct me if I am wrong).
>>
>> Probably, here we have an opportunity to improve.
>>
>> Thanks in anticipation
>>
>> On Fri, Jan 13, 2017 at 8:58 AM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> If you check here 
>> org.apache.airavata.gfac.impl.task.SCPDataStageTask#outputDataStaging
>> you will see that we are saving process outputs to database(through
>> registry). You probably testing with local job submission
>> with org.apache.airavata.gfac.impl.task.DataStageTask as data staging
>> task implementation. There we don't save process outputs. First thing is
>> to fix this and save the process outputs to the database.
>>
>> If you know t

Re: Airavata Output Processing Doubt

2017-01-15 Thread Ajinkya Dhamnaskar
Hi Shameera,

If you check
org.apache.airavata.orchestrator.cpi.impl.SimpleOrchestratorImpl#createAndSaveOutputDataStagingTasks()
method, we entertain output staging task only when output data type is
STDOUT, STDERR and URI. I am suggesting, in default case we will create
different task which points to OutputLogginTask.

OutputLogginTask is nothing but yet another implementation of task, similar
to SCPDataStageTask where we stage files and log output as well. But,
in OutputLogginTask
we need not to stage any data, we would just log data as it come.

I am assuming TaskId and ProcessId is sufficient to fetch output. (please
correct me if you don't think so)

Thanks Shameera, this discussion is helping me a lot.

On Sun, Jan 15, 2017 at 9:38 AM, Shameera Rathnayaka <shameerai...@gmail.com
> wrote:

> Hi Ajinkya,
>
> It is not clear to me how this "OutputLogginTask" knows about job output
> data. It would be helpful if you can explain your suggested approach bit
> more.
>
> Best,
> Shameera.
>
> On Sat, Jan 14, 2017 at 1:07 PM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
> wrote:
>
>> Hi Shameera,
>>
>> One possible solution would be to introduce OutputLoggingTask. We can
>> create output task irrespective of output data type and if there isn't any
>> file to stage we can call OutputLoggingTask.
>> Its sole purpose is to log data, that way we can justify each output type.
>>
>> Please suggest, in case you think of any better solution.
>>
>> Thanks in anticipation.
>>
>>
>> On Sat, Jan 14, 2017 at 9:46 PM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> Yes, that is the case. how would you plan to solve it?
>>
>> Regards,
>> Shameera.
>>
>> On Fri, Jan 13, 2017 at 6:37 AM, Ajinkya Dhamnaskar <
>> adham...@umail.iu.edu> wrote:
>>
>> Amila,
>>
>> Thanks Amila for explaining. It really explains how things are mapped. I
>> could see output against JOB but could not figure out from where exactly we
>> are logging output for a process.
>>
>> Shameera,
>>
>> Yeah, that's true. So basically, if application does not have output
>> staging task, it would not log output for respective process.
>> Which means if output data type is not URI, we are not logging output
>> against process.(Please correct me if I am wrong).
>>
>> Probably, here we have an opportunity to improve.
>>
>> Thanks in anticipation
>>
>> On Fri, Jan 13, 2017 at 8:58 AM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>> Hi Ajinkya,
>>
>> If you check here 
>> org.apache.airavata.gfac.impl.task.SCPDataStageTask#outputDataStaging
>> you will see that we are saving process outputs to database(through
>> registry). You probably testing with local job submission
>> with org.apache.airavata.gfac.impl.task.DataStageTask as data staging
>> task implementation. There we don't save process outputs. First thing is
>> to fix this and save the process outputs to the database.
>>
>> If you know the JobId then you can retrieve processId from Job model.
>> Using processId you can get all process outputs. see PROCES_OUTPUT case in
>> org.apache.airavata.registry.core.experiment.catalog.impl.
>> ExperimentCatalogImpl#get(..,..) method.
>>
>> Hope this will help you to move forward.
>>
>> Best,
>> Shameera.
>>
>> On Thu, Jan 12, 2017 at 3:48 PM Amila Jayasekara <thejaka.am...@gmail.com>
>> wrote:
>>
>> Hi Ajinkya,
>>
>> I am not familiar with the context of your question but let me try to
>> answer.
>>
>> If you are referring to an application deployed in a supercomputer, then
>> the application should have a job id. In the supercomputer, each
>> application runs as a separate batch job and each job is distinguished
>> using the job id (similar to process id in a PC). Usually, the job
>> scheduler returns this job id and Airavata should be aware about that job
>> id. Then, you should be able to use this job id to identify the output,
>> provided job script specify instructions to generate output.
>>
>> I did not understand what you referred as "process model" and "job
>> model". I assume these are database tables.
>>
>> Thanks
>> -Amila
>>
>>
>>
>> On Wed, Jan 11, 2017 at 1:17 PM, Ajinkya Dhamnaskar <
>> adham...@umail.iu.edu> wrote:
>>
>> Hello Dev,
>>
>> I am trying to fetch application output (type:INTEGER) after experiment
>> completion. As per my

Re: Airavata Output Processing Doubt

2017-01-14 Thread Ajinkya Dhamnaskar
Hi Shameera,

One possible solution would be to introduce OutputLoggingTask. We can
create output task irrespective of output data type and if there isn't any
file to stage we can call OutputLoggingTask.
Its sole purpose is to log data, that way we can justify each output type.

Please suggest, in case you think of any better solution.

Thanks in anticipation.


On Sat, Jan 14, 2017 at 9:46 PM, Shameera Rathnayaka <shameerai...@gmail.com
> wrote:

> Hi Ajinkya,
>
> Yes, that is the case. how would you plan to solve it?
>
> Regards,
> Shameera.
>
> On Fri, Jan 13, 2017 at 6:37 AM, Ajinkya Dhamnaskar <adham...@umail.iu.edu
> > wrote:
>
>> Amila,
>>
>> Thanks Amila for explaining. It really explains how things are mapped. I
>> could see output against JOB but could not figure out from where exactly we
>> are logging output for a process.
>>
>> Shameera,
>>
>> Yeah, that's true. So basically, if application does not have output
>> staging task, it would not log output for respective process.
>> Which means if output data type is not URI, we are not logging output
>> against process.(Please correct me if I am wrong).
>>
>> Probably, here we have an opportunity to improve.
>>
>> Thanks in anticipation
>>
>> On Fri, Jan 13, 2017 at 8:58 AM, Shameera Rathnayaka <
>> shameerai...@gmail.com> wrote:
>>
>>> Hi Ajinkya,
>>>
>>> If you check here org.apache.airavata.gfac.
>>> impl.task.SCPDataStageTask#outputDataStaging you will see that we are
>>> saving process outputs to database(through registry). You probably testing
>>> with local job submission with 
>>> org.apache.airavata.gfac.impl.task.DataStageTask
>>> as data staging task implementation. There we don't save process outputs.
>>> First thing is to fix this and save the process outputs to the database.
>>>
>>> If you know the JobId then you can retrieve processId from Job model.
>>> Using processId you can get all process outputs. see PROCES_OUTPUT case in
>>> org.apache.airavata.registry.core.experiment.catalog.impl.ExperimentCatalogImpl#get(..,..)
>>> method.
>>>
>>> Hope this will help you to move forward.
>>>
>>> Best,
>>> Shameera.
>>>
>>> On Thu, Jan 12, 2017 at 3:48 PM Amila Jayasekara <
>>> thejaka.am...@gmail.com> wrote:
>>>
>>>> Hi Ajinkya,
>>>>
>>>> I am not familiar with the context of your question but let me try to
>>>> answer.
>>>>
>>>> If you are referring to an application deployed in a supercomputer,
>>>> then the application should have a job id. In the supercomputer, each
>>>> application runs as a separate batch job and each job is distinguished
>>>> using the job id (similar to process id in a PC). Usually, the job
>>>> scheduler returns this job id and Airavata should be aware about that job
>>>> id. Then, you should be able to use this job id to identify the output,
>>>> provided job script specify instructions to generate output.
>>>>
>>>> I did not understand what you referred as "process model" and "job
>>>> model". I assume these are database tables.
>>>>
>>>> Thanks
>>>> -Amila
>>>>
>>>>
>>>>
>>>> On Wed, Jan 11, 2017 at 1:17 PM, Ajinkya Dhamnaskar <
>>>> adham...@umail.iu.edu> wrote:
>>>>
>>>> Hello Dev,
>>>>
>>>> I am trying to fetch application output (type:INTEGER) after experiment
>>>> completion. As per my understanding each application runs as a process and
>>>> that process should have final output.
>>>>
>>>> So, ideally we should be able to get final output from process id
>>>> itself (correct me if I am wrong).
>>>> In my case, I am not seeing final output in database. Basically, we are
>>>> not updating process model after job completion, we update job model
>>>> though.
>>>>
>>>> Am I missing anything here?
>>>>
>>>> Any help is appreciated.
>>>>
>>>> --
>>>> Thanks and regards,
>>>>
>>>> Ajinkya Dhamnaskar
>>>> Student ID : 0003469679
>>>> Masters (CS)
>>>> +1 (812) 369- 5416 <(812)%20369-5416>
>>>>
>>>>
>>>> --
>>> Shameera Rathnayaka
>>>
>>
>>
>>
>> --
>> Thanks and regards,
>>
>> Ajinkya Dhamnaskar
>> Student ID : 0003469679
>> Masters (CS)
>> +1 (812) 369- 5416 <(812)%20369-5416>
>>
>
>
>
> --
> Best Regards,
> Shameera Rathnayaka.
>
> email: shameera AT apache.org , shameerainfo AT gmail.com
> Blogs : https://shameerarathnayaka.wordpress.com , http://
> shameerarathnayaka.blogspot.com/
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: Airavata Output Processing Doubt

2017-01-13 Thread Ajinkya Dhamnaskar
Amila,

Thanks Amila for explaining. It really explains how things are mapped. I
could see output against JOB but could not figure out from where exactly we
are logging output for a process.

Shameera,

Yeah, that's true. So basically, if application does not have output
staging task, it would not log output for respective process.
Which means if output data type is not URI, we are not logging output
against process.(Please correct me if I am wrong).

Probably, here we have an opportunity to improve.

Thanks in anticipation

On Fri, Jan 13, 2017 at 8:58 AM, Shameera Rathnayaka <shameerai...@gmail.com
> wrote:

> Hi Ajinkya,
>
> If you check here 
> org.apache.airavata.gfac.impl.task.SCPDataStageTask#outputDataStaging
> you will see that we are saving process outputs to database(through
> registry). You probably testing with local job submission
> with org.apache.airavata.gfac.impl.task.DataStageTask as data staging
> task implementation. There we don't save process outputs. First thing is
> to fix this and save the process outputs to the database.
>
> If you know the JobId then you can retrieve processId from Job model.
> Using processId you can get all process outputs. see PROCES_OUTPUT case in
> org.apache.airavata.registry.core.experiment.catalog.impl.
> ExperimentCatalogImpl#get(..,..) method.
>
> Hope this will help you to move forward.
>
> Best,
> Shameera.
>
> On Thu, Jan 12, 2017 at 3:48 PM Amila Jayasekara <thejaka.am...@gmail.com>
> wrote:
>
>> Hi Ajinkya,
>>
>> I am not familiar with the context of your question but let me try to
>> answer.
>>
>> If you are referring to an application deployed in a supercomputer, then
>> the application should have a job id. In the supercomputer, each
>> application runs as a separate batch job and each job is distinguished
>> using the job id (similar to process id in a PC). Usually, the job
>> scheduler returns this job id and Airavata should be aware about that job
>> id. Then, you should be able to use this job id to identify the output,
>> provided job script specify instructions to generate output.
>>
>> I did not understand what you referred as "process model" and "job
>> model". I assume these are database tables.
>>
>> Thanks
>> -Amila
>>
>>
>>
>> On Wed, Jan 11, 2017 at 1:17 PM, Ajinkya Dhamnaskar <
>> adham...@umail.iu.edu> wrote:
>>
>> Hello Dev,
>>
>> I am trying to fetch application output (type:INTEGER) after experiment
>> completion. As per my understanding each application runs as a process and
>> that process should have final output.
>>
>> So, ideally we should be able to get final output from process id itself
>> (correct me if I am wrong).
>> In my case, I am not seeing final output in database. Basically, we are
>> not updating process model after job completion, we update job model
>> though.
>>
>> Am I missing anything here?
>>
>> Any help is appreciated.
>>
>> --
>> Thanks and regards,
>>
>> Ajinkya Dhamnaskar
>> Student ID : 0003469679
>> Masters (CS)
>> +1 (812) 369- 5416 <(812)%20369-5416>
>>
>>
>> --
> Shameera Rathnayaka
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Airavata Output Processing Doubt

2017-01-11 Thread Ajinkya Dhamnaskar
Hello Dev,

I am trying to fetch application output (type:INTEGER) after experiment
completion. As per my understanding each application runs as a process and
that process should have final output.

So, ideally we should be able to get final output from process id itself
(correct me if I am wrong).
In my case, I am not seeing final output in database. Basically, we are not
updating process model after job completion, we update job model though.

Am I missing anything here?

Any help is appreciated.

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Pull Request: Airavata Integration Test Suit

2016-12-18 Thread Ajinkya Dhamnaskar
Hello,

Jira: AIRAVATA-2283
https://github.com/apache/airavata/pull/91

Integration test suit to test airavata experiment lifecycle.
It creates resources locally and uses local provider to test simple echo
experiment.

PRE_REQUISITES

1. embeded derby server
2. RabbitMQ on 149.165.228.91

ASSUMPTIONS
==
1. we have already built airavata and tarball is present in distribution
2. No other processes are blocking airavata ports


How to build
=
1. mvn clean install in test-suit
2. It will pick AiravataIT and run stated test cases


How to run

1. Make sure you have proper airavata-server.properties inside test-suit
2. Embeded derby server needs to be enabled in main distribution (set
start.derby.server.mode=false)


What it does?

1. It will extract airavata build from distribution and deploy it with
embeded derby server
2. Automatically creates and tests gateway
3. Automatically creates and tests computer resource
4. Automatically creates and tests storage resource
5. Automatically creates and tests application
6. Automatically launches application and tests result
7. Clean up

You may find some redundant code. The sole purpose is to accommodate future
enhancements as they come.

Please revert if you have any concerns.

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Kafka Logging Conflicts

2016-11-07 Thread Ajinkya Dhamnaskar
Hello,

We have introduced kafka logging in airavata and for this to work properly
we need to get slf4j-log4j12-1.7.10.jar and log4j-1.2.17.jar out of the
class path.

These jars are conflicting with logback.jar which is being used for kafka
logging.

Now, the concern is which one we should keep.

I am opening this thread to discuss your concerns and suggestions on the
same.

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: Pull Request for Airavata Docker images.

2016-10-28 Thread Ajinkya Dhamnaskar
Hi Shameera,

As discussed, I have created a new pull request (
https://github.com/apache/airavata/pull/67)

These are self contained docker images, basic idea is to ease the
deployment process.
Images can be deployed across different systems, proper entries need to be
there in airavata-server.properties file.
Please find build and run instructions in readme.

On Thu, Oct 27, 2016 at 12:33 PM, Shameera Rathnayaka <
shameerai...@gmail.com> wrote:

> Hi Ajinkya,
>
> Could you clean this pull request the way it shows only your changes and
> up to date with current develop branch? This pull request has 198 commits
> and 2500+ files changes.
>
> Best,
> Shameera.
>
> On Wed, Oct 19, 2016 at 11:08 AM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
> wrote:
>
>> Hello,
>>
>> I have created a pull request for self contained docker images.(
>> https://github.com/apache/airavata/pull/57)
>> One can use this branch to deploy airavata from scratch.
>>
>> I have cleaned the branch. Please ignore all deletes.
>> These are self contained docker images, basic idea is to ease the
>> deployment process.
>> Images can be deployed across different systems, proper entries need to
>> be there in airavata-server.properties file.
>> Please find build and run instructions in readme.
>>
>> Please revert if you have any concerns.
>>
>> --
>> Thanks and regards,
>>
>> Ajinkya Dhamnaskar
>> Student ID : 0003469679
>> Masters (CS)
>> +1 (812) 369- 5416 <(812)%20369-5416>
>>
> --
> Shameera Rathnayaka
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Pull Request for Airavata Docker images.

2016-10-19 Thread Ajinkya Dhamnaskar
Hello,

I have created a pull request for self contained docker images.(
https://github.com/apache/airavata/pull/57)
One can use this branch to deploy airavata from scratch.

I have cleaned the branch. Please ignore all deletes.
These are self contained docker images, basic idea is to ease the
deployment process.
Images can be deployed across different systems, proper entries need to be
there in airavata-server.properties file.
Please find build and run instructions in readme.

Please revert if you have any concerns.

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Improve File Transfer Component

2016-10-07 Thread Ajinkya Dhamnaskar
Hello All,

Currently, in Airavata we support only scp file transfer. As we move ahead,
we are planing to support different file transfer protocols.
Instead of separate component, we are planing to integrate it in gfac
itself.

Let's use this thread to discuss the improvements in this regard and please
do share your views and suggestions.

-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Re: [Airavata] : Local Provider

2016-10-07 Thread Ajinkya Dhamnaskar
Hello,

This is regarding database changes that we had to make to fix Local
provider.

I have added SECURITY_PROTOCOL column in LOCAL_SUBMISSION table. So, if
someone does not want to build database from scratch use below commands to
add these changes without hampering existing data.

ALTER TABLE LOCAL_SUBMISSION ADD COLUMN SECURITY_PROTOCOL VARCHAR (255);

UPDATE LOCAL_SUBMISSION SET SECURITY_PROTOCOL = 'LOCAL';

ALTER TABLE LOCAL_SUBMISSION MODIFY COLUMN SECURITY_PROTOCOL VARCHAR (255)
NOT NULL;
Please revert if you have any questions.

On Thu, Sep 29, 2016 at 4:18 PM, Shameera Rathnayaka <shameerai...@gmail.com
> wrote:

> Hi Ajinkay,
>
> In Airavata we use Java Executor Services comes with java concurrency
> package for thread management, so we have delegate thread creation and
> management to Java instead of doing it all by ourselves. To be specific we
> used Fixed ThreadPool Executor inside Airavata. There are both good and bad
> in Thread vs Processes. If we use processes to run this local jobs then you
> have to manage process creation and deletion all by yourself anyway it is
> not a good practice to write long running synchronous jobs so we can assume
> these synchronous jobs are small in execution time.
>
> Thanks,
> Shameera.
>
> On Thu, Sep 29, 2016 at 2:36 PM Ajinkya Dhamnaskar <adham...@umail.iu.edu>
> wrote:
>
>> Hello All,
>>
>> For local provider I was exploring different ways to submit and get the
>> job done. As, it is a local provider we need not to have monitoring system
>> in place, simply forking a job on different thread should serve the
>> purpose.
>>
>> But, to do so we need to keep investigation that thread. In my opinion,
>> ProcessBuilder in java would be handy in this case or if you have any other
>> suggestions please do let me know.
>>
>> Just want to know and use if there is any better way to this.
>>
>> Thanks in anticipation
>> --
>> Thanks and regards,
>>
>> Ajinkya Dhamnaskar
>> Student ID : 0003469679
>> Masters (CS)
>> +1 (812) 369- 5416
>>
> --
> Shameera Rathnayaka
>



-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


[Airavata] : Local Provider

2016-09-29 Thread Ajinkya Dhamnaskar
Hello All,

For local provider I was exploring different ways to submit and get the job
done. As, it is a local provider we need not to have monitoring system in
place, simply forking a job on different thread should serve the purpose.

But, to do so we need to keep investigation that thread. In my opinion,
ProcessBuilder in java would be handy in this case or if you have any other
suggestions please do let me know.

Just want to know and use if there is any better way to this.

Thanks in anticipation
-- 
Thanks and regards,

Ajinkya Dhamnaskar
Student ID : 0003469679
Masters (CS)
+1 (812) 369- 5416


Deploying airavata on aws

2016-09-24 Thread Ajinkya Dhamnaskar
Hi Devs,

I have installed airavata on aws instance (52.207.252.115). Please do let
me know, if anyone wants access.

Also, trying to install PGA on the same and will update you all once I get
through.

-- 
Thanks and regards,

Ajinkya Dhamnaskar
+1 (812) 369- 5416