Re: Apache Airavata-OODT Integration

2013-07-10 Thread Sanjaya Medonsa
Hi Chris,
 I have started looking at changing the current implementation to use
file Name instead of product id. As per the current PGETask wrapper
implementation, it takes two inputs (Product ID or file path at the remote
location. If filePath is used force staging should be set. But I am not
quite sure what it means by force staging). If I am to use the current
provisions in PGETaskWrapper, then remote file path (Not the file  name)
has to be given as input. I am not quite sure whether it is ideal to use
file path instead of file name. If filename to use as input, then
FilesStager needs to be customized to  retrieve product references from
file name. File manager client doesn't have a mechanism to retrieve product
by file name. But it has mechanism to retrieve product by product name. I
guess typically both are the same. One drawback of this approach is that it
doesn't support list of product names. The method getProductReferences
which returns list of products is based on back end implementation that is
based on product id, through actual input is product (Product with just
product name set is not possible to as input). Please let me know your
thoughts.

Best Regards,
Sanjaya




On Mon, Jun 17, 2013 at 5:52 PM, Sanjaya Medonsa sanjaya...@gmail.comwrote:

 Thanks Chris. I'll update the implementation to use file name instead of
 OODT product id.

 Cheers,
 Sanjaya


 On Sun, Jun 16, 2013 at 12:51 AM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Hey Sanjaya, sure +1 use the Filename. It's not guaranteed to be unique,
 but you can easily just pop the first one off the top (latest) and take
 that (since it's sorted by product received time). You may check out the
 pcs-core module and some of its internal classes like FileManagerUtils
 to see some cool helper functions that could aid in this regard.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Saturday, June 15, 2013 4:04 AM
 To: Airavata Dev d...@airavata.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Thanks Chris for your help! Working directory is available in
 JobExecutionContext in Airavata and directory can easily be retrieved.
 Issue in my case is that, from XBaya GUI I take product id as input not
 the
 file name. Internally file stager query the file manager using product id
 to retrieve product reference and corresponding file name to stage the
 file
 into input dir. Since this product id to file name mapping happens
 internally during the file staging, my implementation don't have access
 to
 filename unless I query the file manager to retrieve the corresponding
 file
 name using product id.
 
 One of the major issue in my implementation seems that I use OODT product
 id as input, not the file name. Should I change my implementation to use
 file name instead of product id ?
 
 Best Regards,
 Sanjaya
 
 
 On Fri, Jun 14, 2013 at 8:51 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hey Sanjaya,
 
  Easy, see the attached PGEConfig.xml here:
 
  http://paste.apache.org/6OGW
 
  In that file:
 
  1. We compute the staged file path by computing JobDir
  2. We create in the exe block a staged input dir
  3. We stage the files just using cps in the exeBlock (could have
  just as easily used fileStager)
  4. We know that the file is [JobInputDir]/[Filename]
 
  HTH.
 
  Cheers,
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Sanjaya Medonsa sanjaya...@gmail.com
  Reply-To: d...@airavata.apache.org d...@airavata.apache.org
  Date: Friday, June 14, 2013 5:02 AM
  To: Airavata Dev d...@airavata.apache.org
  Subject: Re: Apache Airavata-OODT Integration
 
  Thanks Chris for your input. I actually use the PGETaskInstance for
 file
  staging with minimal additional code. But my issue issue

Re: Apache Airavata-OODT Integration

2013-07-08 Thread Mattmann, Chris A (398J)
Hi Sanjaya,

-Original Message-

From: Sanjaya Medonsa sanjaya...@gmail.com
Reply-To: d...@airavata.apache.org d...@airavata.apache.org
Date: Monday, July 8, 2013 12:09 AM
To: Airavata Dev d...@airavata.apache.org
Cc: dev@oodt.apache.org dev@oodt.apache.org
Subject: Re: Apache Airavata-OODT Integration

Hi Chris,
 I have started looking at changing the current implementation to use
file Name instead of product id. As per the current PGETask wrapper
implementation, it takes two inputs (Product ID or file path at the remote
location. If filePath is used force staging should be set. But I am not
quite sure what it means by force staging).

Force staging I believe controls whether or not the staged files are
overwritten.

 If I am to use the current
provisions in PGETaskWrapper, then remote file path (Not the file  name)
has to be given as input. I am not quite sure whether it is ideal to use
file path instead of file name.

You can easily generate the file path (which does not have to be remote,
in fact, if you think about it, it could easily be local and in Apache
OODT,
we typically ensure it's local by using distributed filesystems like HDFS
or NFS or Gluster to make remote files appear local by pushing that portion
down into the distributed filesystem which we think does a better job of
data movement :) ). To generate the file path you can use CAS-PGE SQLQuery
facility that will allow you to look up e.g., $FileLocation/$Filename based
on met fields, which in turn you can then feed into the path.


If filename to use as input, then
FilesStager needs to be customized to  retrieve product references from
file name. 

See above for an alternative.

File manager client doesn't have a mechanism to retrieve product
by file name. But it has mechanism to retrieve product by product name. I
guess typically both are the same.

Yeah, or the other easy mechanism is simply to issue a query, e.g., build
yourself a Filename query and then query the FM Catalog.

One drawback of this approach is that it
doesn't support list of product names. The method getProductReferences
which returns list of products is based on back end implementation that is
based on product id, through actual input is product (Product with just
product name set is not possible to as input). Please let me know your
thoughts.

See above.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++







On Mon, Jun 17, 2013 at 5:52 PM, Sanjaya Medonsa
sanjaya...@gmail.comwrote:

 Thanks Chris. I'll update the implementation to use file name instead of
 OODT product id.

 Cheers,
 Sanjaya


 On Sun, Jun 16, 2013 at 12:51 AM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Hey Sanjaya, sure +1 use the Filename. It's not guaranteed to be
unique,
 but you can easily just pop the first one off the top (latest) and take
 that (since it's sorted by product received time). You may check out
the
 pcs-core module and some of its internal classes like FileManagerUtils
 to see some cool helper functions that could aid in this regard.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Saturday, June 15, 2013 4:04 AM
 To: Airavata Dev d...@airavata.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Thanks Chris for your help! Working directory is available in
 JobExecutionContext in Airavata and directory can easily be retrieved.
 Issue in my case is that, from XBaya GUI I take product id as input
not
 the
 file name. Internally file stager query the file manager using
product id
 to retrieve product reference and corresponding file name to stage the
 file
 into input dir. Since this product id to file name mapping happens
 internally during the file staging, my implementation don't have
access
 to
 filename unless I query the file manager to retrieve the corresponding
 file
 name using product id.
 
 One

Re: Apache Airavata-OODT Integration

2013-06-17 Thread Sanjaya Medonsa
Thanks Chris. I'll update the implementation to use file name instead of
OODT product id.

Cheers,
Sanjaya


On Sun, Jun 16, 2013 at 12:51 AM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hey Sanjaya, sure +1 use the Filename. It's not guaranteed to be unique,
 but you can easily just pop the first one off the top (latest) and take
 that (since it's sorted by product received time). You may check out the
 pcs-core module and some of its internal classes like FileManagerUtils
 to see some cool helper functions that could aid in this regard.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Saturday, June 15, 2013 4:04 AM
 To: Airavata Dev d...@airavata.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Thanks Chris for your help! Working directory is available in
 JobExecutionContext in Airavata and directory can easily be retrieved.
 Issue in my case is that, from XBaya GUI I take product id as input not
 the
 file name. Internally file stager query the file manager using product id
 to retrieve product reference and corresponding file name to stage the
 file
 into input dir. Since this product id to file name mapping happens
 internally during the file staging, my implementation don't have access to
 filename unless I query the file manager to retrieve the corresponding
 file
 name using product id.
 
 One of the major issue in my implementation seems that I use OODT product
 id as input, not the file name. Should I change my implementation to use
 file name instead of product id ?
 
 Best Regards,
 Sanjaya
 
 
 On Fri, Jun 14, 2013 at 8:51 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hey Sanjaya,
 
  Easy, see the attached PGEConfig.xml here:
 
  http://paste.apache.org/6OGW
 
  In that file:
 
  1. We compute the staged file path by computing JobDir
  2. We create in the exe block a staged input dir
  3. We stage the files just using cps in the exeBlock (could have
  just as easily used fileStager)
  4. We know that the file is [JobInputDir]/[Filename]
 
  HTH.
 
  Cheers,
  Chris
 
  ++
  Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop: 171-246
  Email: chris.a.mattm...@nasa.gov
  WWW:  http://sunset.usc.edu/~mattmann/
  ++
  Adjunct Assistant Professor, Computer Science Department
  University of Southern California, Los Angeles, CA 90089 USA
  ++
 
 
 
 
 
 
  -Original Message-
  From: Sanjaya Medonsa sanjaya...@gmail.com
  Reply-To: d...@airavata.apache.org d...@airavata.apache.org
  Date: Friday, June 14, 2013 5:02 AM
  To: Airavata Dev d...@airavata.apache.org
  Subject: Re: Apache Airavata-OODT Integration
 
  Thanks Chris for your input. I actually use the PGETaskInstance for
 file
  staging with minimal additional code. But my issue issue not with the
 file
  staging. As per my current implementation, application inputs product
 id.
  Then using the capabilities in PGETaskInstance class, it does the file
  staging. But my issue is that during the file staging product is
 mapped to
  a file in specified working directory. I don't have a way to retrieve
 the
  staged file name, as it is not recorded in Metadata (For this purpose,
 I
  query the FileManager again to get the corresponding reference name
 for a
  given product id). I need the staged file path, since I modify the
 input
  product id into staged file path prior to actual workflow invocation.
  Basically I am looking for some implementation where I can easily
  retrieve,
  staged file path for a given product id.
  
  Cheers,
  Sanjaya
  
  
  On Wed, Jun 12, 2013 at 10:04 PM, Mattmann, Chris A (398J) 
  chris.a.mattm...@jpl.nasa.gov wrote:
  
   Hi Sanjaya,
  
   -Original Message-
  
   From: Sanjaya Medonsa sanjaya...@gmail.com
   Reply-To: d...@airavata.apache.org d...@airavata.apache.org
   Date: Monday, June 10, 2013 5:20 PM
   To: d...@airavata.apache.org d...@airavata.apache.org
   Cc: dev@oodt.apache.org dev@oodt.apache.org
   Subject: Re: Apache Airavata-OODT Integration
  
   Hi Chris,
  On configuration, I have get rid of all

Re: Apache Airavata-OODT Integration

2013-06-15 Thread Mattmann, Chris A (398J)
Hey Sanjaya, sure +1 use the Filename. It's not guaranteed to be unique,
but you can easily just pop the first one off the top (latest) and take
that (since it's sorted by product received time). You may check out the
pcs-core module and some of its internal classes like FileManagerUtils
to see some cool helper functions that could aid in this regard.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Sanjaya Medonsa sanjaya...@gmail.com
Reply-To: d...@airavata.apache.org d...@airavata.apache.org
Date: Saturday, June 15, 2013 4:04 AM
To: Airavata Dev d...@airavata.apache.org
Subject: Re: Apache Airavata-OODT Integration

Thanks Chris for your help! Working directory is available in
JobExecutionContext in Airavata and directory can easily be retrieved.
Issue in my case is that, from XBaya GUI I take product id as input not
the
file name. Internally file stager query the file manager using product id
to retrieve product reference and corresponding file name to stage the
file
into input dir. Since this product id to file name mapping happens
internally during the file staging, my implementation don't have access to
filename unless I query the file manager to retrieve the corresponding
file
name using product id.

One of the major issue in my implementation seems that I use OODT product
id as input, not the file name. Should I change my implementation to use
file name instead of product id ?

Best Regards,
Sanjaya


On Fri, Jun 14, 2013 at 8:51 PM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hey Sanjaya,

 Easy, see the attached PGEConfig.xml here:

 http://paste.apache.org/6OGW

 In that file:

 1. We compute the staged file path by computing JobDir
 2. We create in the exe block a staged input dir
 3. We stage the files just using cps in the exeBlock (could have
 just as easily used fileStager)
 4. We know that the file is [JobInputDir]/[Filename]

 HTH.

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Friday, June 14, 2013 5:02 AM
 To: Airavata Dev d...@airavata.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Thanks Chris for your input. I actually use the PGETaskInstance for
file
 staging with minimal additional code. But my issue issue not with the
file
 staging. As per my current implementation, application inputs product
id.
 Then using the capabilities in PGETaskInstance class, it does the file
 staging. But my issue is that during the file staging product is
mapped to
 a file in specified working directory. I don't have a way to retrieve
the
 staged file name, as it is not recorded in Metadata (For this purpose,
I
 query the FileManager again to get the corresponding reference name
for a
 given product id). I need the staged file path, since I modify the
input
 product id into staged file path prior to actual workflow invocation.
 Basically I am looking for some implementation where I can easily
 retrieve,
 staged file path for a given product id.
 
 Cheers,
 Sanjaya
 
 
 On Wed, Jun 12, 2013 at 10:04 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sanjaya,
 
  -Original Message-
 
  From: Sanjaya Medonsa sanjaya...@gmail.com
  Reply-To: d...@airavata.apache.org d...@airavata.apache.org
  Date: Monday, June 10, 2013 5:20 PM
  To: d...@airavata.apache.org d...@airavata.apache.org
  Cc: dev@oodt.apache.org dev@oodt.apache.org
  Subject: Re: Apache Airavata-OODT Integration
 
  Hi Chris,
 On configuration, I have get rid of all the configuration
files,
  including pge-config.xml. All the required configurations are
  programmatically set.  Configurations such FileManagerServer URL are
  configured in the airavata-server.properties file. I'll update the
 review
  request with modified details.
 
  Great work!
 
 
 Still I am not quite clear on how

Re: Apache Airavata-OODT Integration

2013-06-12 Thread Mattmann, Chris A (398J)
Hi Sanjaya,

-Original Message-

From: Sanjaya Medonsa sanjaya...@gmail.com
Reply-To: d...@airavata.apache.org d...@airavata.apache.org
Date: Monday, June 10, 2013 5:20 PM
To: d...@airavata.apache.org d...@airavata.apache.org
Cc: dev@oodt.apache.org dev@oodt.apache.org
Subject: Re: Apache Airavata-OODT Integration

Hi Chris,
   On configuration, I have get rid of all the configuration files,
including pge-config.xml. All the required configurations are
programmatically set.  Configurations such FileManagerServer URL are
configured in the airavata-server.properties file. I'll update the review
request with modified details.

Great work!


   Still I am not quite clear on how to retrieve staged file path
properly. Currently I am using getStagedFilePath method
in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged file path.
While I am going through the OODT code that I have seen method in
DataTransferer to notify FileManagerServer once transfer is completed. But
I couldn't see the same for product retrieval.

Example:
http://svn.apache.org/repos/asf/oodt/trunk/pge/src/test/resources/pge-confi
g.xml


Review Board tickets:
https://reviews.apache.org/r/4746/

https://reviews.apache.org/r/5382/


JIRA issue source (in OODT since 0.4):
  https://issues.apache.org/jira/browse/OODT-443


   As you suggested I'll improve my workflow using Apache Tika. I'd
like to continue this as an Parallal task. While modifying staging
implementation based on community feedback, currently I am looking at
ingesting output back to OODT.

See above for info on file staging. I would strongly encourage you not
to reimplement CAS-PGE in Airavata -- it's pretty functional and expressive
anyways and I would work to figure out how to make Airavata leverage
CAS-PGE.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sanjaya,

 I think starting out with /bin/ls would be good, maybe like a /bin/ls
 workflow, and then for each file returned, maybe run Apache Tika and
 extract its metadata and then pipe that to a file?

 How about that?

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Tuesday, June 4, 2013 5:31 AM
 To: d...@airavata.apache.org d...@airavata.apache.org
 Cc: dev@oodt.apache.org dev@oodt.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Hi Chris,
  Please see my comments below on the two items.
 
 Configuration : It should be possible to set them programmatically.
 Actually I have implemented partly it for file staging information.
I'll
 work to get rid of the other configuration files.
 
 Staged File Path : I'll work on the suggested approach, though I am not
 fully understand it at the moment. I guess I need to go through bit
more
 on
 CAS-PGE and come back to you on the proposed approach.
 
 Currently I am testing this by wrapping /bin/ls command as GFac
service. I
 may need to test this with real workflow. Could you please provide me
know
 some guidance on better scenario to test this.
 
 Cheers,
 Sanjaya
 
 
 
 
 On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sanjaya,
 
  -Original Message-
 
  From: Sanjaya Medonsa sanjaya...@gmail.com
  Reply-To: d...@airavata.apache.org d...@airavata.apache.org
  Date: Thursday, May 30, 2013 5:12 AM
  To: dev@oodt.apache.org dev@oodt.apache.org,
 d...@airavata.apache.org
  d...@airavata.apache.org
  Subject: Apache Airavata-OODT Integration
 
  Hi,
   I have worked on the Apache Airavata integration with Apache
 OODT. As
  a first step, I have implemented integration with Apache OODT file
  manager component.
 
  Great work!!
 
  Comments below:
 
1. Introduce a new GFac Schema type called OODTProduct which
 takes
  APache OODT product IDs

Re: Apache Airavata-OODT Integration

2013-06-12 Thread Mattmann, Chris A (398J)
+5000 great idea, as usual my friend.

Cheers,
Chris

++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++






-Original Message-
From: Suresh Marru sma...@apache.org
Reply-To: d...@airavata.apache.org d...@airavata.apache.org
Date: Wednesday, June 12, 2013 9:51 AM
To: d...@airavata.apache.org d...@airavata.apache.org
Cc: dev@oodt.apache.org dev@oodt.apache.org
Subject: Re: Apache Airavata-OODT Integration

On Jun 12, 2013, at 12:34 PM, Mattmann, Chris A (398J)
chris.a.mattm...@jpl.nasa.gov wrote:

 See above for info on file staging. I would strongly encourage you not
 to reimplement CAS-PGE in Airavata -- it's pretty functional and
expressive
 anyways and I would work to figure out how to make Airavata leverage
 CAS-PGE.

+ 1. 

Sanjaya, Airavata and OODT communities,

Any volunteers to write a paper on A tale of two apache workflow
systems: Airavata and OODT?

Given the page limit and to keep in scope, I suggest to leave out the use
cases of the systems and focus on software architectures. A detailed
technical paper comparing and contrasting the features and identifying
potential collaborative components.

If you want a deadline, how about August 15th to WORKS workshop -
http://works.cs.cardiff.ac.uk/

Suresh



Re: Apache Airavata-OODT Integration

2013-06-12 Thread Sanjaya Medonsa
Hi Chris,
   On configuration, I have get rid of all the configuration files,
including pge-config.xml. All the required configurations are
programmatically set.  Configurations such FileManagerServer URL are
configured in the airavata-server.properties file. I'll update the review
request with modified details.
   Still I am not quite clear on how to retrieve staged file path
properly. Currently I am using getStagedFilePath method
in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged file path.
While I am going through the OODT code that I have seen method in
DataTransferer to notify FileManagerServer once transfer is completed. But
I couldn't see the same for product retrieval.
   As you suggested I'll improve my workflow using Apache Tika. I'd
like to continue this as an Parallal task. While modifying staging
implementation based on community feedback, currently I am looking at
ingesting output back to OODT.

Best Regards,
Sanjaya



On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) 
chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sanjaya,

 I think starting out with /bin/ls would be good, maybe like a /bin/ls
 workflow, and then for each file returned, maybe run Apache Tika and
 extract its metadata and then pipe that to a file?

 How about that?

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Tuesday, June 4, 2013 5:31 AM
 To: d...@airavata.apache.org d...@airavata.apache.org
 Cc: dev@oodt.apache.org dev@oodt.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Hi Chris,
  Please see my comments below on the two items.
 
 Configuration : It should be possible to set them programmatically.
 Actually I have implemented partly it for file staging information. I'll
 work to get rid of the other configuration files.
 
 Staged File Path : I'll work on the suggested approach, though I am not
 fully understand it at the moment. I guess I need to go through bit more
 on
 CAS-PGE and come back to you on the proposed approach.
 
 Currently I am testing this by wrapping /bin/ls command as GFac service. I
 may need to test this with real workflow. Could you please provide me know
 some guidance on better scenario to test this.
 
 Cheers,
 Sanjaya
 
 
 
 
 On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sanjaya,
 
  -Original Message-
 
  From: Sanjaya Medonsa sanjaya...@gmail.com
  Reply-To: d...@airavata.apache.org d...@airavata.apache.org
  Date: Thursday, May 30, 2013 5:12 AM
  To: dev@oodt.apache.org dev@oodt.apache.org,
 d...@airavata.apache.org
  d...@airavata.apache.org
  Subject: Apache Airavata-OODT Integration
 
  Hi,
   I have worked on the Apache Airavata integration with Apache
 OODT. As
  a first step, I have implemented integration with Apache OODT file
  manager component.
 
  Great work!!
 
  Comments below:
 
1. Introduce a new GFac Schema type called OODTProduct which
 takes
  APache OODT product IDs as input.
2. Implemented new pre GFac Handler by extending Apache OODT
  PgeTaskInstance to stage the corresponding file into the working
  directory.
3. Once file is staged, input parameter with OODT product id is
  replaced with path of the staged file for downstream processing
  
  I have tested the implementation with Gfac application which wraps
 /bin/ls
  command. Application takes product id as input and stage corresponding
  file
  into the working directory and /bin/ls is executed against the staged
  file.
  Hope this is a valid testing scenario.
  
  Concerns
  - Configurations : I have added new configuration file named and
  oodt-integration.properties in addition to dynamic_metadata.met and
  pge-config.xml files used by OODT. But at the moment there is no item
  configured with the oodt-integration.properties.
 
  You probably only need the pge-config.xml file. Dynamic metadata, and
 the
  task configuration properties can be specified programmatically, right?
 
  - Staged File Name - With the current implementation of
 PgeTaskInstance it
  is not possible to retrieve path of the staged file. Due to this
  limitation, I have query the FileManagerServer with product id and
  retrieve
  the file name and computed the file path using information of working
  directory.
 
  I'm not sure I understand this? If you store

Re: Apache Airavata-OODT Integration

2013-06-12 Thread Sanjaya Medonsa
Hi Chris,
  I have just realized that Airavata GFac Handler has been updated to
include Gfac Handler specific configuration recently. I think I should move
the configurations in airavata-server.properties into gfac-config.xml as
properties of the GFac handler which performs the OODT File Staging

Best Regards,
Sanjaya


On Tue, Jun 11, 2013 at 5:50 AM, Sanjaya Medonsa sanjaya...@gmail.comwrote:

 Hi Chris,
On configuration, I have get rid of all the configuration files,
 including pge-config.xml. All the required configurations are
 programmatically set.  Configurations such FileManagerServer URL are
 configured in the airavata-server.properties file. I'll update the review
 request with modified details.
Still I am not quite clear on how to retrieve staged file path
 properly. Currently I am using getStagedFilePath method
 in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged file path.
 While I am going through the OODT code that I have seen method in
 DataTransferer to notify FileManagerServer once transfer is completed. But
 I couldn't see the same for product retrieval.
As you suggested I'll improve my workflow using Apache Tika. I'd
 like to continue this as an Parallal task. While modifying staging
 implementation based on community feedback, currently I am looking at
 ingesting output back to OODT.

 Best Regards,
 Sanjaya



 On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:

 Hi Sanjaya,

 I think starting out with /bin/ls would be good, maybe like a /bin/ls
 workflow, and then for each file returned, maybe run Apache Tika and
 extract its metadata and then pipe that to a file?

 How about that?

 Cheers,
 Chris

 ++
 Chris Mattmann, Ph.D.
 Senior Computer Scientist
 NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
 Office: 171-266B, Mailstop: 171-246
 Email: chris.a.mattm...@nasa.gov
 WWW:  http://sunset.usc.edu/~mattmann/
 ++
 Adjunct Assistant Professor, Computer Science Department
 University of Southern California, Los Angeles, CA 90089 USA
 ++






 -Original Message-
 From: Sanjaya Medonsa sanjaya...@gmail.com
 Reply-To: d...@airavata.apache.org d...@airavata.apache.org
 Date: Tuesday, June 4, 2013 5:31 AM
 To: d...@airavata.apache.org d...@airavata.apache.org
 Cc: dev@oodt.apache.org dev@oodt.apache.org
 Subject: Re: Apache Airavata-OODT Integration

 Hi Chris,
  Please see my comments below on the two items.
 
 Configuration : It should be possible to set them programmatically.
 Actually I have implemented partly it for file staging information. I'll
 work to get rid of the other configuration files.
 
 Staged File Path : I'll work on the suggested approach, though I am not
 fully understand it at the moment. I guess I need to go through bit more
 on
 CAS-PGE and come back to you on the proposed approach.
 
 Currently I am testing this by wrapping /bin/ls command as GFac service.
 I
 may need to test this with real workflow. Could you please provide me
 know
 some guidance on better scenario to test this.
 
 Cheers,
 Sanjaya
 
 
 
 
 On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) 
 chris.a.mattm...@jpl.nasa.gov wrote:
 
  Hi Sanjaya,
 
  -Original Message-
 
  From: Sanjaya Medonsa sanjaya...@gmail.com
  Reply-To: d...@airavata.apache.org d...@airavata.apache.org
  Date: Thursday, May 30, 2013 5:12 AM
  To: dev@oodt.apache.org dev@oodt.apache.org,
 d...@airavata.apache.org
  d...@airavata.apache.org
  Subject: Apache Airavata-OODT Integration
 
  Hi,
   I have worked on the Apache Airavata integration with Apache
 OODT. As
  a first step, I have implemented integration with Apache OODT file
  manager component.
 
  Great work!!
 
  Comments below:
 
1. Introduce a new GFac Schema type called OODTProduct which
 takes
  APache OODT product IDs as input.
2. Implemented new pre GFac Handler by extending Apache OODT
  PgeTaskInstance to stage the corresponding file into the working
  directory.
3. Once file is staged, input parameter with OODT product id is
  replaced with path of the staged file for downstream processing
  
  I have tested the implementation with Gfac application which wraps
 /bin/ls
  command. Application takes product id as input and stage corresponding
  file
  into the working directory and /bin/ls is executed against the staged
  file.
  Hope this is a valid testing scenario.
  
  Concerns
  - Configurations : I have added new configuration file named and
  oodt-integration.properties in addition to dynamic_metadata.met and
  pge-config.xml files used by OODT. But at the moment there is no item
  configured with the oodt-integration.properties.
 
  You probably only need the pge-config.xml file. Dynamic metadata, and
 the
  task