[ 
https://issues.apache.org/jira/browse/AIRAVATA-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13749289#comment-13749289
 ] 

Mawathage Sanjaya Priyadarshana Medonsa commented on AIRAVATA-52:
-----------------------------------------------------------------

Proposed Solution

1. Extend GFacParameterTypes.xsd to introduce new type to Handle OODT products.
2. Introduce new module, oodt-integration, to handle OODT product retrieval and 
ingestion
     - GFacHanler extension in Airavata is used as the integration point. 
Introduce new Input handler and Output handler to handle file retrieval and 
ingestion respectively.
     - Both the Handler implementations are based on PGETaskInstance (Task 
Wrapper) in OODT. Since Airavata doesn't have a support for task wrappers, 
pre/post task execution responsibilities are given to IN/OUT handlers.
     
3. Workflow Execution
     - Workflow should be set up with OODTProduct as the input
     - OODTProductStager input handler is registered with Airavata as Input 
Handler. OODTProductStager process any input of type OODTProduct. It invokes 
ApacheAiravataWorkFlowInstance to stage file into Airavata staging directory 
configured with the application. ApacheAiravataWorkFlowInstance is an extension 
to PGETaskInstance of OODT.
     - ApacheAiravataWorkFlowInstance instance initialize the PGETaskInstance 
metadata and configuration based on application configurations, Airavata 
configurations and properties configured with OODTProductStager input handler. 
OODTSpecific configurations are handled by OODTProductStager properties in 
gfac-config.xml. All the configurations required for PGETaskInstance makes 
dynamic when possible.
     - ApacheAiravataWorkFlowInstance interpret the OODTProduct input as one of 
following.
         1. Local file - If input contains file separator, it considers as 
local file. NFS and HDFS could be used to make OODT file manager repository 
local to workflow execution.
         2. Product Name - With this approach, remote transfer is required. 
ApacheAiravataWorkFlowInstance first query the OODTFileManager to retrieve 
corresponding product id from the file manager repository and use it for remote 
file transfer.
     - Scientific applications takes staged file as the input. Once file is 
successfully staged into the staging directory of the application, 
OODTProductStager modify the value of input to staged file path for rest of the 
processing.
         3. Once workflow successfully executed, generated output is staged 
into OODT Product Repository and Metadata catalog using Airavata OUT handler. 
OODTOutputIngester is an extension to OODT PGETaskInstance to use post 
execution capabilities in PGETaskInstance. OODTOutputIngester ingests output 
together with rich set of Metadata which is useful for Provenance aware 
workflow processing. 

Testing
=======
     - Initial testing of the integration is plan based on linux /bin/ls 
utility which list out the basic information such as access privileges of the 
file. 
     - As a sub task of this project new sample service is implemented to 
extract metadata from the ingested outfile using Apache Tika.

                
> [GSoC] Create OODT File Manager extension for Airavata
> ------------------------------------------------------
>
>                 Key: AIRAVATA-52
>                 URL: https://issues.apache.org/jira/browse/AIRAVATA-52
>             Project: Airavata
>          Issue Type: New Feature
>          Components: GFac, XBaya
>            Reporter: Chris A. Mattmann
>            Priority: Minor
>              Labels: gsoc2012, gsoc2013, mentor
>             Fix For: WISHLIST
>
>
> Create an OODT File Manager extension point for Airavata for data and 
> workflow cataloging. Also investigate the use of cas-catalog from OODT too. 
> Airavata GFac has an data cataloging interface which by default publishes to 
> the Airavata registry. An implementation to use the rich data cataloging 
> features of OODT will enhance the users to store and retrieve richer metadata 
> from workflow execution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to