Hi Chris, I have just realized that Airavata GFac Handler has been updated to include Gfac Handler specific configuration recently. I think I should move the configurations in airavata-server.properties into gfac-config.xml as properties of the GFac handler which performs the OODT File Staging
Best Regards, Sanjaya On Tue, Jun 11, 2013 at 5:50 AM, Sanjaya Medonsa <sanjaya...@gmail.com>wrote: > Hi Chris, > On configuration, I have get rid of all the configuration files, > including pge-config.xml. All the required configurations are > programmatically set. Configurations such FileManagerServer URL are > configured in the airavata-server.properties file. I'll update the review > request with modified details. > Still I am not quite clear on how to retrieve staged file path > properly. Currently I am using getStagedFilePath method > in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged file path. > While I am going through the OODT code that I have seen method in > DataTransferer to notify FileManagerServer once transfer is completed. But > I couldn't see the same for product retrieval. > As you suggested I'll improve my workflow using Apache Tika. I'd > like to continue this as an Parallal task. While modifying staging > implementation based on community feedback, currently I am looking at > ingesting output back to OODT. > > Best Regards, > Sanjaya > > > > On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> Hi Sanjaya, >> >> I think starting out with /bin/ls would be good, maybe like a /bin/ls >> workflow, and then for each file returned, maybe run Apache Tika and >> extract its metadata and then pipe that to a file? >> >> How about that? >> >> Cheers, >> Chris >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: chris.a.mattm...@nasa.gov >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> >> -----Original Message----- >> From: Sanjaya Medonsa <sanjaya...@gmail.com> >> Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> >> Date: Tuesday, June 4, 2013 5:31 AM >> To: "d...@airavata.apache.org" <d...@airavata.apache.org> >> Cc: "dev@oodt.apache.org" <dev@oodt.apache.org> >> Subject: Re: Apache Airavata-OODT Integration >> >> >Hi Chris, >> > Please see my comments below on the two items. >> > >> >Configuration : It should be possible to set them programmatically. >> >Actually I have implemented partly it for file staging information. I'll >> >work to get rid of the other configuration files. >> > >> >Staged File Path : I'll work on the suggested approach, though I am not >> >fully understand it at the moment. I guess I need to go through bit more >> >on >> >CAS-PGE and come back to you on the proposed approach. >> > >> >Currently I am testing this by wrapping /bin/ls command as GFac service. >> I >> >may need to test this with real workflow. Could you please provide me >> know >> >some guidance on better scenario to test this. >> > >> >Cheers, >> >Sanjaya >> > >> > >> > >> > >> >On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) < >> >chris.a.mattm...@jpl.nasa.gov> wrote: >> > >> >> Hi Sanjaya, >> >> >> >> -----Original Message----- >> >> >> >> From: Sanjaya Medonsa <sanjaya...@gmail.com> >> >> Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> >> >> Date: Thursday, May 30, 2013 5:12 AM >> >> To: "dev@oodt.apache.org" <dev@oodt.apache.org>, >> >>"d...@airavata.apache.org" >> >> <d...@airavata.apache.org> >> >> Subject: Apache Airavata-OODT Integration >> >> >> >> >Hi, >> >> > I have worked on the Apache Airavata integration with Apache >> >>OODT. As >> >> >a first step, I have implemented integration with Apache OODT file >> >> >manager component. >> >> >> >> Great work!! >> >> >> >> Comments below: >> >> >> >> > 1. Introduce a new GFac Schema type called OODTProduct which >> >>takes >> >> >APache OODT product IDs as input. >> >> > 2. Implemented new pre GFac Handler by extending Apache OODT >> >> >PgeTaskInstance to stage the corresponding file into the working >> >> >directory. >> >> > 3. Once file is staged, input parameter with OODT product id is >> >> >replaced with path of the staged file for downstream processing >> >> > >> >> >I have tested the implementation with Gfac application which wraps >> >>/bin/ls >> >> >command. Application takes product id as input and stage corresponding >> >> >file >> >> >into the working directory and /bin/ls is executed against the staged >> >> >file. >> >> >Hope this is a valid testing scenario. >> >> > >> >> >Concerns >> >> >- Configurations : I have added new configuration file named and >> >> >oodt-integration.properties in addition to dynamic_metadata.met and >> >> >pge-config.xml files used by OODT. But at the moment there is no item >> >> >configured with the oodt-integration.properties. >> >> >> >> You probably only need the pge-config.xml file. Dynamic metadata, and >> >>the >> >> task configuration properties can be specified programmatically, right? >> >> >> >> >- Staged File Name - With the current implementation of >> >>PgeTaskInstance it >> >> >is not possible to retrieve path of the staged file. Due to this >> >> >limitation, I have query the FileManagerServer with product id and >> >> >retrieve >> >> >the file name and computed the file path using information of working >> >> >directory. >> >> >> >> I'm not sure I understand this? If you store and record the Filename, >> >>and >> >> FileLocation >> >> metadata files, then you can easily retrieve the staged file path via a >> >> SQLquery >> >> via CAS-PGE by simply setting the FORMAT=('$FileLocation/$Filename') in >> >> the response. >> >> Can you comment on this? >> >> >> >> >- Currently it is not possible to execute the workflow using Xbaya due >> >>to >> >> >validation failure due to new schema type. I have commented out the >> >> >relevant validation code for testing purpose. >> >> >> >> OK, will probably need to work on this. >> >> >> >> > >> >> >Currently I am having an issue with review board client tool and need >> >>to >> >> >resolve it to upload the code for review. >> >> >> >> I see later that you got this working, so will head over and review >> that >> >> now. >> >> >> >> Thanks! >> >> >> >> Cheers, >> >> Chris >> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> Chris Mattmann, Ph.D. >> >> Senior Computer Scientist >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> >> Office: 171-266B, Mailstop: 171-246 >> >> Email: chris.a.mattm...@nasa.gov >> >> WWW: http://sunset.usc.edu/~mattmann/ >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> Adjunct Assistant Professor, Computer Science Department >> >> University of Southern California, Los Angeles, CA 90089 USA >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> >> >> >> >> >