Thanks Chris. I'll update the implementation to use file name instead of OODT product id.
Cheers, Sanjaya On Sun, Jun 16, 2013 at 12:51 AM, Mattmann, Chris A (398J) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hey Sanjaya, sure +1 use the Filename. It's not guaranteed to be unique, > but you can easily just pop the first one off the top (latest) and take > that (since it's sorted by product received time). You may check out the > pcs-core module and some of its internal classes like FileManagerUtils > to see some cool helper functions that could aid in this regard. > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Senior Computer Scientist > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 171-266B, Mailstop: 171-246 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Adjunct Assistant Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > -----Original Message----- > From: Sanjaya Medonsa <sanjaya...@gmail.com> > Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> > Date: Saturday, June 15, 2013 4:04 AM > To: Airavata Dev <d...@airavata.apache.org> > Subject: Re: Apache Airavata-OODT Integration > > >Thanks Chris for your help! Working directory is available in > >JobExecutionContext in Airavata and directory can easily be retrieved. > >Issue in my case is that, from XBaya GUI I take product id as input not > >the > >file name. Internally file stager query the file manager using product id > >to retrieve product reference and corresponding file name to stage the > >file > >into input dir. Since this product id to file name mapping happens > >internally during the file staging, my implementation don't have access to > >filename unless I query the file manager to retrieve the corresponding > >file > >name using product id. > > > >One of the major issue in my implementation seems that I use OODT product > >id as input, not the file name. Should I change my implementation to use > >file name instead of product id ? > > > >Best Regards, > >Sanjaya > > > > > >On Fri, Jun 14, 2013 at 8:51 PM, Mattmann, Chris A (398J) < > >chris.a.mattm...@jpl.nasa.gov> wrote: > > > >> Hey Sanjaya, > >> > >> Easy, see the attached PGEConfig.xml here: > >> > >> http://paste.apache.org/6OGW > >> > >> In that file: > >> > >> 1. We compute the staged file path by computing JobDir > >> 2. We create in the exe block a staged input dir > >> 3. We stage the files just using cps in the exeBlock (could have > >> just as easily used fileStager) > >> 4. We know that the file is [JobInputDir]/[Filename] > >> > >> HTH. > >> > >> Cheers, > >> Chris > >> > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Chris Mattmann, Ph.D. > >> Senior Computer Scientist > >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> Office: 171-266B, Mailstop: 171-246 > >> Email: chris.a.mattm...@nasa.gov > >> WWW: http://sunset.usc.edu/~mattmann/ > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Adjunct Assistant Professor, Computer Science Department > >> University of Southern California, Los Angeles, CA 90089 USA > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >> > >> > >> > >> > >> > >> -----Original Message----- > >> From: Sanjaya Medonsa <sanjaya...@gmail.com> > >> Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> > >> Date: Friday, June 14, 2013 5:02 AM > >> To: Airavata Dev <d...@airavata.apache.org> > >> Subject: Re: Apache Airavata-OODT Integration > >> > >> >Thanks Chris for your input. I actually use the PGETaskInstance for > >>file > >> >staging with minimal additional code. But my issue issue not with the > >>file > >> >staging. As per my current implementation, application inputs product > >>id. > >> >Then using the capabilities in PGETaskInstance class, it does the file > >> >staging. But my issue is that during the file staging product is > >>mapped to > >> >a file in specified working directory. I don't have a way to retrieve > >>the > >> >staged file name, as it is not recorded in Metadata (For this purpose, > >>I > >> >query the FileManager again to get the corresponding reference name > >>for a > >> >given product id). I need the staged file path, since I modify the > >>input > >> >product id into staged file path prior to actual workflow invocation. > >> >Basically I am looking for some implementation where I can easily > >> >retrieve, > >> >staged file path for a given product id. > >> > > >> >Cheers, > >> >Sanjaya > >> > > >> > > >> >On Wed, Jun 12, 2013 at 10:04 PM, Mattmann, Chris A (398J) < > >> >chris.a.mattm...@jpl.nasa.gov> wrote: > >> > > >> >> Hi Sanjaya, > >> >> > >> >> -----Original Message----- > >> >> > >> >> From: Sanjaya Medonsa <sanjaya...@gmail.com> > >> >> Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> > >> >> Date: Monday, June 10, 2013 5:20 PM > >> >> To: "d...@airavata.apache.org" <d...@airavata.apache.org> > >> >> Cc: "dev@oodt.apache.org" <dev@oodt.apache.org> > >> >> Subject: Re: Apache Airavata-OODT Integration > >> >> > >> >> >Hi Chris, > >> >> > On configuration, I have get rid of all the configuration > >>files, > >> >> >including pge-config.xml. All the required configurations are > >> >> >programmatically set. Configurations such FileManagerServer URL are > >> >> >configured in the airavata-server.properties file. I'll update the > >> >>review > >> >> >request with modified details. > >> >> > >> >> Great work! > >> >> > >> >> > >> >> > Still I am not quite clear on how to retrieve staged file > >>path > >> >> >properly. Currently I am using getStagedFilePath method > >> >> >in ApacheAiravataWorkFlowInstanceImpl to regenerate the staged file > >> >>path. > >> >> >While I am going through the OODT code that I have seen method in > >> >> >DataTransferer to notify FileManagerServer once transfer is > >>completed. > >> >>But > >> >> >I couldn't see the same for product retrieval. > >> >> > >> >> Example: > >> >> > >> >> > >> > >> > http://svn.apache.org/repos/asf/oodt/trunk/pge/src/test/resources/pge-con > >> >>fi > >> >> g.xml > >> >> > >> >> > >> >> Review Board tickets: > >> >> https://reviews.apache.org/r/4746/ > >> >> > >> >> https://reviews.apache.org/r/5382/ > >> >> > >> >> > >> >> JIRA issue source (in OODT since 0.4): > >> >> https://issues.apache.org/jira/browse/OODT-443 > >> >> > >> >> > >> >> > As you suggested I'll improve my workflow using Apache Tika. > >>I'd > >> >> >like to continue this as an Parallal task. While modifying staging > >> >> >implementation based on community feedback, currently I am looking > >>at > >> >> >ingesting output back to OODT. > >> >> > >> >> See above for info on file staging. I would strongly encourage you > >>not > >> >> to reimplement CAS-PGE in Airavata -- it's pretty functional and > >> >>expressive > >> >> anyways and I would work to figure out how to make Airavata leverage > >> >> CAS-PGE. > >> >> > >> >> Cheers, > >> >> Chris > >> >> > >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> Chris Mattmann, Ph.D. > >> >> Senior Computer Scientist > >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> >> Office: 171-266B, Mailstop: 171-246 > >> >> Email: chris.a.mattm...@nasa.gov > >> >> WWW: http://sunset.usc.edu/~mattmann/ > >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> Adjunct Assistant Professor, Computer Science Department > >> >> University of Southern California, Los Angeles, CA 90089 USA > >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> > >> >> > >> >> > >> >> > > >> >> > > >> >> > > >> >> >On Wed, Jun 5, 2013 at 12:11 AM, Mattmann, Chris A (398J) < > >> >> >chris.a.mattm...@jpl.nasa.gov> wrote: > >> >> > > >> >> >> Hi Sanjaya, > >> >> >> > >> >> >> I think starting out with /bin/ls would be good, maybe like a > >>/bin/ls > >> >> >> workflow, and then for each file returned, maybe run Apache Tika > >>and > >> >> >> extract its metadata and then pipe that to a file? > >> >> >> > >> >> >> How about that? > >> >> >> > >> >> >> Cheers, > >> >> >> Chris > >> >> >> > >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> >> Chris Mattmann, Ph.D. > >> >> >> Senior Computer Scientist > >> >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> >> >> Office: 171-266B, Mailstop: 171-246 > >> >> >> Email: chris.a.mattm...@nasa.gov > >> >> >> WWW: http://sunset.usc.edu/~mattmann/ > >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> >> Adjunct Assistant Professor, Computer Science Department > >> >> >> University of Southern California, Los Angeles, CA 90089 USA > >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> -----Original Message----- > >> >> >> From: Sanjaya Medonsa <sanjaya...@gmail.com> > >> >> >> Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> > >> >> >> Date: Tuesday, June 4, 2013 5:31 AM > >> >> >> To: "d...@airavata.apache.org" <d...@airavata.apache.org> > >> >> >> Cc: "dev@oodt.apache.org" <dev@oodt.apache.org> > >> >> >> Subject: Re: Apache Airavata-OODT Integration > >> >> >> > >> >> >> >Hi Chris, > >> >> >> > Please see my comments below on the two items. > >> >> >> > > >> >> >> >Configuration : It should be possible to set them > >>programmatically. > >> >> >> >Actually I have implemented partly it for file staging > >>information. > >> >> >>I'll > >> >> >> >work to get rid of the other configuration files. > >> >> >> > > >> >> >> >Staged File Path : I'll work on the suggested approach, though I > >>am > >> >>not > >> >> >> >fully understand it at the moment. I guess I need to go through > >>bit > >> >> >>more > >> >> >> >on > >> >> >> >CAS-PGE and come back to you on the proposed approach. > >> >> >> > > >> >> >> >Currently I am testing this by wrapping /bin/ls command as GFac > >> >> >>service. I > >> >> >> >may need to test this with real workflow. Could you please > >>provide > >> >>me > >> >> >>know > >> >> >> >some guidance on better scenario to test this. > >> >> >> > > >> >> >> >Cheers, > >> >> >> >Sanjaya > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> >On Mon, Jun 3, 2013 at 8:17 PM, Mattmann, Chris A (398J) < > >> >> >> >chris.a.mattm...@jpl.nasa.gov> wrote: > >> >> >> > > >> >> >> >> Hi Sanjaya, > >> >> >> >> > >> >> >> >> -----Original Message----- > >> >> >> >> > >> >> >> >> From: Sanjaya Medonsa <sanjaya...@gmail.com> > >> >> >> >> Reply-To: "d...@airavata.apache.org" <d...@airavata.apache.org> > >> >> >> >> Date: Thursday, May 30, 2013 5:12 AM > >> >> >> >> To: "dev@oodt.apache.org" <dev@oodt.apache.org>, > >> >> >> >>"d...@airavata.apache.org" > >> >> >> >> <d...@airavata.apache.org> > >> >> >> >> Subject: Apache Airavata-OODT Integration > >> >> >> >> > >> >> >> >> >Hi, > >> >> >> >> > I have worked on the Apache Airavata integration with > >>Apache > >> >> >> >>OODT. As > >> >> >> >> >a first step, I have implemented integration with Apache OODT > >> >>file > >> >> >> >> >manager component. > >> >> >> >> > >> >> >> >> Great work!! > >> >> >> >> > >> >> >> >> Comments below: > >> >> >> >> > >> >> >> >> > 1. Introduce a new GFac Schema type called OODTProduct > >> >>which > >> >> >> >>takes > >> >> >> >> >APache OODT product IDs as input. > >> >> >> >> > 2. Implemented new pre GFac Handler by extending Apache > >> >>OODT > >> >> >> >> >PgeTaskInstance to stage the corresponding file into the > >>working > >> >> >> >> >directory. > >> >> >> >> > 3. Once file is staged, input parameter with OODT > >>product > >> >>id > >> >> >>is > >> >> >> >> >replaced with path of the staged file for downstream > >>processing > >> >> >> >> > > >> >> >> >> >I have tested the implementation with Gfac application which > >> >>wraps > >> >> >> >>/bin/ls > >> >> >> >> >command. Application takes product id as input and stage > >> >> >>corresponding > >> >> >> >> >file > >> >> >> >> >into the working directory and /bin/ls is executed against the > >> >> >>staged > >> >> >> >> >file. > >> >> >> >> >Hope this is a valid testing scenario. > >> >> >> >> > > >> >> >> >> >Concerns > >> >> >> >> >- Configurations : I have added new configuration file named > >>and > >> >> >> >> >oodt-integration.properties in addition to > >>dynamic_metadata.met > >> >>and > >> >> >> >> >pge-config.xml files used by OODT. But at the moment there is > >>no > >> >> >>item > >> >> >> >> >configured with the oodt-integration.properties. > >> >> >> >> > >> >> >> >> You probably only need the pge-config.xml file. Dynamic > >>metadata, > >> >>and > >> >> >> >>the > >> >> >> >> task configuration properties can be specified > >>programmatically, > >> >> >>right? > >> >> >> >> > >> >> >> >> >- Staged File Name - With the current implementation of > >> >> >> >>PgeTaskInstance it > >> >> >> >> >is not possible to retrieve path of the staged file. Due to > >>this > >> >> >> >> >limitation, I have query the FileManagerServer with product id > >> >>and > >> >> >> >> >retrieve > >> >> >> >> >the file name and computed the file path using information of > >> >> >>working > >> >> >> >> >directory. > >> >> >> >> > >> >> >> >> I'm not sure I understand this? If you store and record the > >> >>Filename, > >> >> >> >>and > >> >> >> >> FileLocation > >> >> >> >> metadata files, then you can easily retrieve the staged file > >>path > >> >> >>via a > >> >> >> >> SQLquery > >> >> >> >> via CAS-PGE by simply setting the > >> >>FORMAT=('$FileLocation/$Filename') > >> >> >>in > >> >> >> >> the response. > >> >> >> >> Can you comment on this? > >> >> >> >> > >> >> >> >> >- Currently it is not possible to execute the workflow using > >> >>Xbaya > >> >> >>due > >> >> >> >>to > >> >> >> >> >validation failure due to new schema type. I have commented > >>out > >> >>the > >> >> >> >> >relevant validation code for testing purpose. > >> >> >> >> > >> >> >> >> OK, will probably need to work on this. > >> >> >> >> > >> >> >> >> > > >> >> >> >> >Currently I am having an issue with review board client tool > >>and > >> >> >>need > >> >> >> >>to > >> >> >> >> >resolve it to upload the code for review. > >> >> >> >> > >> >> >> >> I see later that you got this working, so will head over and > >> >>review > >> >> >>that > >> >> >> >> now. > >> >> >> >> > >> >> >> >> Thanks! > >> >> >> >> > >> >> >> >> Cheers, > >> >> >> >> Chris > >> >> >> >> > >> >> >> >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> >> >> Chris Mattmann, Ph.D. > >> >> >> >> Senior Computer Scientist > >> >> >> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> >> >> >> Office: 171-266B, Mailstop: 171-246 > >> >> >> >> Email: chris.a.mattm...@nasa.gov > >> >> >> >> WWW: http://sunset.usc.edu/~mattmann/ > >> >> >> >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> >> >> Adjunct Assistant Professor, Computer Science Department > >> >> >> >> University of Southern California, Los Angeles, CA 90089 USA > >> >> >> >> > >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> > >> >> >> > >> >> > >> >> > >> > >> > >