Hi Ricky, On Jul 20, 2012, at 9:35 AM, Nguyen, Ricky wrote:
> Hi folks, > > I was thinking of using Resource Manager to execute my PGE tasks on various > worker nodes (batch stubs) in hopes of improving the run time of my workflows. Awesome, OK! > So my first question is, how will a batch stub obtain files from the File > Manager? The batch stub doesn't by default, but CAS-PGE will. It's part of the task wrapper functionality to take care of that and act as the "embryo" and "umbilical cord" to the rest of the substrate services, per: http://sunset.usc.edu/~mattmann/pubs/SMCIT09.pdf > I'm only familiar with the local use case: PGE (without resmgr) will just use > "$FileLocation/$Filename" returned by filemgr to access the file locally. Yep through TaskJob and TaskJobInput inside of the Workflow Manager structs package, it works the exact same in remote mode (with Resource Manager). Pretty cool, huh? > My guess is that all the nodes should mount the filemgr archive dir at the > same NFS location (as in > https://cwiki.apache.org/OODT/getting-products-from-a-remote-filemanager.html). That works pre 0.4 -- in 0.4 Brian Foster has added file staging to CAS-PGE so it will stage too. > Or does the PushPull component have a role here? It can via the Data Transfer extension point in File Manager, but doesn't have to b/c of NFS (or better yet HDFS) mount, and also b/c of 0.4 CAS-PGE. HTH! Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
