Hi Ricky,

On Jul 20, 2012, at 9:35 AM, Nguyen, Ricky wrote:

> Hi folks,
> 
> I was thinking of using Resource Manager to execute my PGE tasks on various 
> worker nodes (batch stubs) in hopes of improving the run time of my workflows.

Awesome, OK!

> So my first question is, how will a batch stub obtain files from the File 
> Manager?

The batch stub doesn't by default, but CAS-PGE will. It's part of the task 
wrapper functionality to take care of that
and act as the "embryo" and "umbilical cord" to the rest of the substrate 
services, per:

http://sunset.usc.edu/~mattmann/pubs/SMCIT09.pdf

> I'm only familiar with the local use case: PGE (without resmgr) will just use 
> "$FileLocation/$Filename" returned by filemgr to access the file locally.

Yep through TaskJob and TaskJobInput inside of the Workflow Manager structs 
package, it works the
exact same in remote mode (with Resource Manager). Pretty cool, huh?

> My guess is that all the nodes should mount the filemgr archive dir at the 
> same NFS location (as in 
> https://cwiki.apache.org/OODT/getting-products-from-a-remote-filemanager.html).

That works pre 0.4 -- in 0.4 Brian Foster has added file staging to CAS-PGE so 
it will stage too.

> Or does the PushPull component have a role here?

It can via the Data Transfer extension point in File Manager, but doesn't have 
to b/c of NFS (or 
better yet HDFS) mount, and also b/c of 0.4 CAS-PGE.

HTH!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: [email protected]
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to