Hi Milinda,

Thanks for the detailed feedback. I did not read through in detail yet, but did 
you intend any visual diagram? If so, since you cannot attach files, can you 
please crate a jira on this improvement and attach to it?

Thanks,
Suresh
On Aug 3, 2012, at 4:31 PM, Milinda Pathirage wrote:

> Hi Devs,
> 
> I found out several issues in internal architecture of current GFac
> implementation while implementing cloud bursting support for GFac. As a
> solution to these issues and to make it easy to extend GFac to support
> different scenarios involve in cloud bursting, I did several improvements
> to internal architecture. Below are the changes I did. Please feel free to
> comment on new improvements.
> 
> *Handler Based Architecture
> *Execution of a GFac job includes step like input file staging,
> initializing execution environment(cluster setup, etc.) and output file
> staging. In the current trunk implementation some of these steps(input data
> movement) happen inside provider initialization and it limits the
> flexibility of having different execution sequences. For example in Hadoop
> cloud bursting scenario, we may have to setup HDFS cluster first then move
> the data to HDFS and execute the Hadoop job. So in this case we need to
> have flexibility to setup HDFS cluster before doing any data movements. If
> we follow the current implementation pattern, we had to implement support
> for these different scenarios in the providers initialization method and
> this will make provider logic complex.
> 
> Also handler based architecture allow us to reuse already available
> handlers in different execution flows. For example we can have GridFTP to
> Amazon S3 data movement handler which can be used with different
> providers(Hadoop, etc.).
> 
> Another improvement we can do on top of handler architecture is dynamically
> changing handler execution sequence based on application description.
> Initial implementation will come with the static handler sequence. But I am
> planning to scheduler which decides handler sequence based on application
> description.
> 
> Flow of information between handlers can be done via parameters and handler
> implementor should document required parameters by the handler and what are
> the parameters that handler will make available to other handlers.
> 
> Following is a visual representation of new handler architecture.
> 
> **
> 
> 
> *Changes to context hierarchy
> *JobExecutionContext carries the data and configuration information related
> to single job execution. JobExecutionContext will contain input and output
> MessageContext instances which carry data related to input and output.
> There will be a SecurityContext which can be used to set security related
> properties to each message context. ApplicationContext contains information
> related to application and there will be a ApplicationContext associated
> with JobExecutionContext.
> 
> 
> New GFac code can be found at [1]. I am done with the initial
> implementation and we need to do following things before first release of
> new GFac core.
> 
>   - Migrate existing code to new implementation
>   - Complete workflow notifier based on new EventBus API
>   - Migrate Hadoop provider and Whirr based cloud deployment code to new
>   architecture
> 
> 
> Thanks
> Milinda
> 
> 
> [1]
> https://svn.codespot.com/a/apache-extras.org/airavata-gsoc-sandbox/trunk/milinda/gfac-core
> 
> --
> Milinda Pathirage
> PhD Student Indiana University, Bloomington;
> E-mail: [email protected]
> Web: http://mpathirage.com
> Blog: http://blog.mpathirage.com

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to