Hi Milinda,

On Fri, Aug 3, 2012 at 4:31 PM, Milinda Pathirage
<[email protected]> wrote:
> Hi Devs,
>
> I found out several issues in internal architecture of current GFac
> implementation while implementing cloud bursting support for GFac. As a
> solution to these issues and to make it easy to extend GFac to support
> different scenarios involve in cloud bursting, I did several improvements
> to internal architecture. Below are the changes I did. Please feel free to
> comment on new improvements.
>
> *Handler Based Architecture
> *Execution of a GFac job includes step like input file staging,
> initializing execution environment(cluster setup, etc.) and output file
> staging. In the current trunk implementation some of these steps(input data
> movement) happen inside provider initialization and it limits the
> flexibility of having different execution sequences. For example in Hadoop
> cloud bursting scenario, we may have to setup HDFS cluster first then move
> the data to HDFS and execute the Hadoop job. So in this case we need to
> have flexibility to setup HDFS cluster before doing any data movements. If
> we follow the current implementation pattern, we had to implement support
> for these different scenarios in the providers initialization method and
> this will make provider logic complex.
>
> Also handler based architecture allow us to reuse already available
> handlers in different execution flows. For example we can have GridFTP to
> Amazon S3 data movement handler which can be used with different
> providers(Hadoop, etc.).
>
> Another improvement we can do on top of handler architecture is dynamically
> changing handler execution sequence based on application description.
> Initial implementation will come with the static handler sequence. But I am
> planning to scheduler which decides handler sequence based on application
> description.
Please make sure when you add some new parameter, you might want to
add that in to WorkflowExecutionContext too.
>
> Flow of information between handlers can be done via parameters and handler
> implementor should document required parameters by the handler and what are
> the parameters that handler will make available to other handlers.
>
> Following is a visual representation of new handler architecture.
>
> **
>
>
> *Changes to context hierarchy
> *JobExecutionContext carries the data and configuration information related
> to single job execution. JobExecutionContext will contain input and output
> MessageContext instances which carry data related to input and output.
> There will be a SecurityContext which can be used to set security related
> properties to each message context. ApplicationContext contains information
> related to application and there will be a ApplicationContext associated
> with JobExecutionContext.
>
>
> New GFac code can be found at [1]. I am done with the initial
> implementation and we need to do following things before first release of
> new GFac core.
>
>    - Migrate existing code to new implementation
>    - Complete workflow notifier based on new EventBus API
>    - Migrate Hadoop provider and Whirr based cloud deployment code to new
>    architecture
>
I am willing to help with this todos !

Lahiru
>
> Thanks
> Milinda
>
>
> [1]
> https://svn.codespot.com/a/apache-extras.org/airavata-gsoc-sandbox/trunk/milinda/gfac-core
>
> --
> Milinda Pathirage
> PhD Student Indiana University, Bloomington;
> E-mail: [email protected]
> Web: http://mpathirage.com
> Blog: http://blog.mpathirage.com



-- 
System Analyst Programmer
PTI Lab
Indiana University

Reply via email to