Hi Milinda, Thanks for the detailed feedback. I did not read through in detail yet, but did you intend any visual diagram? If so, since you cannot attach files, can you please crate a jira on this improvement and attach to it?
Thanks, Suresh On Aug 3, 2012, at 4:31 PM, Milinda Pathirage wrote: > Hi Devs, > > I found out several issues in internal architecture of current GFac > implementation while implementing cloud bursting support for GFac. As a > solution to these issues and to make it easy to extend GFac to support > different scenarios involve in cloud bursting, I did several improvements > to internal architecture. Below are the changes I did. Please feel free to > comment on new improvements. > > *Handler Based Architecture > *Execution of a GFac job includes step like input file staging, > initializing execution environment(cluster setup, etc.) and output file > staging. In the current trunk implementation some of these steps(input data > movement) happen inside provider initialization and it limits the > flexibility of having different execution sequences. For example in Hadoop > cloud bursting scenario, we may have to setup HDFS cluster first then move > the data to HDFS and execute the Hadoop job. So in this case we need to > have flexibility to setup HDFS cluster before doing any data movements. If > we follow the current implementation pattern, we had to implement support > for these different scenarios in the providers initialization method and > this will make provider logic complex. > > Also handler based architecture allow us to reuse already available > handlers in different execution flows. For example we can have GridFTP to > Amazon S3 data movement handler which can be used with different > providers(Hadoop, etc.). > > Another improvement we can do on top of handler architecture is dynamically > changing handler execution sequence based on application description. > Initial implementation will come with the static handler sequence. But I am > planning to scheduler which decides handler sequence based on application > description. > > Flow of information between handlers can be done via parameters and handler > implementor should document required parameters by the handler and what are > the parameters that handler will make available to other handlers. > > Following is a visual representation of new handler architecture. > > ** > > > *Changes to context hierarchy > *JobExecutionContext carries the data and configuration information related > to single job execution. JobExecutionContext will contain input and output > MessageContext instances which carry data related to input and output. > There will be a SecurityContext which can be used to set security related > properties to each message context. ApplicationContext contains information > related to application and there will be a ApplicationContext associated > with JobExecutionContext. > > > New GFac code can be found at [1]. I am done with the initial > implementation and we need to do following things before first release of > new GFac core. > > - Migrate existing code to new implementation > - Complete workflow notifier based on new EventBus API > - Migrate Hadoop provider and Whirr based cloud deployment code to new > architecture > > > Thanks > Milinda > > > [1] > https://svn.codespot.com/a/apache-extras.org/airavata-gsoc-sandbox/trunk/milinda/gfac-core > > -- > Milinda Pathirage > PhD Student Indiana University, Bloomington; > E-mail: [email protected] > Web: http://mpathirage.com > Blog: http://blog.mpathirage.com
signature.asc
Description: Message signed with OpenPGP using GPGMail
