Re: Orchestration Component implementation review

Sachith Withana Mon, 20 Jan 2014 07:55:10 -0800

Hi All,

I will go ahead and create the Wiki on the Orchestrator. Will send you all
a draft as soon as I can.


One question though, Do we have to explicitly show the SPIs and APIs both?


On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <marpi...@iu.edu> wrote:

> +1 for real use cases first. We have at least 3.  But I'm sure we will
> want to make it as easy as possible for developers to pass back the
> correct, created experimentID when invoking launchExperiment.
>
>
> Marlon
>
> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
> > Marlon, I think until we put this to real use we wont get much feedback
> on
> > what aspects we should focus on more and in what features we should
> expand
> > or prioritize on. So how about having a test plan for the Orchestrator.
> > Expose it to real usecases and see how it will survive. WDYT?
> >
> > It might be a little confusing to return a "JobRequest" object from the
> > Orchestrator (since its a response). Or perhaps it should be renamed?
> >
> > Sachith, I think we should have a google hangout or a separate mail
> thread
> > (or both) to discuss muti-threaded support. Could you organize this
> please?
> >
> >
> > On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
> > <thejaka.am...@gmail.com>wrote:
> >
> >>
> >>
> >> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samin...@gmail.com
> >wrote:
> >>
> >>> Following are few thoughts I had during my review of the component,
> >>>
> >>> *Multi-threaded vs single threaded*
> >>> If we are going to have multi-threaded job submission the
> implementation
> >>> should work on handling race conditions. Essentially JobSubmitter
> should be
> >>> able to "lock" an experiment request before continuing processing that
> >>> request so that other JobSubmitters accessing the experiment requests
> a the
> >>> same time would skip it.
> >>>
> >> +1. These are implementation details.
> >>
> >>
> >>> *Orchestrator service*
> >>> We might want to think of the possibility in future where we will be
> >>> having multiple deployments of an Airavata service. This could
> particularly
> >>> be true for SciGaP. We may have to think how some of the internal data
> >>> structures/SPIs should be updated to accomodate such requirements in
> future.
> >>>
> >> +1.
> >>
> >>
> >>> *Orchestrator Component configurations*
> >>> I see alot of places where the orchestrator can have configurations. I
> >>> think its too early finalize them, but I think we can start refactoring
> >>> them out perhaps to the airavata-server.properties. I'm also seeing the
> >>> orchestrator is now hardcoded to use default/admin gateway and
> username. I
> >>> think it should come from the request itself.
> >>>
> >> +1. But in overall we may need to change the way we handle
> configurations
> >> within Airavata. Currently we have multiple configuration files and
> >> multiple places where we read configurations. IMO we should have a
> separate
> >> module to handle configurations. Only this module should be aware how to
> >> intepret configurations in the file and provide a component interface to
> >> access those configuration values.
> >>
> > +1 we tried this once with "ServerSettings" and "ApplicationSettings",
> but
> > apparently again more configuration files seems to have spawned. So far
> > however they seemed to be localized for their component now.
> >
> >>
> >>> *Visibility of API functions*
> >>> I think initialize(), shutdown() and startJobSubmitter() functions
> should
> >>> not be part of the API because I don't see a scenario where the gateway
> >>> developer would be responsible for using them. They serve a more
> internal
> >>> purpose of managing the orchestrator component IMO. As Amila pointed
> out so
> >>> long ago (wink) functions that do not concern outside parties should
> not be
> >>> used as part of the API.
> >>>
> >> +1
> >>
> >>
> >>> *Return values of Orchestrator API*
> >>> IMO unless it is specifically required to do so I think the functions
> >>> does not necessarily need to return anything other than throw
> exceptions
> >>> when needed. For example the launchExperiment can simply return void
> if all
> >>> is succesful and return an exception if something fails. Handling
> issues
> >>> with a try catch is not only simpler but also the explanations are
> readily
> >>> available for the user.
> >>>
> >> +1. Also try to have different exception for different scenarios. For
> >> example if persistence (hypothetical) fails,
> DatabasePersistenceException,
> >> if validation fails, ValidationFailedException etc ... Then the
> developer
> >> who uses the API can catch these different exceptions and act on them
> >> appropriately.
> >>
> > +1. What needs to be understood here is that the Exception should be a
> > Gateway friendly exception. i.e. it should not expose internal details of
> > Airavata at the top-level exception and exception message should be self
> > explanatory enough for the gateway developer not to remain scratching
> > his/her head after reading the exception. A feedback from Sudhakar
> sometime
> > back was to provide suggestions in the exception message on how to
> resolve
> > the issue.
> >
> >>
> >>> *Data persisted in registry*
> >>> ExperimentRequest.getUsername() : I think we should clarify what this
> >>> username denotes. In current API, in experiment submission we consider
> two
> >>> types of users. Submission user (the user who submits the experiment
> to the
> >>> Airavata Server - this is inferred by the request itself) and the
> execution
> >>> user (the user who corelates to the application executions of the
> gateway -
> >>> thus this user can be a different user for different gateway, eg:
> community
> >>> user, gateway user).
> >>> I think we should persist the date/time of the experiment request as
> >>> well.
> >>>
> >> +1
> >>
> >>>  Also when retrying of API functions in the case of a failure in an
> >>> previous attempt there should be a way to not to repeat already
> performed
> >>> steps or gracefully roleback and redo those required steps as
> necessary.
> >>> While such actions could be transparent to the user sometimes it might
> make
> >>> sense to allow user to be notified of success/failure of a retry.
> However
> >>> this might mean keeping additional records at the registry level.
> >>>
> >> In addition we should also have a way of cleaning up unsubmitted
> >> experiment ids. (But not sure whether you want to address this right
> now).
> >> The way I see this is to have a periodic thread which goes through the
> >> table and clear up experiments which are not submitted for a defined
> time.
> >>
> > +1. Something else we may have to think of later is the data archiving
> > capabilities. We keep running in to performance issues when the database
> > grows with experiment results. Unless we become experts of distributed
> > database management we should have a way better way to manage our db
> > performance issues.
> >
> >
> >> BTW, nice review notes, Saminda.
> >>
> >> Thanks
> >> Amila
> >>
> >>
> >>
>
>


-- 
Thanks,
Sachith Withana

Re: Orchestration Component implementation review

Reply via email to