Re: Orchestration Component implementation review

Saminda Wijeratne Sun, 19 Jan 2014 09:49:49 -0800

On Sun, Jan 19, 2014 at 8:36 AM, Suresh Marru <sma...@apache.org> wrote:


> Great thoughts Saminda and Amila. Agreed about real-world use cases and
> integration will help prioritize. I will embed my feedback below:
>
> On Jan 17, 2014, at 2:57 PM, Saminda Wijeratne <samin...@gmail.com> wrote:
>
> > Marlon, I think until we put this to real use we wont get much feedback
> on what aspects we should focus on more and in what features we should
> expand or prioritize on. So how about having a test plan for the
> Orchestrator. Expose it to real usecases and see how it will survive. WDYT?
> >
> > It might be a little confusing to return a "JobRequest" object from the
> Orchestrator (since its a response). Or perhaps it should be renamed?
> >
> > Sachith, I think we should have a google hangout or a separate mail
> thread (or both) to discuss muti-threaded support. Could you organize this
> please?
> >
> > On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara <
> thejaka.am...@gmail.com> wrote:
> >
> > On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samin...@gmail.com>
> wrote:
> > Following are few thoughts I had during my review of the component,
> >
> > Multi-threaded vs single threaded
> > If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
> >
> > +1. These are implementation details.
>
> Agreed. For the implementation, I see this as a solved problem in
> operating systems and distributed systems worlds. Hopefully, we do not have
> to re-invent and rather leverage some libraries.
>
+1


>
> > Orchestrator service
> > We might want to think of the possibility in future where we will be
> having multiple deployments of an Airavata service. This could particularly
> be true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
> >
> > +1.
> >
> + 1.
> >
> > Orchestrator Component configurations
> > I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
> >
> > +1. But in overall we may need to change the way we handle
> configurations within Airavata. Currently we have multiple configuration
> files and multiple places where we read configurations. IMO we should have
> a separate module to handle configurations. Only this module should be
> aware how to intepret configurations in the file and provide a component
> interface to access those configuration values.
> > +1 we tried this once with "ServerSettings" and "ApplicationSettings",
> but apparently again more configuration files seems to have spawned. So far
> however they seemed to be localized for their component now.
>
> Fully agreed. I think we need to go back to these single configuration for
> all Airavata Server Needs and a single one for the Client SDK’s.
>
> > Visibility of API functions
> > I think initialize(), shutdown() and startJobSubmitter() functions
> should not be part of the API because I don't see a scenario where the
> gateway developer would be responsible for using them. They serve a more
> internal purpose of managing the orchestrator component IMO. As Amila
> pointed out so long ago (wink) functions that do not concern outside
> parties should not be used as part of the API.
> >
> > +1
>
> + 1. These should be within Orchestrator SPI but not exposed through the
> API as the clients should not be able to control these server behavior.
>
> > Return values of Orchestrator API
> > IMO unless it is specifically required to do so I think the functions
> does not necessarily need to return anything other than throw exceptions
> when needed. For example the launchExperiment can simply return void if all
> is succesful and return an exception if something fails. Handling issues
> with a try catch is not only simpler but also the explanations are readily
> available for the user.
> >
> > +1. Also try to have different exception for different scenarios. For
> example if persistence (hypothetical) fails, DatabasePersistenceException,
> if validation fails, ValidationFailedException etc ... Then the developer
> who uses the API can catch these different exceptions and act on them
> appropriately.
> > +1. What needs to be understood here is that the Exception should be a
> Gateway friendly exception. i.e. it should not expose internal details of
> Airavata at the top-level exception and exception message should be self
> explanatory enough for the gateway developer not to remain scratching
> his/her head after reading the exception. A feedback from Sudhakar sometime
> back was to provide suggestions in the exception message on how to resolve
> the issue.
>
> I have drafted some of these in the thrift files, will update the JIRA to
> brainstorm more.
>
> > Data persisted in registry
> > ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> > I think we should persist the date/time of the experiment request as
> well.
> > +1
>
> The user naming is getting more confusing. I will start a separate
> discussion on this.
>
> > Also when retrying of API functions in the case of a failure in an
> previous attempt there should be a way to not to repeat already performed
> steps or gracefully roleback and redo those required steps as necessary.
> While such actions could be transparent to the user sometimes it might make
> sense to allow user to be notified of success/failure of a retry. However
> this might mean keeping additional records at the registry level.
> >
> > In addition we should also have a way of cleaning up unsubmitted
> experiment ids. (But not sure whether you want to address this right now).
> The way I see this is to have a periodic thread which goes through the
> table and clear up experiments which are not submitted for a defined time.
> > +1. Something else we may have to think of later is the data archiving
> capabilities. We keep running in to performance issues when the database
> grows with experiment results. Unless we become experts of distributed
> database management we should have a way better way to manage our db
> performance issues.
> >
>
> -1 on this. I may want to go back a year later and submit a previously
> created experiment. I think its wrong to put a temporal bound on these,
> more over these provide as a good source of analaytics to improvise
> usability. As per data base performance, not in 2014, there should be many
> solutions to handle zillions of experiments (atleast thats what the social
> networking world claims).
>
I didn't mean that the experiments should be removed from users grasp by
archiving them. Its more like an idea of memory hierarchy. The data which
is most likely to be used should be available for quick querying. Ofcourse
such data distributions should be transparent to the users.

>
> >
> > BTW, nice review notes, Saminda.
>
> + 1. And also + 1 to Amila’s attention to detail.
>
> Suresh
>
> >
> > Thanks
> > Amila
> >
> >
> >
>
>

Re: Orchestration Component implementation review

Reply via email to