Hi All, I will go ahead and create the Wiki on the Orchestrator. Will send you all a draft as soon as I can.
One question though, Do we have to explicitly show the SPIs and APIs both? On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <marpi...@iu.edu> wrote: > +1 for real use cases first. We have at least 3. But I'm sure we will > want to make it as easy as possible for developers to pass back the > correct, created experimentID when invoking launchExperiment. > > > Marlon > > On 1/17/14 2:57 PM, Saminda Wijeratne wrote: > > Marlon, I think until we put this to real use we wont get much feedback > on > > what aspects we should focus on more and in what features we should > expand > > or prioritize on. So how about having a test plan for the Orchestrator. > > Expose it to real usecases and see how it will survive. WDYT? > > > > It might be a little confusing to return a "JobRequest" object from the > > Orchestrator (since its a response). Or perhaps it should be renamed? > > > > Sachith, I think we should have a google hangout or a separate mail > thread > > (or both) to discuss muti-threaded support. Could you organize this > please? > > > > > > On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara > > <thejaka.am...@gmail.com>wrote: > > > >> > >> > >> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samin...@gmail.com > >wrote: > >> > >>> Following are few thoughts I had during my review of the component, > >>> > >>> *Multi-threaded vs single threaded* > >>> If we are going to have multi-threaded job submission the > implementation > >>> should work on handling race conditions. Essentially JobSubmitter > should be > >>> able to "lock" an experiment request before continuing processing that > >>> request so that other JobSubmitters accessing the experiment requests > a the > >>> same time would skip it. > >>> > >> +1. These are implementation details. > >> > >> > >>> *Orchestrator service* > >>> We might want to think of the possibility in future where we will be > >>> having multiple deployments of an Airavata service. This could > particularly > >>> be true for SciGaP. We may have to think how some of the internal data > >>> structures/SPIs should be updated to accomodate such requirements in > future. > >>> > >> +1. > >> > >> > >>> *Orchestrator Component configurations* > >>> I see alot of places where the orchestrator can have configurations. I > >>> think its too early finalize them, but I think we can start refactoring > >>> them out perhaps to the airavata-server.properties. I'm also seeing the > >>> orchestrator is now hardcoded to use default/admin gateway and > username. I > >>> think it should come from the request itself. > >>> > >> +1. But in overall we may need to change the way we handle > configurations > >> within Airavata. Currently we have multiple configuration files and > >> multiple places where we read configurations. IMO we should have a > separate > >> module to handle configurations. Only this module should be aware how to > >> intepret configurations in the file and provide a component interface to > >> access those configuration values. > >> > > +1 we tried this once with "ServerSettings" and "ApplicationSettings", > but > > apparently again more configuration files seems to have spawned. So far > > however they seemed to be localized for their component now. > > > >> > >>> *Visibility of API functions* > >>> I think initialize(), shutdown() and startJobSubmitter() functions > should > >>> not be part of the API because I don't see a scenario where the gateway > >>> developer would be responsible for using them. They serve a more > internal > >>> purpose of managing the orchestrator component IMO. As Amila pointed > out so > >>> long ago (wink) functions that do not concern outside parties should > not be > >>> used as part of the API. > >>> > >> +1 > >> > >> > >>> *Return values of Orchestrator API* > >>> IMO unless it is specifically required to do so I think the functions > >>> does not necessarily need to return anything other than throw > exceptions > >>> when needed. For example the launchExperiment can simply return void > if all > >>> is succesful and return an exception if something fails. Handling > issues > >>> with a try catch is not only simpler but also the explanations are > readily > >>> available for the user. > >>> > >> +1. Also try to have different exception for different scenarios. For > >> example if persistence (hypothetical) fails, > DatabasePersistenceException, > >> if validation fails, ValidationFailedException etc ... Then the > developer > >> who uses the API can catch these different exceptions and act on them > >> appropriately. > >> > > +1. What needs to be understood here is that the Exception should be a > > Gateway friendly exception. i.e. it should not expose internal details of > > Airavata at the top-level exception and exception message should be self > > explanatory enough for the gateway developer not to remain scratching > > his/her head after reading the exception. A feedback from Sudhakar > sometime > > back was to provide suggestions in the exception message on how to > resolve > > the issue. > > > >> > >>> *Data persisted in registry* > >>> ExperimentRequest.getUsername() : I think we should clarify what this > >>> username denotes. In current API, in experiment submission we consider > two > >>> types of users. Submission user (the user who submits the experiment > to the > >>> Airavata Server - this is inferred by the request itself) and the > execution > >>> user (the user who corelates to the application executions of the > gateway - > >>> thus this user can be a different user for different gateway, eg: > community > >>> user, gateway user). > >>> I think we should persist the date/time of the experiment request as > >>> well. > >>> > >> +1 > >> > >>> Also when retrying of API functions in the case of a failure in an > >>> previous attempt there should be a way to not to repeat already > performed > >>> steps or gracefully roleback and redo those required steps as > necessary. > >>> While such actions could be transparent to the user sometimes it might > make > >>> sense to allow user to be notified of success/failure of a retry. > However > >>> this might mean keeping additional records at the registry level. > >>> > >> In addition we should also have a way of cleaning up unsubmitted > >> experiment ids. (But not sure whether you want to address this right > now). > >> The way I see this is to have a periodic thread which goes through the > >> table and clear up experiments which are not submitted for a defined > time. > >> > > +1. Something else we may have to think of later is the data archiving > > capabilities. We keep running in to performance issues when the database > > grows with experiment results. Unless we become experts of distributed > > database management we should have a way better way to manage our db > > performance issues. > > > > > >> BTW, nice review notes, Saminda. > >> > >> Thanks > >> Amila > >> > >> > >> > > -- Thanks, Sachith Withana