Re: Orchestration Component implementation review

Saminda Wijeratne Mon, 20 Jan 2014 09:33:42 -0800

On Mon, Jan 20, 2014 at 8:38 AM, Sachith Withana <swsach...@gmail.com>wrote:


> Okay. Will do. Will send you a draft asap.
>
>
> On Mon, Jan 20, 2014 at 11:17 AM, Marlon Pierce <marpi...@iu.edu> wrote:
>
>> I think we want to make everything explicit.  The Airavata API is
>> intended for external clients, not communications between Airavata
>> components (the SPI).  Your figure at
>>
>> https://cwiki.apache.org/confluence/display/AIRAVATA/Simple+Gateway+Developer+Guide
>> summarizes this nicely.  You'll need to explain both the API (what a
>> gateway does) and the SPI (how Airavata components work together) to do
>> this.
>>
> +1. Like you did for the gateway developer guide an initial draft atleast
with a bullet point wiki would give you some quick feedback. Try and use
some flow diagrams to explain activities.

>
>>
>> Marlon
>>
>> On 1/20/14 11:09 AM, Sachith Withana wrote:
>> > We are using internal SPIs which are not reflected in the Airavata API.
>> > should it be explained or just make it a higher level diagram which
>> won't
>> > show the SPIs?
>> >
>> >
>> > On Mon, Jan 20, 2014 at 11:06 AM, Marlon Pierce <marpi...@iu.edu>
>> wrote:
>> >
>> >> Thanks, Sachith.  Can you explain your question about APIs and SPIs a
>> >> little more?
>> >>
>> >>
>> >> Marlon
>> >>
>> >> On 1/20/14 10:53 AM, Sachith Withana wrote:
>> >>> Hi All,
>> >>>
>> >>> I will go ahead and create the Wiki on the Orchestrator. Will send you
>> >> all
>> >>> a draft as soon as I can.
>> >>>
>> >>> One question though, Do we have to explicitly show the SPIs and APIs
>> >> both?
>> >>>
>> >>> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <marpi...@iu.edu>
>> wrote:
>> >>>
>> >>>> +1 for real use cases first. We have at least 3.  But I'm sure we
>> will
>> >>>> want to make it as easy as possible for developers to pass back the
>> >>>> correct, created experimentID when invoking launchExperiment.
>> >>>>
>> >>>>
>> >>>> Marlon
>> >>>>
>> >>>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
>> >>>>> Marlon, I think until we put this to real use we wont get much
>> feedback
>> >>>> on
>> >>>>> what aspects we should focus on more and in what features we should
>> >>>> expand
>> >>>>> or prioritize on. So how about having a test plan for the
>> Orchestrator.
>> >>>>> Expose it to real usecases and see how it will survive. WDYT?
>> >>>>>
>> >>>>> It might be a little confusing to return a "JobRequest" object from
>> the
>> >>>>> Orchestrator (since its a response). Or perhaps it should be
>> renamed?
>> >>>>>
>> >>>>> Sachith, I think we should have a google hangout or a separate mail
>> >>>> thread
>> >>>>> (or both) to discuss muti-threaded support. Could you organize this
>> >>>> please?
>> >>>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
>> >>>>> <thejaka.am...@gmail.com>wrote:
>> >>>>>
>> >>>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <
>> >> samin...@gmail.com
>> >>>>> wrote:
>> >>>>>>> Following are few thoughts I had during my review of the
>> component,
>> >>>>>>>
>> >>>>>>> *Multi-threaded vs single threaded*
>> >>>>>>> If we are going to have multi-threaded job submission the
>> >>>> implementation
>> >>>>>>> should work on handling race conditions. Essentially JobSubmitter
>> >>>> should be
>> >>>>>>> able to "lock" an experiment request before continuing processing
>> >> that
>> >>>>>>> request so that other JobSubmitters accessing the experiment
>> requests
>> >>>> a the
>> >>>>>>> same time would skip it.
>> >>>>>>>
>> >>>>>> +1. These are implementation details.
>> >>>>>>
>> >>>>>>
>> >>>>>>> *Orchestrator service*
>> >>>>>>> We might want to think of the possibility in future where we will
>> be
>> >>>>>>> having multiple deployments of an Airavata service. This could
>> >>>> particularly
>> >>>>>>> be true for SciGaP. We may have to think how some of the internal
>> >> data
>> >>>>>>> structures/SPIs should be updated to accomodate such requirements
>> in
>> >>>> future.
>> >>>>>> +1.
>> >>>>>>
>> >>>>>>
>> >>>>>>> *Orchestrator Component configurations*
>> >>>>>>> I see alot of places where the orchestrator can have
>> configurations.
>> >> I
>> >>>>>>> think its too early finalize them, but I think we can start
>> >> refactoring
>> >>>>>>> them out perhaps to the airavata-server.properties. I'm also
>> seeing
>> >> the
>> >>>>>>> orchestrator is now hardcoded to use default/admin gateway and
>> >>>> username. I
>> >>>>>>> think it should come from the request itself.
>> >>>>>>>
>> >>>>>> +1. But in overall we may need to change the way we handle
>> >>>> configurations
>> >>>>>> within Airavata. Currently we have multiple configuration files and
>> >>>>>> multiple places where we read configurations. IMO we should have a
>> >>>> separate
>> >>>>>> module to handle configurations. Only this module should be aware
>> how
>> >> to
>> >>>>>> intepret configurations in the file and provide a component
>> interface
>> >> to
>> >>>>>> access those configuration values.
>> >>>>>>
>> >>>>> +1 we tried this once with "ServerSettings" and
>> "ApplicationSettings",
>> >>>> but
>> >>>>> apparently again more configuration files seems to have spawned. So
>> far
>> >>>>> however they seemed to be localized for their component now.
>> >>>>>
>> >>>>>>> *Visibility of API functions*
>> >>>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>> >>>> should
>> >>>>>>> not be part of the API because I don't see a scenario where the
>> >> gateway
>> >>>>>>> developer would be responsible for using them. They serve a more
>> >>>> internal
>> >>>>>>> purpose of managing the orchestrator component IMO. As Amila
>> pointed
>> >>>> out so
>> >>>>>>> long ago (wink) functions that do not concern outside parties
>> should
>> >>>> not be
>> >>>>>>> used as part of the API.
>> >>>>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>>
>> >>>>>>> *Return values of Orchestrator API*
>> >>>>>>> IMO unless it is specifically required to do so I think the
>> functions
>> >>>>>>> does not necessarily need to return anything other than throw
>> >>>> exceptions
>> >>>>>>> when needed. For example the launchExperiment can simply return
>> void
>> >>>> if all
>> >>>>>>> is succesful and return an exception if something fails. Handling
>> >>>> issues
>> >>>>>>> with a try catch is not only simpler but also the explanations are
>> >>>> readily
>> >>>>>>> available for the user.
>> >>>>>>>
>> >>>>>> +1. Also try to have different exception for different scenarios.
>> For
>> >>>>>> example if persistence (hypothetical) fails,
>> >>>> DatabasePersistenceException,
>> >>>>>> if validation fails, ValidationFailedException etc ... Then the
>> >>>> developer
>> >>>>>> who uses the API can catch these different exceptions and act on
>> them
>> >>>>>> appropriately.
>> >>>>>>
>> >>>>> +1. What needs to be understood here is that the Exception should
>> be a
>> >>>>> Gateway friendly exception. i.e. it should not expose internal
>> details
>> >> of
>> >>>>> Airavata at the top-level exception and exception message should be
>> >> self
>> >>>>> explanatory enough for the gateway developer not to remain
>> scratching
>> >>>>> his/her head after reading the exception. A feedback from Sudhakar
>> >>>> sometime
>> >>>>> back was to provide suggestions in the exception message on how to
>> >>>> resolve
>> >>>>> the issue.
>> >>>>>
>> >>>>>>> *Data persisted in registry*
>> >>>>>>> ExperimentRequest.getUsername() : I think we should clarify what
>> this
>> >>>>>>> username denotes. In current API, in experiment submission we
>> >> consider
>> >>>> two
>> >>>>>>> types of users. Submission user (the user who submits the
>> experiment
>> >>>> to the
>> >>>>>>> Airavata Server - this is inferred by the request itself) and the
>> >>>> execution
>> >>>>>>> user (the user who corelates to the application executions of the
>> >>>> gateway -
>> >>>>>>> thus this user can be a different user for different gateway, eg:
>> >>>> community
>> >>>>>>> user, gateway user).
>> >>>>>>> I think we should persist the date/time of the experiment request
>> as
>> >>>>>>> well.
>> >>>>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>>>  Also when retrying of API functions in the case of a failure in
>> an
>> >>>>>>> previous attempt there should be a way to not to repeat already
>> >>>> performed
>> >>>>>>> steps or gracefully roleback and redo those required steps as
>> >>>> necessary.
>> >>>>>>> While such actions could be transparent to the user sometimes it
>> >> might
>> >>>> make
>> >>>>>>> sense to allow user to be notified of success/failure of a retry.
>> >>>> However
>> >>>>>>> this might mean keeping additional records at the registry level.
>> >>>>>>>
>> >>>>>> In addition we should also have a way of cleaning up unsubmitted
>> >>>>>> experiment ids. (But not sure whether you want to address this
>> right
>> >>>> now).
>> >>>>>> The way I see this is to have a periodic thread which goes through
>> the
>> >>>>>> table and clear up experiments which are not submitted for a
>> defined
>> >>>> time.
>> >>>>> +1. Something else we may have to think of later is the data
>> archiving
>> >>>>> capabilities. We keep running in to performance issues when the
>> >> database
>> >>>>> grows with experiment results. Unless we become experts of
>> distributed
>> >>>>> database management we should have a way better way to manage our db
>> >>>>> performance issues.
>> >>>>>
>> >>>>>
>> >>>>>> BTW, nice review notes, Saminda.
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Amila
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>
>> >
>>
>>
>
>
> --
> Thanks,
> Sachith Withana
>
>

Re: Orchestration Component implementation review

Reply via email to