Re: Orchestration Component implementation review

Marlon Pierce Mon, 20 Jan 2014 08:10:48 -0800

Thanks, Sachith.  Can you explain your question about APIs and SPIs a
little more?



Marlon

On 1/20/14 10:53 AM, Sachith Withana wrote:
> Hi All,
>
> I will go ahead and create the Wiki on the Orchestrator. Will send you all
> a draft as soon as I can.
>
> One question though, Do we have to explicitly show the SPIs and APIs both?
>
>
> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <marpi...@iu.edu> wrote:
>
>> +1 for real use cases first. We have at least 3.  But I'm sure we will
>> want to make it as easy as possible for developers to pass back the
>> correct, created experimentID when invoking launchExperiment.
>>
>>
>> Marlon
>>
>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
>>> Marlon, I think until we put this to real use we wont get much feedback
>> on
>>> what aspects we should focus on more and in what features we should
>> expand
>>> or prioritize on. So how about having a test plan for the Orchestrator.
>>> Expose it to real usecases and see how it will survive. WDYT?
>>>
>>> It might be a little confusing to return a "JobRequest" object from the
>>> Orchestrator (since its a response). Or perhaps it should be renamed?
>>>
>>> Sachith, I think we should have a google hangout or a separate mail
>> thread
>>> (or both) to discuss muti-threaded support. Could you organize this
>> please?
>>>
>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
>>> <thejaka.am...@gmail.com>wrote:
>>>
>>>>
>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samin...@gmail.com
>>> wrote:
>>>>> Following are few thoughts I had during my review of the component,
>>>>>
>>>>> *Multi-threaded vs single threaded*
>>>>> If we are going to have multi-threaded job submission the
>> implementation
>>>>> should work on handling race conditions. Essentially JobSubmitter
>> should be
>>>>> able to "lock" an experiment request before continuing processing that
>>>>> request so that other JobSubmitters accessing the experiment requests
>> a the
>>>>> same time would skip it.
>>>>>
>>>> +1. These are implementation details.
>>>>
>>>>
>>>>> *Orchestrator service*
>>>>> We might want to think of the possibility in future where we will be
>>>>> having multiple deployments of an Airavata service. This could
>> particularly
>>>>> be true for SciGaP. We may have to think how some of the internal data
>>>>> structures/SPIs should be updated to accomodate such requirements in
>> future.
>>>> +1.
>>>>
>>>>
>>>>> *Orchestrator Component configurations*
>>>>> I see alot of places where the orchestrator can have configurations. I
>>>>> think its too early finalize them, but I think we can start refactoring
>>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>>> orchestrator is now hardcoded to use default/admin gateway and
>> username. I
>>>>> think it should come from the request itself.
>>>>>
>>>> +1. But in overall we may need to change the way we handle
>> configurations
>>>> within Airavata. Currently we have multiple configuration files and
>>>> multiple places where we read configurations. IMO we should have a
>> separate
>>>> module to handle configurations. Only this module should be aware how to
>>>> intepret configurations in the file and provide a component interface to
>>>> access those configuration values.
>>>>
>>> +1 we tried this once with "ServerSettings" and "ApplicationSettings",
>> but
>>> apparently again more configuration files seems to have spawned. So far
>>> however they seemed to be localized for their component now.
>>>
>>>>> *Visibility of API functions*
>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>> should
>>>>> not be part of the API because I don't see a scenario where the gateway
>>>>> developer would be responsible for using them. They serve a more
>> internal
>>>>> purpose of managing the orchestrator component IMO. As Amila pointed
>> out so
>>>>> long ago (wink) functions that do not concern outside parties should
>> not be
>>>>> used as part of the API.
>>>>>
>>>> +1
>>>>
>>>>
>>>>> *Return values of Orchestrator API*
>>>>> IMO unless it is specifically required to do so I think the functions
>>>>> does not necessarily need to return anything other than throw
>> exceptions
>>>>> when needed. For example the launchExperiment can simply return void
>> if all
>>>>> is succesful and return an exception if something fails. Handling
>> issues
>>>>> with a try catch is not only simpler but also the explanations are
>> readily
>>>>> available for the user.
>>>>>
>>>> +1. Also try to have different exception for different scenarios. For
>>>> example if persistence (hypothetical) fails,
>> DatabasePersistenceException,
>>>> if validation fails, ValidationFailedException etc ... Then the
>> developer
>>>> who uses the API can catch these different exceptions and act on them
>>>> appropriately.
>>>>
>>> +1. What needs to be understood here is that the Exception should be a
>>> Gateway friendly exception. i.e. it should not expose internal details of
>>> Airavata at the top-level exception and exception message should be self
>>> explanatory enough for the gateway developer not to remain scratching
>>> his/her head after reading the exception. A feedback from Sudhakar
>> sometime
>>> back was to provide suggestions in the exception message on how to
>> resolve
>>> the issue.
>>>
>>>>> *Data persisted in registry*
>>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>>> username denotes. In current API, in experiment submission we consider
>> two
>>>>> types of users. Submission user (the user who submits the experiment
>> to the
>>>>> Airavata Server - this is inferred by the request itself) and the
>> execution
>>>>> user (the user who corelates to the application executions of the
>> gateway -
>>>>> thus this user can be a different user for different gateway, eg:
>> community
>>>>> user, gateway user).
>>>>> I think we should persist the date/time of the experiment request as
>>>>> well.
>>>>>
>>>> +1
>>>>
>>>>>  Also when retrying of API functions in the case of a failure in an
>>>>> previous attempt there should be a way to not to repeat already
>> performed
>>>>> steps or gracefully roleback and redo those required steps as
>> necessary.
>>>>> While such actions could be transparent to the user sometimes it might
>> make
>>>>> sense to allow user to be notified of success/failure of a retry.
>> However
>>>>> this might mean keeping additional records at the registry level.
>>>>>
>>>> In addition we should also have a way of cleaning up unsubmitted
>>>> experiment ids. (But not sure whether you want to address this right
>> now).
>>>> The way I see this is to have a periodic thread which goes through the
>>>> table and clear up experiments which are not submitted for a defined
>> time.
>>> +1. Something else we may have to think of later is the data archiving
>>> capabilities. We keep running in to performance issues when the database
>>> grows with experiment results. Unless we become experts of distributed
>>> database management we should have a way better way to manage our db
>>> performance issues.
>>>
>>>
>>>> BTW, nice review notes, Saminda.
>>>>
>>>> Thanks
>>>> Amila
>>>>
>>>>
>>>>
>>
>

Re: Orchestration Component implementation review

Reply via email to