These look good to me. Can you please explain usage of getGFacJobsFromDescriptors method? How is this different from getting the descriptors from registry? and who should register this data?
Also a typo in updateGFacJobMetadta method name. Thanks Raminder On May 21, 2013, at 11:28 PM, Saminda Wijeratne wrote: > Following API functions are added for the ProvenanceManager[2], > > boolean isGFacJobExists(String gfacJobId) > void addGFacJob(GFacJob job) > void updateGFacJob(GFacJob job) > void updateGFacJobStatus(String gfacJobId, GFacJobStatus status) > void updateGFacJobData(String gfacJobId, String jobdata) > void updateGFacJobSubmittedTime(String gfacJobId, Date submitted) > void updateGFacJobCompletedTime(String gfacJobId, Date completed) > void updateGFacJobMetadta(String gfacJobId, String metadata) > GFacJob getGFacJob(String gfacJobId) > List<GFacJob> getGFacJobsForDescriptors(String serviceDescriptionId, String > hostDescriptionId, String applicationDescriptionId) > List<GFacJob> getGFacJobs(String experimentId, String workflowExecutionId, > String nodeId) > > Thoughts are welcome!!! > > > 2. > https://svn.apache.org/repos/asf/airavata/trunk/modules/airavata-client/src/main/java/org/apache/airavata/client/api/ProvenanceManager.java > > > On Tue, May 21, 2013 at 5:04 PM, Saminda Wijeratne <samin...@gmail.com>wrote: > >> But I thought the providers are part of the GFac (not as a separate >> service). If not then the providers should report to GFac. Orelse there is >> no way the GFac knows what status to update which data to update etc. Does >> the current GFac implementation support this? >> >> >> On Tue, May 21, 2013 at 4:47 PM, Amila Jayasekara <thejaka.am...@gmail.com >>> wrote: >> >>> I think that should be handled at a more upper layer like Workflow >>> Interpretter or GFac. In FT perspective it is better if providers are >>> stateless. One reason is we dont have control over some providers and and >>> there will be many places writing to disk if we implement the persistence >>> logic at provider level. >>> >>> Thanks >>> Amila >>> >>> >>> On Tue, May 21, 2013 at 4:39 PM, Saminda Wijeratne <samin...@gmail.com >>>> wrote: >>> >>>> On Tue, May 21, 2013 at 4:36 PM, Amila Jayasekara >>>> <thejaka.am...@gmail.com>wrote: >>>> >>>>> On Tue, May 21, 2013 at 3:51 PM, Saminda Wijeratne < >>> samin...@gmail.com >>>>>> wrote: >>>>> >>>>>> Thanks for the feedback Amila. a few comments inline >>>>>> >>>>>> >>>>>> On Tue, May 21, 2013 at 12:29 PM, Amila Jayasekara >>>>>> <thejaka.am...@gmail.com>wrote: >>>>>> >>>>>>> Hi Saminda, >>>>>>> >>>>>>> Great suggestion. Also +1 for Dhanushka's proposal to have >>>>>>> serialize/de-serilized data. >>>>>>> Few suggestions, >>>>>>> 1. In addition to successful/error statuses we need other status >>> for >>>>>> nodes >>>>>>> & workflows >>>>>>> and workflows. >>>>>>> E . g :- >>>>>>> node - started, submitted, in-progress, failed, successful etc >>> ... >>>>>>> >>>>>> Sorry if I was too vague. Yes we have more fine-grain statuses for >>>>> workflow >>>>>> and node[1]. We will have a much fine-grained level of granuality >>> for a >>>>>> GFacJob status. >>>>>> public static enum GFacJobStatus{ >>>>>> SUBMITTED, //job is submitted, possibly waiting to start >>>>> executing >>>>>> EXECUTING, //submitted job is being executed >>>>>> CANCELLED, //job was cancelled >>>>>> PAUSED, //job was paused >>>>>> WAITING_FOR_DATA, // job is waiting for data to continue >>>>> executing >>>>>> FAILED, // error occurred while job was executing and the >>> job >>>>>> stopped >>>>>> FINISHED, // job completed successfully >>>>>> UNKNOWN // unknown status. lookup the metadata for more >>>> details. >>>>>> } >>>>>> >>>>>> >>>>>> 2. This data will be useful in implementing FT and Load Balancing in >>>> each >>>>>>> component. Sometime back we had discussions to make GFac >>> stateless. >>>> So >>>>>> who >>>>>>> is going to populate this data structure and persist it ? >>>>>>> >>>>>> That is a very good question... :). This summer is going to be a >>> long >>>>>> one... ;) >>>>>> >>>>> >>>>> What I meant is which component is doing persistence ? (GFac or WF >>>>> Interpretter). Not the actual person who is going to implement it :). >>>>> >>>> hih hih.... >>>> Well its going to be whatever the provider respondible for managing the >>> job >>>> lifecycle. For example GRAMProvider should be responsible for recording >>> all >>>> the data relating to the GRAM jobs its working with. >>>> >>>>> >>>>> >>>>>> >>>>>> 1. >>>>>> >>>>>> >>>>> >>>> >>> https://svn.apache.org/repos/asf/airavata/trunk/modules/workflow-model/workflow-model-core/src/main/java/org/apache/airavata/workflow/model/graph/Node.java >>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> Amila >>>>>>> >>>>>>> >>>>>>> On Tue, May 21, 2013 at 11:39 AM, Saminda Wijeratne < >>>>> samin...@gmail.com >>>>>>>> wrote: >>>>>>> >>>>>>>> Thats is an excellent idea. We can have the job data field to be >>>> the >>>>>>>> designated GFac job serialized data. The whatever GFacProvider >>>> should >>>>>>>> adhere to it. >>>>>>>> >>>>>>>> I'm still inclined to have the rest of the fields to ease of >>>> querying >>>>>> for >>>>>>>> the required data. For example if we wanted all attempts on >>>> executing >>>>>>> for a >>>>>>>> particular node of a workflow or if we wanted to know which >>>>> application >>>>>>>> descriptions are faster in execution or more reliable etc. we >>> can >>>> let >>>>>> the >>>>>>>> query language deal with it. wdyt? >>>>>>>> >>>>>>>> >>>>>>>> On Tue, May 21, 2013 at 11:24 AM, Danushka Menikkumbura < >>>>>>>> danushka.menikkumb...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Saminda, >>>>>>>>> >>>>>>>>> I think the data container does not need to have a generic >>>> format. >>>>> We >>>>>>> can >>>>>>>>> have a base class that facilitate object >>>>>> serialization/deserialization >>>>>>>> and >>>>>>>>> let specific meta data structure implement them as required. >>> We >>>> get >>>>>> the >>>>>>>>> Registry API to serialize objects and save them in a meta data >>>>> table >>>>>>>> (with >>>>>>>>> just two columns?) and to deserialize as they are loaded off >>> the >>>>>>>> registry. >>>>>>>>> >>>>>>>>> Danushka >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, May 21, 2013 at 8:34 PM, Saminda Wijeratne < >>>>>> samin...@gmail.com >>>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> It has being apparent more and more that saving the data >>>> related >>>>> to >>>>>>>>>> executing a jobs from the GFac can be useful for many >>> reasons >>>>> such >>>>>>> as, >>>>>>>>>> >>>>>>>>>> debugging >>>>>>>>>> retrying >>>>>>>>>> to make smart decisions on reliability/cost etc. >>>>>>>>>> statistical analysis >>>>>>>>>> >>>>>>>>>> Thus we thought of saving the data related to GFac jobs in >>> the >>>>>>> registry >>>>>>>>> in >>>>>>>>>> order to facilitate feature such as above in the future. >>>>>>>>>> >>>>>>>>>> However a GFac job is potentially any sort of computing >>>> resource >>>>>>> access >>>>>>>>>> (GRAM/UNICORE/EC2 etc.). Therefore we need to come up with a >>>>>>>> generalized >>>>>>>>>> data structure that can hold the data of any type of >>> resource. >>>>>>>> Following >>>>>>>>>> are the suggested data to save for a single GFac job >>> execution, >>>>>>>>>> >>>>>>>>>> *experiment id, workflow instance id, node id* - pinpoint >>> the >>>>> node >>>>>>>>>> execution >>>>>>>>>> *service, host, application description ids *- pinpoint the >>>>>>> descriptors >>>>>>>>>> responsible >>>>>>>>>> *local job id* - the unique job id retrieved/generated per >>>>>> execution >>>>>>>>>> [PRIMARY KEY] >>>>>>>>>> *job data* - data related executing the job (eg: the rsl in >>>> GRAM) >>>>>>>>>> *submitted, completed time* >>>>>>>>>> *completed status* - whether the job was successfull or ran >>> in >>>> to >>>>>>>> errors >>>>>>>>>> etc. >>>>>>>>>> *metadata* - custom field to add anything user wants >>>>>>>>>> >>>>>>>>>> Your feedback is most welcome. The API related changes will >>>> also >>>>> be >>>>>>>>>> discussed once we have a proper data structure. We are >>> hoping >>>> to >>>>>>>>> implement >>>>>>>>>> this within next few days. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Saminda >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >>