Application. But in our case we may have to use both. eg: addApplicationJob(...) or addApplicationSubmission(...). The name addApplication(...) is misleading I think. wdyt?
On Wed, May 22, 2013 at 1:43 PM, Amila Jayasekara <thejaka.am...@gmail.com>wrote: > What is more familiar ? "Application" or "Job" ? > > Thanks > Amila > > > On Wed, May 22, 2013 at 11:28 AM, Saminda Wijeratne <samin...@gmail.com > >wrote: > > > On Wed, May 22, 2013 at 11:22 AM, Amila Jayasekara > > <thejaka.am...@gmail.com>wrote: > > > > > I am bit concerned about the names. Are we assuming that API users has > > > knowledge about GFac ? > > > OR else we can just remove "GFac" substring and have method names like > > > "void > > > updateJobMetadta(..)" > > > > > You have a point there Amila. Perhaps we can name them as "Application" > > rather than GFac since we already have the notion of an application > > descriptor in the API. wdyt? > > > > > > > Thanks > > > Amila > > > > > > > > > On Tue, May 21, 2013 at 11:28 PM, Saminda Wijeratne < > samin...@gmail.com > > > >wrote: > > > > > > > Following API functions are added for the ProvenanceManager[2], > > > > > > > > boolean isGFacJobExists(String gfacJobId) > > > > void addGFacJob(GFacJob job) > > > > void updateGFacJob(GFacJob job) > > > > void updateGFacJobStatus(String gfacJobId, GFacJobStatus status) > > > > void updateGFacJobData(String gfacJobId, String jobdata) > > > > void updateGFacJobSubmittedTime(String gfacJobId, Date submitted) > > > > void updateGFacJobCompletedTime(String gfacJobId, Date completed) > > > > void updateGFacJobMetadta(String gfacJobId, String metadata) > > > > GFacJob getGFacJob(String gfacJobId) > > > > List<GFacJob> getGFacJobsForDescriptors(String serviceDescriptionId, > > > String > > > > hostDescriptionId, String applicationDescriptionId) > > > > List<GFacJob> getGFacJobs(String experimentId, String > > > workflowExecutionId, > > > > String nodeId) > > > > > > > > Thoughts are welcome!!! > > > > > > > > > > > > 2. > > > > > > > > > > > > > > https://svn.apache.org/repos/asf/airavata/trunk/modules/airavata-client/src/main/java/org/apache/airavata/client/api/ProvenanceManager.java > > > > > > > > > > > > On Tue, May 21, 2013 at 5:04 PM, Saminda Wijeratne < > samin...@gmail.com > > > > >wrote: > > > > > > > > > But I thought the providers are part of the GFac (not as a separate > > > > > service). If not then the providers should report to GFac. Orelse > > there > > > > is > > > > > no way the GFac knows what status to update which data to update > etc. > > > > Does > > > > > the current GFac implementation support this? > > > > > > > > > > > > > > > On Tue, May 21, 2013 at 4:47 PM, Amila Jayasekara < > > > > thejaka.am...@gmail.com > > > > > > wrote: > > > > > > > > > >> I think that should be handled at a more upper layer like Workflow > > > > >> Interpretter or GFac. In FT perspective it is better if providers > > are > > > > >> stateless. One reason is we dont have control over some providers > > and > > > > and > > > > >> there will be many places writing to disk if we implement the > > > > persistence > > > > >> logic at provider level. > > > > >> > > > > >> Thanks > > > > >> Amila > > > > >> > > > > >> > > > > >> On Tue, May 21, 2013 at 4:39 PM, Saminda Wijeratne < > > > samin...@gmail.com > > > > >> >wrote: > > > > >> > > > > >> > On Tue, May 21, 2013 at 4:36 PM, Amila Jayasekara > > > > >> > <thejaka.am...@gmail.com>wrote: > > > > >> > > > > > >> > > On Tue, May 21, 2013 at 3:51 PM, Saminda Wijeratne < > > > > >> samin...@gmail.com > > > > >> > > >wrote: > > > > >> > > > > > > >> > > > Thanks for the feedback Amila. a few comments inline > > > > >> > > > > > > > >> > > > > > > > >> > > > On Tue, May 21, 2013 at 12:29 PM, Amila Jayasekara > > > > >> > > > <thejaka.am...@gmail.com>wrote: > > > > >> > > > > > > > >> > > > > Hi Saminda, > > > > >> > > > > > > > > >> > > > > Great suggestion. Also +1 for Dhanushka's proposal to have > > > > >> > > > > serialize/de-serilized data. > > > > >> > > > > Few suggestions, > > > > >> > > > > 1. In addition to successful/error statuses we need other > > > status > > > > >> for > > > > >> > > > nodes > > > > >> > > > > & workflows > > > > >> > > > > and workflows. > > > > >> > > > > E . g :- > > > > >> > > > > node - started, submitted, in-progress, failed, > > successful > > > > etc > > > > >> ... > > > > >> > > > > > > > > >> > > > Sorry if I was too vague. Yes we have more fine-grain > statuses > > > for > > > > >> > > workflow > > > > >> > > > and node[1]. We will have a much fine-grained level of > > > granuality > > > > >> for a > > > > >> > > > GFacJob status. > > > > >> > > > public static enum GFacJobStatus{ > > > > >> > > > SUBMITTED, //job is submitted, possibly waiting to > > start > > > > >> > > executing > > > > >> > > > EXECUTING, //submitted job is being executed > > > > >> > > > CANCELLED, //job was cancelled > > > > >> > > > PAUSED, //job was paused > > > > >> > > > WAITING_FOR_DATA, // job is waiting for data to > > continue > > > > >> > > executing > > > > >> > > > FAILED, // error occurred while job was executing > and > > > the > > > > >> job > > > > >> > > > stopped > > > > >> > > > FINISHED, // job completed successfully > > > > >> > > > UNKNOWN // unknown status. lookup the metadata for > > more > > > > >> > details. > > > > >> > > > } > > > > >> > > > > > > > >> > > > > > > > >> > > > 2. This data will be useful in implementing FT and Load > > > Balancing > > > > in > > > > >> > each > > > > >> > > > > component. Sometime back we had discussions to make GFac > > > > >> stateless. > > > > >> > So > > > > >> > > > who > > > > >> > > > > is going to populate this data structure and persist it ? > > > > >> > > > > > > > > >> > > > That is a very good question... :). This summer is going to > > be a > > > > >> long > > > > >> > > > one... ;) > > > > >> > > > > > > > >> > > > > > > >> > > What I meant is which component is doing persistence ? (GFac > or > > WF > > > > >> > > Interpretter). Not the actual person who is going to implement > > it > > > > :). > > > > >> > > > > > > >> > hih hih.... > > > > >> > Well its going to be whatever the provider respondible for > > managing > > > > the > > > > >> job > > > > >> > lifecycle. For example GRAMProvider should be responsible for > > > > recording > > > > >> all > > > > >> > the data relating to the GRAM jobs its working with. > > > > >> > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > >> > > > 1. > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://svn.apache.org/repos/asf/airavata/trunk/modules/workflow-model/workflow-model-core/src/main/java/org/apache/airavata/workflow/model/graph/Node.java > > > > >> > > > > > > > >> > > > > > > > > >> > > > > Thanks > > > > >> > > > > Amila > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > On Tue, May 21, 2013 at 11:39 AM, Saminda Wijeratne < > > > > >> > > samin...@gmail.com > > > > >> > > > > >wrote: > > > > >> > > > > > > > > >> > > > > > Thats is an excellent idea. We can have the job data > field > > > to > > > > be > > > > >> > the > > > > >> > > > > > designated GFac job serialized data. The whatever > > > GFacProvider > > > > >> > should > > > > >> > > > > > adhere to it. > > > > >> > > > > > > > > > >> > > > > > I'm still inclined to have the rest of the fields to > ease > > of > > > > >> > querying > > > > >> > > > for > > > > >> > > > > > the required data. For example if we wanted all attempts > > on > > > > >> > executing > > > > >> > > > > for a > > > > >> > > > > > particular node of a workflow or if we wanted to know > > which > > > > >> > > application > > > > >> > > > > > descriptions are faster in execution or more reliable > etc. > > > we > > > > >> can > > > > >> > let > > > > >> > > > the > > > > >> > > > > > query language deal with it. wdyt? > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > On Tue, May 21, 2013 at 11:24 AM, Danushka Menikkumbura > < > > > > >> > > > > > danushka.menikkumb...@gmail.com> wrote: > > > > >> > > > > > > > > > >> > > > > > > Saminda, > > > > >> > > > > > > > > > > >> > > > > > > I think the data container does not need to have a > > generic > > > > >> > format. > > > > >> > > We > > > > >> > > > > can > > > > >> > > > > > > have a base class that facilitate object > > > > >> > > > serialization/deserialization > > > > >> > > > > > and > > > > >> > > > > > > let specific meta data structure implement them as > > > required. > > > > >> We > > > > >> > get > > > > >> > > > the > > > > >> > > > > > > Registry API to serialize objects and save them in a > > meta > > > > data > > > > >> > > table > > > > >> > > > > > (with > > > > >> > > > > > > just two columns?) and to deserialize as they are > loaded > > > off > > > > >> the > > > > >> > > > > > registry. > > > > >> > > > > > > > > > > >> > > > > > > Danushka > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > On Tue, May 21, 2013 at 8:34 PM, Saminda Wijeratne < > > > > >> > > > samin...@gmail.com > > > > >> > > > > > > >wrote: > > > > >> > > > > > > > > > > >> > > > > > > > It has being apparent more and more that saving the > > data > > > > >> > related > > > > >> > > to > > > > >> > > > > > > > executing a jobs from the GFac can be useful for > many > > > > >> reasons > > > > >> > > such > > > > >> > > > > as, > > > > >> > > > > > > > > > > > >> > > > > > > > debugging > > > > >> > > > > > > > retrying > > > > >> > > > > > > > to make smart decisions on reliability/cost etc. > > > > >> > > > > > > > statistical analysis > > > > >> > > > > > > > > > > > >> > > > > > > > Thus we thought of saving the data related to GFac > > jobs > > > in > > > > >> the > > > > >> > > > > registry > > > > >> > > > > > > in > > > > >> > > > > > > > order to facilitate feature such as above in the > > future. > > > > >> > > > > > > > > > > > >> > > > > > > > However a GFac job is potentially any sort of > > computing > > > > >> > resource > > > > >> > > > > access > > > > >> > > > > > > > (GRAM/UNICORE/EC2 etc.). Therefore we need to come > up > > > > with a > > > > >> > > > > > generalized > > > > >> > > > > > > > data structure that can hold the data of any type of > > > > >> resource. > > > > >> > > > > > Following > > > > >> > > > > > > > are the suggested data to save for a single GFac job > > > > >> execution, > > > > >> > > > > > > > > > > > >> > > > > > > > *experiment id, workflow instance id, node id* - > > > pinpoint > > > > >> the > > > > >> > > node > > > > >> > > > > > > > execution > > > > >> > > > > > > > *service, host, application description ids *- > > pinpoint > > > > the > > > > >> > > > > descriptors > > > > >> > > > > > > > responsible > > > > >> > > > > > > > *local job id* - the unique job id > retrieved/generated > > > per > > > > >> > > > execution > > > > >> > > > > > > > [PRIMARY KEY] > > > > >> > > > > > > > *job data* - data related executing the job (eg: the > > rsl > > > > in > > > > >> > GRAM) > > > > >> > > > > > > > *submitted, completed time* > > > > >> > > > > > > > *completed status* - whether the job was successfull > > or > > > > ran > > > > >> in > > > > >> > to > > > > >> > > > > > errors > > > > >> > > > > > > > etc. > > > > >> > > > > > > > *metadata* - custom field to add anything user wants > > > > >> > > > > > > > > > > > >> > > > > > > > Your feedback is most welcome. The API related > changes > > > > will > > > > >> > also > > > > >> > > be > > > > >> > > > > > > > discussed once we have a proper data structure. We > are > > > > >> hoping > > > > >> > to > > > > >> > > > > > > implement > > > > >> > > > > > > > this within next few days. > > > > >> > > > > > > > > > > > >> > > > > > > > Thanks, > > > > >> > > > > > > > Saminda > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > >