Re: Persisting GFac job data

Saminda Wijeratne Tue, 21 May 2013 13:40:28 -0700

On Tue, May 21, 2013 at 4:36 PM, Amila Jayasekara
<thejaka.am...@gmail.com>wrote:


> On Tue, May 21, 2013 at 3:51 PM, Saminda Wijeratne <samin...@gmail.com
> >wrote:
>
> > Thanks for the feedback Amila. a few comments inline
> >
> >
> > On Tue, May 21, 2013 at 12:29 PM, Amila Jayasekara
> > <thejaka.am...@gmail.com>wrote:
> >
> > > Hi Saminda,
> > >
> > > Great suggestion. Also +1 for Dhanushka's proposal to have
> > > serialize/de-serilized data.
> > > Few suggestions,
> > > 1. In addition to successful/error statuses we need other status for
> > nodes
> > > & workflows
> > > and workflows.
> > > E . g :-
> > >    node - started, submitted, in-progress, failed, successful etc ...
> > >
> > Sorry if I was too vague. Yes we have more fine-grain statuses for
> workflow
> > and node[1]. We will have a much fine-grained level of granuality for a
> > GFacJob status.
> >     public static enum GFacJobStatus{
> >         SUBMITTED, //job is submitted, possibly waiting to start
> executing
> >         EXECUTING, //submitted job is being executed
> >         CANCELLED, //job was cancelled
> >         PAUSED, //job was paused
> >         WAITING_FOR_DATA, // job is waiting for data to continue
> executing
> >         FAILED, // error occurred while job was executing and the job
> > stopped
> >         FINISHED, // job completed successfully
> >         UNKNOWN // unknown status. lookup the metadata for more details.
> >     }
> >
> >
> > 2. This data will be useful in implementing FT and Load Balancing in each
> > > component. Sometime back we had discussions to make GFac stateless. So
> > who
> > > is going to populate this data structure and persist it ?
> > >
> > That is a very good question... :). This summer is going to be a long
> > one... ;)
> >
>
> What I meant is which component is doing persistence ? (GFac or WF
> Interpretter). Not the actual person who is going to implement it :).
>
hih hih....
Well its going to be whatever the provider respondible for managing the job
lifecycle. For example GRAMProvider should be responsible for recording all
the data relating to the GRAM jobs its working with.

>
>
> >
> > 1.
> >
> >
> https://svn.apache.org/repos/asf/airavata/trunk/modules/workflow-model/workflow-model-core/src/main/java/org/apache/airavata/workflow/model/graph/Node.java
> >
> > >
> > > Thanks
> > > Amila
> > >
> > >
> > > On Tue, May 21, 2013 at 11:39 AM, Saminda Wijeratne <
> samin...@gmail.com
> > > >wrote:
> > >
> > > > Thats is an excellent idea. We can have the job data field to be the
> > > > designated GFac job serialized data. The whatever GFacProvider should
> > > > adhere to it.
> > > >
> > > > I'm still inclined to have the rest of the fields to ease of querying
> > for
> > > > the required data. For example if we wanted all attempts on executing
> > > for a
> > > > particular node of a workflow or if we wanted to know which
> application
> > > > descriptions are faster in execution or more reliable etc. we can let
> > the
> > > > query language deal with it. wdyt?
> > > >
> > > >
> > > > On Tue, May 21, 2013 at 11:24 AM, Danushka Menikkumbura <
> > > > danushka.menikkumb...@gmail.com> wrote:
> > > >
> > > > > Saminda,
> > > > >
> > > > > I think the data container does not need to have a generic format.
> We
> > > can
> > > > > have a base class that facilitate object
> > serialization/deserialization
> > > > and
> > > > > let specific meta data structure implement them as required. We get
> > the
> > > > > Registry API to serialize objects and save them in a meta data
> table
> > > > (with
> > > > > just two columns?) and to deserialize as they are loaded off the
> > > > registry.
> > > > >
> > > > > Danushka
> > > > >
> > > > >
> > > > > On Tue, May 21, 2013 at 8:34 PM, Saminda Wijeratne <
> > samin...@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > It has being apparent more and more that saving the data related
> to
> > > > > > executing a jobs from the GFac can be useful for many reasons
> such
> > > as,
> > > > > >
> > > > > > debugging
> > > > > > retrying
> > > > > > to make smart decisions on reliability/cost etc.
> > > > > > statistical analysis
> > > > > >
> > > > > > Thus we thought of saving the data related to GFac jobs in the
> > > registry
> > > > > in
> > > > > > order to facilitate feature such as above in the future.
> > > > > >
> > > > > > However a GFac job is potentially any sort of computing resource
> > > access
> > > > > > (GRAM/UNICORE/EC2 etc.). Therefore we need to come up with a
> > > > generalized
> > > > > > data structure that can hold the data of any type of resource.
> > > > Following
> > > > > > are the suggested data to save for a single GFac job execution,
> > > > > >
> > > > > > *experiment id, workflow instance id, node id* - pinpoint the
> node
> > > > > > execution
> > > > > > *service, host, application description ids *- pinpoint the
> > > descriptors
> > > > > > responsible
> > > > > > *local job id* - the unique job id retrieved/generated per
> > execution
> > > > > > [PRIMARY KEY]
> > > > > > *job data* - data related executing the job (eg: the rsl in GRAM)
> > > > > > *submitted, completed time*
> > > > > > *completed status* - whether the job was successfull or ran in to
> > > > errors
> > > > > > etc.
> > > > > > *metadata* - custom field to add anything user wants
> > > > > >
> > > > > > Your feedback is most welcome. The API related changes will also
> be
> > > > > > discussed once we have a proper data structure. We are hoping to
> > > > > implement
> > > > > > this within next few days.
> > > > > >
> > > > > > Thanks,
> > > > > > Saminda
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Persisting GFac job data

Reply via email to