On Tue, Jul 16, 2013 at 1:11 PM, Lahiru Gunathilake <[email protected]>wrote:
> Hi Amila, > > I think at this level we can live without having interpreter level job > canceling, because if we cancel a job in some other thread interpreter can > pick it up and make the that node as cancelled and with current interpreter > logic, after the the first job failure workflow is failing. So logically > before we think of interpreter level job canceling we need to do more work > in our interpreter logic to make use of that feature. > I dont understand quite what you are saying above. The GFac job cancellation is needed for 3 reasons in my opinion. 1. If a workflow is cancelled 2. If WF interpretter decide it should cancel execution of a node based on a feedback loop (I know we dont have right now) 3. If a job needs to be cancelled when using GFac for job submission only 1 and 2 should originate from interpretter for sure. For consistency I think 3 should also come from interpretter. > > For job canceling logic, we can pick the provider using the same logic we > use in normal GFAC node execution and program against provider interface, > so that right cancel method will get called. > Please look at the trunk code. This is already in place though the implementation is restricted Gram at the moment. > > Raman, what do you mean by user setting JobExecutionContext or security > Context ? User doesn't have to set anything, we create it and set it in to > JobExecutionContext, the same way as we do in GFacAPI, user just have to > specify the nodeId, experimentId. > > Thanks > Lahiru > > > On Tue, Jul 16, 2013 at 10:23 PM, Amila Jayasekara < > [email protected]> wrote: > >> >> >> >> On Tue, Jul 16, 2013 at 12:30 PM, Saminda Wijeratne >> <[email protected]>wrote: >> >>> >>> >>> >>> On Tue, Jul 16, 2013 at 12:18 PM, Raminder Singh < >>> [email protected]> wrote: >>> >>>> Thanks Amila for providing the details. Job cancel will be user action >>>> called from API or Xbaya. I don't think its necessarily always a workflow >>>> interpreter operation. Its will be useful if we provide an option in API >>>> to cancel jobs. I have few other questions >>>> >>>> 1. We don't need to pass experimentid, workflowid, nodeid all the way >>>> to gfac level. GFAC only need jobid to create cancel request for the job. >>>> According to me getting of job id need to be done in API and only job id >>>> need to be passed to this level. >>>> >>> +1 >>> At the workflow interpreter level it should be "cancel node execution" >>> "cancel workflow execution" "cancel experiment". The interpreter can >>> translate the node id to the gfac job id and call cancel job in the gfac >>> interface. >>> >> >> Ok. Lets have a single method with job id to cancel jobs. >> >> >>> >>> 2. I looked into GramProvider code and did not like the dependency of >>>> JobExecutionContext in these methods. I observed you are using it get >>>> security context. Is not it lightweight for the client to just set security >>>> context? >>>> >>> >> I prefer to keep JobExecutionContext as it is the medium communicating >> with the GFac interface. Further if we need pass any additional parameters >> we can use job execution context. >> >> I assume Raman or Saminda will help implementing job cancellation at >> interpretter level and also at API level. >> >> Thanks >> Amila >> >> >>> >>>> Please let me know if you have any questions. >>>> >>>> Thanks >>>> Raminder >>>> >>>> On Jul 16, 2013, at 11:11 AM, Amila Jayasekara <[email protected]> >>>> wrote: >>>> >>>> > Hi All, >>>> > >>>> > I have added following methods to GFacProvider interface to do job >>>> cancellation. But we need to figure out from where these methods should be >>>> called. As I feel these methods should get triggered from Workflow >>>> Interpretter. >>>> > >>>> > I would like to use this mail thread to discuss how we can invoke >>>> cancellation methods and how we can expose job cancellation at API. >>>> > >>>> > Please give feedback. >>>> > >>>> > Thanks >>>> > Amila >>>> > >>>> > >>>> > /** >>>> > * Cancels all jobs relevant to an experiment. >>>> > * @param experimentId The experiment id >>>> > * @param jobExecutionContext The job execution context, contains >>>> runtime information. >>>> > * @throws GFacException If an error occurred while cancelling >>>> the job. >>>> > */ >>>> > void cancelJob(String experimentId, JobExecutionContext >>>> jobExecutionContext) throws GFacException; >>>> > >>>> > /** >>>> > * Cancels all jobs relevant to a workflow in an experiment. >>>> > * @param experimentId The experiment id >>>> > * @param workflowId The workflow id. >>>> > * @param jobExecutionContext The job execution context, contains >>>> runtime information. >>>> > * @throws GFacException If an error occurred while cancelling >>>> the job. >>>> > */ >>>> > void cancelJob(String experimentId, String workflowId, >>>> > JobExecutionContext jobExecutionContext) throws >>>> GFacException; >>>> > >>>> > /** >>>> > * Cancels the job for a given a workflow id and node id in an >>>> experiment. >>>> > * @param experimentId The experiment id. >>>> > * @param workflowId The workflow id. >>>> > * @param nodeId The node id. >>>> > * @param jobExecutionContext The job execution context relevant >>>> to cancel job operation. >>>> > * @throws GFacException If an error occurred while cancelling >>>> the job. >>>> > */ >>>> > void cancelJob(String experimentId, String workflowId, String >>>> nodeId, >>>> > JobExecutionContext jobExecutionContext) throws >>>> GFacException; >>>> >>>> >>> >> > > > -- > System Analyst Programmer > PTI Lab > Indiana University >
