Idea is not to do job canceling from monitoring but have a single state. I had experience dealing with job cancel as a asynchronous call and difficulty of dealing with job status. Resources return job as complete even user canceled the job. I will recommend to modify the monitor thread to stop listening for canceled job and update the job status not relaying on resource status. Thanks Raminder
On Apr 21, 2014, at 3:23 PM, Saminda Wijeratne <samin...@gmail.com> wrote: > > > > On Mon, Apr 21, 2014 at 11:51 AM, Eroma Abeysinghe > <eroma.abeysin...@gmail.com> wrote: > Hi, > > If you have bottom up we will not be able to cancel unless there is a job > available for that experiment right? > Hmm... dats true... when a Job is not available the Task would be the bottom > doing the preprocessing stuff for the job submission or post processing stuff > after job completion. > Also few questions; > 1. What do we really mean by canceling? is it just a status update? > Effectively a status update. But it will signal for interested parties what > is happenning. eg: gateway will knwo that its in the process of canceling and > GFacProvider knows it needs to perform actual cancellation if not already > done. > OR > 2. Are we going to stop all file transfers, delete any data file/file path > existing in Airavata for that experiment/tasks/job? > I think thats what Raminder meant by the cleanup operations. > 3. And also are we considering both single submission and workflows or is it > just single submissions? > Personally right now I'm focusing only on single submission. Atleast until I > get my bearings on how this new design will play-out with what we want. > Workflows will come next. > If we are going to consider canceling of workflows then we need to extend > cancelling to multiple tasks and jobs an experiment would have > Yep > 4. Also we need to define what experiments we can cancel - IMO we don't need > to bother with COMPLETED, CANCELED, UNKNOWN, FAILED experiments and similar > statuses in tasks and jobs > Yes, I'll add that validation. > > > > IMHO i also don't think Job monitor should do any cancellations. > > Thank You, > Best Regards, > Eroma > > > > > > On Mon, Apr 21, 2014 at 2:37 PM, Saminda Wijeratne <samin...@gmail.com> wrote: > May I finish setting up the framework for catching cancel requests? I'll > finish implementing the cancel once we decide upon who will do what when > cancelling a job. > > I just remembered the canceled notification would be handled by the status > update mechanism we introduced last week. But this mechanism works only > bottom up, i.e. Job status updates will trigger Task status updates and that > will trigger Experiment status updates. Does it make sense to have > "canceling" status also to progress likewise instead of top down (which i > suggested in the first mail)? > > > > On Mon, Apr 21, 2014 at 11:11 AM, Lahiru Gunathilake <glah...@gmail.com> > wrote: > Hi Raman, > > > On Mon, Apr 21, 2014 at 10:35 AM, Raminder Singh <raminderjsi...@gmail.com> > wrote: > Thanks for investigation the problem and working through solution. This is > really required for the production gateways like Ultrascan. > > In the current architecture where we have job submission(provider) and > monitoring separate, job cancel request need not to go to GFAC provider. > Provider submits the jobs and handover the job id to the orchestrator. > Orchestrator works with the job monitoring to maintain the job state. Now the > cancel need to be handled by Orchestrator and Monitoring. That will change > the course of action for API to cancel a job. > > I dont' think so, Orchestrator can invoke GFac Provider level job > cancellation and it should simply reflect in the monitor when it try to get > the status of that job( once its got to know by the monitor it should stop > monitoring that job) and without modifying the monitor everything should > work. There is no need to touch the monitor. > > I think Job cancellation should be a functionality of GFAC Provider and it > should be similar to job submission where you can do pre processing and post > processing after job cancellation operation. > One important requirement to take care is cleanup task after the job is > canceled like updating the job status table and updating the status. > > Thanks > Raminder > > > On Apr 21, 2014, at 9:06 AM, Saminda Wijeratne <samin...@gmail.com> wrote: > >> Hi All, >> >> After looking at the current design and doing some trial and error I thought >> of implementing the cancellation as follows. >> >> Cancellation of an experiment requested by a gateway requires cancellation >> request to go through several layers. (Orchestrator > GFac > GFac Provider) >> Each layer is responsible for handling cancellation relevant for that layer >> (Orchestration cancels experiment, GFac cancels Task, GFac Provider cancels >> Job) >> What I thought is, each layer will listen to cancellation request made to >> the layer above and perform its cancellation actions accordingly. (GFac will >> see the experiment is having the status "canceling" for an experiment id and >> it will perform cancellation of the tasks relevant for that experiment) >> Effectively the Orchestrator will be >> updating the status of the experiment in registry with the status >> "canceling" >> publish a message which will be caught by GFac instance which handles its >> Tasks. >> GFac will perform the same and the correct GFac Provider instance will catch >> the message and perform the actual job cancellation. >> Once the job cancellation is done the statuses at each layer will be updated >> (to "canceled") in similar fashion. >> We allow the API call of cancellation to be asynchronous >> I'm hoping to use the MonitorPublisher implemented by Lahiru to publish the >> messages. >> wdyt? >> >> >> >> Thanks, >> >> Saminda >> >> > > > > > -- > System Analyst Programmer > PTI Lab > Indiana University > > > > > -- > Thank You, > Best Regards, > Eroma >