Alena Prokharchyk created CLOUDSTACK-2680:
---------------------------------------------

             Summary: Async job expunge thread expunges not only inactive jobs, 
but also the jobs that are currently being processed
                 Key: CLOUDSTACK-2680
                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-2680
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.1.0
            Reporter: Alena Prokharchyk
            Assignee: Alena Prokharchyk
             Fix For: 4.2.0


Async Job Expunge thread that expunges jobs being in the async_job table for 
more than "job.expire.minutes", expunge not only inactive (waiting) jobs, but 
also the jobs that are currently being processed. It affects all cloudStack 
jobs. It wasn't caught before because the default expire 
interval is 1 day, and the job would expire faster on the backend (30 mins is 
the default timeout). 

So here what happens in snapshot case: 

1) Set "concurrent.snapshots.threshold.perhost"=1, job.expire.minutes=15 mins 
2) First createSnapshot API was executed at "X" time. Async job1 was created. 
As there were no other snapshot jobs, the command was sent for execution to the 
backend. 
3) Second createSnapshot was executed at "X + 30 seconds" time. Async job2 was 
created. Job2 is sitting in the queue and waiting on a job1 to finish. 
4) The job1 didn't return back in 15 mins, and it was considered as expired by 
the AsyncJobManager, and removed from the queue (although it was already 
processed) 
5) The background process checking on the sync status for job2 (runs every 10 
seconds), found out that there is nothing blocking job2 any more, and sent it 
to the backend. 


The recommended fix would be: expire/expunge only inactive and already 
completed jobs. Don't touch the jobs that are currently being processed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to