Change these global parameters to very small value say 1 min
job.cancel.threshold.minutes    Time (in minutes) for async-jobs to be forcely 
cancelled if it has been in process for long     
job.expire.minutes      Time (in minutes) for async-jobs to be kept in system   

and then restart your management server
wait for some time and asyn job will expire and then
change back these value to  original value and restart MS again.

Hope this will help 

Thanks
Shweta



-----Original Message-----
From: Andrei Mikhailovsky [mailto:and...@arhont.com] 
Sent: Monday, June 09, 2014 6:53 PM
To: users@cloudstack.apache.org
Subject: deleting or cancelling broken ACS jobs

Hello guys, 

was wondering if anyone have come across an issue where acs would get stuck on 
several jobs and keeps trying to do them over and over again? 

I've come across an issue a few days ago. For some reason I have about 5 or 6 
XenServer cluster jobs which have gone crazy. These jobs are of different 
nature, like template creation, vm start and enable host maintenance. 
They keep on repeating in the logs about 20-30 times a second, causing 
overfilling of logs. I get about 20GB of management server logs each day and it 
seems that these stuck jobs are causing the overflow. I am also not able to 
perform any activity on the XenServer cluster which has those stuck jobs. I am 
unable to start or stop jobs or pretty much do anything with it. 

I've tried restarting both the management server and the xenserver hosts, but 
that didn't help. After a short while following a restart the same thing starts 
to happen. 

Is there a way for ACS to cancel / remove these jobs? I've looked at the 
async_job and async_job_view db tables and I can see 28 entries there amongst 
which are these stuck jobs gone crazy. Is it safe for me to simply remove them 
from the database and restart the management server? Are there any other db 
tables that I should look at? 

Many thanks 

Andrei 




Reply via email to