[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224522#comment-14224522
 ] 

Rohit Yadav commented on CLOUDSTACK-7749:
-----------------------------------------

[~minchen07] can you help backport this to 4.3 or advise, thanks.

> AsyncJob GC thread cannot purge queue items that have been blocking for too 
> long if exception is thrown in expunging some unfinished or completed old 
> jobs, this will make some future jobs stuck.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-7749
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-7749
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.3.0
>            Reporter: Min Chen
>            Assignee: Min Chen
>            Priority: Critical
>             Fix For: 4.5.0
>
>
> AsyncJobManager has a GC thread to clean up some unfinished job and complete 
> jobs that are too old. In this same thread, we are also forcefully cancel 
> blocking queue items if they've been staying there for too long. Currently if 
> there is an exception thrown in expunging one job, for example, like the one 
> below:
> 2014-10-14 17:57:26,347 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (AsyncJobMgr-Heartbeat-1:ctx-67ae4177) Expunging unfinished job AsyncJobVO
> {id:18443, userId: 60, accountId: 69, instanceType: null, instanc eId: null, 
> cmd: com.cloud.vm.VmWorkStart, cmdInfo: 
> rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIACkoABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljb
>  
> HVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL
>  
> 01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAABFAAAAAAAAADwAAAAAAAACdXQAGVZpcnR1YWxNYWNoaW5lTWFuY
>  
> WdlckltcGwAAAAAAAAAAHBwcHBwcHBzcgARamF2YS51dGlsLkhhc2hNYXAFB9rBwxZg0QMAAkYACmxvYWRGYWN0b3JJAAl0aHJlc2hvbGR4cD9AAAAAAAAMdwgAAAAQAAAAAXQAClZtUGFzc3dvcmR0ABxyTzBBQlhRQURuTmhkbVZrWDNCaGMzTjNiM0preHA,
>  cmdVersi on: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, 
> result: null, initMsid: 244536014864905, completeMsid: null, lastUpdated: 
> null, lastPolled: null, created: Wed Aug 20 18:14:13 EDT 2014}
> 2014-10-14 17:57:26,350 DEBUG [c.c.u.d.T.Transaction] 
> (AsyncJobMgr-Heartbeat-1:ctx-67ae4177) Rolling back the transaction: Time = 2 
> Name = AsyncJobMgr-Heartbeat-1; called by -TransactionLegacy.rollback:8
> 96-TransactionLegacy.removeUpTo:839-TransactionLegacy.close:663-TransactionContextInterceptor.invoke:35-ReflectiveMethodInvocation.proceed:161-ExposeInvocationInterceptor.invoke:91-ReflectiveMethodInvocat
> ion.proceed:172-JdkDynamicAopProxy.invoke:204-$Proxy151.expunge:-1-AsyncJobManagerImpl$8.doInTransactionWithoutResult:802-TransactionCallbackNoReturn.doInTransaction:25-Transaction$2.doInTransaction:49
> 2014-10-14 17:57:26,368 ERROR [o.a.c.f.j.i.AsyncJobManagerImpl] 
> (AsyncJobMgr-Heartbeat-1:ctx-67ae4177) Unexpected exception when trying to 
> execute queue item,
> com.cloud.utils.exception.CloudRuntimeException: DB Exception on: 
> com.mysql.jdbc.JDBC4PreparedStatement@39fdfb52: DELETE FROM async_job WHERE 
> async_job.id= 18443
> at com.cloud.utils.db.GenericDaoBase.expunge(GenericDaoBase.java:1144)
> at sun.reflect.GeneratedMethodAccessor247.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:622)
> at 
> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
> at 
> com.cloud.utils.db.TransactionContextInterceptor.invoke(TransactionContextInterceptor.java:33)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
> at 
> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
> at 
> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
> at 
> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
> at com.sun.proxy.$Proxy151.expunge(Unknown Source)
> at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$8.doInTransactionWithoutResult(AsyncJobManagerImpl.java:802)
> at 
> com.cloud.utils.db.TransactionCallbackNoReturn.doInTransaction(TransactionCallbackNoReturn.java:25)
> at com.cloud.utils.db.Transaction$2.doInTransaction(Transaction.java:49)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:37)
> at com.cloud.utils.db.Transaction.execute(Transaction.java:46)
> at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl.expungeAsyncJob(AsyncJobManagerImpl.java:799)
> at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$7.reallyRun(AsyncJobManagerImpl.java:762)
> at 
> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$7.runInContext(AsyncJobManagerImpl.java:738)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:50)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
> at 
> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
> at 
> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:47)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:701)
> Caused by: 
> com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: 
> Cannot delete or update a parent row: a foreign key constraint fails 
> (`cloud`.`async_job_join_map`, CONSTRAINT 
> `fk_async_job_join_map__join_job_id` FOREIGN KEY (`join_job_id`) REFERENCES 
> `async_job` (`id`))
> all the following purge queue item action will not get chances to run at all. 
> This will cause potential job stuck issues since we are serializing VM 
> operations per VM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to