[jira] Commented: (OFBIZ-3867) JobManager.poll() enters an endless loop when it can't get a connection

Bob Morley (JIRA) Tue, 20 Jul 2010 10:43:49 -0700

    [ 
https://issues.apache.org/jira/browse/OFBIZ-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890339#action_12890339
 ]


Bob Morley commented on OFBIZ-3867:
-----------------------------------

Naturally we are willing to share our code if you would like.  Our trunk build 
currently is running on a single application server with about 200 tenants.  We 
have a few enhancements in the hopper on this as we bring application services 
online we need to properly distribute the jobs across the managers for example. 
 We have had a number of issues from the job manager ... one that we want to 
tackle is because the "job template" is a job_sandbox record like any other, it 
eventually gets purge.  For example, say you are running a job with a "HOURLY" 
temporal expression and you then decide you want to change that to run 
"NIGHTLY".  There is no clean way to do this other than updating every 
job_sandbox record that is either RUNNING or PENDING.

Another thought is that we moved the rescheduling of the next instance of a 
recurring job to the end (rather than when a job is init'd) and we made sure to 
schedule it in the future to avoid overloading.  For example, if a database is 
offline for say a week and then it comes back online, you really do not want 
"sendEmail" to be scheduled every 5 minutes over that week period.

> JobManager.poll() enters an endless loop when it can't get a connection
> -----------------------------------------------------------------------
>
>                 Key: OFBIZ-3867
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-3867
>             Project: OFBiz
>          Issue Type: Bug
>            Reporter: Adam Heath
>            Assignee: Adam Heath
>
> JobManager.poll(), line 157, where it calls storeByCondition, can fail when 
> there is no connection available from the database(due to a connection leak, 
> or just load, or whatever).  An exception then gets thrown by 
> storeByCondition(deep inside ofbiz/commons-dbcp/postgres).  The 
> catch(Throwable) then logs the error, and the loop tries again.  Since 
> pollDone never gets set to true, this loop is *very* tight, and the log file 
> starts to fill up *very* fast, each each thread of JobPoller tries the same 
> thing over and over.
> I'm filing this bug mainly to see if anyone else works on it, but if not, 
> it's a reminder for me.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (OFBIZ-3867) JobManager.poll() enters an endless loop when it can't get a connection

Reply via email to