We have made a number of changes to the JobScheduler to properly work with multi-tenancy. In this spot we created a list of the databases that were down and when polling for jobs we would exclude these jobs. We then had a separate polling period (default 5 minutes) that would check the offline databases to see if they have gone back online.

This might not match what you are trying to do exactly because we have a technique of storing all persisted jobs in our "main" database which has a "delegatorName" column (which represents the tenant). Jobs that are destined to run for all tenants would be "exploded" into a job per tenant (targeted for it). This allows a "sendEmail" job (for example) to execute on all tenant databases that are online, and safely skip non-online tenants until they go back online. This also creates a singleton jobManager so you do not have one running for each tenant ...

Anyway those are my thoughts on it :)

On Jul 20, 2010, at 1:06 PM, Adrian Crum (JIRA) wrote:


[ https://issues.apache.org/jira/browse/OFBIZ-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890329 #action_12890329 ]

Adrian Crum commented on OFBIZ-3867:
------------------------------------

One idea off the top of my head and without looking at the code would be to give the JobManager a state. It could enter a suspended state and then try switching to an active state from time to time. State change log entries would be informational, not warnings.

Having a hook where outside events could monitor/trigger state changes could be useful. A process monitoring the request load could suspend the JobManager during peak traffic times.


JobManager.poll() enters an endless loop when it can't get a connection
-----------------------------------------------------------------------

               Key: OFBIZ-3867
               URL: https://issues.apache.org/jira/browse/OFBIZ-3867
           Project: OFBiz
        Issue Type: Bug
          Reporter: Adam Heath
          Assignee: Adam Heath

JobManager.poll(), line 157, where it calls storeByCondition, can fail when there is no connection available from the database(due to a connection leak, or just load, or whatever). An exception then gets thrown by storeByCondition(deep inside ofbiz/commons-dbcp/ postgres). The catch(Throwable) then logs the error, and the loop tries again. Since pollDone never gets set to true, this loop is *very* tight, and the log file starts to fill up *very* fast, each each thread of JobPoller tries the same thing over and over. I'm filing this bug mainly to see if anyone else works on it, but if not, it's a reminder for me.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to