We have made a number of changes to the JobScheduler to properly work
with multi-tenancy. In this spot we created a list of the databases
that were down and when polling for jobs we would exclude these jobs.
We then had a separate polling period (default 5 minutes) that would
check the offline databases to see if they have gone back online.
This might not match what you are trying to do exactly because we have
a technique of storing all persisted jobs in our "main" database which
has a "delegatorName" column (which represents the tenant). Jobs that
are destined to run for all tenants would be "exploded" into a job per
tenant (targeted for it). This allows a "sendEmail" job (for example)
to execute on all tenant databases that are online, and safely skip
non-online tenants until they go back online. This also creates a
singleton jobManager so you do not have one running for each tenant ...
Anyway those are my thoughts on it :)
On Jul 20, 2010, at 1:06 PM, Adrian Crum (JIRA) wrote:
[ https://issues.apache.org/jira/browse/OFBIZ-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890329
#action_12890329 ]
Adrian Crum commented on OFBIZ-3867:
------------------------------------
One idea off the top of my head and without looking at the code
would be to give the JobManager a state. It could enter a suspended
state and then try switching to an active state from time to time.
State change log entries would be informational, not warnings.
Having a hook where outside events could monitor/trigger state
changes could be useful. A process monitoring the request load could
suspend the JobManager during peak traffic times.
JobManager.poll() enters an endless loop when it can't get a
connection
-----------------------------------------------------------------------
Key: OFBIZ-3867
URL: https://issues.apache.org/jira/browse/OFBIZ-3867
Project: OFBiz
Issue Type: Bug
Reporter: Adam Heath
Assignee: Adam Heath
JobManager.poll(), line 157, where it calls storeByCondition, can
fail when there is no connection available from the database(due to
a connection leak, or just load, or whatever). An exception then
gets thrown by storeByCondition(deep inside ofbiz/commons-dbcp/
postgres). The catch(Throwable) then logs the error, and the loop
tries again. Since pollDone never gets set to true, this loop is
*very* tight, and the log file starts to fill up *very* fast, each
each thread of JobPoller tries the same thing over and over.
I'm filing this bug mainly to see if anyone else works on it, but
if not, it's a reminder for me.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.