Raphael Ouazana created JAMES-3478:
--------------------------------------
Summary: [Distributed Task Manager] Run tasks in parallel
Key: JAMES-3478
URL: https://issues.apache.org/jira/browse/JAMES-3478
Project: James Server
Issue Type: Improvement
Components: task
Reporter: Raphael Ouazana
Architecture Decision Record
Run tasks in parallel
Status
draft
Context
Historically the task manager has been sequential with only one task running at
the same time.
We would like to take advantage of running a distributed task manager to run
some tasks in parallel on different nodes.
But we should be able to avoid running some tasks in parallel with some others.
The execution should stay sequential on a single node.
Decision
Use a non exclusive queue with prefetchCount=1 in RabbitMQ this way each node
will prefetch only one message
Each task will have a Set of resources needed to be acquired.
It will not be allowed to run in parallel with another task requiring one of
those resources.
Those resources could be hierarchical for example :
cassandra/mailboxes
cassandra/mailboxes/foo
A new event Dispatched is created during after a Create Command if the
resources needed by the task are free. And the task is sent to the workqueue.
At the termination of a task, we check for Created but not Delivered tasks and
deliver and send them to the workqueue if they do not depend on locked
resources now.
Consequences
We will have to modify the behavior of the start command to accept a task only
if no incompatible tasks are already running on the cluster.
we will have to be wary of detecting stuck task ( in case in a node restart) as
it should prevent to start new tasks requiring the same resources.
Definition of done
Have a test ensuring that given two task managers when 2 tasks which could run
concurrently are submitted then one is executed on one instance and the other
on the other instance.
Have a test ensuring that given two task managers when 2 tasks which could NOT
run concurrently are submitted then one is executed on an instance and the
other is executed only once the first one is terminated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]