There is a serious bottleneck in the existing TaskManager implementation when, under load, the typical task has at least one runAfter dependency on an older task at the time it is added.

I plan to remove this bottleneck, but I think understanding it may help in some decisions that need to be made about the new implementation, so I'm going to describe it.

The constructor comment block says: "A new thread is created if the total number of runnable tasks (both active and pending) exceeds the number of threads times the loadFactor, and the maximum number of threads has not been reached."

Achieving that requires a check whenever the number of runnable tasks increases. Similarly, a thread that is waiting for a task to run needs to be notified when a runnable task becomes available.

There are two ways a Task x can become runnable:

1. At the time x is added, it does not need to run after any older task.

2. Another task y has just completed, x did need to run after y, and x does not need to run after any other task that still exists.

Condition 1 is being handled correctly, with both a check for increasing the number of tasks and a notify on add of an immediately runnable task.

Condition 2 is completely ignored.

At least one thread will always be created, because the first task to be added must be immediately runnable. Even if a second thread gets created because a task happens to be immediately runnable, there is a danger of a thread timing out waiting for work, because it will not get notified until another immediately runnable task is added. For a workload with dependencies, the single threading will tend to increase the queue length, making added tasks less likely to be immediately runnable because they have more tasks to conflict with.

I believe the problem is a consequence of the data structure design, which makes asking "Did this task completion make any other task or tasks runnable?" very expensive. The design I'm working on will make answering this question a very fast side effect of work that needs to be done anyway.

Empirically, I have tested a workload with dependencies that should alternate between 5 parallel tasks and a single task, mean three tasks at a time, but runs no faster permitted up to 100 threads than when limited to one thread. I measured on a computer with two Xeon processors, each dual core and dual threaded.

Patricia

Reply via email to