What about using pull queues? Your external system could poll the
queues for work, and scale based on the number of items in the queue.
That would give you the most control. If you didn't want to poll, you
could set up a push notification to your external system, telling it
that there is new work
Sorry, I realize I didn't explain everything: since the external
system can handle an arbitrary load (as long as this load doesn't grow
too fast) and requests get processed in 1 to 2 seconds, we already set
the processing rate to the maximum (200), as well as the concurrent
(200).
The problem is
Interesting problem. While the sudden load may be undesirable it seems
that the real problem is that the task queue backoff is too aggressive - if
it kept trying, it would eventually spin up enough hardware at the external
system.
You can configure the retry schedule explicitly - maybe try
Good point, forcing the tasks to retry every few seconds instead of at
an increasing back off might do it. Keep pounding the system to force
it to scale, but it's not very elegant :)
My original idea was to play with the bucket size parameter for the
queue, as the docs seem to imply that it