We are currently using Michael's suggestion of running a job periodically
which looks for labels with 0 nodes and assigns that label to a node in the
"fallback" pool. We are continuing to manage locality by manually juggling
label assignments. So now we have guaranteed incremental builds, which
Same issue here. Our incremental builds suffer from very poor locality
because the pipeline *always* start on a completely different set of node
then last time, and we have quite a few nodes. I came up with the same
tricks you mention, but it did not help much. Did you find a better
solution al
On Friday, October 28, 2016 at 9:36:49 PM UTC-7, John Calsbeek wrote:
>
>
> Shared storage is a potential option, yes, but the tasks in question are
> currently not very fault-tolerant when it comes to network hitches.
>
Well, it would pay to make them more fault-tolerant :-) But even if you do
On Friday, October 28, 2016 at 9:14:53 PM UTC-7, Michael Lasevich wrote:
>
> Is there are way to reduce the need for tasks to run on same slave? I
> suspect the issue is having data from the last run - if that is the case,
> is there any shared storage solution that may reduce the time difference
Is there are way to reduce the need for tasks to run on same slave? I
suspect the issue is having data from the last run - if that is the case,
is there any shared storage solution that may reduce the time difference?
If you can reduce the need for binding tasks to specific nodes, you bypass
yo
We have a problem trying to get more control over how the node() decides
what node to allocate an executor on. Specifically, we have a situation
where we have a pool of nodes with a specific label, all of which are
capable of executing a given task, but with a strong preference to run the
task