Hi Scott,
Drillbit symmetry is built deep into Drill's distribution model: the planner
assumes Drillbits are equal. Changing this assumption is possible (you cited
MapReduce as a system that handles this case), but would require complex code
changes:
* Distribute scan blocks based on locality, or include machine capability when
attempting to balance reads (weaker machines get fewer reads, say)?
* When determining the number of minor fragments (execution tasks), base this
on the total available slots? (With each machine having a number of slots
determined by its configuration, say.) This is easier for simple operators
(filter, project), but gets trickier for things like sorts and joins.
* Prefer more powerful machines for some operators such as sort? (Sort on
machines with the most memory, or. a combination of memory and CPU)?
* Exclude weak nodes from being Foreman? (Or, dedicate such nodes to ONLY being
Foreman?)
As you can see, the scheduling algorithm for an asymmetric cluster would be
very complex and very hard to get right. I suspect that is why Drill went with
the much simpler assumption: symmetric nodes.
In fact, to support asymmetry well, Drill would likely need a different
paralyzer design, one that sees assigning minor fragments to nodes as a simple
slice & dice activity to instead looking at more like YARN (or Kubernetes)
does: as a process of assigning tasks to slots using some kind of best-fit or
bin-packing algorithm. Obviously not a trivial change!
For now, the best advice would be to configure all Drillbits to use the same
amount of memory and CPU. Use YARN to assign additional non-Drill tasks to
larger nodes, while leaving Drill as the only task on weaker nodes.
Thanks,
- Paul
On Tuesday, August 21, 2018, 1:48:19 PM PDT, scott <[email protected]>
wrote:
Hi community,
I am trying to find a way to tune Drill so that weaker drillbits get less
data to work on so that the weak link doesn't drag my performance down. I
have drillbits running on a variety of hardware and sometimes these shared
resources get really slow. It seems that the query plan always evenly
divides the data fragments so that each drillbit gets the same data to chew
on. How do I make it give weaker drillbits less data?
Alternatively, is there a way to limit and queue fragments of the query and
leave them unassigned, then assign to drillbits as their resources become
free, similar to MapReduce?
Thanks for you time,
Scott