In order to scale query execution horizontally, Drill divides it up over
over all (well, 70% by default) of the CPUs available to the cluster by
slicing physical plans up into "major fragments" and then those into
"minor fragments". You can roughly think of a minor fragment as a single
thread of execution at runtime. I've included some further reading below.
1. https://drill.apache.org/architecture/
2. Learning Apache Drill (O'Reilly)
On 2022/12/06 00:05, marc nicole wrote:
Hi,
Could somebody explain the notion of fragments in Drill and why would a
query be executed on each of the data fragments?
Thanks.