The query plans can indicate if a query is parallelized, by looking for
exchanges, which are used to merge work from multiple execution fragments,
or to re-distribute data for an operation. Execution fragments can run on
different threads or different machines. The best place to find out how
Running a CTAS from csv files in a 4 node HDFS cluster into a Parquet
file, and I note the physical plan in the Drill UI references scans of
all the csv sources on a single node.
collectl implies read and write IO on all 4 nodes - does this imply that
the full cluster is used for scanning the