By enabling the planner.enable_unionall_distribute option it is distributing UNION ALL operations to all worker nodes. But by default this is disabled. Is it ok to enable it? Just trying to understand the side effects of this option for other types of queries. On Sunday, May 5, 2024 at 07:13:43 PM PDT, Ranjit Reddy <ranjitre...@yahoo.com> wrote: Hi all, I'm working on a customer use case. They have one of the queries as shown below. The query is a modified version to hide the names of original tables and columns. What we are observing is, it is picking the right partition in hdfs by the noFiles scanned. However, all the scanning is done by just one node. This is displayed in the profile of the query. As a result the query is not truly distributed and takes forever to complete. This is a 20 node Drill cluster. Are there any settings to distribute the query? We are using following Drill version: apache drill> select * from sys.version; +--------------------+------------------------------------------+------------------------------------+---------------------------+-----------------+---------------------------+ | version | commit_id | commit_message | commit_time | +--------------------+------------------------------------------+------------------------------------+---------------------------+-----------------+---------------------------+ | 1.20.3.203-eep-921 | c215ce49343cac8f5a1a072e92f005bd03903af4 | Update Drill version to 1.20.3.203 | +--------------------+------------------------------------------+------------------------------------+---------------------------+-----------------+---------------------------+
planner.width.max_per_query = 1000 planner.width.max_per_node = 45 SELECT msi, startTime, endTime FROM ( SELECT column1, column2 FROM `dfs`.dse UNION ALL SELECT FROM `dfs`.`tmp`.`p UNION ALL SELECT column2, column3 FROM `dfs`.`tmp`.mdr ) where ( dir0 > 'event_date=2023-10-31' AND dir0 < 'event_date=2023-12-31' ) AND ( ntwk_date_time >= to_timestamp('2023-10-31 14:00:00','YYYY-MM-dd HH:mm:ss') AND date_time <= to_timestamp('2023-12-31 13:59:59','YYYY-MM-dd HH:mm:ss') ) AND ( MSI = '0146993') AND trim(party_num) <> '' order by startTime desc Thanks, -Ranjit