Csaba Ringhofer created IMPALA-10973:
----------------------------------------

             Summary: Empty scan nodes are scheduled to the (exclusive) 
coordinator
                 Key: IMPALA-10973
                 URL: https://issues.apache.org/jira/browse/IMPALA-10973
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
            Reporter: Csaba Ringhofer


Currently fragments with scan nodes that have no scan ranges are scheduled to 
the coordinator, even if it is an exclusive coordinator:
https://github.com/apache/impala/blob/master/be/src/scheduling/scheduler.cc#L805

As "parent" fragments are often scheduled to be collocated with their children, 
the condition of "being scheduled to the coordinator" can spread through the 
plan tree.

This can be disastrous to scalability in clusters with lot of executors but few 
coordinators and is also very counter-intuitive, as scanning an empty table 
shouldn't have a major effect on the query. 
 
To reproduce locally:
bin/start-impala-cluster.py --use_exclusive_coordinators -c 1
in Impala shell:
select id from functional.alltypes;
profile; -- scan nodes will be scheduled to 2 hosts

select f2 from functional.emptytable union all select id from 
functional.alltypes;
profile; --  scan nodes will be scheduled to 3 hosts



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to