The way we intend multithreading to work is that a parameter MT_DOP will control how many fragment instances per node are going to be run. Today, MT_DOP works for simple queries that do not involve joins - but it does not yet work for joins.
On Thu, Oct 26, 2017 at 11:33 PM, 俊杰陈 <cjjnj...@gmail.com> wrote: > Thanks, Alex! > > My question maybe can impala start multiple fragment instances for a > particular plan fragment on a single node, for example, I have 5 fragment > instances for a plan fragment say F01 on a 5 nodes cluster, is that > possible to have 10 F01 instances on 5 nodes, 2 F01 instances per node? > > 2017-10-27 13:41 GMT+08:00 Alexander Behm <alex.b...@cloudera.com>: > > > The multithreading effort is still ongoing. Joins, in particular, are not > > executed with multiple threads yet. > > > > Not sure if I completely followed your last two questions, please correct > > me if I misunderstood. > > The general idea of the multithreading effort is to start multiple > fragment > > instances per host. A fragment instance may contain an exchange node. > > > > > > On Wed, Oct 25, 2017 at 7:22 PM, 俊杰陈 <cjjnj...@gmail.com> wrote: > > > > > Thanks for the reply. > > > > > > I saw IMPALA-3902 <https://issues.apache.org/jira/browse/IMPALA-3902> > > > seems > > > to add support for multithread execution. It describes the goal is to > > > support running multiple fragment instances on a single node, is that > > means > > > coordinator generate multiple instances for a plan fragment on a single > > > node so that starts multiple exchange nodes to receive data and > process? > > Or > > > it starts instances for different plan fragments for preparing > > > the streaming? > > > > > > 2017-10-25 22:08 GMT+08:00 Jeszy <jes...@gmail.com>: > > > > > > > Hello JJ, > > > > > > > > No, currently Impala uses one thread to execute the join (without > > > > regard for the amount of partitions that fit into memory). > > > > > > > > HTH > > > > > > > > On 25 October 2017 at 05:44, 俊杰陈 <cjjnj...@gmail.com> wrote: > > > > > Hi > > > > > > > > > > When Impala does a partitioned join on a node, it split the build > > input > > > > > into partitions until a partition can fit into memory and consume > the > > > > probe > > > > > input then do the join and output rows. > > > > > > > > > > My question is will impala schedule multiple tasks to do join if > > > multiple > > > > > partitions fit into memory, or iterate over partitions? And for one > > > > > partition does it use multiple threads to do join? Thanks in > > advanced. > > > > > > > > > > > > > > > JJ > > > > > > > > > > > > > > > > -- > > > Thanks & Best Regards > > > > > > > > > -- > Thanks & Best Regards >