Re: How many threads impala start for handling partitioned join?

2017-10-27 Thread Alexander Behm
The way we intend multithreading to work is that a parameter MT_DOP will control how many fragment instances per node are going to be run. Today, MT_DOP works for simple queries that do not involve joins - but it does not yet work for joins. On Thu, Oct 26, 2017 at 11:33 PM, 俊杰陈 wrote: > Thanks,

Re: How many threads impala start for handling partitioned join?

2017-10-26 Thread 俊杰陈
Thanks, Alex! My question maybe can impala start multiple fragment instances for a particular plan fragment on a single node, for example, I have 5 fragment instances for a plan fragment say F01 on a 5 nodes cluster, is that possible to have 10 F01 instances on 5 nodes, 2 F01 instances per node?

Re: How many threads impala start for handling partitioned join?

2017-10-26 Thread Alexander Behm
The multithreading effort is still ongoing. Joins, in particular, are not executed with multiple threads yet. Not sure if I completely followed your last two questions, please correct me if I misunderstood. The general idea of the multithreading effort is to start multiple fragment instances per h

Re: How many threads impala start for handling partitioned join?

2017-10-25 Thread 俊杰陈
Thanks for the reply. I saw IMPALA-3902 seems to add support for multithread execution. It describes the goal is to support running multiple fragment instances on a single node, is that means coordinator generate multiple instances for a plan fr

Re: How many threads impala start for handling partitioned join?

2017-10-25 Thread Jeszy
Hello JJ, No, currently Impala uses one thread to execute the join (without regard for the amount of partitions that fit into memory). HTH On 25 October 2017 at 05:44, 俊杰陈 wrote: > Hi > > When Impala does a partitioned join on a node, it split the build input > into partitions until a partition

How many threads impala start for handling partitioned join?

2017-10-24 Thread 俊杰陈
Hi When Impala does a partitioned join on a node, it split the build input into partitions until a partition can fit into memory and consume the probe input then do the join and output rows. My question is will impala schedule multiple tasks to do join if multiple partitions fit into memory, or i