To clarify one bit - the acceptor thread is the thread calling accept() on
the listening TCP socket. Once accepted, the RPC system uses libev
(event-based IO) to react to new packets on a "reactor thread". When a full
RPC request is received, it is distributed to the "service threads".

I'd also suggest running 'top -H -p $(pgrep kudu-tserver)' to see the
thread activity during the workload. You can see if one of the reactor
threads is hitting 100% CPU, for example, though I've never seen that to be
a bottleneck. David's pointers are probably good places to start
investigating.

-Todd

On Fri, Apr 28, 2017 at 1:41 PM, David Alves <davidral...@gmail.com> wrote:

> Hi
>
>   The acceptor thread only distributes work, it's very unlikely that is a
> bottleneck. Same goes for the number of workers, since the number of
> threads pulling data is defined by impala.
>   What is "extremely" slow in this case?
>
>   Some things to check:
>   It seems like this is scanning only 5 tablets? Are those all the tablets
> in per ts? Do tablets have roughly the same size?
>   Are you using encoding/compression?
>   How much data per tablet?
>   Have you ran "compute stats" on impala?
>
> Best
> David
>
>
>
> On Fri, Apr 28, 2017 at 9:07 AM, 기준 <0ctopus13pr...@gmail.com> wrote:
>
>> Hi!
>>
>> I'm using kudu 1.3, impala 2.7.
>>
>> I'm investigating about extreamly slow scan read in impala's profiling.
>>
>> So i digged source impala, kudu's source code.
>>
>> And i concluded this as a connection throughput problem.
>>
>> As i found out, impala use below steps to send scan request to kudu.
>>
>> 1. RunScannerThread -> Create new scan threads
>> 2. ProcessScanToken -> Open
>> 3. KuduScanner:GetNext
>> 4. Send Scan RPC -> Send scan rpc continuously
>>
>> So i checked kudu's rpc configurations.
>>
>> --rpc_num_acceptors_per_address=1
>> --rpc_num_service_threads=20
>> --rpc_service_queue_length=50
>>
>>
>> Here are my questions.
>>
>> 1. Does acceptor accept all rpc requests and toss those to proper service?
>> So, Scan rpc -> Acceptor -> RpcService?
>>
>> 2. If i want to increase input throughput then should i increase
>> '--rpc_num_service_threads' right?
>>
>> 3. Why '--rpc_num_acceptors_per_address' has so small value compared
>> to --rpc_num_service_threads? Because I'm going to increase that value
>> too, do you think this is a bad idea? if so can you plz describe
>> reason?
>>
>> Thanks for replying me!
>>
>> Have a nice day~ :)
>>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to