[ https://issues.apache.org/jira/browse/IMPALA-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Smith updated IMPALA-11400: ----------------------------------- Fix Version/s: (was: Impala 4.3.0) > Kudu scan bottleneck due to sharing a single Kudu client for multiple tablet > scans > ---------------------------------------------------------------------------------- > > Key: IMPALA-11400 > URL: https://issues.apache.org/jira/browse/IMPALA-11400 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 4.1.0 > Reporter: Sameera Wijerathne > Priority: Major > Labels: performance > Attachments: 0.JPG, 1.JPG, 2-1.jpeg, 2.JPG, 2.jpeg, 3.JPG, 4.JPG, > 5.JPG, Impala_1.png, Impala_2.png, Kudu_1.png, Kudu_2.png, WhatsApp Image > 2022-06-07 at 10.39.27 PM.jpeg > > > This issue was observed when impala queries large datasets resides in Kudu. > Even single ImpalaD is scanning multiple kudu tablets, it shows a slowness to > retrive data eventhough ImpalaD makes parrellel scans. Reason for this is > ImpalaD only uses a single Kudu client for multiple scans but > KuduScanner::NextBatch runs on a single thread. So it's rpc reactor thread > utilizes upto a single core and bottlenecks all parrelel scans. > This behaviour makes Impala clusters that scans kudu cannot be vertically > scales to the maximum performance/cores of a node. > Please refer the screenshots from Kudu slack channel for more information. > > !2-1.jpeg|width=717,height=961! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org