Hi all,

I’m working on applying our orc-support patch into the latest code bases (
IMPALA-5717 <https://issues.apache.org/jira/browse/IMPALA-5717>). Since our
patch is based on cdh-5.7.3-release which was released one year ago,
there’re lots of work to merge it.


One of the biggest changes from cdh-5.7.3-release I notice is the new scan
node & scanner model introduced in IMPALA-3902
<https://issues.apache.org/jira/browse/IMPALA-3902>. I think it’s inspired
by the investigating task in IMPALA-2849
<https://issues.apache.org/jira/browse/IMPALA-2849>, but I cannot find any
performance report in this issue. Could you share some report about this
multi-thread refactor?


I’m wondering how much this can improve the performance, since the old
single thread scan node & multi-thread scanners model has supplied
concurrent IO for reading, and most of the queries in OLAP are IO bound.


Thanks,

Quanlong

Reply via email to