[ https://issues.apache.org/jira/browse/IMPALA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong updated IMPALA-9302: ---------------------------------- Affects Version/s: Impala 3.3.0 > Multithreaded scanners don't check for filter effectiveness > ----------------------------------------------------------- > > Key: IMPALA-9302 > URL: https://issues.apache.org/jira/browse/IMPALA-9302 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Affects Versions: Impala 3.3.0 > Reporter: Tim Armstrong > Assignee: Tim Armstrong > Priority: Major > Labels: multithreading, performance > > This can be reproduced for TPC-H Q9. I saw this on scale factor 30 locally, > where the mt_dop=4 version of the query uses a lot more CPU in the scan than > the mt_dop=0 version. > This turns out to be because none of the runtime filters are getting > disabled, not even the ineffective ones. > {noformat} > Filter 2 (16.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 30.97M (30970695) > - Rows rejected: 0 (0) > - Rows total: 31.01M (31009074) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 4 (8.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 30.97M (30970695) > - Rows rejected: 0 (0) > - Rows total: 31.01M (31009074) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 5 (8.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 30.97M (30970695) > - Rows rejected: 0 (0) > - Rows total: 31.01M (31009074) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 8 (1.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 31.01M (31009074) > - Rows rejected: 0 (0) > - Rows total: 31.01M (31009074) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 10 (1.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 31.01M (31009074) > - Rows rejected: 29.32M (29317263) > - Rows total: 31.01M (31009074) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > {noformat} > In contrast here are the filters for mt_dop=0, where not all the rows are > processed. > {noformat} > Filter 2 (16.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 8.18M (8180257) > - Rows rejected: 0 (0) > - Rows total: 180.00M (179998372) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 4 (8.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 8.18M (8180257) > - Rows rejected: 0 (0) > - Rows total: 180.00M (179998372) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 5 (8.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 8.18M (8180257) > - Rows rejected: 0 (0) > - Rows total: 180.00M (179998372) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 8 (1.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 8.41M (8406914) > - Rows rejected: 0 (0) > - Rows total: 180.00M (179998372) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > Filter 10 (1.00 MB): > - Files processed: 0 (0) > - Files rejected: 0 (0) > - Files total: 0 (0) > - RowGroups processed: 0 (0) > - RowGroups rejected: 0 (0) > - RowGroups total: 0 (0) > - Rows processed: 180.00M (179998372) > - Rows rejected: 170.18M (170177099) > - Rows total: 180.00M (179998372) > - Splits processed: 0 (0) > - Splits rejected: 0 (0) > - Splits total: 0 (0) > {noformat} > Perf top showed 28% of CPU time in impala::BloomFilter::BucketFindAVX2, which > corroborates this. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org