[ https://issues.apache.org/jira/browse/DRILL-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462895#comment-16462895 ]
Vitalii Diravka commented on DRILL-6374: ---------------------------------------- [~kkhatua] I think we don't need revert this commit (at least now). I believe there is possibility to solve it. The plan after TRANSITIVE_CLOSURE rules logical block is better than before performing it: {code} 2018-05-03 20:00:16,187 [2514bfe0-2670-e7db-484b-72cb30123640:foreman] ERROR o.a.d.e.p.s.h.DefaultSqlHandler - Before transitive closure: LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[DESC]) LogicalAggregate(group=[{0, 1}], sum_profit=[SUM($2)]) LogicalProject(nation=[$22], o_year=[EXTRACT(FLAG(YEAR), $19)], amount=[-(*($10, -(1, $11)), *($16, $12))]) LogicalFilter(condition=[AND(=($4, $7), =($14, $7), =($15, $8), =($1, $8), =($18, $9), =($5, $21), LIKE($2, '%yellow%'))]) LogicalJoin(condition=[true], joinType=[inner]) LogicalJoin(condition=[true], joinType=[inner]) LogicalJoin(condition=[true], joinType=[inner]) LogicalJoin(condition=[true], joinType=[inner]) LogicalJoin(condition=[true], joinType=[inner]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/part]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/supplier]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/lineitem]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/partsupp]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/orders]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/nation]]) 2018-05-03 20:00:16,256 [2514bfe0-2670-e7db-484b-72cb30123640:foreman] ERROR o.a.d.e.p.s.h.DefaultSqlHandler - After transitive closure: LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[DESC]) LogicalAggregate(group=[{0, 1}], sum_profit=[SUM($2)]) LogicalProject(nation=[$22], o_year=[EXTRACT(FLAG(YEAR), $19)], amount=[-(*($10, -(1, $11)), *($16, $12))]) LogicalJoin(condition=[=($5, $21)], joinType=[inner]) LogicalJoin(condition=[=($18, $9)], joinType=[inner]) LogicalJoin(condition=[AND(=($14, $7), =($15, $8))], joinType=[inner]) LogicalJoin(condition=[AND(=($4, $7), =($1, $8))], joinType=[inner]) LogicalJoin(condition=[true], joinType=[inner]) LogicalFilter(condition=[LIKE($2, '%yellow%')]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/part]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/supplier]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/lineitem]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/partsupp]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/orders]]) EnumerableTableScan(table=[[dfs, /tpch_sf100/tpch_sf100_parquet/nation]]) {code} But for some reason _DrillPushProjectIntoScanRule_ is not performed and there are a lot of Project operators in the final plan and Drill scans more files than it should. I will investigate it in details and will try to solve it. Also the question is it possible to add test cases for our test framework to catch these kinds of performance regressions automatically? > TPCH Queries regressed and OOM when run concurrency test > -------------------------------------------------------- > > Key: DRILL-6374 > URL: https://issues.apache.org/jira/browse/DRILL-6374 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill > Affects Versions: 1.14.0 > Environment: RHEL 7 > Reporter: Dechang Gu > Assignee: Vitalii Diravka > Priority: Critical > Fix For: 1.14.0 > > Attachments: TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json, > TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json > > > Run TPCH regression test on Apache Drill 1.14.0 master commit > 6fcaf4268eddcb09010b5d9c5dfb3b3be5c3f903 (DRILL-6173), most of the queries > regressed. > In particular, TPC-H Query 9 takes about 4x time (36 sec vs 8.6 sec), > comparing to that when run against the parent commit > (9173308710c3decf8ff745493ad3e85ccdaf7c37). > Further in the concurrency test for the commit, with 48 clients each running > 16 TPCH queries (so total 768 queries are executed) with > planner.width.max_per_node=5, some queries hit OOM and caused 273 queries > failed, while for the parent commit all the 768 queries completed > successfully. > > Profiles for TPCH_09 in the regression tests are uploaded: > * The failing commit file name: > [^TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json], > * The parent commit file name: > [^TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json] ). -- This message was sent by Atlassian JIRA (v7.6.3#76005)