[ 
https://issues.apache.org/jira/browse/DRILL-7183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820569#comment-16820569
 ] 

ASF GitHub Bot commented on DRILL-7183:
---------------------------------------

HanumathRao commented on pull request #1755: DRILL-7183: TPCDS query 10, 35, 69 
take longer with sf 1000 when Statistics are disabled.
URL: https://github.com/apache/drill/pull/1755
 
 
   This PR reverts the changes made for DRILL-6997. The code changes for 
DRILL-6997 fixes join plan  issue observed when semi join is present in the 
query but introduces other queries which are candidates for semi join not to 
pick semi join. This behavior is causing regressions in TPCDS queries 10, 35, 
69. 
   
   @amansinha100  Please review this PR.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> TPCDS query 10, 35, 69 take longer with sf 1000 when Statistics are disabled
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-7183
>                 URL: https://issues.apache.org/jira/browse/DRILL-7183
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.16.0
>            Reporter: Robert Hou
>            Assignee: Hanumath Rao Maduri
>            Priority: Blocker
>             Fix For: 1.16.0
>
>
> Query 69 runs 150% slower when Statistics is disabled.  Here is the query:
> {noformat}
> SELECT
>   cd_gender,
>   cd_marital_status,
>   cd_education_status,
>   count(*) cnt1,
>   cd_purchase_estimate,
>   count(*) cnt2,
>   cd_credit_rating,
>   count(*) cnt3
> FROM
>   customer c, customer_address ca, customer_demographics
> WHERE
>   c.c_current_addr_sk = ca.ca_address_sk AND
>     ca_state IN ('KY', 'GA', 'NM') AND
>     cd_demo_sk = c.c_current_cdemo_sk AND
>     exists(SELECT *
>            FROM store_sales, date_dim
>            WHERE c.c_customer_sk = ss_customer_sk AND
>              ss_sold_date_sk = d_date_sk AND
>              d_year = 2001 AND
>              d_moy BETWEEN 4 AND 4 + 2) AND
>     (NOT exists(SELECT *
>                 FROM web_sales, date_dim
>                 WHERE c.c_customer_sk = ws_bill_customer_sk AND
>                   ws_sold_date_sk = d_date_sk AND
>                   d_year = 2001 AND
>                   d_moy BETWEEN 4 AND 4 + 2) AND
>       NOT exists(SELECT *
>                  FROM catalog_sales, date_dim
>                  WHERE c.c_customer_sk = cs_ship_customer_sk AND
>                    cs_sold_date_sk = d_date_sk AND
>                    d_year = 2001 AND
>                    d_moy BETWEEN 4 AND 4 + 2))
> GROUP BY cd_gender, cd_marital_status, cd_education_status,
>   cd_purchase_estimate, cd_credit_rating
> ORDER BY cd_gender, cd_marital_status, cd_education_status,
>   cd_purchase_estimate, cd_credit_rating
> LIMIT 100;
> {noformat}
> This regression is caused by commit 982e98061e029a39f1c593f695c0d93ec7079f0d. 
>  This commit should be reverted for now.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to