comphead opened a new issue, #18487: URL: https://github.com/apache/datafusion/issues/18487
### Is your feature request related to a problem or challenge? I'm testing TPCDS Q69 correctness issues with Comet(https://github.com/apache/datafusion-comet/issues/2667) and found the Q69 which contains multiple anti joins performs way worse comparing to hash join Modified query ``` select cd_gender, cd_marital_status, cd_education_status, count(*) cnt1, cd_purchase_estimate, count(*) cnt2, cd_credit_rating, count(*) cnt3 from customer c,customer_address ca,customer_demographics where c.c_current_addr_sk = ca.ca_address_sk and ca_state in ('IN','VA','MS') and cd_demo_sk = c.c_current_cdemo_sk and exists (select * from store_sales,date_dim where c.c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year = 2002 and d_moy between 2 and 2+2) and (not exists (select * from web_sales,date_dim where c.c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year = 2002 and d_moy between 2 and 2+2)) group by cd_gender, cd_marital_status, cd_education_status, cd_purchase_estimate, cd_credit_rating order by 1, 2, 3, 4, 5, 6, 7, 8; ``` HJ Elapsed 9.879 seconds. SMJ Elapsed 59.865 seconds. Also it is looks like to be a reson for https://github.com/apache/datafusion-comet/issues/901 ### Describe the solution you'd like _No response_ ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
