[ 
https://issues.apache.org/jira/browse/HIVE-28735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956030#comment-17956030
 ] 

Paramvir Singh edited comment on HIVE-28735 at 6/4/25 7:23 AM:
---------------------------------------------------------------

I've found out the issue, it seems to be caused by 
https://issues.apache.org/jira/browse/HIVE-25149 due to error not being 
propagated back properly via the threads used for loading fast hash tables. 
Have raised a PR for fixing the same : https://github.com/apache/hive/pull/5845


was (Author: JIRAUSER308479):
I've found out the issue, it seems to be caused by 
https://issues.apache.org/jira/browse/HIVE-25149 due to error not being 
propagated back properly via the threads used for loading fast hash tables. 
Have raised a PR for fixing the same : https://github.com/apache/hive/pull/5406

> TPCDS queries q15, q19 are failing when 
> hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled  is set to 
> true
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-28735
>                 URL: https://issues.apache.org/jira/browse/HIVE-28735
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Vectorization
>    Affects Versions: 4.0.0
>            Reporter: Paramvir Singh
>            Priority: Major
>              Labels: hive-4.1.0-must, pull-request-available
>
> TPCDS queries q15, q19 are failing when 
> hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled  is set to 
> true. 
> Setup should include atleast 2 node cluster.  It's passing when the cluster 
> has only 1 node. 
> The wrong result is also random(on each run I get different random wrong 
> values).
> Small repro query on TPCDS dataset
> {code:java}
> select ca_zip, count(*)
> from catalog_sales_small, customer_small, customer_address_small
> where cs_bill_customer_sk = c_customer_sk
> and c_current_addr_sk = ca_address_sk
> group by ca_zip
> order by ca_zip
> limit 100;
> {code}
> If we set the following properties, we get correct results
> {code:java}
> set hive.vectorized.execution.enabled=false; - Correct results
> {code}
> OR
> {code:java}
> set hive.auto.convert.join=false; - Correct results
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to