Mergen created HIVE-27997: ----------------------------- Summary: Incorrect result for Hive join query with NVL and Map Join Key: HIVE-27997 URL: https://issues.apache.org/jira/browse/HIVE-27997 Project: Hive Issue Type: Bug Components: Operators Affects Versions: 3.1.3 Reporter: Mergen
Hive returns incorrect result if there is NVL() in an ON clause with Map Join enabled. STEPS TO REPRODUCE: {code:java} Step 1: Create a table test_nvl create table test_nvl(a string); Step 2: Insert null and non-null data into table test_nvl insert into test_nvl values ('x'), ('y'), (null); select * from test_nvl; +-------------+ | test_nvl.a | +-------------+ | x | | y | | NULL | +-------------+ Step 3 : Execute the following query select x.a, y.a from test_nvl x left join test_nvl y on nvl(x.a, '') = nvl(y.a, '');{code} EXPECTED RESULT: {code:java} +-------+-------+ | x.a | y.a | +-------+-------+ | x | x | | y | y | | NULL | NULL | +-------+-------+ {code} ACTUAL RESULT: {code:java} +-------+------+ | x.a | y.a | +-------+------+ | x | x | | y | x | | NULL | x | +-------+------+{code} (Obviously 'y' != 'x' and NULL != 'x' so they should not be in the same line) The query works fine with Map Join disabled: {code:java} -- Using Merge Join instead. set hive.auto.convert.join=false; select x.a, y.a from test_nvl x left join test_nvl y on nvl(x.a, '') = nvl(y.a, ''); +-------+-------+ | x.a | y.a | +-------+-------+ | NULL | NULL | | x | x | | y | y | +-------+-------+ {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)