[ https://issues.apache.org/jira/browse/HIVE-27997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mergen updated HIVE-27997: -------------------------- Description: Hive returns incorrect result if there is NVL() in an ON clause with Map Join enabled. STEPS TO REPRODUCE: {code:java} Step 1: Create a table test_nvl create table test_nvl(a string); Step 2: Insert null and non-null data into table test_nvl insert into test_nvl values ('x'), ('y'), (null); select * from test_nvl; +-------------+ | test_nvl.a | +-------------+ | x | | y | | NULL | +-------------+ Step 3 : Execute the following query select x.a, y.a from test_nvl x left join test_nvl y on nvl(x.a, '') = nvl(y.a, '');{code} EXPECTED RESULT: {code:java} +-------+-------+ | x.a | y.a | +-------+-------+ | x | x | | y | y | | NULL | NULL | +-------+-------+ {code} ACTUAL RESULT: {code:java} +-------+------+ | x.a | y.a | +-------+------+ | x | x | | y | x | | NULL | x | +-------+------+{code} (Obviously 'y' != 'x' and NULL != 'x' so they should not be in the same line) The query works fine with Map Join disabled: {code:java} -- Using Merge Join instead. set hive.auto.convert.join=false; select x.a, y.a from test_nvl x left join test_nvl y on nvl(x.a, '') = nvl(y.a, ''); +-------+-------+ | x.a | y.a | +-------+-------+ | NULL | NULL | | x | x | | y | y | +-------+-------+ {code} was: Hive returns incorrect result if there is NVL() in an ON clause with Map Join enabled. STEPS TO REPRODUCE: {code:java} Step 1: Create a table test_nvl create table test_nvl(a string); Step 2: Insert null and non-null data into table test_nvl insert into test_nvl values ('x'), ('y'), (null); select * from test_nvl; +-------------+ | test_nvl.a | +-------------+ | x | | y | | NULL | +-------------+ Step 3 : Execute the following query select x.a, y.a from test_nvl x left join test_nvl y on nvl(x.a, '') = nvl(y.a, '');{code} EXPECTED RESULT: {code:java} +-------+-------+ | x.a | y.a | +-------+-------+ | x | x | | y | y | | NULL | NULL | +-------+-------+ {code} ACTUAL RESULT: {code:java} +-------+------+ | x.a | y.a | +-------+------+ | x | x | | y | x | | NULL | x | +-------+------+{code} (Obviously 'y' != 'x' and NULL != 'x' so they should not be in the same line) The query works fine with Map Join disabled: {code:java} -- Using Merge Join instead. set hive.auto.convert.join=false; select x.a, y.a from test_nvl x left join test_nvl y on nvl(x.a, '') = nvl(y.a, ''); +-------+-------+ | x.a | y.a | +-------+-------+ | NULL | NULL | | x | x | | y | y | +-------+-------+ {code} > Incorrect result for Hive join query with NVL and Map Join > ---------------------------------------------------------- > > Key: HIVE-27997 > URL: https://issues.apache.org/jira/browse/HIVE-27997 > Project: Hive > Issue Type: Bug > Components: Operators > Affects Versions: 3.1.3 > Reporter: Mergen > Priority: Major > > Hive returns incorrect result if there is NVL() in an ON clause with Map Join > enabled. > > STEPS TO REPRODUCE: > {code:java} > Step 1: Create a table test_nvl > create table test_nvl(a string); > Step 2: Insert null and non-null data into table test_nvl > insert into test_nvl values ('x'), ('y'), (null); > select * from test_nvl; > +-------------+ > | test_nvl.a | > +-------------+ > | x | > | y | > | NULL | > +-------------+ > Step 3 : Execute the following query > select x.a, y.a > from test_nvl x > left join test_nvl y > on nvl(x.a, '') = nvl(y.a, '');{code} > > EXPECTED RESULT: > {code:java} > +-------+-------+ > | x.a | y.a | > +-------+-------+ > | x | x | > | y | y | > | NULL | NULL | > +-------+-------+ {code} > > ACTUAL RESULT: > {code:java} > +-------+------+ > | x.a | y.a | > +-------+------+ > | x | x | > | y | x | > | NULL | x | > +-------+------+{code} > (Obviously 'y' != 'x' and NULL != 'x' so they should not be in the same line) > > The query works fine with Map Join disabled: > {code:java} > -- Using Merge Join instead. > set hive.auto.convert.join=false; > select x.a, y.a > from test_nvl x > left join test_nvl y > on nvl(x.a, '') = nvl(y.a, ''); > +-------+-------+ > | x.a | y.a | > +-------+-------+ > | NULL | NULL | > | x | x | > | y | y | > +-------+-------+ {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)