[jira] [Updated] (HIVE-15272) "LEFT OUTER JOIN" Is not populating different records with Hive On Spark
[ https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikash Pareek updated HIVE-15272: - Description: I ran following Hive query multiple times with execution engine as Hive on Spark and Hive on MapReduce. {code} SELECT COUNT(DISTINCT t1.region, t1.amount) FROM my_db.my_table1 t1 LEFT OUTER JOIN my-db.my_table2 t2 ON (t1.id = t2.id AND t1.name = t2.name) {code} With Hive on Spark: Result (count) were different of every execution. With Hive on MapReduce: Result (count) were same of every execution. Seems like Hive on Spark behaving differently in each execution and does not populating correct result. was: I ran following Hive query multiple times with execution engine as Hive on Spark and Hive on MapReduce. {code} SELECT COUNT(DISTINCT t1.region, t1.amount) FROM my_db.my_table1 t1 LEFT OUTER JOIN my-db.my_table2 t2 ON (t1.id = t2.id AND t1.name = t2.name) {code} With Hive on Spark: Result (count) were different of every execution. With Hive on MapReduce: Result (count) were same of every execution. > "LEFT OUTER JOIN" Is not populating different records with Hive On Spark > > > Key: HIVE-15272 > URL: https://issues.apache.org/jira/browse/HIVE-15272 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 1.1.0 > Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4 >Reporter: Vikash Pareek > > I ran following Hive query multiple times with execution engine as Hive on > Spark and Hive on MapReduce. > {code} > SELECT COUNT(DISTINCT t1.region, t1.amount) > FROM my_db.my_table1 t1 > LEFT OUTER > JOIN my-db.my_table2 t2 ON (t1.id = t2.id > AND t1.name = t2.name) > {code} > With Hive on Spark: Result (count) were different of every execution. > With Hive on MapReduce: Result (count) were same of every execution. > Seems like Hive on Spark behaving differently in each execution and does not > populating correct result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15272) "LEFT OUTER JOIN" Is not populating different records with Hive On Spark
[ https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikash Pareek updated HIVE-15272: - Description: I ran following Hive query multiple times with execution engine as Hive on Spark and Hive on MapReduce. {code} SELECT COUNT(DISTINCT t1.region, t1.amount) FROM my_db.my_table1 t1 LEFT OUTER JOIN my-db.my_table2 t2 ON (t1.id = t2.id AND t1.name = t2.name) {code} With Hive on Spark: Result (count) were different of every execution. With Hive on MapReduce: Result (count) were same of every execution. was: Following query is populating different result every time I ran with Hive on Spark: {code} SELECT COUNT(*) FROM (SELECT DISTINCT mt1.name, mt1.id FROM (SELECT mt1.*, mt2.region, mt2., regexp_replace(mt2.tr_dat,"\\.","") AS TRANSACTION_DATE FROM my_database.my_table1 mt1 LEFT OUTER JOIN my_database.my_table2 mt2 ON (mt1.id=mt2.id AND mt1.name = mt2.name))t6)A; {code} But the same query populating same result with Hive on MapReduce every time. > "LEFT OUTER JOIN" Is not populating different records with Hive On Spark > > > Key: HIVE-15272 > URL: https://issues.apache.org/jira/browse/HIVE-15272 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 1.1.0 > Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4 >Reporter: Vikash Pareek > > I ran following Hive query multiple times with execution engine as Hive on > Spark and Hive on MapReduce. > {code} > SELECT COUNT(DISTINCT t1.region, t1.amount) > FROM my_db.my_table1 t1 > LEFT OUTER > JOIN my-db.my_table2 t2 ON (t1.id = t2.id > AND t1.name = t2.name) > {code} > With Hive on Spark: Result (count) were different of every execution. > With Hive on MapReduce: Result (count) were same of every execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15272) "LEFT OUTER JOIN" Is not populating different records with Hive On Spark
[ https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-15272: --- Description: Following query is populating different result every time I ran with Hive on Spark: {code} SELECT COUNT(*) FROM (SELECT DISTINCT mt1.name, mt1.id FROM (SELECT mt1.*, mt2.region, mt2., regexp_replace(mt2.tr_dat,"\\.","") AS TRANSACTION_DATE FROM my_database.my_table1 mt1 LEFT OUTER JOIN my_database.my_table2 mt2 ON (mt1.id=mt2.id AND mt1.name = mt2.name))t6)A; {code} But the same query populating same result with Hive on MapReduce every time. was: Following query is populating different result every time I ran with Hive on Spark: " SELECT COUNT(*) FROM (SELECT DISTINCT mt1.name, mt1.id FROM (SELECT mt1.*, mt2.region, mt2., regexp_replace(mt2.tr_dat,"\\.","") AS TRANSACTION_DATE FROM my_database.my_table1 mt1 LEFT OUTER JOIN my_database.my_table2 mt2 ON (mt1.id=mt2.id AND mt1.name = mt2.name))t6)A; " But the same query populating same result with Hive on MapReduce every time. > "LEFT OUTER JOIN" Is not populating different records with Hive On Spark > > > Key: HIVE-15272 > URL: https://issues.apache.org/jira/browse/HIVE-15272 > Project: Hive > Issue Type: Bug > Components: Hive, Spark >Affects Versions: 1.1.0 > Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4 >Reporter: Vikash Pareek > > Following query is populating different result every time I ran with Hive on > Spark: > {code} > SELECT COUNT(*) > FROM > (SELECT DISTINCT mt1.name, >mt1.id >FROM > (SELECT mt1.*, > mt2.region, > mt2., > regexp_replace(mt2.tr_dat,"\\.","") AS TRANSACTION_DATE > FROM my_database.my_table1 mt1 > LEFT OUTER JOIN my_database.my_table2 mt2 ON (mt1.id=mt2.id > AND mt1.name = > mt2.name))t6)A; > {code} > But the same query populating same result with Hive on MapReduce every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)