Jet Guo created HIVE-9052:
-----------------------------
Summary: Missing grouping rows when multi-insert
Key: HIVE-9052
URL: https://issues.apache.org/jira/browse/HIVE-9052
Project: Hive
Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Jet Guo
Giving a table and data as below:
create table score (class string, student string, score int) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ',' ;
------------------Data---------------
class1,Jack,7
class1,Mike,8
class2,Tom,7
The HQL 'from score INSERT OVERWRITE DIRECTORY '/tmp/dpp/hql1' select
class,student , count(score) group by class, student grouping sets ((class),
(class,student)) '
will get result like :
----------hql1--------------
class1\N2
class1Jack1
class1Mike1
class2\N1
class2Tom1
And the HQL 'from score INSERT OVERWRITE DIRECTORY '/tmp/dpp/hql2' select
class,student , sum(score) group by class, student grouping sets ((class),
(class,student)) '
will get result like :
----------hql2--------------
class1\N15
class1Jack7
class1Mike8
class2\N7
class2Tom7
But, if you run the HQL with above two inserts, 'from score INSERT OVERWRITE
DIRECTORY '/tmp/dpp/hql1' select class,student , count(score) group by class,
student grouping sets ((class), (class,student)) INSERT OVERWRITE DIRECTORY
'/tmp/dpp/hql2' select class,student , sum(score) group by class, student
grouping sets ((class), (class,student))'
, the results will miss some grouping rows as below:
----------hql1--------------
class1Jack1
class1Mike1
class2Tom1
----------hql2--------------
class1Jack7
class1Mike8
class2Tom7
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)