Eugene Koifman created HIVE-7817:
------------------------------------
Summary: distinct/group by don't work on partition columns
Key: HIVE-7817
URL: https://issues.apache.org/jira/browse/HIVE-7817
Project: Hive
Issue Type: Bug
Components: Query Processor
Affects Versions: 0.14.0
Reporter: Eugene Koifman
suppose you have a table like this:
{code:sql}
CREATE TABLE page_view(
viewTime INT,
userid BIGINT,
page_url STRING,
referrer_url STRING,
ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
CLUSTERED BY(userid) INTO 4 BUCKETS
{code:sql}
Then
{code:sql}
select distinct dt from page_view;
select distinct dt, country from page_view;
select dt, country from page_view group by dt, country;
{code:sql}
all fail with
{noformat}
Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-08-20 17:26:13,018 Stage-1 map = 0%, reduce = 0%
Ended Job = job_local165359429_0013 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
{noformat}
but
{code:sql}
select dt, country, count(*) from page_view group by dt, country;
{code:sql}
works fine.
--
This message was sent by Atlassian JIRA
(v6.2#6252)