[jira] [Created] (HIVE-7817) distinct/group by don't work on partition columns

Eugene Koifman (JIRA) Wed, 20 Aug 2014 17:30:12 -0700

Eugene Koifman created HIVE-7817:
------------------------------------

             Summary: distinct/group by don't work on partition columns
                 Key: HIVE-7817
                 URL: https://issues.apache.org/jira/browse/HIVE-7817
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.14.0
            Reporter: Eugene Koifman



suppose you have a table like this:
{code:sql}
CREATE TABLE page_view(
       viewTime INT,
       userid BIGINT,
        page_url STRING,
        referrer_url STRING,
        ip STRING COMMENT 'IP Address of the User')
COMMENT 'This is the page view table'
PARTITIONED BY(dt STRING, country STRING)
CLUSTERED BY(userid) INTO 4 BUCKETS
{code:sql}

Then 
{code:sql}
select distinct dt from page_view;
select distinct dt, country from page_view;
select dt, country from page_view group by dt, country;
{code:sql}

all fail with

{noformat}
Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Job running in-process (local Hadoop)
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2014-08-20 17:26:13,018 Stage-1 map = 0%,  reduce = 0%
Ended Job = job_local165359429_0013 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
{noformat}

but 
{code:sql}
select dt, country, count(*) from page_view group by dt, country;
{code:sql}

works fine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HIVE-7817) distinct/group by don't work on partition columns

Reply via email to