[ https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919023#action_12919023 ]
Namit Jain commented on HIVE-474: --------------------------------- Once HIVE-537 is committed, the general idea is as listed in the example in HIVE-537. Say, the query is: select a, count(distinct b), count(distinct c) from T group by a and the data is: a1 b1 c1 a1 b1 c2 a1 b2 c2 a1 b2 c1 a2 ... Mapper will emit a union type: a1 0:b1 a1 1:c1 a1 0:b1 a1 1:c2 a1 0:b2 a1 1:c2 a1 0:b2 a1 1:c1 Since the sort key is (a, union_tag, (b|c)) The data will come to the reducer in the following order: a1 0:b1 a1 0:b1 a1 0:b2 a1 0:b2 a1 1:c1 a1 1:c1 a1 1:c2 a1 1:c2 and then the reducer can stream the distincts > Support for distinct selection on two or more columns > ----------------------------------------------------- > > Key: HIVE-474 > URL: https://issues.apache.org/jira/browse/HIVE-474 > Project: Hadoop Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Alexis Rondeau > Assignee: Amareshwari Sriramadasu > Attachments: hive-474.0.4.2rc.patch > > > The ability to select distinct several, individual columns as by example: > select count(distinct user), count(distinct session) from actions; > Currently returns the following failure: > FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns > not Supported user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.