[ https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068095#comment-14068095 ]
Lefty Leverenz commented on HIVE-4590: -------------------------------------- [~eugene.koifman], it's past time to fix this but first I have a couple of questions: # Why does the equivalent SELECT statement say "col1" while the description says "an integer in the second column"? Does this assume column numbers start with zero? #* "select col1, count\(*\) from $table group by col1;" I tried to figure it out from the MR program, but strained my brain. # Is there a typo in the output for your sample dataset (1,1,1,3,3,3,5)? I see three 3s, not 2. #* 1, 3 3, 2, 5, 1 ... and presumably the comma after the 2 (or 3) can be removed. The doc has a new location, by the way: * [HCat Input and Output -- Read Example | https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput#HCatalogInputOutput-ReadExample] > HCatalog documentation example is wrong > --------------------------------------- > > Key: HIVE-4590 > URL: https://issues.apache.org/jira/browse/HIVE-4590 > Project: Hive > Issue Type: Bug > Components: Documentation, HCatalog > Affects Versions: 0.10.0 > Reporter: Eugene Koifman > Assignee: Lefty Leverenz > Priority: Minor > > http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example > reads > The following very simple MapReduce program reads data from one table which > it assumes to have an integer in the second column, and counts how many > different values it sees. That is, it does the equivalent of "select col1, > count(*) from $table group by col1;". > The description of the query is wrong. It actually counts how many instances > of each distinct value it find. For example, if values of col1 are > {1,1,1,3,3,3,5) it will produce > 1, 3 > 3, 2, > 5, 1 > -- This message was sent by Atlassian JIRA (v6.2#6252)