[
https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068095#comment-14068095
]
Lefty Leverenz commented on HIVE-4590:
--------------------------------------
[~eugene.koifman], it's past time to fix this but first I have a couple of
questions:
# Why does the equivalent SELECT statement say "col1" while the description
says "an integer in the second column"? Does this assume column numbers start
with zero?
#* "select col1, count\(*\) from $table group by col1;"
I tried to figure it out from the MR program, but strained my brain.
# Is there a typo in the output for your sample dataset (1,1,1,3,3,3,5)? I
see three 3s, not 2.
#* 1, 3
3, 2,
5, 1
... and presumably the comma after the 2 (or 3) can be removed.
The doc has a new location, by the way:
* [HCat Input and Output -- Read Example |
https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput#HCatalogInputOutput-ReadExample]
> HCatalog documentation example is wrong
> ---------------------------------------
>
> Key: HIVE-4590
> URL: https://issues.apache.org/jira/browse/HIVE-4590
> Project: Hive
> Issue Type: Bug
> Components: Documentation, HCatalog
> Affects Versions: 0.10.0
> Reporter: Eugene Koifman
> Assignee: Lefty Leverenz
> Priority: Minor
>
> http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example
> reads
> The following very simple MapReduce program reads data from one table which
> it assumes to have an integer in the second column, and counts how many
> different values it sees. That is, it does the equivalent of "select col1,
> count(*) from $table group by col1;".
> The description of the query is wrong. It actually counts how many instances
> of each distinct value it find. For example, if values of col1 are
> {1,1,1,3,3,3,5) it will produce
> 1, 3
> 3, 2,
> 5, 1
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)