Yong,
Thanks for your response..
I am getting an error :
FAILED: SemanticException [Error 10002]: Line 11:28 Invalid column
reference 'timestamp_dt'
Sorry I didn't mention earlier, the table is partitioned on "timestamp_dt "
On 2/25/14, 5:57 PM, java8964 wrote:
one query won't work, as totalcount is not in "group by".
You have 2 options:
1) use the sub query
select a.timestamp_dt, a.totalcount/b.total_sum
from daily_count_per_kg_domain a join
(
select timestamp_dt, sum(totalcount) as total_sum
from
daily_count_per_kg_domain
group by timestamp_dt
) b on (a.timestamp_dt = b.timestamp_dt)
2) If you are using hive 11 or above, using windows functions.
Yong
------------------------------------------------------------------------
Date: Tue, 25 Feb 2014 18:27:34 -0600
Subject: Re: hive query to calculate percentage
From: kkrishna...@gmail.com
To: user@hive.apache.org
Modfiy the query to :
select totalcount / sum(totalcount) from daily_count_per_kg_domain
where timestamp_dt = '20140219' group by timestamp_dt;
if you dont specify the where clause, you will get result for all
partitions.
On Tue, Feb 25, 2014 at 3:14 PM, Manish <maa...@gmail.com
<mailto:maa...@gmail.com>> wrote:
I have a partitioned table on timestamp_dt:
> desc daily_count_per_kg_domain;
OK
ddnamesyskg string
totalcount int
timestamp_dt string
hive> select * from daily_count_per_kg_domain;
OK
sys_kg_band 224 20140219
sys_kg_event 3435 20140219
sys_kg_movies 44987 20140219
sys_kg_oly 4172 20140219
sys_kg_sp_countr 5499 20140219
sys_kg_sports 3954 20140219
sys_kg_tv 21387 20140219
sys_kg_venue 152 20140219
sys_kgathlete 9000 20140219
sys_kgpeople 300064 20140219
Looking to compute percentages for each row, (per day):
100* totalcount / sum(totalcount)
Intuitively i tried :
> select totalcount / sum(totalcount) from
daily_count_per_kg_domain where timestamp_dt = '20140219' ;
FAILED: SemanticException [Error 10025]: Line 1:7 Expression not
in GROUP BY key 'totalcount'
I am not sure what group by on totalcount means..
Any ides ?