If it is not so much data, you can use Excel's PivotTable to solve this specific requirement:

1. Select the date and plz columns (w/o header) and create pivotTable on a new WorkSheet
2. Drag the column name: 'Date' from the 'Pivot Table' Builder  to the 'Column Label' section.
3. Drag the column name: 'PLZ' from the 'Pivot Table' Builder  to the  'Row Labels'
4. Drag the column name: 'Date' from the 'Pivot Table' Builder  to the  'Values' and ensure the selected summary function is 'Count'

Hope, this helps,

Thanks,
Selva-


On Oct 10, 2013, at 7:53 AM, Panshul Whisper <ouchwhis...@gmail.com> wrote:

Hello,

I have a data manipulation query.

I have my data in the following format:

Date   PLZ   Count
date1   plz1   count1
date1   plz1   count2
date1   plz1   count3
date1   plz2   count4
date1   plz2   count5
date1   plz3   count6
date1   plz3   count7

date2   plz1   count8
date2   plz1   count9
date2   plz3   count10
date2   plz3   count11

date3   plz1   count12
date3   plz1   count13
date3   plz2   count14
date3   plz2   count15
date3   plz2   count16

date4   plz1   count17
date4   plz2   count18
date4   plz3   count19

With Hive queries I have managed to get the data into the following grouped and aggregated format:

select plz, TO_DATE(time), sum(totalcount) from power_pad_part where
yearfolder='2013' and monthfolder in ('01')
and eco=0 and TO_DATE(time)>='Date1' and TO_DATE(time)<'Date4'
and plz in ('plz1','plz2','plz3')
group by plz, time

PLZ   Date   TotalCount
plz1   date1   TC1
plz1   date2   TC2
plz1   date3   TC3
plz1   date4   TC4

plz2   date1   TC5
plz2   date3   TC6
plz2   date4   TC7

plz3   date1   TC8
plz3   date2   TC9
plz3   date4   TC10

The above data is grouped by plz and date. There is no entry of a plz if it does not exist for a date.
I used the following query to generate the above data:



But I require the data to be in the following format:

PLZ   Date1   Date2   Date3   Date4 ..... DateN .....
plz1   TC1       TC2      TC3       TC4   .....  TC N
plz2   TC5       ------      TC6       TC7   .....  TC N
plz3   TC8       TC9      ------       TC10  ....  TC N


This needs to be generated as 1 column for every date of the year.

Any help to generate this format is welcome. Writing my own mapper and reduce and calling in Hive as function  is to be my last option.

Thanking You,


--
Regards,
Ouch Whisper
010101010101

Reply via email to