RE: Converting rows into dynamic colums in Hive

2012-08-16 Thread richin.jain
You could do it using Pivot table in MS Excel. It's under the Insert tab, first 
option on the left.

Richin

-Original Message-
From: Jain Richin (Nokia-LC/Boston) 
Sent: Thursday, August 09, 2012 4:16 PM
To: user@hive.apache.org
Subject: RE: Converting rows into dynamic colums in Hive

Thanks John. 
Is there a way to do this is excel since I am outputting the table data in csv 
format (using macros or something else)?

Richin

-Original Message-
From: ext John Meagher [mailto:john.meag...@gmail.com]
Sent: Thursday, August 09, 2012 10:11 AM
To: user@hive.apache.org
Subject: Re: Converting rows into dynamic colums in Hive

R is another programming language geared to statistical analysis.  It has some 
features that make it easy to write this sort of data manipulation.  
http://www.r-project.org/

Here's a sample R function that does something similar to what you need (note: 
I'm not a real R programmer, so this might be ugly, but it
works):

abCountsToMatrix - function(csvFilename, outputmatrixfile){

   data - read.csv(csvFilename, as.is=T, check.names=F, header=F)
   cat(Loaded, nrow(data), rows.\n)
   idrows - sort(unique(c(data[ , 1])))
   idcols - sort(unique(c(data[ , 2])))

   cat(Found, length(idrows),  X , length(idcols), .)
   output - array(0, c(length(idrows), length(idcols)))
   rownames(output) - idrows
   colnames(output) - idcols

   for(i in 1:nrow(data)){

 if (i %% 100 == 0) {
   cat(On , i,  of , nrow(data), \n)
 }

 rowidx - which(idrows == data[i, 1])
 colidx - which(idcols == data[i, 2])

 count - data[i,3]

 output[rowidx, colidx] - count
   }


   write.csv(output, file=outputmatrixfile, quote=F)

}



On Wed, Aug 8, 2012 at 8:58 PM,  richin.j...@nokia.com wrote:
 John,

 What is R?

 -Original Message-
 From: ext John Meagher [mailto:john.meag...@gmail.com]
 Sent: Wednesday, August 08, 2012 4:34 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive

 I don't think having dynamic columns is possible in Hive.  I've always output 
 from Hive a structure like your query output and used R to convert it into a 
 dynamic column structure.


 On Wed, Aug 8, 2012 at 3:56 PM,  richin.j...@nokia.com wrote:
 Thanks Ashish, that gives an idea.

 But I am not sure about the outer select loop, I have to know all the 
 values in Beta column beforehand to do a max on each value.

 Is there a better way?



 Richin



 From: ext Ashish Thusoo [mailto:athu...@qubole.com]
 Sent: Tuesday, August 07, 2012 5:05 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive



 you should be able to do this in hive using a group by on alpha and 
 then using a combination of the max and if statement... something on 
 the following lines

 select alpha, max(abc), max(pqr), ...
 (
   select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 
 'pqr', Gamma, NUL) as pqr, 
   from table
 )
 group by alpha

 something on those lines...

 Ashish

 On Tue, Aug 7, 2012 at 1:57 PM, richin.j...@nokia.com wrote:

 Hi All,



 One of my Query output looks like-



 AlphaBeta   Gamma

 123 xyz 1.0

 123 abc 0.5

 123 pqr 1.3

 123 def 2.1

 456 xyz 0.1

 456 abc 0.6

 456 pqr 1.9

 456 def 3.2

 456 lmn1.1

 456 sdf  1.2



 I want the output for the data visualization purpose to look like 
 (basically taking rows from one table and making it column in another
 table)



 Alphaxyz abc pqr def lmn
 sdf

 123 1.0  0.5  1.3  2.1

 456 0.1  0.6  1.9  3.2
 1.1  1.2



 Question - Can it be done in Hive? If not, any suggestions.



 Thanks,

 Richin








Re: Converting rows into dynamic colums in Hive

2012-08-09 Thread John Meagher
R is another programming language geared to statistical analysis.  It
has some features that make it easy to write this sort of data
manipulation.  http://www.r-project.org/

Here's a sample R function that does something similar to what you
need (note: I'm not a real R programmer, so this might be ugly, but it
works):

abCountsToMatrix - function(csvFilename, outputmatrixfile){

   data - read.csv(csvFilename, as.is=T, check.names=F, header=F)
   cat(Loaded, nrow(data), rows.\n)
   idrows - sort(unique(c(data[ , 1])))
   idcols - sort(unique(c(data[ , 2])))

   cat(Found, length(idrows),  X , length(idcols), .)
   output - array(0, c(length(idrows), length(idcols)))
   rownames(output) - idrows
   colnames(output) - idcols

   for(i in 1:nrow(data)){

 if (i %% 100 == 0) {
   cat(On , i,  of , nrow(data), \n)
 }

 rowidx - which(idrows == data[i, 1])
 colidx - which(idcols == data[i, 2])

 count - data[i,3]

 output[rowidx, colidx] - count
   }


   write.csv(output, file=outputmatrixfile, quote=F)

}



On Wed, Aug 8, 2012 at 8:58 PM,  richin.j...@nokia.com wrote:
 John,

 What is R?

 -Original Message-
 From: ext John Meagher [mailto:john.meag...@gmail.com]
 Sent: Wednesday, August 08, 2012 4:34 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive

 I don't think having dynamic columns is possible in Hive.  I've always output 
 from Hive a structure like your query output and used R to convert it into a 
 dynamic column structure.


 On Wed, Aug 8, 2012 at 3:56 PM,  richin.j...@nokia.com wrote:
 Thanks Ashish, that gives an idea.

 But I am not sure about the outer select loop, I have to know all the
 values in Beta column beforehand to do a max on each value.

 Is there a better way?



 Richin



 From: ext Ashish Thusoo [mailto:athu...@qubole.com]
 Sent: Tuesday, August 07, 2012 5:05 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive



 you should be able to do this in hive using a group by on alpha and
 then using a combination of the max and if statement... something on
 the following lines

 select alpha, max(abc), max(pqr), ...
 (
   select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta ==
 'pqr', Gamma, NUL) as pqr, 
   from table
 )
 group by alpha

 something on those lines...

 Ashish

 On Tue, Aug 7, 2012 at 1:57 PM, richin.j...@nokia.com wrote:

 Hi All,



 One of my Query output looks like-



 AlphaBeta   Gamma

 123 xyz 1.0

 123 abc 0.5

 123 pqr 1.3

 123 def 2.1

 456 xyz 0.1

 456 abc 0.6

 456 pqr 1.9

 456 def 3.2

 456 lmn1.1

 456 sdf  1.2



 I want the output for the data visualization purpose to look like
 (basically taking rows from one table and making it column in another
 table)



 Alphaxyz abc pqr def lmn
 sdf

 123 1.0  0.5  1.3  2.1

 456 0.1  0.6  1.9  3.2
 1.1  1.2



 Question - Can it be done in Hive? If not, any suggestions.



 Thanks,

 Richin








RE: Converting rows into dynamic colums in Hive

2012-08-09 Thread richin.jain
Thanks John. 
Is there a way to do this is excel since I am outputting the table data in csv 
format (using macros or something else)?

Richin

-Original Message-
From: ext John Meagher [mailto:john.meag...@gmail.com] 
Sent: Thursday, August 09, 2012 10:11 AM
To: user@hive.apache.org
Subject: Re: Converting rows into dynamic colums in Hive

R is another programming language geared to statistical analysis.  It has some 
features that make it easy to write this sort of data manipulation.  
http://www.r-project.org/

Here's a sample R function that does something similar to what you need (note: 
I'm not a real R programmer, so this might be ugly, but it
works):

abCountsToMatrix - function(csvFilename, outputmatrixfile){

   data - read.csv(csvFilename, as.is=T, check.names=F, header=F)
   cat(Loaded, nrow(data), rows.\n)
   idrows - sort(unique(c(data[ , 1])))
   idcols - sort(unique(c(data[ , 2])))

   cat(Found, length(idrows),  X , length(idcols), .)
   output - array(0, c(length(idrows), length(idcols)))
   rownames(output) - idrows
   colnames(output) - idcols

   for(i in 1:nrow(data)){

 if (i %% 100 == 0) {
   cat(On , i,  of , nrow(data), \n)
 }

 rowidx - which(idrows == data[i, 1])
 colidx - which(idcols == data[i, 2])

 count - data[i,3]

 output[rowidx, colidx] - count
   }


   write.csv(output, file=outputmatrixfile, quote=F)

}



On Wed, Aug 8, 2012 at 8:58 PM,  richin.j...@nokia.com wrote:
 John,

 What is R?

 -Original Message-
 From: ext John Meagher [mailto:john.meag...@gmail.com]
 Sent: Wednesday, August 08, 2012 4:34 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive

 I don't think having dynamic columns is possible in Hive.  I've always output 
 from Hive a structure like your query output and used R to convert it into a 
 dynamic column structure.


 On Wed, Aug 8, 2012 at 3:56 PM,  richin.j...@nokia.com wrote:
 Thanks Ashish, that gives an idea.

 But I am not sure about the outer select loop, I have to know all the 
 values in Beta column beforehand to do a max on each value.

 Is there a better way?



 Richin



 From: ext Ashish Thusoo [mailto:athu...@qubole.com]
 Sent: Tuesday, August 07, 2012 5:05 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive



 you should be able to do this in hive using a group by on alpha and 
 then using a combination of the max and if statement... something on 
 the following lines

 select alpha, max(abc), max(pqr), ...
 (
   select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 
 'pqr', Gamma, NUL) as pqr, 
   from table
 )
 group by alpha

 something on those lines...

 Ashish

 On Tue, Aug 7, 2012 at 1:57 PM, richin.j...@nokia.com wrote:

 Hi All,



 One of my Query output looks like-



 AlphaBeta   Gamma

 123 xyz 1.0

 123 abc 0.5

 123 pqr 1.3

 123 def 2.1

 456 xyz 0.1

 456 abc 0.6

 456 pqr 1.9

 456 def 3.2

 456 lmn1.1

 456 sdf  1.2



 I want the output for the data visualization purpose to look like 
 (basically taking rows from one table and making it column in another
 table)



 Alphaxyz abc pqr def lmn
 sdf

 123 1.0  0.5  1.3  2.1

 456 0.1  0.6  1.9  3.2
 1.1  1.2



 Question - Can it be done in Hive? If not, any suggestions.



 Thanks,

 Richin








RE: Converting rows into dynamic colums in Hive

2012-08-08 Thread richin.jain
Thanks Ashish, that gives an idea.
But I am not sure about the outer select loop, I have to know all the values in 
Beta column beforehand to do a max on each value.
Is there a better way?

Richin

From: ext Ashish Thusoo [mailto:athu...@qubole.com]
Sent: Tuesday, August 07, 2012 5:05 PM
To: user@hive.apache.org
Subject: Re: Converting rows into dynamic colums in Hive

you should be able to do this in hive using a group by on alpha and then using 
a combination of the max and if statement... something on the following lines

select alpha, max(abc), max(pqr), ...
(
  select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 'pqr', 
Gamma, NUL) as pqr, 
  from table
)
group by alpha

something on those lines...

Ashish
On Tue, Aug 7, 2012 at 1:57 PM, 
richin.j...@nokia.commailto:richin.j...@nokia.com wrote:
Hi All,

One of my Query output looks like-

AlphaBeta   Gamma
123 xyz 1.0
123 abc 0.5
123 pqr 1.3
123 def 2.1
456 xyz 0.1
456 abc 0.6
456 pqr 1.9
456 def 3.2
456 lmn1.1
456 sdf  1.2

I want the output for the data visualization purpose to look like (basically 
taking rows from one table and making it column in another table)

Alphaxyz abc pqr def lmn
sdf
123 1.0  0.5  1.3  2.1
456 0.1  0.6  1.9  3.2  
1.1  1.2

Question - Can it be done in Hive? If not, any suggestions.

Thanks,
Richin





Re: Converting rows into dynamic colums in Hive

2012-08-08 Thread John Meagher
I don't think having dynamic columns is possible in Hive.  I've always
output from Hive a structure like your query output and used R to
convert it into a dynamic column structure.


On Wed, Aug 8, 2012 at 3:56 PM,  richin.j...@nokia.com wrote:
 Thanks Ashish, that gives an idea.

 But I am not sure about the outer select loop, I have to know all the values
 in Beta column beforehand to do a max on each value.

 Is there a better way?



 Richin



 From: ext Ashish Thusoo [mailto:athu...@qubole.com]
 Sent: Tuesday, August 07, 2012 5:05 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive



 you should be able to do this in hive using a group by on alpha and then
 using a combination of the max and if statement... something on the
 following lines

 select alpha, max(abc), max(pqr), ...
 (
   select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 'pqr',
 Gamma, NUL) as pqr, 
   from table
 )
 group by alpha

 something on those lines...

 Ashish

 On Tue, Aug 7, 2012 at 1:57 PM, richin.j...@nokia.com wrote:

 Hi All,



 One of my Query output looks like-



 AlphaBeta   Gamma

 123 xyz 1.0

 123 abc 0.5

 123 pqr 1.3

 123 def 2.1

 456 xyz 0.1

 456 abc 0.6

 456 pqr 1.9

 456 def 3.2

 456 lmn1.1

 456 sdf  1.2



 I want the output for the data visualization purpose to look like (basically
 taking rows from one table and making it column in another table)



 Alphaxyz abc pqr def lmn
 sdf

 123 1.0  0.5  1.3  2.1

 456 0.1  0.6  1.9  3.2
 1.1  1.2



 Question – Can it be done in Hive? If not, any suggestions.



 Thanks,

 Richin








RE: Converting rows into dynamic colums in Hive

2012-08-08 Thread richin.jain
John,

What is R?

-Original Message-
From: ext John Meagher [mailto:john.meag...@gmail.com] 
Sent: Wednesday, August 08, 2012 4:34 PM
To: user@hive.apache.org
Subject: Re: Converting rows into dynamic colums in Hive

I don't think having dynamic columns is possible in Hive.  I've always output 
from Hive a structure like your query output and used R to convert it into a 
dynamic column structure.


On Wed, Aug 8, 2012 at 3:56 PM,  richin.j...@nokia.com wrote:
 Thanks Ashish, that gives an idea.

 But I am not sure about the outer select loop, I have to know all the 
 values in Beta column beforehand to do a max on each value.

 Is there a better way?



 Richin



 From: ext Ashish Thusoo [mailto:athu...@qubole.com]
 Sent: Tuesday, August 07, 2012 5:05 PM
 To: user@hive.apache.org
 Subject: Re: Converting rows into dynamic colums in Hive



 you should be able to do this in hive using a group by on alpha and 
 then using a combination of the max and if statement... something on 
 the following lines

 select alpha, max(abc), max(pqr), ...
 (
   select alpha, if (beta == 'abc', Gamma, NULL) as abc, if (beta == 
 'pqr', Gamma, NUL) as pqr, 
   from table
 )
 group by alpha

 something on those lines...

 Ashish

 On Tue, Aug 7, 2012 at 1:57 PM, richin.j...@nokia.com wrote:

 Hi All,



 One of my Query output looks like-



 AlphaBeta   Gamma

 123 xyz 1.0

 123 abc 0.5

 123 pqr 1.3

 123 def 2.1

 456 xyz 0.1

 456 abc 0.6

 456 pqr 1.9

 456 def 3.2

 456 lmn1.1

 456 sdf  1.2



 I want the output for the data visualization purpose to look like 
 (basically taking rows from one table and making it column in another 
 table)



 Alphaxyz abc pqr def lmn
 sdf

 123 1.0  0.5  1.3  2.1

 456 0.1  0.6  1.9  3.2
 1.1  1.2



 Question - Can it be done in Hive? If not, any suggestions.



 Thanks,

 Richin








RE: Converting rows into dynamic colums in Hive

2012-08-07 Thread Raghunath, Ranjith
You can do this using case statements and summing the values. The only item to 
remember here is that the values in the grid need to be numeric for the sum to 
work.

Thanks,
Ranjith

From: richin.j...@nokia.com [mailto:richin.j...@nokia.com]
Sent: Tuesday, August 07, 2012 3:58 PM
To: user@hive.apache.org
Subject: Converting rows into dynamic colums in Hive

Hi All,

One of my Query output looks like-

AlphaBeta   Gamma
123 xyz 1.0
123 abc 0.5
123 pqr 1.3
123 def 2.1
456 xyz 0.1
456 abc 0.6
456 pqr 1.9
456 def 3.2
456 lmn1.1
456 sdf  1.2

I want the output for the data visualization purpose to look like (basically 
taking rows from one table and making it column in another table)

Alphaxyz abc pqr def lmn
sdf
123 1.0  0.5  1.3  2.1
456 0.1  0.6  1.9  3.2  
1.1  1.2

Question - Can it be done in Hive? If not, any suggestions.

Thanks,
Richin