[R] table over a matrix dimension...

2014-07-10 Thread Jonathan Greenberg
R-helpers:

I'm trying to determine the frequency of characters for a matrix
applied to a single dimension, and generate a matrix as an output.
I've come up with a solution, but it appears inelegant -- I was
wondering if there is an easier way to accomplish this task:

# Create a matrix of factors (characters):
random_characters=matrix(sample(letters[1:4],1000,replace=TRUE),100,10)

# Applying with the table() function doesn't work properly, because not all rows
# have ALL of the factors, so I get a list output:
apply(random_characters,1,table)

# Hacked solution:
unique_values = letters[1:4]

countsmatrix - t(apply(random_characters,1,function(x,unique_values)
{
counts=vector(length=length(unique_values))
for(i in seq(unique_values))
{
counts[i] = sum(x==unique_values[i])
}
return(counts)
},
unique_values=unique_values
))

# Gets me the output I want but requires two nested loops (apply and
for() ), so
# not efficient for very large datasets.

###

Is there a more elegant solution to this?

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
259 Computing Applications Building, MC-150
605 East Springfield Avenue
Champaign, IL  61820-6371
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] table over a matrix dimension...

2014-07-10 Thread Marc Schwartz

On Jul 10, 2014, at 12:03 PM, Jonathan Greenberg j...@illinois.edu wrote:

 R-helpers:
 
 I'm trying to determine the frequency of characters for a matrix
 applied to a single dimension, and generate a matrix as an output.
 I've come up with a solution, but it appears inelegant -- I was
 wondering if there is an easier way to accomplish this task:
 
 # Create a matrix of factors (characters):
 random_characters=matrix(sample(letters[1:4],1000,replace=TRUE),100,10)
 
 # Applying with the table() function doesn't work properly, because not all 
 rows
 # have ALL of the factors, so I get a list output:
 apply(random_characters,1,table)
 
 # Hacked solution:
 unique_values = letters[1:4]
 
 countsmatrix - t(apply(random_characters,1,function(x,unique_values)
 {
 counts=vector(length=length(unique_values))
 for(i in seq(unique_values))
 {
 counts[i] = sum(x==unique_values[i])
 }
 return(counts)
 },
 unique_values=unique_values
 ))
 
 # Gets me the output I want but requires two nested loops (apply and
 for() ), so
 # not efficient for very large datasets.
 
 ###
 
 Is there a more elegant solution to this?
 
 --j
 


If I am correctly understanding your issue, you simply need to coerce the input 
to table() to a factor with a common set of levels, since the matrix will be 
'character' by default:


set.seed(1)
random_characters - matrix(sample(factor(letters[1:4]), 1000, replace = TRUE), 
100, 10)

 random_characters 
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
  [1,] b  c  b  c  c  c  d  d  d  d  
  [2,] b  b  a  a  a  c  d  d  a  d  
  [3,] c  b  c  b  d  c  a  d  d  b  
  [4,] d  d  b  b  d  c  c  c  c  a  
  [5,] a  c  a  b  d  b  d  c  b  a  
  [6,] d  a  c  d  c  d  d  a  c  a  
  [7,] d  a  c  a  b  b  b  b  b  a  
  [8,] c  b  a  d  d  d  b  c  d  a  
  [9,] c  d  b  a  a  d  d  d  b  a  
 [10,] a  c  c  b  d  c  a  c  a  a  
 [11,] a  d  d  a  d  d  d  c  b  c  
 [12,] a  c  a  a  b  b  b  b  b  d  
 [13,] c  b  d  d  c  a  c  a  b  c  
 [14,] b  b  d  c  d  c  c  d  d  a  
 [15,] d  a  d  b  c  c  c  b  b  a  
 [16,] b  a  b  b  b  a  b  b  c  b  
 [17,] c  c  c  a  b  c  a  a  d  a  
 [18,] d  a  d  b  b  c  b  a  d  c 
 ...


RES - t(apply(random_characters, 1, 
   function(x) table(factor(x, levels = letters[1:4]

 RES
   a b c d
  [1,] 0 2 4 4
  [2,] 4 2 1 3
  [3,] 1 3 3 3
  [4,] 1 2 4 3
  [5,] 3 3 2 2
  [6,] 3 0 3 4
  [7,] 3 5 1 1
  [8,] 2 2 2 4
  [9,] 3 2 1 4
 [10,] 4 1 4 1
 [11,] 2 1 2 5
 [12,] 3 5 1 1
 [13,] 2 2 4 2
 [14,] 1 2 3 4
 [15,] 2 3 3 2
 [16,] 2 7 1 0
 [17,] 4 1 4 1
 [18,] 2 3 2 3
 ...



Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] table over a matrix dimension...

2014-07-10 Thread William Dunlap
You can make make a factor with a common set of levels out of each slice of
the matrix so all the tables are the same size:

f - function (charMatrix, levels = unique(sort(as.vector(charMatrix
{
apply(charMatrix, 1, function(x) table(factor(x, levels = levels)))
}
used as
 m - cbind(c(A,A,A), c(B, A, A))
 f(m)
  [,1] [,2] [,3]
A122
B100



Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Jul 10, 2014 at 10:03 AM, Jonathan Greenberg j...@illinois.edu
wrote:

 R-helpers:

 I'm trying to determine the frequency of characters for a matrix
 applied to a single dimension, and generate a matrix as an output.
 I've come up with a solution, but it appears inelegant -- I was
 wondering if there is an easier way to accomplish this task:

 # Create a matrix of factors (characters):
 random_characters=matrix(sample(letters[1:4],1000,replace=TRUE),100,10)

 # Applying with the table() function doesn't work properly, because not
 all rows
 # have ALL of the factors, so I get a list output:
 apply(random_characters,1,table)

 # Hacked solution:
 unique_values = letters[1:4]

 countsmatrix - t(apply(random_characters,1,function(x,unique_values)
 {
 counts=vector(length=length(unique_values))
 for(i in seq(unique_values))
 {
 counts[i] = sum(x==unique_values[i])
 }
 return(counts)
 },
 unique_values=unique_values
 ))

 # Gets me the output I want but requires two nested loops (apply and
 for() ), so
 # not efficient for very large datasets.

 ###

 Is there a more elegant solution to this?

 --j

 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 259 Computing Applications Building, MC-150
 605 East Springfield Avenue
 Champaign, IL  61820-6371
 Phone: 217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.