On Jul 10, 2014, at 12:03 PM, Jonathan Greenberg <j...@illinois.edu> wrote:
> R-helpers: > > I'm trying to determine the frequency of characters for a matrix > applied to a single dimension, and generate a matrix as an output. > I've come up with a solution, but it appears inelegant -- I was > wondering if there is an easier way to accomplish this task: > > # Create a matrix of "factors" (characters): > random_characters=matrix(sample(letters[1:4],1000,replace=TRUE),100,10) > > # Applying with the table() function doesn't work properly, because not all > rows > # have ALL of the factors, so I get a list output: > apply(random_characters,1,table) > > # Hacked solution: > unique_values = letters[1:4] > > countsmatrix <- t(apply(random_characters,1,function(x,unique_values) > { > counts=vector(length=length(unique_values)) > for(i in seq(unique_values)) > { > counts[i] = sum(x==unique_values[i]) > } > return(counts) > }, > unique_values=unique_values > )) > > # Gets me the output I want but requires two nested loops (apply and > for() ), so > # not efficient for very large datasets. > > ### > > Is there a more elegant solution to this? > > --j > If I am correctly understanding your issue, you simply need to coerce the input to table() to a factor with a common set of levels, since the matrix will be 'character' by default: set.seed(1) random_characters <- matrix(sample(factor(letters[1:4]), 1000, replace = TRUE), 100, 10) > random_characters [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] "b" "c" "b" "c" "c" "c" "d" "d" "d" "d" [2,] "b" "b" "a" "a" "a" "c" "d" "d" "a" "d" [3,] "c" "b" "c" "b" "d" "c" "a" "d" "d" "b" [4,] "d" "d" "b" "b" "d" "c" "c" "c" "c" "a" [5,] "a" "c" "a" "b" "d" "b" "d" "c" "b" "a" [6,] "d" "a" "c" "d" "c" "d" "d" "a" "c" "a" [7,] "d" "a" "c" "a" "b" "b" "b" "b" "b" "a" [8,] "c" "b" "a" "d" "d" "d" "b" "c" "d" "a" [9,] "c" "d" "b" "a" "a" "d" "d" "d" "b" "a" [10,] "a" "c" "c" "b" "d" "c" "a" "c" "a" "a" [11,] "a" "d" "d" "a" "d" "d" "d" "c" "b" "c" [12,] "a" "c" "a" "a" "b" "b" "b" "b" "b" "d" [13,] "c" "b" "d" "d" "c" "a" "c" "a" "b" "c" [14,] "b" "b" "d" "c" "d" "c" "c" "d" "d" "a" [15,] "d" "a" "d" "b" "c" "c" "c" "b" "b" "a" [16,] "b" "a" "b" "b" "b" "a" "b" "b" "c" "b" [17,] "c" "c" "c" "a" "b" "c" "a" "a" "d" "a" [18,] "d" "a" "d" "b" "b" "c" "b" "a" "d" "c" ... RES <- t(apply(random_characters, 1, function(x) table(factor(x, levels = letters[1:4])))) > RES a b c d [1,] 0 2 4 4 [2,] 4 2 1 3 [3,] 1 3 3 3 [4,] 1 2 4 3 [5,] 3 3 2 2 [6,] 3 0 3 4 [7,] 3 5 1 1 [8,] 2 2 2 4 [9,] 3 2 1 4 [10,] 4 1 4 1 [11,] 2 1 2 5 [12,] 3 5 1 1 [13,] 2 2 4 2 [14,] 1 2 3 4 [15,] 2 3 3 2 [16,] 2 7 1 0 [17,] 4 1 4 1 [18,] 2 3 2 3 ... Regards, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.