Dear R-experts,

I'd like to avoid the use of very slow 'for'-loops but I don't know how. My data look as follows (the original data has 1600 rows and 30 columns):

# data example
c1 <- c(1,1,1,0.25,0,1,1,1,0,1)
c2 <- c(0,0,1,1,0,1,0,1,0.5,1)
c3 <- c(0,1,1,1,0,0.75,1,1,0.5,0)
x <- data.frame(c1,c2,c3)

I need to compare every column with each other and want to know the percentage of similar values for each column pair. To calculate the percentage of similar values I used the function 'agree' from the irr-package. I solved the problem with a loop that is very slow.

library(irr)     # required for the function 'agree'

# empty data frame for the results
a <- as.data.frame(matrix(data=NA, nrow=3, ncol=3))
colnames(a) <- colnames(x)
rownames(a) <- colnames(x)

# the loop to write the data
for (j in 1:ncol(x)){
  for (i in 1:ncol(x)){
    a[i,j] <- agree(cbind(x[,j], x[,i]))$value } }


I would be very pleased to receive your suggestions how to avoid the loop. Furthermore the resulting data frame could be displayed as a diagonal matrix without duplicates of each pairwise comparison, but I don't know how to solve this problem.

Kind regards

Thomas

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to