Dear all, Given vectors "x" and "y", I would like to compute the proportion of entries that are equal, that is, mean(x == y).
Now, suppose I have the following matrix: n <- 1e2 m <- 1e4 X <- matrix(sample(0:2, m*n, replace = TRUE), ncol = m) I am interested in calculating the above proportion for every pairwise combination of rows. I came up with the following: myd <- function(X, p = NROW(X)){ D <- matrix(NA, p, p) for(i in 1:p) for(j in 1:p) if(i > j) D[i, j] <- mean(X[i, ] == X[j,]) D } system.time(d <- myd(X)) However, in my application n and m are much more larger than in this example and the computational time might be an issue. I would very much appreciate any suggestions on how to speed the "myd" function. Note: I have done some experiments with the dist() function and despite being much, much, much faster than "myd", none of the default distances fits my needs. I would also appreciate any suggestions on how to include "my own" distance function in dist(). Thank you very much for your time. Best regards, Jorge Velez.- [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.