Hello, I'm attempting to create a data frame with correlations between every pair of variables in a data frame, so that I can then sort by the value of the correlation coefficient and see which pairs of variables are most strongly correlated.
The sm2vec function in the corpcor library works very nicely as shown here: library(Hmisc) library(corpcor) # Create example data x1 = runif(50) x2 = runif(50) x3 = runif(50) d = data.frame(x1=x1,x2=x2,x3=x3) label(d$x1) = "Variable x1" label(d$x2) = "Variable x2" label(d$x3) = "Variable x3" # Get correlations cormat = cor(d) # Get vector form of lower triangular elements cors = sm2vec(cormat,diag=F) inds = sm.index(cormat,diag=F) # Create a data frame var1 = dimnames(cormat)[[1]][inds[,1]] var2 = dimnames(cormat)[[2]][inds[,2]] lbl1 = label(d[,var1]) lbl2 = label(d[,var2]) cor_df = data.frame(Var1=lbl1,Var2=lbl2,Cor=cors) The issue that I run into is when trying to get the labels in lbl1 and lbl2. I get the warning: In mapply(FUN = label, x = x, default = default, MoreArgs = list(self = TRUE), : longer argument not a multiple of length of shorter My usage of label seems ambiguous since the data frame could also a label attached to it, aside from labels attached to variables within the data frame. However, the code above does work, with the warning. Aside from using a loop to get the label of one variable at a time, is there any other way of getting the labels for all variables in the data frame? Also, if there is a better way to achieve my goal of getting the correlations between all variable pairs, I'd love to know. Thanks in advance for any responses! --Krishna [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.