Hello,

I want to compare all of the columns of one data frame to another to see if any 
of the columns are equivalent to one another. The first column in both of my 
data frames are the sample IDs and do not need to be compared. Below is an 
example of the loop I am using to compare the two data frames that counts the 
number of equivalent values there between two columns. So in this example the 
value of 3 means that all three observations for the two columns being compared 
were equivalent. The loop works fine but I do not understand why it tests the 
first column of the sample IDs providing “NA” for the sum of matching when my 
loop is specifying to only test columns 2-3.  

Thank you!


#create dataframe A 
A = matrix(c("a",3,4,"b",5,7,"c",3,7),nrow=3, ncol=3,byrow = TRUE)    
A <- as.data.frame(A)
A$V2 <- as.numeric(A$V2)
A$V3 <- as.numeric(A$V3)
str(A)

#create dataframe B
B = matrix(c("a",1,1,"b",6,2,"c",2,2),nrow=3, ncol=3,byrow = TRUE)    
B <- as.data.frame(B)
B$V2 <- as.numeric(B$V2)
B$V3 <- as.numeric(B$V3)
str(B)

results.2 <- numeric()
results.3  <- numeric()


#compare columns to identify those that are identical in the two dataframes 
for(i in 2:3){
  results.2[i] <- sum(A[,2]==B[,i])
  results.3[i] <- sum(A[,3]==B[,i])
  results.pc.all <- rbind(results.2,results.3)
}
results.pc.all

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to