joseph <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:
> I have 2 data frames df1 and df2. I would like to create a > new data frame new_df which will contain only the common rows based > on the first 2 columns (chrN and start). The column score in the new > data frame should > be replaced with a column containing the average score > (average_score) from df1 and df2. > > df1= data.frame(chrN= c("chr1", "chr1", "chr1", "chr1", "chr2", > "chr2", "chr2"), > start= c(23, 82, 95, 108, 95, 108, 121), > end= c(33, 92, 105, 118, 105, 118, 131), > score= c(3, 6, 2, 4, 9, 2, 7)) > > df2= data.frame(chrN= c("chr1", "chr2", "chr2", "chr2" , "chr2"), > start= c(23, 50, 95, 20, 121), > end= c(33, 60, 105, 30, 131), > score= c(9, 3, 7, 7, 3)) Clunky to be sure, but this should worked for me: df3 <- merge(df1,df2,by=c("chrN","start") #non-match variables get auto-relabeled df3$avg.scr <- with(df3, (score.x+score.y)/2) # or mean( ) df3 <- df3[,c("chrN","start","avg.scr")] #drops the variables not of interest df3 chrN start avg.scr 1 chr1 23 6 2 chr2 121 5 3 chr2 95 8 -- David Winsemius ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.