Good morning! I've searched the docs etc... Am I doing something wrong or is this a bug?
I'm doing a merge of two dataframes and getting extra rows in the resulting dataframe - the dataframes being merged might have NAs... count <- 10 nacount <- 3 a1 <- as.data.frame(as.Date("2005-06-01")+0:(count-1)) names(a1) <- "mdate" a1$value <- runif(count) a1[floor(runif(nacount)*count),]$value <- NA a2 <- as.data.frame(as.Date("2005-06-01")+0:(count-1)) names(a2) <- "mdate" a2$value2 <- runif(count) #a2[floor(runif(nacount)*count),]$value2 <- NA > a1 mdate value 1 2005-06-09 NA 2 2005-06-02 0.5287683 3 2005-06-03 0.7563833 4 2005-06-09 NA 5 2005-06-05 0.1027646 6 2005-06-06 0.7775884 7 2005-06-07 0.2993592 8 2005-06-09 NA 9 2005-06-09 0.7434682 10 2005-06-10 0.2096477 > a2 mdate value2 1 2005-06-01 0.5347852 2 2005-06-02 0.9322765 3 2005-06-03 0.9106499 4 2005-06-04 0.6810564 5 2005-06-05 0.5871867 6 2005-06-06 0.8123808 7 2005-06-07 0.9675379 8 2005-06-08 0.9470369 9 2005-06-09 0.7493767 10 2005-06-10 0.8864103 > atot <- merge(a1,a2,all=T) However, I find the following results to be quite un-intuitive - are they correct? May I draw your attention to lines 9:12... Should lines 9:11 be there? > atot mdate value value2 1 2005-06-01 NA 0.5347852 2 2005-06-02 0.5287683 0.9322765 3 2005-06-03 0.7563833 0.9106499 4 2005-06-04 NA 0.6810564 5 2005-06-05 0.1027646 0.5871867 6 2005-06-06 0.7775884 0.8123808 7 2005-06-07 0.2993592 0.9675379 8 2005-06-08 NA 0.9470369 9 2005-06-09 NA 0.7493767 10 2005-06-09 NA 0.7493767 11 2005-06-09 NA 0.7493767 12 2005-06-09 0.7434682 0.7493767 13 2005-06-10 0.2096477 0.8864103 Note with no NAs, it works perfectly and as expected... > a1 <- as.data.frame(as.Date("2005-06-01")+0:(count-1)) > names(a1) <- "mdate" > a1$value <- runif(count) > #a1[floor(runif(nacount)*count),]$value <- NA > > atot <- merge(a1,a2,all=T) > > atot mdate value value2 1 2005-06-01 0.35002519 0.5347852 2 2005-06-02 0.76318940 0.9322765 3 2005-06-03 0.32759570 0.9106499 4 2005-06-04 0.47218729 0.6810564 5 2005-06-05 0.74435374 0.5871867 6 2005-06-06 0.81415290 0.8123808 7 2005-06-07 0.04774783 0.9675379 8 2005-06-08 0.21799101 0.9470369 9 2005-06-09 0.99472758 0.7493767 10 2005-06-10 0.41974293 0.8864103 R started in each case with --vanilla _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status Patched major 2 minor 3.0 year 2006 month 05 day 11 svn rev 38037 language R version.string Version 2.3.0 Patched (2006-05-11 r38037) win-xp-pro sp2 - binary installs from CRAN it works in a similar way if I say atot <- merge(a1,a2,by.x="mdate",by.y="mdate",all=T) or even atot <- merge(a1,a2,by="mdate",all=T) also tested on versions 2.2.1, 2.3.0 cheers, Sean O'Riordain (ps. ctrl-v paste wouldn't work on 2.4.0-dev downloaded this morning - didn't try very hard though) ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html