And I should also add that if I merge only on one column it works fine but the result is not what I want.
merge(data_lane6_snps, data_lane6_snps_rsid , by = c("SNP") : works as expected. Is the "chr" column being a factor creating probs here ? -A On Tue, Apr 6, 2010 at 4:03 PM, Abhishek Pratap <abhishek....@gmail.com>wrote: > Hi David > > Here it is. You can ignore the bio jargon if it sounds confusing. The > corresponding data type of column (SNP, chr) on which I am applying merge is > same. > > merge(data_lane6_snps, data_lane6_snps_rsid , by = c("SNP,"chr")) > > > str(data_lane6_snps) > 'data.frame': 7724462 obs. of 10 variables: > $ chr : Factor w/ 25 levels "chr1","chr10",..: 1 1 1 1 1 1 1 1 1 > 1 ... > $ SNP : int 100 101 103 108 179 180 191 197 218 222 ... > $ reference : Factor w/ 5 levels "A","C","G","N",..: 2 2 5 2 2 5 2 2 1 > 5 ... > $ genotype : Factor w/ 10 levels "A","C","G","K",..: 1 1 1 8 2 2 3 8 > 2 2 ... > $ consensus_qual: int 0 0 0 4 33 33 19 19 19 19 ... > $ snp_qual : int 0 0 0 4 0 33 19 19 19 19 ... > $ rms_qual : int 0 0 0 0 21 21 21 21 21 21 ... > $ depth : int 1 1 1 1 2 2 2 2 2 2 ... > $ bases : Factor w/ 453774 levels "^!,","^!,^!,",..: 5 5 5 410998 > 49793 155731 284998 416878 133393 133393 ... > $ base_quality : Factor w/ 555104 levels "`","``","```",..: 359 359 359 > 54813 92856 92856 92856 92856 92539 55424 ... > > > str(data_lane6_snps_rsid) > 'data.frame': 797807 obs. of 4 variables: > $ chr : Factor w/ 24 levels "1","10","11",..: 3 3 3 3 3 3 3 3 3 3 ... > $ SNP : int 68143872 11071026 69423434 12394791 1302846 95330693 3921381 > 57122299 41899656 76990037 ... > $ end : int 68143872 11071026 69423434 12394791 1302846 95330693 3921381 > 57122299 41899656 76990037 ... > $ rsid: Factor w/ 797807 levels "rs10","rs10000010",..: 100229 685690 > 505395 470219 780326 29342 29263 327909 434159 723152 ... > > > On Tue, Apr 6, 2010 at 3:59 PM, David Winsemius <dwinsem...@comcast.net>wrote: > >> >> On Apr 6, 2010, at 3:54 PM, Abhishek Pratap wrote: >> >> Hi Guys >>> >>> I have two data frames which I would like to merge on two conditions. >>> >>> I am doing the following (abstract form) >>> >>> new.data.frame <- merge(df1,df2, by=c("Col1","Col2")) >>> >> >> What does >> >> str(df1) ; str(df2) >> >> ... show? >> >> >> >>> It is giving me a null result. >>> >>> Basically I need to apply two conditions. >>> >>> I also tried sqldf but it is running forever. Will indexing help ? >>> >>> temp <- sqldf("select a.chr,a.SNP,a.snp_qual,a.rms_qual,a.depth,b.rsid >>> FROM >>> + data_lane6_snps a, >>> + data_lane6_snps_rsid b >>> + WHERE >>> + a.SNP = b.SNP >>> + AND >>> + a.chr = b.chr >>> + ") >>> >>> Thanks! >>> -Abhi >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> David Winsemius, MD >> West Hartford, CT >> >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.