Dear list I wish to extract from a population genotypized for 10 SNP a subsample of the same population of size n with similar allele frequencies. Essentially i have a matrix of 200 rows (df) like this Name,Condition,rs1385699_X,rs6625163_X,rs962458_X,Rs4658627_1, sample01,Case,1,1,1,-1 sample02,Control,1,1,1,1 sample06,Control,1,-1,1,0 sample10,Case,1,1,1,0 sample11,Control,1,1,1,1 sample24,Control,-1,-1,1,0 sample29,Control,1,-1,1,0 sample42,Case,-1,-1,1,0 sample64,Case,-1,1,1,0 .... I'm interested to mantain in my subsample the same frequencies of those observed for the 1 value in each column I approached the problem with sample() function
mysample<-df[sample(1:nrow(df),100,replace=F),] Then I tested that the frequencies of each allele in mysample are not statistically different respect to the initial dataset by mean of prop.test This seems to work but do you know if there is a package that can do the same thing allowing for example a more strict control? Thank you very much Guido [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.