Dear list I wish to extract from a population genotypized for 10 SNP a
subsample of the same population of size n with similar allele frequencies.
Essentially i have a matrix of 200 rows (df) like this
Name,Condition,rs1385699_X,rs6625163_X,rs962458_X,Rs4658627_1,
sample01,Case,1,1,1,-1
sample02,Control,1,1,1,1
sample06,Control,1,-1,1,0
sample10,Case,1,1,1,0
sample11,Control,1,1,1,1
sample24,Control,-1,-1,1,0
sample29,Control,1,-1,1,0
sample42,Case,-1,-1,1,0
sample64,Case,-1,1,1,0
....
I'm interested to mantain in my subsample the same frequencies of those
observed for the 1 value in each column
I approached the problem with sample() function

mysample<-df[sample(1:nrow(df),100,replace=F),]
Then I tested that  the frequencies of each allele in mysample are not
statistically different respect to the initial dataset by mean of prop.test
This seems to work but do you know if there is a package that can do the
same thing  allowing for example a more strict control?
Thank you very much
Guido

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to