Hi, I need to do some computation based on subset sampling from the original 
fastq file. For example, the file I have from one lane has 30 million reads, 
I'd like to sample 3 subsets from the 30 million reads: 20, 10, 5 millions from 
the original 30 M reads. And the bigger set contains the smaller sets, ie, all 
5 million reads are within the 10 and 20 million sets, and 10 M is in the 20 M 
set.

 It appears to me that I need to generate a series of random number and 
re-order the original read file according to those random number to ensure the 
enclosure. Could somebody tell me a way to re-ordering the reads based a set of 
random numbers? Thanks.

-Kunbin



______________________________________________________________________
The contents of this electronic message, including any attachments, are 
intended only for the use of the individual or entity to which they are 
addressed and may contain confidential information. If you are not the intended 
recipient, you are hereby notified that any use, dissemination, distribution, 
or copying of this message or any attachment is strictly prohibited. If you 
have received this transmission in error, please send an e-mail to 
postmas...@genomichealth.com and delete this message, along with any 
attachments, from your computer.
        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to