Hi, I need to do some computation based on subset sampling from the original fastq file. For example, the file I have from one lane has 30 million reads, I'd like to sample 3 subsets from the 30 million reads: 20, 10, 5 millions from the original 30 M reads. And the bigger set contains the smaller sets, ie, all 5 million reads are within the 10 and 20 million sets, and 10 M is in the 20 M set.
It appears to me that I need to generate a series of random number and re-order the original read file according to those random number to ensure the enclosure. Could somebody tell me a way to re-ordering the reads based a set of random numbers? Thanks. -Kunbin ______________________________________________________________________ The contents of this electronic message, including any attachments, are intended only for the use of the individual or entity to which they are addressed and may contain confidential information. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this message or any attachment is strictly prohibited. If you have received this transmission in error, please send an e-mail to postmas...@genomichealth.com and delete this message, along with any attachments, from your computer. [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing