Hi, Are you looking for the number of reads that have 0, 1, ..., X 'N's in them?
If so, you can stop here: On Tue, Sep 6, 2011 at 4:22 PM, wang peter <wng.pe...@gmail.com> wrote: > i used a stupid way to do statistics on the reads distribution varied with N > number > > library(ShortRead) > reads <- readFastq(fastqfile); > ids<- id(reads); > seqs <- sread(reads); > # do you know how to get such information by a bioconductor function > nCount<-alphabetFrequency(seqs)[,"N"] And do: R> n.distro <- table(nCount) or some such, I think. But it seems like you should also have the same answer in nCountHist, as you've done it below, no? > nCountHist<-hist(nCount,breaks=max(nCount)) > nCountHist["breaks"] > nCountHist["counts"] If that's not what you need, then maybe you can be a bit more specific about what you are after? -steve > $breaks > [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 > 24 > [26] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 > 49 > [51] 50 51 52 53 54 55 >> nCountHist["counts"] > $counts > [1] 16988332 3975 4365 3099 2760 2473 2918 3045 > [9] 3320 3028 3290 3560 4695 4546 3939 4255 > [17] 3899 4025 6764 3554 4056 2716 1812 1456 > [25] 1618 2133 2253 1809 1638 924 951 889 > [33] 931 1089 1868 3344 348 36 20 25 > [41] 12 16 10 24 9 4 4 3 > [49] 0 0 3 1 1 0 1 > > what i need is just the count of reads varied with "N" number, like such > above > > thx > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > Bioc-sig-sequencing@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing