i used a stupid way to do statistics on the reads distribution varied with N number
library(ShortRead) reads <- readFastq(fastqfile); ids<- id(reads); seqs <- sread(reads); # do you know how to get such information by a bioconductor function nCount<-alphabetFrequency(seqs)[,"N"] nCountHist<-hist(nCount,breaks=max(nCount)) nCountHist["breaks"] nCountHist["counts"] $breaks [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 [26] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 [51] 50 51 52 53 54 55 > nCountHist["counts"] $counts [1] 16988332 3975 4365 3099 2760 2473 2918 3045 [9] 3320 3028 3290 3560 4695 4546 3939 4255 [17] 3899 4025 6764 3554 4056 2716 1812 1456 [25] 1618 2133 2253 1809 1638 924 951 889 [33] 931 1089 1868 3344 348 36 20 25 [41] 12 16 10 24 9 4 4 3 [49] 0 0 3 1 1 0 1 what i need is just the count of reads varied with "N" number, like such above thx [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing