i used a stupid way to do statistics on the reads distribution varied with N
number

library(ShortRead)
reads <- readFastq(fastqfile);
ids<- id(reads);
seqs <- sread(reads);
# do you know how to get such information by a bioconductor function
nCount<-alphabetFrequency(seqs)[,"N"]
nCountHist<-hist(nCount,breaks=max(nCount))
nCountHist["breaks"]
nCountHist["counts"]


$breaks
 [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24
[26] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
49
[51] 50 51 52 53 54 55
> nCountHist["counts"]
$counts
 [1] 16988332     3975     4365     3099     2760     2473     2918     3045
 [9]     3320     3028     3290     3560     4695     4546     3939     4255
[17]     3899     4025     6764     3554     4056     2716     1812     1456
[25]     1618     2133     2253     1809     1638      924      951      889
[33]      931     1089     1868     3344      348       36       20       25
[41]       12       16       10       24        9        4        4        3
[49]        0        0        3        1        1        0        1

what i need is just the count of reads varied with "N" number, like such
above

thx

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to