Hello all,

I've had a couple of questions about how CollectRNASeqMetrics works.

1. When I use "refflat"-style RefSeq annotation for mm10, the program
estimates the percentage of reads that lands on exons, introns, intergenic
regions, and UTR. The first three are pretty clear; but how does the
program evaluate the UTR region? It is not in the annotation, and I can
only assume that it is guessing it somehow.

The reason I'm asking this is I'm getting a fairly high percentage of my
reads land in UTR. Is it in any way possible to use UTR annotation from,
say, Gencode?

2. When the program calculates the distribution histogram of reads over
gene body, does it take the direction (strand) of the feature into account?
I've ran three evaluations with strand flags set to "NONE", "FIRST...", and
"SECOND...", and all the stats except for number and percent of reads
landing on the correct strand are the same. So I figured I'd ask.

Thank you in advance!

-- Alex
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to