Hi Alex, UCSC’s ref flat has columns for both transcript start and end as well as CDS start and end. Using that information the code can determine what is coding (in any transcript for a gene) and what is untranslated across all transcripts for the gene, and then counts accordingly. As it stands there isn’t a way to take in GTF format instead though you could probably reformat that data to look like refFlat data. Alternatively we’d very much welcome a pull request to be able to take either of ref flat or GTF formats.
On #2 no - the only effect of that option is to calculate the fraction of reads that are coming from the expected strand - it doesn’t effect the histogram or any other outputs. -t On Jun 29, 2014, at 2:19 PM, Alexander Predeus <[email protected]> wrote: > Hello all, > > I've had a couple of questions about how CollectRNASeqMetrics works. > > 1. When I use "refflat"-style RefSeq annotation for mm10, the program > estimates the percentage of reads that lands on exons, introns, intergenic > regions, and UTR. The first three are pretty clear; but how does the program > evaluate the UTR region? It is not in the annotation, and I can only assume > that it is guessing it somehow. > > The reason I'm asking this is I'm getting a fairly high percentage of my > reads land in UTR. Is it in any way possible to use UTR annotation from, say, > Gencode? > > 2. When the program calculates the distribution histogram of reads over gene > body, does it take the direction (strand) of the feature into account? I've > ran three evaluations with strand flags set to "NONE", "FIRST...", and > "SECOND...", and all the stats except for number and percent of reads landing > on the correct strand are the same. So I figured I'd ask. > > Thank you in advance! > > -- Alex > ------------------------------------------------------------------------------ > Open source business process management suite built on Java and Eclipse > Turn processes into business applications with Bonita BPM Community Edition > Quickly connect people, data, and systems into organized workflows > Winner of BOSSIE, CODIE, OW2 and Gartner awards > http://p.sf.net/sfu/Bonitasoft_______________________________________________ > Samtools-help mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/samtools-help ------------------------------------------------------------------------------ Open source business process management suite built on Java and Eclipse Turn processes into business applications with Bonita BPM Community Edition Quickly connect people, data, and systems into organized workflows Winner of BOSSIE, CODIE, OW2 and Gartner awards http://p.sf.net/sfu/Bonitasoft _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
