Re: [Bioc-sig-seq] count the coverage base by base

Kunbin Qu Fri, 22 Jul 2011 16:16:27 -0700

Michael, I am terribly sorry.

Steve Lianoglou on this group has been helping me with my previous questions. I 
messed up and thought it was him responding my emails. My sincere apology, 
Michael.

-Kunbin

From: Michael Lawrence [mailto:lawrence.mich...@gene.com]
Sent: Friday, July 22, 2011 4:13 PM
To: Kunbin Qu
Cc: Michael Lawrence; bioc-sig-sequencing@r-project.org
Subject: Re: [Bioc-sig-seq] count the coverage base by base

Just call coverage() on the GRanges. And just for the record, my name is not 
Steve :)

Michael
On Fri, Jul 22, 2011 at 4:03 PM, Kunbin Qu 
<k...@genomichealth.com<mailto:k...@genomichealth.com>> wrote:
Steve, the reads are GRanges, like the following. If the reads go across two 
exons, it would be two GRanges. I was using CASAVA1.7 (ie, ELAND2.0) to 
generate the alignment. In the export file from CASAVA, it has this weird 
representation of the spliced reads:

SEQUENCER02     110     1       1101    4693    1950    TTAGGC  1       
TTGCTGCAAGCATTTGAGAACAACCTTTTTCGTGCT      `acccbcccggggggggggggggggggggggggggg  
  splice_sites-auto.fa    TAF1_35_35_chrX.fa_70604899_70607111      27      F   
    36      118                                             Y

So I wrote a script to convert the above single entry (one read) into two 
entries, one for each portion falling into the two exons where the read goes 
across. Then read them into R as GRanges. Could you show me a little more how 
to coverage functions to get the counts? Thanks.

-Kunbin

> gr
GRanges with 27421835 ranges and 0 elementMetadata values
           seqnames                 ranges strand   |
              <Rle>              <IRanges>  <Rle>   |
       [1]    chr11 [ 48034060,  48034095]      +   |
       [2]    chr13 [103319962, 103319997]      +   |
       [3]     chr2 [198350561, 198350596]      -   |
       [4]    chr12 [ 41850809,  41850844]      +   |
       [5]    chr16 [ 89974865,  89974900]      -   |
       [6]     chr1 [172113839, 172113874]      -   |
       [7]    chr12 [111080272, 111080307]      -   |
       [8]     chr2 [179445437, 179445472]      -   |
       [9]    chr10 [119817069, 119817104]      +   |
       ...      ...                    ...    ... ...
[27421827]    chr17 [ 43334904,  43334939]      -   |
[27421828]     chr6 [163903657, 163903692]      +   |
[27421829]    chr18 [ 74737099,  74737134]      -   |
[27421830]    chr13 [ 78311617,  78311652]      -   |
[27421831]     chr4 [170541832, 170541867]      +   |
[27421832]    chr13 [ 32601946,  32601981]      +   |
[27421833]     chr3 [ 38420019,  38420054]      +   |
[27421834]     chr8 [ 74561716,  74561751]      -   |
[27421835]     chr3 [ 39319812,  39319847]      -   |

seqlengths
  chr1 chr10 chr11 chr12 chr13 chr14 ...  chr6  chr7  chr8  chr9  chrX  chrY
    NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA
>

From: Michael Lawrence 
[mailto:lawrence.mich...@gene.com<mailto:lawrence.mich...@gene.com>]
Sent: Friday, July 22, 2011 3:32 PM
To: Kunbin Qu
Cc: Michael Lawrence; 
bioc-sig-sequencing@r-project.org<mailto:bioc-sig-sequencing@r-project.org>

Subject: Re: [Bioc-sig-seq] count the coverage base by base

You could call coverage on all sorts of things. How are you representing your 
reads? GappedAlignments would work, for example.

> showMethods(coverage)
Function: coverage (package IRanges)
x="AlignedXStringSet0"
x="GRangesList"
x="GappedAlignments"
x="GenomicRanges"
x="IRanges"
x="MIndex"
x="MaskCollection"
x="MaskedXString"
x="PairwiseAlignedFixedSubject"
x="PairwiseAlignedFixedSubjectSummary"
x="RangedData"
x="RangesList"
x="Views"
x="numeric"
On Fri, Jul 22, 2011 at 3:24 PM, Kunbin Qu 
<k...@genomichealth.com<mailto:k...@genomichealth.com>> wrote:
Steve, thanks for the advice. But I did not get it: coverage() is applied to a 
set of IRanges. What should I use for the input of the coverage() then? Is it 
possible for you to show me a little pseudo or real code? Thanks.

-Kunbin

From: Michael Lawrence 
[mailto:lawrence.mich...@gene.com<mailto:lawrence.mich...@gene.com>]
Sent: Friday, July 22, 2011 3:15 PM
To: Kunbin Qu
Cc: bioc-sig-sequencing@r-project.org<mailto:bioc-sig-sequencing@r-project.org>
Subject: Re: [Bioc-sig-seq] count the coverage base by base

coverage(), form Views() on the coverage for your genes and then viewSums().
On Fri, Jul 22, 2011 at 2:36 PM, Kunbin Qu 
<k...@genomichealth.com<mailto:k...@genomichealth.com>> wrote:
Hi,

If I want to count the coverage base by base, not read by read, can I still use 
countOverlaps? I have a human transcriptome done, and would like to count the 
coverage for each gene based on the mapping. As there are some reads mapped 
across the junctions (ie, one read is splitted into two portions), or partially 
mapped into the introns, it would be good to count the coverage by base pair, 
instead of by read number? Could anybody suggest a way doing that? Thanks.

-Kunbin

______________________________________________________________________
The contents of this electronic message, including any attachments, are 
intended only for the use of the individual or entity to which they are 
addressed and may contain confidential information. If you are not the intended 
recipient, you are hereby notified that any use, dissemination, distribution, 
or copying of this message or any attachment is strictly prohibited. If you 
have received this transmission in error, please send an e-mail to 
postmas...@genomichealth.com<mailto:postmas...@genomichealth.com> and delete 
this message, along with any attachments, from your computer.
       [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org<mailto:Bioc-sig-sequencing@r-project.org>
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

______________________________________________________________________
The contents of this electronic message, including any attachments, are 
intended only for the use of the individual or entity to which they are 
addressed and may contain confidential information. If you are not the intended 
recipient, you are hereby notified that any use, dissemination, distribution, 
or copying of this message or any attachment is strictly prohibited. If you 
have received this transmission in error, please send an e-mail to 
postmas...@genomichealth.com<mailto:postmas...@genomichealth.com> and delete 
this message, along with any attachments, from your computer.

______________________________________________________________________
The contents of this electronic message, including any attachments, are 
intended only for the use of the individual or entity to which they are 
addressed and may contain confidential information. If you are not the intended 
recipient, you are hereby notified that any use, dissemination, distribution, 
or copying of this message or any attachment is strictly prohibited. If you 
have received this transmission in error, please send an e-mail to 
postmas...@genomichealth.com<mailto:postmas...@genomichealth.com> and delete 
this message, along with any attachments, from your computer.

______________________________________________________________________
The contents of this electronic message, including any attachments, are 
intended only for the use of the individual or entity to which they are 
addressed and may contain confidential information. If you are not the intended 
recipient, you are hereby notified that any use, dissemination, distribution, 
or copying of this message or any attachment is strictly prohibited. If you 
have received this transmission in error, please send an e-mail to 
postmas...@genomichealth.com and delete this message, along with any 
attachments, from your computer.
        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] count the coverage base by base

Reply via email to