Re: [Bioc-sig-seq] count the coverage base by base

Michael Lawrence Sat, 23 Jul 2011 05:56:41 -0700

This is where the Views come in.

cov_by_gene <- Views(cov, genes)
viewSums(cov_by_gene)


On Fri, Jul 22, 2011 at 10:13 PM, Kunbin Qu <k...@genomichealth.com> wrote:

>  Michael, I do not think a simple coverage() call will do it. coverage()
> would only give me the base pair count for the reads  stack from the
> transcriptome. The problem I have is to count the base coverage within each
> refseq genes. Forgive my slowness, where does the refseq gene play in
> coverage()?****
>
> ** **
>
> -Kunbin****
>
> ** **
>
> ** **
>
> *From:* Michael Lawrence [mailto:lawrence.mich...@gene.com]
> *Sent:* Friday, July 22, 2011 4:13 PM
> *To:* Kunbin Qu
> *Cc:* Michael Lawrence; bioc-sig-sequencing@r-project.org
> *Subject:* Re: [Bioc-sig-seq] count the coverage base by base****
>
> ** **
>
> Just call coverage() on the GRanges. And just for the record, my name is
> not Steve :)
>
> Michael****
>
> On Fri, Jul 22, 2011 at 4:03 PM, Kunbin Qu <k...@genomichealth.com> wrote:*
> ***
>
> Steve, the reads are GRanges, like the following. If the reads go across
> two exons, it would be two GRanges. I was using CASAVA1.7 (ie, ELAND2.0) to
> generate the alignment. In the export file from CASAVA, it has this weird
> representation of the spliced reads:****
>
>  ****
>
> SEQUENCER02     110     1       1101    4693    1950    TTAGGC  1
> TTGCTGCAAGCATTTGAGAACAACCTTTTTCGTGCT
> `acccbcccggggggggggggggggggggggggggg    splice_sites-auto.fa
> TAF1_35_35_chrX.fa_70604899_70607111      27      F       36      118
>                                       Y****
>
>  ****
>
> So I wrote a script to convert the above single entry (one read) into two
> entries, one for each portion falling into the two exons where the read goes
> across. Then read them into R as GRanges. Could you show me a little more
> how to coverage functions to get the counts? Thanks.****
>
>  ****
>
> -Kunbin****
>
>  ****
>
>  ****
>
> > gr****
>
> GRanges with 27421835 ranges and 0 elementMetadata values****
>
>            seqnames                 ranges strand   |****
>
>               <Rle>              <IRanges>  <Rle>   |****
>
>        [1]    chr11 [ 48034060,  48034095]      +   |****
>
>        [2]    chr13 [103319962, 103319997]      +   |****
>
>        [3]     chr2 [198350561, 198350596]      -   |****
>
>        [4]    chr12 [ 41850809,  41850844]      +   |****
>
>        [5]    chr16 [ 89974865,  89974900]      -   |****
>
>        [6]     chr1 [172113839, 172113874]      -   |****
>
>        [7]    chr12 [111080272, 111080307]      -   |****
>
>        [8]     chr2 [179445437, 179445472]      -   |****
>
>        [9]    chr10 [119817069, 119817104]      +   |****
>
>        ...      ...                    ...    ... ...****
>
> [27421827]    chr17 [ 43334904,  43334939]      -   |****
>
> [27421828]     chr6 [163903657, 163903692]      +   |****
>
> [27421829]    chr18 [ 74737099,  74737134]      -   |****
>
> [27421830]    chr13 [ 78311617,  78311652]      -   |****
>
> [27421831]     chr4 [170541832, 170541867]      +   |****
>
> [27421832]    chr13 [ 32601946,  32601981]      +   |****
>
> [27421833]     chr3 [ 38420019,  38420054]      +   |****
>
> [27421834]     chr8 [ 74561716,  74561751]      -   |****
>
> [27421835]     chr3 [ 39319812,  39319847]      -   |****
>
>  ****
>
> seqlengths****
>
>   chr1 chr10 chr11 chr12 chr13 chr14 ...  chr6  chr7  chr8  chr9  chrX
> chrY****
>
>     NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA
> NA****
>
> > ****
>
>  ****
>
> *From:* Michael Lawrence [mailto:lawrence.mich...@gene.com]
> *Sent:* Friday, July 22, 2011 3:32 PM
> *To:* Kunbin Qu
> *Cc:* Michael Lawrence; bioc-sig-sequencing@r-project.org****
>
>
> *Subject:* Re: [Bioc-sig-seq] count the coverage base by base****
>
>  ****
>
> You could call coverage on all sorts of things. How are you representing
> your reads? GappedAlignments would work, for example.
>
> > showMethods(coverage)
> Function: coverage (package IRanges)
> x="AlignedXStringSet0"
> x="GRangesList"
> x="GappedAlignments"
> x="GenomicRanges"
> x="IRanges"
> x="MIndex"
> x="MaskCollection"
> x="MaskedXString"
> x="PairwiseAlignedFixedSubject"
> x="PairwiseAlignedFixedSubjectSummary"
> x="RangedData"
> x="RangesList"
> x="Views"
> x="numeric"****
>
> On Fri, Jul 22, 2011 at 3:24 PM, Kunbin Qu <k...@genomichealth.com> wrote:*
> ***
>
> Steve, thanks for the advice. But I did not get it: coverage() is applied
> to a set of IRanges. What should I use for the input of the coverage() then?
> Is it possible for you to show me a little pseudo or real code? Thanks.***
> *
>
>  ****
>
> -Kunbin****
>
>  ****
>
>  ****
>
> *From:* Michael Lawrence [mailto:lawrence.mich...@gene.com]
> *Sent:* Friday, July 22, 2011 3:15 PM
> *To:* Kunbin Qu
> *Cc:* bioc-sig-sequencing@r-project.org
> *Subject:* Re: [Bioc-sig-seq] count the coverage base by base****
>
>  ****
>
> coverage(), form Views() on the coverage for your genes and then
> viewSums().****
>
> On Fri, Jul 22, 2011 at 2:36 PM, Kunbin Qu <k...@genomichealth.com> wrote:*
> ***
>
> Hi,
>
> If I want to count the coverage base by base, not read by read, can I still
> use countOverlaps? I have a human transcriptome done, and would like to
> count the coverage for each gene based on the mapping. As there are some
> reads mapped across the junctions (ie, one read is splitted into two
> portions), or partially mapped into the introns, it would be good to count
> the coverage by base pair, instead of by read number? Could anybody suggest
> a way doing that? Thanks.
>
> -Kunbin
>
>
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments, are
> intended only for the use of the individual or entity to which they are
> addressed and may contain confidential information. If you are not the
> intended recipient, you are hereby notified that any use, dissemination,
> distribution, or copying of this message or any attachment is strictly
> prohibited. If you have received this transmission in error, please send an
> e-mail to postmas...@genomichealth.com and delete this message, along with
> any attachments, from your computer.
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing****
>
>  ****
>
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments, are
> intended only for the use of the individual or entity to which they are
> addressed and may contain confidential information. If you are not the
> intended recipient, you are hereby notified that any use, dissemination,
> distribution, or copying of this message or any attachment is strictly
> prohibited. If you have received this transmission in error, please send an
> e-mail to postmas...@genomichealth.com and delete this message, along with
> any attachments, from your computer.****
>
>  ****
>
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments, are
> intended only for the use of the individual or entity to which they are
> addressed and may contain confidential information. If you are not the
> intended recipient, you are hereby notified that any use, dissemination,
> distribution, or copying of this message or any attachment is strictly
> prohibited. If you have received this transmission in error, please send an
> e-mail to postmas...@genomichealth.com and delete this message, along with
> any attachments, from your computer.****
>
> ** **
>
> ______________________________________________________________________
> The contents of this electronic message, including any attachments, are
> intended only for the use of the individual or entity to which they are
> addressed and may contain confidential information. If you are not the
> intended recipient, you are hereby notified that any use, dissemination,
> distribution, or copying of this message or any attachment is strictly
> prohibited. If you have received this transmission in error, please send an
> e-mail to postmas...@genomichealth.com and delete this message, along with
> any attachments, from your computer.
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] count the coverage base by base

Reply via email to