This is where the Views come in. cov_by_gene <- Views(cov, genes) viewSums(cov_by_gene)
On Fri, Jul 22, 2011 at 10:13 PM, Kunbin Qu <k...@genomichealth.com> wrote: > Michael, I do not think a simple coverage() call will do it. coverage() > would only give me the base pair count for the reads stack from the > transcriptome. The problem I have is to count the base coverage within each > refseq genes. Forgive my slowness, where does the refseq gene play in > coverage()?**** > > ** ** > > -Kunbin**** > > ** ** > > ** ** > > *From:* Michael Lawrence [mailto:lawrence.mich...@gene.com] > *Sent:* Friday, July 22, 2011 4:13 PM > *To:* Kunbin Qu > *Cc:* Michael Lawrence; bioc-sig-sequencing@r-project.org > *Subject:* Re: [Bioc-sig-seq] count the coverage base by base**** > > ** ** > > Just call coverage() on the GRanges. And just for the record, my name is > not Steve :) > > Michael**** > > On Fri, Jul 22, 2011 at 4:03 PM, Kunbin Qu <k...@genomichealth.com> wrote:* > *** > > Steve, the reads are GRanges, like the following. If the reads go across > two exons, it would be two GRanges. I was using CASAVA1.7 (ie, ELAND2.0) to > generate the alignment. In the export file from CASAVA, it has this weird > representation of the spliced reads:**** > > **** > > SEQUENCER02 110 1 1101 4693 1950 TTAGGC 1 > TTGCTGCAAGCATTTGAGAACAACCTTTTTCGTGCT > `acccbcccggggggggggggggggggggggggggg splice_sites-auto.fa > TAF1_35_35_chrX.fa_70604899_70607111 27 F 36 118 > Y**** > > **** > > So I wrote a script to convert the above single entry (one read) into two > entries, one for each portion falling into the two exons where the read goes > across. Then read them into R as GRanges. Could you show me a little more > how to coverage functions to get the counts? Thanks.**** > > **** > > -Kunbin**** > > **** > > **** > > > gr**** > > GRanges with 27421835 ranges and 0 elementMetadata values**** > > seqnames ranges strand |**** > > <Rle> <IRanges> <Rle> |**** > > [1] chr11 [ 48034060, 48034095] + |**** > > [2] chr13 [103319962, 103319997] + |**** > > [3] chr2 [198350561, 198350596] - |**** > > [4] chr12 [ 41850809, 41850844] + |**** > > [5] chr16 [ 89974865, 89974900] - |**** > > [6] chr1 [172113839, 172113874] - |**** > > [7] chr12 [111080272, 111080307] - |**** > > [8] chr2 [179445437, 179445472] - |**** > > [9] chr10 [119817069, 119817104] + |**** > > ... ... ... ... ...**** > > [27421827] chr17 [ 43334904, 43334939] - |**** > > [27421828] chr6 [163903657, 163903692] + |**** > > [27421829] chr18 [ 74737099, 74737134] - |**** > > [27421830] chr13 [ 78311617, 78311652] - |**** > > [27421831] chr4 [170541832, 170541867] + |**** > > [27421832] chr13 [ 32601946, 32601981] + |**** > > [27421833] chr3 [ 38420019, 38420054] + |**** > > [27421834] chr8 [ 74561716, 74561751] - |**** > > [27421835] chr3 [ 39319812, 39319847] - |**** > > **** > > seqlengths**** > > chr1 chr10 chr11 chr12 chr13 chr14 ... chr6 chr7 chr8 chr9 chrX > chrY**** > > NA NA NA NA NA NA ... NA NA NA NA NA > NA**** > > > **** > > **** > > *From:* Michael Lawrence [mailto:lawrence.mich...@gene.com] > *Sent:* Friday, July 22, 2011 3:32 PM > *To:* Kunbin Qu > *Cc:* Michael Lawrence; bioc-sig-sequencing@r-project.org**** > > > *Subject:* Re: [Bioc-sig-seq] count the coverage base by base**** > > **** > > You could call coverage on all sorts of things. How are you representing > your reads? GappedAlignments would work, for example. > > > showMethods(coverage) > Function: coverage (package IRanges) > x="AlignedXStringSet0" > x="GRangesList" > x="GappedAlignments" > x="GenomicRanges" > x="IRanges" > x="MIndex" > x="MaskCollection" > x="MaskedXString" > x="PairwiseAlignedFixedSubject" > x="PairwiseAlignedFixedSubjectSummary" > x="RangedData" > x="RangesList" > x="Views" > x="numeric"**** > > On Fri, Jul 22, 2011 at 3:24 PM, Kunbin Qu <k...@genomichealth.com> wrote:* > *** > > Steve, thanks for the advice. But I did not get it: coverage() is applied > to a set of IRanges. What should I use for the input of the coverage() then? > Is it possible for you to show me a little pseudo or real code? Thanks.*** > * > > **** > > -Kunbin**** > > **** > > **** > > *From:* Michael Lawrence [mailto:lawrence.mich...@gene.com] > *Sent:* Friday, July 22, 2011 3:15 PM > *To:* Kunbin Qu > *Cc:* bioc-sig-sequencing@r-project.org > *Subject:* Re: [Bioc-sig-seq] count the coverage base by base**** > > **** > > coverage(), form Views() on the coverage for your genes and then > viewSums().**** > > On Fri, Jul 22, 2011 at 2:36 PM, Kunbin Qu <k...@genomichealth.com> wrote:* > *** > > Hi, > > If I want to count the coverage base by base, not read by read, can I still > use countOverlaps? I have a human transcriptome done, and would like to > count the coverage for each gene based on the mapping. As there are some > reads mapped across the junctions (ie, one read is splitted into two > portions), or partially mapped into the introns, it would be good to count > the coverage by base pair, instead of by read number? Could anybody suggest > a way doing that? Thanks. > > -Kunbin > > > > ______________________________________________________________________ > The contents of this electronic message, including any attachments, are > intended only for the use of the individual or entity to which they are > addressed and may contain confidential information. If you are not the > intended recipient, you are hereby notified that any use, dissemination, > distribution, or copying of this message or any attachment is strictly > prohibited. If you have received this transmission in error, please send an > e-mail to postmas...@genomichealth.com and delete this message, along with > any attachments, from your computer. > [[alternative HTML version deleted]] > > _______________________________________________ > Bioc-sig-sequencing mailing list > Bioc-sig-sequencing@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing**** > > **** > > > ______________________________________________________________________ > The contents of this electronic message, including any attachments, are > intended only for the use of the individual or entity to which they are > addressed and may contain confidential information. If you are not the > intended recipient, you are hereby notified that any use, dissemination, > distribution, or copying of this message or any attachment is strictly > prohibited. If you have received this transmission in error, please send an > e-mail to postmas...@genomichealth.com and delete this message, along with > any attachments, from your computer.**** > > **** > > > ______________________________________________________________________ > The contents of this electronic message, including any attachments, are > intended only for the use of the individual or entity to which they are > addressed and may contain confidential information. If you are not the > intended recipient, you are hereby notified that any use, dissemination, > distribution, or copying of this message or any attachment is strictly > prohibited. If you have received this transmission in error, please send an > e-mail to postmas...@genomichealth.com and delete this message, along with > any attachments, from your computer.**** > > ** ** > > ______________________________________________________________________ > The contents of this electronic message, including any attachments, are > intended only for the use of the individual or entity to which they are > addressed and may contain confidential information. If you are not the > intended recipient, you are hereby notified that any use, dissemination, > distribution, or copying of this message or any attachment is strictly > prohibited. If you have received this transmission in error, please send an > e-mail to postmas...@genomichealth.com and delete this message, along with > any attachments, from your computer. > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing