Hi Lax,

On 04/10/11 08:13, Lakshmanan Iyer wrote:
Hello,
I created a GRangesList of 3'UTRs and GappedAlignments of a BAM file using
the following code:
_____________________
library(GenomicRanges);
  library(Rsamtools);
  library(leeBamViews)
library ("GenomicFeatures")
txdb<- makeTranscriptDbFromUCSC(genome="mm9", tablename="knownGene")
#
# It remembers the transcript ids. so can be used later
p3UTRs<- threeUTRsByTranscript(txdb, use.names=TRUE)
# read the bam alignment
aligns<- readGappedAlignments ("Dend_accepted_hits_sorted.bam")

--------------------------------------
However, when I try to get the countOverlaps, I get the following error:
c<- countOverlaps (p3UTRs, aligns)
Error in queryHits(findOverlaps(query, subject, maxgap = maxgap, type =
type)) :
   error in evaluating the argument 'x' in selecting a method for function
'queryHits'

I suspect that this happens as the sequence names, "chrUn_random chrX_random
chrY_random" etc which are present in the GRangeslist, p3UTRs are not
present in the GappedAlignement objects.

My questions is:
1. Can I make countOverlaps to choose only those intervals in which there
are entries in both the objects?
No. The general approach is to clean the subject and query before calling countOverlaps.
  2. How do I subjset GRangeList (p3UTRs) to include entries from specific
list of chromosomes of interest only?
To retain specific chromosomes, you can subset with seqnames then reset the seqlevels.

In R-2.13/BioC 2.8, using the grl object from the GRangesList man page :

> grl
GRangesList of length 3
$gr1
GRanges with 1 range and 2 elementMetadata values
    seqnames    ranges strand |     score        GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1]     chr2    [3, 6]      + |         5      0.45

$gr2
GRanges with 2 ranges and 2 elementMetadata values
    seqnames    ranges strand |     score        GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1]     chr1  [ 7,  9]      + |         3       0.3
[2]     chr1  [13, 15]      - |         4       0.5

$gr3
GRanges with 2 ranges and 2 elementMetadata values
    seqnames    ranges strand |     score        GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1]     chr1    [1, 3]      - |         6       0.4
[2]     chr2    [4, 9]      - |         2       0.1


seqlengths
 chr2 chr1
   NA   NA

    grlsub <- grl[seqnames(grl) == "chr1", ]
    seqlevels(grlsub) <- seqlevels(grlsub)[seqlevels(grlsub) == "chr1"]

You can remove the empty list elements with,

    grlsub <- grlsub[elementLengths(grlsub) >0]


> grlsub
GRangesList of length 2
$gr2
GRanges with 2 ranges and 2 elementMetadata values
    seqnames    ranges strand |     score        GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1]     chr1  [ 7,  9]      + |         3       0.3
[2]     chr1  [13, 15]      - |         4       0.5

$gr3
GRanges with 1 range and 2 elementMetadata values
    seqnames    ranges strand |     score        GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1]     chr1    [1, 3]      - |         6       0.4


seqlengths
 chr1
   NA


Valerie


> sessionInfo()
R version 2.13.0 beta (2011-03-30 r55199)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GenomicRanges_1.3.37 IRanges_1.9.29

loaded via a namespace (and not attached):
[1] tools_2.13.0




-best
-Lax
sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GenomicFeatures_1.2.1 leeBamViews_0.99.11   BSgenome_1.18.2
[4] Biobase_2.10.0        Rsamtools_1.2.0       Biostrings_2.18.0
[7] GenomicRanges_1.2.3   IRanges_1.8.2

loaded via a namespace (and not attached):
[1] biomaRt_2.6.0      DBI_0.2-5          RCurl_1.4-3
RSQLite_0.9-2
[5] rtracklayer_1.10.5 tools_2.12.0       XML_3.2-0

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to