Hi Lax,
Please reply to the list when answering/asking questions. The
conversation might be helpful to someone else in the future.
The seqlevels function is not available in R-2.12/BioC 2.7 but is with
R-2.13/BioC 2.8. Version R-2.13/BioC 2.8 are being released this week
so you might want to update.
Take care,
Valerie
Hi Valerie
Thanks. Subsetting works fine. The only thing I found is that there is
NO function named "seqlevels",
I could get to the levels using the command at the individual list entries:
levels (seqnames (p3UTRsSub)[[1]])
This solves my problems, now I can iterative over the chromosomes and
find overlapping counts.
-thanks
-Lax
On 04/11/2011 06:53 AM, Valerie Obenchain wrote:
Hi Lax,
On 04/10/11 08:13, Lakshmanan Iyer wrote:
Hello,
I created a GRangesList of 3'UTRs and GappedAlignments of a BAM file
using
the following code:
_____________________
library(GenomicRanges);
library(Rsamtools);
library(leeBamViews)
library ("GenomicFeatures")
txdb<- makeTranscriptDbFromUCSC(genome="mm9", tablename="knownGene")
#
# It remembers the transcript ids. so can be used later
p3UTRs<- threeUTRsByTranscript(txdb, use.names=TRUE)
# read the bam alignment
aligns<- readGappedAlignments ("Dend_accepted_hits_sorted.bam")
--------------------------------------
However, when I try to get the countOverlaps, I get the following error:
c<- countOverlaps (p3UTRs, aligns)
Error in queryHits(findOverlaps(query, subject, maxgap = maxgap, type =
type)) :
error in evaluating the argument 'x' in selecting a method for
function
'queryHits'
I suspect that this happens as the sequence names, "chrUn_random
chrX_random
chrY_random" etc which are present in the GRangeslist, p3UTRs are not
present in the GappedAlignement objects.
My questions is:
1. Can I make countOverlaps to choose only those intervals in which
there
are entries in both the objects?
No. The general approach is to clean the subject and query before
calling countOverlaps.
2. How do I subjset GRangeList (p3UTRs) to include entries from
specific
list of chromosomes of interest only?
To retain specific chromosomes, you can subset with seqnames then
reset the seqlevels.
In R-2.13/BioC 2.8, using the grl object from the GRangesList man page :
> grl
GRangesList of length 3
$gr1
GRanges with 1 range and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1] chr2 [3, 6] + | 5 0.45
$gr2
GRanges with 2 ranges and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1] chr1 [ 7, 9] + | 3 0.3
[2] chr1 [13, 15] - | 4 0.5
$gr3
GRanges with 2 ranges and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1] chr1 [1, 3] - | 6 0.4
[2] chr2 [4, 9] - | 2 0.1
seqlengths
chr2 chr1
NA NA
grlsub <- grl[seqnames(grl) == "chr1", ]
seqlevels(grlsub) <- seqlevels(grlsub)[seqlevels(grlsub) == "chr1"]
You can remove the empty list elements with,
grlsub <- grlsub[elementLengths(grlsub) >0]
> grlsub
GRangesList of length 2
$gr2
GRanges with 2 ranges and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1] chr1 [ 7, 9] + | 3 0.3
[2] chr1 [13, 15] - | 4 0.5
$gr3
GRanges with 1 range and 2 elementMetadata values
seqnames ranges strand | score GC
<Rle> <IRanges> <Rle> | <integer> <numeric>
[1] chr1 [1, 3] - | 6 0.4
seqlengths
chr1
NA
Valerie
> sessionInfo()
R version 2.13.0 beta (2011-03-30 r55199)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicRanges_1.3.37 IRanges_1.9.29
loaded via a namespace (and not attached):
[1] tools_2.13.0
-best
-Lax
sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicFeatures_1.2.1 leeBamViews_0.99.11 BSgenome_1.18.2
[4] Biobase_2.10.0 Rsamtools_1.2.0 Biostrings_2.18.0
[7] GenomicRanges_1.2.3 IRanges_1.8.2
loaded via a namespace (and not attached):
[1] biomaRt_2.6.0 DBI_0.2-5 RCurl_1.4-3
RSQLite_0.9-2
[5] rtracklayer_1.10.5 tools_2.12.0 XML_3.2-0
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing