Re: [Bioc-sig-seq] Extracting DNA sequences from BSgenome.Mmusculus.UCSC.mm9_1.3.11

Ivan Gregoretti Fri, 29 May 2009 09:06:10 -0700

Hi Hervé,


> With BSgenome 1.12.1 (release) and 1.13.5 (devel) you can now do:
>
>  myseqs <- data.frame(
>    chr=c("chrY", "chr1", "chr2", "chr3", "chrY", "chr3", "chr1", "chr1"),
>    start=c(NA, -40, 8510201, 4920301, 30001, 9220500, -2804, -30),
>    end=c(50, NA, 8510220, 4920330, 30011, 9220555, -2801, -11)
>  )
>
>  library(BSgenome.Mmusculus.UCSC.mm9)
>
>  > getSeq(Mmusculus, myseqs$chr, myseqs$start, myseqs$end)
>  [1] "GATCCAAAACACATTCTCCCTGGTAGCATGGACAAGCAACATTTTGGGAG"
>  [2] "TTCTGTAAAGAATTTGGTATTAAACTTAAAACTGGAATTC"
>  [3] "ACGACTATAAAAACCTTTAG"
>  [4] "CATACAATAATTGTGGGGGAACTTCAAAAC"
>  [5] "ATCTTAATCAC"
>  [6] "CAGTAGTGGCGTACACCTTTAATCCCAGCACGTGGTAGGCAGAGGCAGATGGATTT"
>  [7] "ATGA"
>  [8] "AATTTGGTATTAAACTTAAA"
>
> to extract multiple subsequences from multiple chromosomes at once.
> (Note support for NAs and negative start or end.)
>

So, getSeq is vectorised now. Great. That addresses a very common use of getSeq.


>
> Hopefully this time you won't get hit by the infamous bug you reported
> earlier (BTW anything new on that front? Were you able to reproduce it?
> Thanks).
>

Bug? Last time I was in real trouble I solved my problem with
Michael's suggestions on the use of RangedData. But that was a feature
rather than a bug. Bottom line, I stick to RangedData now because it
is relatively easy to manipulate it.

Thank you,

Ivan


Ivan Gregoretti, PhD
National Institute of Diabetes and Digestive and Kidney Diseases
National Institutes of Health
5 Memorial Dr, Building 5, Room 205.
Bethesda, MD 20892. USA.
Phone: 1-301-496-1592
Fax: 1-301-496-9878

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] Extracting DNA sequences from BSgenome.Mmusculus.UCSC.mm9_1.3.11

Reply via email to