Hi, Mark:

On Nov 17, 11:46 pm, Mark Robinson <mrobin...@wehi.edu.au> wrote:
> Hi Sabrina.
>
> Comments below.
>
> > Hi, Mark:
> > Thanks for the information. What I worried about using coordinate is
> > that coordinate changes with assembly, while sequences do not change.
> > I don't know how many psrs are out of exon boundary, I will look into
> > it. But here is an example:
> > group Number: 5092691:
> > CTTATCGAGATAAAAAGTGCTTCTGTGGGTCAATCTAGATATTGATAGATTTGGACTGGAGAAG
>
> > when I used blat from ensembl, it showed that it is out of exon
> > boundary.
>
> I guess I mean: do you have an example of a (core) probe that maps to  
> a location not within an bona fide exon?  What's above is certainly  
> not an Affymetrix 25-mer probe, so I'm a bit confused.


This is the sequence of the psr or from the unitGroup. The file I used
was downloaded from Affy website. According to the website:
Probe set sequences consist of the contiguous genomic sequence
starting at the beginning of the first probe and ending at the end of
the last probe in the set as they are aligned to the genome. They are
provided in the orientation they exist in the mRNA in 5'->3'
direction.

so my hypothesis is that if all of the probes in psr are in the exon
region, then this psr sequence should be in it as well, right?


>
> > Forgot to ask another question. Back to my original question, is there
> > an (easy) way to map from  partial sequence (i.e. the probeset
> > sequence) to exon sequence in a batch mode or in R? Thanks!
>
> This is not really an aroma.affymetrix question and there are various  
> answers to this on the Bioconductor mailing list.  Personally, I  
> usually use the findOverlaps() function in IRanges package.  Its super  
> quick.  Here is an rough sketch of an example (please don't take this  
> and assume it work work for you ... this is just to direct you toward  
> a way to do it):


I will look into it! Thanks a lot :)


Sabrina
>
> library(IRanges)
>
> # assume you have 3 corresponding vectors for the exons:
> # 'exonStart', 'exonEnd', 'exonChr'
> exonIranges <- mapply(IRanges, start = split(exonStart, exonChr),
>                               end = split(exonEnd, exonChr))
> exonL <- do.call(RangesList, exonIranges)
>
> # similarly, corresponding vectors for the 25-mer Affy probes
> # 'sp' is position, 'ch' is chromosome
> probeRanges <- mapply(IRanges, start = split(sp, ch),
>                               end = split(sp+24, ch))
> probeL <- do.call(RangesList, probeRanges)
>
> # you may wish to make sure that these lists cover the same chromosomes
> fo <- findOverlaps(probeL, exonL)
>
> I'll leave it as an exercise to the reader to unwrap the contents of  
> the 'fo' object [Hint: see the as.table() method].  In your case,  
> you'd be interested to know how many probes do not "overlap" within  
> exon boundaries and I'd guess that you'll be careful what set of  
> probes you use (e.g. core only?).
>
> Hope that helps.
>
> Cheers,
> Mark
>
> > Of course, if I made a mistake when I compiled the CDF, then that is
> > different story. I hope not.
>
> > Any suggestions? Thanks!
>
> > On Nov 16, 8:10 pm, Mark Robinson <mrobin...@wehi.edu.au> wrote:
> >> Hi folks.
>
> >> Note that you can download these directly with the R/Bioconductor
> >> package 'rtracklayer'.  For example:
>
> >> library(rtracklayer)
> >> session <- browserSession("UCSC")
> >> q1 <- ucscTableQuery(session, "refGene", GenomicRanges(genome =  
> >> "hg18"))
> >> refGene <- getTable(q1)
>
> >> Sabrina:  I'm actually surprised that many probes lie outside exon
> >> boundaries.  They were specifically designed to be inside.  Of  
> >> course,
> >> the array was designed on annotation from a few years ago, but  
> >> still I
> >> would expect this to be minimal.  Can you give some numbers on this?
> >> Or, some examples.
>
> >> Cheers,
> >> Mark
>
> >> On 17-Nov-09, at 3:41 AM, camelbbs wrote:
>
> >>>http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/
> >>> refFlat.txt.gz  and refGene.txt.gz are the similar.
>
> >>> On Nov 15, 5:02 pm, sabrina <sabrina.s...@gmail.com> wrote:
> >>>> Hi, Jiang:
> >>>> Can you give me more detail about UCSC RefGene file? Is there a  
> >>>> link
> >>>> to download? Thanks!
>
> >>>> Sabrina
>
> >>>> On Nov 15, 11:38 am, camelbbs <camel...@gmail.com> wrote:
>
> >>>>> hi,
> >>>>> I think you can get the probeset coordinates from affy exon array
> >>>>> annotation files, and you can get exon coordinates from the UCSC
> >>>>> RefGene files. Then you can compare them.
> >>>>> Best,
> >>>>> Jiang
>
> >>>>> On Nov 15, 6:56 am, sabrina <sabrina.s...@gmail.com> wrote:
>
> >>>>>> Hello, all:
> >>>>>> I used FIRMA to find potential spliced genes. But as we know, the
> >>>>>> probesets from affy exon array could be out of exon boundary or
> >>>>>> just
> >>>>>> cover part of exons. I wonder if I have the sequence of the
> >>>>>> probeset
> >>>>>> (which I get from Affy website), how do I do it in batch to find
> >>>>>> whether it is in an exon (ENSEMBL) region , or if it is, how do I
> >>>>>> get
> >>>>>> the entire exon sequence and coordinates? Thanks a lot!
>
> >>>>>> Sabrina
>
> >>> --
> >>> When reporting problems on aroma.affymetrix, make sure 1) to run the
> >>> latest version of the package, 2) to report the output of
> >>> sessionInfo() and traceback(), and 3) to post a complete code  
> >>> example.
>
> >>> You received this message because you are subscribed to the Google
> >>> Groups "aroma.affymetrix" group.
> >>> To post to this group, send email to aroma-affymetrix@googlegroups.com
> >>> To unsubscribe from this group, send email to 
> >>> aroma-affymetrix-unsubscr...@googlegroups.com
> >>> For more options, visit this group 
> >>> athttp://groups.google.com/group/aroma-affymetrix?hl=en
>
> >> ------------------------------
> >> Mark Robinson, PhD (Melb)
> >> Epigenetics Laboratory, Garvan
> >> Bioinformatics Division, WEHI
> >> e: m.robin...@garvan.org.au
> >> e: mrobin...@wehi.edu.au
> >> p: +61 (0)3 9345 2628
> >> f: +61 (0)3 9347 0852
> >> ------------------------------
>
> > --
> > When reporting problems on aroma.affymetrix, make sure 1) to run the  
> > latest version of the package, 2) to report the output of  
> > sessionInfo() and traceback(), and 3) to post a complete code example.
>
> > You received this message because you are subscribed to the Google  
> > Groups "aroma.affymetrix" group.
> > To post to this group, send email to aroma-affymetrix@googlegroups.com
> > To unsubscribe from this group, send email to 
> > aroma-affymetrix-unsubscr...@googlegroups.com
> > For more options, visit this group 
> > athttp://groups.google.com/group/aroma-affymetrix?hl=en
>
> ------------------------------
> Mark Robinson, PhD (Melb)
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: m.robin...@garvan.org.au
> e: mrobin...@wehi.edu.au
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
> ------------------------------

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en

Reply via email to