Thank you Galt for your detailed information,

I understand the optimal configuration depends the needs. So... my query 
sequences are cDNAs of 100-5000bp. One of the goals is to detect variations 
like intron retention between related mammals like primates vs. rodents 
(therefore I need genomes as targets).
The basic configuration finds most but not all HSPs per hit (accordingly 
sometimes small exons are not detected, or larger intronic regions). But the 
optimization is problematic because I see that often even -stepSize=5 is less 
sensitive than the default stepSize. As far as I understand this can happen 
because of repetitive sequences that are ignored if they occur too many times 
when sensitivity rises. Should I increase -repMatch to prevent it ? but which 
value is the program default repMatch for [-stepSize=5,-tileSize=10] and for 
[-stepSize=5,-tileSize=default] ? 

thanks,
Avi


-repMatch

--- On Mon, 3/8/10, Galt Barber <[email protected]> wrote:

> From: Galt Barber <[email protected]>
> Subject: Re: [Genome] gfServer/gfClient and -tileSize
> To: [email protected]
> Date: Monday, March 8, 2010, 7:35 PM
> 
> Higher tileSize increases memory,
> increases speed, decreases sensitivity slightly.
> 
> The default tileSize 11 is very good.
> On rare occasions you see 10 or 12 used.
> Smaller tileSizes tend to lead to
> dramatically longer runtime.
> 
> It's a little complex to state easily
> in a formula because there are multiple
> phases internally that have each different
> characteristics.
> 
> The default stepSize is just tileSize.
> This means that you are sampling a
> position of the genome every stepSize bases.
> 
> For PCR primer searching, we leave tileSize at 11
> and lower stepSize to 5 for increased
> sensitivity.  Of course this will also
> cause the runtime to grow.
> 
> Increasing sensitivity means increasing
> the number of hits, and each hit that
> has to be explored can take a lot of
> processing.
> 
> And of course, whatever generalizations
> one would make, the real power, speed,
> and memory-required will depend
> on the characteristics of the genome,
> the queries.  Not to mention several command-line
> switches that are available.
> 
> But luckily the defaults have good
> performance and sensitivity
> for a wide-range of applications.
> 
> If you are doing short-reads then
> perhaps one of the many good freely
> available short-read aligners like
> would be useful.
> 
> BLAT is free for non-commercial use.
> 
> -Galt
> 
> Ar 3/8/2010 7:03 AM, scríobh Fungazid:
> > Hello people,
> >
> >
> > About gfServer/gfClient :
> >
> > I see that higher -tileSize leads to higher memory
> requirement. Does higher -tileSize expected to decrease
> detection power ?
> > In addition, should higher -tileSize enhance the speed
> of gfServer/gfClient ?
> >
> > And, what is the -stepSize and how it effects the
> detection power, speed and memory requirement ?
> >
> >
> > Thanks,
> > Avi
> >
> >
> >
> >
> >
> > _______________________________________________
> > Genome maillist  -  [email protected]
> > https://lists.soe.ucsc.edu/mailman/listinfo/genome
> 
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
> 


      


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to