Re: [Genome] (no subject)

Jennifer Jackson Tue, 02 Jun 2009 11:31:47 -0700

Hello Avi,

Some options to try are to:

1) Lower the minmatch threshold. This will extend the alignments around 
higher scoring regions. Then you can compare the high scoring location 
(from a lift with a higher minmatch) to identify the flanking sequence.

2) Allow multiple = Y. This will also help to include/extend result 
data. Perhaps some regions are hitting without specificity (this 
likelihood will increase with a lower minmatch). This may not matter to 
you, since you will be able to find the regions you are interested in 
based on the higher scoring region locations.

3) Keep the parameters strict, locate the regions of high identity, then 
use your own tools to build coordinates for surrounding regions on the 
target genome, and extract sequence from the genomic files in Downloads 
or use the "get DNA" tool in the Assembly browser. This will only 
identify regions that are contiguous with the original high scoring lift 
region, but it sounds is if that is what you are trying to do.

4) These processes can be cyclic. Run strict, filter, run permissive, 
filter, etc. The failure reasons will alert you to why the regions do 
not map and can point you to the specific parameters to make less 
stringent to achieve a lift. It is also possible that some regions will 
need some manual judgment - try using the chain/net tracks in the 
browser to visualize the data. The liftOver files are based on this same 
source data.

Permissive parameters can produce a lot of output. Consider adding in 
other filters if they seem appropriate - minChainT/Q and minSizeT/Q can 
help reduce the noise from very small fragments mapping.

LiftOver runs pretty quickly, so the best way to determine the best set 
of parameters is usually a test/analyze methodology. Each experiment can 
be different and some regions of the genome can differ from other 
regions depending on the presence of repetitive elements or the type of 
gene(s) or the finished state of the assembly. Repeating subunits in a 
single gene or several genes in close genomic proximity that were all 
derived from a common ancestor gene can cause complications that 
sometimes only a person can resolve.

Good luck and we hope that some of these suggestions are helpful,
Jennifer Jackson
UCSC Genome Bioinformatics Group

Fungazid wrote:
> Hello UCSC team,
>
> Thanks for all the help,
> The standalone tools I needed are now giving me reasonable outputs, but I'm 
> not sure about the exact parameters to use for my specific needs in 
> the liftOver tool:
>
> In many introns the levels of conservation of distantly related species are 
> very low, with only some island of conserved regions (not only exons). 
> Sometimes I wish to map coordinates of different species in regions that are  
> 200-500bp from the nearest conserved regions. Accordingly, in such cases I do 
> not need the to find exact coordinate match, but I want the nearest match. 
> Maybe you can suggest how to use liftOver in such case
>
> Avi
>
>
>
>       
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>   
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] (no subject)

Reply via email to