Good Morning Avi:

liftOver is not an alignment program.  It merely uses the information
that came out of the alignment program 'lastz' and our chain/net processing
pipeline:

http://www.bx.psu.edu/miller_lab/

Various parameters used during the lastz, chain and net procedures
eliminate poor mappings of one genome to another.  Not everything
maps.  liftOver is not recommended to investigate genome to
genome similarities.

If you want to make different mappings, you can operate lastz
with looser parameters to allow items to match that do not
survive our filter parameters.  For example, parameters
currently in use for alignments to the new human assembly hg19:
http://genomewiki.ucsc.edu/index.php/Hg19_conservation_lastz_parameters
and the chain and net filter parameters:
http://genomewiki.ucsc.edu/index.php/Hg19_Genome_size_statistics

--Hiram

Fungazid wrote:
> Thanks Jennifer,
> 
> I tried things like,
> 
> liftOver ./test.bed ./mm9ToHg18.over.chain ./test_out.bed ./unMapped 
> -minMatch=0.05 -multiple -minChainT=50 -minChainQ=50 -minSizeQ=50 
> 
> I understand that by allowing liftOver to search only for long  alignment 
> blocks (increasing minChain) it is possible to filter spurious multiple hits. 
> I see that somehow minMatch is related to the minimum conservation of 
> alignment block that liftOver considers. But can liftOver "map" regions with 
> no cover (look for the nearest match), or gaps ?
> 
> Best regards,
> Avi
> 
> 
> --- On Tue, 6/2/09, Jennifer Jackson <[email protected]> wrote:
> 
>> From: Jennifer Jackson <[email protected]>
>> Subject: Re: [Genome] (no subject)
>> To: "Fungazid" <[email protected]>
>> Cc: [email protected]
>> Date: Tuesday, June 2, 2009, 9:30 PM
>> Hello Avi,
>>
>> Some options to try are to:
>>
>> 1) Lower the minmatch threshold. This will extend the
>> alignments around higher scoring regions. Then you can
>> compare the high scoring location (from a lift with a higher
>> minmatch) to identify the flanking sequence.
>>
>> 2) Allow multiple = Y. This will also help to
>> include/extend result data. Perhaps some regions are hitting
>> without specificity (this likelihood will increase with a
>> lower minmatch). This may not matter to you, since you will
>> be able to find the regions you are interested in based on
>> the higher scoring region locations.
>>
>> 3) Keep the parameters strict, locate the regions of high
>> identity, then use your own tools to build coordinates for
>> surrounding regions on the target genome, and extract
>> sequence from the genomic files in Downloads or use the "get
>> DNA" tool in the Assembly browser. This will only identify
>> regions that are contiguous with the original high scoring
>> lift region, but it sounds is if that is what you are trying
>> to do.
>>
>> 4) These processes can be cyclic. Run strict, filter, run
>> permissive, filter, etc. The failure reasons will alert you
>> to why the regions do not map and can point you to the
>> specific parameters to make less stringent to achieve a
>> lift. It is also possible that some regions will need some
>> manual judgment - try using the chain/net tracks in the
>> browser to visualize the data. The liftOver files are based
>> on this same source data.
>>
>> Permissive parameters can produce a lot of output. Consider
>> adding in other filters if they seem appropriate -
>> minChainT/Q and minSizeT/Q can help reduce the noise from
>> very small fragments mapping.
>>
>> LiftOver runs pretty quickly, so the best way to determine
>> the best set of parameters is usually a test/analyze
>> methodology. Each experiment can be different and some
>> regions of the genome can differ from other regions
>> depending on the presence of repetitive elements or the type
>> of gene(s) or the finished state of the assembly. Repeating
>> subunits in a single gene or several genes in close genomic
>> proximity that were all derived from a common ancestor gene
>> can cause complications that sometimes only a person can
>> resolve.
>>
>> Good luck and we hope that some of these suggestions are
>> helpful,
>> Jennifer Jackson
>> UCSC Genome Bioinformatics Group
>>
>> Fungazid wrote:
>>> Hello UCSC team,
>>>
>>> Thanks for all the help,
>>> The standalone tools I needed are now giving me
>> reasonable outputs, but I'm not sure about the exact
>> parameters to use for my specific needs in the liftOver
>> tool:
>>> In many introns the levels of conservation of
>> distantly related species are very low, with only some
>> island of conserved regions (not only exons). Sometimes I
>> wish to map coordinates of different species in regions that
>> are  200-500bp from the nearest conserved regions.
>> Accordingly, in such cases I do not need the to find exact
>> coordinate match, but I want the nearest match. Maybe you
>> can suggest how to use liftOver in such case
>>> Avi
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to