Hello Marco,
In the command line version of liftOver, try using the -multiple option so that
the splits will not appear in the unmapped list.
To see the complete usage statement for liftOver, simply type it at the command
line with no arguments:
> liftOver
liftOver - Move annotations from one assembly to another
usage:
liftOver oldFile map.chain newFile unMapped
oldFile and newFile are in bed format by default, but can be in GFF and
maybe eventually others with the appropriate flags below.
The map.chain file has the old genome as the target and the new genome
as the query.
***********************************************************************
WARNING: liftOver was only designed to work between different
assemblies of the same organism, it may not do what you want
if you are lifting between different organisms.
***********************************************************************
options:
-minMatch=0.N Minimum ratio of bases that must remap. Default 0.95
-gff File is in gff/gtf format. Note that the gff lines are converted
separately. It would be good to have a separate check after this
that the lines that make up a gene model still make a plausible gene
after liftOver
-genePred - File is in genePred format
-sample - File is in sample format
-bedPlus=N - File is bed N+ format
-positions - File is in browser "position" format
-hasBin - File has bin value (used only with -bedPlus)
-tab - Separate by tabs rather than space (used only with -bedPlus)
-pslT - File is in psl format, map target side only
-minBlocks=0.N Minimum ratio of alignment blocks or exons that must map
(default 1.00)
-fudgeThick (bed 12 or 12+ only) If thickStart/thickEnd is not mapped,
use the closest mapped base. Recommended if using
-minBlocks.
-multiple Allow multiple output regions
-minChainT, -minChainQ Minimum chain size in target/query, when mapping
to multiple output regions (default 0, 0)
-minSizeT deprecated synonym for -minChainT (ENCODE compat.)
-minSizeQ Min matching region size in query with -multiple.
-chainTable Used with -multiple, format is db.tablename,
to extend chains from net (preserves dups)
-errorHelp Explain error messages
Regards,
----------
Ann Zweig
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu
Blanchette, Marco wrote:
> I am trying to use the command line version of liftover with the available
> chain files to map locations from the Drosophila melanogaster genome to the
> other Drosophila genomes. My problem is that I am getting different results
> between the web and the command line versions. For instance, lifting the
> following coordinates between D. melanogaster and D. ananasae (dm3 to
> DroAna3) gives me the following result
> Dmel:
> chr2R 12750470 12755218 aaa 1000 -
> Dana
> scaffold_13266 13843857 13848583 aaa 1 +
>
>
> However, if I use the following command line using (dmel_coords.bed
> containing the previous bed coordinates):
> $ liftOver dmel_coords.bed chainFiles/dm3.droAna3.all.chain dana.lift
> dana.unMap
>
> dana.lift is an empty file while dana.unMap as the following line:
> #Split in new
> chr2R 12750470 12755218 aaa 1000 -
>
> Why the discrepancies? I also get similar results with other genomes while
> some give me the right answers (for instance with D. Simulans using the
> dm3.droSim1.all.chain file give me the right result). I downloaded the most
> recent chain files yesterday. Is there something I am not doing right? Can't
> figure it out...
>
> Thanks
> --
> Marco Blanchette, Ph.D.
> Assistant Investigator
> Stowers Institute for Medical Research
> 1000 East 50th St.
>
> Kansas City, MO 64110
>
> Tel: 816-926-4071
> Cell: 816-726-8419
> Fax: 816-926-2018
>
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome