Re: [Genome] chain file/liftover for Arabidopsis thaliana

Hiram Clawson Wed, 01 Apr 2009 12:06:59 -0700

Good Morning Raj:

The steps are pretty straightforward.


A. target genome = this is the genome to lift to the query - targetSequence.2bit
B. query genome = lifting to this genome - querySequence.2bit

1. break up your query genome into 10,000 base sized chunks.
        Remember the lift file for this breakup to use for lifting the blat 
results
2. Leave the target genome pieces/chroms whole, no splitting
3. blat each of your query genome chunks to each target genome piece
        blat targetPiece queryChunk tmpUnlifted.psl \
   -tileSize=11 -ooc=target.11.ooc -minScore=100 -minIdentity=98 -fastMap 
-noHead
4. Lift the blat results:
        liftUp -pslQ -nohead resultOut.psl query.lft warn tmpUnlifted.psl
5. Chain the results:
        cat allResultsForTargetPiece*.psl | axtChain -linearGap=medium -psl 
stdin \
                targetSequence.2bit querySequence.2bit 
resultChainForTargetPiece.chain
        chainMergeSort allResultsForTarget*.chain > mergedChain.chain
        chainSplit targetChains mergedChain.chain
6. Net the chains:
        foreach C in targetChains/*.chain
        chainNet ${C} targetSizes querySizes netSplit/${C}.net ${C} stdout \
                chainStitchId stdin overSplit/${C}.chain
7. The lift file result is:
        cat overSplit/*.chain > target.To.query.over.chain

It is mostly a matter of bookeeping to keep all the pieces in the correct
place at the correct time with the correct name.  See also, our script
that performs this in our environment in the source tree:
        src/hg/utils/automation/doSameSpeciesLiftOver.pl

--Hiram

Rajkumar Sasidharan wrote:
> Hi,
> 
> I am one of the curators at The Arabidopsis Information Resource (TAIR) 
> hosted at the Carnegie Institution in Stanford University campus. We are 
> looking into the possibility of providing liftOver files for converting 
> coordinates across A.thaliana genome builds and several related close 
> species.
> 
> Is this something we could do routinely at our side without bothering 
> you often? Could you give me details or pointers on generating these files?
> 
> Thanks,
> Raj
> _______________________________________________
> Genome maillist  -  [email protected]
> http://www.soe.ucsc.edu/mailman/listinfo/genome
> 

_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] chain file/liftover for Arabidopsis thaliana

Reply via email to