Good Morning Andreas:

Watch out for the - strand coordinates.
The start and end coordinates of this sequence is found via:

157194 = 225932-(68705+33)

157227 = 225932-68705

Then, reverse compliment your result as so:

$ twoBitToFa -start=157194 -end=157227 -seq=scaffold_2538 dasNov2.2bit stdout \
    | faRc -keepCase stdin stdout
 >RC_scaffold_2538:157194-157227
ATCTACTCTTTTACCCATCCCTCCAAAAAGCCT

--Hiram

Andreas Gruber wrote:
> Hi
> 
> I am having problems matching a sequence listed in a MAF alignment (hg18 
> 44way multiz) back to the genome of the species. I have this line from 
> the Armadillo in the alignment:
> s dasNov2.scaffold_2538               68705 33 -    225932 
> ATCTACTC----------T-TTTACCCATCCCTCCAAAAAGCCT
> 
> I have obtained the dasNov2 genome assembly from 
> ftp://hgdownload.cse.ucsc.edu/gbdb/dasNov2/dasNov2.2bit and converted it 
> back to fasta using the utilities from the BLAT suite.
> 
> When I try to extract the particular sequence now using fastacmd, I end 
> up with a sequence containing just Ns.
> fastacmd -d dasNov2.fa -s scaffold_2538 -L 68705,68738 -S2
>  >lcl|scaffold_2538:68705-68738 No definition line found
> NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
> 
> Any ideas what went wrong?
> 
> Cheers,
> Andreas
> 

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to