Dear UCSC,

I'm trying to use Multiz to merge the pairwise alignments (all use human genome 
as a target) to get mutiple-species alignments. Before that, it is necessary 
for me to convert the axt format to maf format. So I need the tSizes and qSizes 
files of all the species I am interested.  

In the latest reply, Dr. Angie Hinrichs told us to use fetchChromSizes to get 
the sizes of different species. And it does work well in most species. However, 
fetchChromSizes still cannot get sizes of some species, such as aplaca_vicPac1, 
baboon_papHam1 and so on.

As we can download the pairwise alignments (human genome as a target) of these 
species from UCSC, it should have been sequenced and have sizes files of all 
these species. So, I wonder where and how I can get these species' sizes files 
which cannot get by fetchChromSizes.

We are looking forward to your reply. Thank you! 

PS: Here are the list of all these species lack sizes files.
 
alpaca_vicPac1
armadillo_dasNov2
baboon_papHam1
dolphin_turTru1
megabat_pteVam1
microbat_myoLuc1
mouseLemur_micMur1
pika_ochPri2
rockHyrax_proCap1
shrew_sorAra1
sloth_choHof1
tarsier_tarSyr1
tenrec_echTel1
wallaby_macEug1


Guangyi Dai

Laboratory of Evolutionary Genomics
CAS-MPG Partner Institute for Computational Biology
Chinese Academy of Sciences
Yue Yang Road 320
Shanghai, 200031
China

Tel: +(86)-21-54920487
Fax: +(86)-21-54920451
E-mail: [email protected]




-----Original Message-----

On 2011-11-24, at 上午4:05, Angie Hinrichs wrote:


Hello Chen Ming,

I would like to add a bit about these parts of your question:


So could you please tell me where I can find the correct information

about the supercontigs, especially their length information ?


We have a shell script fetchChromSizes that retrieves the sizes.  You
can download the script here:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/fetchChromSizes

Run "fetchChromSizes" with no arguments to see usage instructions.



Could you please explain about the function of tSizes and qSizes in

the axtToMaf program? Is it necessary for them to be accuracy?


MAF's "s" and "e" blocks have a srcSize field (see
http://genome.ucsc.edu/FAQ/FAQformat.html#format5).  AXT does not
include that info, so axtToMaf takes it from the tSizes and qSizes input
files.  MAF's inclusion of chromosome sizes makes it possible to
calculate forward-strand coordinates from the reverse-strand coordinates
when the strand field is "-".

Hope that helps,
Angie


----- Original Message -----

From: "MING Chen,evolgen"  <[email protected] >

To: [email protected]

Sent: Tuesday, November 22, 2011 10:07:27 AM

Subject: [Genome] the supercontigs information of gorilla and the usage
of axtToMaf

Dear UCSC,

I'm trying to convert the pairwise alignment files between human and

gorilla

(http://hgdownload.cse.ucsc.edu/goldenPath/hg19/vsGorGor1/axtNet/

) in axt format to maf, by using the axtToMaf program. But axtToMaf

needs tSizes and qSizes. The human genome sizes is easy to get. But

the gorilla genome (GorGor1)size is difficult to find, especially for

supercontigs. I have searched NCBI WGS (

http://www.ncbi.nlm.nih.gov/Traces/wgs/?val=CABD01

) and Ensembl

(ftp://ftp.ensembl.org/pub/release-57/fasta/gorilla_gorilla/dna/

). But there is no supercontig information matching your pairwise

alignment files, for example:

0 chr1 10974 20818 Supercontig_0000035 107816 117704 + 890977

So could you please tell me where I can find the correct information

about the supercontigs, especially their length information ?

Could you please explain about the function of tSizes and qSizes in

the axtToMaf program? Is it necessary for them to be accuracy?

Thanks very much



Plus: the pairwise alignment assembles:

target/reference: Human (hg19, Feb. 2009, GRCh37 Genome Reference

Consortium Human Reference 37 (GCA_000001405.1))



query: Gorilla (gorGor1, Oct. 2008, Sanger Institute Oct 2008 (NCBI

project 31265, CABD01000000)

Chen Ming

Evolutionary Genomics (Evolgen)

CAS-MPG Partner Institute for Computational Biology (PICB)

Shanghai Institutes for Biological Sciences(SIBS)

Chinese Academy of Sciences (CAS)



320 Yue Yang Rd.

Shanghai, P.R.China 200031



TEL: +86-21-5492-0467

http://www.picb.ac.cn/evolgen/



_______________________________________________

Genome maillist - [email protected]

https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to