Hello Bryan,
The link you found is a good one. The track description page would also
be a great place to find information (open browser and click on the
track name). And here are some more details about this data field in
particular:
The ali data field is defined in our schema as "Bases in gap-free
alignments". There are two types of "gaps". Gaps within a chain and gaps
between chains. The chain(s) the net is based on itself can be
internally gapped and the net<db> table contains rows for both netted
chain alignments and for gapped regions between netted chain alignments.
The long way to interpret this is "Number of bases in alignment blocks
for a data row representing netted chain(s) having
type=top/syn/inv/nonSyn; set to '0' for a data row representing a gapped
region between nets having type=gap".
Simply, "ali" is the number of bases "matched up" between the two
genomes. This does not mean that the bases matched are identical. The
span of genome sequence globally encompassed by the net will be equal to
or longer than the number of "ali" bases - potentially for both the
query and the target. And this global span should be expected to be
different between the query and target unless the region is very highly
conserved.
In general terms, to determine the lengths associated with any entry in
a net<db> table that is not type=gap, use the start/stop coordinates
like this:
tName:tStart-tEnd = global span start/stop for target
qName:qStart-qEnd = global span start/stop for query
1) # bases spanned by net with respect to target genome = (tEnd-tStart)
2) # bases spanned by net with respect to query genome = (qEnd-qStart)
3) # bases included in net = ali
Differences between 1 & 3 or 2 & 3 represent bases that are gapped in
the netted chain(s) for that level ("type"). A chain is comprised of
alignment blocks and (potentially) gaps. Alignment blocks within a chain
are defined in the chain<db>Link table.
Entries in the net<db> table with type=gap use the same coordinate
fields to define the regions between nets.
Examining a few regions between the table data values and the browser
graphics will definitely help to clarify the data structures (and make
understanding less tedious). Use the Table browser to link data on
joining fields between the three tables: chain<db>, chain<db>Link, and
net<db> and extract all the data for a particular region. Next set the
browser to the same region, open the track/subtracks to "full", and compare.
Hopefully this helps,
Jennifer
---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/
On 3/18/10 12:45 AM, Bryan White wrote:
> Ah, nevermind, found this which I believe should answer all my questions.
> http://genomewiki.ucsc.edu/index.php/Chains_Nets
>
> Thanks!
> Bryan
>
> On Wed, Mar 17, 2010 at 11:34 PM, Bryan White<[email protected]> wrote:
>
>> Hi Vanessa,
>>
>> Thanks for the response. I have spent some time familiarizing myself with
>> the database. Now I'm wondering about the "ali" field in the net tables (ie.
>> netPanTro2).
>>
>> Is it the number of bases aligned in an alignment? So, say an "ali" is 1000
>> bases long, does that mean that alignment is 1000 bases long, or does it
>> mean there are 1000 identical bases in that alignment, but the whole
>> alignment may actually be 1200 bases long.
>>
>> Thanks,
>> Bryan
>>
>>
>> On Tue, Mar 16, 2010 at 4:21 PM, Vanessa Kirkup Swing<
>> [email protected]> wrote:
>>
>>> Hi Bryan,
>>>
>>> The best way to view the vertebrate nets/chains is to view them in the
>>> human genome browser: http://genome.ucsc.edu/cgi-bin/hgGateway
>>>
>>> Here is a link to the document to help you get started:
>>> http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#GetStarted
>>>
>>> Once you have selected the human browser and clicked submit from the
>>> gateway page it will take you to that specific browser. There is a
>>> "Vertebrate Chain/Net" track listed under the "Comparative Genomics" section
>>> below the browser image. Click on the "Vertebrate Chain/Net" link to find
>>> out more information on this track and to set up which subtracks you would
>>> like to view.
>>>
>>> Please don't hesitate to contact us if you have further questions.
>>>
>>> Vanessa Kirkup Swing
>>> UCSC Genome Bioinformatics Group
>>>
>>>
>>> ----- Original Message -----
>>> From: "Bryan White"<[email protected]>
>>> To: [email protected]
>>> Sent: Tuesday, March 16, 2010 2:08:15 PM GMT -08:00 US/Canada Pacific
>>> Subject: [Genome] .net and .chain files
>>>
>>> Hello,
>>>
>>> I am currently trying to view the Human to other Vertebrate genome
>>> comparisons listed here
>>> http://hgdownload.cse.ucsc.edu/downloads.html#human
>>> and I am wondering what is the best way to view these .net and/or .chain
>>> files.
>>>
>>> Thanks,
>>> Bryan
>>> _______________________________________________
>>> Genome maillist - [email protected]
>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>
>>
>>
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome