Steve,
You forgot about the extra 10 bases that Rhileigh added to the query
length. i.e. qSize should be 41, not 31.
0 1 2 3 4 tens position in query
01234567890123456789012345678901234567890 ones position in query
++++ +++++ plus strand alignment on query
-------- ---------- minus strand alignment on query
So:
qStart = qSize - revQEnd = 41 - 26 = 15
qEnd = qSize - revQStart = 41 - 4 = 37
Another issue with this equation in the FAQ is that "revQEnd" and
"revQStart" are never actually defined.
~Lucas
Steve Heitner wrote:
> Hello, Rhileigh.
>
>
>
> Your extension is incorrect. If we show the negative strand coordinates of
> the query, we would get:
>
>
>
> 0 1 2 3 tens position in query
>
> 0123456789012345678901234567890 ones position in query
>
> ++++ +++++ plus strand alignment on query
>
> -------- ---------- minus strand alignment on query
>
> 0987654321098765432109876543210 ones position in query NEG STRAND
> coordinates
>
> 3 2 1 0 tens position in query NEG STRAND
> coordinates
>
>
>
> Plus strand:
>
> qStart=12
>
> qEnd=31
>
> blockSizes=4,5
>
> qStarts=12,26
>
>
>
> Minus strand:
>
> qStart=4
>
> qEnd=26
>
> blockSizes=10,8
>
> qStarts=5,19
>
>
>
> The reason the negative strand qStart and qEnd are reported on the positive
> strand is because it makes very rapid searches for overlapping items faster.
>
>
>
> If we do as the FAQ says, then in negative strand coordinates:
>
> qStart = qSize - revQEnd = 31 - 26 = 5
>
> qEnd = qSize - revQStart = 31 - 4 = 27
>
>
>
> Note that the blockSizes and qStarts are in negative-strand coordinates and
> the order of blocks in the list is reversed compared to the positive strand.
>
>
>
> Please contact us again at [email protected] if you have any further
> questions.
>
>
>
> ---
>
> Steve Heitner
>
> UCSC Genome Bioinformatics Group
>
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On
> Behalf Of Rhileigh Almgren
> Sent: Monday, April 30, 2012 8:46 AM
> To: [email protected]
> Subject: [Genome] PSL format example
>
>
>
> Hi --
>
>
>
> I would like to suggest an addition to the PSL format example given here:
>
> <http://genome.ucsc.edu/FAQ/FAQformat#format2>
> http://genome.ucsc.edu/FAQ/FAQformat#format2
>
>
>
> The current example is
>
>
>
> 0 1 2 3 tens position in query
>
> 0123456789012345678901234567890 ones position in query
>
> ++++ +++++ plus strand alignment on query
>
> -------- ---------- minus strand alignment on query
>
>
>
> Plus strand:
>
> qStart=12
>
> qEnd=31
>
> blockSizes=4,5
>
> qStarts=12,26
>
>
>
> Minus strand:
>
> qStart=4
>
> qEnd=26
>
> blockSizes=10,8
>
> qStarts=5,19
>
>
>
> To an ignoramus (me) trying to puzzle this out, the Minus strand qStart and
> qEnd values seem ambiguous. The stand is 30 bases long, so the coordinates 4
> and 26 are not informative about the correct directional relationship. By
> adding 10 bases to the query length, the ambiguity is resolved:
>
>
>
> 0 1 2 3 4 tens position in query
>
> 01234567890123456789012345678901234567890 ones position in query
>
> ++++ +++++ plus strand alignment on query
>
> -------- ---------- minus strand alignment on query
>
>
>
> Plus strand:
>
> qStart=12
>
> qEnd=31
>
> blockSizes=4,5
>
> qStarts=12,26
>
>
>
> Minus strand:
>
> qStart=4
>
> qEnd=26
>
> blockSizes=10,8
>
> qStarts=15,29
>
>
>
> Is my extension of the example correct?
>
>
>
> Thanks
>
> _______________________________________________
>
> Genome maillist - <mailto:[email protected]> [email protected]
>
> <https://lists.soe.ucsc.edu/mailman/listinfo/genome>
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
>
>
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome