Dear UCSC,

I've looked through the archives so I think my question hasn't yet
been answered.

I'm looking at microsatellites in the snp130.txt file.  I am trying to
make sense of the coordinates.  In many case the coordinates of a
microsatellite refer to a single base (chromEnd = chromStart + 1).
Such is the cases A and B below.  But where is the microsatellite?
According to the alignments (by clicking on the rs... name), in case A
the indicated microsatellite (the black bar in the browser with snp130
set to "full") is at the *end* of the CA repeat (the actual
microsatellite).  In case B, the indicated microsatellite is at the
*beginning* of the CA repeat.  Both of these are top strand snps.

Case A.

627     chr1    5576651 5576652 rs3223599       0       +       C
C       (CA)19/20/21/22/23/24   genomic microsatellite  by-frequency
0.752086        0.089764        unknown exact   1

The genome browser shows the entire microsatellite
repeat (all 24 copies of CA, so 48 bases) as the reference
sequence. The position 5576652 marks the *end* of the CA repeat.  The
browser just shows the microsatellite as a single base.

Case B:

  658     chr1    9585594 9585595 rs3220726       0       +       C
  C       lengthTooLong   genomic microsatellite  by-frequency
  0.8126  0.129764        unknown exact   1

The genome browser shows base at 1-position 9585595 in this case is at
the *left* (beginning) of the CA repeat.  This repeat is not
particularly long: 58 bases.  I don't see any way that I can get this
information from the line above.

Question 1)

So how would anyone know, by looking in snp130.txt, where the actual
microsatellite is?  Is there some other table that I could download
that would give this information?

In case C, the coordinates given are the actual coordinates of the
microsatellite.

Case C:

753     chr1    22129926        22129973        rs3222966       0
+       CACACACACACACACACACACACACACACACACACACACACACACAC
CACACACACACACACACACACACACACACACACACACACACACACAC
(CA)17/18/19/20/21/22/23/24     genomic microsatellite  by-frequency
0.7524  0.158867        unknown range   1

In this case, the microsatellite shows the full coordinates of the
47-base microsatellite which includes all (but 1/2) of the 24-copy CA
repeat.

Question 2)

If the observed is listed as lengthTooLong, is there any way to
determine what the bases of the microsatellite are?  (Without
that, they aren't much use.)


Case D:

852     chr1    35119589        35119590        rs3219614       0
+       T       T       (CA)20/21/22/23/A/T     genomic microsatellite
by-frequency    0.284918        0.283047        unknown exact   1

Question 3)

In case D, what does the /A/T mean at the end of (CA)20/21/22/23/A/T ?

Question 4)

In case D, the CA repeat starts at position 35119591 (chr1) and ends
at
35119632, giving 42 bases or 21 copies of the repeat.  So why does the
allele indicate that there are 23 copies?

Thank you very much!

David Gordon


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to