Hi John,

These previously-answered mailing list questions should be helpful:

https://lists.soe.ucsc.edu/pipermail/genome/2009-March/018553.html
https://lists.soe.ucsc.edu/pipermail/genome/2008-August/017007.html

I also suggest looking at the "description" column in the table schema 
(hit "view table schema" from the table browser or the SNP details 
page): 
http://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=varRep&hgta_track=snp135&hgta_table=snp135&hgta_doSchema=describe+table+schema.

The refNCBI column (and the refUCSC column) always contain the allele 
from the positive strand.  The strand column denotes the strand of the 
alleles in the observed column.

It might be helpful to look at what we display when you click on the SNP 
in the Genome Browser.  We reverse-complement the reference allele on 
when the strand is negative, but not the observed allele:

dbSNP build 135 rs1000073
Strand: -
Observed: C/T
Reference allele: T

So, for rs1000073, NCBI and UCSC are in agreement that T is present on 
the negative strand in the GRCh37 reference genome at position 
chr1:157255396.  The two alleles observed on the negative strand in that 
position are C and T.  Also, if you click on "Re-alignment of the SNP's 
flanking sequences to the genomic sequence," you will see that the 
flanking sequences for rs1000073 from dbSNP align to the negative 
genomic strand.

I note that most SNPs (~51 million out of ~54 million) in the snp135 
table have a strand of "+".  You should also be aware of the exceptions 
column of the snp135 table, which flags some indicators of potential 
problems with a record, such as these (from the SNP track details page):

ObservedContainsIupac - At least one observed allele from dbSNP contains 
an IUPAC ambiguous base (e.g., R, Y, N).

ObservedMismatch - UCSC reference allele does not match any observed 
allele from dbSNP. This is tested only for SNPs whose class is single, 
in-del, insertion, deletion, mnp or mixed.

ObservedTooLong - Observed allele not given (length too long).

ObservedWrongFormat - Observed allele(s) from dbSNP have unexpected 
format for the given class.

I hope this helps explain the snp table.  If you have further questions, 
please contact us again at [email protected].

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 12/21/11 4:07 AM, John Curtin wrote:
> Hi
> I am trying to determine strand orientation for SNPs using the
> output  from "http://genome.ucsc.edu/cgi-bin/hgTables"; (all SNPs(135)).
>
> This is in order to impute using 1000 genomes data which requires
> you  to know the strand orientation of the data. I want to be sure that I
fully understand this table and I am getting the information I need.
>
> In the attached file (exported from tables/SNPs) I have highlighted
> 4  columns. Does the "Strand" column refer to the "observed" column. i.e.
the observed column for rs1000073 it is on the "-" strand because it is
("C/T"). For rs1000073 the "+" strand equivalent is "A" (from refUCSC).
Obviously you cannot use this approach for AT or CG, but I have illumina
data and 99% of SNPs are unambiguous.
> Is this correct?
> Regards
> John
>
>
> ************************************************************************************
> John A Curtin
> Lecturer in Functional Genomics
> Deputy Director, MRes in Translational Medicine
> University of Manchester
> CIGMR
> 2nd Floor, Stopford Building
> Oxford Road
> Manchester, M13 9PT
> [email protected]<mailto:[email protected]>  | Tel:  
> 0161 275-5203 (CIGMR) | 0161 291-5867 (UHSM)
> http://www.medicine.manchester.ac.uk/staff/JohnCurtin
> Master of Research Translational Medicine:
> http://www.medicine.manchester.ac.uk/postgraduate/mres/TMInterMolecMRes/<https://outlook.manchester.ac.uk/owa/redir.aspx?C=497ad8462a0a44fbbbfe62f15e5417b0&URL=http%3a%2f%2fwww.medicine.manchester.ac.uk%2fpostgraduate%2fmres%2fTMInterMolecMRes%2f>
> http://www.medicine.manchester.ac.uk/postgraduate/mres/TMPharmCancerMRes/
>
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to