Hi Craig,

One of our engineers suggests the following:
There isn't a unique key because when a SNP is implicated by more than 
one study there are multiple rows for it, containing the metadata from 
each study.

If you need a distinct list of rs IDs, here is how to get that using mysql:

mysql [[insert_public_mysql_stuff_here]] hg19 -NBe 'select distinct(name) from 
gwasCatalog'>  gwasRsIds.txt

or using a downloaded file:

wgethttp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/gwasCatalog.txt.gz
zcat gwasCatalog.txt.gz | cut -f 5 | sort -u>  gwasRsIds.txt

If you want to keep the study info, a unique (except for multiply-mapped 
SNPs, but those are excluded from snp132Common) key could be constructed 
from the name and pubMedID columns. How to proceed depends on what 
tool(s) you are using (mysql? command-line? Table Browser/Galaxy? etc) 
and how much info you want to keep from gwasCatalog.

Please contact us again at [email protected] if you have any further 
questions.

---
Luvina Guruvadoo
UCSC Genome Bioinformatics Group


On 11/1/2011 6:16 AM, Benson, Craig C wrote:
> Hi,
>
> I was wondering, for the table "gwasCatalog" in the hg19 database, is there a 
> unique key field(s) for each entry.  There are 7,096 rows, but some SNPs have 
> more than one entry in the database, based on the disease association. I'm 
> trying to join the tables "snp132Common" and "gwasCatalog" for only a subset 
> of SNPs.
>
> Thanks
> _______________________________________________
> Genome maillist  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to