Hello, Laura. As you mentioned, the refGene table does not actually list the Genbank version number. The hg19.gbStatus.version field does list the version number, but the problem is that this is the current version number and not necessarily the version that was current as of June 23, 2011. There is also a field called hg19.gbStatus.modDate that lists the last modified date, but there are two problems with this. First, our modDate does not necessarily coincide precisely with the official Genbank version date (e.g., our modDate for NM_021219.2 is March 21, 2012 while Genbank lists it as April 21, 2012). Also, if the particular transcript you are looking at is a version 3 (e.g., NR_001458.3), the gbStatus table does not keep a history of previous versions and modDates, so there is no way to know whether it was NR_001458.1 or NR_001458.2 on June 23, 2011.
We do not keep histories of the refGene table, so there is no June 23, 2011 version of refGene that we can direct you to. There is no easy way to get a snapshot of the data as it existed on June 23, 2011. It is possible to look directly at Genbank to find the dates corresponding with the various transcript versions (e.g., http://www.ncbi.nlm.nih.gov/nuccore/NM_021219.1 shows that NM_021219.1 was released on April 24, 2002 and http://www.ncbi.nlm.nih.gov/nuccore/NM_021219.2 shows that NM_021219.2 was released on April 21, 2012), but if you have a large number of IDs, this would be very tedious without some kind of custom script. Please contact us again at [email protected] if you have any further questions. --- Steve Heitner UCSC Genome Bioinformatics Group -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Laura Smith Sent: Tuesday, May 29, 2012 1:56 PM To: [email protected] Subject: [Genome] Downloading old refseq and ensemble transcripts with the "version numbers" in the accession IDs. Hello, I have been using the refseq transcripts and ensemble transcripts downloaded from UCSC genome browser table on June 23 2011. The transcript IDs in these datasets that were downloaded from UCSC do not have the version numbers (such as NM_134564.2) where ".2" is the version number after the period. However, recently, it turns out that I need to have the version numbers of each transcript. So, I tried to look for them and download them using the info provided here, however there is no way for me to choose the refseq transcripts for the date June 23 2011: https://lists.soe.ucsc.edu/pipermail/genome/2011-September/027099.html Would it be possible for you to please send me the refseq and ensemble transcripts for June 23 2011 from your archives please which includes the version numbers for each transcript in them? Or if there is a way that I could access this data myself, if you could please let me know I would very much appreciate it. Thank you, Laura _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
