Hi Eden, We don't have a completely straightforward way to go from a large list of positions (more than 1,000) to a list of gene names. (Smaller lists can be obtained as described here: https://lists.soe.ucsc.edu/pipermail/genome/2010-October/023960.html.)
Galaxy (http://main.g2.bx.psu.edu/), which is run by Penn State and works in conjunction with the Genome Browser, might have some better solutions. You can, at least, get rows from our gene tables (which include gene accessions, but not gene names) that correspond to a list of chromosomal locations using a custom track and the Table Browser. A video tutorial about both of these tools is here: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=28 You can create a custom track of positions by uploading them in BED format (http://genome.ucsc.edu/FAQ/FAQformat.html#format1). Note that the start position in BED format is one base less than the start position that is displayed in the Genome Browser. For additional help, the custom track user's guide is here: http://genome.ucsc.edu/goldenPath/help/customTrack.html#MANAGE_CT At this point, there are a few different ways to proceed. First you will need to select a gene track to use to get gene names (click on the blue track names under "Genes and Gene Prediction Tracks" in the browser to read descriptions of each). The UCSC Genes track has the advantage of having a table called "knownCanonical", which only lists a single gene per cluster of splice variants. Second, you will need to decide whether it matters if you preserve the original positions you entered in your custom track, or if you can use the positions listed for each gene in the table you are using. If the former is true, you will need to use a tool that can join two tables together; the Table Browser doesn't do this. Galaxy has a tool that will: look under "Operate on Genomic Intervals" for the "Join" tool. If you are okay with retrieving the intervals in the table instead of seeing your own regions in the output, select the gene track you want to use and create an intersection with your custom track and choose BED as the output format. This will result a list of the gene positions (as listed in the table), and a gene accession. The results will include duplicate accessions for each gene if there is more than one in the region. You will need to do some additional steps to get from an accession to a gene symbol (for instance, to get from "uc002ypa.2" to "SOD1"), and to get a unique gene name for each region. Again, Galaxy may have an easier tool for this. If you plan to use Galaxy, please send any questions on that tool to their helpdesk at [email protected]. They may have some additional tools or suggestions for you. If you have further questions on the Genome Browser, please send them to this list ([email protected]). -- Brooke Rhead UCSC Genome Bioinformatics Group On 11/15/10 07:57, Kleiman, Eden wrote: > Hi my name is Eden Kleiman. I am a post-doc at the University of Miami. Recently, our lab performed transcriptome analysis on B cell subsets. I have been using your software for various analysis. The output format for the transcriptome data gives me a chromosomal coordinate range for every comparison between two B cell subsets (no absolute value just relative expression levels of each gene between two B cell subset). In the attached example, B cell subsets S1 (column D) and S2 (column E) are being compared for their expression of a certain gene. The gene coordinates are given in column C. I have no problem plugging this range into genome browser and finding out which gene is listed. > But my problem is that I have over 100,000 rows that need gene names. > I would like to know if there is a way to do this on a larger scale where I could submit many chromosomal coordinates and receive the gene names instantly? My other question is that if this is possible, what will Genome Browser do about getting multiple gene names for each chromosomal range? > Thank you, > Eden > > > ------------------------------------------------------------------------ > > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
