Hi Kathleen, Thank you for your patience while we worked on your question!
The table hgFixed.gnfHumanAtlas2AllExps has the tissues listed in the original order (with replicates being side-by-side: A,A,B,B,C,C, etc). This is the the table that connects the expScores in hgFixed.gnfHumanAtlas2All to tissue types and contains the data you want. The table hgFixed.gnfHumanAtlas2MedianExps was made to connect with tables that had only the median of the two replicates (like gnfAtlas2). When this was done, the tissues were also reordered to group similar tissue types. When the output format "microarray names" is chosen for gnfAtlas2, it obtains the tissue names from this table (hgFixed.gnfHumanAtlas2MedianExps). Unfortunately, at this point in time, if you select the track GNF Atlas 2 from the table browser, it will not let you select hgFixed.gnfHumanAtlas2AllExps - we are working on this and hope to have a fix out soon. To get the data from the hgFixed.gnfHumanAtlas2AllExps table, you will need to select the following in the table browser: group: All Tables database: hgFixed table: hgFixed.gnfHumanAtlas2AllExps I hope this information is helpful. Please feel free to contact the mail list again if you require further assistance. Best, Mary ------------------ Mary Goldman UCSC Bioinformatics Group On 5/17/11 9:54 AM, kathleen askland wrote: > Hello Jen, > > I wrote you about a year ago with a question about gnf2 expression > data that I downloaded using the UCSC genome table browser. I've come > back to this data for a different project and was reviewing our > correspondence (see previous emails at bottom of page) and checking it > against some downloaded data. There seems to be a significant > discrepancy that I hope you can clarify. > > Essentially, I want to be certain that I know which tissue and > replicate each of the expression values in the output file corresponds > to. > So, I downloaded the GNF Atlas 2 absolute expression values for both > original samples/replicates by opening the Table Browser and > proceeding as follows: > 1) Selected Clade: Mammal, Genome: Human, Assembly: Feb 2009, Group: > Expression, Track: GNF Atlas 2, Table: hgFixed.gnfHumanAtlas2All > 2) Next, I selected output format: 'all fields from selected table' > 3) then I clicked 'Get output,' which opens an html window with the > requested data, the first two lines of which is as follows: > > #name expCount expScores > 1007_s_at 158 > 3621,3212,1078,1130,475,408,375,528,668,482,543,392,745,996,696,649,1124,1259,291,451,707,745,1022,1296,2956,2359,1462,2318,1157,1437,1662,841,1288,1575,3465,2565,1281,1504,1203,1415,1919,1330,292,112,1039,1498,1868,1679,1855,2219,2701,3162,3561,2943,3455,4784,4332,4136,3441,3333,3043,2922,3291,4413,2727,5157,3332,3064,6515,6949,4237,5045,1896,1810,2531,2425,2542,2070,8931,9319,4300,4765,2586,2623,3334,5043,1872,2320,1515,2165,2561,2859,5122,5007,1572,1717,5614,5501,4380,4137,2087,2416,4298,4484,1867,2184,2081,1932,5530,6309,1077,1149,3709,1832,2859,8037,1718,1876,1303,1537,1441,925,864,978,1571,1110,2494,1825,4551,2741,1588,1161,726,1428,1434,1005,1687,1509,775,996,930,1187,768,800,1110,1114,1436,1281,1211,1171,1225,1455,2559,2741,3083,4111,2179,2653, > > 1053_at 158 > 1041,522,265,351,222,244,519,248,272,247,297,538,191,60,195,102,390,635,526,384,510,700,549,657,1436,1441,316,253,301,228,530,905,757,530,247,296,228,301,182,229,175,99,453,329,239,130,30,32,29,79,147,75,42,104,74,112,142,121,50,76,98,28,119,124,24,129,24,109,30,194,110,48,122,19,17,172,27,158,221,60,38,231,17,60,378,242,170,318,54,212,17,74,42,170,30,126,224,199,136,123,153,135,155,25,293,396,303,214,270,145,159,31,62,95,118,111,153,122,57,171,174,214,73,30,29,106,16,225,67,24,131,48,76,28,172,46,70,35,34,117,29,75,22,25,59,97,21,72,38,127,130,74,156,31,31,17,55,33, > > Since this particular table does not have the expression IDs or > descriptive names, I do not know which tissues/replicates each of the > 158 values for each probe corresponds to. So, my first question is: > Are the expression values in order of the tissue ID with each pair of > replicates adjacent to one another (i.e., 0,0,1,1,2,2,3,3,etc...), or > ordered by tissue ID for first replicate then by tissue ID for second > replicate (i.e., 0,1,2,3,....; 0,1,2,3,...), or in some other order? > > Finally, I want to be sure that the tissue IDs listed in the table > 'hgFixed.gnfHumanAtlas2MedianExps' (pasted below) are the same tissue > IDs that I should be using to reference the absolute expression data > provided in the 'hgFixed.gnfHumanAtlas2All' table. I ask this, in > particular, because your correspondence of March 30,2010 indicated: " >> For example, gnfHumanAtlas2AllExps.id =0 or =1, the first two fields are: >> >> id name >> >> 0 ColorectalAdenocarcinoma >> >> 1 ColorectalAdenocarcinoma 2 " > which is different than the tissue ID-tissue description matches > listed when I select and output the 'hgFixed.gnfHumanAtlas2All' > table, for which I get the following list: > > #id description > 0 fetal brain > 1 whole brain > 2 temporal lobe > 3 parietal lobe > 4 occipital lobe > 5 prefrontal cortex > 6 cingulate cortex > 7 cerebellum > 8 cerebellum peduncles > 9 amygdala > 10 hypothalamus > 11 thalamus > 12 subthalamic nucleus > 13 caudate nucleus > 14 globus pallidus > 15 olfactory bulb > 16 pons > 17 medulla oblongata > 18 spinal cord > 19 ciliary ganglion > 20 trigeminal ganglion > 21 superior cervical ganglion > 22 dorsal root ganglion > 23 thymus > 24 tonsil > 25 lymph node > 26 bone marrow > 27 BM-CD71+ early erythroid > 28 BM-CD33+ myeloid > 29 BM-CD105+ endothelial > 30 BM-CD34+ > 31 whole blood > 32 PB-BDCA4+ dentritic cells > 33 PB-CD14+ monocytes > 34 PB-CD56+ NKCells > 35 PB-CD4+ Tcells > 36 PB-CD8+ Tcells > 37 PB-CD19+ Bcells > 38 leukemia lymphoblastic(molt4) > 39 721 B lymphoblasts > 40 lymphoma Burkitts Raji > 41 leukemia promyelocytic(hl60) > 42 lymphoma Burkitts Daudi > 43 leukemia chronic myelogenous(k562) > 44 colorectal adenocarcinoma > 45 appendix > 46 skin > 47 adipocyte > 48 fetal thyroid > 49 thyroid > 50 pituitary gland > 51 adrenal gland > 52 adrenal cortex > 53 prostate > 54 salivary gland > 55 pancreas > 56 pancreatic islets > 57 atrioventricular node > 58 heart > 59 cardiac myocytes > 60 skeletal muscle > 61 tongue > 62 smooth muscle > 63 uterus > 64 uterus corpus > 65 trachea > 66 bronchial epithelial cells > 67 fetal lung > 68 lung > 69 kidney > 70 fetal liver > 71 liver > 72 placenta > 73 testis > 74 testis Leydig cell > 75 testis germ cell > 76 testis interstitial > 77 testis seminiferous tubule > 78 ovary > > Thank you for any assistance you may provide. > > Kathleen > > > On Tue, Mar 30, 2010 at 3:51 PM, Jennifer Jackson<[email protected]> wrote: >> Hello Kathleen, >> >> There are 76 distinct tissues with two replicates per experiment, which >> brings the number of values = 158 scores. The order of the tissues is in the >> gnfHumanAtlas2AllExps.id field, the tissue names are in the >> gnfHumanAtlas2AllExps.name field. >> >> For example, gnfHumanAtlas2AllExps.id =0 or =1, the first two fields are: >> >> id name >> >> 0 ColorectalAdenocarcinoma >> >> 1 ColorectalAdenocarcinoma 2 >> >> This replication per-tissue is explained in the track's description page >> (open Assembly browser and click on track name - or - open the Table browser >> to the track, leave the primary table as-is, click on "describe table >> schema", then scroll to the bottom on the page. >> >> Hopefully this addresses your questions, but please let us know if you need >> more information, >> Jen >> >> --------------------------------- >> Jennifer Jackson >> UCSC Genome Informatics Group >> http://genome.ucsc.edu/ >> >> On 3/30/10 5:49 AM, kathleen askland wrote: >>> I have recently downloaded human expression data via UCSC genome Table >>> Browser using the following query parameters: Mammal, human, Assembly: >>> Feb 2009(GRCh37/hg19), Group: Expression, Track: GNFAtlas2, Table: >>> hgFixed.gnfHumanAtlas2All, as I wanted all available replicates >>> available for each probe. >>> >>> However, the file output is very difficult to understand. There were >>> 44775 probes (as expected) for which data are available. Each probe >>> has a corresponding 'hgFixed.gnfHumanAtlas2All.expCount' value= 158, >>> suggesting there should be 158 expression values per probe and, in >>> fact, the column headed 'hgFixed.gnfHumanAtlas2All.expScores' does in >>> fact contain 158 comma-separated absolute expression values. >>> >>> However, I am not able to obtain the EXP ids (i.e., tissue name) >>> associated with each of the 158 expression values in the sequence so >>> how is one supposed to figure out which tissue each of the 158 >>> expression scores corresponds to? >>> >>> I have attempted to obtain those expression IDs in several ways, by >>> selecting different associated tables to join and seemingly relevant >>> variables to no avail. Moreover, even more confusingly, when I select >>> from associated table gnfHumanAtlas2MedianExps the variables >>> 'hgFixed.gnfHumanAtlas2AllExps.id' and >>> 'hgFixed.gnfHumanAtlas2AllExps.name' which would seem like the desired >>> information, I get a series of comma-separated EXP ids and the >>> corresponding EXP id tissue names (e.g., 112 and Pancreas, >>> respectively), but there are generally not 158 entries in each of >>> these cells and many probes have 'n/a' in both columns. >>> >>> So, for example, probe '1007_s_at' has the following associated data: >>> hgFixed.gnfHumanAtlas2All.expCount='158', >>> hgFixed.gnfHumanAtlas2All.expScores= >>> '3621,3212,1078,1130,475,408,375,528,...' (158 distinct values >>> comma-separated) >>> hgFixed.gnfHumanAtlas2AllExps.id= '112' >>> hgFixed.gnfHumanAtlas2AllExps.name='Pancreas' >>> >>> While probe '117_at' gives: >>> hgFixed.gnfHumanAtlas2All.expCount='158', >>> hgFixed.gnfHumanAtlas2All.expScores= >>> '338,277,2383,2456,617,423,...'(158 comma-separated values) >>> hgFixed.gnfHumanAtlas2AllExps.id= >>> '52,74,75,85,94,96,98,112,121,127,129,137,' >>> >>> hgFixed.gnfHumanAtlas2AllExps.name='cerebellum,CingulateCortex,CingulateCortex >>> 2,Lung 2,Uterus,Thyroid,fetalThyroid,Pancreas,TestisGermCell >>> 2,salivarygland 2,trachea 2,skin 2,' >>> >>> Since the number of expression values listed under >>> 'hgFixed.gnfHumanAtlas2All.expScores' does not correspond to the >>> number of Expression IDs/names listed under >>> 'hgFixed.gnfHumanAtlas2AllExps.id' and >>> 'hgFixed.gnfHumanAtlas2AllExps.name', respectively, how is one >>> supposed to figure out which tissue each of the 158 expression scores >>> corresponds to? >>> > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
