Hello,

I'm Carly's mentor. I'd just like to follow up by thanking you for all of your 
assistance with browsing the data hosted on your website. It's an invaluable 
resource. Your tips have allowed us to make a lot of progress so far.

We are curious about a couple of specific things…

 1.  Is there a resource where we can read more details about the "Score" for 
histone methylation signal? Would this be in the Sabo et al 2004 paper?
 2.  Once we generate a list of "highly methylation score" regions, can the 
filters and/or table intersect tools be used to retrieve the methylated regions 
that are located within –500 and +500 base pairs of every annotated 
transcription start site?

Once we get these questions squared away, we will continue working on our 
project with the help of bioinformatics colleagues here at ASU. Thanks again 
for helping us to become more familiar with the ENCODE data hosted by the UCSC 
Genome Browser site.

Best,
---Karmella

--
Karmella A. Haynes, Ph.D.
Assistant Professor
School of Biological and Health Systems Engineering
Arizona State University
501 E. Tyler Mall, ECG 346
Tempe, AZ 85287
E-mail: [email protected]
Website: haynes.lab.asu.edu


From: Carly Hom <[email protected]<mailto:[email protected]>>
Date: Sat, 4 Feb 2012 02:41:59 -0700
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Karmella Haynes <[email protected]<mailto:[email protected]>>
Subject: Question: Table Browser Filtering

Hello, I received a response from you receiving instruction on how to filter 
the browser according to these instructions:

Hi my name is Carly Hom and I am an undergraduate student researcher at Arizona 
State University working with Dr. Karmella Haynes. In my current lab I am using 
Synthetic Biology and Bioinformatics toinvestigate reliable and predictable 
reactivation of dormant genes that can help treat cancer and enable tissue 
re-growth. By determining which silenced genes will switch to an active state 
in osteosarcoma cells, with the presence of the synthetic transcription factor 
PC-TF, my work will establish a comprehensive method for predicting the effect 
of rationally designed protein-based drugs. Pc-TF, a synthetic transcription 
factor developed by Dr. Haynes, regulates cell states by binding the repressive 
trimethyl-histone H3 lysine 27 signal (H3K27me3) and switching silenced genes 
to an active state in osteosarcoma cells. Since a comprehensive ChIP map is not 
available for osteosarcoma, I will be identifying genes associated with 
H3K27me3 in liver (HepG2) and fibroblast (BJ) cell lines. Overall, I will need 
to collect about 1000 genes from the ENCODE database that show a significant 
enough H3K27me3 signal at the promoter of the gene. I have already figured out 
how to project only information from the HepG2 and BJ cell lines in relation to 
H3K27me3, but by just clicking to move through the cell line to find genes will 
take entirely too long and can cause me to miss important genes. At the request 
of Dr. Haynes I am asking if ENCODE has some sort of filter program that will 
provide a list of genes where the promoter site shows a high level of the 
H3K27me3 histone methylation. I will need it to be able to find the beginning 
of the gene's promoter on the UCSC Genome Browser and then show about 500bps to 
the left and 500bps to the right of the promoter . Ultimately, I want to be 
able to navigate through the genes in this cell line that show a significant 
enough H3K27me3 signal at the promoter (everything else with a low H3K27me3 
signal I do not care about). If you could get back to me on whether this is 
even possible to do within the Genome Browser, and if yes, how I would be able 
to do this that would be great. Thank you!

After a couple of emails back and forth this was the best response we received:

Hello, Karmella.
To expand upon Luvina's instructions, to create the filter, perform the
following steps in the Table Browser:
1. Select the following options:
Clade: Mammal
Genome: Human
Assembly: Feb. 2009 (GRCh37/hg19)
Group: Genes and Gene Prediction Tracks
Track: UCSC Genes
Table: knownCanonical
2. Next to "filter", click the "create" button
3. In the "Linked Tables" section, scroll down and check the hg19.knownGene
checkbox
4. Scroll to the bottom of the page and click the "Allow Filtering Using
Fields in Checked Tables" button
5. In the "hg19.knownGene based filters" section, the third line should read
"strand does match +"
6. Click the "submit" button
Also note that these tables:
wgEncodeUwHistoneHepg2H3k27me3StdPkRep1
wgEncodeUwHistoneHepg2H3k27me3StdPkRep2
wgEncodeUwHistoneBjH3k27me3StdPkRep1
wgEncodeUwHistoneBjH3k27me3StdPkRep2
contain the methylation scores for H3k27me3 in Hepg and Bj cell lines as
calculated according to the process outlined in the UW Histone track
description here:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=wgEncodeUwHistone
and in the reference contained therein. You may also be interested in these
additional tables:
wgEncodeUwHistoneHepg2H3k27me3StdHotspotsRep1
wgEncodeUwHistoneHepg2H3k27me3StdHotspotsRep2
wgEncodeUwHistoneBjH3k27me3StdHotspotsRep1
wgEncodeUwHistoneBjH3k27me3StdHotspotsRep2
which also contain methylation hotspot data. You can read more about both
sets of tables in the aforementioned description.
Please be aware that the encode tables contain all the methylation scores,
not just the high scores. If you're only interested in the high methylation
scores, you'll need to filter the encode tables similar to my above example:
1. Select the following options:
Clade: Mammal
Genome: Human
Assembly: Feb. 2009 (GRCh37/hg19)
Group: Regulation
Track: UW Histone
Table: select the appropriate tables
2. Next to "filter", click the "create" button
3. Edit the "score" line so that it contains the values you are interested
in such as "score is >= 500"
4. Click the "submit" button

These instructions helped me out a lot, but I still need couple more things to 
be done. If certain things are not possible let me know. I am aware that I can 
check the boxes chrom, chromeSTART, and chromeEND and then copy and paste that 
into the Genome Browser. Is it possible for the tables to provide the location 
of the promoter (+/- 500bps to the left and right of that region) instead of a 
being thrown onto a random area of the gene?  Also, how exactly is the histone 
methylation score measured? I want to be able to select a specific score range 
(ex: >= 500), but I am unsure of what to qualify as a significant enough of a 
signal since I do not know what the score is being measured relative to and 
what the top score possible is. Lastly, is there a way to add the gene name to 
the table output? It makes it a lot easier than having to determine the gene 
names when some genes are very close to each other and have an area of overlap. 
I know that the gene name check box is available in the group knownCANONICAL, 
but I can't seem to find it when I am in the Regulation group with the 
'selected fields from primary and related tables' output format. If you could 
get back to me on these items that would be great. Thank you!

- Carly

--
Caroline Hom
Tempe, AZ
Ph: 602-315-5728
Arizona State University
Ira A. Fulton Schools of Engineering
Biomedical Engineering
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to