Hi Kaida, Please find my answers to your questions below.Fee free to contact us again at [email protected] if you have any further questions.
--- Luvina Guruvadoo UCSC Genome Bioinformatics Group On 6/8/2012 7:54 AM, Kaida Ning wrote: > Dear Sir/Madam, > > I want to download all transcription factors and their binding sites from > ENCODE. > > Is the file "wgEncodeRegTfbsClusteredV2. > bed.gz" in the following link the correct one to download? > http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeRegTfbsClustered/ You can find more information about the wgEncodeRegTfbsClusteredV2 table by taking a look at the track description page: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeRegTfbsClusteredV2. Note this part of the description, "This track combines data from many different cell lines and transcription-factor targeting antibodies from ENCODE into a relatively dense display." For original TFBS data submitted by ENCODE labs, please see the tracks in the ENCODE TF Binding Super-track (http://genome.ucsc.edu/cgi-bin/hgTrackUi?g=wgEncodeTfBindingSuper) and their associated download pages. > > If so, what are the columns in the .bed file? All columns and their descriptions can be found in the schema for this table: http://genome.ucsc.edu/cgi-bin/hgTables?db=hg19&hgta_group=regulation&hgta_track=wgEncodeRegTfbsClusteredV2&hgta_table=wgEncodeRegTfbsClusteredV2&hgta_doSchema=describe+table+schema > > How were the binding site peaks called? Were they all called by same > algorithm or by the original papers published? From the track description, "The peaks were computed using a uniform pipeline developed by Anshul Kundaje that uses the variation between the two replicates to develop sensible peak thresholds." > > What are the broadpeak and narrowpeak files in the link for HAIB TFBS and > SYDH TFBS etc. on page http://genome.ucsc.edu/ENCODE/downloads.html? For more information, please see track descriptions for HAIB TFBS (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeHaibTfbs) and SYDH TFBS (http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeSydhTfbs). You may also find information on the differences between broadpeak and narrowpeak file formats on our Data Formats FAQ page: broadpeak (http://genome.ucsc.edu/FAQ/FAQformat.html#format13) and narrowpeak (http://genome.ucsc.edu/FAQ/FAQformat.html#format12) > > > Thank you! > > > Best, > Kaida > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
