Re: [Genome] valouev et al data 2008

Jennifer Jackson Mon, 22 Jun 2009 17:41:45 -0700

Hello Hagen,

Attachments are screened by the monitor program and I didn't request a 
copy in time today. However, I have deduced that the target species is 
/C.elegens/ May 2008 (ce6) assembly and the track either "Nucleosome" or 
one of the derivatives (with variable stringencies applied). The answers 
should address this set of tracks and actually most tracks in general, 
but please write back if I don't answer your questions completely.

1) Yes, with the BED format output one of the options is 1 record per 
line is for each read. Depending on which table you are retrieving, the 
Table browser may be able to export the data. The limit for the TB 
output is 100k lines. For tables with more data, either export the data 
per chromosome using the TB, use the public mySQL server, or ftp the 
file from the Downloads server (species -> assembly -> Annotation 
database -> table name = file name.txt.gz). The data for this track is 
in BED format in the database and therefore the files in Downloads are 
also BED format.
Ftp instructions: http://genome.ucsc.edu/FAQ/FAQdownloads#download1
Public mySQL server: http://genome.ucsc.edu/FAQ/FAQdownloads#download29

2) Again, yes, column 5 is score,  which for this track is the collapsed 
depth of reads, grouped by starting position, as documented on the track 
description page in the section "Display Conventions and Configuration".
BED format is described here: http://genome.ucsc.edu/FAQ/FAQformat#format1

Some stats and example SQL query for summarizing score values:
unix$ hgsql ce6
mysql> select count(*), score from nucleosomeControl group by score;
+----------+-------+
| count(*) | score |
+----------+-------+
| 18614626 |     1 |
|  5371783 |     2 |
|  2243522 |     3 |
|  1107780 |     4 |
|   603639 |     5 |
|   353237 |     6 |
more ....
|        1 |   345 |
+----------+-------+
193 rows in set (0.00 sec)

Thank you for your patience today while we worked out an answer and 
please write back if you have questions or would like more help,
Jennifer Jackson
UCSC Genome Bioinformatics Group

Hagen Tilgner wrote:
> Dear UCSC developpers and users,
>
> I have two questions that hopefully require only a short "yes".
>
> 1) I downloaded nucleosome reads from the table-browser (see 
> README_download.png for a screenshot of what I exactly did) from the 
> valouev et al publication. Upon hitting "get output" I get to a page 
> (see makeSurecorrect.png for a screenshot) that asks me whether I want 
> 1 record per gene, exon, etc. All I want is 1 bed-line per read of the 
> experiment genome wide, independently of any annotation. Is this what 
> I get when hitting "get BED" as in the attached png ?
>
> 2) I assume that column 5 in the resulting bed-file gives me the 
> number of times this read was observed. Am I right ?
>
> I would really appreciate your help
> Hagen
> ------------------------------------------------------------------------
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>   
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] valouev et al data 2008

Reply via email to