Hello, Steve, 

 
What about the case for representing insertions in BED files? 

Could you please tell me how would you represent an insertion in a BED file? 
Wouldn't it be as below?: 


Let's say that we want to represent the following insertion in a BED file which 
is given on cosmic website: 

http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=mut_summary&id=681

In Cosmic website, the start and end position for an insertion of CTGTGGGCT on 
chr17:37881006..37881007 (GRCh37) is represented as:

start: 37881006
end: 37881007 


Since both the start and end positions are 1-based in cosmic. 



Next, 

When we convert this variant above to a BED file, the start position would 
become 0-based. Thus, shifts the position to left. 

So in BED file, this would be the representation: 

start: 37881006
end: 37881006 


So the bed file for this insertion would look like:

CHR    STARTEND UNIQUID TYPE
chr17    3788100637881006 id2  REF=;OBS=CTGTGGGCT



According to this example I have given, would you agree that it is possible for 
the "start" position to be "equal" end position in a given BED file? 
 
You are right, start position can NOT be numeracally larger than the end 
position. However, they can be equal sometimes for the case of insertions I 
have given above. 

Thank you,
Laura


________________________________
 From: Steve Heitner <[email protected]>
To: 'Laura Smith' <[email protected]>; [email protected] 
Sent: Tuesday, March 27, 2012 11:46 AM
Subject: RE: [Genome] Can "end position" can ever be larger than "start 
position" in a BED format file?
 
Hello, Laura.

The simple answer to all three of your questions is yes. Whether they are on
the + or - strand, both chromStart and thickStart must be numerically
smaller than chromEnd and thickEnd, respectively. If chromStart or
thickStart are not smaller than chromEnd or thickEnd, you will receive an
error such as "Error line 3 of custom track: chromStart after chromEnd (6000
> 2000)" when trying to load your custom track.

Please contact us again at [email protected] if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group

-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Laura Smith
Sent: Monday, March 26, 2012 6:10 PM
To: [email protected]
Subject: [Genome] Can "end position" can ever be larger than "start
position" in a BED format file?

Hi, 

I read the information on BED format on your website, however something is
not very clear to me. 

1. In BED format, isstart position always less than or equal to end
position? 

2. In other words, can "end position"  can ever be larger than "start
position" in a BED format file? 


The reason I am asking this is because, I want to know the following: 

3. For a region on negative strand, if someone wants to represent it on BED
file, is the start position always less than end position? 


Looking at the examples on negative strand provided in your website, it is
easy to assume that end position is always larger than or equal to start
position regardless of the strand. Is this correct? 

If you could please answer the 3 questions above, I would appreciate it. 

thanks,
Laura 

track name=pairedReads description="Clone Paired Reads" useScore=1
chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512
chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601

BED format provides a flexible way to define the data lines that are
displayed in an annotation track. BED lines have three required fields and
nine additional optional fields. The number of fields per line must be
consistent throughout any single set of data in an annotation track.

The first three required BED fields are:

chrom - The name of the chromosome (e.g. chr3, chrY, chr2_random) or contig
(e.g. ctgY1).
chromStart - The starting position of the feature in the chromosome or
contig. The first base in a chromosome is numbered 0.
chromEnd - The ending position of the feature in the chromosome or contig.
The chromEnd base is not included in the display of the feature. For
example, the first 100 bases of a chromosome are defined as chromStart=0,
chromEnd=100, and span the bases numbered 0-99.
The 9 additional optional BED fields are:

name - Defines the name of the BED line. This label is displayed to the left
of the BED line in the Genome Browser window when the track is open to full
display mode or directly to the left of the item in pack mode.
score - A score between 0 and 1000. If the track line useScore attribute is
set to 1 for this annotation data set, the score value will determine the
level of gray in which this feature is displayed (higher numbers = darker
gray).
strand - Defines the strand - either '+' or '-'.
thickStart - The starting position at which the feature is drawn thickly
(for example, the start codon in gene displays).
thickEnd - The ending position at which the feature is drawn thickly (for
example, the stop codon in gene displays).
reserved - This should always be set to zero.
blockCount - The number of blocks (exons) in the BED line.
blockSizes - A comma-separated list of the block sizes. The number of items
in this list should correspond to blockCount.
blockStarts - A comma-separated list of block starts. All of the blockStart
positions should be calculated relative to chromStart. The number of items
in this list should correspond to blockCount.
Example:
Here's an example of an annotation track that uses a complete BED
definition:

track name=pairedReads description="Clone Paired Reads" useScore=1
chr22 1000 5000 cloneA 960 + 1000 5000 0 2 567,488, 0,3512
chr22 2000 6000 cloneB 900 - 2000 6000 0 2 433,399, 0,3601
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to