Re: [mart-dev] question about: Flank-coding region (Gene), Flank (Gene), 5' UTR

Bogdan Mon, 11 Jun 2007 00:55:26 -0700

Dear Damian,

thank you for your reply.

ok - we are improving the user warning and images for the forthcoming
release :-) Downstream flank refers to the "downstream of the gene". As it
doesn't really make sense to join the upstream and downstream flanks when
just selecting flanks we disabling using them both together - it just
returns the upstream flank as you experienced. Apologies for the confusion


for "flank-coding region" for both "gene" and "transcript" the image
does show both upstream and downstream flanks as those of the
gene/transcript, but for "flank" only the upstream sequence is
highlighted on the image - that was the source of confusion in my
case.

> <Dataset name = "rnorvegicus_gene_ensembl" interface = "default" >
>       <Attribute name = "gene_stable_id" />
>       <Attribute name = "coding_gene_flank" />
>       <Filter name = "upstream_flank" value = "1000"/>
>       <Filter name = "transcript_status" value = "KNOWN"/>
>       <Filter name = "ensembl_gene_id" value = "ENSRNOG00000006899"/>
> </Dataset>

this should give you your 1000bp upstream of the TSS - is it not doing
this? or are you looking for something different? Let me know and will try
and help


It does give 1kbp upstream, but I'm looking for the 1kbp up from TSS
*plus* a stretch of sequence down from TSS to the translation start
site (i.e. 5'UTR). I can do this with the following sample query:

<Dataset name = "rnorvegicus_gene_ensembl" interface = "default" >
<Attribute name = "gene_stable_id" />
<Attribute name = "5utr" />
<Attribute name = "5utr_start" />
<Attribute name = "5utr_end" />
<Attribute name = "transcript_chrom_strand" />
<Filter name = "upstream_flank" value = "1000"/>
<Filter name = "transcript_status" value = "KNOWN"/>
<Filter name = "ensembl_gene_id" value = "ENSRNOG00000014029"/>
</Dataset>

but I have a problem interpreting the results for the genes with
multiple 5'UTRs defined (like the ENSRNOG00000014029 in the sample
query above).

I do not understand what should multiple 5'UTRs mean for a single
gene. Based on query results, it appears that UTRs are linked to the
gene, and not to the gene transcripts. Thus, multiple 5'UTRs shouldn't
mean the UTRs of transcripts. Then what sequence do I get with the
following query, issued for the multiple-5'UTR gene?
<Dataset name = "rnorvegicus_gene_ensembl" interface = "default" >
<Attribute name = "gene_stable_id" />
<Attribute name = "5utr" />
<Attribute name = "transcript_chrom_strand" />
<Filter name = "transcript_status" value = "KNOWN"/>
<Filter name = "ensembl_gene_id" value = "ENSRNOG00000014029"/>
</Dataset>

I attempted aligning the sequence returned by this query to the
"Unspliced (Gene)" sequence from the same gene, and there were 379bp
of identities followed by some 122bp of non-identical sequence (full
length of 5'UTR returned is 501bp). Hence the question on _what
exactly_ is returned by the 5'utr-query?

Thank you for your answer,

--
Sincerely yours,
Bogdan Tokovenko,
PhD student at the Laboratory of Protein Biosynthesis,
Department of Genetic Information Translation Mechanisms,
Institute of Molecular Biology and Genetics, Kyiv, Ukraine
http://bogdan.org.ua/

Re: [mart-dev] question about: Flank-coding region (Gene), Flank (Gene), 5' UTR

Reply via email to