Hello again Alain,

One of our engineers suggests that you also consider using the GENCODE 
Comprehensive or GENCODE Basic tracks.  The GENCODE tracks include 
Ensembl's predictions for non-coding RNAs as well as HAVANA (Vega 
database) manually annotated ncRNAs.  Therefore they are more 
comprehensive sets with more types of ncRNA than Ensembl alone.  You can 
read more details by clicking on the GENCODEtrack name (in the Genes and 
Gene Prediction Tracks section) on the main Genome Browser display page 
(http://genome.ucsc.edu/cgi-bin/hgTracks).

You can use the wgEncodeGencodeAttrsV7.transcriptClass field to filter 
the wgEncodeGencodeBasicV7 or the wgEncodeGencodeCompV7 table.  If there 
is an ID in the havanaGeneId field of the "Attrs" table, then the 
transcript has been manually annotated (and may also be in Ensembl).  If 
there is no havanaGeneId, then the annotation is only in Ensembl.

I hope this additional information is helpful.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


On 2/17/12 4:50 PM, Brooke Rhead wrote:
> Hi Alain,
>
> You could set up a filter using the related ensemblSource table, which
> shows coding/non-coding status.  With the ensGene table selected in the
> Table Browser, hit the "filter: create" button and choose the
> ensemblSource table from the Linked Tables section. Create the filter
> such that "source doesn't match protein_coding".
>
> To see the full list of possible values for the ensemblSource table,
> view its table schema in the Table Browser, and hit the "values" link in
> the info column for the source field.
>
> You should be aware that there are some other ways you could go about
> finding the non-coding transcripts: one way is to create a filter on
> ensGene where cdsStart=cdsEnd; another way is to filter out ensGene
> entries that do not have a protein listed in the protein field of the
> ensGtp table.
>
> Using the first method, I get a count of 43,096 genes that are annotated
> as something other than protein-coding.  Using either of the other
> methods, I get a count of 88,332 genes that do not have an annotated
> coding region or associated protein.
>
> If you have further questions, please contact us again at
> [email protected].
>
> --
> Brooke Rhead
> UCSC Genome Bioinformatics Group
>
>
> On 2/17/12 2:27 PM, Alain Pacis wrote:
>> Hi,
>>
>> Is it possible to get a list of ncRNAs from the ensembl track?
>> I managed to do so with the refseq track by using NR_* in the name filter.
>> I am not quite sure how to employ the same filter for the ensembl track.
>>
>> Thank you.
>>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to