Re: [Genome] Obtaining Selected Predicted Protein Sequences

Mary Goldman Wed, 05 May 2010 11:19:40 -0700

Hi Basel,

If you use the protAcc numbers, then obtaining the protein sequences in 
FASTA format is easy with our Table Browser. To do so, follow these 
directions:
1. Go to the Table browser ("Tables" in the top link from our home page).
2. Reset cart settings (link below the "get output" button) if UCSC 
Genes is not already the track selected.
3. Make a filter by clicking on "create" after filter.
4. Under hg19.kgXref paste your identifiers into the protAcc value box, 
each separated by a space, and click "submit".
5. Select the output format "sequence", type in a file name if desired, 
and click "get output".
6. Select the "protein" button and click "submit".

Unfortunately, there is not way to see the the protAcc numbers that are 
associated with the UCSC Gene IDs in FASTA format. If you would like to 
see how the UCSC IDs compare with the protAcc numbers in a table and 
then convert to the FASTA format, you will have to instead select the 
output format "selected fields from the primary and related tables" and 
click "get output". Scroll down to the Linked Tables, select 
knownGenePep, scroll down to the bottom and click "Allow selection form 
checked tables". Finally, select the fields desired from all tables and 
then click "get output".

I hope this information is helpful.  Please feel free to contact the 
mail list again if you require further assistance.

Best,
Mary
------------------
Mary Goldman
UCSC Bioinformatics Group

On 5/4/10 2:22 PM, Baghal, Basel (NIH/NIAAA) [F] wrote:
> Hi,
>
> I am a research trainee at the NIH/NIAAA. For my current research project, I 
> am I need to find the most efficient way of obtaining a selected list of 
> RefSeq Predicted Protein Sequences (in FASTA format) which corresponds to my 
> list of mrnaAcc numbers (or, alternatively, protAcc numbers). I am 
> specifically looking for the UCSC predicted proteins because there are slight 
> differences when compared to NCBI.
>
> Previously I have used the genome browser (hg18 and Rhesus) to find the 
> protein sequences for a short list of accession numbers one by one, however, 
> now I am looking for a efficient way to obtain and possibly export the aa 
> sequences for a much longer list.  If you could provide me with any pointers 
> or advice I would really appreciate it.
>
> Sincere thanks,
> Basel Baghal
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>    
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Re: [Genome] Obtaining Selected Predicted Protein Sequences

Reply via email to