Hi Vanessa,

Thank you so much for your reply.

Best,

Hong

On Tue, Dec 6, 2011 at 5:36 PM, Vanessa Kirkup Swing
<[email protected]>wrote:

> Hi Hong,
>
> Our primary focus is on vertebrates which these alignments are all based
> on. This  means that we won't be adding yeast, fly, or nematode to the
> multiple alignment.  You might be interested in the BlastTab tables which
> establish orthology by BLAST. Here is a previously answered mailing list
> question that will give you more information:
>
> https://lists.soe.ucsc.edu/pipermail/genome/2011-July/026544.html
>
>
> With regards to your last question, NM_001024599 aligns multiple places in
> the genome. One of those places is aligned to calJac1 in the multiple
> alignment, the other is not.
>
> The description page for the conservation track has a lot of information
> that might be of interest to you:
>
> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&c=chr21&g=cons46way
>
> Hope this helps. If you have further questions, please email the mailing
> list: [email protected].
>
> Vanessa Kirkup Swing
> UCSC Genome Bioinformatics Group
>
>
> ---------- Forwarded message ----------
> From: Hong Lu <[email protected]>
> Date: Tue, Dec 6, 2011 at 2:15 PM
> Subject: Re: [Genome] UCSC human multiple alignment (protein)
> To: Brian Raney <[email protected]>
> Cc: [email protected]
>
>
> Hi Brian,
>
> Thank you so much for your quick reply.
>
> I attached two files (diff_seq.txt and miss_seq.txt). The file
> "diff_seq.txt" includes a list of genes that the protein sequences from
> NCBI are different from the protein sequences used to run multiple
> alignment at UCSC. The file "miss_seq.txt" includes a list of genes that
> have protein sequences at NCBI, but have no alignment results at UCSC
> outcomes.
>
> And also are you planning to include yeast, drosophila, and c elegans into
> the multiple alignment? Thanks.
>
> For the problem of NM_001024599 and NM_000344, I will take NM_001024599 as
> an example.
>
> If I grep "NM_001024599" and "hg19" from the alignment results, I can found
> two records.
> >NM_001024599_hg19_1_1 127 0 0 chr1:149398799-149399179-
>
> MPDPAKSAPAPKKGSKKAVTKVQKKDGEKRKRSRKESYSVYVZEVLKQVHPDTGISSKTMGIMNSFVNDIFERIAGEASRLAHYNKRSTITSREIQTAVRLLLPGELAKHAVSEGTKAVTKYTSSKZ
>
> >NM_001024599_hg19_1_1 127 0 0 chr1:149783498-149783878-
>
> MPDPAKSAPAPKKGSKKAVTKVQKKDGKKRKRSRKESYSVYVYKVLKQVHPDTGISSKAMGIMNSFVNDIFERIAGEASRLAHYNKRSTITSREIQTAVRLLLPGELAKHAVSEGTKAVTKYTSSKZ
>
> The protein sequences of this gene at two different positions are exactly
> same. But when we check the homology of calJac1
>
> At the first position, it's
> >NM_001024599_calJac1_1_1 127 0 0 Contig10067:55107-55487-
>
> MPDPAKSAPAPKKGSKKAVTKVQKKDGKKRKRSRKESYSVYVYKVLKQVHPDTGISSKAMGIMNSFVNDIFERIAGEASRLAHYNKRSTITSREIQTAVRLLLPGELAKHAVSEGTKAVTKYTSSKZ
>
> At the second position, it's
> >NM_001024599_calJac1_1_1 127 0 0
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> That means we cannot find homology of gene NM_001024599 at the second
> position even if we can find it at the first position. I don't know the
> reason of this. Thanks.
>
> Best,
>
> Hong
>
>
> On Tue, Dec 6, 2011 at 1:39 PM, Brian Raney <[email protected]> wrote:
>
> > Hey Hong,
> > I'll go ahead and regenerate the alignments for the refSeq gene
> > models.  These should be ready by tomorrow.  I'll send you some mail
> > off-list to tell you when they're done.   In the near future we plan
> > to regenerate these files more frequently.
> > I don't understand your question about NM_001024599 and NM_000344.
> > Those two genes have significantly different mRNA sequence, and are
> > found in different places in the genome.
> > I hope this answers your questions.  Please reply to this list with
> > any follow up questions.
> > Brian
> > On Tue, Dec 6, 2011 at 11:57 AM, Hong Lu <[email protected]> wrote:
> > > Hello,
> > >
> > > I am interested in the multiple alignment of human proteins from UCSC.
> > >
> >
> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz46way/alignments/refGene.exonAA.fa.gz
> > >
> > > But this file is relatively old (about two years old). Are you planing
> to
> > > update this file? If so, would you please tell me when it can be
> > released.
> > > NCBI has been updated many genes within the last two years.
> > >
> > > And also, when I read that file, I found sometimes the human protein
> > > sequences are exactly same, but the multiple alignment are very
> different
> > > (such as NM_001024599 and NM_000344). Could you tell me the reason?
> > Thanks.
> > >
> > > Best,
> > >
> > > Hong
> > > _______________________________________________
> > > Genome maillist  -  [email protected]
> > > https://lists.soe.ucsc.edu/mailman/listinfo/genome
> >
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>
>
>
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to