Deay Henry,

I really appreciate all your comments.
The IDs for case 3 and 4 were really the 2nd hit.
Now all make sense to me.
I agree with your points and hope the work will be done well.

Minyoung.

On 3월23일, 오후2시01분, Henry Lam <heining...@gmail.com> wrote:
> Dear Minyoung,
>
> Thanks for your message. Did you manage to check the actual IDs for
> Cases 3 and 4? I suppose if our X filter is working properly, then
> your IDs should be the second hits, not the top hits any more.
>
> In Case 4, if the top hit is thrown out, then the second hit is the
> only hit. In such a special case, deltaCn is set to 1.
>
> Still, I think the way we are neglecting X-containing peptides need
> some improvement. For instance, even if we're reporting the 2nd hit as
> the ID in Case 3 (pretending the sequence of the top hit never existed
> in the database), the deltaCn should be (xcorr(2) - xcorr(3)) / xcorr
> (2). The currently reported 0.0417 is (xcorr(1) - xcorr(3)) / xcorr
> (1), which makes no sense if we are throwing out the top hit.
>
> I will refer this problem to the developer who worked on the X filter,
> and together we'll come up with a solution somehow. In any case, I
> suppose the X containing peptides are quite rare, so this should not
> have a big impact hopefully on your analysis.
>
> Henry
>
> On Mar 23, 10:43 am, Minyoung <minyoung....@gmail.com> wrote:
>
>
>
> > Hi Henry,
> > Sorry for late reply.
>
> > I used Petunia, and made pepXML with default setting.
> > I checked the amino acid X problem, and confirmed your comment.
> > In addition, when the best hit has the X, the shtml deltaCn was
> > calculated from delow the 3rd hit even though the 2nd hit does not
> > contain X. The 3rd and 4th case is the example.
>
> > case 3
> > #1 e.qgxtdymgads...@ikr.k  deltCn=0.0000
> > #2 L.LC*ELLYESEFDSQLW.I deltCn=0.0296
> > #3 a.ekic*eytytdie...@g.k deltCn=0.0417
> > then, deltaCn in shtml is 0.0417.
>
> > case 4
> > #1 I.LAXXXYEGLKEFZBCB.Z deltCn=0.0000
> > #2 B.ZAQLSLM#QLYLTNKSD.N deltCn=0.3882
> > The out file has only these two hits.
> > then, deltaCn in shtml is 1.
>
> > If the defalut setting ignores X containing peptides, why the 3rd and
> > 4th case output deltaCn and skip the 2nd hit?
>
> > On 3월21일, 오후11시49분, Henry Lam <heining...@gmail.com> wrote:
>
> > > Dear Minyoung,
>
> > > I think I found the issue. It does not have anything to do with the
> > > calculation of deltaCn values, but the fact that in Minyoung's
> > > examples, some of the lower hits have the "amino acid" X in it. The
> > > default behavior of the .out to .pep.xml converter is that all X-
> > > containing peptides are ignored, as if they are never searched. So
> > > your examples all make sense if you pretend that all the X-containing
> > > peptides disappear.
>
> > > For example, in your Case 2, the 2 homologs at the 2nd and 3rd
> > > positions both contain an X. They are treated as invisible. So your
> > > deltaCn becomes that of 1st - 4th. In your Case 3, your 2nd hit
> > > contains an X, so your deltaCn becomes 1st - 3rd. etc.
>
> > > This behavior is designed to keep those X-containing peptides out of
> > > the final result set, since they are often confusing. Whether or not
> > > we should also apply this to lower hits (and hence altering the
> > > deltaCn behavior) is, I think, is not an easy call. Any comments?
>
> > > If you want to get the expected behavior back, you need to run Out2XML
> > > separately, and specify the -all option at the end. Then you can
> > > xinteract normally, starting from the pep.xml files.
>
> > > Henry
>
> > > On Mar 21, 12:53 pm, Henry Lam <heining...@gmail.com> wrote:
>
> > > > Hi Minyoung,
>
> > > > I have trouble reproducing your results from here. Would you help me
> > > > by telling me how exactly you come up with the .out files and .pep.xml
> > > > files that show the anomaly? i.e. the exact command you run? Perhaps
> > > > even copy and paste the out files and the portion of .pep.xml file for
> > > > me? Thanks a lot in advance.
>
> > > > If you are familiar with how to build TPP from the code base directly,
> > > > I can tell you where to get the reverted change right away and see if
> > > > it fixes your problem. Or you'll have to wait for the official
> > > > release.
>
> > > > Henry
>
> > > > On Mar 21, 8:16 am, Jimmy Eng <j...@systemsbiology.org> wrote:
>
> > > > > I apologize to all who should not care about these esoteric Sequest
> > > > > details ...
>
> > > > > I guess the important point of note is that the previous code did not
> > > > > calculate deltaCn value for the 2nd hit nor is there any placeholder 
> > > > > to
> > > > >   report that value anywhere in the pepXML.  (We can calculate it and
> > > > > add in another attribute if it's important.)  There's only 1 deltaCn
> > > > > value that is between top hit and first non-homologous hit, whether
> > > > > that's x2 or xN.   There's also a deltacnstar attribute with valid
> > > > > values of 0/1 (true/false) to indicate if the deltacn value is between
> > > > > top 2 hits or between top hit and something lower down in the list.
> > > > > Hope that clarifies things.
>
> > > > > Henry Lam wrote:
> > > > > > Hi Jimmy,
>
> > > > > > Oh no no. I know what deltaCn means for the top hit. It is the 
> > > > > > deltaCn
> > > > > > of the lower hits I'm changing. The deltaCn of the top hit is what 
> > > > > > you
> > > > > > described, x1-x2 in most cases, and x1-x(the highest non-homologous
> > > > > > hit). The code change I made should not change that (or at least I
> > > > > > thought).
>
> > > > > > But what is the deltaCn of the second hit? (I know we don't use the
> > > > > > second hit at all in our pipeline, but it doesn't mean people 
> > > > > > won't.)
> > > > > > In the old code, it is x1-x3. In the new code, it is x2-x3. I don't
> > > > > > see why that doesn't make more sense. Similarly, the deltaCn of the
> > > > > > 3rd hit is x3-x4 in the new code, x1-x4 in the old code.
>
> > > > > > That said, I was afraid that my code change had some unintended
> > > > > > consequence that maybe I failed to see. Let me spend some time
> > > > > > figuring this out.
>
> > > > > > Henry
>
> > > > > > On Mar 21, 12:21 am, Jimmy Eng <j...@systemsbiology.org> wrote:
> > > > > >> Henry,
>
> > > > > >> I'm not going to have any time in the next week or so to look in 
> > > > > >> to the
> > > > > >> problem.  But your interpretation of what deltaCn means is wrong or
> > > > > >> rather different than what it is meant to represent.
>
> > > > > >> The premise for the ad-hoc deltaCn value is to generate some 
> > > > > >> number to
> > > > > >> quantify how different the top hit is from the next best hit.  So
> > > > > >> deltaCn is always just the normalized xcorr for hit 2 (or hit 3 or 
> > > > > >> hit
> > > > > >> N).  For the typical case, it is just the difference between top 
> > > > > >> hit and
> > > > > >> 2nd best hit (i.e. xcorr(2)).  When there's homology in the top 
> > > > > >> hits,
> > > > > >> deltaCn was calculated to be the difference between the top hit and
> > > > > >> first dis-similar hit.  If that is the 3rd peptide then the output 
> > > > > >> value
> > > > > >> should be normalized xcorr(3) and not xcorr(3)-xcorr(2).  Hope that
> > > > > >> makes sense.  If you would like a different interpretation of what
> > > > > >> number should go in that field, I guess we should discuss it 
> > > > > >> offline
> > > > > >> including how it impacts PeptideProphet.  But until then, I think 
> > > > > >> you
> > > > > >> want to revert the correction you made for the next update release.
>
> > > > > >> - Jimmy
>
> > > > > >> Henry Lam wrote:
> > > > > >>> Hi Jimmy,
> > > > > >>> I made a change recently on SequestOut.cpp to retain the first 
> > > > > >>> deltaCn
> > > > > >>> (regardless of homology) in the deltacnstar field. I also 
> > > > > >>> corrected
> > > > > >>> the deltaCn of the lower hits (e.g. deltaCn of 2nd hit is now 
> > > > > >>> xcorr(3)
> > > > > >>> - xcorr(2) rather than xcorr(3)). I looked at it again today but
> > > > > >>> couldn't see why my changes would cause the behavior seen by 
> > > > > >>> Minyoung.
> > > > > >>> Maybe it's unrelated, but perhaps this will point you to 
> > > > > >>> something:
> > > > > >>>http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteom...
> > > > > >>> Henry
> > > > > >>> On Mar 19, 5:24 am, Jimmy Eng <j...@systemsbiology.org> wrote:
> > > > > >>>> The deltaCn is calculated from the first non-similar peptide 
> > > > > >>>> compared to
> > > > > >>>> the top hit.  Similarity is based on sequence homology and the 
> > > > > >>>> cutoff is
> > > > > >>>> 75%.  The homology determination is definitely not optimally 
> > > > > >>>> calculated
> > > > > >>>> though but that doesn't explain your problems.
> > > > > >>>> Anyways, some of the deltaCn values in your examples below are
> > > > > >>>> definitely wrong; the only exception is example 1.  
> > > > > >>>> Unfortunately I
> > > > > >>>> haven't seen that behavior in any of my results.  Someone would 
> > > > > >>>> need to
> > > > > >>>> see your files (out and pep.xml) to try to figure out the 
> > > > > >>>> problem.
> > > > > >>>> - Jimmy
> > > > > >>>> Minyoung wrote:
> > > > > >>>>> Hi.
> > > > > >>>>> I wonder why deltaCn values from out file and from 
> > > > > >>>>> peptideprophet
> > > > > >>>>> shtml are different.
> > > > > >>>>> I observed the following:
> > > > > >>>>> 1.
> > > > > >>>>> when the best hit and the second best hit in a out file are very
> > > > > >>>>> similar (identical sequence except PTM),
> > > > > >>>>> shtml DeltaCn is calculated with reference to the third best 
> > > > > >>>>> hit.
> > > > > >>>>> example> in some out file
> > > > > >>>>> #1 P.C*HCCA.P deltCn=0.0000
> > > > > >>>>> #2 P.CHCC*A.P deltCn=0.0046
> > > > > >>>>> #3 R.HC*CCA.E deltCn=0.0558
> > > > > >>>>> then, deltaCn in shtml is 0.0558.
> > > > > >>>>> 2.
> > > > > >>>>> when the second best and the third best hit are very similar,
> > > > > >>>>> shtml DeltaCn is calculated with the next best hit.
> > > > > >>>>> example>
> > > > > >>>>> #1 r.fqspagtealfe...@isvadsan@YSC*VYVDLKPPFGGSAPSER.L 
> > > > > >>>>> deltCn=0.0000
> > > > > >>>>> #2 c.eecgkafnqstnltrhkrihtaekpykceecgkafnh...@.l deltCn=0.0028
> > > > > >>>>> #3 c.eecgkafnqstnltrhkrihtaekpykceecgk...@hpxn.l deltCn=0.0220
> > > > > >>>>> #4 Q.KFPKPLPQEYQYFDELSGIPAEDLPYYGGSVEIADYC*PFS.Q deltCn=0.1644
> > > > > >>>>> then, deltaCn in shtml is 0.1644.
> > > > > >>>>> 3.
> > > > > >>>>> there is no sequence homology from the best hit to a reference 
> > > > > >>>>> hit,
> > > > > >>>>> but shtml DeltaCn is calculated with the reference hit.
> > > > > >>>>> example>
> > > > > >>>>> #1 e.qgxtdymgads...@ikr.k deltCn=0.0000
>
> ...
>
> 추가 정보 >>- 따온 텍스트 숨기기 -
>
> - 따온 텍스트 보기 -

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to spctools-discuss@googlegroups.com
To unsubscribe from this group, send email to 
spctools-discuss+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to