There's no surprise that generating mgf files using other tools can
cause Mascot to perform (much) better as those tools apply things like
peak picking which MzXML2Search doesn't do.  MzXML2Search pretty much
just takes the input spectral data and writes it back out into the
chosen output format.

Anyways, what you describe below includes two issues:  lower number of
identifications and max ions cutoff.  There's no real near term
solution to directly address the first issue.  That fix would entail
some developer to spend time implementing a peak picking routine in
the tool that's validated to work well with Mascot.  The second issue
can be mitigated by using the '-N<num>' option in MzXML2Search.  That
command line option specifies the maximum peak count to export for any
given spectrum.  Use a command like the following:

   MzXML2Search -mgf -N100 input.mzXML

This will cause only to 100 most intense m/z values for each spectrum
to be printed out.  I just use 100 as an example.  Because this does
reduce the peak count, it will have some affect on the resulting
Mascot identifications.  And I'm sure there has to be some peak count
value that will give you optimal number of identifications; whether or
not that number of identifications approaches what you get by PLGS
data export is unknown though.

If you're motivated to do so, I would suggest that you generate
SSM411.mgf  using various peak counts (50, 100, 150, 200, 400, etc.)
and run them through Mascot to see which gives the most
identifications and see if the results approach the PLGS results.

- Jimmy

On Tue, Jun 29, 2010 at 4:12 PM, Jing Wang <fayfay9...@gmail.com> wrote:
> Hi Brian, Jimmy, David,
>
> Thanks for all the suggestions!
>
> I did all you suggested: generated .mgf file, re-searched by Mascot, renamed
> .dat file......., and they all worked for the file I have been trying
> (SSM411, as Jimmy pointed out) so far. But, when I tried the different mzXML
> files (since SSM411 is just one of the fractions), I was stuck on Mascot
> searching. I have tried another 4 different mzXML files, they all gave me
> similar error massages:
>
> Max number of ions is 10000. Ignoring ms-ms set starting at line 139813
> [M00031]
> Your search is continuing...
> Warning:
> .............................(similar warnings with different line numbers)
> .......................
> .......................
> Max number of ions is 10000. Ignoring ms-ms set starting at line 154038
> [M00031]
> Your search is continuing...
> Warning:
>
> Your search is continuing...
>
> Sorry, your search could not be performed due to the following mistake
> entering data.
> Missing ion intensity value on line 3857088 of input file [M00430]
> Please press the back button on your browser, correct the fault and retry
> the search.
>
> Another problem is although the SSM411.mgf file worked on Mascot search, the
> results is a bit different from what I got earlier searched by pkl files
> generated by PLGS (Waters). The result from .mgf (converted by MzXML2Search)
> gives 33 identified proteins, and the one from .pkl (converted by PLGS)
> gives 39 identified proteins. The result from .mgf also gives fewer number
> of "peptide matches above identity threshold" compared to result from .pkl
> file (104 vs. 135).
> I also tried to generate .pkl file by MzXML2Search command just for the
> curiosities. It gives the very similar result compared to the search from
> .mgf file. The only differece is ""peptide matches above identity threshold"
> with 103 instead of 104. The search from .pkl file (converted by PLGS)
> didn't give any warning message during the searching process, while the .mgf
> and .pkl (converted by MzXML2Search) gave the similar warning message as
> follows:
>
> Max number of ions is 10000. Ignoring ms-ms set starting at line 275395
> [M00031]
> Your search is continuing...
>
> ....................... (similar warnings with different line numbers)
>
> .......................
>
> Your search is continuing...
> Finished uploading search details and file...
> Searching....
> Warning:
> Error 31 has been detected 26 times and only the first 10 messages have been
> output [M00999]
> Your search is continuing...
>
> .20% complete
> ..50% complete
>
> Any suggestions for fixing?
>
> Thanks in advance,
>
> Jing
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "spctools-discuss" group.
> To post to this group, send email to spctools-disc...@googlegroups.com.
> To unsubscribe from this group, send email to
> spctools-discuss+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/spctools-discuss?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to spctools-disc...@googlegroups.com.
To unsubscribe from this group, send email to 
spctools-discuss+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en.

Reply via email to