There's no surprise that generating mgf files using other tools can cause Mascot to perform (much) better as those tools apply things like peak picking which MzXML2Search doesn't do. MzXML2Search pretty much just takes the input spectral data and writes it back out into the chosen output format.
Anyways, what you describe below includes two issues: lower number of identifications and max ions cutoff. There's no real near term solution to directly address the first issue. That fix would entail some developer to spend time implementing a peak picking routine in the tool that's validated to work well with Mascot. The second issue can be mitigated by using the '-N<num>' option in MzXML2Search. That command line option specifies the maximum peak count to export for any given spectrum. Use a command like the following: MzXML2Search -mgf -N100 input.mzXML This will cause only to 100 most intense m/z values for each spectrum to be printed out. I just use 100 as an example. Because this does reduce the peak count, it will have some affect on the resulting Mascot identifications. And I'm sure there has to be some peak count value that will give you optimal number of identifications; whether or not that number of identifications approaches what you get by PLGS data export is unknown though. If you're motivated to do so, I would suggest that you generate SSM411.mgf using various peak counts (50, 100, 150, 200, 400, etc.) and run them through Mascot to see which gives the most identifications and see if the results approach the PLGS results. - Jimmy On Tue, Jun 29, 2010 at 4:12 PM, Jing Wang <fayfay9...@gmail.com> wrote: > Hi Brian, Jimmy, David, > > Thanks for all the suggestions! > > I did all you suggested: generated .mgf file, re-searched by Mascot, renamed > .dat file......., and they all worked for the file I have been trying > (SSM411, as Jimmy pointed out) so far. But, when I tried the different mzXML > files (since SSM411 is just one of the fractions), I was stuck on Mascot > searching. I have tried another 4 different mzXML files, they all gave me > similar error massages: > > Max number of ions is 10000. Ignoring ms-ms set starting at line 139813 > [M00031] > Your search is continuing... > Warning: > .............................(similar warnings with different line numbers) > ....................... > ....................... > Max number of ions is 10000. Ignoring ms-ms set starting at line 154038 > [M00031] > Your search is continuing... > Warning: > > Your search is continuing... > > Sorry, your search could not be performed due to the following mistake > entering data. > Missing ion intensity value on line 3857088 of input file [M00430] > Please press the back button on your browser, correct the fault and retry > the search. > > Another problem is although the SSM411.mgf file worked on Mascot search, the > results is a bit different from what I got earlier searched by pkl files > generated by PLGS (Waters). The result from .mgf (converted by MzXML2Search) > gives 33 identified proteins, and the one from .pkl (converted by PLGS) > gives 39 identified proteins. The result from .mgf also gives fewer number > of "peptide matches above identity threshold" compared to result from .pkl > file (104 vs. 135). > I also tried to generate .pkl file by MzXML2Search command just for the > curiosities. It gives the very similar result compared to the search from > .mgf file. The only differece is ""peptide matches above identity threshold" > with 103 instead of 104. The search from .pkl file (converted by PLGS) > didn't give any warning message during the searching process, while the .mgf > and .pkl (converted by MzXML2Search) gave the similar warning message as > follows: > > Max number of ions is 10000. Ignoring ms-ms set starting at line 275395 > [M00031] > Your search is continuing... > > ....................... (similar warnings with different line numbers) > > ....................... > > Your search is continuing... > Finished uploading search details and file... > Searching.... > Warning: > Error 31 has been detected 26 times and only the first 10 messages have been > output [M00999] > Your search is continuing... > > .20% complete > ..50% complete > > Any suggestions for fixing? > > Thanks in advance, > > Jing > > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to spctools-disc...@googlegroups.com. > To unsubscribe from this group, send email to > spctools-discuss+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to spctools-disc...@googlegroups.com. To unsubscribe from this group, send email to spctools-discuss+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.