I just re extracted that file and I don't see the issue anymore. Perhaps this was a decompression issue.
Thanks for checking. -David On Thu, Oct 22, 2020 at 12:19 PM Emily Kawaler <e.kawa...@gmail.com> wrote: > Hello, > Thanks so much for taking a look! I think the selenocysteines ("U") are > likely not the problem, since I've got those in all of my databases, > including the ones that run correctly. I'm looking at > 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML and I don't see > anything odd in line 171821 ("</modification_info>"), so I think our line > numberings might not match up - what does your problematic line contain? > > When I try to run it on my end, it always sticks somewhere in the > 10CPTAC_OV files. Right now I'm running a working set of spectra with a > database that didn't work and vice versa, so hopefully that'll help me pin > down whether it's a problem with my spectra or my database - will let you > know how that turns out! > > Emily > > On Thursday, October 22, 2020 at 3:09:29 PM UTC-4 David Shteynberg wrote: > >> Hi Emily, >> >> I analyzed the search results that you sent and I am seeing some strange >> things in at least one of the files you gave me. This may be causing some >> of the problems you saw. >> In file 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML on line >> 171821 there are some strange characters (possibly binary) that are >> tripping up the TPP. I think these might be caused by a bug in an analysis >> tool upstream of the TPP. Not sure if there are other mistakes of this >> sort. Also I found some 'U' amino acids in the database which the TPP >> complains about having a mass of 0. >> >> I hope this helps you somewhat. Let me know what you find on your end. >> >> Cheers, >> -David >> >> On Tue, Oct 20, 2020 at 1:42 PM Emily Kawaler <e.ka...@gmail.com> wrote: >> >>> Sure! The spectra are from the CPTAC2 ovarian propective dataset, though >>> I removed all scans that matched to a standard reference database (I don't >>> think the scan removal is the issue, since I'm also having this problem on >>> a different dataset without removing any scans; I also checked with xmllint >>> and it looks like the mzML pepXML files are valid). I've been running it >>> with the philosopher pipeline, so the pepXML files were generated with >>> MSFragger as part of that pipeline. The database is a customized variant >>> database with contaminants and decoys added by philosopher's database tool. >>> Are there any other specifics you'd like? I can upload my full >>> philosopher.yml file if that would be helpful. >>> >>> On Tuesday, October 20, 2020 at 1:30:44 AM UTC-4 David Shteynberg wrote: >>> >>>> Hi Emily, >>>> >>>> I got the data and now I am trying to understand how you are running >>>> the analysis. Can you please describe those steps? >>>> >>>> Thank you, >>>> -David >>>> >>>> On Sat, Oct 17, 2020 at 12:54 PM Emily Kawaler <e.ka...@gmail.com> >>>> wrote: >>>> >>>>> I've uploaded the pepXML files, the parameters I used, and the >>>>> database here. >>>>> <https://drive.google.com/drive/folders/1gJoi9fqsmIYg_0tl_2Ur-n04MJyuotyc?usp=sharing> >>>>> Please let me know if I should be uploading anything else! Thank you! >>>>> >>>>> On Saturday, October 17, 2020 at 12:04:21 AM UTC-4 Emily Kawaler wrote: >>>>> >>>>>> Thank you! I'm working on getting it transferred to Drive, so it >>>>>> might take a little while, but I'll be in touch! >>>>>> >>>>>> On Tuesday, October 13, 2020 at 3:08:44 PM UTC-4 David Shteynberg >>>>>> wrote: >>>>>> >>>>>>> Hello Emily, >>>>>>> >>>>>>> If you are able to share the dataset including the pepXML file and >>>>>>> the database I can try to replicate the issue here and try to >>>>>>> troubleshoot >>>>>>> the sticking point. >>>>>>> >>>>>>> Thanks, >>>>>>> -David >>>>>>> >>>>>>> On Tue, Oct 13, 2020 at 11:15 AM Emily Kawaler <e.ka...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello, and thank you for your response! It doesn't look like the >>>>>>>> process is using too much memory (I've allocated 300 GB and it's >>>>>>>> maxing out >>>>>>>> around 10), and I've kicked up the minprob parameter - it's still >>>>>>>> getting >>>>>>>> stuck, unfortunately. >>>>>>>> Emily >>>>>>>> >>>>>>>> On Friday, October 9, 2020 at 2:24:37 PM UTC-4 Luis wrote: >>>>>>>> >>>>>>>>> Hello Emily, >>>>>>>>> >>>>>>>>> This is not a problem that we have seen much of. Do you know >>>>>>>>> which version of ProteinProphet / TPP you are using? >>>>>>>>> >>>>>>>>> One potential issue is the large number of proteins (and peptides) >>>>>>>>> that it is trying to process -- can you either monitor the memory >>>>>>>>> usage of >>>>>>>>> the machine when you run this dataset, and/or try on one with more >>>>>>>>> memory? >>>>>>>>> >>>>>>>>> Hope this helps, >>>>>>>>> --Luis >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Oct 6, 2020 at 6:32 PM Emily Kawaler <e.ka...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hello! I've been running ProteinProphet as part of the >>>>>>>>>> Philosopher pipeline for a while now with no problems. However, one >>>>>>>>>> of my >>>>>>>>>> datasets seems to be getting stuck in the middle of this function. It >>>>>>>>>> doesn't throw an error or anything - just stops advancing (the last >>>>>>>>>> line of the output is "Computing degenerate peptides for 69919 >>>>>>>>>> proteins: 0%...10%...20%...30%...40%...50%"). Has anyone run into >>>>>>>>>> this >>>>>>>>>> problem before? >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>> Google Groups "spctools-discuss" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>>> send an email to spctools-discu...@googlegroups.com. >>>>>>>>>> To view this discussion on the web visit >>>>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/be33a8fb-a6ec-41b6-a988-981161f194fcn%40googlegroups.com >>>>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/be33a8fb-a6ec-41b6-a988-981161f194fcn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>> . >>>>>>>>>> >>>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "spctools-discuss" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to spctools-discu...@googlegroups.com. >>>>>>>> >>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/6d28e150-40f0-4747-a8a3-02630b12379dn%40googlegroups.com >>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/6d28e150-40f0-4747-a8a3-02630b12379dn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "spctools-discuss" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to spctools-discu...@googlegroups.com. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/spctools-discuss/de634f4a-0057-4fc1-b135-e639c0eb77een%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/spctools-discuss/de634f4a-0057-4fc1-b135-e639c0eb77een%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "spctools-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to spctools-discu...@googlegroups.com. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/spctools-discuss/9c0b1f62-81a7-417b-9e8f-14900f87e134n%40googlegroups.com >>> <https://groups.google.com/d/msgid/spctools-discuss/9c0b1f62-81a7-417b-9e8f-14900f87e134n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to spctools-discuss+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/spctools-discuss/8a49c6ac-a508-4f34-9369-53d0d6b503afn%40googlegroups.com > <https://groups.google.com/d/msgid/spctools-discuss/8a49c6ac-a508-4f34-9369-53d0d6b503afn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discuss+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/CAGJJY%3D94DO2OQtaq_yue0BVTjjRyxhf4TOM8S55ZOaAVdo%3Dh0A%40mail.gmail.com.