While those tests are still running, I pulled out all 185 of the proteins that are in the 10OV pepXMLs but not in 01-09OV, figuring that maybe one of those is causing the error. I've uploaded that to the same folder everything else is in (it's called 10OV_uniq.fasta) - I don't see anything that jumps out immediately. (There are no individual characters unique to either the headers or the sequences in 10OV, so I don't think there's an individual character messing things up.)
On Thursday, October 22, 2020 at 3:49:18 PM UTC-4 David Shteynberg wrote: > I just re extracted that file and I don't see the issue anymore. Perhaps > this was a decompression issue. > > Thanks for checking. > > -David > > On Thu, Oct 22, 2020 at 12:19 PM Emily Kawaler <e.ka...@gmail.com> wrote: > >> Hello, >> Thanks so much for taking a look! I think the selenocysteines ("U") are >> likely not the problem, since I've got those in all of my databases, >> including the ones that run correctly. I'm looking at >> 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML and I don't see >> anything odd in line 171821 ("</modification_info>"), so I think our line >> numberings might not match up - what does your problematic line contain? >> > >> When I try to run it on my end, it always sticks somewhere in the >> 10CPTAC_OV files. Right now I'm running a working set of spectra with a >> database that didn't work and vice versa, so hopefully that'll help me pin >> down whether it's a problem with my spectra or my database - will let you >> know how that turns out! >> >> Emily >> >> On Thursday, October 22, 2020 at 3:09:29 PM UTC-4 David Shteynberg wrote: >> >>> Hi Emily, >>> >>> I analyzed the search results that you sent and I am seeing some strange >>> things in at least one of the files you gave me. This may be causing some >>> of the problems you saw. >>> In file 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML on line >>> 171821 there are some strange characters (possibly binary) that are >>> tripping up the TPP. I think these might be caused by a bug in an analysis >>> tool upstream of the TPP. Not sure if there are other mistakes of this >>> sort. Also I found some 'U' amino acids in the database which the TPP >>> complains about having a mass of 0. >>> >>> I hope this helps you somewhat. Let me know what you find on your end. >>> >>> Cheers, >>> -David >>> >>> On Tue, Oct 20, 2020 at 1:42 PM Emily Kawaler <e.ka...@gmail.com> wrote: >>> >>>> Sure! The spectra are from the CPTAC2 ovarian propective dataset, >>>> though I removed all scans that matched to a standard reference database >>>> (I >>>> don't think the scan removal is the issue, since I'm also having this >>>> problem on a different dataset without removing any scans; I also checked >>>> with xmllint and it looks like the mzML pepXML files are valid). I've been >>>> running it with the philosopher pipeline, so the pepXML files were >>>> generated with MSFragger as part of that pipeline. The database is a >>>> customized variant database with contaminants and decoys added by >>>> philosopher's database tool. Are there any other specifics you'd like? I >>>> can upload my full philosopher.yml file if that would be helpful. >>>> >>>> On Tuesday, October 20, 2020 at 1:30:44 AM UTC-4 David Shteynberg wrote: >>>> >>>>> Hi Emily, >>>>> >>>>> I got the data and now I am trying to understand how you are running >>>>> the analysis. Can you please describe those steps? >>>>> >>>>> Thank you, >>>>> -David >>>>> >>>>> On Sat, Oct 17, 2020 at 12:54 PM Emily Kawaler <e.ka...@gmail.com> >>>>> wrote: >>>>> >>>>>> I've uploaded the pepXML files, the parameters I used, and the >>>>>> database here. >>>>>> <https://drive.google.com/drive/folders/1gJoi9fqsmIYg_0tl_2Ur-n04MJyuotyc?usp=sharing> >>>>>> Please let me know if I should be uploading anything else! Thank you! >>>>>> >>>>>> On Saturday, October 17, 2020 at 12:04:21 AM UTC-4 Emily Kawaler >>>>>> wrote: >>>>>> >>>>>>> Thank you! I'm working on getting it transferred to Drive, so it >>>>>>> might take a little while, but I'll be in touch! >>>>>>> >>>>>>> On Tuesday, October 13, 2020 at 3:08:44 PM UTC-4 David Shteynberg >>>>>>> wrote: >>>>>>> >>>>>>>> Hello Emily, >>>>>>>> >>>>>>>> If you are able to share the dataset including the pepXML file and >>>>>>>> the database I can try to replicate the issue here and try to >>>>>>>> troubleshoot >>>>>>>> the sticking point. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -David >>>>>>>> >>>>>>>> On Tue, Oct 13, 2020 at 11:15 AM Emily Kawaler <e.ka...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hello, and thank you for your response! It doesn't look like the >>>>>>>>> process is using too much memory (I've allocated 300 GB and it's >>>>>>>>> maxing out >>>>>>>>> around 10), and I've kicked up the minprob parameter - it's still >>>>>>>>> getting >>>>>>>>> stuck, unfortunately. >>>>>>>>> Emily >>>>>>>>> >>>>>>>>> On Friday, October 9, 2020 at 2:24:37 PM UTC-4 Luis wrote: >>>>>>>>> >>>>>>>>>> Hello Emily, >>>>>>>>>> >>>>>>>>>> This is not a problem that we have seen much of. Do you know >>>>>>>>>> which version of ProteinProphet / TPP you are using? >>>>>>>>>> >>>>>>>>>> One potential issue is the large number of proteins (and >>>>>>>>>> peptides) that it is trying to process -- can you either monitor the >>>>>>>>>> memory >>>>>>>>>> usage of the machine when you run this dataset, and/or try on one >>>>>>>>>> with more >>>>>>>>>> memory? >>>>>>>>>> >>>>>>>>>> Hope this helps, >>>>>>>>>> --Luis >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Oct 6, 2020 at 6:32 PM Emily Kawaler <e.ka...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hello! I've been running ProteinProphet as part of the >>>>>>>>>>> Philosopher pipeline for a while now with no problems. However, one >>>>>>>>>>> of my >>>>>>>>>>> datasets seems to be getting stuck in the middle of this function. >>>>>>>>>>> It >>>>>>>>>>> doesn't throw an error or anything - just stops advancing (the last >>>>>>>>>>> line of the output is "Computing degenerate peptides for 69919 >>>>>>>>>>> proteins: 0%...10%...20%...30%...40%...50%"). Has anyone run into >>>>>>>>>>> this >>>>>>>>>>> problem before? >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "spctools-discuss" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to spctools-discu...@googlegroups.com. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/be33a8fb-a6ec-41b6-a988-981161f194fcn%40googlegroups.com >>>>>>>>>>> >>>>>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/be33a8fb-a6ec-41b6-a988-981161f194fcn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "spctools-discuss" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to spctools-discu...@googlegroups.com. >>>>>>>>> >>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/6d28e150-40f0-4747-a8a3-02630b12379dn%40googlegroups.com >>>>>>>>> >>>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/6d28e150-40f0-4747-a8a3-02630b12379dn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "spctools-discuss" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to spctools-discu...@googlegroups.com. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/spctools-discuss/de634f4a-0057-4fc1-b135-e639c0eb77een%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/spctools-discuss/de634f4a-0057-4fc1-b135-e639c0eb77een%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "spctools-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to spctools-discu...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/spctools-discuss/9c0b1f62-81a7-417b-9e8f-14900f87e134n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/spctools-discuss/9c0b1f62-81a7-417b-9e8f-14900f87e134n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "spctools-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to spctools-discu...@googlegroups.com. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/spctools-discuss/8a49c6ac-a508-4f34-9369-53d0d6b503afn%40googlegroups.com >> >> <https://groups.google.com/d/msgid/spctools-discuss/8a49c6ac-a508-4f34-9369-53d0d6b503afn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discuss+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/9cf92b41-9be9-44d3-aa48-edbd433813fdn%40googlegroups.com.