Re: [spctools-discuss] ProteinProphet sticking in findDegenGroups3
Not a problem! In the end I decided just to remove all of the titins from my database - it shouldn't have a huge effect on my results - and I was indeed able to run all of my datasets to completion. Thanks for all of your help! Emily On Friday, November 6, 2020 at 8:25:56 PM UTC-5 David Shteynberg wrote: > Hello again Emily, > > Apologies for the delay but I needed a bit more time to look into this. > You are absolutely right about the titins causing this issue. The problem > is the significant overlap in peptides in this very large titin group. > Your database contains 343 variations of titin with different SAAPs, > which share large subsets of the same peptides. Calculating this > enormous protein group is certainly stressing the ProteinProphet > algorithm, forcing it into a higher-order polynomial time complexity > problem. I was looking into the code to see if there was a simple way to > speed it up, but unfortunately this doesn't seem to be the case. Is there > any way you can reduce the number of titin entries in your database? Have > you considered using PEFF? > > Thanks, > -David > > On Sat, Oct 24, 2020 at 10:48 PM Emily Kawaler wrote: > >> Another update: I've pinpointed a much smaller database that reproduces >> the error when run with just 10OV - uploaded to the same folder as above, >> named "titins_revs.fasta" (it contains a bunch of titins and some reverse >> decoy sequences). Something in the titins is causing this error, I think >> (I've run this set of titins with a few different sets of reverse decoys so >> I don't think it's caused by the decoys). I also think there are a couple >> of other sequences in the database that may have the same effect, but if we >> can figure out what's doing it in this set, it should be easier to know >> what to look for. Any thoughts? > > >> >> On Friday, October 23, 2020 at 3:45:08 PM UTC-4 Emily Kawaler wrote: >> >>> Okay - When I ran the working set of spectra with the database that >>> failed, it seems to have failed; when I ran the set of spectra that failed >>> with a database that worked, it ran to completion. I think we can probably >>> narrow the problem down to something in the database. >>> >>> On Friday, October 23, 2020 at 1:56:18 AM UTC-4 Emily Kawaler wrote: >>> While those tests are still running, I pulled out all 185 of the proteins that are in the 10OV pepXMLs but not in 01-09OV, figuring that maybe one of those is causing the error. I've uploaded that to the same folder everything else is in (it's called 10OV_uniq.fasta) - I don't see anything that jumps out immediately. (There are no individual characters unique to either the headers or the sequences in 10OV, so I don't think there's an individual character messing things up.) On Thursday, October 22, 2020 at 3:49:18 PM UTC-4 David Shteynberg wrote: > I just re extracted that file and I don't see the issue anymore. > Perhaps this was a decompression issue. > > Thanks for checking. > > -David > > On Thu, Oct 22, 2020 at 12:19 PM Emily Kawaler > wrote: > >> Hello, >> Thanks so much for taking a look! I think the selenocysteines ("U") >> are likely not the problem, since I've got those in all of my databases, >> including the ones that run correctly. I'm looking at >> 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML and I don't see >> anything odd in line 171821 (""), so I think our >> line >> numberings might not match up - what does your problematic line contain? >> > >> When I try to run it on my end, it always sticks somewhere in the >> 10CPTAC_OV files. Right now I'm running a working set of spectra with a >> database that didn't work and vice versa, so hopefully that'll help me >> pin >> down whether it's a problem with my spectra or my database - will let >> you >> know how that turns out! >> >> Emily >> >> On Thursday, October 22, 2020 at 3:09:29 PM UTC-4 David Shteynberg >> wrote: >> >>> Hi Emily, >>> >>> I analyzed the search results that you sent and I am seeing some >>> strange things in at least one of the files you gave me. This may be >>> causing some of the problems you saw. >>> In file 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML on >>> line 171821 there are some strange characters (possibly binary) that >>> are >>> tripping up the TPP. I think these might be caused by a bug in an >>> analysis >>> tool upstream of the TPP. Not sure if there are other mistakes of this >>> sort. Also I found some 'U' amino acids in the database which the TPP >>> complains about having a mass of 0. >>> >>> I hope this helps you somewhat. Let me know what you find on >>> your end. >>> >>> Cheers, >>> -David >>> >>> On Tue, Oct 20, 2020 at 1
Re: [spctools-discuss] ProteinProphet sticking in findDegenGroups3
Hello again Emily, Apologies for the delay but I needed a bit more time to look into this. You are absolutely right about the titins causing this issue. The problem is the significant overlap in peptides in this very large titin group. Your database contains 343 variations of titin with different SAAPs, which share large subsets of the same peptides. Calculating this enormous protein group is certainly stressing the ProteinProphet algorithm, forcing it into a higher-order polynomial time complexity problem. I was looking into the code to see if there was a simple way to speed it up, but unfortunately this doesn't seem to be the case. Is there any way you can reduce the number of titin entries in your database? Have you considered using PEFF? Thanks, -David On Sat, Oct 24, 2020 at 10:48 PM Emily Kawaler wrote: > Another update: I've pinpointed a much smaller database that reproduces > the error when run with just 10OV - uploaded to the same folder as above, > named "titins_revs.fasta" (it contains a bunch of titins and some reverse > decoy sequences). Something in the titins is causing this error, I think > (I've run this set of titins with a few different sets of reverse decoys so > I don't think it's caused by the decoys). I also think there are a couple > of other sequences in the database that may have the same effect, but if we > can figure out what's doing it in this set, it should be easier to know > what to look for. Any thoughts? > > On Friday, October 23, 2020 at 3:45:08 PM UTC-4 Emily Kawaler wrote: > >> Okay - When I ran the working set of spectra with the database that >> failed, it seems to have failed; when I ran the set of spectra that failed >> with a database that worked, it ran to completion. I think we can probably >> narrow the problem down to something in the database. >> >> On Friday, October 23, 2020 at 1:56:18 AM UTC-4 Emily Kawaler wrote: >> >>> While those tests are still running, I pulled out all 185 of the >>> proteins that are in the 10OV pepXMLs but not in 01-09OV, figuring that >>> maybe one of those is causing the error. I've uploaded that to the same >>> folder everything else is in (it's called 10OV_uniq.fasta) - I don't see >>> anything that jumps out immediately. (There are no individual characters >>> unique to either the headers or the sequences in 10OV, so I don't think >>> there's an individual character messing things up.) >>> >>> On Thursday, October 22, 2020 at 3:49:18 PM UTC-4 David Shteynberg wrote: >>> I just re extracted that file and I don't see the issue anymore. Perhaps this was a decompression issue. Thanks for checking. -David On Thu, Oct 22, 2020 at 12:19 PM Emily Kawaler wrote: > Hello, > Thanks so much for taking a look! I think the selenocysteines ("U") > are likely not the problem, since I've got those in all of my databases, > including the ones that run correctly. I'm looking at > 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML and I don't see > anything odd in line 171821 (""), so I think our line > numberings might not match up - what does your problematic line contain? > > When I try to run it on my end, it always sticks somewhere in the > 10CPTAC_OV files. Right now I'm running a working set of spectra with a > database that didn't work and vice versa, so hopefully that'll help me pin > down whether it's a problem with my spectra or my database - will let you > know how that turns out! > > Emily > > On Thursday, October 22, 2020 at 3:09:29 PM UTC-4 David Shteynberg > wrote: > >> Hi Emily, >> >> I analyzed the search results that you sent and I am seeing some >> strange things in at least one of the files you gave me. This may be >> causing some of the problems you saw. >> In file 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML on line >> 171821 there are some strange characters (possibly binary) that are >> tripping up the TPP. I think these might be caused by a bug in an >> analysis >> tool upstream of the TPP. Not sure if there are other mistakes of this >> sort. Also I found some 'U' amino acids in the database which the TPP >> complains about having a mass of 0. >> >> I hope this helps you somewhat. Let me know what you find on >> your end. >> >> Cheers, >> -David >> >> On Tue, Oct 20, 2020 at 1:42 PM Emily Kawaler >> wrote: >> >>> Sure! The spectra are from the CPTAC2 ovarian propective dataset, >>> though I removed all scans that matched to a standard reference >>> database (I >>> don't think the scan removal is the issue, since I'm also having this >>> problem on a different dataset without removing any scans; I also >>> checked >>> with xmllint and it looks like the mzML pepXML files are valid). I've >>> been >>> running it with the philosopher pipelin
Re: [spctools-discuss] Linux Build Error
Hi Nathan, In case you want to build something that closely resembles 5.2.0, you can follow the steps on this page: http://tools.proteomecenter.org/wiki/index.php?title=TPP_5.2.0:_Installing_on_Ubuntu_18.04_LTS making sure to use revision 7909. Alternatively, you can use an older compiler to build 5.2.0 : "*The release version of TPP 5.2.0 was discovered to have a minor bug that prevented a clean compile under the version of GCC in Ubuntu 18.04, so in this example below we actually pull from revision 7909, which is just after the 5.2.0 release. Under older versions of GCC, the stock 5.2.0 will compile fine.*" HTH, --Luis On Fri, Nov 6, 2020 at 1:29 PM 'David Shteynberg' via spctools-discuss < spctools-discuss@googlegroups.com> wrote: > Hi Nathan, > > These are some bugs in the TPP externals that have been fixed in the more > recent versions of the code. You can either pull the offending files from > a more recent version of the code, use an up-to-date trunk version of the > code (and report any bugs you find ;), or wait until we make an official > release. > > Cheers, > -David > > On Fri, Nov 6, 2020 at 12:59 PM Nathan Wamsley > wrote: > >> >> I have extracted version 5.2.0 on my computer and am running Ubuntu >> 20.04.1. I have also followed the instructions on the BUILD_LINUX file that >> comes with the distribution. I believe I have installed the dependencies. >> When I navigate into the "release_5-2-0" directory and run "make all," I >> see the following: >> >> (base) nathan@NathanLaptop:~/release_5-2-0$ make all >> cd >> /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014; >> make >> make[1]: Entering directory >> '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' >> >> g++ -O3 -Wall -Wextra -static -Wno-char-subscripts -D_LARGEFILE_SOURCE >> -D_FILE_OFFSET_BITS=64 -D__LINUX__ -IMSToolkit/include -IComet >> Search Comet.cpp -c >> In file included from CometSearch/Common.h:40, >> from Comet.cpp:18: >> MSToolkit/include/MSReader.h:96:80: error: invalid conversion from ‘char’ >> to ‘char*’ [-fpermissive] >> 96 | void writeFile(const char* c, MSFileFormat ff, MSObject& m, >> char* sha1Report='\0'); >> | >> >> ^~~~ >> | >> >> | >> | >> >> char >> Comet.cpp: In function ‘void LoadParameters(char*, >> CometInterfaces::ICometSearchManager*)’: >> Comet.cpp:235:24: warning: passing argument 1 to restrict-qualified >> parameter aliases with argument 3 [-Wrestrict] >> 235 |sprintf(szVersion, "%s %s %s", szVersion, szRev1, >> szRev2); >> |^ ~ >> Comet.cpp:1037:9: warning: ignoring return value of ‘char* fgets(char*, >> int, FILE*)’, declared with attribute warn_unused_result [-Wu >> nused-result] >> 1037 |fgets(szParamBuf, SIZE_BUF, fp); >> |~^~ >> Comet.cpp:1077:12: warning: ignoring return value of ‘char* fgets(char*, >> int, FILE*)’, declared with attribute warn_unused_result [-W >> unused-result] >> 1077 | fgets(szParamBuf, SIZE_BUF, fp); >> | ~^~ >> make[1]: *** [Makefile:20: Comet.o] Error 1 >> make[1]: Leaving directory >> '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' >> >> make: *** [extern/Makefile:353: >> /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014/comet] >> Error 2 >> (base) nathan@NathanLaptop:~/release_5-2-0$ sudo make all >> cd >> /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014; >> make >> make[1]: Entering directory >> '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' >> >> g++ -O3 -Wall -Wextra -static -Wno-char-subscripts -D_LARGEFILE_SOURCE >> -D_FILE_OFFSET_BITS=64 -D__LINUX__ -IMSToolkit/include -IComet >> Search Comet.cpp -c >> In file included from CometSearch/Common.h:40, >> from Comet.cpp:18: >> MSToolkit/include/MSReader.h:96:80: error: invalid conversion from ‘char’ >> to ‘char*’ [-fpermissive] >> 96 | void writeFile(const char* c, MSFileFormat ff, MSObject& m, >> char* sha1Report='\0'); >> | >> >> ^~~~ >> | >> >> | >> | >> >> char >> Comet.cpp: In function ‘void LoadParameters(char*, >> CometInterfaces::ICometSearchManager*)’: >> Comet.cpp:235:24: warning: passing argument 1 to restrict-qualified >> parameter aliases with argument 3 [-Wrestrict] >> 235 |sprintf(szVersion, "%s %s %s", szVersion, szRev1, >> szRev2); >> |^ ~ >> Comet.cpp:1037:9: warning: ignoring return value of ‘char* fgets(char*, >> int, FILE*)’, declared with attribute warn_unused_result [-Wu >> nused-result] >> 1037 |fgets(szParamBuf, SIZE_BUF, fp); >> |~^~ >> Comet.cpp:1077:12: warning: ignoring return value of ‘char* fgets(char*, >> int, FILE*)’, declared with attribute warn_unused_result [-W >> unused-r
Re: [spctools-discuss] Linux Build Error
Hi Nathan, These are some bugs in the TPP externals that have been fixed in the more recent versions of the code. You can either pull the offending files from a more recent version of the code, use an up-to-date trunk version of the code (and report any bugs you find ;), or wait until we make an official release. Cheers, -David On Fri, Nov 6, 2020 at 12:59 PM Nathan Wamsley wrote: > > I have extracted version 5.2.0 on my computer and am running Ubuntu > 20.04.1. I have also followed the instructions on the BUILD_LINUX file that > comes with the distribution. I believe I have installed the dependencies. > When I navigate into the "release_5-2-0" directory and run "make all," I > see the following: > > (base) nathan@NathanLaptop:~/release_5-2-0$ make all > cd > /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014; > make > make[1]: Entering directory > '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' > > g++ -O3 -Wall -Wextra -static -Wno-char-subscripts -D_LARGEFILE_SOURCE > -D_FILE_OFFSET_BITS=64 -D__LINUX__ -IMSToolkit/include -IComet > Search Comet.cpp -c > In file included from CometSearch/Common.h:40, > from Comet.cpp:18: > MSToolkit/include/MSReader.h:96:80: error: invalid conversion from ‘char’ > to ‘char*’ [-fpermissive] > 96 | void writeFile(const char* c, MSFileFormat ff, MSObject& m, char* > sha1Report='\0'); > | > > ^~~~ > | > > | > | > > char > Comet.cpp: In function ‘void LoadParameters(char*, > CometInterfaces::ICometSearchManager*)’: > Comet.cpp:235:24: warning: passing argument 1 to restrict-qualified > parameter aliases with argument 3 [-Wrestrict] > 235 |sprintf(szVersion, "%s %s %s", szVersion, szRev1, > szRev2); > |^ ~ > Comet.cpp:1037:9: warning: ignoring return value of ‘char* fgets(char*, > int, FILE*)’, declared with attribute warn_unused_result [-Wu > nused-result] > 1037 |fgets(szParamBuf, SIZE_BUF, fp); > |~^~ > Comet.cpp:1077:12: warning: ignoring return value of ‘char* fgets(char*, > int, FILE*)’, declared with attribute warn_unused_result [-W > unused-result] > 1077 | fgets(szParamBuf, SIZE_BUF, fp); > | ~^~ > make[1]: *** [Makefile:20: Comet.o] Error 1 > make[1]: Leaving directory > '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' > > make: *** [extern/Makefile:353: > /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014/comet] > Error 2 > (base) nathan@NathanLaptop:~/release_5-2-0$ sudo make all > cd > /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014; > make > make[1]: Entering directory > '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' > > g++ -O3 -Wall -Wextra -static -Wno-char-subscripts -D_LARGEFILE_SOURCE > -D_FILE_OFFSET_BITS=64 -D__LINUX__ -IMSToolkit/include -IComet > Search Comet.cpp -c > In file included from CometSearch/Common.h:40, > from Comet.cpp:18: > MSToolkit/include/MSReader.h:96:80: error: invalid conversion from ‘char’ > to ‘char*’ [-fpermissive] > 96 | void writeFile(const char* c, MSFileFormat ff, MSObject& m, char* > sha1Report='\0'); > | > > ^~~~ > | > > | > | > > char > Comet.cpp: In function ‘void LoadParameters(char*, > CometInterfaces::ICometSearchManager*)’: > Comet.cpp:235:24: warning: passing argument 1 to restrict-qualified > parameter aliases with argument 3 [-Wrestrict] > 235 |sprintf(szVersion, "%s %s %s", szVersion, szRev1, > szRev2); > |^ ~ > Comet.cpp:1037:9: warning: ignoring return value of ‘char* fgets(char*, > int, FILE*)’, declared with attribute warn_unused_result [-Wu > nused-result] > 1037 |fgets(szParamBuf, SIZE_BUF, fp); > |~^~ > Comet.cpp:1077:12: warning: ignoring return value of ‘char* fgets(char*, > int, FILE*)’, declared with attribute warn_unused_result [-W > unused-result] > 1077 | fgets(szParamBuf, SIZE_BUF, fp); > | ~^~ > make[1]: *** [Makefile:20: Comet.o] Error 1 > make[1]: Leaving directory > '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' > > make: *** [extern/Makefile:353: > /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014/comet] > Error 2 > > Does anyone know what could be happening here? > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to spctools-discuss+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/spctools-discuss/6f1b57ab-0813-4b60-998d-dc19b221bb1en%40googlegroups.com >
[spctools-discuss] Linux Build Error
I have extracted version 5.2.0 on my computer and am running Ubuntu 20.04.1. I have also followed the instructions on the BUILD_LINUX file that comes with the distribution. I believe I have installed the dependencies. When I navigate into the "release_5-2-0" directory and run "make all," I see the following: (base) nathan@NathanLaptop:~/release_5-2-0$ make all cd /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014; make make[1]: Entering directory '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' g++ -O3 -Wall -Wextra -static -Wno-char-subscripts -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D__LINUX__ -IMSToolkit/include -IComet Search Comet.cpp -c In file included from CometSearch/Common.h:40, from Comet.cpp:18: MSToolkit/include/MSReader.h:96:80: error: invalid conversion from ‘char’ to ‘char*’ [-fpermissive] 96 | void writeFile(const char* c, MSFileFormat ff, MSObject& m, char* sha1Report='\0'); | ^~~~ | | | char Comet.cpp: In function ‘void LoadParameters(char*, CometInterfaces::ICometSearchManager*)’: Comet.cpp:235:24: warning: passing argument 1 to restrict-qualified parameter aliases with argument 3 [-Wrestrict] 235 |sprintf(szVersion, "%s %s %s", szVersion, szRev1, szRev2); |^ ~ Comet.cpp:1037:9: warning: ignoring return value of ‘char* fgets(char*, int, FILE*)’, declared with attribute warn_unused_result [-Wu nused-result] 1037 |fgets(szParamBuf, SIZE_BUF, fp); |~^~ Comet.cpp:1077:12: warning: ignoring return value of ‘char* fgets(char*, int, FILE*)’, declared with attribute warn_unused_result [-W unused-result] 1077 | fgets(szParamBuf, SIZE_BUF, fp); | ~^~ make[1]: *** [Makefile:20: Comet.o] Error 1 make[1]: Leaving directory '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' make: *** [extern/Makefile:353: /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014/comet] Error 2 (base) nathan@NathanLaptop:~/release_5-2-0$ sudo make all cd /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014; make make[1]: Entering directory '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' g++ -O3 -Wall -Wextra -static -Wno-char-subscripts -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D__LINUX__ -IMSToolkit/include -IComet Search Comet.cpp -c In file included from CometSearch/Common.h:40, from Comet.cpp:18: MSToolkit/include/MSReader.h:96:80: error: invalid conversion from ‘char’ to ‘char*’ [-fpermissive] 96 | void writeFile(const char* c, MSFileFormat ff, MSObject& m, char* sha1Report='\0'); | ^~~~ | | | char Comet.cpp: In function ‘void LoadParameters(char*, CometInterfaces::ICometSearchManager*)’: Comet.cpp:235:24: warning: passing argument 1 to restrict-qualified parameter aliases with argument 3 [-Wrestrict] 235 |sprintf(szVersion, "%s %s %s", szVersion, szRev1, szRev2); |^ ~ Comet.cpp:1037:9: warning: ignoring return value of ‘char* fgets(char*, int, FILE*)’, declared with attribute warn_unused_result [-Wu nused-result] 1037 |fgets(szParamBuf, SIZE_BUF, fp); |~^~ Comet.cpp:1077:12: warning: ignoring return value of ‘char* fgets(char*, int, FILE*)’, declared with attribute warn_unused_result [-W unused-result] 1077 | fgets(szParamBuf, SIZE_BUF, fp); | ~^~ make[1]: *** [Makefile:20: Comet.o] Error 1 make[1]: Leaving directory '/home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014' make: *** [extern/Makefile:353: /home/nathan/release_5-2-0/build/gnu-x86_64/artifacts/comet_source_2018014/comet] Error 2 Does anyone know what could be happening here? -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discuss+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/6f1b57ab-0813-4b60-998d-dc19b221bb1en%40googlegroups.com.