Dear David, to generate my library I have done a couple of steps: 1) modified the file utils.py from seq2ms to contain my modifications:
*# C H N O S P# mods = {"ox" : [0,0,0,1,0,0], ## changed this line to enable sulfatase modification prediction se=serine; al=formylglycine-aldehyd; do=formylglycine-diol; ds=diol-sulfate; ss=serine-sulfatemods = {"ox" : [0,0,0,1,0,0], "se" : [0,0,1,0,-1,0], "al" : [0,-2,1,0,-1,0], "do" : [0,0,2,0,-1,0], "ds" : [0,-1,5,0,0,0], "ss" : [0,0,4,0,0,0], "ph" : [0,1,0,3,0,1], "cam" : [2,3,1,1,0,0] , "ac": [2,2,0,1,0,0], "me": [1,2,0,0,0,0], "hy": [0,0,0,1,0,0], "gly": [4,6,2,2,0,0], "bi" : [10,14,2,2,1,0], "cr": [4,4,0,1,0,0], "di": [2,4,0,0,0,0], "ma": [3,2,0,3,0,0], "ni": [0,-1,1,2,0,0], "bu" : [4,6,0,1,0,0], "fo": [1,0,0,1,0,0], "glu": [5,6,0,3,0,0], "hyb": [4,6,0,2,0,0], "pr": [3,4,0,1,0,0], "su" : [4,4,0,3,0,0], "tr": [3,6,0,0,0,0], "ci": [0,-1,-1,1,0,0]}* 2) I run the prediction by executing 'predict.py' with the a modified input.tsv and (I think) the pretrained_model. The input file looked like the following but contained more peptides: *Sequence Charge Mass Modified sequence Modification ProteinDILTPELDNLAQNGSIFTSAYVAHPFCGPSR 2 3332.613578 _DILTPELDNLAQNGSIFTSAYVAHPFCGPSR_ Unmodified DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 3 3332.613576 _DILTPELDNLAQNGSIFTSAYVAHPFCGPSR_ Unmodified DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 4 3332.613576 _DILTPELDNLAQNGSIFTSAYVAHPFCGPSR_ Unmodified DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 2 3316.636422 _DILTPELDNLAQNGSIFTSAYVAHPFC(se)GPSR_ Serine (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 3 3316.636422 _DILTPELDNLAQNGSIFTSAYVAHPFC(se)GPSR_ Serine (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 4 3316.63642 _DILTPELDNLAQNGSIFTSAYVAHPFC(se)GPSR_ Serine (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 2 3314.620772 _DILTPELDNLAQNGSIFTSAYVAHPFC(al)GPSR_ FGAldehyd (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 3 3314.62077 _DILTPELDNLAQNGSIFTSAYVAHPFC(al)GPSR_ FGAldehyd (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 4 3314.620772 _DILTPELDNLAQNGSIFTSAYVAHPFC(al)GPSR_ FGAldehyd (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 2 3332.631336 _DILTPELDNLAQNGSIFTSAYVAHPFC(do)GPSR_ FGDiol (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 3 3332.631336 _DILTPELDNLAQNGSIFTSAYVAHPFC(do)GPSR_ FGDiol (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 4 3332.631336 _DILTPELDNLAQNGSIFTSAYVAHPFC(do)GPSR_ FGDiol (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 2 3411.580326 _DILTPELDNLAQNGSIFTSAYVAHPFC(ds)GPSR_ DiolSulfat (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 3 3411.580326 _DILTPELDNLAQNGSIFTSAYVAHPFC(ds)GPSR_ DiolSulfat (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 4 3411.580324 _DILTPELDNLAQNGSIFTSAYVAHPFC(ds)GPSR_ DiolSulfat (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 2 3396.593236 _DILTPELDNLAQNGSIFTSAYVAHPFC(ss)GPSR_ SerinSulfat (C) DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 3 3396.593235 _DILTPELDNLAQNGSIFTSAYVAHPFC(ss)GPSR_ Unmodified DILTPELDNLAQNGSIFTSAYVAHPFCGPSR 4 3396.593236 _DILTPELDNLAQNGSIFTSAYVAHPFC(ss)GPSR_ Unmodified * 3) I imported the generated .msp file into spectraST format using the command line and the following options: *C:\TPP\bin\spectrast.exe -cNPredicted_Sec2MS.splib -MSec2MS.usermods Predicted.msp* 4) I tried to visualise this library by Lib2HTML from the within Petunia For the files I will send you a link via the ISB contact form to our university's nextcloud where I have uploaded them. Best regards, Juergen Juergen Bartel schrieb am Montag, 22. Januar 2024 um 18:22:49 UTC+1: > Dear all, > > I have recently generated an in-silico spectral library for several > peptides which may contains different modifications on the same cystein > (cystein converted to a serin, formylglycin-aldehyde, ...). For this > prediction I used Seq2MS ( > https://pubs.acs.org/doi/10.1021/acs.jproteome.3c00180) and obtained a > library in msp format. > In this file the header indicates modifications for examle in the > following way: > > Name: DILTPELDNLAQNGSIFTSAYVAHPFCGPSR/2_1(26,C,FGAldehyd) > Comment: Charge=2 Parent=1657.310386 Mods=1(26,C,FGAldehyd) Protein=nan > > If I correctly specify the modifications in a .usermods file(*) and import > it to spectraST, the resulting sptxt files contain header such as: > > Name: DILTPELDNLAQNGSIFTSAYVAHPFC[85]GPSR/2 > LibID: 6 > MW: 3316.6353 > PrecursorMZ: 1658.3177 > Status: Normal > FullName: X.DILTPELDNLAQNGSIFTSAYVAHPFC[85]GPSR.X/2 (CID) > Comment: AvePrecursorMz=1659.3531 BinaryFileOffset=19516 Charge=2 > FracUnassigned=0.89,4/5;0.83,16/20;0.56,35/60 Mods=1(26,C,FGAldehyd) NAA=31 > NISTProtein=nan NMC=0 NTT=1 Parent=1657.310386 Prob=1.0000 Protein=1/nan > > However, when I want to check the spectra in this file in HTML format > using Lib2HTML, the table contains the complete peptide only for the > unmodified form and has an identical truncated peptide for all other > modified forms: > > [image: Unbenannt.PNG] > > Ths does not only affect the truncated amino acids but also results in > mis-alignment of the y-ion-series (i.e. due to the modified cystein, y1 is > annotated as 122.027 while in this case it should be y5 with a much larger > m/z). Again, the annotation in sptxt seems good. > > Does anyone has an idea how to solve this and/or how I could watch the > spectra alternatively? > > Best, > Juergen > > > (*) part of the usermods file: > C[se]|-15.977156|Serin > C[al]|-17.992806|FGAldehyd > ... > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to spctools-discuss+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/spctools-discuss/dd0ac502-dde9-495c-ac48-58340707b5b9n%40googlegroups.com.