Hi Ed I didn't have time to try your tips, but they should help me out when I try to run the full_analysis.py script again...
I'll let you know if it works well or if I still get long computation times... Cheers Séb :) Edward d'Auvergne wrote: > Hi, > > On 9/17/07, Sebastien Morin <[EMAIL PROTECTED]> wrote: > >> Hi Ed, >> >> First, there were some bad assignments in my data set. I used the automatic >> assignment (which takes an assigned peak list and propagates it to other >> peak lists) procedure within NMRPipe for the first time and some peaks were >> badly assigned. >> > > Although a problem because of the bond vector orientation, the effect > of this should not be long computation times just incorrect internal > motions. > > > >> Second, the PDB file is quite good as it is a representative conformation >> from a 60 ns MD simulation using CHARMM. That said, the protein moves in the >> simulation and, hence, the orientations also change. I could take another >> conformation, which is what I'll do to cross-validate my models, but >> nevertheless the orientations will change and subtil changes will appear. >> This shouldn't be an issue since the vectors that move a lot in the >> simulations should have correlating relaxation properties and that should be >> seen in the models chosen. >> > > The orientation changes should only affect the Euler angle values of > the diffusion tensor. Nothing else should be affected by this. The > internal motions of the simulation will affect the results of the > analysis, but the overall orientation really doesn't matter unless you > are comparing these Euler angles. > > > >> Third, here are the stats for the ellipsoid optimization : >> >> round t_total_(h) t_opt_(h) iter_opt model_change tm a b >> g chi2 comments >> ===== =========== ========= ======== ============ ====== ==== ===== >> ==== ================== ======================= >> 1 146 144 207 --- 12.423 18.8 159.7 >> 99.1 9282.2280010132217 ok >> 2 49 47 62 215 12.463 74.7 152.0 >> 94.3 8793.0777454789404 ok >> 3 16 14 19 16 12.448 78.0 152.3 >> 96.9 8767.5325004348124 ok >> 4 12 10 13 1 12.445 80.2 151.9 >> 97.9 8765.5659442063006 ok >> 5 19 17 23 2 12.445 83.1 151.7 >> 98.3 8761.0001889287214 ok >> 6 25 23 27 1 12.452 80.9 151.4 >> 96.2 8744.6870170285692 ok >> 7 16 14 19 1 12.445 83.1 151.7 >> 98.3 8761.0001889287269 almost_5 >> 8 25 23 28 1 12.452 80.9 151.4 >> 96.2 8744.6870170285729 almost_6 >> 9 14 12 17 1 12.445 83.1 151.7 >> 98.3 8761.0001889287269 almost_5_and_exactly_7 >> 10 29 27 33 1 12.452 80.9 151.4 >> 96.2 8744.6870170285656 almost_6_and_8 >> 11 stopped................................... >> > > Are these states from the results in the 'opt' directories? Can you > possibly pin-point where in the calculation the problem is? One > option is to increase the verbosity flag 'print_flag' in the > minimise() user function. This may help in seeing the problem. > > > >> As you can see, there is a kind of interchange between two runs in the end >> of the optimization. In fact, from the iteration 5 on, there is only one >> residue for which the model is changing, it's always the same. It changes >> from model 5 to 6 and 6 to 5... with a tf of ~17, a ts of ~25000 and a S2 of >> ~0.73 (chi2 ~40 in aic file, but then with ts ~ 1200) when with model 6 and >> ts of ~650 and S2 of ~0.78 when with model 5 (chi2 ~50 in aic file). How >> come a so high ts (25000) isn't eliminated..? >> > > In mathematical modelling, model elimination or model validation must > occur prior to the model selection step. This is when ts is at ~1.2 > ns, and hence the model is not eliminated. The final optimisation is > shifting ts up to 25 ns, and this is likely to be the thing causing > the optimisation to take soooo long! Is there something particular > with this residue? > > The iteration numbers are low, but these may be the number of > iterations of the method of multipliers algorithm. For each iteration > there could possibly be thousands of steps of the Newton subalgorithm. > I can't remember how the iteration number is generated, but the > print_flag option may show if this is the case. > > > >> round AIC_or_OPT model S2 S2f S2s tf ts chi2 >> ===== ========== ===== === ==== ==== ====== ====== ========= >> 9 AIC 5 0.78 0.96 0.81 None 698 52 >> 10 AIC 6 0.78 0.97 0.80 11.2 1173 39 >> 9 OPT 5 0.78 0.96 0.81 None 630 --- >> 10 OPT 6 0.73 0.93 0.79 16.8 24904 --- >> >> >> Fourth, the previous runs were made on 4 different computers which give >> almost exactly the same calculation time, maybe differing from 10-15 %... >> This shouldn't be what's causing those so extremely long times... >> > > This is unlikely to be the problem, but I was just wondering in case > there was an operating system or platform specific bug possibly in the > Numeric code. > > > >> Fifth, I used the default algorithm whithin the full_analysis.py script. >> How can I change the optimization algorithm so it's a two stage procedure >> like you proposed ? Should I run several times with MIN_ALGOR = 'simplex' >> and, after a few runs (maybe when the chi2 and number of iterations get to a >> plateau) switch to MIN_ALGOR = 'newton' ? >> > > Simply have two lines, one after the other, in the code where the > minimise() user function is located. I.e. in the current 1.2 > repository line file 'full_analysis.py': > > # Minimise all parameters. > minimise('simplex', run=name) > minimise(MIN_ALGOR, run=name) > > # Write the results. > ... > > > That should be enough to solve the problem (hopefully). > > Cheers, > > Edward > > > > >> I think that's almost everything I can find now... >> >> Let me know if you know how to catch those problems before they appear... >> >> Cheers >> >> >> Séb :) >> >> >> >> >> >> >> Edward d'Auvergne wrote: >> Hi, >> >> I've been trying to think of what could possibly be causing these >> really long times, but I'm really not sure what is happening. >> Unfortunately there just was not enough information in the post to >> decipher the key to this problem. Is there something special about >> those 7 residues? How accurate do you think their orientations are in >> the PDB file you are using? And how accurate is the PDB file itself >> with respect to all parts of the system? >> >> Have you had a chance to investigate further as to what the issue >> might be? For example, which part of the calculation is taking the >> time? Is it the global optimisation of all parameters? Are the final >> results of each round similar or completely different (selected model >> wise and parameter value wise). How do the iteration numbers compare >> at each stage. Essentially a fine analysis and comparison of the >> results files and the printout from relax will be necessary to track >> down this abnormal computation time. Oh, are you running these on the >> same computer as the previous analysis? >> >> As for the optimisation algorithm being stuck, if you've used the >> default algorithm then this shouldn't happen. Optimisation should >> terminate. There are certain very rare situations where the algorithm >> known as the GMW Hessian modification, which is used by default as a >> subalgorithm by the Newton algorithm in relax, can take large amounts >> of time to complete. You'll see this as a increase in the number of >> iterations by 4 to 5 orders of magnitude. One way to test this is to >> use a lower quality optimisation algorithm first and then complete to >> high precision with the Newton algorithm. In this case I would use >> simplex first followed by the default Newton algorithm and its default >> subalgorithms. In all cases constraints should be used. This will >> only solve the long computation times if the GMW algorithm is at >> fault. >> >> Regards, >> >> Edward >> >> >> On 9/4/07, Sebastien Morin <[EMAIL PROTECTED]> wrote: >> >> >> Hi all, >> >> I am using the full_analysis.py script with data a three magnetic fields. >> >> After a first complete cycle (going through the final optimization), I >> realized that a few residues had extremely high chi-squared values (> >> 1000) no matter the diffusion model or model-free model chosen... >> >> So I removed those residues (7 out of 222) and started the full_analysis >> protocole again. >> >> However, the optimization times are now extremely long and I should get >> the final results in weeks... >> >> >> Here are the available times (for local_tm, sphere and ellipsoid) : >> >> >> Diffusion_model Round Time-before_N=222 X2 >> Time-now_N=215 X2 >> =============== ===== ================= ======= >> ============== ======= >> local_tm --- 12h30 45949 >> 14h30 5802 OK, X2 much smaller >> >> sphere init --- 1154338 --- >> 249255 >> 1 2h30 65654 36h00 >> 10303 Long, but X2 much smaller >> 2 2h30 65654 > 30h00 >> >> ellipsoid init --- 753535 >> --- 177764 >> 1 4h00 64592 > >> 67h00 ?? >> 2 2h30 64592 >> not_there_yet >> >> Is it possible that the algorithms get stuck somewhere during the >> optimization..? >> >> I thought that removing badly fit residues would, on the contrary, speed >> up calculations... >> >> Thanks for ideas ! >> >> >> Sébastien :) >> >> -- >> ______________________________________ >> _______________________________________________ >> | | >> || Sebastien Morin || >> ||| Etudiant au PhD en biochimie ||| >> |||| Laboratoire de resonance magnetique nucleaire |||| >> ||||| Dr Stephane Gagne ||||| >> |||| CREFSIP (Universite Laval, Quebec, CANADA) |||| >> ||| 1-418-656-2131 #4530 ||| >> || || >> |_______________________________________________| >> ______________________________________ >> >> >> >> _______________________________________________ >> relax (http://nmr-relax.com) >> >> This is the relax-users mailing list >> relax-users@gna.org >> >> To unsubscribe from this list, get a password >> reminder, or change your subscription options, >> visit the list information page at >> https://mail.gna.org/listinfo/relax-users >> >> >> >> >> >> -- >> ______________________________________ >> _______________________________________________ >> | | >> || Sebastien Morin || >> ||| Etudiant au PhD en biochimie ||| >> |||| Laboratoire de resonance magnetique nucleaire |||| >> ||||| Dr Stephane Gagne ||||| >> |||| CREFSIP (Universite Laval, Quebec, CANADA) |||| >> ||| 1-418-656-2131 #4530 ||| >> || || >> |_______________________________________________| >> ______________________________________ >> >> >> > > -- ______________________________________ _______________________________________________ | | || Sebastien Morin || ||| Etudiant au PhD en biochimie ||| |||| Laboratoire de resonance magnetique nucleaire |||| ||||| Dr Stephane Gagne ||||| |||| CREFSIP (Universite Laval, Quebec, CANADA) |||| ||| 1-418-656-2131 #4530 ||| || || |_______________________________________________| ______________________________________
_______________________________________________ relax (http://nmr-relax.com) This is the relax-users mailing list relax-users@gna.org To unsubscribe from this list, get a password reminder, or change your subscription options, visit the list information page at https://mail.gna.org/listinfo/relax-users