On Thu, Jun 1, 2017 at 9:39 PM, Elizabeth Ploetz <plo...@ksu.edu> wrote:
> However, if most runs are group scheme, a quick check could show whether > > jumps are present in runs that i) do PP-PME tuning ii) if logs go truncated > during continuation at least whether they do use separate PME ranks > (because otherwise CPU-only runs don't tune). > > i) If grepping "timed" from the LOG file does not give any output, does > that mean there was no PP-PME tuning? (Sorry for the stupid question. I'm > not sure which piece of information from the LOG file is going to answer > whether or not there was PP-PME tuning.) Do you run with -append? If so, the log file too gets truncated, but I do not recall exactly where and whether the PP-PME balancing messages are removed or not, but it's not hard to try -- just run with separate PME and too few of them (e.g. 1 out of 12) and that will trigger load balancing. On a second thought, instead of testing with Verlet, you might want to just do the above and try to directly observe the anomalies after the balancer. > If so, perhaps there is a correlation between having PP-PME tuning and > having a jump. Please see this link<http://i1243.photobucket. > com/albums/gg545/ploetz/volumeJumps_zps8hmlghtn.png>. *If* the volume for > 40-60ns of row 3 is the correct system volume, then all the data in this > figure is consistent with there being a jump when there is PP-PME tuning. > (Please note that while the data at 1 bar looks okay in this case, and > elevated pressures do not, this is not always true. We get jumps at 1 bar > as well sometimes.) > ii) These are all CPU-only runs. The simulations always use separate PME > ranks. > Please let me know if any particular data from the LOG file would be > helpful. > It would be easier if you provided logs that we can look through. > > If I understood correctly, it's only group scheme runs where this has been > observed, so it could be some newer feature/change that interacts badly > with the group scheme. > > You are correct, so far we have not seen any jumps with Verlet. > > BTW, do you have any data with 4.5? > > I have a few old simulations with version 4.5.3 (none with 4.5, sorry). > They were all ran with inexact continuations (i.e., I did not provide > checkpoint files when running multiple short runs to create one long > simulation) or single trajectories that I had killed at various points and > then continued using checkpoint files and -append. I don't have a huge data > set with 4.5.3, but none of them exhibited jumps! > > I'd suggest that (especially if if investigation of current data does not > reveal the reasons) pick a setup where you seemed to get the anomaly and > run with the same settings using the Verlet scheme lots of short runs with > restarts in a loop. > > Thanks, we are doing this test. > > -- > Gromacs Users mailing list > > * Please search the archive at http://www.gromacs.org/Support > /Mailing_Lists/GMX-Users_List before posting! > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > > * For (un)subscribe requests visit > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or > send a mail to gmx-users-requ...@gromacs.org. > -- Gromacs Users mailing list * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting! * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists * For (un)subscribe requests visit https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-requ...@gromacs.org.