Do you want to submit a PR? --Steve
On 9/13/2019 9:27 AM, Yosef Zlochower wrote: > Thanks. The issue seems to be that with manual topology a region_t > structure has it's map entry incorrectly set > > What happens is, the in > > bool gh::recompose there is the check > bool const do_recompose = level_did_change(rl); > > In level_did_change, the level is considered to change because > > the new region_t is > > region_t(extent=([41,0,0]:[80,10,10]:[1,1,1]/[41,0,0]:[80,10,10]/[40,11,11]/4840),outer_boundaries=[[0,1,1],[1,1,1]],map=51,processor=1) > > > > while the old > isregion_t(extent=([41,0,0]:[80,10,10]:[1,1,1]/[41,0,0]:[80,10,10]/[40,11,11]/4840),outer_boundaries=[[0,1,1],[1,1,1]],map=0,processor=1) > > The only difference is the new map is 51. > > If I add a line Carpet/src/Recompose.cc:SplitRegions_AsSpecified > to force the map entry to be zero, then all seems to work. > > > Without the change, Carpet recomposes the grid but never calls the > postregrid functions. Hence the Nans in grid::x > > > > On 9/12/19 2:37 PM, Steven R. Brandt wrote: >> I said on the call there was an easy way to trace what function call you >> are in... >> >> Add this to your thornlist... >> >> !TARGET = $ARR >> !TYPE = git >> !URL = https://github.com/stevenrbrandt/ReadWriteDiagnostics.git >> !REPO_PATH=$2 >> !CHECKOUT = >> ReadWriteDiagnostics/FCall >> >> Then add FCall to your ActiveThorns and you'll see a message printed >> before and after each scheduled function. >> >> --Steve >> >> On 9/10/2019 3:03 PM, Yosef Zlochower wrote: >>> It seems that there may be multiple issues. The parfile I sent before >>> tests for NaNs in grid::x. grid::x is not a checkpointed variable. It >>> seems that with manual topology, the grid::x is filled with nans during >>> the recover step (the pointer is actually pointing to a new area of >>> memory). With standard topology, the array pointer and contents do not >>> change on recover. I have also seen NaNs in the recovered variables, >>> but >>> this parfile doesn't show that. >>> >>> >>> >>> On 9/9/19 4:24 PM, Yosef Zlochower wrote: >>>> Hi, >>>> >>>> I have been trying to debug why some runs I was performing >>>> could not >>>> recover from a checkpoint file, but would otherwise proceed as normal. >>>> >>>> I attached a minimalist parfile showing the problem. A small grid is >>>> manually distributed over 8 processors and terminates at iteration >>>> 2. An >>>> attempt at recover fails with nans on grid::x. If the manual topology >>>> section is commented out, no problems are seen. >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> [email protected] >>>> http://lists.einsteintoolkit.org/mailman/listinfo/users >>>> >>> _______________________________________________ >>> Users mailing list >>> [email protected] >>> http://lists.einsteintoolkit.org/mailman/listinfo/users >> _______________________________________________ >> Users mailing list >> [email protected] >> http://lists.einsteintoolkit.org/mailman/listinfo/users >> _______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
