Xiangdong,
I've attached at patch
Xiangdong.patch
Description: Binary data
also available in the branch barry/support-dmcreatedomaindecompositionscatters-da-nonperiodic-maint if you are using git. It resolves the current issue by only creating scatters for physical ghost points.
BUT I do not know if this fix will get nasm to work properly for a real
problem with DMDAs. There may be other problems that come up.
Barry
> On Jun 29, 2016, at 10:49 AM, Xiangdong <[email protected]> wrote:
>
> To make things easier, I created a simple example here:
>
> https://drive.google.com/file/d/0BxEfb1tasJxhOHJWbVJ0ZGl0LTA/view?usp=sharing
>
> ./exe -snes_newtonls okay.
> ./exe -snes_nasm -da_grid_z 2 okay
> ./exe -snes_nasm failed. It got trapped in the while loop.
>
> Do you have any suggestions to avoid this trap for nz=1?
>
> Thanks.
>
> Xiangdong
>
> On Wed, Jun 29, 2016 at 11:23 AM, Xiangdong <[email protected]> wrote:
> The code got trapped in a while loop when I switch to s=2.
>
> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/impls/da/dadd.c.html
>
> In dadd.c line 85-96
> /* gone out of processor range on y axis */
> 86: while(jj > ne-1 || jj < ns) {
> 87: if (nr == n-1) {
> 88: ns = 0;
> 89: ne = ly[0];
> 90: nr = 0;
> 91: } else {
> 92: nr++;
> 93: ns = ne;
> 94: ne += ly[nr];
> 95: }
> 96: }
>
> For my case nx=5,ny=5,nz=1,s=2, np=2, snes_type nasm, the code is trapped in
> the loop for jj=-1, ns=0, nr=0, n=1. This negative jj comes from the line
> 340-349 of same file.
>
> /* global and subdomain ISes for the local indices of the subdomain */
> 341: /* todo - make this not loop over at nonperiodic boundaries, which
> will be more involved */
> 342: lower.i = subinfo.gxs;
> 343: lower.j = subinfo.gys;
> 344: lower.k = subinfo.gzs;
> 345: upper.i = subinfo.gxs+subinfo.gxm;
> 346: upper.j = subinfo.gys+subinfo.gym;
> 347: upper.k = subinfo.gzs+subinfo.gzm;
>
> 349: DMDACreatePatchIS(dm,&lower,&upper,&gdis);
>
> When trapped, subinfo.gxs=-2. Since there is a todo comment, I am not sure
> this is the reason for the while loop trap.
>
> Thanks.
>
> Xiangdong
>
>
> On Tue, Jun 28, 2016 at 5:16 PM, Barry Smith <[email protected]> wrote:
>
> If you make the stencil width as wide at the overlap does the problem go
> away and you get expected convergence?
>
> > On Jun 28, 2016, at 3:48 PM, Xiangdong <[email protected]> wrote:
> >
> > Let me illustrate my point and question by an example. I have a 1d
> > nonlinear problem with Nx=5 and stencil width 1. When I run it with np=2,
> > processor 1 own cells 1,2,3 (+ ghost 4) and processor 2 own cells 4,5 (+
> > ghost 3). I also have functions FormFunctionLocal and FormJacobianLocal to
> > compute the residual and Jacobian.
> >
> > i) snes_type newtonls. The Jacobian is in mpiaij format distributed on two
> > processors. processor 1 computes the first three rows of Jac (size 3x5) and
> > processor 1 computes the last two rows of Jac (size 2x5). The code works
> > fine.
> >
> > ii) snes_type nasm and da_overlap 0. Each processor has its own Jac
> > (seqaij). Processor 1 computes its 3x3 Jac and Processor 2 computes its 2x2
> > Jac. The code works fine.
> >
> > iii) snes_type nasm and da_overlap 1. Each processor needs to computes its
> > own Jac. Processor 1 computes its 4x4 Jac and Processor 2 computes its 3x3
> > Jac. However, for processor 1 to compute the 4-th row of its Jac, it needs
> > the properties information stored on cell 5, which is not available on
> > processor 1. The zero pivot error is from this wrong Jac for the row
> > corresponding to the overlap cell. The mat_fd_color does not work either
> > for the same reason that information from cell 5 is not available on
> > processor 1.
> >
> > Do you have any suggestion for the overlap case? Thank you.
> >
> > Best,
> > Xiangdong
> >
> > On Mon, Jun 27, 2016 at 4:21 PM, Barry Smith <[email protected]> wrote:
> >
> > > On Jun 27, 2016, at 3:15 PM, Xiangdong <[email protected]> wrote:
> > >
> > > Hello everyone,
> > >
> > > I am trying different number of da_overlap to see its effects on nasm and
> > > aspin preconditioner. The codes works fine with -da_overlap 0. However,
> > > when I change the option -da_overlap 1, it crashed with the error message
> > > like "zero pivot row 12544 value 0 tolerance 2.2e-14". The option
> > > -pc_factor_nonzeros_along_diagonal does not fix this zero pivot issue.
> > >
> > > Do you have any quick suggestions for me to try to fix this issue?
> >
> > My guess is that it is bug in the da_overlap business or in the example;
> > normally I would not expect this to happen. Is this a PETSc example I can
> > run (tell me example and exact options) to reproduce the problem?
> >
> > Thanks
> >
> > Barry
> >
> > >
> > > Thank you.
> > >
> > > Best,
> > > Xiangdong
> >
> >
>
>
>
