Xiangdong,

    I've attached at patch 

Attachment: Xiangdong.patch
Description: Binary data

also available in the branch 
barry/support-dmcreatedomaindecompositionscatters-da-nonperiodic-maint if you 
are using git.
It resolves the current issue by only creating scatters for physical ghost 
points. 

    BUT I do not know if this fix will get nasm to work properly for a real 
problem with DMDAs. There may be other problems that come up.

   Barry


> On Jun 29, 2016, at 10:49 AM, Xiangdong <[email protected]> wrote:
> 
> To make things easier, I created a simple example here:
> 
> https://drive.google.com/file/d/0BxEfb1tasJxhOHJWbVJ0ZGl0LTA/view?usp=sharing
> 
> ./exe -snes_newtonls okay.
> ./exe -snes_nasm -da_grid_z 2 okay
> ./exe -snes_nasm failed. It got trapped in the while loop.
> 
> Do you have any suggestions to avoid this trap for nz=1?
> 
> Thanks.
> 
> Xiangdong
> 
> On Wed, Jun 29, 2016 at 11:23 AM, Xiangdong <[email protected]> wrote:
> The code got trapped in a while loop when I switch to s=2.
> 
> http://www.mcs.anl.gov/petsc/petsc-current/src/dm/impls/da/dadd.c.html
> 
> In dadd.c line 85-96  
> /* gone out of processor range on y axis */
>  86:         while(jj > ne-1 || jj < ns) {
>  87:           if (nr == n-1) {
>  88:             ns = 0;
>  89:             ne = ly[0];
>  90:             nr = 0;
>  91:           } else {
>  92:             nr++;
>  93:             ns = ne;
>  94:             ne += ly[nr];
>  95:           }
>  96:         }
> 
> For my case nx=5,ny=5,nz=1,s=2, np=2, snes_type nasm, the code is trapped in 
> the loop for jj=-1, ns=0, nr=0, n=1. This negative jj comes from the line 
> 340-349 of same file.
> 
> /* global and subdomain ISes for the local indices of the subdomain */
> 341:     /* todo - make this not loop over at nonperiodic boundaries, which 
> will be more involved */
> 342:     lower.i = subinfo.gxs;
> 343:     lower.j = subinfo.gys;
> 344:     lower.k = subinfo.gzs;
> 345:     upper.i = subinfo.gxs+subinfo.gxm;
> 346:     upper.j = subinfo.gys+subinfo.gym;
> 347:     upper.k = subinfo.gzs+subinfo.gzm;
> 
> 349:     DMDACreatePatchIS(dm,&lower,&upper,&gdis);
> 
> When trapped, subinfo.gxs=-2. Since there is a todo comment, I am not sure 
> this is the reason for the while loop trap.
> 
> Thanks.
> 
> Xiangdong
> 
> 
> On Tue, Jun 28, 2016 at 5:16 PM, Barry Smith <[email protected]> wrote:
> 
>   If you make the stencil width as wide at the overlap does the problem go 
> away and you get expected convergence?
> 
> > On Jun 28, 2016, at 3:48 PM, Xiangdong <[email protected]> wrote:
> >
> > Let me illustrate my point and question by an example. I have a 1d 
> > nonlinear problem with Nx=5 and stencil width 1. When I run it with np=2, 
> > processor 1 own cells 1,2,3 (+ ghost 4) and processor 2 own cells 4,5 (+ 
> > ghost 3). I also have functions FormFunctionLocal and FormJacobianLocal to 
> > compute the residual and Jacobian.
> >
> > i) snes_type newtonls. The Jacobian is in mpiaij format distributed on two 
> > processors. processor 1 computes the first three rows of Jac (size 3x5) and 
> > processor 1 computes the last two rows of Jac (size 2x5). The code works 
> > fine.
> >
> > ii) snes_type nasm and da_overlap 0. Each processor has its own Jac 
> > (seqaij). Processor 1 computes its 3x3 Jac and Processor 2 computes its 2x2 
> > Jac. The code works fine.
> >
> > iii) snes_type nasm and da_overlap 1. Each processor needs to computes its 
> > own Jac. Processor 1 computes its 4x4 Jac and Processor 2 computes its 3x3 
> > Jac. However, for processor 1 to compute the 4-th row of its Jac, it needs 
> > the properties information stored on cell 5, which is not available on 
> > processor 1. The zero pivot error is from this wrong Jac for the row 
> > corresponding to the overlap cell. The mat_fd_color does not work either 
> > for the same reason that information from cell 5 is not available on 
> > processor 1.
> >
> > Do you have any suggestion for the overlap case? Thank you.
> >
> > Best,
> > Xiangdong
> >
> > On Mon, Jun 27, 2016 at 4:21 PM, Barry Smith <[email protected]> wrote:
> >
> > > On Jun 27, 2016, at 3:15 PM, Xiangdong <[email protected]> wrote:
> > >
> > > Hello everyone,
> > >
> > > I am trying different number of da_overlap to see its effects on nasm and 
> > > aspin preconditioner. The codes works fine with -da_overlap 0. However, 
> > > when I change the option -da_overlap 1, it crashed with the error message 
> > > like "zero pivot row 12544 value 0 tolerance 2.2e-14". The option 
> > > -pc_factor_nonzeros_along_diagonal does not fix this zero pivot issue.
> > >
> > > Do you have any quick suggestions for me to try to fix this issue?
> >
> >    My guess is that it is bug in the da_overlap business or in the example; 
> > normally I would not expect this to happen. Is this a PETSc example I can 
> > run (tell me example and exact options) to reproduce the problem?
> >
> >    Thanks
> >
> >    Barry
> >
> > >
> > > Thank you.
> > >
> > > Best,
> > > Xiangdong
> >
> >
> 
> 
> 

Reply via email to