On Mon, Jun 25, 2018 at 6:12 PM, Richard Tran Mills <rtmi...@anl.gov> wrote:
> Hi Hong, > > On Mon, Jun 25, 2018 at 2:52 PM, Zhang, Hong <hongzh...@anl.gov> wrote: > >> >> >> On Jun 25, 2018, at 4:09 PM, Mills, Richard Tran <rtmi...@anl.gov> wrote: >> >> Hi Hong, >> >> Thanks for your reply. Yes, I was just looking at what goes on in >> MatConvert_Basic() and I see the problem. I note that my AIJSELL code seems >> to always work correctly when the matrix block size is one -- I guess that >> this is because MatSeqSELLSetPreallocation() is called in this case. >> >> Are you saying that the >> >> if (A->rmap->bs > 1) { >> ierr = MatConvert_Basic(A,newtype,reuse,newmat);CHKERRQ(ierr); >> PetscFunctionReturn(0); >> } >> >> lines in sell.c should be removed and then things should work? >> >> >> If your conversion function for AIJSELL calls MatConvert_SeqAIJ_SeqSELL(), >> then yes. Otherwise you might want to copy some code from the link I >> attached (sell.c). >> > > I've tried to construct things so that there is as little code duplication > as possible, so I am just calling MatConvert_SeqAIJ_SeqSELL(). > > >> If so, I'll remove this, test on my machine, and then merge this change >> into 'next'. Or is more code needed to make SELL work properly with a block >> size > 1? >> >> >> I don't think we need more code. >> > > Alright, sounds like the presence of the code path that sends things down > MatConvert_Basic() should be considered a bug, then. I'll start a fix off > of 'maint', test things locally, and then (if things pass) merge into > 'next' for testing. > > If removing the code in question still doesn't fix things, well, then it's > still a bug, but will need some more work to fix. > Hong: Thanks for your help in identifying this bug. Removing the code that called MatConvert_Basic() fixes the bugs I was seeing with my AIJSELL class. I've created a branch with this fix off of maint. It passes all tests and I've merged it into 'next'. Best regards, Richard > Thanks, > Richard > > >> Thanks, >> Hong (Mr.) >> >> >> --Richard >> >> On Mon, Jun 25, 2018 at 2:04 PM, Zhang, Hong <hongzh...@anl.gov> wrote: >> >>> Hi Richard, >>> >>> MatConvert_Basic() does not work for most cases when converting AIJ to >>> SELL because SELL may require padding and being preallocated based on the >>> nonzeros per row. >>> See the correct conversion in https://bitbucket.org/petsc >>> /petsc/src/master/src/mat/impls/sell/seq/sell.c?fileviewer= >>> file-view-default%20#sell.c-251 >>> >>> You were probably mislead by the code >>> >>> if (A->rmap->bs > 1) { >>> ierr = MatConvert_Basic(A,newtype,reuse,newmat);CHKERRQ(ierr); >>> PetscFunctionReturn(0); >>> } >>> which should be removed. >>> >>> Thanks, >>> Hong (Mr.) >>> >>> >>> On Jun 25, 2018, at 3:06 PM, Richard Tran Mills <rtmi...@anl.gov> wrote: >>> >>> Hi Everyone (especially Herr Doktor Hong Zhang), >>> >>> It sure took me a while to get around to it, but I coded up (in branch >>> rmills/feature-add-mataijsell) the skeleton of what I'm currently calling >>> the 'AIJSELL' matrix type, which is a "meta" matrix type that inherits from >>> AIJ but maintains a "shadow" MATSEQSELL copy of the matrix for use when >>> that is appropriate. This works similarly to how AIJMKL works, insofar as >>> the PetscObjectState is tracked and used to determine when the SELL >>> "shadow" copy needs to be updated before an operation like MatMult() is >>> applied. (Note that what I have so far only does MatMult, but it should be >>> easy to add other operations once I've verified that my implementation is >>> working.) I decided on the approach of using a MATSEQSELL instance inside >>> an AIJ-derived matrix type because I still want to be able to support >>> standalone SELL matrices when there really isn't a need for any of the >>> AIJ-only operations. >>> >>> My AIJSELL implementation seems to work fine in many cases, but I've >>> spent several hours chasing some bugs that crop up in certain cases. I now >>> think that the bugs are not a problem with my AIJSELL implementation >>> (though I'd welcome another pair of eyes to help me verify this), but are >>> lurking somewhere in the SELL code and haven't been hit before because >>> AIJSELL makes it easier to exercise the SELL class on more problems. >>> >>> I've been using the well-loved SNES ex19 example (that is used for 'make >>> check') for some of my debugging. I've found that everything works great >>> when I size the problem grid such that the resulting matrices have a row >>> count that is evenly divisible by 8 (when using double precision scalars). >>> So >>> >>> ./ex19 -ksp_type gmres -pc_type none -snes_monitor -ksp_monitor >>> -da_grid_x 4 -da_grid_y 5 -mat_seqaij_type seqaijsell -snes_max_it 1 >>> -ksp_monitor >>> >>> which results in an 80x80 Jacobian (since there are 4 degrees of freedom >>> per grid point) runs fine, but >>> >>> ./ex19 -ksp_type gmres -pc_type none -snes_monitor -ksp_monitor >>> -da_grid_x 3 -da_grid_y 5 -mat_seqaij_type seqaijsell -snes_max_it 1 >>> -ksp_monitor >>> >>> results in either a segfault or the GMRES(30) solve terminating at >>> 10,000 iterations because the residual norm just keeps bouncing up and down. >>> >>> Sticking some MatView() calls in, it appears that the MatMult() results >>> inside the KSP solve are incorrect because the SELL copy of the matrix >>> generated by my MatConvert() call is incorrect when the row count isn't >>> evenly divisible by eight. In the case of SNES ex19, MatConvert_Basic() is >>> currently used because the block size is greater than unity. So I think >>> something is going wrong either in the MatSetValues() or (my guess) >>> MatAssemblyBegin()/End(). >>> >>> Poking around with a debugger, I see that the column index array >>> ("colidx") that is part of the SEQSELLHEADER ends up with nonsense values, >>> and the segfaults I see tend to occur when indexing into the x vector with >>> this. >>> >>> I'm not familiar enough with the internals of SELL to look further into >>> debugging this without a lot of effort. Hong, can you help me look at this? >>> >>> Best regards, >>> Richard >>> >>> On Tue, Mar 6, 2018 at 3:59 AM, Karl Rupp <r...@iue.tuwien.ac.at> wrote: >>> >>>> >>>> > Karl, are you thinking of a matrix subclass that has everything that >>>> an >>>> > AIJ matrix does, but also keeps a SELL copy around for operations >>>> like >>>> > MatMult()? Would it make sense to just keep another Mat inside (like >>>> how >>>> > MPIAIJ keeps multiple Mat instances) that *is* of type MATSELL, that >>>> > gets built/updated as needed? Would this involve carrying too much >>>> > baggage around, associated with a complete Mat instance? >>>> >>>> What I have in mind is to put the SELL datastructures into a A->spptr, >>>> very much like you did for AIJMKL. >>>> >>>> >>>> > I like the idea >>>> > of having a MATSELL type available that is lean (insofar as not >>>> having >>>> > storing an AIJ matrix) for those cases when a user really knows that >>>> the >>>> > AIJ stuff will not be needed. But maybe it makes sense to be able to >>>> use >>>> > that inside another matrix class. Perhaps we could have something, >>>> > called, say, MATAIJMUTABLE that uses AIJ but might also create copies >>>> in >>>> > SELL (or other formats, potentially) when appropriate -- perhaps >>>> based >>>> > on a some performance model indicating which format is fastest for >>>> > MatMult() or whatever. >>>> >>>> The actual overhead of also storing a SELL datastructure in terms of >>>> memory footprint is at most 2x. When you keep in mind that extra memory >>>> during the matrix assembly is also needed in the stash, then the impact >>>> on overall memory consumption is about 50 percent extra. Given that >>>> SELL >>>> is a performance optimization for SpMV (and hence the SELL >>>> datastructure >>>> is only populated if you call MatMult on the particular matrix), I'm >>>> not >>>> too worried about the increased memory footprint at this time. >>>> >>>> >>>> > Having a class that's an AIJ but can also use SELL is more convenient >>>> > than adding a fallback to AIJ format inside MATSELL. I wonder if the >>>> > latter option might be preferable in some circumstances, however, >>>> > because it can avoid the extra memory footprint of also keeping the >>>> > matrix in AIJ -- maybe AIJ operations are rarely needed and the AIJ >>>> > conversion can just happen on a lazy basis. >>>> >>>> I advocate an 'optimize as needed' approach here. Let's first make SELL >>>> compatible with the full range of AIJ operations and preconditioners. >>>> >>>> Best regards, >>>> Karli >>>> >>>> >>>> >>>> > >>>> > --Richard >>>> > >>>> > On 3/4/18 2:58 AM, Karl Rupp wrote: >>>> >> Hi all, >>>> >> >>>> >> I'm getting increasingly concerned about SELL not being a subclass >>>> of >>>> >> AIJ. As such, we have to deal with all these fallback operations >>>> now, >>>> >> whereas as a subclass of AIJ we could just selectively make use of >>>> the >>>> >> SELL format where we really benefit from it. "Use AIJ by default >>>> >> unless we have something optimized for SELL" is just much more >>>> >> appropriate for the few use cases of SELL than the current "SELL has >>>> >> to implement everything and usually this means to manually convert >>>> >> back to AIJ". >>>> >> >>>> >> If there are no objections I'd like to clean this up. (Subclassing >>>> AIJ >>>> >> was unfortunately not available at the time Hong started his great >>>> >> work on SELL) >>>> >> >>>> >> Best regards, >>>> >> Karli >>>> >> >>>> >> >>>> >> >>>> >> On 03/03/2018 07:52 AM, Richard Tran Mills wrote: >>>> >>> Resurrecting a few weeks old thread: >>>> >>> >>>> >>> Stefano, did you get around to coding something up to do an >>>> automatic >>>> >>> conversion to SeqAIJ for operations unsupported by SELL format? I >>>> did >>>> >>> some hacking the other day to try to get PCGAMG to use SELL inside >>>> >>> the smoothers and this turns out to be way more complicated than >>>> I'd >>>> >>> like and very bug prone (I haven't found all of mine, anyway). I >>>> >>> think it may be preferable to be able to pass a SELL matrix to >>>> PCGAMG >>>> >>> and have an internal conversion happen in the SELL matrix to AIJ >>>> >>> format for doing the MatPtAP and LU solves. Support for this would >>>> >>> certainly make it easier for users in a lot other cases as well, >>>> and >>>> >>> might make the use of SELL much more likely. If no one has already >>>> >>> done some work on this, I'll take a stab at it. >>>> >>> >>>> >>> --Richard >>>> >>> >>>> >>> On Mon, Feb 12, 2018 at 10:04 AM, Richard Tran Mills < >>>> rtmi...@anl.gov >>>> >>> <mailto:rtmi...@anl.gov>> wrote: >>>> >>> >>>> >>> On Mon, Feb 12, 2018 at 8:47 AM, Smith, Barry F. < >>>> bsm...@mcs.anl.gov >>>> >>> <mailto:bsm...@mcs.anl.gov>> wrote: >>>> >>> >>>> >>> >>>> >>> >>>> >>> > On Feb 12, 2018, at 10:25 AM, Stefano Zampini >>>> >>> <stefano.zamp...@gmail.com <mailto:stefano.zamp...@gmail.com>> >>>> wrote: >>>> >>> > >>>> >>> > Barry, >>>> >>> > >>>> >>> > for sure Amat,Pmat is the right approach; however, with >>>> >>> complicated user codes, we are not always in control of having a >>>> >>> different Jacobian matrix. >>>> >>> > Since Mat*SELL does not currently support any >>>> >>> preconditioning except PCSOR and PCJACOBI, we ask the user to put >>>> >>> codes like >>>> >>> > >>>> >>> > if (type is SELL) >>>> >>> > create two matrices (and maybe modify the code in many >>>> >>> other parts) >>>> >>> > else >>>> >>> > ok with the previous code >>>> >>> >>>> >>> I don't disagree with what you are saying and am not >>>> opposed >>>> >>> to the proposed work. >>>> >>> >>>> >>> Perhaps we need to do a better job with making the >>>> mat,pmat >>>> >>> approach simpler or better documented so more people use it >>>> >>> naturally in their applications. >>>> >>> >>>> >>> >>>> >>> I wrote some code like that in some of the Jacobian/function >>>> >>> routines in PFLOTRAN to experiment with MATSELL, and it works, >>>> but >>>> >>> looks and feels pretty hacky. And if I wanted to support it for >>>> all >>>> >>> of the different systems that PFLOTRAN can model, then I'd have >>>> to >>>> >>> reproduce that it in many different Jacobian and function >>>> evaluation >>>> >>> routines. I also don't like that it makes it awkward to play >>>> with >>>> >>> the many combinations of matrix types and preconditioners that >>>> PETSc >>>> >>> allows: The above pseudocode should really say "if (type is >>>> SELL) >>>> >>> and (preconditioner is not PCSOR or PCJACOBI)". I do think that >>>> >>> Amat,Pmat is a good approach in many situations, but it's easy >>>> to >>>> >>> construct scenarios in which it falls short. >>>> >>> >>>> >>> In some situations, what I'd like to have happen is what >>>> Stefano is >>>> >>> talking about, with an automatic conversion to AIJ happening if >>>> SELL >>>> >>> doesn't support an operation. But, ideally, I think this sort of >>>> >>> implicit format conversion shouldn't be something hard-coded >>>> into >>>> >>> the workings of SELL. Instead, there should be some general >>>> >>> mechanism by which PETSc recognizes that a particular operation >>>> is >>>> >>> unsupported for a given matrix format, and then it can >>>> (optionally) >>>> >>> copy/convert to a different matrix type (probably default to >>>> AIJ, >>>> >>> but it shouldn't have to be AIJ) that supports the operation. >>>> This >>>> >>> sort of implicit data rearrangement game may actually become >>>> more >>>> >>> important if future computer architectures strongly prefer >>>> different >>>> >>> data layouts different types of operations (though let's not get >>>> >>> ahead of ourselves). >>>> >>> >>>> >>> --Richard >>>> >>> >>>> >>> >>>> >>> Barry >>>> >>> >>>> >>> > >>>> >>> > Just my two cents. >>>> >>> > >>>> >>> > >>>> >>> > 2018-02-12 19:10 GMT+03:00 Smith, Barry F. >>>> >>> <bsm...@mcs.anl.gov <mailto:bsm...@mcs.anl.gov>>: >>>> >>> > >>>> >>> > >>>> >>> > > On Feb 12, 2018, at 9:59 AM, Stefano Zampini >>>> >>> <stefano.zamp...@gmail.com <mailto:stefano.zampini@gmail. >>>> com>> >>>> >>> wrote: >>>> >>> > > >>>> >>> > > FYI, I just checked and MatSOR_*SELL does not use any >>>> >>> vectorized instruction. >>>> >>> > > Why just not converting to SeqAIJ, factor and then use >>>> the >>>> >>> AIJ implementation for MatSolve for the moment? >>>> >>> > >>>> >>> > Why not use the mat, pmat feature of the solvers to >>>> pass in >>>> >>> both matrices and have the solvers handle using two formats >>>> >>> simultaneously instead of burdening the MatSELL code with >>>> tons >>>> >>> of special code for automatically converting to AIJ for >>>> >>> solvers etc? >>>> >>> > >>>> >>> > >>>> >>> > > >>>> >>> > > 2018-02-12 18:06 GMT+03:00 Stefano Zampini >>>> >>> <stefano.zamp...@gmail.com <mailto:stefano.zampini@gmail. >>>> com>>: >>>> >>> > > >>>> >>> > > >>>> >>> > > 2018-02-12 17:36 GMT+03:00 Jed Brown <j...@jedbrown.org >>>> >>> <mailto:j...@jedbrown.org>>: >>>> >>> > > Karl Rupp <r...@iue.tuwien.ac.at >>>> >>> <mailto:r...@iue.tuwien.ac.at>> writes: >>>> >>> > > >>>> >>> > > > Hi Stefano, >>>> >>> > > > >>>> >>> > > >> Is there any plan to write code for native ILU/ICC >>>> etc >>>> >>> for SeqSELL, at least to have BJACOBI in parallel? >>>> >>> > > > >>>> >>> > > > (imho) ILU/ICC is a pain to do with SeqSELL. >>>> Point-Jacobi >>>> >>> should be >>>> >>> > > > possible, yes. SELL is really just tailored to >>>> MatMults >>>> >>> and a pain for >>>> >>> > > > anything that is not very similar to a MatMult... >>>> >>> > > >>>> >>> > > There is already MatSOR_*SELL. MatSolve_SeqSELL >>>> wouldn't >>>> >>> be any harder. >>>> >>> > > I think it would be acceptable to convert to SeqAIJ, >>>> >>> factor, and convert >>>> >>> > > the factors back to SELL. >>>> >>> > > >>>> >>> > > Yes, this was my idea. Today I have started coding >>>> >>> something. I'll push the branch whenever I have anything >>>> working >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> > > -- >>>> >>> > > Stefano >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> > > -- >>>> >>> > > Stefano >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > >>>> >>> > -- >>>> >>> > Stefano >>>> >>> >>>> >>> >>>> >>> >>>> > >>>> >>> >>> >>> >> >> >