Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly

Abhyankar, Shrirang G via petsc-users Thu, 09 May 2019 07:31:53 -0700

    > What code? The special sparsity pattern for blocks?     

Yes. See here 
https://bitbucket.org/petsc/petsc/src/master/src/dm/impls/network/network.c#lines-1712


    > What if the user has dense blocks or don't know or care about the 
sparsity pattern? By default it should allocate something that every users code 
will fit in the matrix and then user callable options for making parts of the 
matrices sparse.

That's what it is doing I believe. If the user does not set the sparse matrix 
block for the point then it uses a dense block.

Shri



On 5/9/19, 12:24 AM, "Smith, Barry F." <bsm...@mcs.anl.gov> wrote:

    
    
    > On May 8, 2019, at 9:53 PM, Abhyankar, Shrirang G via petsc-users 
<petsc-users@mcs.anl.gov> wrote:
    > 
    >  
    >  
    > From: Matthew Knepley <knep...@gmail.com>
    > Date: Wednesday, May 8, 2019 at 9:29 PM
    > To: "Abhyankar, Shrirang G" <shrirang.abhyan...@pnnl.gov>
    > Cc: "Zhang, Hong" <hzh...@mcs.anl.gov>, Justin Chang 
<jychan...@gmail.com>, petsc-users <petsc-users@mcs.anl.gov>
    > Subject: Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly
    >  
    > On Wed, May 8, 2019 at 10:17 PM Abhyankar, Shrirang G via petsc-users 
<petsc-users@mcs.anl.gov> wrote:
    >  
    >  
    > From: petsc-users <petsc-users-boun...@mcs.anl.gov> on behalf of "Zhang, 
Hong via petsc-users" <petsc-users@mcs.anl.gov>
    > Reply-To: "Zhang, Hong" <hzh...@mcs.anl.gov>
    > Date: Wednesday, May 8, 2019 at 8:01 PM
    > To: Justin Chang <jychan...@gmail.com>
    > Cc: petsc-users <petsc-users@mcs.anl.gov>
    > Subject: Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly
    >  
    > Justin:
    > Great, the issue is resolved.
    > Why MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) does not 
raise an error?
    >  
    > I copy-pasted the above line from the power.c example. MatSetOption 
should use PETSC_TRUE to activate the MAT_NEW_NONZERO_ALLOCATION_ERR option.
    >  
    > Matt,
    >  
    > We usually prevent this with a structured SetValues API. For example, 
DMDA uses MatSetValuesStencil() which cannot write
    > outside the stencil you set. DMPlex uses MatSetValuesClosure(), which is 
guaranteed to be allocated. We should write one
    > for DMNetwork. The allocation is just like Plex (I believe) where you 
allocate closure(star(p)), which would mean that setting
    > values for a vertex gets the neighboring edges and their vertices, and 
setting values for an edge gets the covering vertices.
    > Is that right for DMNetwork?
    > Yes, DMNetwork behaves in this fashion. 
    > I cannot find MatSetValuesClosure() in petsc-master. 
    > Can you provide detailed instruction on how to implement  
MatSetValuesClosure() for DMNetwork?
    > Note, dmnetwork is a subclass of DMPlex.
    >  
    > DMNetwork does not do any matrix creation by itself. It calls Plex to 
create the matrix.
    >  
    > Right. However, the only operation I put in was MatSetClosure() since 
that is appropriate for FEM. I think you would need
    > a MatSetStar() for DMNetwork as well.
    >  
    > I see that Hong has implemented has some additional code in 
DMCreateMatrix_Network that does not use Plex for creating the matrix.  I think 
it covers what you describe above.
    >  
    > Hong: DMNetwork matrix creation code is not used unless the user wants to 
set special sparsity pattern for the blocks. Shouldn’t this code
    
       What code? The special sparsity pattern for blocks? 
    
    > be used by default instead of having Plex create the matrix?
    
       What if the user has dense blocks or don'ts know or care about the 
sparsity pattern? By default it should allocate something that every users code 
will fit in the matrix and then user callable options for making parts of the 
matrices sparse.
    
       Barry
    
    >  
    > Shri
    >  
    >    Matt
    >  
    > Hong
    >  
    >  
    > On Wed, May 8, 2019 at 4:00 PM Dave May <dave.mayhe...@gmail.com> wrote:
    >  
    >  
    > On Wed, 8 May 2019 at 20:34, Justin Chang via petsc-users 
<petsc-users@mcs.anl.gov> wrote:
    > So here's the branch/repo to the working example I have: 
    >  
    > https://github.com/jychang48/petsc-dss/tree/single-bus-vertex
    >  
    > Type 'make' to compile the dss, it should work with the latest petsc-dev
    >  
    > To test the performance, I've taken an existing IEEE 13-bus and 
duplicated it N times to create a long radial-like network. I have three sizes 
where N = 100, 500, and 1000. Those test files are listed as:
    >  
    > input/test_100.m
    > input/test_500.m
    > input/test_1000.m
    >  
    > I also created another set of examples where the IEEE 13-bus is fully 
balanced (but the program will crash ar the solve step because I used some 
unrealistic parameters for the Y-bus matrices and probably have some zeros 
somewhere). They are listed as:
    >  
    > input/test2_100.m
    > input/test2_500.m
    > input/test2_1000.m
    >  
    > The dof count and matrices for the test2_*.m files are slightly larger 
than their respective test_*.m but they have a bs=6.
    >  
    > To run these tests, type the following:
    >  
    > ./dpflow -input input/test_100.m 
    >  
    > I have a timer that shows how long it takes to compute the Jacobian. 
Attached are the log outputs I have for each of the six cases.
    >  
    > Turns out that only the first call to the SNESComputeJacobian() is slow, 
all the subsequent calls are fast as I expect. This makes me think it still has 
something to do with matrix allocation.
    >  
    > I think it is a preallocation issue.
    > Looking to some of the output files (test_1000.out, test_100.out), under 
Mat Object I see this in the KSPView
    >  
    >       total number of mallocs used during MatSetValues calls =10000
    >  
    >  
    >  
    >  
    >  
    > Thanks for the help everyone,
    >  
    > Justin
    >  
    > On Wed, May 8, 2019 at 12:36 PM Matthew Knepley <knep...@gmail.com> wrote:
    > On Wed, May 8, 2019 at 2:30 PM Justin Chang <jychan...@gmail.com> wrote:
    > Hi everyone, 
    >  
    > Yes I have these lines in my code:
    > 
    > ierr = DMCreateMatrix(networkdm,&J);CHKERRQ(ierr);
    > ierr = 
MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr);
    >  
    > Okay, its not allocation. So maybe Hong is right that its setting great 
big element matrices. We will see with the example.
    >  
    >   Thanks,
    >  
    >     Matt
    >  
    > I tried -info and here's my output:
    >  
    > [0] PetscInitialize(): PETSc successfully started: number of processors = 
1
    > [0] PetscInitialize(): Running on machine: jchang31606s.domain
    > [0] PetscCommDuplicate(): Duplicating a communicator 4436504608 
140550815662944 max tags = 2147483647
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608 
140550815662944
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608 
140550815662944
    > Base power = 0.166667, numbus = 115000, numgen = 5000, numyl = 75000, 
numdl = 5000, numlbr = 109999, numtbr = 5000
    >  
    > **** Power flow dist case ****
    >  
    > Base power = 0.166667, nbus = 115000, ngen = 5000, nwye = 75000, ndelta = 
5000, nbranch = 114999
    > [0] PetscCommDuplicate(): Duplicating a communicator 4436505120 
140550815683104 max tags = 2147483647
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 620000 X 620000; storage space: 
0 unneeded,10799928 used
    > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
    > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 28
    > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
0)/(num_localrows 620000) < 0.6. Do not use CompressedRow routines.
    > [0] MatSeqAIJCheckInode(): Found 205000 nodes of 620000. Limit used: 5. 
Using Inode routines
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608 
140550815662944
    > [0] DMGetDMSNES(): Creating new DMSNES
    > [0] DMGetDMKSP(): Creating new DMKSP
    > [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120 
140550815683104
    >   0 SNES Function norm 1155.45 
    >  
    > nothing else -info related shows up as I'm iterating through the vertex 
loop.
    >  
    > I'll have a MWE for you guys to play with shortly.
    >  
    > Thanks,
    > Justin
    >  
    > On Wed, May 8, 2019 at 12:10 PM Smith, Barry F. <bsm...@mcs.anl.gov> 
wrote:
    > 
    >   Justin,
    > 
    >      Are you providing matrix entries that connect  directly one vertex 
to another vertex ACROSS an edge? I don't think that is supported by the 
DMNetwork model. The assumption is that edges are only connected to vertices 
and vertices are only connected to neighboring edges.
    > 
    >   Everyone,
    > 
    >   I second Matt's reply. 
    > 
    >   How is the DMNetwork preallocating for the Jacobian? Does it take into 
account coupling between neighboring vertices/edges? Or does it assume no 
coupling. Or assume full coupling. If it assumes no coupling and the user has a 
good amount of coupling it will be very slow. 
    > 
    >   There would need to be a way for the user provide the coupling 
information between neighboring vertices/edges if it assumes no coupling.
    > 
    >     Barry
    > 
    > 
    > > On May 8, 2019, at 7:44 AM, Matthew Knepley via petsc-users 
<petsc-users@mcs.anl.gov> wrote:
    > > 
    > > On Wed, May 8, 2019 at 4:45 AM Justin Chang via petsc-users 
<petsc-users@mcs.anl.gov> wrote:
    > > Hi guys,
    > > 
    > > I have a fully working distribution system solver written using 
DMNetwork, The idea is that each electrical bus can have up to three phase 
nodes, and each phase node has two unknowns: voltage magnitude and angle. In a 
completely balanced system, each bus has three nodes, but in an unbalanced 
system some of the buses can be either single phase or two-phase.
    > > 
    > > The working DMNetwork code I developed, loosely based on the SNES 
network/power.c, essentially represents each vertex as a bus. 
DMNetworkAddNumVariables() function will add either 2, 4, or 6 unknowns to each 
vertex. If every single bus had the same number of variables, the mat block 
size = 2, 4, or 6, and my code is both fast and scalable. However, if the 
unknowns per DMNetwork vertex unknowns are not the same across, then my 
SNESFormJacobian function becomes extremely extremely slow. Specifically, the 
MatSetValues() calls when the col/row global indices contain an offset value 
that points to a neighboring bus vertex. 
    > > 
    > > I have never seen MatSetValues() be slow unless it is allocating. Did 
you confirm that you are not allocating, with -info?
    > > 
    > >   Thanks,
    > > 
    > >      MAtt
    > >  
    > > Why is that? Is it because I no longer have a uniform block structure 
and lose the speed/optimization benefits of iterating through an AIJ matrix? I 
see three potential workarounds:
    > > 
    > > 1) Treat every vertex as a three phase bus and "zero out" all the 
unused phase node dofs and put a 1 in the diagonal. The problem I see with this 
is that I will have unnecessary degrees of freedom (aka non-zeros in the 
matrix). From the distribution systems I've seen, it's possible that  anywhere 
from 1/3 to 1/2 of the buses will be two-phase or less, meaning I may have 
nearly twice the amount of dofs than necessary if I wanted to preserve the 
block size = 6 for the AU mat.
    > > 
    > > 2) Treat every phase node as a vertex aka solve a single-phase power 
flow solver. That way I guarantee to have a block size = 2, this is what 
Domenico's former student did in his thesis work. The problem I see with this 
is that I have a larger graph, which can take more time to setup and 
parallelize.
    > > 
    > > 3) Create a "fieldsplit" where I essentially have three "blocks" - one 
for buses with all three phases, another for buses with only two phases, one 
for single-phase buses. This way each block/fieldsplit will have a consistent 
block size. I am not sure if this will solve the MatSetValues() issues, but 
it's, but can anyone give pointers on how to go about achieving this?
    > > 
    > > Thanks,
    > > Justin
    > > 
    > > 
    > > -- 
    > > What most experimenters take for granted before they begin their 
experiments is infinitely more interesting than any results to which their 
experiments lead.
    > > -- Norbert Wiener
    > > 
    > > https://www.cse.buffalo.edu/~knepley/
    > 
    > 
    >  
    > -- 
    > What most experimenters take for granted before they begin their 
experiments is infinitely more interesting than any results to which their 
experiments lead.
    > -- Norbert Wiener
    >  
    > https://www.cse.buffalo.edu/~knepley/
    > 
    >  
    > -- 
    > What most experimenters take for granted before they begin their 
experiments is infinitely more interesting than any results to which their 
experiments lead.
    > -- Norbert Wiener
    >  
    > https://www.cse.buffalo.edu/~knepley/
    > 
    >  
    > -- 
    > What most experimenters take for granted before they begin their 
experiments is infinitely more interesting than any results to which their 
experiments lead.
    > -- Norbert Wiener
    >  
    > https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly

Reply via email to