Send configure.log and make.log to petsc-ma...@mcs.anl.gov

  Barry


> On Feb 5, 2019, at 10:48 PM, Yaxiong Chen <chen2...@purdue.edu> wrote:
> 
> Since mumps and scalapack are already installed on my computer, I only ran 
> ./configure with --download-superlu_dist . 
> After everything is done, I received the error: 
> dyld: lazy symbol binding failed: Symbol not found: 
> _MatSolverTypeRegister_SuperLU_DIST
>   Referenced from: 
> /Users/yaxiong/Downloads/petsc-3.9.4/arch-darwin-c-debug/lib/libpetsc.3.9.dylib
>   Expected in: flat namespace
> 
> dyld: Symbol not found: _MatSolverTypeRegister_SuperLU_DIST
>   Referenced from: 
> /Users/yaxiong/Downloads/petsc-3.9.4/arch-darwin-c-debug/lib/libpetsc.3.9.dylib
>   Expected in: flat namespace
> 
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 5 TRAP
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames 
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: [0] MatInitializePackage line 150 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/mat/interface/dlregismat.c
> [0]PETSC ERROR: [0] MatCreate line 83 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/mat/utils/gcreate.c
> [0]PETSC ERROR: --------------------- Error Message 
> --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> 
> 
> From: Smith, Barry F. <bsm...@mcs.anl.gov>
> Sent: Tuesday, February 5, 2019 10:58 PM
> To: Yaxiong Chen
> Cc: petsc-users@mcs.anl.gov
> Subject: Re: [petsc-users] PETSC matrix assembling super slow
>  
> 
>   Run ./configure with --download-superlu_dist --download-mumps 
> --download-scalapack and the you can use either -pc_factor_mat_solver_type 
> superlu_dist or -pc_factor_mat_solver_type mumps
> 
>   Good luck
> 
> 
> > On Feb 5, 2019, at 9:29 PM, Yaxiong Chen <chen2...@purdue.edu> wrote:
> > 
> > > Also, I found the solving time is also shorter when I use the direct 
> > > solver(0.432s  vs 4.332 s ). Is this due the small scale of the system? 
> > > When I have a very large (e.g., 100000*100000 ) system, can I expect 
> > > iterative solver being faster?
> > 
> > <<   It sounds like the default preconditioner is not working well on your 
> > problem. First run with -ksp_monitor -ksp_converged_reason to get an idea 
> > of how quickly <<the iterative solver is converging. If the number of 
> > iterations is high you are going to need to change the preconditioner to 
> > get one that converges well for your <<problem. What mathematical models is 
> > your code implementing, this will help in determining what type of 
> > preconditioner to use.
> > 
> >  << Barry
> > 
> > 
> > It seems I can only work with Jacobi pre-conditioner. When I try LU or 
> > Cholesky ,I got the error :
> > [0]PETSC ERROR: See 
> > http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for 
> > possible LU and Cholesky solvers
> > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must 
> > ./configure with --download-<package>
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
> > trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.9.4, Sep, 11, 2018 
> > [0]PETSC ERROR: ./optimal_mechanical_part.hdc on a arch-darwin-c-debug 
> > named hidac.ecn.purdue.edu by chen2018 Tue Feb  5 22:19:08 2019
> > [0]PETSC ERROR: Configure options 
> > [0]PETSC ERROR: #1 MatGetFactor() line 4328 in 
> > /Users/yaxiong/Downloads/petsc-3.9.4/src/mat/interface/matrix.c
> > [0]PETSC ERROR: #2 PCSetUp_Cholesky() line 86 in 
> > /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/pc/impls/factor/cholesky/cholesky.c
> > [0]PETSC ERROR: #3 PCSetUp() line 923 in 
> > /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/pc/interface/precon.c
> > [0]PETSC ERROR: #4 KSPSetUp() line 381 in 
> > /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/ksp/interface/itfunc.c
> > [0]PETSC ERROR: #5 KSPSolve() line 612 in 
> > /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/ksp/interface/itfunc.c
> >  solve time   3.8589999999985025E-003
> >  reason           0
> >  It asks me to download the package. Do I want to run the command in the 
> > folder of PETSC? But I am not sure what the name of the package should be . 
> > I tried with ./configure --download-<PCLU> But it does not work.
> > 
> > From: Smith, Barry F. <bsm...@mcs.anl.gov>
> > Sent: Tuesday, February 5, 2019 1:18 PM
> > To: Yaxiong Chen
> > Cc: PETSc users list
> > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> >  
> > 
> > 
> > > On Feb 5, 2019, at 10:32 AM, Yaxiong Chen <chen2...@purdue.edu> wrote:
> > > 
> > > Thanks Barry,
> > >      I will explore how to partition for parallel computation later. But 
> > > now I still have some confusion on the sequential operation.
> > > I compared PETSC and Mumps. In both case, the subroutine for generating 
> > > elemental matrix is very similar. However, the difference is in the 
> > > following step:
> > >    call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr)
> > > In this step ,each element cost about 3~4 e-2 second.
> > > 
> > > However, when I use mumps, I use the following code: 
> > >    preA(Aptr+1:Aptr+n*(n+1)/2) = pack(Ae, mask(1:n,1:n))
> > >    Aptr = Aptr +n*(n+1)/2
> > >    if (allocated(auxRHSe)) preRHS(idx) = preRHS(idx)+auxRHSe
> > >  It just cost 10e-6~10e-5 second. For a 10000*10000 matrix, the 
> > > assembling time for using PETSC is 300s while it cost 60s when using 
> > > Mumps.
> > > For a 10000*10000 system,the  Is there any way I can make it faster? 
> > 
> >   As we keep saying the slowness of the matrix assembly is due to incorrect 
> > preallocation. Once you have the preallocation correct the speed should 
> > increase dramatically.
> > 
> > > Also, I found the solving time is also shorter when I use the direct 
> > > solver(0.432s  vs 4.332 s ). Is this due the small scale of the system? 
> > > When I have a very large (e.g., 100000*100000 ) system, can I expect 
> > > iterative solver being faster?
> > 
> >    It sounds like the default preconditioner is not working well on your 
> > problem. First run with -ksp_monitor -ksp_converged_reason to get an idea 
> > of how quickly the iterative solver is converging. If the number of 
> > iterations is high you are going to need to change the preconditioner to 
> > get one that converges well for your problem. What mathematical models is 
> > your code implementing, this will help in determining what type of 
> > preconditioner to use.
> > 
> >   Barry
> > 
> > 
> > > 
> > > Thanks
> > > 
> > > Yaxiong
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Yaxiong Chen, 
> > > Ph.D. Student 
> > > 
> > > School of Mechanical Engineering, 3171 
> > > 585 Purdue Mall
> > > West Lafayette, IN 47907
> > > 
> > > 
> > > 
> > > 
> > > 
> > > From: Smith, Barry F. <bsm...@mcs.anl.gov>
> > > Sent: Monday, February 4, 2019 10:42 PM
> > > To: Yaxiong Chen
> > > Cc: PETSc users list
> > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > >  
> > > 
> > > 
> > > > On Feb 4, 2019, at 4:41 PM, Yaxiong Chen <chen2...@purdue.edu> wrote:
> > > > 
> > > >   Hi Barry,
> > > > 
> > > >  !===================================================
> > > >       mystart =rank +1                                    ! rank starts 
> > > > from 0
> > > >       do i=mystart,nelem,nproc                    ! nelem: total number 
> > > > of elements  ; nproc :number of process 
> > > >             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)    ! 
> > > > Generate elemental matrix Ae and corresponding global Idx
> > > >             ne=size(idx)            
> > > >             idx=idx-1           !-1 since PETSC index starts from zero 
> > > >             if (allocated(auxRHSe))  call 
> > > > VecSetValues(bvec,ne,idx,auxRHSe,ADD_VALUES,ierr)  
> > > >             call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr) ! 
> > > > Add elemental RHS to global RHS
> > > >       end do
> > > > !===================================================
> > > >   Maybe this is where I am wrong. The way I use MPI is to let each core 
> > > > generate the elemental matrix in turn.
> > > 
> > >    This is very bad strategy because there is no data locality. 
> > > 
> > > > Which means I have one global matrix on each core and finally add them 
> > > > together. My case is similar to typical finite element method. But the 
> > > > problem is the Index is not continuous, in this case I don't know how I 
> > > > can partition the global matrix. Do you have any suggestions or do you 
> > > > have any template which can show me how finite element method use PETSC?
> > > 
> > >   What you need to do is partition the elements across the processors (so 
> > > that the each process has a contiguous subdomain of elements), The each 
> > > process computes the element stiffness for "its elements". There really 
> > > isn't a single PETSc example that manages this all directly for finite 
> > > elements because that is a rather involved process to do it so as to get 
> > > good performance.
> > > 
> > >    Depending on how specialized your problem needs to be you might 
> > > consider using one of the packages libMesh, MOOSE or deal.ii to manage 
> > > the elements and element computations (they all use PETSc for the 
> > > algebraic solvers) instead of doing it yourself; it is an involved 
> > > process to do it all your self.
> > > 
> > >    Barry
> > > 
> > > > 
> > > > Thanks
> > > > 
> > > > Yaxiong
> > > > 
> > > > 
> > > > From: Smith, Barry F. <bsm...@mcs.anl.gov>
> > > > Sent: Monday, February 4, 2019 5:21 PM
> > > > To: Yaxiong Chen
> > > > Cc: Mark Adams; PETSc users list
> > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > >  
> > > > 
> > > > 
> > > > > On Feb 4, 2019, at 3:17 PM, Yaxiong Chen via petsc-users 
> > > > > <petsc-users@mcs.anl.gov> wrote:
> > > > > 
> > > > > Hi Mark,
> > > > >      Will the parameter MatMPIAIJSetPreallocation in influence the 
> > > > > following part
> > > > >       do i=mystart,nelem,nproc
> > > > >             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
> > > > >             ne=size(idx)
> > > > >             idx=idx-1           !-1 since PETSC index starts from 
> > > > > zero 
> > > > >             if (allocated(auxRHSe))  call 
> > > > > VecSetValues(bvec,ne,idx,auxRHSe,ADD_VALUES,ierr)
> > > > >             call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr)
> > > > >       end do
> > > > > I found this part will impede my assembling process. In my case ,the 
> > > > > total DOF is 20800. I estimated the upper bound  of number of nonzero 
> > > > > entries in each row as 594. So I set f9 to be 20206 and f6 to be 
> > > > > 10103 in the following command:
> > > > >  call 
> > > > > MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> > > > >  ierr)
> > > > > 
> > > > > Running the code in sequential mode with -info I got the following 
> > > > > information.And I can get the result even thought the assembling 
> > > > > process is kind of slow(several times slower than using Mumps).
> > > > > [0] MatAssemblyBegin_MPIBAIJ(): Stash has 0 entries,uses 0 mallocs.
> > > > > [0] MatAssemblyBegin_MPIBAIJ(): Block-Stash has 0 entries, uses 0 
> > > > > mallocs.
> > > > > [0] MatAssemblyEnd_SeqBAIJ(): Matrix size: 20800 X 20800, block size 
> > > > > 1; storage space: 13847 unneeded, 1030038 used
> > > > > [0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues 
> > > > > is 62659
> > > > 
> > > > 
> > > >    Something is very wrong. The number of mallocs should be zero if you 
> > > > have the correct preallocation. Are you calling MatZeroEntries() or 
> > > > some other Mat routine before you call MatSetValues() ?
> > > > 
> > > > 
> > > > > [0] MatAssemblyEnd_SeqBAIJ(): Most nonzeros blocks in any row is 70
> > > > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
> > > > > 0)/(num_localrows 20800) < 0.6. Do not use CompressedRow routines.
> > > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 
> > > > > 4462889472 140509341618528
> > > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 
> > > > > 4462889472 140509341618528
> > > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 
> > > > > 4462889472 140509341618528
> > > > > [0] VecScatterCreate_Seq(): Special case: sequential copy
> > > > > [0] MatAssemblyEnd_SeqBAIJ(): Matrix size: 20800 X 0, block size 1; 
> > > > > storage space: 104000 unneeded, 0 used
> > > > > [0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues 
> > > > > is 0
> > > > > [0] MatAssemblyEnd_SeqBAIJ(): Most nonzeros blocks in any row is 0
> > > > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
> > > > > 20800)/(num_localrows 20800) > 0.6. Use CompressedRow routines.
> > > > >  aseembel time   282.54871800000001     
> > > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 
> > > > > 4462888960 140509341442256
> > > > > [0] PCSetUp(): Setting up PC for first time
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > [0] PCSetUp(): Leaving PC with identical preconditioner since 
> > > > > operator is unchanged
> > > > > 
> > > > > However, when I use the parallel mode,I got the following information:
> > > > > [0] MatAssemblyBegin_MPIBAIJ(): Stash has 707965 entries,uses 6 
> > > > > mallocs.
> > > > 
> > > >    You have a lot of off-process MatSetValues(). That is one process is 
> > > > generating a lot of matrix entries that belong on another process. This 
> > > > is not desirable. Ideally each process will generate the matrix entries 
> > > > that belong to that process so only a few matrix entries need to be 
> > > > transported to another process. How are you partitioning the mesh and 
> > > > how are you deciding which process computes which which entries in the 
> > > > matrix? All of this may be need to be revisited.
> > > > 
> > > >   Barry
> > > > 
> > > > 
> > > > 
> > > > > [0] MatAssemblyBegin_MPIBAIJ(): Block-Stash has 0 entries, uses 0 
> > > > > mallocs.
> > > > > 
> > > > > it seems it never went to  call 
> > > > > MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)
> > > > > Is there anything I am doing wrong?
> > > > > Thanks
> > > > > 
> > > > > Yaxiong Chen, 
> > > > > Ph.D. Student 
> > > > > 
> > > > > School of Mechanical Engineering, 3171 
> > > > > 585 Purdue Mall
> > > > > West Lafayette, IN 47907
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > From: Mark Adams <mfad...@lbl.gov>
> > > > > Sent: Tuesday, January 29, 2019 10:02 PM
> > > > > To: Yaxiong Chen; PETSc users list
> > > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > > >  
> > > > > Optimized is a configuration flag not a versions.
> > > > > 
> > > > > You need to figure out your number of non-zeros per row of you global 
> > > > > matrix, or a bound on it, and supply that in 
> > > > > MatMPIAIJSetPreallocation. Otherwise it has to allocate and copy 
> > > > > memory often.
> > > > > 
> > > > > You could increase your f9 on a serial run and see what runs best and 
> > > > > then  move to parallel with a value in f6 of about 1/2 of f9.
> > > > > 
> > > > > On Tue, Jan 29, 2019 at 9:13 PM Yaxiong Chen <chen2...@purdue.edu> 
> > > > > wrote:
> > > > > Thanks Mark, 
> > > > >     I use PETSC 3.9.4, is this the optimized version you called?
> > > > >     Actually f9 and f6 are from the PETSC example. I don't know how 
> > > > > to set the value correctly so I commend them. The size of my 
> > > > > elemental matrix may vary. For 2D problem, the size of elemental 
> > > > > matrix can be 24*24 or 32*32 or some other sizes. And the index is 
> > > > > not continuous. In this case, the elemental matrix may interlace with 
> > > > > each other in the global matrix, and I may have thousands of 
> > > > > elemental matrix to be assembled. Does the preallocating suitable for 
> > > > > this?
> > > > > 
> > > > > Yaxiong Chen, 
> > > > > Ph.D. Student 
> > > > > 
> > > > > School of Mechanical Engineering, 3171 
> > > > > 585 Purdue Mall
> > > > > West Lafayette, IN 47907
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > From: Mark Adams <mfad...@lbl.gov>
> > > > > Sent: Tuesday, January 29, 2019 8:25 PM
> > > > > To: Yaxiong Chen
> > > > > Cc: Song Gao; petsc-users@mcs.anl.gov
> > > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > > >  
> > > > > Slow assembly is often from not preallocating correctly. I am 
> > > > > guessing that you are using Q1 element and f9==9, and thus the 
> > > > > preallocation should be OK if this is a scalar problem on a regular 
> > > > > grid and f6-==6 should be OK for the off processor allocation, if my 
> > > > > assumptions are correct.
> > > > > 
> > > > > You can run with -info, which will tell you how many allocation were 
> > > > > done in assembly. Make sure that it is small (eg, 0).
> > > > > 
> > > > > I see you use f90 array stuff 'idx-1'. Compilers can sometimes do 
> > > > > crazy things with seeming simple code. You could just do this 
> > > > > manually if you can find anything else.
> > > > > 
> > > > > And I trust you are running an optimized version of PETSc.
> > > > > 
> > > > > 
> > > > > On Tue, Jan 29, 2019 at 6:22 PM Yaxiong Chen via petsc-users 
> > > > > <petsc-users@mcs.anl.gov> wrote:
> > > > > Hi Song,
> > > > > I don't quite understand how I can use this command. I don't 
> > > > > partition the gloabl matrix. If I add my elemental matrix to the 
> > > > > global system it will be like this. And in my parallel part, I use 
> > > > > each core to generate the elemental matrix in turn. In this case, I 
> > > > > guess each core will be assigned the space for global matrix and 
> > > > > finally be assembled.But according to the manual, it seems each core 
> > > > > will store a part of the global matrix. Is the local submatrix in the 
> > > > > MatMPIAIJSetPreallocation(Mat B,PetscInt d_nz,const PetscInt 
> > > > > d_nnz[],PetscInt o_nz,const PetscInto_nnz[])the same as my elemental 
> > > > > matrix? 
> > > > > <pastedImage.png>
> > > > > 
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > Yaxiong Chen, 
> > > > > Ph.D. Student 
> > > > > 
> > > > > School of Mechanical Engineering, 3171 
> > > > > 585 Purdue Mall
> > > > > West Lafayette, IN 47907
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > From: Song Gao <song.g...@mail.mcgill.ca>
> > > > > Sent: Tuesday, January 29, 2019 1:22 PM
> > > > > To: Yaxiong Chen
> > > > > Cc: Matthew Knepley; petsc-users@mcs.anl.gov
> > > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > > >  
> > > > > I think you would prefer to preallocate the matrix
> > > > > 
> > > > > uncomment this line 
> > > > > ! call 
> > > > > MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> > > > >  ierr)
> > > > > 
> > > > > 
> > > > > 
> > > > > Le mar. 29 janv. 2019, à 12 h 40, Yaxiong Chen via petsc-users 
> > > > > <petsc-users@mcs.anl.gov> a écrit :
> > > > > Hello,
> > > > > 
> > > > > I have a 2D system which is assembled by each elemental matrix.  Ae 
> > > > > is my elemental matrix, auxRHSe(:) and RHSe(:) and corresponding 
> > > > > right hand side, idx is the global index. My code is as follow, 
> > > > > however ,the assembling rate is super slow(Marked red in the code). I 
> > > > > am not sure whether the assembling type is right or not. Since for 
> > > > > each element, idx are not continuous numbers. Do you have any idea 
> > > > > what is the better way to assemble the matrix?
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > block
> > > > >     PetscErrorCode ierr
> > > > >      PetscMPIInt rank,nproc, mystart
> > > > >      PetscInt  nelem
> > > > >      integer,allocatable  ::info(:)
> > > > >      real(wp), allocatable :: Ae(:,:), auxRHSe(:),RHSe(:)
> > > > >      integer, allocatable      :: idx(:)
> > > > >      PetscScalar, pointer :: xx_v(:)
> > > > >   
> > > > >      PC               prec
> > > > >      PetscScalar ::   val
> > > > >      Vec              xvec,bvec,uvec
> > > > >      Mat              Amat
> > > > >      KSP              ksp
> > > > >      PetscViewer viewer
> > > > >      PetscInt  geq,j,k,ne,M,Istart,Iend
> > > > >      PetscBool  flg
> > > > >      KSPConvergedReason reason
> > > > >      Vec         dummyVec, dummyVecs(1)
> > > > >      MatNullSpace nullspace
> > > > > 
> > > > >      call PetscInitialize( PETSC_NULL_CHARACTER, ierr )
> > > > > 
> > > > >      if (ierr .ne. 0) then
> > > > >      print*,'Unable to initialize PETSc'
> > > > >      stop
> > > > >      endif
> > > > >      call MPI_Comm_size(PETSC_COMM_WORLD,nproc,ierr)
> > > > >      call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> > > > >      mystart=rank+1
> > > > > 
> > > > >  !   Parameter set
> > > > >      info=ptsystem%getInitInfo()
> > > > >      nelem = 
> > > > > info(Info_EleMatNum)+info(Info_FixedDOFNum)+info(Info_NumConstrain) 
> > > > >      print*,'nelem',nelem
> > > > >  !-------------------------------------
> > > > >  !  Create Matrix
> > > > >      call MatCreate(PETSC_COMM_WORLD,Amat,ierr)
> > > > >      call MatSetSizes( Amat,PETSC_DECIDE, PETSC_DECIDE, info(1), 
> > > > > info(1), ierr )
> > > > >      call MatSetType( Amat, MATMPIBAIJ, ierr )
> > > > >      ! call 
> > > > > MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> > > > >  ierr)
> > > > >      call MatSetFromOptions( Amat, ierr )
> > > > >      call MatSetUp( Amat, ierr )
> > > > >      call MatGetOwnershipRange( Amat, Istart, Iend, ierr )
> > > > >  
> > > > >       xvec = tVec(0)
> > > > >       call MatCreateVecs( Amat, PETSC_NULL_VEC, xvec, ierr )
> > > > >       call VecSetFromOptions( xvec, ierr )
> > > > >       call VecDuplicate( xvec, bvec, ierr )
> > > > >       call VecDuplicate( xvec, uvec, ierr )
> > > > > 
> > > > >    t1 = MPI_WTime(); 
> > > > > 
> > > > >       do i=mystart,nelem,nproc
> > > > >             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
> > > > >             ne=size(idx)
> > > > >             if (allocated(auxRHSe))  call 
> > > > > VecSetValues(bvec,ne,idx-1,auxRHSe,ADD_VALUES,ierr)
> > > > >              call 
> > > > > MatSetValues(Amat,ne,idx-1,ne,idx-1,Ae,ADD_VALUES,ierr)
> > > > >       end do
> > > > > 
> > > > >       nelem   = info(Info_EleRHSNum)
> > > > >       mystart = rank+1
> > > > > 
> > > > >       do i = mystart, nelem, nproc  
> > > > >         call ptSystem%getElementalRHS(i, RHSe, idx)
> > > > >         print*,'idx',idx
> > > > >         ne=size(idx)
> > > > >         if (allocated(RHSe))  call 
> > > > > VecSetValues(bvec,ne,idx-1,RHSe,ADD_VALUES,ierr)
> > > > >       end do 
> > > > >       call MatAssemblyBegin(Amat,MAT_FINAL_ASSEMBLY,ierr)
> > > > >       call MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)               
> > > > >            !  this part is slow, the for loop above is done but here 
> > > > > it may get stuck
> > > > >       call VecAssemblyBegin(bvec,ierr)                                
> > > > >                                     ! For a 2500 DOF system, 
> > > > > assembling only takes over 2 seconds
> > > > >       call VecAssemblyEnd(bvec,ierr)                                  
> > > > >                                     ! But for a 10000 DOF system , it 
> > > > > gets stuck
> > > > >      t2 = MPI_WTime(); 
> > > > >       print*,'assembeling time',t2-t1
> > > > >       ! Solve
> > > > >       call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> > > > > !  Set operators. Here the matrix that defines the linear system
> > > > > !  also serves as the preconditioning matrix.
> > > > >       call KSPSetOperators(ksp,Amat,Amat,ierr)
> > > > >       call KSPSetFromOptions(ksp,ierr)
> > > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> > > > > - -
> > > > > !                      Solve the linear system
> > > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> > > > > - -
> > > > >     call KSPSetType(ksp,KSPCG,ierr)
> > > > >     call KSPGetPC(ksp,prec,ierr)
> > > > >    ! call KSPSetPCSide(ksp,PC_SYMMETRIC,ierr)
> > > > >     call PCSetType(prec,PCJACOBI,ierr)
> > > > >      call KSPSolve(ksp,bvec,xvec,ierr)
> > > > >      call PetscFinalize(ierr)
> > > > > 
> > > > > end block
> > > > > 
> > > > > 
> > > > > Yaxiong Chen, 
> > > > > Ph.D. Student 
> > > > > 
> > > > > School of Mechanical Engineering, 3171 
> > > > > 585 Purdue Mall
> > > > > West Lafayette, IN 47907

Reply via email to