Since mumps and scalapack are already installed on my computer, I only ran 
./configure with --download-superlu_dist .

After everything is done, I received the error:

dyld: lazy symbol binding failed: Symbol not found: 
_MatSolverTypeRegister_SuperLU_DIST

  Referenced from: 
/Users/yaxiong/Downloads/petsc-3.9.4/arch-darwin-c-debug/lib/libpetsc.3.9.dylib

  Expected in: flat namespace


dyld: Symbol not found: _MatSolverTypeRegister_SuperLU_DIST

  Referenced from: 
/Users/yaxiong/Downloads/petsc-3.9.4/arch-darwin-c-debug/lib/libpetsc.3.9.dylib

  Expected in: flat namespace


[0]PETSC ERROR: 
------------------------------------------------------------------------

[0]PETSC ERROR: Caught signal number 5 TRAP

[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[0]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
find memory corruption errors

[0]PETSC ERROR: likely location of problem given in stack below

[0]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------

[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,

[0]PETSC ERROR:       INSTEAD the line number of the start of the function

[0]PETSC ERROR:       is given.

[0]PETSC ERROR: [0] MatInitializePackage line 150 
/Users/yaxiong/Downloads/petsc-3.9.4/src/mat/interface/dlregismat.c

[0]PETSC ERROR: [0] MatCreate line 83 
/Users/yaxiong/Downloads/petsc-3.9.4/src/mat/utils/gcreate.c

[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------

[0]PETSC ERROR: Signal received


________________________________
From: Smith, Barry F. <bsm...@mcs.anl.gov>
Sent: Tuesday, February 5, 2019 10:58 PM
To: Yaxiong Chen
Cc: petsc-users@mcs.anl.gov
Subject: Re: [petsc-users] PETSC matrix assembling super slow


  Run ./configure with --download-superlu_dist --download-mumps 
--download-scalapack and the you can use either -pc_factor_mat_solver_type 
superlu_dist or -pc_factor_mat_solver_type mumps

  Good luck


> On Feb 5, 2019, at 9:29 PM, Yaxiong Chen <chen2...@purdue.edu> wrote:
>
> > Also, I found the solving time is also shorter when I use the direct 
> > solver(0.432s  vs 4.332 s ). Is this due the small scale of the system? 
> > When I have a very large (e.g., 100000*100000 ) system, can I expect 
> > iterative solver being faster?
>
> <<   It sounds like the default preconditioner is not working well on your 
> problem. First run with -ksp_monitor -ksp_converged_reason to get an idea of 
> how quickly <<the iterative solver is converging. If the number of iterations 
> is high you are going to need to change the preconditioner to get one that 
> converges well for your <<problem. What mathematical models is your code 
> implementing, this will help in determining what type of preconditioner to 
> use.
>
>  << Barry
>
>
> It seems I can only work with Jacobi pre-conditioner. When I try LU or 
> Cholesky ,I got the error :
> [0]PETSC ERROR: See 
> http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for 
> possible LU and Cholesky solvers
> [0]PETSC ERROR: Could not locate a solver package. Perhaps you must 
> ./configure with --download-<package>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
> trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.9.4, Sep, 11, 2018
> [0]PETSC ERROR: ./optimal_mechanical_part.hdc on a arch-darwin-c-debug named 
> hidac.ecn.purdue.edu by chen2018 Tue Feb  5 22:19:08 2019
> [0]PETSC ERROR: Configure options
> [0]PETSC ERROR: #1 MatGetFactor() line 4328 in 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/mat/interface/matrix.c
> [0]PETSC ERROR: #2 PCSetUp_Cholesky() line 86 in 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/pc/impls/factor/cholesky/cholesky.c
> [0]PETSC ERROR: #3 PCSetUp() line 923 in 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: #4 KSPSetUp() line 381 in 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: #5 KSPSolve() line 612 in 
> /Users/yaxiong/Downloads/petsc-3.9.4/src/ksp/ksp/interface/itfunc.c
>  solve time   3.8589999999985025E-003
>  reason           0
>  It asks me to download the package. Do I want to run the command in the 
> folder of PETSC? But I am not sure what the name of the package should be . I 
> tried with ./configure --download-<PCLU> But it does not work.
>
> From: Smith, Barry F. <bsm...@mcs.anl.gov>
> Sent: Tuesday, February 5, 2019 1:18 PM
> To: Yaxiong Chen
> Cc: PETSc users list
> Subject: Re: [petsc-users] PETSC matrix assembling super slow
>
>
>
> > On Feb 5, 2019, at 10:32 AM, Yaxiong Chen <chen2...@purdue.edu> wrote:
> >
> > Thanks Barry,
> >      I will explore how to partition for parallel computation later. But 
> > now I still have some confusion on the sequential operation.
> > I compared PETSC and Mumps. In both case, the subroutine for generating 
> > elemental matrix is very similar. However, the difference is in the 
> > following step:
> >    call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr)
> > In this step ,each element cost about 3~4 e-2 second.
> >
> > However, when I use mumps, I use the following code:
> >    preA(Aptr+1:Aptr+n*(n+1)/2) = pack(Ae, mask(1:n,1:n))
> >    Aptr = Aptr +n*(n+1)/2
> >    if (allocated(auxRHSe)) preRHS(idx) = preRHS(idx)+auxRHSe
> >  It just cost 10e-6~10e-5 second. For a 10000*10000 matrix, the assembling 
> > time for using PETSC is 300s while it cost 60s when using Mumps.
> > For a 10000*10000 system,the  Is there any way I can make it faster?
>
>   As we keep saying the slowness of the matrix assembly is due to incorrect 
> preallocation. Once you have the preallocation correct the speed should 
> increase dramatically.
>
> > Also, I found the solving time is also shorter when I use the direct 
> > solver(0.432s  vs 4.332 s ). Is this due the small scale of the system? 
> > When I have a very large (e.g., 100000*100000 ) system, can I expect 
> > iterative solver being faster?
>
>    It sounds like the default preconditioner is not working well on your 
> problem. First run with -ksp_monitor -ksp_converged_reason to get an idea of 
> how quickly the iterative solver is converging. If the number of iterations 
> is high you are going to need to change the preconditioner to get one that 
> converges well for your problem. What mathematical models is your code 
> implementing, this will help in determining what type of preconditioner to 
> use.
>
>   Barry
>
>
> >
> > Thanks
> >
> > Yaxiong
> >
> >
> >
> >
> >
> > Yaxiong Chen,
> > Ph.D. Student
> >
> > School of Mechanical Engineering, 3171
> > 585 Purdue Mall
> > West Lafayette, IN 47907
> >
> >
> >
> >
> >
> > From: Smith, Barry F. <bsm...@mcs.anl.gov>
> > Sent: Monday, February 4, 2019 10:42 PM
> > To: Yaxiong Chen
> > Cc: PETSc users list
> > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> >
> >
> >
> > > On Feb 4, 2019, at 4:41 PM, Yaxiong Chen <chen2...@purdue.edu> wrote:
> > >
> > >   Hi Barry,
> > >
> > >  !===================================================
> > >       mystart =rank +1                                    ! rank starts 
> > > from 0
> > >       do i=mystart,nelem,nproc                    ! nelem: total number 
> > > of elements  ; nproc :number of process
> > >             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)    ! 
> > > Generate elemental matrix Ae and corresponding global Idx
> > >             ne=size(idx)
> > >             idx=idx-1           !-1 since PETSC index starts from zero
> > >             if (allocated(auxRHSe))  call 
> > > VecSetValues(bvec,ne,idx,auxRHSe,ADD_VALUES,ierr)
> > >             call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr) ! 
> > > Add elemental RHS to global RHS
> > >       end do
> > > !===================================================
> > >   Maybe this is where I am wrong. The way I use MPI is to let each core 
> > > generate the elemental matrix in turn.
> >
> >    This is very bad strategy because there is no data locality.
> >
> > > Which means I have one global matrix on each core and finally add them 
> > > together. My case is similar to typical finite element method. But the 
> > > problem is the Index is not continuous, in this case I don't know how I 
> > > can partition the global matrix. Do you have any suggestions or do you 
> > > have any template which can show me how finite element method use PETSC?
> >
> >   What you need to do is partition the elements across the processors (so 
> > that the each process has a contiguous subdomain of elements), The each 
> > process computes the element stiffness for "its elements". There really 
> > isn't a single PETSc example that manages this all directly for finite 
> > elements because that is a rather involved process to do it so as to get 
> > good performance.
> >
> >    Depending on how specialized your problem needs to be you might consider 
> > using one of the packages libMesh, MOOSE or deal.ii to manage the elements 
> > and element computations (they all use PETSc for the algebraic solvers) 
> > instead of doing it yourself; it is an involved process to do it all your 
> > self.
> >
> >    Barry
> >
> > >
> > > Thanks
> > >
> > > Yaxiong
> > >
> > >
> > > From: Smith, Barry F. <bsm...@mcs.anl.gov>
> > > Sent: Monday, February 4, 2019 5:21 PM
> > > To: Yaxiong Chen
> > > Cc: Mark Adams; PETSc users list
> > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > >
> > >
> > >
> > > > On Feb 4, 2019, at 3:17 PM, Yaxiong Chen via petsc-users 
> > > > <petsc-users@mcs.anl.gov> wrote:
> > > >
> > > > Hi Mark,
> > > >      Will the parameter MatMPIAIJSetPreallocation in influence the 
> > > > following part
> > > >       do i=mystart,nelem,nproc
> > > >             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
> > > >             ne=size(idx)
> > > >             idx=idx-1           !-1 since PETSC index starts from zero
> > > >             if (allocated(auxRHSe))  call 
> > > > VecSetValues(bvec,ne,idx,auxRHSe,ADD_VALUES,ierr)
> > > >             call MatSetValues(Amat,ne,idx,ne,idx,Ae,ADD_VALUES,ierr)
> > > >       end do
> > > > I found this part will impede my assembling process. In my case ,the 
> > > > total DOF is 20800. I estimated the upper bound  of number of nonzero 
> > > > entries in each row as 594. So I set f9 to be 20206 and f6 to be 10103 
> > > > in the following command:
> > > >  call 
> > > > MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> > > >  ierr)
> > > >
> > > > Running the code in sequential mode with -info I got the following 
> > > > information.And I can get the result even thought the assembling 
> > > > process is kind of slow(several times slower than using Mumps).
> > > > [0] MatAssemblyBegin_MPIBAIJ(): Stash has 0 entries,uses 0 mallocs.
> > > > [0] MatAssemblyBegin_MPIBAIJ(): Block-Stash has 0 entries, uses 0 
> > > > mallocs.
> > > > [0] MatAssemblyEnd_SeqBAIJ(): Matrix size: 20800 X 20800, block size 1; 
> > > > storage space: 13847 unneeded, 1030038 used
> > > > [0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 
> > > > 62659
> > >
> > >
> > >    Something is very wrong. The number of mallocs should be zero if you 
> > > have the correct preallocation. Are you calling MatZeroEntries() or some 
> > > other Mat routine before you call MatSetValues() ?
> > >
> > >
> > > > [0] MatAssemblyEnd_SeqBAIJ(): Most nonzeros blocks in any row is 70
> > > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
> > > > 0)/(num_localrows 20800) < 0.6. Do not use CompressedRow routines.
> > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 4462889472 
> > > > 140509341618528
> > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 4462889472 
> > > > 140509341618528
> > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 4462889472 
> > > > 140509341618528
> > > > [0] VecScatterCreate_Seq(): Special case: sequential copy
> > > > [0] MatAssemblyEnd_SeqBAIJ(): Matrix size: 20800 X 0, block size 1; 
> > > > storage space: 104000 unneeded, 0 used
> > > > [0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 0
> > > > [0] MatAssemblyEnd_SeqBAIJ(): Most nonzeros blocks in any row is 0
> > > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
> > > > 20800)/(num_localrows 20800) > 0.6. Use CompressedRow routines.
> > > >  aseembel time   282.54871800000001
> > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 4462888960 
> > > > 140509341442256
> > > > [0] PCSetUp(): Setting up PC for first time
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > > [0] PCSetUp(): Leaving PC with identical preconditioner since operator 
> > > > is unchanged
> > > >
> > > > However, when I use the parallel mode,I got the following information:
> > > > [0] MatAssemblyBegin_MPIBAIJ(): Stash has 707965 entries,uses 6 mallocs.
> > >
> > >    You have a lot of off-process MatSetValues(). That is one process is 
> > > generating a lot of matrix entries that belong on another process. This 
> > > is not desirable. Ideally each process will generate the matrix entries 
> > > that belong to that process so only a few matrix entries need to be 
> > > transported to another process. How are you partitioning the mesh and how 
> > > are you deciding which process computes which which entries in the 
> > > matrix? All of this may be need to be revisited.
> > >
> > >   Barry
> > >
> > >
> > >
> > > > [0] MatAssemblyBegin_MPIBAIJ(): Block-Stash has 0 entries, uses 0 
> > > > mallocs.
> > > >
> > > > it seems it never went to  call 
> > > > MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)
> > > > Is there anything I am doing wrong?
> > > > Thanks
> > > >
> > > > Yaxiong Chen,
> > > > Ph.D. Student
> > > >
> > > > School of Mechanical Engineering, 3171
> > > > 585 Purdue Mall
> > > > West Lafayette, IN 47907
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > From: Mark Adams <mfad...@lbl.gov>
> > > > Sent: Tuesday, January 29, 2019 10:02 PM
> > > > To: Yaxiong Chen; PETSc users list
> > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > >
> > > > Optimized is a configuration flag not a versions.
> > > >
> > > > You need to figure out your number of non-zeros per row of you global 
> > > > matrix, or a bound on it, and supply that in MatMPIAIJSetPreallocation. 
> > > > Otherwise it has to allocate and copy memory often.
> > > >
> > > > You could increase your f9 on a serial run and see what runs best and 
> > > > then  move to parallel with a value in f6 of about 1/2 of f9.
> > > >
> > > > On Tue, Jan 29, 2019 at 9:13 PM Yaxiong Chen <chen2...@purdue.edu> 
> > > > wrote:
> > > > Thanks Mark,
> > > >     I use PETSC 3.9.4, is this the optimized version you called?
> > > >     Actually f9 and f6 are from the PETSC example. I don't know how to 
> > > > set the value correctly so I commend them. The size of my elemental 
> > > > matrix may vary. For 2D problem, the size of elemental matrix can be 
> > > > 24*24 or 32*32 or some other sizes. And the index is not continuous. In 
> > > > this case, the elemental matrix may interlace with each other in the 
> > > > global matrix, and I may have thousands of elemental matrix to be 
> > > > assembled. Does the preallocating suitable for this?
> > > >
> > > > Yaxiong Chen,
> > > > Ph.D. Student
> > > >
> > > > School of Mechanical Engineering, 3171
> > > > 585 Purdue Mall
> > > > West Lafayette, IN 47907
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > From: Mark Adams <mfad...@lbl.gov>
> > > > Sent: Tuesday, January 29, 2019 8:25 PM
> > > > To: Yaxiong Chen
> > > > Cc: Song Gao; petsc-users@mcs.anl.gov
> > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > >
> > > > Slow assembly is often from not preallocating correctly. I am guessing 
> > > > that you are using Q1 element and f9==9, and thus the preallocation 
> > > > should be OK if this is a scalar problem on a regular grid and f6-==6 
> > > > should be OK for the off processor allocation, if my assumptions are 
> > > > correct.
> > > >
> > > > You can run with -info, which will tell you how many allocation were 
> > > > done in assembly. Make sure that it is small (eg, 0).
> > > >
> > > > I see you use f90 array stuff 'idx-1'. Compilers can sometimes do crazy 
> > > > things with seeming simple code. You could just do this manually if you 
> > > > can find anything else.
> > > >
> > > > And I trust you are running an optimized version of PETSc.
> > > >
> > > >
> > > > On Tue, Jan 29, 2019 at 6:22 PM Yaxiong Chen via petsc-users 
> > > > <petsc-users@mcs.anl.gov> wrote:
> > > > Hi Song,
> > > > I don't quite understand how I can use this command. I don't partition 
> > > > the gloabl matrix. If I add my elemental matrix to the global system it 
> > > > will be like this. And in my parallel part, I use each core to generate 
> > > > the elemental matrix in turn. In this case, I guess each core will be 
> > > > assigned the space for global matrix and finally be assembled.But 
> > > > according to the manual, it seems each core will store a part of the 
> > > > global matrix. Is the local submatrix in the 
> > > > MatMPIAIJSetPreallocation(Mat B,PetscInt d_nz,const PetscInt 
> > > > d_nnz[],PetscInt o_nz,const PetscInto_nnz[])the same as my elemental 
> > > > matrix?
> > > > <pastedImage.png>
> > > >
> > > >
> > > > Thanks
> > > >
> > > > Yaxiong Chen,
> > > > Ph.D. Student
> > > >
> > > > School of Mechanical Engineering, 3171
> > > > 585 Purdue Mall
> > > > West Lafayette, IN 47907
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > From: Song Gao <song.g...@mail.mcgill.ca>
> > > > Sent: Tuesday, January 29, 2019 1:22 PM
> > > > To: Yaxiong Chen
> > > > Cc: Matthew Knepley; petsc-users@mcs.anl.gov
> > > > Subject: Re: [petsc-users] PETSC matrix assembling super slow
> > > >
> > > > I think you would prefer to preallocate the matrix
> > > >
> > > > uncomment this line
> > > > ! call 
> > > > MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> > > >  ierr)
> > > >
> > > >
> > > >
> > > > Le mar. 29 janv. 2019, à 12 h 40, Yaxiong Chen via petsc-users 
> > > > <petsc-users@mcs.anl.gov> a écrit :
> > > > Hello,
> > > >
> > > > I have a 2D system which is assembled by each elemental matrix.  Ae is 
> > > > my elemental matrix, auxRHSe(:) and RHSe(:) and corresponding right 
> > > > hand side, idx is the global index. My code is as follow, however ,the 
> > > > assembling rate is super slow(Marked red in the code). I am not sure 
> > > > whether the assembling type is right or not. Since for each element, 
> > > > idx are not continuous numbers. Do you have any idea what is the better 
> > > > way to assemble the matrix?
> > > >
> > > > Thanks
> > > >
> > > > block
> > > >     PetscErrorCode ierr
> > > >      PetscMPIInt rank,nproc, mystart
> > > >      PetscInt  nelem
> > > >      integer,allocatable  ::info(:)
> > > >      real(wp), allocatable :: Ae(:,:), auxRHSe(:),RHSe(:)
> > > >      integer, allocatable      :: idx(:)
> > > >      PetscScalar, pointer :: xx_v(:)
> > > >
> > > >      PC               prec
> > > >      PetscScalar ::   val
> > > >      Vec              xvec,bvec,uvec
> > > >      Mat              Amat
> > > >      KSP              ksp
> > > >      PetscViewer viewer
> > > >      PetscInt  geq,j,k,ne,M,Istart,Iend
> > > >      PetscBool  flg
> > > >      KSPConvergedReason reason
> > > >      Vec         dummyVec, dummyVecs(1)
> > > >      MatNullSpace nullspace
> > > >
> > > >      call PetscInitialize( PETSC_NULL_CHARACTER, ierr )
> > > >
> > > >      if (ierr .ne. 0) then
> > > >      print*,'Unable to initialize PETSc'
> > > >      stop
> > > >      endif
> > > >      call MPI_Comm_size(PETSC_COMM_WORLD,nproc,ierr)
> > > >      call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr)
> > > >      mystart=rank+1
> > > >
> > > >  !   Parameter set
> > > >      info=ptsystem%getInitInfo()
> > > >      nelem = 
> > > > info(Info_EleMatNum)+info(Info_FixedDOFNum)+info(Info_NumConstrain)
> > > >      print*,'nelem',nelem
> > > >  !-------------------------------------
> > > >  !  Create Matrix
> > > >      call MatCreate(PETSC_COMM_WORLD,Amat,ierr)
> > > >      call MatSetSizes( Amat,PETSC_DECIDE, PETSC_DECIDE, info(1), 
> > > > info(1), ierr )
> > > >      call MatSetType( Amat, MATMPIBAIJ, ierr )
> > > >      ! call 
> > > > MatMPIAIJSetPreallocation(Amat,f9,PETSC_NULL_INTEGER,f6,PETSC_NULL_INTEGER,
> > > >  ierr)
> > > >      call MatSetFromOptions( Amat, ierr )
> > > >      call MatSetUp( Amat, ierr )
> > > >      call MatGetOwnershipRange( Amat, Istart, Iend, ierr )
> > > >
> > > >       xvec = tVec(0)
> > > >       call MatCreateVecs( Amat, PETSC_NULL_VEC, xvec, ierr )
> > > >       call VecSetFromOptions( xvec, ierr )
> > > >       call VecDuplicate( xvec, bvec, ierr )
> > > >       call VecDuplicate( xvec, uvec, ierr )
> > > >
> > > >    t1 = MPI_WTime();
> > > >
> > > >       do i=mystart,nelem,nproc
> > > >             call ptSystem%getElementalMAT(i, Ae, auxRHSe, idx)
> > > >             ne=size(idx)
> > > >             if (allocated(auxRHSe))  call 
> > > > VecSetValues(bvec,ne,idx-1,auxRHSe,ADD_VALUES,ierr)
> > > >              call 
> > > > MatSetValues(Amat,ne,idx-1,ne,idx-1,Ae,ADD_VALUES,ierr)
> > > >       end do
> > > >
> > > >       nelem   = info(Info_EleRHSNum)
> > > >       mystart = rank+1
> > > >
> > > >       do i = mystart, nelem, nproc
> > > >         call ptSystem%getElementalRHS(i, RHSe, idx)
> > > >         print*,'idx',idx
> > > >         ne=size(idx)
> > > >         if (allocated(RHSe))  call 
> > > > VecSetValues(bvec,ne,idx-1,RHSe,ADD_VALUES,ierr)
> > > >       end do
> > > >       call MatAssemblyBegin(Amat,MAT_FINAL_ASSEMBLY,ierr)
> > > >       call MatAssemblyEnd(Amat,MAT_FINAL_ASSEMBLY,ierr)                 
> > > >          !  this part is slow, the for loop above is done but here it 
> > > > may get stuck
> > > >       call VecAssemblyBegin(bvec,ierr)                                  
> > > >                                   ! For a 2500 DOF system, assembling 
> > > > only takes over 2 seconds
> > > >       call VecAssemblyEnd(bvec,ierr)                                    
> > > >                                   ! But for a 10000 DOF system , it 
> > > > gets stuck
> > > >      t2 = MPI_WTime();
> > > >       print*,'assembeling time',t2-t1
> > > >       ! Solve
> > > >       call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
> > > > !  Set operators. Here the matrix that defines the linear system
> > > > !  also serves as the preconditioning matrix.
> > > >       call KSPSetOperators(ksp,Amat,Amat,ierr)
> > > >       call KSPSetFromOptions(ksp,ierr)
> > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> > > > -
> > > > !                      Solve the linear system
> > > > ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> > > > -
> > > >     call KSPSetType(ksp,KSPCG,ierr)
> > > >     call KSPGetPC(ksp,prec,ierr)
> > > >    ! call KSPSetPCSide(ksp,PC_SYMMETRIC,ierr)
> > > >     call PCSetType(prec,PCJACOBI,ierr)
> > > >      call KSPSolve(ksp,bvec,xvec,ierr)
> > > >      call PetscFinalize(ierr)
> > > >
> > > > end block
> > > >
> > > >
> > > > Yaxiong Chen,
> > > > Ph.D. Student
> > > >
> > > > School of Mechanical Engineering, 3171
> > > > 585 Purdue Mall
> > > > West Lafayette, IN 47907

Reply via email to