I will not buy at first that things can be done better than looping over elements filling a std::vector< std::set<int> >, next filling a vector with each row size, next preallocating the AIJ matrix, and finally another loop filling matrix rows with zeros, ones, or garbage. All this are about 10-20 lines of code using simple C++ STL containers and a few calls to PETSc.
If anyone has a better way, and can demostrate it with actual timing numbers and some self contained example code, the perhaps I can take the effort of adding something like this to PETSc. But I doubt that this effort is going to pay something ;-). On 5/29/08, Billy Ara?jo <billy at dem.uminho.pt> wrote: > Hi, > > I just want to share my experience with FE assembly. > I think the problem of preallocation in finite element matrices is that you > don't know how many elements are connected to a given node, there can be 5, > 20 elements or more. You can build a structure with the number of nodes > connected to a node and then preallocate the matrix but this is not very > efficient. > > I know UMFPACK has a method of forming triplets with the matrix information > and then it has routines to add duplicate entries and compress the data in a > compressed matrix format. Although I have never used UMFPACK with PETSC. I > also don't know if there are similiar functions in PETSC optimized for FE > matrix assembly. > > Regards, > > Billy. > > > > -----Mensagem original----- > De: owner-petsc-users at mcs.anl.gov em nome de Barry Smith > Enviada: qua 28-05-2008 16:03 > Para: petsc-users at mcs.anl.gov > Assunto: Re: Slow MatSetValues > > > > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manual.pdf#sec_matsparse > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJ.html > > Also, slightl less important, collapse the 4 MatSetValues() below > into a single call that does the little two by two block > > Barry > > On May 28, 2008, at 9:07 AM, Lars Rindorf wrote: > > > Hi everybody > > > > I have a problem with MatSetValues, since the building of my matrix > > takes much longer (35 s) than its solution (0.2 s). When the number > > of degrees of freedom is increased, then the problem worsens. The > > rate of which the elements of the (sparse) matrix is set also seems > > to decrease with the number of elements already set. That is, it > > becomes slower near the end. > > > > The structure of my program is something like: > > > > for element in finite elements > > for dof in element > > for equations in FEM formulation > > ierr = > MatSetValues(M->M,1,&i,1,&j,&tmp,ADD_Values); > > ierr = > MatSetValues(M->M,1,&k,1,&l,&tmp,ADD_Values); > > ierr = > MatSetValues(M->M,1,&i,1,&l,&tmp,ADD_Values); > > ierr = > MatSetValues(M->M,1,&k,1,&j,&tmp,ADD_Values); > > > > > > where i,j,k,l are appropriate integers and tmp is a double value to > > be added. > > > > The code has fine worked with previous version of petsc (not > > compiled by me). The version of petsc that I use is slightly newer > > (I think), 2.3.3 vs ~2.3. > > > > Is it something of an dynamic allocation problem? I have tried using > > MatSetValuesBlock, but this is only slightly faster. If I monitor > > the program's CPU and memory consumption then the CPU is 100 % used > > and the memory consumption is only 20-30 mb. > > > > My computer is a red hat linux with a xeon quad core processor. I > > use Intel's MKL blas and lapack. > > > > What should I do to speed up the petsc? > > > > Kind regards > > Lars > > _____________________________ > > > > > > Lars Rindorf > > M.Sc., Ph.D. > > > > http://www.dti.dk > > > > Danish Technological Institute > > Gregersensvej > > > > 2630 Taastrup > > > > Denmark > > Phone +45 72 20 20 00 > > > > > > > -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594
