Thank you Dave! Do you have a rough idea of how long a matrix like that should take to assemble? Not hours. Right?
Regards, Jose -- José Abell *PhD Candidate* Computational Geomechanics Group Dept. of Civil and Environmental Engineering UC Davis www.joseabell.com On Thu, Dec 17, 2015 at 12:58 AM, Dave May <dave.mayhe...@gmail.com> wrote: > > > On 17 December 2015 at 08:06, Jose A. Abell M. <jaab...@ucdavis.edu> > wrote: > >> Hello dear PETSc users, >> >> This is a problem that pops up often, from what I see, in the mailing >> list. My program takes a long time assembling the matrix. >> >> What I know: >> >> >> - Matrix Size is (MatMPIAIJ) 2670402 >> - Number of processes running PETSc: 95 >> - Not going to virtual memory (no swapping, used mem well withing >> each node's capacity) >> - System is partitioned with ParMETIS for load balancing >> - I see memory moving around in each node (total used memory changes >> a bit, grows and then frees) >> - Matrix is filled in blocks of size 81x81 (FEM code, so this ends up >> being a sparse matrix) >> - I don't do flushes at all. Only MAT_FINAL_ASSEMBLY when all the >> MatSetValues are done. >> >> Should I do MAT_FLUSH_ASSEMBLY even though I have enough memory to store >> the buffers? If so, how often? Every 100 blocks? >> >> What else could it be? >> >> Its taking several hours to asseble this matrix. I re-use the sparsity >> pattern, so subsequent assemblies are fast. Does this mean that my >> preallocation is wrong? >> > > The preallocation could be wrong. That is the usual cause of very slow > matrix assembly. To confirm this hypothesis, run your code with the command > line option -info. You will get an enormous amount of information in > stdout. You might consider using -info with a smallish problem size / core > count. > > Inspect the output generated by -info and look for lines like this: > > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > > If the number of mallocs during MatSetValues() is not zero, then your > preallocation is not exactly correct. A small number of mallocs, say less > than 10, might be accepted (performance wise). However if the number of > mallocs is > 100, then assembly time will be terribly slow. > > Thanks, > Dave > > > > >> >> Regards, >> >> >> >> >> -- >> >> José Abell >> *PhD Candidate* >> Computational Geomechanics Group >> Dept. of Civil and Environmental Engineering >> UC Davis >> www.joseabell.com >> >> >