Re: [petsc-users] MatCreate performance

2019-03-11 Thread Mark Adams via petsc-users
The PETSc logs print the max time and the ratio max/min.

On Mon, Mar 11, 2019 at 8:24 AM Ale Foggia via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Hello all,
>
> Thanks for your answers.
>
> 1) I'm working with a matrix with a linear size of 2**34, but it's a
> sparse matrix, and the number of elements different from zero is
> 43,207,072,74. I know that the distribution of these elements is not
> balanced between the processes, the matrix is more populated in the middle
> part.
>
> 2) I initialize Slepc. Then I create the basis elements of the system
> (this part does not involve Petsc/Slepc, and every process is just
> computing -and owns- an equal amount of basis elements). Then I call:
> ierr = MatCreate(PETSC_COMM_WORLD, ); CHKERRQ(ierr);
> ierr = MatSetType(A, MATMPIAIJ); CHKERRQ(ierr);
> ierr = MatSetSizes(A, PETSC_DECIDE, PETSC_DECIDE, size, size);
> CHKERRQ(ierr);
> ierr = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); CHKERRQ(ierr);
> ierr = MatZeroEntries(A); CHKERRQ(ierr);
> After this, I compute the elements of the matrix and set the values with
> MatSetValues. The I call EPSSolve (with KrylovSchur and setting the type as
> EPS_HEP).
>
> 3) There are a few more things that are strange to me. I measure the
> execution time of these parts both with a PetscLogStage and with a
> std::chrono (in nanoseconds) clock. I understand that the time given by the
> Log is an average over the processes right? In the case of the std::chrono,
> I'm only printing the times from process 0 (no average over processes).
> What I see is the following:
>  1024 procs  2048 procs   4096
> procs8192 procs
>  Log   std  Log   std
> Log   stdLog   std
> MatCreate68.42   122.7  67.08121.2   62.29116
>73.36127.4
> preallocation140.36  140.3 76.45 76.45   40.31
> 40.3  21.13 21.12
> MatSetValues   237.79  237.7 116.6116.660.5960.59
> 35.3235.32
> ESPSolve 162.816095.8   94.2 62.17
> 60.63 41.1640.24
>
> - So, all the times (including the total execution time that I'm not
> showing here) are the same between PetscLogStage and the std::chrono clock,
> except for the part of MatCreate. Maybe that part is very unbalanced?
> - The time of the MatCreate given by the PetscLogStage is not changing.
>
> Ale
>
> El vie., 8 mar. 2019 a las 17:00, Jed Brown () escribió:
>
>> This is very unusual.  MatCreate() does no work, merely dup'ing a
>> communicator (or referencing an inner communicator if this is not the
>> first PetscObject on the provided communicator).  What size matrices are
>> you working with?  Can you send some performance data and (if feasible)
>> a reproducer?
>>
>> Ale Foggia via petsc-users  writes:
>>
>> > Hello all,
>> >
>> > I have a problem with the scaling of the MatCreate() function. I wrote a
>> > code to diagonalize sparse matrices and I'm running it in parallel. I've
>> > observed a very bad speedup of the code and it's given by the MatCreate
>> > part of it: for a fixed matrix size, when I increase the number of
>> > processes the time taken by the function also increases. I wanted to
>> know
>> > if you expect this behavior or if maybe there's something wrong with my
>> > code. When I go to (what I consider) very big matrix sizes, and
>> depending
>> > on the number of mpi processes, in some cases, MatCreate takes more time
>> > than the time the solver takes to solve the system for one eigenvalue or
>> > the time it takes to set up the values.
>> >
>> > Ale
>>
>


Re: [petsc-users] MatCreate performance

2019-03-08 Thread Mark Adams via petsc-users
MatCreate is collective so you want to check that it is not seeing load
imbalance from earlier code.

And duplicating communicators can be expensive on some systems.

On Fri, Mar 8, 2019 at 10:21 AM Ale Foggia via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Hello all,
>
> I have a problem with the scaling of the MatCreate() function. I wrote a
> code to diagonalize sparse matrices and I'm running it in parallel. I've
> observed a very bad speedup of the code and it's given by the MatCreate
> part of it: for a fixed matrix size, when I increase the number of
> processes the time taken by the function also increases. I wanted to know
> if you expect this behavior or if maybe there's something wrong with my
> code. When I go to (what I consider) very big matrix sizes, and depending
> on the number of mpi processes, in some cases, MatCreate takes more time
> than the time the solver takes to solve the system for one eigenvalue or
> the time it takes to set up the values.
>
> Ale
>


Re: [petsc-users] MatCreate performance

2019-03-08 Thread Smith, Barry F. via petsc-users


https://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly


> On Mar 8, 2019, at 9:19 AM, Ale Foggia via petsc-users 
>  wrote:
> 
> Hello all,
> 
> I have a problem with the scaling of the MatCreate() function. I wrote a code 
> to diagonalize sparse matrices and I'm running it in parallel. I've observed 
> a very bad speedup of the code and it's given by the MatCreate part of it: 
> for a fixed matrix size, when I increase the number of processes the time 
> taken by the function also increases. I wanted to know if you expect this 
> behavior or if maybe there's something wrong with my code. When I go to (what 
> I consider) very big matrix sizes, and depending on the number of mpi 
> processes, in some cases, MatCreate takes more time than the time the solver 
> takes to solve the system for one eigenvalue or the time it takes to set up 
> the values.
> 
> Ale



Re: [petsc-users] MatCreate performance

2019-03-08 Thread Jed Brown via petsc-users
This is very unusual.  MatCreate() does no work, merely dup'ing a
communicator (or referencing an inner communicator if this is not the
first PetscObject on the provided communicator).  What size matrices are
you working with?  Can you send some performance data and (if feasible)
a reproducer?

Ale Foggia via petsc-users  writes:

> Hello all,
>
> I have a problem with the scaling of the MatCreate() function. I wrote a
> code to diagonalize sparse matrices and I'm running it in parallel. I've
> observed a very bad speedup of the code and it's given by the MatCreate
> part of it: for a fixed matrix size, when I increase the number of
> processes the time taken by the function also increases. I wanted to know
> if you expect this behavior or if maybe there's something wrong with my
> code. When I go to (what I consider) very big matrix sizes, and depending
> on the number of mpi processes, in some cases, MatCreate takes more time
> than the time the solver takes to solve the system for one eigenvalue or
> the time it takes to set up the values.
>
> Ale