Re: [petsc-users] Memory optimization

2019-12-10 Thread Jose E. Roman
I guess conditioning is getting worse. I would try using MUMPS for the sub_pc 
LU.
Jose


> El 10 dic 2019, a las 18:36, Perceval Desforges 
>  escribió:
> 
> Hello again,
> 
> I have tried following your advice to use preconditioned iterative solvers 
> for my 3D systems, and have been encountering some difficulties. 
> I have been following the recommendations of section 3.4.1 of the slepc 
> user's manual, setting the following options:  -st_ksp_type gmres 
> -ksp_gmres_modifiedgramschmidt -st_pc_type asm -st_sub_pc_type lu 
> -st_ksp_rtol 1e-9 -st_ksp_converged_reason -st_ksp_monitor_true_residual. 
> 
> The problem is that the code converges quite rapidly for the first 
> eigenvalues (at around 0.4 in my case, and in about 20 iterations for each). 
> The last eigenvalue obtained is a bit higher than 0,5. However, when I set 
> the shift to 0.5, it does not converge even after 1 iterations, and the 
> residual norm is still at around 0,01. 
> 
> This only seems to be happening when my matrix is large enough (10^6 by 10^6).
> 
> Is there something obvious I am doing wrong?
> 
> Thanks for your time,
> 
> Regards,
> 
> Perceval,
> 
> 
> 
>> In 3D problems it is recommended to use preconditioned iterative solvers. 
>> Unfortunately the spectrum slicing technique requires the full factorization 
>> (because it uses matrix inertia).
>> 
>> 
>>> El 25 nov 2019, a las 18:44, Perceval Desforges 
>>>  escribió:
>>> 
>>> I am basically trying to solve a finite element problem, which is why in 3D 
>>> I have 7 non-zero diagonals that are quite farm apart from one another. In 
>>> 2D I only have 5 non-zero diagonals that are less far apart. So is it 
>>> normal that the set up time is around 400 times greater in the 3D case? Is 
>>> there nothing to be done?
>>> 
>>> I will try setting up only one partition.
>>> 
>>> Thanks,
>>> 
>>> Perceval,
>>> 
 Probably it is not a preallocation issue, as it shows "total number of 
 mallocs used during MatSetValues calls =0".
 
 Adding new diagonals may increase fill-in a lot, if the new diagonals are 
 displaced with respect to the other ones.
 
 The partitions option is intended for running several nodes. If you are 
 using just one node probably it is better to set one partition only.
 
 Jose
 
 
> El 25 nov 2019, a las 18:25, Matthew Knepley  escribió:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges 
>  wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 100 by 1000. I ran 
> the program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=100, allocated nonzeros=100
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions 
> which may be too much. I tried running the code again with only 2 
> partitions and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero 
> diagonals (so 500 non-zero entries), and the set up time is quite 
> fast (8 seconds)  and solving is also quite fast. The second version is 
> the same but I have two extra non-zero diagonals (700 non-zero 
> entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) 
> and solving is also a lot slower. Is it normal that adding two extra 
> diagonals increases solve and set up time so much?
> 
> 
> I can't see the rest of your code, but I am guessing your preallocation 
> statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second 
> matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
>   Thanks,
> 
> Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> 
> 
> 
> 
>> Then I guess it is the factorization that is failing. How many nonzero 
>> entries do you have? Run with
>> -mat_view ::ascii_info
>> 
>> Jose
>> 
>> 
>>> El 22 nov 2019, a las 19:56, Perceval Desforges 
>>>  escribió:
>>

Re: [petsc-users] Memory optimization

2019-12-10 Thread Perceval Desforges
Hello again, 

I have tried following your advice to use preconditioned iterative
solvers for my 3D systems, and have been encountering some difficulties.

I have been following the recommendations of section 3.4.1 of the slepc
user's manual, setting the following options:  -st_ksp_type gmres
-ksp_gmres_modifiedgramschmidt -st_pc_type asm -st_sub_pc_type lu
-st_ksp_rtol 1e-9 -st_ksp_converged_reason
-st_ksp_monitor_true_residual.  

The problem is that the code converges quite rapidly for the first
eigenvalues (at around 0.4 in my case, and in about 20 iterations for
each). The last eigenvalue obtained is a bit higher than 0,5. However,
when I set the shift to 0.5, it does not converge even after 1
iterations, and the residual norm is still at around 0,01.  

This only seems to be happening when my matrix is large enough (10^6 by
10^6). 

Is there something obvious I am doing wrong? 

Thanks for your time, 

Regards, 

Perceval, 

> In 3D problems it is recommended to use preconditioned iterative solvers. 
> Unfortunately the spectrum slicing technique requires the full factorization 
> (because it uses matrix inertia).
> 
> El 25 nov 2019, a las 18:44, Perceval Desforges 
>  escribió:
> 
> I am basically trying to solve a finite element problem, which is why in 3D I 
> have 7 non-zero diagonals that are quite farm apart from one another. In 2D I 
> only have 5 non-zero diagonals that are less far apart. So is it normal that 
> the set up time is around 400 times greater in the 3D case? Is there nothing 
> to be done?
> 
> I will try setting up only one partition.
> 
> Thanks,
> 
> Perceval,
> 
> Probably it is not a preallocation issue, as it shows "total number of 
> mallocs used during MatSetValues calls =0".
> 
> Adding new diagonals may increase fill-in a lot, if the new diagonals are 
> displaced with respect to the other ones.
> 
> The partitions option is intended for running several nodes. If you are using 
> just one node probably it is better to set one partition only.
> 
> Jose
> 
> El 25 nov 2019, a las 18:25, Matthew Knepley  escribió:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges 
>  wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 100 by 1000. I ran the 
> program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=100, cols=100
> total: nonzeros=700, allocated nonzeros=700
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=100, cols=100
> total: nonzeros=100, allocated nonzeros=100
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=100, cols=100
> total: nonzeros=700, allocated nonzeros=700
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions 
> which may be too much. I tried running the code again with only 2 partitions 
> and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero diagonals 
> (so 500 non-zero entries), and the set up time is quite fast (8 seconds)  
> and solving is also quite fast. The second version is the same but I have two 
> extra non-zero diagonals (700 non-zero entries)  and the set up time is a 
> lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is 
> it normal that adding two extra diagonals increases solve and set up time so 
> much?
> 
> I can't see the rest of your code, but I am guessing your preallocation 
> statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second 
> matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
> Thanks,
> 
> Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> Then I guess it is the factorization that is failing. How many nonzero 
> entries do you have? Run with
> -mat_view ::ascii_info
> 
> Jose
> 
> El 22 nov 2019, a las 19:56, Perceval Desforges 
>  escribió:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but 
> the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 
> 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and 
> contains too many eigenvalues (SLEPc needs to allocate at least one vector 
> per

Re: [petsc-users] Memory optimization

2019-11-26 Thread Smith, Barry F.

> I am basically trying to solve a finite element problem, which is why in 3D I 
> have 7 non-zero diagonals that are quite farm apart from one another. In 2D I 
> only have 5 non-zero diagonals that are less far apart. So is it normal that 
> the set up time is around 400 times greater in the 3D case? Is there nothing 
> to be done?

   Yes, sparse direct solver behavior between 2d and 3d problems can be 
dramatically different in both space and time. There is a well developed 
understanding of this from the 1970s. 

   For 2d the results are given in 
https://epubs.siam.org/doi/abs/10.1137/0710032?journalCode=sjnaam  work is n^3 
space is n^2 log (n) using nested dissection ordering

   In 3d work is n^6   see 
http://amath.colorado.edu/faculty/martinss/2014_CBMS/Lectures/lecture06.pdf

   So 3d is very limited for direct solvers; and one has to try something else.
 
   Barry







> On Nov 26, 2019, at 9:23 AM, Perceval Desforges 
>  wrote:
> 
> Hello,
> 
> This is the output of -log_view. I selected what I thought were the important 
> parts. I don't know if this is the best format to send the logs. If a text 
> file is better let me know. Thanks again,
> 
> 
> 
> -- PETSc Performance Summary: 
> --
> 
> ./dos.exe on a  named compute-0-11.local with 20 processors, by pcd Tue Nov 
> 26 15:50:50 2019
> Using Petsc Release Version 3.10.5, Mar, 28, 2019 
> 
>  Max   Max/Min Avg   Total 
> Time (sec):   2.214e+03 1.000   2.214e+03
> Objects:  1.370e+02 1.030   1.332e+02
> Flop: 1.967e+14 1.412   1.539e+14  3.077e+15
> Flop/sec: 8.886e+10 1.412   6.950e+10  1.390e+12
> MPI Messages: 1.716e+03 1.350   1.516e+03  3.032e+04
> MPI Message Lengths:  2.559e+08 5.796   4.179e+04  1.267e+09
> MPI Reductions:   3.840e+02 1.000
> 
> Summary of Stages:   - Time --  - Flop --  --- Messages ---  
> -- Message Lengths --  -- Reductions --
> Avg %Total Avg %TotalCount   %Total   
>   Avg %TotalCount   %Total 
>  0:  Main Stage: 1.e+02   4.5%  3.0771e+15 100.0%  3.016e+04  99.5%  
> 4.190e+04   99.7%  3.310e+02  86.2% 
>  1:  Setting Up EPS: 2.1137e+03  95.5%  7.4307e+09   0.0%  1.600e+02   0.5%  
> 2.000e+040.3%  4.600e+01  12.0%
> 
> 
> 
> 
> EventCount  Time (sec) Flop   
>--- Global ---  --- Stage   Total
>Max Ratio  Max Ratio   Max  Ratio  Mess   AvgLen  
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> 
> 
> --- Event Stage 0: Main Stage
> 
> PetscBarrier   2 1.0 2.6554e+004632.9 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   3  0  0  0  0 0
> BuildTwoSidedF 3 1.0 1.2021e-01672.3 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> VecDot 8 1.0 1.1364e-02 2.3 8.00e+05 1.0 0.0e+00 0.0e+00 
> 8.0e+00  0  0  0  0  2   0  0  0  0  2  1408
> VecMDot   11 1.0 4.8588e-02 2.2 6.60e+06 1.0 0.0e+00 0.0e+00 
> 1.1e+01  0  0  0  0  3   0  0  0  0  3  2717
> VecNorm   12 1.0 5.2616e-02 4.3 1.20e+06 1.0 0.0e+00 0.0e+00 
> 1.2e+01  0  0  0  0  3   0  0  0  0  4   456
> VecScale  12 1.0 9.8681e-04 2.2 6.00e+05 1.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 12160
> VecCopy3 1.0 4.1175e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> VecSet   108 1.0 9.3610e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> VecAXPY1 1.0 1.6284e-04 3.2 1.00e+05 1.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 12282
> VecMAXPY  12 1.0 7.6976e-03 1.9 7.70e+06 1.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 20006
> VecScatterBegin  419 1.0 4.5905e-01 3.7 0.00e+00 0.0 2.9e+04 3.7e+04 
> 9.0e+01  0  0 96 85 23   0  0 97 85 27 0
> VecScatterEnd329 1.0 9.3328e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   1  0  0  0  0 0
> VecSetRandom   1 1.0 4.3299e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
> VecNormalize  12 1.0 5.3697e-02 4.2 1.80e+06 1.0 0.0e+00 0.0e+00 
> 1.2e+01  0  0  0  0  3   0  0  0  0  4   670
> MatMult  240 1.0 1.2112e-01 1.5 1.86e+07 1.0 4.4e+02 8.0e+04 
> 0.0e+00  0  0  1  3  0   0  0  1  3  0  3071
> MatSolve 101 1.0 9.3087e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04 
> 9.1e+01  4100 97 82 24  93100 97 82 27 33055277
> MatCholFctrNum 1 1.0

Re: [petsc-users] Memory optimization

2019-11-26 Thread Perceval Desforges
Hello, 

This is the output of -log_view. I selected what I thought were the
important parts. I don't know if this is the best format to send the
logs. If a text file is better let me know. Thanks again, 

-- PETSc Performance
Summary: --

./dos.exe on a  named compute-0-11.local with 20 processors, by pcd Tue
Nov 26 15:50:50 2019
Using Petsc Release Version 3.10.5, Mar, 28, 2019 

 Max   Max/Min Avg   Total 
Time (sec):   2.214e+03 1.000   2.214e+03
Objects:  1.370e+02 1.030   1.332e+02
Flop: 1.967e+14 1.412   1.539e+14  3.077e+15
Flop/sec: 8.886e+10 1.412   6.950e+10  1.390e+12
MPI Messages: 1.716e+03 1.350   1.516e+03  3.032e+04
MPI Message Lengths:  2.559e+08 5.796   4.179e+04  1.267e+09
MPI Reductions:   3.840e+02 1.000 

Summary of Stages:   - Time --  - Flop --  --- Messages
---  -- Message Lengths --  -- Reductions --
Avg %Total Avg %TotalCount  
%Total Avg %TotalCount   %Total 
 0:  Main Stage: 1.e+02   4.5%  3.0771e+15 100.0%  3.016e+04 
99.5%  4.190e+04   99.7%  3.310e+02  86.2% 
 1:  Setting Up EPS: 2.1137e+03  95.5%  7.4307e+09   0.0%  1.600e+02  
0.5%  2.000e+040.3%  4.600e+01  12.0% 


EventCount  Time (sec) Flop 
--- Global ---  --- Stage   Total
   Max Ratio  Max Ratio   Max  Ratio  Mess   AvgLen 
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s


--- Event Stage 0: Main Stage

PetscBarrier   2 1.0 2.6554e+004632.9 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   3  0  0  0  0 0
BuildTwoSidedF 3 1.0 1.2021e-01672.3 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
VecDot 8 1.0 1.1364e-02 2.3 8.00e+05 1.0 0.0e+00 0.0e+00
8.0e+00  0  0  0  0  2   0  0  0  0  2  1408
VecMDot   11 1.0 4.8588e-02 2.2 6.60e+06 1.0 0.0e+00 0.0e+00
1.1e+01  0  0  0  0  3   0  0  0  0  3  2717
VecNorm   12 1.0 5.2616e-02 4.3 1.20e+06 1.0 0.0e+00 0.0e+00
1.2e+01  0  0  0  0  3   0  0  0  0  4   456
VecScale  12 1.0 9.8681e-04 2.2 6.00e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 12160
VecCopy3 1.0 4.1175e-04 2.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
VecSet   108 1.0 9.3610e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
VecAXPY1 1.0 1.6284e-04 3.2 1.00e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 12282
VecMAXPY  12 1.0 7.6976e-03 1.9 7.70e+06 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 20006
VecScatterBegin  419 1.0 4.5905e-01 3.7 0.00e+00 0.0 2.9e+04 3.7e+04
9.0e+01  0  0 96 85 23   0  0 97 85 27 0
VecScatterEnd329 1.0 9.3328e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   1  0  0  0  0 0
VecSetRandom   1 1.0 4.3299e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
VecNormalize  12 1.0 5.3697e-02 4.2 1.80e+06 1.0 0.0e+00 0.0e+00
1.2e+01  0  0  0  0  3   0  0  0  0  4   670
MatMult  240 1.0 1.2112e-01 1.5 1.86e+07 1.0 4.4e+02 8.0e+04
0.0e+00  0  0  1  3  0   0  0  1  3  0  3071
MatSolve 101 1.0 9.3087e+01 1.0 1.97e+14 1.4 2.9e+04 3.5e+04
9.1e+01  4100 97 82 24  93100 97 82 27 33055277
MatCholFctrNum 1 1.0 1.2752e-02 2.8 5.00e+04 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  078
MatICCFactorSym1 1.0 4.0321e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
MatAssemblyBegin   5 1.7 1.2031e-01501.1 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 0
MatAssemblyEnd 5 1.7 6.6613e-02 2.4 0.00e+00 0.0 1.6e+02 2.0e+04
2.4e+01  0  0  1  0  6   0  0  1  0  7 0
MatGetRowIJ1 1.0 7.1526e-06 2.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
MatGetOrdering 1 1.0 1.2271e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
MatLoad3 1.0 2.8543e-01 1.0 0.00e+00 0.0 3.3e+02 5.6e+05
5.4e+01  0  0  1 15 14   0  0  1 15 16 0
MatView2 0.0 7.4778e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
KSPSetUp   2 1.0 1.3866e-0236.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 0
KSPSolve  90 1.0 9.3211e+01 1.0 1.97e+14 1.4 3.0e+04 3.6e+04
1.1e+02  4100 98 85 30  93100 99 85

Re: [petsc-users] Memory optimization

2019-11-25 Thread Matthew Knepley
On Mon, Nov 25, 2019 at 11:45 AM Perceval Desforges <
perceval.desfor...@polytechnique.edu> wrote:

> I am basically trying to solve a finite element problem, which is why in
> 3D I have 7 non-zero diagonals that are quite farm apart from one another.
> In 2D I only have 5 non-zero diagonals that are less far apart. So is it
> normal that the set up time is around 400 times greater in the 3D case? Is
> there nothing to be done?
>
> No. It is almost certain that preallocation is screwed up. There is no way
it can take 400x longer for a few nonzeros.

In order to debug, please send the output of -log_view and indicate where
the time is taken for assembly. You can usually
track down bad preallocation using -info.

  Thanks,

 Matt

> I will try setting up only one partition.
>
> Thanks,
>
> Perceval,
>
> Probably it is not a preallocation issue, as it shows "total number of
> mallocs used during MatSetValues calls =0".
>
> Adding new diagonals may increase fill-in a lot, if the new diagonals are
> displaced with respect to the other ones.
>
> The partitions option is intended for running several nodes. If you are
> using just one node probably it is better to set one partition only.
>
> Jose
>
>
> El 25 nov 2019, a las 18:25, Matthew Knepley  escribió:
>
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <
> perceval.desfor...@polytechnique.edu> wrote:
> Hi,
>
> So I'm loading two matrices from files, both 100 by 1000. I ran
> the program with -mat_view::ascii_info and I got:
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=100, allocated nonzeros=100
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times as well, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times as well before crashing.
>
> I realized it might be because I am setting up 20 krylov schur partitions
> which may be too much. I tried running the code again with only 2
> partitions and now the code runs but I have speed issues.
>
> I have one version of the code where my first matrix has 5 non-zero
> diagonals (so 500 non-zero entries), and the set up time is quite fast
> (8 seconds)  and solving is also quite fast. The second version is the same
> but I have two extra non-zero diagonals (700 non-zero entries)  and the
> set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also
> a lot slower. Is it normal that adding two extra diagonals increases solve
> and set up time so much?
>
>
> I can't see the rest of your code, but I am guessing your preallocation
> statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second
> matrix. When you load them from disk, we do all the
> preallocation correctly.
>
>   Thanks,
>
> Matt
> Thanks again,
>
> Best regards,
>
> Perceval,
>
>
>
>
>
> Then I guess it is the factorization that is failing. How many nonzero
> entries do you have? Run with
> -mat_view ::ascii_info
>
> Jose
>
>
> El 22 nov 2019, a las 19:56, Perceval Desforges <
> perceval.desfor...@polytechnique.edu> escribió:
>
> Hi,
>
> Thanks for your answer. I tried looking at the inertias before solving,
> but the problem is that the program crashes when I call EPSSetUp with this
> error:
>
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 >
> 107317760), being killed
>
> I get this error even when there are no eigenvalues in the interval.
>
> I've started using BVMAT instead of BVVECS by the way.
>
> Thanks,
>
> Perceval,
>
>
>
>
>
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>
> Most likely the problem is that the interval you gave is too large and
> contains too many eigenvalues (SLEPc needs to allocate at least one vector
> per each eigenvalue). You can count the eigenvalues in the interval with
> the inertias, which are available at EPSSetUp (no need to call EPSSolve).
> See this example:
>
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option
> -show_inertias
> For example, the output
>Shift 0.1  Inertia 3
>Shift 0.35  Inertia 11
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is
> slower).
>
> Jose
>
>
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <
> petsc-users@mcs.anl.gov> escribió:
>
> Hello all,
>
> I am tryi

Re: [petsc-users] Memory optimization

2019-11-25 Thread Jose E. Roman
In 3D problems it is recommended to use preconditioned iterative solvers. 
Unfortunately the spectrum slicing technique requires the full factorization 
(because it uses matrix inertia).


> El 25 nov 2019, a las 18:44, Perceval Desforges 
>  escribió:
> 
> I am basically trying to solve a finite element problem, which is why in 3D I 
> have 7 non-zero diagonals that are quite farm apart from one another. In 2D I 
> only have 5 non-zero diagonals that are less far apart. So is it normal that 
> the set up time is around 400 times greater in the 3D case? Is there nothing 
> to be done?
> 
> I will try setting up only one partition.
> 
> Thanks,
> 
> Perceval,
> 
>> Probably it is not a preallocation issue, as it shows "total number of 
>> mallocs used during MatSetValues calls =0".
>> 
>> Adding new diagonals may increase fill-in a lot, if the new diagonals are 
>> displaced with respect to the other ones.
>> 
>> The partitions option is intended for running several nodes. If you are 
>> using just one node probably it is better to set one partition only.
>> 
>> Jose
>> 
>> 
>>> El 25 nov 2019, a las 18:25, Matthew Knepley  escribió:
>>> 
>>> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges 
>>>  wrote:
>>> Hi,
>>> 
>>> So I'm loading two matrices from files, both 100 by 1000. I ran the 
>>> program with -mat_view::ascii_info and I got:
>>> 
>>> Mat Object: 1 MPI processes
>>>   type: seqaij
>>>   rows=100, cols=100
>>>   total: nonzeros=700, allocated nonzeros=700
>>>   total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> 
>>> 20 times, and then
>>> 
>>> Mat Object: 1 MPI processes
>>>   type: seqaij
>>>   rows=100, cols=100
>>>   total: nonzeros=100, allocated nonzeros=100
>>>   total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> 
>>> 20 times as well, and then
>>> 
>>> Mat Object: 1 MPI processes
>>>   type: seqaij
>>>   rows=100, cols=100
>>>   total: nonzeros=700, allocated nonzeros=700
>>>   total number of mallocs used during MatSetValues calls =0
>>> not using I-node routines
>>> 
>>> 20 times as well before crashing.
>>> 
>>> I realized it might be because I am setting up 20 krylov schur partitions 
>>> which may be too much. I tried running the code again with only 2 
>>> partitions and now the code runs but I have speed issues.
>>> 
>>> I have one version of the code where my first matrix has 5 non-zero 
>>> diagonals (so 500 non-zero entries), and the set up time is quite fast 
>>> (8 seconds)  and solving is also quite fast. The second version is the same 
>>> but I have two extra non-zero diagonals (700 non-zero entries)  and the 
>>> set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also 
>>> a lot slower. Is it normal that adding two extra diagonals increases solve 
>>> and set up time so much?
>>> 
>>> 
>>> I can't see the rest of your code, but I am guessing your preallocation 
>>> statement has "5", so it does no mallocs when you create
>>> your first matrix, but mallocs for every row when you create your second 
>>> matrix. When you load them from disk, we do all the
>>> preallocation correctly.
>>> 
>>>   Thanks,
>>> 
>>> Matt 
>>> Thanks again,
>>> 
>>> Best regards,
>>> 
>>> Perceval,
>>> 
>>> 
>>> 
>>> 
>>> 
 Then I guess it is the factorization that is failing. How many nonzero 
 entries do you have? Run with
 -mat_view ::ascii_info
 
 Jose
 
 
> El 22 nov 2019, a las 19:56, Perceval Desforges 
>  escribió:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, 
> but the problem is that the program crashes when I call EPSSetUp with 
> this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 
> 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> 
> 
> 
> 
>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>> 
>> Most likely the problem is that the interval you gave is too large and 
>> contains too many eigenvalues (SLEPc needs to allocate at least one 
>> vector per each eigenvalue). You can count the eigenvalues in the 
>> interval with the inertias, which are available at EPSSetUp (no need to 
>> call EPSSolve). See this example:
>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>> You can comment out the call to EPSSolve() and run with the option 
>> -show_inertias
>> For example, the output
>>Shift 0.1  Inertia 3 
>>Shift 0.35  Inertia 11 
>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>> 
>> By the way, I would suggest using BVMAT

Re: [petsc-users] Memory optimization

2019-11-25 Thread Perceval Desforges
I am basically trying to solve a finite element problem, which is why in
3D I have 7 non-zero diagonals that are quite farm apart from one
another. In 2D I only have 5 non-zero diagonals that are less far apart.
So is it normal that the set up time is around 400 times greater in the
3D case? Is there nothing to be done? 

I will try setting up only one partition. 

Thanks, 

Perceval,

> Probably it is not a preallocation issue, as it shows "total number of 
> mallocs used during MatSetValues calls =0".
> 
> Adding new diagonals may increase fill-in a lot, if the new diagonals are 
> displaced with respect to the other ones.
> 
> The partitions option is intended for running several nodes. If you are using 
> just one node probably it is better to set one partition only.
> 
> Jose
> 
> El 25 nov 2019, a las 18:25, Matthew Knepley  escribió:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges 
>  wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 100 by 1000. I ran the 
> program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=100, cols=100
> total: nonzeros=700, allocated nonzeros=700
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=100, cols=100
> total: nonzeros=100, allocated nonzeros=100
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
> type: seqaij
> rows=100, cols=100
> total: nonzeros=700, allocated nonzeros=700
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions 
> which may be too much. I tried running the code again with only 2 partitions 
> and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero diagonals 
> (so 500 non-zero entries), and the set up time is quite fast (8 seconds)  
> and solving is also quite fast. The second version is the same but I have two 
> extra non-zero diagonals (700 non-zero entries)  and the set up time is a 
> lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is 
> it normal that adding two extra diagonals increases solve and set up time so 
> much?
> 
> I can't see the rest of your code, but I am guessing your preallocation 
> statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second 
> matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
> Thanks,
> 
> Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> Then I guess it is the factorization that is failing. How many nonzero 
> entries do you have? Run with
> -mat_view ::ascii_info
> 
> Jose
> 
> El 22 nov 2019, a las 19:56, Perceval Desforges 
>  escribió:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but 
> the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 
> 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and 
> contains too many eigenvalues (SLEPc needs to allocate at least one vector 
> per each eigenvalue). You can count the eigenvalues in the interval with the 
> inertias, which are available at EPSSetUp (no need to call EPSSolve). See 
> this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option 
> -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is 
> slower).
> 
> Jose
> 
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>  escribió:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly 
> large matrix (100 * 100). I therefore use the spectrum slicing method 
> detailed in section 3.4.5 of the manual. The calculations are run on a 
> processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory lim

Re: [petsc-users] Memory optimization

2019-11-25 Thread Jose E. Roman
Probably it is not a preallocation issue, as it shows "total number of mallocs 
used during MatSetValues calls =0".

Adding new diagonals may increase fill-in a lot, if the new diagonals are 
displaced with respect to the other ones.

The partitions option is intended for running several nodes. If you are using 
just one node probably it is better to set one partition only.

Jose


> El 25 nov 2019, a las 18:25, Matthew Knepley  escribió:
> 
> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges 
>  wrote:
> Hi,
> 
> So I'm loading two matrices from files, both 100 by 1000. I ran the 
> program with -mat_view::ascii_info and I got:
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times, and then
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=100, allocated nonzeros=100
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well, and then
> 
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
> 
> 20 times as well before crashing.
> 
> I realized it might be because I am setting up 20 krylov schur partitions 
> which may be too much. I tried running the code again with only 2 partitions 
> and now the code runs but I have speed issues.
> 
> I have one version of the code where my first matrix has 5 non-zero diagonals 
> (so 500 non-zero entries), and the set up time is quite fast (8 seconds)  
> and solving is also quite fast. The second version is the same but I have two 
> extra non-zero diagonals (700 non-zero entries)  and the set up time is a 
> lot slower (2900 seconds ~ 50 minutes) and solving is also a lot slower. Is 
> it normal that adding two extra diagonals increases solve and set up time so 
> much?
> 
> 
> I can't see the rest of your code, but I am guessing your preallocation 
> statement has "5", so it does no mallocs when you create
> your first matrix, but mallocs for every row when you create your second 
> matrix. When you load them from disk, we do all the
> preallocation correctly.
> 
>   Thanks,
> 
> Matt 
> Thanks again,
> 
> Best regards,
> 
> Perceval,
> 
> 
> 
> 
> 
>> Then I guess it is the factorization that is failing. How many nonzero 
>> entries do you have? Run with
>> -mat_view ::ascii_info
>> 
>> Jose
>> 
>> 
>>> El 22 nov 2019, a las 19:56, Perceval Desforges 
>>>  escribió:
>>> 
>>> Hi,
>>> 
>>> Thanks for your answer. I tried looking at the inertias before solving, but 
>>> the problem is that the program crashes when I call EPSSetUp with this 
>>> error:
>>> 
>>> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 
>>> 107317760), being killed
>>> 
>>> I get this error even when there are no eigenvalues in the interval.
>>> 
>>> I've started using BVMAT instead of BVVECS by the way.
>>> 
>>> Thanks,
>>> 
>>> Perceval,
>>> 
>>> 
>>> 
>>> 
>>> 
 Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
 
 Most likely the problem is that the interval you gave is too large and 
 contains too many eigenvalues (SLEPc needs to allocate at least one vector 
 per each eigenvalue). You can count the eigenvalues in the interval with 
 the inertias, which are available at EPSSetUp (no need to call EPSSolve). 
 See this example:
 http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
 You can comment out the call to EPSSolve() and run with the option 
 -show_inertias
 For example, the output
Shift 0.1  Inertia 3 
Shift 0.35  Inertia 11 
 means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
 
 By the way, I would suggest using BVMAT instead of BVVECS (the latter is 
 slower).
 
 Jose
 
 
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>  escribió:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a 
> fairly large matrix (100 * 100). I therefore use the spectrum 
> slicing method detailed in section 3.4.5 of the manual. The calculations 
> are run on a processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> 
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 
> 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with t

Re: [petsc-users] Memory optimization

2019-11-25 Thread Matthew Knepley
On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <
perceval.desfor...@polytechnique.edu> wrote:

> Hi,
>
> So I'm loading two matrices from files, both 100 by 1000. I ran
> the program with -mat_view::ascii_info and I got:
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=100, allocated nonzeros=100
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times as well, and then
>
> Mat Object: 1 MPI processes
>   type: seqaij
>   rows=100, cols=100
>   total: nonzeros=700, allocated nonzeros=700
>   total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times as well before crashing.
>
> I realized it might be because I am setting up 20 krylov schur partitions
> which may be too much. I tried running the code again with only 2
> partitions and now the code runs but I have speed issues.
>
> I have one version of the code where my first matrix has 5 non-zero
> diagonals (so 500 non-zero entries), and the set up time is quite fast
> (8 seconds)  and solving is also quite fast. The second version is the same
> but I have two extra non-zero diagonals (700 non-zero entries)  and the
> set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also
> a lot slower. Is it normal that adding two extra diagonals increases solve
> and set up time so much?
>
> I can't see the rest of your code, but I am guessing your preallocation
statement has "5", so it does no mallocs when you create
your first matrix, but mallocs for every row when you create your second
matrix. When you load them from disk, we do all the
preallocation correctly.

  Thanks,

Matt

> Thanks again,
>
> Best regards,
>
> Perceval,
>
>
>
> Then I guess it is the factorization that is failing. How many nonzero
> entries do you have? Run with
> -mat_view ::ascii_info
>
> Jose
>
>
> El 22 nov 2019, a las 19:56, Perceval Desforges <
> perceval.desfor...@polytechnique.edu> escribió:
>
> Hi,
>
> Thanks for your answer. I tried looking at the inertias before solving,
> but the problem is that the program crashes when I call EPSSetUp with this
> error:
>
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 >
> 107317760), being killed
>
> I get this error even when there are no eigenvalues in the interval.
>
> I've started using BVMAT instead of BVVECS by the way.
>
> Thanks,
>
> Perceval,
>
>
>
>
>
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>
> Most likely the problem is that the interval you gave is too large and
> contains too many eigenvalues (SLEPc needs to allocate at least one vector
> per each eigenvalue). You can count the eigenvalues in the interval with
> the inertias, which are available at EPSSetUp (no need to call EPSSolve).
> See this example:
>
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option
> -show_inertias
> For example, the output
>Shift 0.1  Inertia 3
>Shift 0.35  Inertia 11
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is
> slower).
>
> Jose
>
>
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <
> petsc-users@mcs.anl.gov> escribió:
>
> Hello all,
>
> I am trying to obtain all the eigenvalues in a certain interval for a
> fairly large matrix (100 * 100). I therefore use the spectrum
> slicing method detailed in section 3.4.5 of the manual. The calculations
> are run on a processor with 20 cores and 96 Go of RAM.
>
> The options I use are :
>
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>
>
>
> However the program quickly crashes with this error:
>
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 >
> 107317760), being killed
>
> I've tried reducing the amount of memory used by slepc with the
> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get
> this error:
>
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase:
> INFOG(1)=-9, INFO(2)=82733614
>
> which is an error due to setting the mumps icntl option so low from what
> I've gathered.
>
> Is there any other way I can reduce memory usage?
>
>
>
> Thanks,
>
> Regards,
>
> Perceval,
>
>
>
> P.S. I sent the same email a few minutes ago but I think I made a mistake
> in the address, I'm sorry if I've sent it twice.
>
>
>
>

-- 
What most experimenters take for granted before they begin their
exp

Re: [petsc-users] Memory optimization

2019-11-25 Thread Perceval Desforges
Hi, 

So I'm loading two matrices from files, both 100 by 1000. I ran
the program with -mat_view::ascii_info and I got: 

Mat Object: 1 MPI processes
  type: seqaij
  rows=100, cols=100
  total: nonzeros=700, allocated nonzeros=700
  total number of mallocs used during MatSetValues calls =0
not using I-node routines 

20 times, and then 

Mat Object: 1 MPI processes
  type: seqaij
  rows=100, cols=100
  total: nonzeros=100, allocated nonzeros=100
  total number of mallocs used during MatSetValues calls =0
not using I-node routines 

20 times as well, and then 

Mat Object: 1 MPI processes
  type: seqaij
  rows=100, cols=100
  total: nonzeros=700, allocated nonzeros=700
  total number of mallocs used during MatSetValues calls =0
not using I-node routines 

20 times as well before crashing. 

I realized it might be because I am setting up 20 krylov schur
partitions which may be too much. I tried running the code again with
only 2 partitions and now the code runs but I have speed issues. 

I have one version of the code where my first matrix has 5 non-zero
diagonals (so 500 non-zero entries), and the set up time is quite
fast (8 seconds)  and solving is also quite fast. The second version is
the same but I have two extra non-zero diagonals (700 non-zero
entries)  and the set up time is a lot slower (2900 seconds ~ 50
minutes) and solving is also a lot slower. Is it normal that adding two
extra diagonals increases solve and set up time so much? 

Thanks again, 

Best regards, 

Perceval, 

> Then I guess it is the factorization that is failing. How many nonzero 
> entries do you have? Run with
> -mat_view ::ascii_info
> 
> Jose
> 
> El 22 nov 2019, a las 19:56, Perceval Desforges 
>  escribió:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but 
> the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 
> 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and 
> contains too many eigenvalues (SLEPc needs to allocate at least one vector 
> per each eigenvalue). You can count the eigenvalues in the interval with the 
> inertias, which are available at EPSSetUp (no need to call EPSSolve). See 
> this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option 
> -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is 
> slower).
> 
> Jose
> 
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>  escribió:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly 
> large matrix (100 * 100). I therefore use the spectrum slicing method 
> detailed in section 3.4.5 of the manual. The calculations are run on a 
> processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 
> 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with the 
> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get 
> this error:
> 
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: 
> INFOG(1)=-9, INFO(2)=82733614
> 
> which is an error due to setting the mumps icntl option so low from what I've 
> gathered.
> 
> Is there any other way I can reduce memory usage?
> 
> Thanks,
> 
> Regards,
> 
> Perceval,
> 
> P.S. I sent the same email a few minutes ago but I think I made a mistake in 
> the address, I'm sorry if I've sent it twice.

Re: [petsc-users] Memory optimization

2019-11-25 Thread Jose E. Roman
Then I guess it is the factorization that is failing. How many nonzero entries 
do you have? Run with
-mat_view ::ascii_info

Jose


> El 22 nov 2019, a las 19:56, Perceval Desforges 
>  escribió:
> 
> Hi,
> 
> Thanks for your answer. I tried looking at the inertias before solving, but 
> the problem is that the program crashes when I call EPSSetUp with this error:
> 
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 > 
> 107317760), being killed
> 
> I get this error even when there are no eigenvalues in the interval.
> 
> I've started using BVMAT instead of BVVECS by the way.
> 
> Thanks,
> 
> Perceval,
> 
> 
> 
> 
> 
>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>> 
>> Most likely the problem is that the interval you gave is too large and 
>> contains too many eigenvalues (SLEPc needs to allocate at least one vector 
>> per each eigenvalue). You can count the eigenvalues in the interval with the 
>> inertias, which are available at EPSSetUp (no need to call EPSSolve). See 
>> this example:
>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>> You can comment out the call to EPSSolve() and run with the option 
>> -show_inertias
>> For example, the output
>>Shift 0.1  Inertia 3 
>>Shift 0.35  Inertia 11 
>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>> 
>> By the way, I would suggest using BVMAT instead of BVVECS (the latter is 
>> slower).
>> 
>> Jose
>> 
>> 
>>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>>>  escribió:
>>> 
>>> Hello all,
>>> 
>>> I am trying to obtain all the eigenvalues in a certain interval for a 
>>> fairly large matrix (100 * 100). I therefore use the spectrum 
>>> slicing method detailed in section 3.4.5 of the manual. The calculations 
>>> are run on a processor with 20 cores and 96 Go of RAM.
>>> 
>>> The options I use are :
>>> 
>>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
>>> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>>> 
>>> 
>>> 
>>> However the program quickly crashes with this error:
>>> 
>>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 
>>> 107317760), being killed
>>> 
>>> I've tried reducing the amount of memory used by slepc with the 
>>> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get 
>>> this error:
>>> 
>>> [1]PETSC ERROR: Error in external library
>>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: 
>>> INFOG(1)=-9, INFO(2)=82733614
>>> 
>>> which is an error due to setting the mumps icntl option so low from what 
>>> I've gathered.
>>> 
>>> Is there any other way I can reduce memory usage?
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Regards,
>>> 
>>> Perceval,
>>> 
>>> 
>>> 
>>> P.S. I sent the same email a few minutes ago but I think I made a mistake 
>>> in the address, I'm sorry if I've sent it twice.
> 
> 



Re: [petsc-users] Memory optimization

2019-11-22 Thread Perceval Desforges
Hi, 

Thanks for your answer. I tried looking at the inertias before solving,
but the problem is that the program crashes when I call EPSSetUp with
this error: 

slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508
> 107317760), being killed 

I get this error even when there are no eigenvalues in the interval. 

I've started using BVMAT instead of BVVECS by the way. 

Thanks, 

Perceval, 

> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
> 
> Most likely the problem is that the interval you gave is too large and 
> contains too many eigenvalues (SLEPc needs to allocate at least one vector 
> per each eigenvalue). You can count the eigenvalues in the interval with the 
> inertias, which are available at EPSSetUp (no need to call EPSSolve). See 
> this example:
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option 
> -show_inertias
> For example, the output
> Shift 0.1  Inertia 3 
> Shift 0.35  Inertia 11 
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
> 
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is 
> slower).
> 
> Jose
> 
>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>>  escribió:
>> 
>> Hello all,
>> 
>> I am trying to obtain all the eigenvalues in a certain interval for a fairly 
>> large matrix (100 * 100). I therefore use the spectrum slicing 
>> method detailed in section 3.4.5 of the manual. The calculations are run on 
>> a processor with 20 cores and 96 Go of RAM.
>> 
>> The options I use are :
>> 
>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
>> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>> 
>> However the program quickly crashes with this error:
>> 
>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 
>> 107317760), being killed
>> 
>> I've tried reducing the amount of memory used by slepc with the 
>> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get 
>> this error:
>> 
>> [1]PETSC ERROR: Error in external library
>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: 
>> INFOG(1)=-9, INFO(2)=82733614
>> 
>> which is an error due to setting the mumps icntl option so low from what 
>> I've gathered.
>> 
>> Is there any other way I can reduce memory usage?
>> 
>> Thanks,
>> 
>> Regards,
>> 
>> Perceval,
>> 
>> P.S. I sent the same email a few minutes ago but I think I made a mistake in 
>> the address, I'm sorry if I've sent it twice.

Re: [petsc-users] Memory optimization

2019-11-21 Thread Jose E. Roman via petsc-users
Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.

Most likely the problem is that the interval you gave is too large and contains 
too many eigenvalues (SLEPc needs to allocate at least one vector per each 
eigenvalue). You can count the eigenvalues in the interval with the inertias, 
which are available at EPSSetUp (no need to call EPSSolve). See this example:
http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
You can comment out the call to EPSSolve() and run with the option 
-show_inertias
For example, the output
   Shift 0.1  Inertia 3 
   Shift 0.35  Inertia 11 
means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).

By the way, I would suggest using BVMAT instead of BVVECS (the latter is 
slower).

Jose


> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>  escribió:
> 
> Hello all,
> 
> I am trying to obtain all the eigenvalues in a certain interval for a fairly 
> large matrix (100 * 100). I therefore use the spectrum slicing method 
> detailed in section 3.4.5 of the manual. The calculations are run on a 
> processor with 20 cores and 96 Go of RAM.
> 
> The options I use are :
> 
> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
> 
> 
> 
> However the program quickly crashes with this error:
> 
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 > 
> 107317760), being killed
> 
> I've tried reducing the amount of memory used by slepc with the 
> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get 
> this error:
> 
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase: 
> INFOG(1)=-9, INFO(2)=82733614
> 
> which is an error due to setting the mumps icntl option so low from what I've 
> gathered.
> 
> Is there any other way I can reduce memory usage?
> 
> 
> 
> Thanks,
> 
> Regards,
> 
> Perceval,
> 
> 
> 
> P.S. I sent the same email a few minutes ago but I think I made a mistake in 
> the address, I'm sorry if I've sent it twice.
>