Re: [petsc-users] Memory optimization

Jose E. Roman Tue, 10 Dec 2019 09:43:54 -0800

I guess conditioning is getting worse. I would try using MUMPS for the sub_pc 
LU.
Jose



> El 10 dic 2019, a las 18:36, Perceval Desforges 
> <perceval.desfor...@polytechnique.edu> escribió:
> 
> Hello again,
> 
> I have tried following your advice to use preconditioned iterative solvers 
> for my 3D systems, and have been encountering some difficulties. 
> I have been following the recommendations of section 3.4.1 of the slepc 
> user's manual, setting the following options:  -st_ksp_type gmres 
> -ksp_gmres_modifiedgramschmidt -st_pc_type asm -st_sub_pc_type lu 
> -st_ksp_rtol 1e-9 -st_ksp_converged_reason -st_ksp_monitor_true_residual. 
> 
> The problem is that the code converges quite rapidly for the first 
> eigenvalues (at around 0.4 in my case, and in about 20 iterations for each). 
> The last eigenvalue obtained is a bit higher than 0,5. However, when I set 
> the shift to 0.5, it does not converge even after 10000 iterations, and the 
> residual norm is still at around 0,01. 
> 
> This only seems to be happening when my matrix is large enough (10^6 by 10^6).
> 
> Is there something obvious I am doing wrong?
> 
> Thanks for your time,
> 
> Regards,
> 
> Perceval,
> 
> 
> 
>> In 3D problems it is recommended to use preconditioned iterative solvers. 
>> Unfortunately the spectrum slicing technique requires the full factorization 
>> (because it uses matrix inertia).
>> 
>> 
>>> El 25 nov 2019, a las 18:44, Perceval Desforges 
>>> <perceval.desfor...@polytechnique.edu> escribió:
>>> 
>>> I am basically trying to solve a finite element problem, which is why in 3D 
>>> I have 7 non-zero diagonals that are quite farm apart from one another. In 
>>> 2D I only have 5 non-zero diagonals that are less far apart. So is it 
>>> normal that the set up time is around 400 times greater in the 3D case? Is 
>>> there nothing to be done?
>>> 
>>> I will try setting up only one partition.
>>> 
>>> Thanks,
>>> 
>>> Perceval,
>>> 
>>>> Probably it is not a preallocation issue, as it shows "total number of 
>>>> mallocs used during MatSetValues calls =0".
>>>> 
>>>> Adding new diagonals may increase fill-in a lot, if the new diagonals are 
>>>> displaced with respect to the other ones.
>>>> 
>>>> The partitions option is intended for running several nodes. If you are 
>>>> using just one node probably it is better to set one partition only.
>>>> 
>>>> Jose
>>>> 
>>>> 
>>>>> El 25 nov 2019, a las 18:25, Matthew Knepley <knep...@gmail.com> escribió:
>>>>> 
>>>>> On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges 
>>>>> <perceval.desfor...@polytechnique.edu> wrote:
>>>>> Hi,
>>>>> 
>>>>> So I'm loading two matrices from files, both 1000000 by 10000000. I ran 
>>>>> the program with -mat_view::ascii_info and I got:
>>>>> 
>>>>> Mat Object: 1 MPI processes
>>>>>   type: seqaij
>>>>>   rows=1000000, cols=1000000
>>>>>   total: nonzeros=7000000, allocated nonzeros=7000000
>>>>>   total number of mallocs used during MatSetValues calls =0
>>>>>     not using I-node routines
>>>>> 
>>>>> 20 times, and then
>>>>> 
>>>>> Mat Object: 1 MPI processes
>>>>>   type: seqaij
>>>>>   rows=1000000, cols=1000000
>>>>>   total: nonzeros=1000000, allocated nonzeros=1000000
>>>>>   total number of mallocs used during MatSetValues calls =0
>>>>>     not using I-node routines
>>>>> 
>>>>> 20 times as well, and then
>>>>> 
>>>>> Mat Object: 1 MPI processes
>>>>>   type: seqaij
>>>>>   rows=1000000, cols=1000000
>>>>>   total: nonzeros=7000000, allocated nonzeros=7000000
>>>>>   total number of mallocs used during MatSetValues calls =0
>>>>>     not using I-node routines
>>>>> 
>>>>> 20 times as well before crashing.
>>>>> 
>>>>> I realized it might be because I am setting up 20 krylov schur partitions 
>>>>> which may be too much. I tried running the code again with only 2 
>>>>> partitions and now the code runs but I have speed issues.
>>>>> 
>>>>> I have one version of the code where my first matrix has 5 non-zero 
>>>>> diagonals (so 5000000 non-zero entries), and the set up time is quite 
>>>>> fast (8 seconds)  and solving is also quite fast. The second version is 
>>>>> the same but I have two extra non-zero diagonals (7000000 non-zero 
>>>>> entries)  and the set up time is a lot slower (2900 seconds ~ 50 minutes) 
>>>>> and solving is also a lot slower. Is it normal that adding two extra 
>>>>> diagonals increases solve and set up time so much?
>>>>> 
>>>>> 
>>>>> I can't see the rest of your code, but I am guessing your preallocation 
>>>>> statement has "5", so it does no mallocs when you create
>>>>> your first matrix, but mallocs for every row when you create your second 
>>>>> matrix. When you load them from disk, we do all the
>>>>> preallocation correctly.
>>>>> 
>>>>>   Thanks,
>>>>> 
>>>>>     Matt 
>>>>> Thanks again,
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Perceval,
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Then I guess it is the factorization that is failing. How many nonzero 
>>>>>> entries do you have? Run with
>>>>>> -mat_view ::ascii_info
>>>>>> 
>>>>>> Jose
>>>>>> 
>>>>>> 
>>>>>>> El 22 nov 2019, a las 19:56, Perceval Desforges 
>>>>>>> <perceval.desfor...@polytechnique.edu> escribió:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Thanks for your answer. I tried looking at the inertias before solving, 
>>>>>>> but the problem is that the program crashes when I call EPSSetUp with 
>>>>>>> this error:
>>>>>>> 
>>>>>>> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 
>>>>>>> > 107317760), being killed
>>>>>>> 
>>>>>>> I get this error even when there are no eigenvalues in the interval.
>>>>>>> 
>>>>>>> I've started using BVMAT instead of BVVECS by the way.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Perceval,
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>>>>>>>> 
>>>>>>>> Most likely the problem is that the interval you gave is too large and 
>>>>>>>> contains too many eigenvalues (SLEPc needs to allocate at least one 
>>>>>>>> vector per each eigenvalue). You can count the eigenvalues in the 
>>>>>>>> interval with the inertias, which are available at EPSSetUp (no need 
>>>>>>>> to call EPSSolve). See this example:
>>>>>>>> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
>>>>>>>> You can comment out the call to EPSSolve() and run with the option 
>>>>>>>> -show_inertias
>>>>>>>> For example, the output
>>>>>>>>    Shift 0.1  Inertia 3 
>>>>>>>>    Shift 0.35  Inertia 11 
>>>>>>>> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>>>>>>>> 
>>>>>>>> By the way, I would suggest using BVMAT instead of BVVECS (the latter 
>>>>>>>> is slower).
>>>>>>>> 
>>>>>>>> Jose
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users 
>>>>>>>>> <petsc-users@mcs.anl.gov> escribió:
>>>>>>>>> 
>>>>>>>>> Hello all,
>>>>>>>>> 
>>>>>>>>> I am trying to obtain all the eigenvalues in a certain interval for a 
>>>>>>>>> fairly large matrix (1000000 * 1000000). I therefore use the spectrum 
>>>>>>>>> slicing method detailed in section 3.4.5 of the manual. The 
>>>>>>>>> calculations are run on a processor with 20 cores and 96 Go of RAM.
>>>>>>>>> 
>>>>>>>>> The options I use are :
>>>>>>>>> 
>>>>>>>>> -bv_type vecs  -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1 
>>>>>>>>> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> However the program quickly crashes with this error:
>>>>>>>>> 
>>>>>>>>> slurmstepd: error: Step 2115.0 exceeded virtual memory limit 
>>>>>>>>> (312121084 > 107317760), being killed
>>>>>>>>> 
>>>>>>>>> I've tried reducing the amount of memory used by slepc with the 
>>>>>>>>> -mat_mumps_icntl_14 option by setting it at -70 for example but then 
>>>>>>>>> I get this error:
>>>>>>>>> 
>>>>>>>>> [1]PETSC ERROR: Error in external library
>>>>>>>>> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization 
>>>>>>>>> phase: INFOG(1)=-9, INFO(2)=82733614
>>>>>>>>> 
>>>>>>>>> which is an error due to setting the mumps icntl option so low from 
>>>>>>>>> what I've gathered.
>>>>>>>>> 
>>>>>>>>> Is there any other way I can reduce memory usage?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> 
>>>>>>>>> Perceval,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> P.S. I sent the same email a few minutes ago but I think I made a 
>>>>>>>>> mistake in the address, I'm sorry if I've sent it twice.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their 
>>>>> experiments is infinitely more interesting than any results to which 
>>>>> their experiments lead.
>>>>> -- Norbert Wiener
>>>>> 
>>>>> https://www.cse.buffalo.edu/~knepley/
>>> 
> 
>

Re: [petsc-users] Memory optimization

Reply via email to