Hi all, I tried to run my Generalized Eigenproblem in 120 x 24 = 2880 cores. The matrix size of A = 20GB and B = 5GB.
It got killed after 7 Hrs of run time. Please see the mumps error log. Why must it fail ? I gave the command: aprun -n 240 -N 24 ./ex7 -f1 a110t -f2 b110t -st_type sinvert -eps_nev 1 -log_summary -st_ksp_type preonly -st_pc_type lu -st_pc_factor_mat_solver_package mumps -mat_mumps_cntl_1 1e-2 Kindly let me know. cheers, Venkatesh On Fri, May 29, 2015 at 10:46 PM, venkatesh g <venkateshg...@gmail.com> wrote: > Hi Matt, users, > > Thanks for the info. Do you also use Petsc and Slepc with MUMPS ? I get > into the segmentation error if I increase my matrix size. > > Can you suggest other software for direct solver for QR in parallel since > as LU may not be good for a singular B matrix in Ax=lambda Bx ? I am > attaching the working version mumps log. > > My matrix size here is around 47000x47000. If I am not wrong, the memory > usage per core is 272MB. > > Can you tell me if I am wrong ? or really if its light on memory for this > matrix ? > > Thanks > cheers, > Venkatesh > > On Fri, May 29, 2015 at 4:00 PM, Matt Landreman <matt.landre...@gmail.com> > wrote: > >> Dear Venkatesh, >> >> As you can see in the error log, you are now getting a segmentation >> fault, which is almost certainly a separate issue from the info(1)=-9 >> memory problem you had previously. Here is one idea which may or may not >> help. I've used mumps on the NERSC Edison system, and I found that I >> sometimes get segmentation faults when using the default Intel compiler. >> When I switched to the cray compiler the problem disappeared. So you could >> perhaps try a different compiler if one is available on your system. >> >> Matt >> On May 29, 2015 4:04 AM, "venkatesh g" <venkateshg...@gmail.com> wrote: >> >>> Hi Matt, >>> >>> I did what you told and read the manual of that CNTL parameters. I solve >>> for that with CNTL(1)=1e-4. It is working. >>> >>> But it was a test matrix with size 46000x46000. Actual matrix size is >>> 108900x108900 and will increase in the future. >>> >>> I get this error of memory allocation failed. And the binary matrix size >>> of A is 20GB and B is 5 GB. >>> >>> Now I submit this in 240 processors each 4 GB RAM and also in 128 >>> Processors with total 512 GB RAM. >>> >>> In both the cases, it fails with the following error like memory is not >>> enough. But for 90000x90000 size it had run serially in Matlab with <256 GB >>> RAM. >>> >>> Kindly let me know. >>> >>> Venkatesh >>> >>> On Tue, May 26, 2015 at 8:02 PM, Matt Landreman < >>> matt.landre...@gmail.com> wrote: >>> >>>> Hi Venkatesh, >>>> >>>> I've struggled a bit with mumps memory allocation too. I think the >>>> behavior of mumps is roughly the following. First, in the "analysis step", >>>> mumps computes a minimum memory required based on the structure of nonzeros >>>> in the matrix. Then when it actually goes to factorize the matrix, if it >>>> ever encounters an element smaller than CNTL(1) (default=0.01) in the >>>> diagonal of a sub-matrix it is trying to factorize, it modifies the >>>> ordering to avoid the small pivot, which increases the fill-in (hence >>>> memory needed). ICNTL(14) sets the margin allowed for this unanticipated >>>> fill-in. Setting ICNTL(14)=200000 as in your email is not the solution, >>>> since this means mumps asks for a huge amount of memory at the start. >>>> Better would be to lower CNTL(1) or (I think) use static pivoting >>>> (CNTL(4)). Read the section in the mumps manual about these CNTL >>>> parameters. I typically set CNTL(1)=1e-6, which eliminated all the >>>> INFO(1)=-9 errors for my problem, without having to modify ICNTL(14). >>>> >>>> Also, I recommend running with ICNTL(4)=3 to display diagnostics. Look >>>> for the line in standard output that says "TOTAL space in MBYTES for IC >>>> factorization". This is the amount of memory that mumps is trying to >>>> allocate, and for the default ICNTL(14), it should be similar to matlab's >>>> need. >>>> >>>> Hope this helps, >>>> -Matt Landreman >>>> University of Maryland >>>> >>>> On Tue, May 26, 2015 at 10:03 AM, venkatesh g <venkateshg...@gmail.com> >>>> wrote: >>>> >>>>> I posted a while ago in MUMPS forums but no one seems to reply. >>>>> >>>>> I am solving a large generalized Eigenvalue problem. >>>>> >>>>> I am getting the following error which is attached, after giving the >>>>> command: >>>>> >>>>> /cluster/share/venkatesh/petsc-3.5.3/linux-gnu/bin/mpiexec -np 64 >>>>> -hosts compute-0-4,compute-0-6,compute-0-7,compute-0-8 ./ex7 -f1 a72t -f2 >>>>> b72t -st_type sinvert -eps_nev 3 -eps_target 0.5 -st_ksp_type preonly >>>>> -st_pc_type lu -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_14 >>>>> 200000 >>>>> >>>>> IT IS impossible to allocate so much memory per processor.. it is >>>>> asking like around 70 GB per processor. >>>>> >>>>> A serial job in MATLAB for the same matrices takes < 60GB. >>>>> >>>>> After trying out superLU_dist, I have attached the error there also >>>>> (segmentation error). >>>>> >>>>> Kindly help me. >>>>> >>>>> Venkatesh >>>>> >>>>> >>>>> >>>> >>> >
Generalized eigenproblem stored in file. Reading COMPLEX matrices from binary files... Entering ZMUMPS driver with JOB, N, NZ = 1 108900 0 ZMUMPS 4.10.0 L U Solver for unsymmetric matrices Type of parallelism: Working host ****** ANALYSIS STEP ******** ** Max-trans not allowed because matrix is distributed ... Structural symmetry (in percent)= 70 Density: NBdense, Average, Median = 01232012102 Ordering based on METIS A root of estimated size 41878 has been selected for Scalapack. Leaving analysis phase with ... INFOG(1) = 0 INFOG(2) = 0 -- (20) Number of entries in factors (estim.) = 5475631310 -- (3) Storage of factors (REAL, estimated) = 89125045501 -- (4) Storage of factors (INT , estimated) = 655485547 -- (5) Maximum frontal size (estimated) = 41878 -- (6) Number of nodes in the tree = 471 -- (32) Type of analysis effectively used = 1 -- (7) Ordering option effectively used = 5 ICNTL(6) Maximum transversal option = 0 ICNTL(7) Pivot order option = 7 Percentage of memory relaxation (effective) = 35 Number of level 2 nodes = 439 Number of split nodes = 149 RINFOG(1) Operations during elimination (estim)= 1.584D+14 Distributed matrix entry format (ICNTL(18)) = 3 ** Rank of proc needing largest memory in IC facto : 0 ** Estimated corresponding MBYTES for IC facto : 33885 ** Estimated avg. MBYTES per work. proc at facto (IC) : 8679 ** TOTAL space in MBYTES for IC factorization : 24997648 ** Rank of proc needing largest memory for OOC facto : 0 ** Estimated corresponding MBYTES for OOC facto : 33683 ** Estimated avg. MBYTES per work. proc at facto (OOC) : 8398 ** TOTAL space in MBYTES for OOC factorization : 24187332 Entering ZMUMPS driver with JOB, N, NZ = 2 108900 1035808400 ****** FACTORIZATION STEP ******** GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ... NUMBER OF WORKING PROCESSES = 2880 OUT-OF-CORE OPTION (ICNTL(22)) = 0 REAL SPACE FOR FACTORS = 89125045501 INTEGER SPACE FOR FACTORS = 655485547 MAXIMUM FRONTAL SIZE (ESTIMATED) = 41878 NUMBER OF NODES IN THE TREE = 471 Convergence error after scaling for ONE-NORM (option 7/8) = 0.95D+00 Maximum effective relaxed size of S = 1630229452 Average effective relaxed size of S = 225206115 [NID 01214] 2015-05-31 17:36:18 Apid 409924: initiated application termination [NID 01214] 2015-05-31 17:34:59 Apid 409924: OOM killer terminated this process. Application 409924 exit signals: Killed Application 409924 resources: utime ~0s, stime ~225s, Rss ~7716, inblocks ~192480, outblocks ~28560