On Sun, May 24, 2015 at 8:57 AM, venkatesh g <venkateshg...@gmail.com> wrote:
> I am using Matload option as in the ex7.c code given by the Slepc. > ierr = MatLoad(A,viewer);CHKERRQ(ierr); > > > There is no problem here right ? or any additional option is required for > very large matrices while running the eigensolver in parallel ? > This will load the matrix from the viewer (presumably disk). There are no options for large matrices. Thanks, Matt > cheers, > Venkatesh > > On Sat, May 23, 2015 at 5:43 PM, Matthew Knepley <knep...@gmail.com> > wrote: > >> On Sat, May 23, 2015 at 7:09 AM, venkatesh g <venkateshg...@gmail.com> >> wrote: >> >>> Hi, >>> Thanks. >>> Per node it has 24 cores and each core has 4 GB RAM. And the job was >>> submitted in 10 nodes. >>> >>> So, does it mean it requires 10G for one core ? or for 1 node ? >>> >> >> The error message from MUMPS said that it tried to allocate 10G. We must >> assume each process >> tried to do the same thing. That means if you scheduled 24 processes on a >> node, it would try to >> allocate at least 240G, which is in excess of what you specify above. >> >> Note that this has nothing to do with PETSc. It is all in the >> documentation for that machine and its >> scheduling policy. >> >> Thanks, >> >> Matt >> >> >>> cheers, >>> >>> Venkatesh >>> >>> On Sat, May 23, 2015 at 5:17 PM, Matthew Knepley <knep...@gmail.com> >>> wrote: >>> >>>> On Sat, May 23, 2015 at 6:44 AM, venkatesh g <venkateshg...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> The same eigenproblem runs with 120 GB RAM in a serial machine in >>>>> Matlab. >>>>> >>>>> In Cray I fired with 240*4 GB RAM in parallel. So it has to go in >>>>> right ? >>>>> >>>> >>>> I do not know how MUMPS allocates memory, but the message is >>>> unambiguous. Also, >>>> this is concerned with the memory available per node. Do you know how >>>> many processes >>>> per node were scheduled? The message below indicates that it was trying >>>> to allocate 10G >>>> for one process. >>>> >>>> >>>>> And for small matrices it is having negative scaling i.e 24 core is >>>>> running faster. >>>>> >>>> >>>> Yes, for strong scaling you always get slowdown eventually since >>>> overheads dominate >>>> work, see Amdahl's Law. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> I have attached the submission script. >>>>> >>>>> Pls see.. Kindly let me know >>>>> >>>>> cheers, >>>>> Venkatesh >>>>> >>>>> >>>>> On Sat, May 23, 2015 at 4:58 PM, Matthew Knepley <knep...@gmail.com> >>>>> wrote: >>>>> >>>>>> On Sat, May 23, 2015 at 2:39 AM, venkatesh g <venkateshg...@gmail.com >>>>>> > wrote: >>>>>> >>>>>>> Hi again, >>>>>>> >>>>>>> I have installed the Petsc and Slepc in Cray with intel compilers >>>>>>> with Mumps. >>>>>>> >>>>>>> I am getting this error when I solve eigenvalue problem with large >>>>>>> matrices: [201]PETSC ERROR: Error reported by MUMPS in numerical >>>>>>> factorization phase: Cannot allocate required memory 9632 megabytes >>>>>>> >>>>>> >>>>>> It ran out of memory on the node. >>>>>> >>>>>> >>>>>>> Also it is again not scaling well for small matrices. >>>>>>> >>>>>> >>>>>> MUMPS strong scaling for small matrices is not very good. Weak >>>>>> scaling is looking at big matrices. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Kindly let me know what to do. >>>>>>> >>>>>>> cheers, >>>>>>> >>>>>>> Venkatesh >>>>>>> >>>>>>> >>>>>>> On Tue, May 19, 2015 at 3:02 PM, Matthew Knepley <knep...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> On Tue, May 19, 2015 at 1:04 AM, venkatesh g < >>>>>>>> venkateshg...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I have attached the log of the command which I gave in the master >>>>>>>>> node: make streams NPMAX=32 >>>>>>>>> >>>>>>>>> I dont know why it says 'It appears you have only 1 node'. But >>>>>>>>> other codes run in parallel with good scaling on 8 nodes. >>>>>>>>> >>>>>>>> >>>>>>>> If you look at the STREAMS numbers, you can see that your system is >>>>>>>> only able to support about 2 cores with the >>>>>>>> available memory bandwidth. Thus for bandwidth constrained >>>>>>>> operations (almost everything in sparse linear algebra >>>>>>>> and solvers), your speedup will not be bigger than 2. >>>>>>>> >>>>>>>> Other codes may do well on this machine, but they would be compute >>>>>>>> constrained, using things like DGEMM. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Kindly let me know. >>>>>>>>> >>>>>>>>> Venkatesh >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, May 18, 2015 at 11:21 PM, Barry Smith <bsm...@mcs.anl.gov> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Run the streams benchmark on this system and send the results. >>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> > On May 18, 2015, at 11:14 AM, venkatesh g < >>>>>>>>>> venkateshg...@gmail.com> wrote: >>>>>>>>>> > >>>>>>>>>> > Hi, >>>>>>>>>> > I have emailed the mumps-user list. >>>>>>>>>> > Actually the cluster has 8 nodes with 16 cores, and other codes >>>>>>>>>> scale well. >>>>>>>>>> > I wanted to ask if this job takes much time, then if I submit >>>>>>>>>> on more cores, I have to increase the icntl(14).. which would again >>>>>>>>>> take >>>>>>>>>> long time. >>>>>>>>>> > >>>>>>>>>> > So is there another way ? >>>>>>>>>> > >>>>>>>>>> > cheers, >>>>>>>>>> > Venkatesh >>>>>>>>>> > >>>>>>>>>> > On Mon, May 18, 2015 at 7:16 PM, Matthew Knepley < >>>>>>>>>> knep...@gmail.com> wrote: >>>>>>>>>> > On Mon, May 18, 2015 at 8:29 AM, venkatesh g < >>>>>>>>>> venkateshg...@gmail.com> wrote: >>>>>>>>>> > Hi I have attached the performance logs for 2 jobs on different >>>>>>>>>> processors. I had to increase the workspace icntl(14) when I submit >>>>>>>>>> on more >>>>>>>>>> cores since it is failing with small value of icntl(14). >>>>>>>>>> > >>>>>>>>>> > 1. performance_log1.txt is run on 8 cores (option given >>>>>>>>>> -mat_mumps_icntl_14 200) >>>>>>>>>> > 2. performance_log2.txt is run on 2 cores (option given >>>>>>>>>> -mat_mumps_icntl_14 85 ) >>>>>>>>>> > >>>>>>>>>> > 1) Your number of iterates increased from 7600 to 9600, but >>>>>>>>>> that is a relatively small effect >>>>>>>>>> > >>>>>>>>>> > 2) MUMPS is just taking a lot longer to do forward/backward >>>>>>>>>> solve. You might try emailing >>>>>>>>>> > the list for them. However, I would bet that your system has >>>>>>>>>> enough bandwidth for 2 procs >>>>>>>>>> > and not much more. >>>>>>>>>> > >>>>>>>>>> > Thanks, >>>>>>>>>> > >>>>>>>>>> > Matt >>>>>>>>>> > >>>>>>>>>> > Venkatesh >>>>>>>>>> > >>>>>>>>>> > On Sun, May 17, 2015 at 6:13 PM, Matthew Knepley < >>>>>>>>>> knep...@gmail.com> wrote: >>>>>>>>>> > On Sun, May 17, 2015 at 1:38 AM, venkatesh g < >>>>>>>>>> venkateshg...@gmail.com> wrote: >>>>>>>>>> > Hi, Thanks for the information. I now increased the workspace >>>>>>>>>> by adding '-mat_mumps_icntl_14 100' >>>>>>>>>> > >>>>>>>>>> > It works. However, the problem is, if I submit in 1 core I get >>>>>>>>>> the answer in 200 secs, but with 4 cores and '-mat_mumps_icntl_14 >>>>>>>>>> 100' it >>>>>>>>>> takes 3500secs. >>>>>>>>>> > >>>>>>>>>> > Send the output of -log_summary for all performance queries. >>>>>>>>>> Otherwise we are just guessing. >>>>>>>>>> > >>>>>>>>>> > Matt >>>>>>>>>> > >>>>>>>>>> > My command line is: 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 -eps_nev >>>>>>>>>> 1 -st_type sinvert -eps_max_it 5000 -st_ksp_type preonly -st_pc_type >>>>>>>>>> lu >>>>>>>>>> -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_14 100' >>>>>>>>>> > >>>>>>>>>> > Kindly let me know. >>>>>>>>>> > >>>>>>>>>> > Venkatesh >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > On Sat, May 16, 2015 at 7:10 PM, David Knezevic < >>>>>>>>>> david.kneze...@akselos.com> wrote: >>>>>>>>>> > On Sat, May 16, 2015 at 8:08 AM, venkatesh g < >>>>>>>>>> venkateshg...@gmail.com> wrote: >>>>>>>>>> > Hi, >>>>>>>>>> > I am trying to solving AX=lambda BX eigenvalue problem. >>>>>>>>>> > >>>>>>>>>> > A and B are of sizes 3600x3600 >>>>>>>>>> > >>>>>>>>>> > I run with this command : >>>>>>>>>> > >>>>>>>>>> > 'mpiexec -np 4 ./ex7 -f1 a2 -f2 b2 -eps_nev 1 -st_type sinvert >>>>>>>>>> -eps_max_it 5000 -st_ksp_type preonly -st_pc_type lu >>>>>>>>>> -st_pc_factor_mat_solver_package mumps' >>>>>>>>>> > >>>>>>>>>> > I get this error: (I get result only when I give 1 or 2 >>>>>>>>>> processors) >>>>>>>>>> > Reading COMPLEX matrices from binary files... >>>>>>>>>> > [0]PETSC ERROR: --------------------- Error Message >>>>>>>>>> ------------------------------------ >>>>>>>>>> > [0]PETSC ERROR: Error in external library! >>>>>>>>>> > [0]PETSC ERROR: Error reported by MUMPS in numerical >>>>>>>>>> factorization phase: INFO(1)=-9, INFO(2)=2024 >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > The MUMPS error types are described in Chapter 7 of the MUMPS >>>>>>>>>> manual. In this case you have INFO(1)=-9, which is explained in the >>>>>>>>>> manual >>>>>>>>>> as: >>>>>>>>>> > >>>>>>>>>> > "–9 Main internal real/complex workarray S too small. If >>>>>>>>>> INFO(2) is positive, then the number of entries that are missing in >>>>>>>>>> S at >>>>>>>>>> the moment when the error is raised is available in INFO(2). If >>>>>>>>>> INFO(2) is >>>>>>>>>> negative, then its absolute value should be multiplied by 1 million. >>>>>>>>>> If an >>>>>>>>>> error –9 occurs, the user should increase the value of ICNTL(14) >>>>>>>>>> before >>>>>>>>>> calling the factorization (JOB=2) again, except if ICNTL(23) is >>>>>>>>>> provided, >>>>>>>>>> in which case ICNTL(23) should be increased." >>>>>>>>>> > >>>>>>>>>> > This says that you should use ICTNL(14) to increase the working >>>>>>>>>> space size: >>>>>>>>>> > >>>>>>>>>> > "ICNTL(14) is accessed by the host both during the analysis and >>>>>>>>>> the factorization phases. It corresponds to the percentage increase >>>>>>>>>> in the >>>>>>>>>> estimated working space. When significant extra fill-in is caused by >>>>>>>>>> numerical pivoting, increasing ICNTL(14) may help. Except in special >>>>>>>>>> cases, >>>>>>>>>> the default value is 20 (which corresponds to a 20 % increase)." >>>>>>>>>> > >>>>>>>>>> > So, for example, you can avoid this error via the following >>>>>>>>>> command line argument to PETSc: "-mat_mumps_icntl_14 30", where 30 >>>>>>>>>> indicates that we allow a 30% increase in the workspace instead of >>>>>>>>>> the >>>>>>>>>> default 20%. >>>>>>>>>> > >>>>>>>>>> > David >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > -- >>>>>>>>>> > What most experimenters take for granted before they begin >>>>>>>>>> their experiments is infinitely more interesting than any results to >>>>>>>>>> which >>>>>>>>>> their experiments lead. >>>>>>>>>> > -- Norbert Wiener >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > -- >>>>>>>>>> > What most experimenters take for granted before they begin >>>>>>>>>> their experiments is infinitely more interesting than any results to >>>>>>>>>> which >>>>>>>>>> their experiments lead. >>>>>>>>>> > -- Norbert Wiener >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener