Re: [petsc-dev] MatMult on Summit

2019-09-20 Thread Smith, Barry F. via petsc-dev
Dang, makes the GPUs less impressive :-). > On Sep 21, 2019, at 12:44 AM, Zhang, Junchao wrote: > > Here are CPU version results on one node with 24 cores, 42 cores. Click the > links for core layout. > > 24 MPI ranks, https://jsrunvisualizer.olcf.ornl.gov/?s4f1o01n6c4g1r14d1b21l0= >

Re: [petsc-dev] MatMult on Summit

2019-09-20 Thread Zhang, Junchao via petsc-dev
Here are CPU version results on one node with 24 cores, 42 cores. Click the links for core layout. 24 MPI ranks, https://jsrunvisualizer.olcf.ornl.gov/?s4f1o01n6c4g1r14d1b21l0= MatMult 100 1.0 3.1431e+00 1.0 2.63e+09 1.2 1.9e+04 5.9e+04 0.0e+00 8 99 97 25 0 100100100100 0 17948

[petsc-dev] Tip while using valgrind

2019-09-20 Thread Smith, Barry F. via petsc-dev
When using valgrind it is important to understand that it does not immediately make a report when it finds an uninitialized memory, it only makes a report when an uninitialized memory would cause a change in the program flow (like in an if statement). This is why sometimes it seems to

Re: [petsc-dev] Configure hangs on Summit

2019-09-20 Thread Smith, Barry F. via petsc-dev
Then the hang is curious. > On Sep 20, 2019, at 11:28 PM, Mills, Richard Tran wrote: > > Everything that Barry says about '--with-batch' is valid, but let me point > out one thing about Summit: You don't need "--with-batch" at all, because the > Summit login/compile nodes run the same

Re: [petsc-dev] MatMult on Summit

2019-09-20 Thread Smith, Barry F. via petsc-dev
Junchao, Very interesting. For completeness please run also 24 and 42 CPUs without the GPUs. Note that the default layout for CPU cores is not good. You will want 3 cores on each socket then 12 on each. Thanks Barry Since Tim is one of our reviewers next week this is a very

Re: [petsc-dev] MatMult on Summit

2019-09-20 Thread Zhang, Junchao via petsc-dev
Click the links to visualize it. 6 ranks https://jsrunvisualizer.olcf.ornl.gov/?s4f1o01n6c1g1r11d1b21l0= jsrun -n 6 -a 1 -c 1 -g 1 -r 6 --latency_priority GPU-GPU --launch_distribution packed --bind packed:1 js_task_info ./ex900 -f HV15R.aij -mat_type aijcusparse -vec_type cuda -n 100 -log_view

Re: [petsc-dev] MatMult on Summit

2019-09-20 Thread Mills, Richard Tran via petsc-dev
Junchao, Can you share your 'jsrun' command so that we can see how you are mapping things to resource sets? --Richard On 9/20/19 11:22 PM, Zhang, Junchao via petsc-dev wrote: I downloaded a sparse matrix (HV15R) from Florida Sparse Matrix Collection. Its

Re: [petsc-dev] Configure hangs on Summit

2019-09-20 Thread Mills, Richard Tran via petsc-dev
Everything that Barry says about '--with-batch' is valid, but let me point out one thing about Summit: You don't need "--with-batch" at all, because the Summit login/compile nodes run the same hardware (minus the GPUs) and software stack as the back-end compute nodes. This makes configuring and

[petsc-dev] MatMult on Summit

2019-09-20 Thread Zhang, Junchao via petsc-dev
I downloaded a sparse matrix (HV15R) from Florida Sparse Matrix Collection. Its size is about 2M x 2M. Then I ran the same MatMult 100 times on one node of Summit with -mat_type aijcusparse -vec_type cuda. I found MatMult was almost dominated by VecScatter

Re: [petsc-dev] Configure hangs on Summit

2019-09-20 Thread Zhang, Junchao via petsc-dev
Richard, I almost copied arch-olcf-summit-opt.py. The hanging is random. I met it few weeks ago. I retried and it passed. It happened today when I did a fresh configure. On Summit login nodes, mpiexec is actually in everyone's PATH. I did "ps ux" and found the script was executing "mpiexec

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Jed Brown via petsc-dev
"Smith, Barry F." writes: >> Satish and Barry: Do we need the Error codes or can I revert to previous >> functionality? > > I think it is important to display the error codes. > > How about displaying at the bottom how to run the broken tests? You already > show how to run them with the

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Smith, Barry F. via petsc-dev
> On Sep 20, 2019, at 4:18 PM, Scott Kruger via petsc-dev > wrote: > > > > > > On 9/20/19 2:49 PM, Jed Brown wrote: >> Hapla Vaclav via petsc-dev writes: >>> On 20 Sep 2019, at 19:59, Scott Kruger >>> mailto:kru...@txcorp.com>> wrote: >>> >>> >>> On 9/20/19 10:44 AM, Hapla Vaclav

Re: [petsc-dev] Configure hangs on Summit

2019-09-20 Thread Smith, Barry F. via petsc-dev
--with-batch is still there and should be used in such circumstances. The difference is that --with-branch does not generate a program that you need to submit to the batch system before continuing the configure. Instead --with-batch guesses at and skips some of the tests (with clear

Re: [petsc-dev] Configure hangs on Summit

2019-09-20 Thread Mills, Richard Tran via petsc-dev
Hi Junchao, Glad you've found a workaround, but I don't know why you are hitting this problem. The last time I built PETSc on Summit (just a couple days ago), I didn't have this problem. I'm working from the example template that's in the PETSc repo at config/examples/arch-olcf-summit-opt.py.

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Hapla Vaclav via petsc-dev
> On 20 Sep 2019, at 23:18, Scott Kruger wrote: > > > > > > On 9/20/19 2:49 PM, Jed Brown wrote: >> Hapla Vaclav via petsc-dev writes: >>> On 20 Sep 2019, at 19:59, Scott Kruger >>> mailto:kru...@txcorp.com>> wrote: >>> >>> >>> On 9/20/19 10:44 AM, Hapla Vaclav via petsc-dev wrote:

Re: [petsc-dev] Configure hangs on Summit

2019-09-20 Thread Zhang, Junchao via petsc-dev
Satish's trick --with-mpiexec=/bin/true solved the problem. Thanks. --Junchao Zhang On Fri, Sep 20, 2019 at 3:50 PM Junchao Zhang mailto:jczh...@mcs.anl.gov>> wrote: My configure hangs on Summit at TESTING: configureMPIEXEC from

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Scott Kruger via petsc-dev
On 9/20/19 2:49 PM, Jed Brown wrote: Hapla Vaclav via petsc-dev writes: On 20 Sep 2019, at 19:59, Scott Kruger mailto:kru...@txcorp.com>> wrote: On 9/20/19 10:44 AM, Hapla Vaclav via petsc-dev wrote: I was used to copy the command actually run by test harness, change to example's

[petsc-dev] Configure hangs on Summit

2019-09-20 Thread Zhang, Junchao via petsc-dev
My configure hangs on Summit at TESTING: configureMPIEXEC from config.packages.MPI(config/BuildSystem/config/packages/MPI.py:170) On the machine one has to use script to submit jobs. So why do we need configureMPIEXEC? Do I need to use --with-batch? I remember we removed that. --Junchao

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Jed Brown via petsc-dev
Hapla Vaclav via petsc-dev writes: > On 20 Sep 2019, at 19:59, Scott Kruger > mailto:kru...@txcorp.com>> wrote: > > > On 9/20/19 10:44 AM, Hapla Vaclav via petsc-dev wrote: > I was used to copy the command actually run by test harness, change to > example's directory and paste the command

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Hapla Vaclav via petsc-dev
On 20 Sep 2019, at 19:59, Scott Kruger mailto:kru...@txcorp.com>> wrote: On 9/20/19 10:44 AM, Hapla Vaclav via petsc-dev wrote: I was used to copy the command actually run by test harness, change to example's directory and paste the command (just changing one .. to ., e.g. ../ex1 to ./ex1).

Re: [petsc-dev] Should we add something about GPU support to the user manual?

2019-09-20 Thread Bisht, Gautam via petsc-dev
Hi Richard, Information about PETSc’s support for GPU would be super helpful. Btw, I noticed that in PETSc User 2019 meeting you gave a talk on "Progress with PETSc on Manycore and GPU-based Systems on the Path to Exascale”, but the

Re: [petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Scott Kruger via petsc-dev
On 9/20/19 10:44 AM, Hapla Vaclav via petsc-dev wrote: I was used to copy the command actually run by test harness, change to example's directory and paste the command (just changing one .. to ., e.g. ../ex1 to ./ex1). Is this output gone? Bad news. I think there should definitely be an

[petsc-dev] test harness: output of actually executed command for V=1 gone?

2019-09-20 Thread Hapla Vaclav via petsc-dev
I was used to copy the command actually run by test harness, change to example's directory and paste the command (just changing one .. to ., e.g. ../ex1 to ./ex1). Is this output gone? Bad news. I think there should definitely be an option to quickly reproduce the test run to work on failing

Re: [petsc-dev] How to check that MatMatMult is available

2019-09-20 Thread Pierre Jolivet via petsc-dev
> On 20 Sep 2019, at 7:36 AM, Jed Brown > wrote: > > Pierre Jolivet via petsc-dev > writes: > >> Hello, >> Given a Mat A, I’d like to know if there is an implementation available for >> doing C=A*B >> I was previously using