I was able to find the bug, it was the outer loop bound in the same fashion than before, my -log_view is this :
---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ucmsMR on a arch-linux2-c-debug named ocean with 2 processors, by valera Sat Jan 7 16:11:51 2017 Using Petsc Release Version 3.7.4, unknown Max Max/Min Avg Total Time (sec): 2.074e+01 1.00000 2.074e+01 Objects: 9.300e+01 1.00000 9.300e+01 Flops: 8.662e+09 1.00000 8.662e+09 1.732e+10 Flops/sec: 4.177e+08 1.00000 4.177e+08 8.354e+08 Memory: 1.027e+08 1.03217 2.021e+08 MPI Messages: 5.535e+02 1.00000 5.535e+02 1.107e+03 MPI Message Lengths: 2.533e+07 1.00000 4.576e+04 5.066e+07 MPI Reductions: 1.903e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.0739e+01 100.0% 1.7325e+10 100.0% 1.107e+03 100.0% 4.576e+04 100.0% 1.903e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDotNorm2 545 1.0 4.4925e-01 1.6 2.18e+08 1.0 0.0e+00 0.0e+00 1.1e+03 2 3 0 0 6 2 3 0 0 6 971 VecMDot 525 1.0 1.7089e+00 1.7 1.48e+09 1.0 0.0e+00 0.0e+00 1.0e+03 7 17 0 0 6 7 17 0 0 6 1735 VecNorm 420 1.0 7.6857e-02 1.0 8.40e+07 1.0 0.0e+00 0.0e+00 8.4e+02 0 1 0 0 4 0 1 0 0 4 2186 VecScale 1090 1.0 2.5113e-01 1.1 1.09e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 868 VecSet 555 1.0 7.3570e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1090 1.0 2.7621e-01 1.1 2.18e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 1579 VecAYPX 5 1.0 3.6647e-03 2.1 5.00e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 273 VecMAXPY 1050 1.0 2.3646e+00 1.1 2.97e+09 1.0 0.0e+00 0.0e+00 0.0e+00 11 34 0 0 0 11 34 0 0 0 2508 VecAssemblyBegin 12 1.7 2.4388e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 12 1.7 1.0085e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 560 1.0 2.3770e+0071.3 0.00e+00 0.0 1.1e+03 2.7e+04 1.0e+01 6 0 99 59 0 6 0 99 59 0 0 VecScatterEnd 550 1.0 3.7769e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 550 1.0 3.7412e+00 1.1 1.80e+09 1.0 1.1e+03 2.0e+04 0.0e+00 17 21 99 43 0 17 21 99 43 0 962 MatSolve 545 1.0 3.6138e+00 1.1 1.77e+09 1.0 0.0e+00 0.0e+00 0.0e+00 17 20 0 0 0 17 20 0 0 0 980 MatLUFactorNum 1 1.0 1.2530e-01 1.5 1.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 203 MatILUFactorSym 1 1.0 2.0162e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 1 1.0 3.3683e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 1 1.0 9.5172e-02359.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 1 1.0 2.6907e-02 1.0 0.00e+00 0.0 4.0e+00 5.0e+03 2.3e+01 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 3 1.0 1.2398e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 4.4249e-02 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLoad 1 1.0 3.8892e-01 1.0 0.00e+00 0.0 7.0e+00 3.0e+06 3.8e+01 2 0 1 41 0 2 0 1 41 0 0 KSPSetUp 2 1.0 2.2634e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 5 1.0 1.2218e+01 1.0 8.66e+09 1.0 1.1e+03 2.0e+04 1.9e+04 59100 99 43 99 59100 99 43 99 1418 PCSetUp 3 1.0 1.7993e+00 1.0 1.27e+07 1.0 0.0e+00 0.0e+00 1.0e+01 8 0 0 0 0 8 0 0 0 0 14 PCSetUpOnBlocks 5 1.0 1.9013e-01 1.7 1.27e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 134 PCApply 546 1.0 3.8320e+00 1.1 1.77e+09 1.0 0.0e+00 0.0e+00 1.0e+00 18 20 0 0 0 18 20 0 0 0 925 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 72 6 5609648 0. Vector Scatter 3 2 1312 0. Matrix 4 0 0 0. Viewer 2 0 0 0. Index Set 7 4 13104 0. Krylov Solver 2 0 0 0. Preconditioner 3 1 1384 0. ======================================================================================================================== Average time to get PetscTime(): 7.15256e-08 Average time for MPI_Barrier(): 1.82629e-05 Average time for zero size MPI_Send(): 9.89437e-06 #PETSc Option Table entries: -log_view -matload_block_size 1 -pc_hypre_boomeramg_nodal_coarsen 1 -pc_hypre_boomeramg_vec_interp_variant 1 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich --download-hdf5 --download-netcdf --download-hypre --download-metis --download-parmetis --download-trillinos ----------------------------------------- Libraries compiled on Fri Dec 9 12:45:19 2016 on ocean Machine characteristics: Linux-3.10.0-327.13.1.el7.x86_64-x86_64-with-centos-7.2.1511-Core Using PETSc directory: /home/valera/petsc Using PETSc arch: arch-linux2-c-debug ----------------------------------------- Using C compiler: /home/valera/petsc/arch-linux2-c-debug/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /home/valera/petsc/arch-linux2-c-debug/bin/mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/valera/petsc/arch-linux2-c-debug/include -I/home/valera/petsc/include -I/home/valera/petsc/include -I/home/valera/petsc/arch-linux2-c-debug/include ----------------------------------------- Using C linker: /home/valera/petsc/arch-linux2-c-debug/bin/mpicc Using Fortran linker: /home/valera/petsc/arch-linux2-c-debug/bin/mpif90 Using libraries: -Wl,-rpath,/home/valera/petsc/arch-linux2-c-debug/lib -L/home/valera/petsc/arch-linux2-c-debug/lib -lpetsc -Wl,-rpath,/home/valera/petsc/arch-linux2-c-debug/lib -L/home/valera/petsc/arch-linux2-c-debug/lib -lparmetis -lmetis -lHYPRE -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -lmpicxx -lstdc++ -lflapack -lfblas -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lpthread -lm -lmpifort -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpicxx -lstdc++ -Wl,-rpath,/home/valera/petsc/arch-linux2-c-debug/lib -L/home/valera/petsc/arch-linux2-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -ldl -Wl,-rpath,/home/valera/petsc/arch-linux2-c-debug/lib -lmpi -lgcc_s -ldl ----------------------------------------- WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! Option left: name:-pc_hypre_boomeramg_nodal_coarsen value: 1 Option left: name:-pc_hypre_boomeramg_vec_interp_variant value: 1 [valera@ocean serGCCOM]$ Any suggestions are very much appreciated, Thanks On Sat, Jan 7, 2017 at 3:39 PM, Matthew Knepley <knep...@gmail.com> wrote: > On Sat, Jan 7, 2017 at 5:33 PM, Manuel Valera <mval...@mail.sdsu.edu> > wrote: > >> Thanks Barry and Matt, >> >> I was able to detect a bug that i just solved, as suggested the loop >> parameters weren't updated as it should, now it does and the program still >> freezes but now in the beginning of the loop... ? >> > > You have called a collective function from only one process. Stepping > through on both processes in your run will find this easily. > > Thanks, > > Matt > > >> Im attaching screen so you have an idea. Im thinking about it... >> >> Thanks >> >> On Sat, Jan 7, 2017 at 3:21 PM, Matthew Knepley <knep...@gmail.com> >> wrote: >> >>> On Sat, Jan 7, 2017 at 4:59 PM, Manuel Valera <mval...@mail.sdsu.edu> >>> wrote: >>> >>>> I would have to think and code a MWE for this problem before sending it >>>> since the model is much bigger than the petsc solver. Attached here is a >>>> screenshot of the debugger as barry taught me, is that the stack trace you >>>> need ? >>>> >>>> the ucmsMain.f90:522 that shows is the call (from all processes) to the >>>> routine that updates the rhs vector (from root) and scatters it (from all >>>> processes). >>>> >>> >>> Yes, so one process is here and the other has moved on, so there is a >>> mismatch in calls. >>> >>> You could do what Barry suggests, but I think it would be better to just >>> step through your main routine once (its slow going), and >>> see where the divergence happens. >>> >>> Matt >>> >>> >>>> This routine is itself inside a double loop that occurs in all >>>> processes but the only call from all processes to the solver is this one, >>>> the rest of the loop which involves correcting for velocities, pressure and >>>> temperature, all happens in root node. >>>> >>>> Sorry for the convoluted program design, this is the first beta version >>>> of the model working on parallel and was the best i could come with, i >>>> suppose it makes more sense in serial, >>>> >>>> Thanks >>>> >>>> On Sat, Jan 7, 2017 at 2:24 PM, Matthew Knepley <knep...@gmail.com> >>>> wrote: >>>> >>>>> On Sat, Jan 7, 2017 at 4:20 PM, Manuel Valera <mval...@mail.sdsu.edu> >>>>> wrote: >>>>> >>>>>> Thank you Matthew, >>>>>> >>>>>> On Sat, Jan 7, 2017 at 1:49 PM, Matthew Knepley <knep...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> On Sat, Jan 7, 2017 at 3:32 PM, Manuel Valera <mval...@mail.sdsu.edu >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi Devs, hope you are having a great weekend, >>>>>>>> >>>>>>>> I could finally parallelize my linear solver and implement it into >>>>>>>> the rest of the code in a way that only the linear system is solved in >>>>>>>> parallel, great news for my team, but there is a catch and is that i >>>>>>>> don't >>>>>>>> see any speedup in the linear system, i don't know if its the MPI in >>>>>>>> the >>>>>>>> cluster we are using, but im not sure on how to debug it, >>>>>>>> >>>>>>> >>>>>>> We need to see -log_view output for any performance question. >>>>>>> >>>>>>> >>>>>>>> On the other hand and because of this issue i was trying to do >>>>>>>> -log_summary or -log_view and i noticed the program in this context >>>>>>>> hangs >>>>>>>> when is time of producing the log, if i debug this for 2 cores, >>>>>>>> process 0 >>>>>>>> exits normally but process 1 hangs in the vectorscatterbegin() with >>>>>>>> scatter_reverse way back in the code, >>>>>>>> >>>>>>> >>>>>>> You are calling a collective routine from only 1 process. >>>>>>> >>>>>>> >>>>>> Matt >>>>>>> >>>>>> >>>>>> I am pretty confident this is not the case, >>>>>> >>>>> >>>>> This is still the simplest explanation. Can you send the stack trace >>>>> for the 2 process run? >>>>> >>>>> >>>>>> the callings to vecscattercreatetozero and vecscatterbegin are made >>>>>> in all processes, the program goes thru all of the iterations on the >>>>>> linear >>>>>> solver, writes output correctly and even closes all the petsc objects >>>>>> without complaining, the freeze occurs at the very end when the log is to >>>>>> be produced. >>>>>> >>>>> >>>>> If you can send us a code to run, we can likely find the error. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> >>>>>> Manuel >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>>> and even after destroying all associated objects and calling >>>>>>>> petscfinalize(), so im really clueless on why is this, as it only >>>>>>>> happens >>>>>>>> for -log_* or -ksp_view options. >>>>>>>> >>>>>>>> my -ksp_view shows this: >>>>>>>> >>>>>>>> KSP Object: 2 MPI processes >>>>>>>> >>>>>>>> type: gcr >>>>>>>> >>>>>>>> GCR: restart = 30 >>>>>>>> >>>>>>>> GCR: restarts performed = 20 >>>>>>>> >>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>> >>>>>>>> tolerances: relative=1e-14, absolute=1e-50, divergence=10000. >>>>>>>> >>>>>>>> right preconditioning >>>>>>>> >>>>>>>> using UNPRECONDITIONED norm type for convergence test >>>>>>>> >>>>>>>> PC Object: 2 MPI processes >>>>>>>> >>>>>>>> type: bjacobi >>>>>>>> >>>>>>>> block Jacobi: number of blocks = 2 >>>>>>>> >>>>>>>> Local solve is same for all blocks, in the following KSP and PC >>>>>>>> objects: >>>>>>>> >>>>>>>> KSP Object: (sub_) 1 MPI processes >>>>>>>> >>>>>>>> type: preonly >>>>>>>> >>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>> >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>> >>>>>>>> left preconditioning >>>>>>>> >>>>>>>> using NONE norm type for convergence test >>>>>>>> >>>>>>>> PC Object: (sub_) 1 MPI processes >>>>>>>> >>>>>>>> type: ilu >>>>>>>> >>>>>>>> ILU: out-of-place factorization >>>>>>>> >>>>>>>> 0 levels of fill >>>>>>>> >>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>> >>>>>>>> matrix ordering: natural >>>>>>>> >>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>> >>>>>>>> Factored matrix follows: >>>>>>>> >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> >>>>>>>> type: seqaij >>>>>>>> >>>>>>>> rows=100000, cols=100000 >>>>>>>> >>>>>>>> package used to perform factorization: petsc >>>>>>>> >>>>>>>> total: nonzeros=1675180, allocated nonzeros=1675180 >>>>>>>> >>>>>>>> total number of mallocs used during MatSetValues calls >>>>>>>> =0 >>>>>>>> >>>>>>>> not using I-node routines >>>>>>>> >>>>>>>> linear system matrix = precond matrix: >>>>>>>> >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> >>>>>>>> type: seqaij >>>>>>>> >>>>>>>> rows=100000, cols=100000 >>>>>>>> >>>>>>>> total: nonzeros=1675180, allocated nonzeros=1675180 >>>>>>>> >>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>> >>>>>>>> not using I-node routines >>>>>>>> >>>>>>>> linear system matrix = precond matrix: >>>>>>>> >>>>>>>> Mat Object: 2 MPI processes >>>>>>>> >>>>>>>> type: mpiaij >>>>>>>> >>>>>>>> rows=200000, cols=200000 >>>>>>>> >>>>>>>> total: nonzeros=3373340, allocated nonzeros=3373340 >>>>>>>> >>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>> >>>>>>>> not using I-node (on process 0) routines >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> And i configured my PC object as: >>>>>>>> >>>>>>>> >>>>>>>> call PCSetType(mg,PCHYPRE,ierr) >>>>>>>> >>>>>>>> call PCHYPRESetType(mg,'boomeramg',ierr) >>>>>>>> >>>>>>>> >>>>>>>> call PetscOptionsSetValue(PETSC_NULL_OBJECT, >>>>>>>> 'pc_hypre_boomeramg_nodal_coarsen','1',ierr) >>>>>>>> >>>>>>>> call PetscOptionsSetValue(PETSC_NULL_OBJECT, >>>>>>>> 'pc_hypre_boomeramg_vec_interp_variant','1',ierr) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> What are your thoughts ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Manuel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Jan 6, 2017 at 1:58 PM, Manuel Valera < >>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>> >>>>>>>>> Awesome, that did it, thanks once again. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jan 6, 2017 at 1:53 PM, Barry Smith <bsm...@mcs.anl.gov> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Take the scatter out of the if () since everyone does it and >>>>>>>>>> get rid of the VecView(). >>>>>>>>>> >>>>>>>>>> Does this work? If not where is it hanging? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> > On Jan 6, 2017, at 3:29 PM, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > >>>>>>>>>> > Thanks Dave, >>>>>>>>>> > >>>>>>>>>> > I think is interesting it never gave an error on this, after >>>>>>>>>> adding the vecassembly calls it still shows the same behavior, >>>>>>>>>> without >>>>>>>>>> complaining, i did: >>>>>>>>>> > >>>>>>>>>> > if(rankl==0)then >>>>>>>>>> > >>>>>>>>>> > call VecSetValues(bp0,nbdp,ind,Rhs,INSERT_VALUES,ierr) >>>>>>>>>> > call VecAssemblyBegin(bp0,ierr) ; call >>>>>>>>>> VecAssemblyEnd(bp0,ierr); >>>>>>>>>> > CHKERRQ(ierr) >>>>>>>>>> > >>>>>>>>>> endif >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > call VecScatterBegin(ctr,bp0,bp2,IN >>>>>>>>>> SERT_VALUES,SCATTER_REVERSE,ierr) >>>>>>>>>> > call VecScatterEnd(ctr,bp0,bp2,INSE >>>>>>>>>> RT_VALUES,SCATTER_REVERSE,ierr) >>>>>>>>>> > print*,"done! " >>>>>>>>>> > CHKERRQ(ierr) >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > CHKERRQ(ierr) >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > Thanks. >>>>>>>>>> > >>>>>>>>>> > On Fri, Jan 6, 2017 at 12:44 PM, Dave May < >>>>>>>>>> dave.mayhe...@gmail.com> wrote: >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > On 6 January 2017 at 20:24, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > Great help Barry, i totally had overlooked that option (it is >>>>>>>>>> explicit in the vecscatterbegin call help page but not in >>>>>>>>>> vecscattercreatetozero, as i read later) >>>>>>>>>> > >>>>>>>>>> > So i used that and it works partially, it scatters te values >>>>>>>>>> assigned in root but not the rest, if i call vecscatterbegin from >>>>>>>>>> outside >>>>>>>>>> root it hangs, the code currently look as this: >>>>>>>>>> > >>>>>>>>>> > call VecScatterCreateToZero(bp2,ctr,bp0,ierr); CHKERRQ(ierr) >>>>>>>>>> > >>>>>>>>>> > call PetscObjectSetName(bp0, 'bp0:',ierr) >>>>>>>>>> > >>>>>>>>>> > if(rankl==0)then >>>>>>>>>> > >>>>>>>>>> > call VecSetValues(bp0,nbdp,ind,Rhs,INSERT_VALUES,ierr) >>>>>>>>>> > >>>>>>>>>> > call VecView(bp0,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > You need to call >>>>>>>>>> > >>>>>>>>>> > VecAssemblyBegin(bp0); >>>>>>>>>> > VecAssemblyEnd(bp0); >>>>>>>>>> > after your last call to VecSetValues() before you can do any >>>>>>>>>> operations with bp0. >>>>>>>>>> > >>>>>>>>>> > With your current code, the call to VecView should produce an >>>>>>>>>> error if you used the error checking macro CHKERRQ(ierr) (as should >>>>>>>>>> VecScatter{Begin,End} >>>>>>>>>> > >>>>>>>>>> > Thanks, >>>>>>>>>> > Dave >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > call VecScatterBegin(ctr,bp0,bp2,IN >>>>>>>>>> SERT_VALUES,SCATTER_REVERSE,ierr) >>>>>>>>>> > call VecScatterEnd(ctr,bp0,bp2,INSE >>>>>>>>>> RT_VALUES,SCATTER_REVERSE,ierr) >>>>>>>>>> > print*,"done! " >>>>>>>>>> > CHKERRQ(ierr) >>>>>>>>>> > >>>>>>>>>> > endif >>>>>>>>>> > >>>>>>>>>> > ! call VecScatterBegin(ctr,bp0,bp2,IN >>>>>>>>>> SERT_VALUES,SCATTER_REVERSE,ierr) >>>>>>>>>> > ! call VecScatterEnd(ctr,bp0,bp2,INSE >>>>>>>>>> RT_VALUES,SCATTER_REVERSE,ierr) >>>>>>>>>> > >>>>>>>>>> > call VecView(bp2,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>>>>>>> > >>>>>>>>>> > call PetscBarrier(PETSC_NULL_OBJECT,ierr) >>>>>>>>>> > >>>>>>>>>> > call exit() >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > And the output is: (with bp the right answer) >>>>>>>>>> > >>>>>>>>>> > Vec Object:bp: 2 MPI processes >>>>>>>>>> > type: mpi >>>>>>>>>> > Process [0] >>>>>>>>>> > 1. >>>>>>>>>> > 2. >>>>>>>>>> > Process [1] >>>>>>>>>> > 4. >>>>>>>>>> > 3. >>>>>>>>>> > Vec Object:bp2: 2 MPI processes (before scatter) >>>>>>>>>> > type: mpi >>>>>>>>>> > Process [0] >>>>>>>>>> > 0. >>>>>>>>>> > 0. >>>>>>>>>> > Process [1] >>>>>>>>>> > 0. >>>>>>>>>> > 0. >>>>>>>>>> > Vec Object:bp0: 1 MPI processes >>>>>>>>>> > type: seq >>>>>>>>>> > 1. >>>>>>>>>> > 2. >>>>>>>>>> > 4. >>>>>>>>>> > 3. >>>>>>>>>> > done! >>>>>>>>>> > Vec Object:bp2: 2 MPI processes (after scatter) >>>>>>>>>> > type: mpi >>>>>>>>>> > Process [0] >>>>>>>>>> > 1. >>>>>>>>>> > 2. >>>>>>>>>> > Process [1] >>>>>>>>>> > 0. >>>>>>>>>> > 0. >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > Thanks inmensely for your help, >>>>>>>>>> > >>>>>>>>>> > Manuel >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > On Thu, Jan 5, 2017 at 4:39 PM, Barry Smith <bsm...@mcs.anl.gov> >>>>>>>>>> wrote: >>>>>>>>>> > >>>>>>>>>> > > On Jan 5, 2017, at 6:21 PM, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > > >>>>>>>>>> > > Hello Devs is me again, >>>>>>>>>> > > >>>>>>>>>> > > I'm trying to distribute a vector to all called processes, >>>>>>>>>> the vector would be originally in root as a sequential vector and i >>>>>>>>>> would >>>>>>>>>> like to scatter it, what would the best call to do this ? >>>>>>>>>> > > >>>>>>>>>> > > I already know how to gather a distributed vector to root >>>>>>>>>> with VecScatterCreateToZero, this would be the inverse operation, >>>>>>>>>> > >>>>>>>>>> > Use the same VecScatter object but with SCATTER_REVERSE, not >>>>>>>>>> you need to reverse the two vector arguments as well. >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > > i'm currently trying with VecScatterCreate() and as of now im >>>>>>>>>> doing the following: >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > if(rank==0)then >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > call VecCreate(PETSC_COMM_SELF,bp0,ierr); CHKERRQ(ierr) >>>>>>>>>> !if i use WORLD >>>>>>>>>> > > >>>>>>>>>> !freezes in SetSizes >>>>>>>>>> > > call VecSetSizes(bp0,PETSC_DECIDE,nbdp,ierr); >>>>>>>>>> CHKERRQ(ierr) >>>>>>>>>> > > call VecSetType(bp0,VECSEQ,ierr) >>>>>>>>>> > > call VecSetFromOptions(bp0,ierr); CHKERRQ(ierr) >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > call VecSetValues(bp0,nbdp,ind,Rhs,INSERT_VALUES,ierr) >>>>>>>>>> > > >>>>>>>>>> > > !call VecSet(bp0,5.0D0,ierr); CHKERRQ(ierr) >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > call VecView(bp0,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>>>>>>>> > > >>>>>>>>>> > > call VecAssemblyBegin(bp0,ierr) ; call >>>>>>>>>> VecAssemblyEnd(bp0,ierr) !rhs >>>>>>>>>> > > >>>>>>>>>> > > do i=0,nbdp-1,1 >>>>>>>>>> > > ind(i+1) = i >>>>>>>>>> > > enddo >>>>>>>>>> > > >>>>>>>>>> > > call ISCreateGeneral(PETSC_COMM_SEL >>>>>>>>>> F,nbdp,ind,PETSC_COPY_VALUES,locis,ierr) >>>>>>>>>> > > >>>>>>>>>> > > !call VecScatterCreate(bp0,PETSC_NULL_OBJECT,bp2,is,ctr,ierr) >>>>>>>>>> !if i use SELF >>>>>>>>>> > > >>>>>>>>>> !freezes here. >>>>>>>>>> > > >>>>>>>>>> > > call VecScatterCreate(bp0,locis,bp2 >>>>>>>>>> ,PETSC_NULL_OBJECT,ctr,ierr) >>>>>>>>>> > > >>>>>>>>>> > > endif >>>>>>>>>> > > >>>>>>>>>> > > bp2 being the receptor MPI vector to scatter to >>>>>>>>>> > > >>>>>>>>>> > > But it freezes in VecScatterCreate when trying to use more >>>>>>>>>> than one processor, what would be a better approach ? >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > Thanks once again, >>>>>>>>>> > > >>>>>>>>>> > > Manuel >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > On Wed, Jan 4, 2017 at 3:30 PM, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > > Thanks i had no idea how to debug and read those logs, that >>>>>>>>>> solved this issue at least (i was sending a message from root to >>>>>>>>>> everyone >>>>>>>>>> else, but trying to catch from everyone else including root) >>>>>>>>>> > > >>>>>>>>>> > > Until next time, many thanks, >>>>>>>>>> > > >>>>>>>>>> > > Manuel >>>>>>>>>> > > >>>>>>>>>> > > On Wed, Jan 4, 2017 at 3:23 PM, Matthew Knepley < >>>>>>>>>> knep...@gmail.com> wrote: >>>>>>>>>> > > On Wed, Jan 4, 2017 at 5:21 PM, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > > I did a PetscBarrier just before calling the vicariate >>>>>>>>>> routine and im pretty sure im calling it from every processor, code >>>>>>>>>> looks >>>>>>>>>> like this: >>>>>>>>>> > > >>>>>>>>>> > > From the gdb trace. >>>>>>>>>> > > >>>>>>>>>> > > Proc 0: Is in some MPI routine you call yourself, line 113 >>>>>>>>>> > > >>>>>>>>>> > > Proc 1: Is in VecCreate(), line 130 >>>>>>>>>> > > >>>>>>>>>> > > You need to fix your communication code. >>>>>>>>>> > > >>>>>>>>>> > > Matt >>>>>>>>>> > > >>>>>>>>>> > > call PetscBarrier(PETSC_NULL_OBJECT,ierr) >>>>>>>>>> > > >>>>>>>>>> > > print*,'entering POInit from',rank >>>>>>>>>> > > !call exit() >>>>>>>>>> > > >>>>>>>>>> > > call PetscObjsInit() >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > And output gives: >>>>>>>>>> > > >>>>>>>>>> > > entering POInit from 0 >>>>>>>>>> > > entering POInit from 1 >>>>>>>>>> > > entering POInit from 2 >>>>>>>>>> > > entering POInit from 3 >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > Still hangs in the same way, >>>>>>>>>> > > >>>>>>>>>> > > Thanks, >>>>>>>>>> > > >>>>>>>>>> > > Manuel >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > On Wed, Jan 4, 2017 at 2:55 PM, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > > Thanks for the answers ! >>>>>>>>>> > > >>>>>>>>>> > > heres the screenshot of what i got from bt in gdb (great hint >>>>>>>>>> in how to debug in petsc, didn't know that) >>>>>>>>>> > > >>>>>>>>>> > > I don't really know what to look at here, >>>>>>>>>> > > >>>>>>>>>> > > Thanks, >>>>>>>>>> > > >>>>>>>>>> > > Manuel >>>>>>>>>> > > >>>>>>>>>> > > On Wed, Jan 4, 2017 at 2:39 PM, Dave May < >>>>>>>>>> dave.mayhe...@gmail.com> wrote: >>>>>>>>>> > > Are you certain ALL ranks in PETSC_COMM_WORLD call these >>>>>>>>>> function(s). These functions cannot be inside if statements like >>>>>>>>>> > > if (rank == 0){ >>>>>>>>>> > > VecCreateMPI(...) >>>>>>>>>> > > } >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > On Wed, 4 Jan 2017 at 23:34, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > > Thanks Dave for the quick answer, appreciate it, >>>>>>>>>> > > >>>>>>>>>> > > I just tried that and it didn't make a difference, any other >>>>>>>>>> suggestions ? >>>>>>>>>> > > >>>>>>>>>> > > Thanks, >>>>>>>>>> > > Manuel >>>>>>>>>> > > >>>>>>>>>> > > On Wed, Jan 4, 2017 at 2:29 PM, Dave May < >>>>>>>>>> dave.mayhe...@gmail.com> wrote: >>>>>>>>>> > > You need to swap the order of your function calls. >>>>>>>>>> > > Call VecSetSizes() before VecSetType() >>>>>>>>>> > > >>>>>>>>>> > > Thanks, >>>>>>>>>> > > Dave >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > On Wed, 4 Jan 2017 at 23:21, Manuel Valera < >>>>>>>>>> mval...@mail.sdsu.edu> wrote: >>>>>>>>>> > > Hello all, happy new year, >>>>>>>>>> > > >>>>>>>>>> > > I'm working on parallelizing my code, it worked and provided >>>>>>>>>> some results when i just called more than one processor, but created >>>>>>>>>> artifacts because i didn't need one image of the whole program in >>>>>>>>>> each >>>>>>>>>> processor, conflicting with each other. >>>>>>>>>> > > >>>>>>>>>> > > Since the pressure solver is the main part i need in parallel >>>>>>>>>> im chosing mpi to run everything in root processor until its time to >>>>>>>>>> solve >>>>>>>>>> for pressure, at this point im trying to create a distributed vector >>>>>>>>>> using >>>>>>>>>> either >>>>>>>>>> > > >>>>>>>>>> > > call VecCreateMPI(PETSC_COMM_WORLD, >>>>>>>>>> PETSC_DECIDE,nbdp,xp,ierr) >>>>>>>>>> > > or >>>>>>>>>> > > call VecCreate(PETSC_COMM_WORLD,xp,ierr); CHKERRQ(ierr) >>>>>>>>>> > > call VecSetType(xp,VECMPI,ierr) >>>>>>>>>> > > call VecSetSizes(xp,PETSC_DECIDE,nbdp,ierr); >>>>>>>>>> CHKERRQ(ierr) >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > In both cases program hangs at this point, something it never >>>>>>>>>> happened on the naive way i described before. I've made sure the >>>>>>>>>> global >>>>>>>>>> size, nbdp, is the same in every processor. What can be wrong? >>>>>>>>>> > > >>>>>>>>>> > > Thanks for your kind help, >>>>>>>>>> > > >>>>>>>>>> > > Manuel. >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > -- >>>>>>>>>> > > What most experimenters take for granted before they begin >>>>>>>>>> their experiments is infinitely more interesting than any results to >>>>>>>>>> which >>>>>>>>>> their experiments lead. >>>>>>>>>> > > -- Norbert Wiener >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which >>>>>>> their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >