See http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#computers in 
particular note the discussion on memory bandwidth. Once you have started using 
multiple cores per CPU you will start to see very little speedup with Jacobi 
preconditioning since it is very memory bandwidth limited. In fact pretty much 
all sparse iterative solvers are memory bandwidth limited.

   Barry


On Dec 20, 2010, at 10:46 AM, Yongjun Chen wrote:

> 
> Hi everyone,
> 
> 
> 
> I use PETSC (version 3.1-p5) to solve a linear problem Ax=b. The matrix A and 
> right hand vector b are read from files. The dimension of A is 
> 1.2Million*1.2Million. I am pretty sure the matrix A and vector b have been 
> read correctly.
> 
> I compiled the program with optimized version (--with-debugging=0), tested 
> the speed up performance on two servers, and I have found that the 
> performance is very poor.
> 
> For the two servers, one is 4 cpus * 4 cores per cpu, i.e., with a total 16 
> cores. And the other one is 4 cpus * 12 cores per cpu, with a total 48 cores.
> 
> On each of them, with the increasing of computing cores k from 1 to 8 
> (mpiexec ?n  k ./Solver_MPI -pc_type jacobi -ksp-type gmres), the speed up 
> will increase from 1 to 6, but when the computing cores k increase from 9 to 
> 16(for the first server) or 48 (for the second server), the speed up decrease 
> firstly and then remains a constant value 5.0 (for the first server) or 
> 4.5(for the second server).
> 
> Actually, the program LAMMPS speed up excellently on these two servers.
> 
> Any comments are very appreciated! Thanks!
> 
>  
> --------------------------------------------------------------------------------------------------------------------------
> 
> PS: the related codes are as following,
> 
> 
> 
> //firstly read A and b from files
> 
> ...
> 
> //then
> 
>  
>               ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
> 
>               ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
> 
>               ierr = VecAssemblyBegin(b); CHKERRQ(ierr);
> 
>               ierr = VecAssemblyEnd(b); CHKERRQ(ierr);
> 
>  
>               ierr = MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE); CHKERRQ(ierr);
> 
>               ierr = MatGetRowUpperTriangular(A); CHKERRQ(ierr);
> 
>               ierr = KSPCreate(PETSC_COMM_WORLD,&ksp);CHKERRQ(ierr);
> 
>  
>               ierr = 
> KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr);
> 
>               ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
> 
>               ierr = 
> KSPSetTolerances(ksp,1.e-7,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT);CHKERRQ(ierr);
> 
>               ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
> 
>  
>               ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
> 
>  
>               ierr = KSPView(ksp,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
> 
>  
>               ierr = KSPGetSolution(ksp, &x);CHKERRQ(ierr);
> 
>  
>               ierr = VecAssemblyBegin(x);CHKERRQ(ierr);
> 
>               ierr = VecAssemblyEnd(x);CHKERRQ(ierr);
> 
> ...
> 
>  
> 

Reply via email to