Nice!  Can you verify that it passes the ibm test?  I didn't look closely, and 
to be honest, I'm not sure why the previous improvement broke the IBM test 
because it hypothetically did what you mentioned (stopped at sqrt(freenodes)).

I think patch 1 is a no-brainer.  I'm not sure about #2 because I'm not sure 
how it's different than the previous one, nor did I have time to investigate 
why the previous one broke the IBM test.  #3 seems like a good idea, too; I 
did't check the paper, but I assume it's some kind of proof about the upper 
limit on the number of primes in a given range.

Two questions:

1. Should we cache generated prime numbers?  (if so, it'll have to be done in a 
thread-safe way)

2. Should we just generate prime numbers and hard-code them into a table that 
is compiled into the code?  We would only need primes up to the sqrt of 
2billion (i.e., signed int), right?  I don't know how many that is -- if it's 
small enough, perhaps this is the easiest solution.



On Feb 10, 2014, at 1:30 PM, Christoph Niethammer <nietham...@hlrs.de> wrote:

> Hello,
> 
> I noticed some effort in improving the scalability of
> MPI_Dims_create(int nnodes, int ndims, int dims[])
> Unfortunately there were some issues with the first attempt (r30539 and 
> r30540) which were reverted.
> 
> So I decided to give it a short review based on r30606
> https://svn.open-mpi.org/trac/ompi/browser/trunk/ompi/mpi/c/dims_create.c?rev=30606
> 
> 
> 1.) freeprocs is initialized to be nnodes and the subsequent divisions of 
> freeprocs have all positive integers as divisor.
> So IMHO it would make more sense to check if nnodes > 0 in the 
> MPI_PARAM_CHECK section at the begin instead of the following (see patch 
> 0001):
> 
> 99        if (freeprocs < 1) {
> 100          return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_DIMS,
> 101                                        FUNC_NAME);
> 102       }
> 
> 
> 2.) I rewrote the algorithm stopping at sqrt(n) in getprimes(int num, int 
> *nprimes, int **pprimes)
> which makes mathematically more sens (as the largest prime factor of any 
> number n cannot exceed \sqrt{n}) - and should produce the right result. ;)
> (see patch 0002)
> Here the improvements:
> 
> module load mpi/openmpi/trunk-gnu.4.7.3
> $ ./mpi-dims-old 1000000
> time used for MPI_Dims_create(1000000, 3, {}): 8.104007
> module swap mpi/openmpi/trunk-gnu.4.7.3 mpi/openmpi/trunk-gnu.4.7.3-testing
> $ ./mpi-dims-new 1000000
> time used for MPI_Dims_create(1000000, 3, {}): 0.060400
> 
> 
> 3.) Memory allocation for the list of prime numbers may be reduced up to a 
> factor of ~6 for 1mio nodes using the result from Dusart 1999 [1]:
> \pi(x)  < x/ln(x)(1+1.2762/ln(x))  for x > 1
> Unfortunately this saves us only 1.6 MB per process for 1mio nodes as 
> reported by tcmalloc/pprof on a test program - but it may sum up with fatter 
> nodes. :P
> 
> $ pprof --base=$PWD/primes-old.0001.heap a.out primes-new.0002.heap
> (pprof) top
> Total: -1.6 MB
>     0.3 -18.8% -18.8%      0.3 -18.8% getprimes2
>     0.0  -0.0% -18.8%     -1.6 100.0% __libc_start_main
>     0.0  -0.0% -18.8%     -1.6 100.0% main
>    -1.9 118.8% 100.0%     -1.9 118.8% getprimes
> 
> Find attached patch for it in 0003.
> 
> 
> If there are no issues I would like to commit this to trunk for further 
> testing (+cmr for 1.7.5?) end of this week.
> 
> Best regards
> Christoph
> 
> [1] 
> http://www.ams.org/journals/mcom/1999-68-225/S0025-5718-99-01037-6/home.html
> 
> 
> 
> --
> 
> Christoph Niethammer
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstrasse 19
> 70569 Stuttgart
> 
> Tel: ++49(0)711-685-87203
> email: nietham...@hlrs.de
> http://www.hlrs.de/people/niethammer<0001-Move-parameter-check-into-appropriate-code-section-a.patch><0002-Speeding-up-detection-of-prime-numbers-using-the-fac.patch><0003-Reduce-memory-usage-by-a-better-approximation-for-th.patch>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to