Hi there, I implemented a parallel version of the InsertionSortCollider. It is almost ready but not yet pushed to the main trunk, as I have a few things to check before that. It would be helpful if some of you could 1/ test that your scripts work correctly and 2/ benchmark this for N>100k and j>4. If you run benchmarks, please remember to always activate timing and report the result of timing.stats(). It gives much more interesting data than the wall clock time.
Preliminary benchmark results are below (from my laptop...), showing a speedup by a factor 2 on the total computation time for j4/200k particles (compared to the sequential collider). The speedup on collider alone is in fact of the order of x3.68 for 4 threads. Nearly linear at least for such small number of threads. My expectation is that it should change almost nothing for small number of particles (say, N<10k), where colliding is an inexpensive step. For 1million of particles OTOH, there could be significant speedup, since the collider takes most of the time. You can get the "pc" branch at my github repo: git clone -b pc https://github.com/bchareyre/trunk.git Results of yade -j4 --performance are below (I7 quad-core with hyperthreading enabled, lightly loaded by background tasks - j>4 not reported as hyperthreading is probably doing no good). Happy benchmarking. :) Bruno ==================== ./yade-trunk -j4 --performance (the current trunk) ....... number of bodies 200813 Elapsed 29.4102840424 sec Performance 6.80034234664 iter/sec Extrapolation on 1e5 iters 4.08476167255 hours =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=* Name Count Time Rel. time ------------------------------------------------------------------------------------------------------- ForceResetter 200 700881us 2.38% InsertionSortCollider 7 18816625us 64.02% InteractionLoop 200 6581283us 22.39% NewtonIntegrator 200 3293119us 11.20% TOTAL 29391910us 100.00% Common time 597.731503963 s 5037 spheres, velocity= 327.689688709 +- 5.13604387635 % 25103 spheres, velocity= 81.2726909754 +- 1.0105334405 % 50250 spheres, velocity= 45.4114521341 +- 3.02333274436 % 100467 spheres, velocity= 19.0287424005 +- 2.26073439157 % 200813 spheres, velocity= 6.51664351023 +- 4.03351515402 % SCORE: 13777 Number of threads 4 ======================== ./yade-parallel -j4 --performance (my "pc" branch) .... number of bodies 200813 Elapsed 15.4320101738 sec Performance 12.9600744004 iter/sec Extrapolation on 1e5 iters 2.14333474636 hours =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=* Name Count Time Rel. time ------------------------------------------------------------------------------------------------------- ForceResetter 200 671157us 4.36% InsertionSortCollider 7 5145114us 33.42% boundDispatcher 7 93186us 1.81% bound 7 12us 0.00% copy 7 160891us 3.13% erase 7 66932us 1.30% sort&collide 7 4824071us 93.76% TOTAL 35 5145095us 100.00% InteractionLoop 200 6545848us 42.52% NewtonIntegrator 200 3030989us 19.69% TOTAL 15393110us 100.00% Common time 460.37680912 s 5037 spheres, velocity= 365.599773471 +- 8.02397068512 % 25103 spheres, velocity= 92.0077536966 +- 3.81069496509 % 50250 spheres, velocity= 54.1683980588 +- 0.528288534811 % 100467 spheres, velocity= 25.7134767981 +- 1.0796373464 % 200813 spheres, velocity= 12.6488486429 +- 4.66276699319 % SCORE: 18800 Number of threads 4 _______________________________________________ Mailing list: https://launchpad.net/~yade-dev Post to : yade-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~yade-dev More help : https://help.launchpad.net/ListHelp