Re: [OMPI devel] Performance analysis proposal

2016-07-28 Thread Artem Polyakov
P.S. For the future reference we also need to keep launch scripts that were used to be able to carefully reproduce. Jeff mentioned that on the wiki page IFRC. 2016-07-29 12:42 GMT+07:00 Artem Polyakov : > Thank you, Arm! > > Good to have vader results (I haven't tried it myself yet). Few > commen

Re: [OMPI devel] Performance analysis proposal

2016-07-28 Thread Artem Polyakov
Thank you, Arm! Good to have vader results (I haven't tried it myself yet). Few comments/questions: 1. I guess we also want to have 1-threaded performance for the "baseline" reference. 2. Have you tried to run with openib, as I mentioned on the call I had some problems with it and I'm curious if y

Re: [OMPI devel] Performance analysis proposal

2016-07-28 Thread Arm Patinyasakdikul (apatinya)
I added some result to https://github.com/open-mpi/2016-summer-perf-testing The result shows much better performance from 2.0.0 and master over 1.10.3 for vader. The test ran with Artem’s version of benchmark on OB1, single node, bind to socket. We should have a place to discuss/comment/collab

Re: [OMPI devel] Performance analysis proposal

2016-07-28 Thread Jeff Squyres (jsquyres)
On Jul 28, 2016, at 6:28 AM, Artem Polyakov wrote: > > Jeff and others, > > 1. The benchmark was updated to support shared memory case. > 2. The wiki was updated with the benchmark description: > https://github.com/open-mpi/ompi/wiki/Request-refactoring-test#benchmark-prototype Sweet -- thanks

Re: [OMPI devel] Performance analysis proposal

2016-07-28 Thread Artem Polyakov
Jeff and others, 1. The benchmark was updated to support shared memory case. 2. The wiki was updated with the benchmark description: https://github.com/open-mpi/ompi/wiki/Request-refactoring-test#benchmark-prototype Let me know if we want to put this prototype to some general place. I think it ma

Re: [OMPI devel] Performance analysis proposal

2016-07-28 Thread Jeff Squyres (jsquyres)
On Jul 28, 2016, at 2:52 AM, Sreenidhi Bharathkar Ramesh via devel wrote: > >> For Open MPI, it's basically THREAD_MULTIPLE and not-THREAD_MULTIPLE. I.e., >> there's no real difference between SINGLE, SERIALIZED, FUNNELED > > We were assuming that there would be cost due to > locking/synchron

[OMPI devel] sm/vader BTL performance in openmpi-2.0.0

2016-07-28 Thread tmishima
Hi Nathan, You gave me a hit, thanks! I applied your patches and added "-mca btl_openib_flags 311" to the mpirun option, then it worked for me. [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl_openib_flags 311 -bind-to core -report-bindings osu_bw [manage.cluster:21733] MCW rank 0