On Wed, Jun 29, 2016 at 8:18 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> > > On Jun 29, 2016, at 10:06 PM, Jeff Hammond <jeff.scie...@gmail.com> > wrote: > > > > > > > > On Wednesday, June 29, 2016, Barry Smith <bsm...@mcs.anl.gov> wrote: > > > > Who are these people and why to they have this webpage? > > > > > > Pop up 2-3 directories and you'll see this is a grad student who appears > to be trying to learn applied math. Is this really your enemy? Don't you > guys have some DOE bigwigs to bash? > > > > Almost for sure they are doing no process binding and no proper > assignment of processes to memory domains. > > > > > > MVAPICH2 sets affinity by default. Details not given but "infiniband > enabled" means it might have been used. I don't know what OpenMPI does by > default but affinity alone doesn't explain this. > > By affinity you mean that the process just remains on the same core > right? You could be right I think the main affect is a bad assignment of > processes to cores/memory domains. > > Yes, affinity to cores. I checked and: - Open-MPI does no binding by default ( https://www.open-mpi.org/faq/?category=tuning#using-paffinity-v1.4). - MVAPICH2 sets affinity by default except when MPI_THREAD_MULTIPLE is used ( http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.0-userguide.pdf ). - I am not certain what Intel MPI does in every case, but at least on Xeon Phi it defaults to compact placement ( https://software.intel.com/en-us/articles/mpi-and-process-pinning-on-xeon-phi), which is almost certainly wrong for bandwidth-limited apps (where scatter makes more sense). > > > > In addition they are likely filling up all the cores on the first node > before adding processes to the second core etc. > > > > > > That's how I would show scaling. Are you suggesting using all the nodes > and doing breadth first placement? > > I would fill up one process per memory domain moving across the nodes; > then go back and start a second process on each memory domain. etc You can > also just go across nodes as you suggest and then across memory domains > > That's reasonable. I just don't bother showing scaling except in the unit of charge, which in most cases is nodes (exception: Blue Gene). There is no way to decompose node resources in a completely reliable way, so one should always use the full node as effectively as possible for every node count. The other exception is the cloud, there hypervisors are presumably doing a halfway decent job of dividing up resources (and adding enough overhead that performance is irrelevant anyways :-) ) and one can plot scaling in the number of (virtual) cores. > If you fill up the entire node of cores and then go to the next node > you get this affect that the performance goes way down as you fill up the > last of the cores (because no more memory bandwidth is available) and then > performance goes up again as you jump to the next node and suddenly have a > big chunk of additional bandwidth. You also have weird load balancing > problem because the first 16 processes are going slow because they share > some bandwidth while the 17 runs much faster since it can hog more > bandwidth. > > Indeed, 17 on 2 should be distributed as 9 and 8, not 16 and 1, although using nproc%nnode!=0 is silly. I thought you meant scaling up to 20 with 1 ppn on 20 nodes, then going to 40 with 2 ppn, etc. Jeff > > > > Jeff > > > > If the studies had been done properly there should be very little fail > off on the strong scaling in going from 1 to 2 to 4 processes and even > beyond. Similarly the huge fail off in going from 4 to 8 to 16 would not > occur for weak scaling. > > > > Barry > > > > > > > On Jun 29, 2016, at 7:47 PM, Matthew Knepley <knep...@gmail.com> > wrote: > > > > > > > > > > > > http://guest.ams.sunysb.edu/~zgao/work/airfoil/scaling.html > > > > > > Can we rerun this on something at ANL since I think this cannot be > true. > > > > > > Matt > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > -- > > Jeff Hammond > > jeff.scie...@gmail.com > > http://jeffhammond.github.io/ > > -- Jeff Hammond jeff.scie...@gmail.com http://jeffhammond.github.io/