Re: [OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-07-14 Thread Josh Aune

On 6/29/06, Patrick Geoffray <patr...@myri.com> wrote:

Jeff Squyres (jsquyres) wrote:
>> -Original Message-
>> From: devel-boun...@open-mpi.org
>> [mailto:devel-boun...@open-mpi.org] On Behalf Of Patrick Geoffray
>> Sent: Wednesday, June 28, 2006 1:23 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] Best bw/lat performance for
>> microbenchmark/debug utility
>>
>> Josh Aune wrote:
>>> I am writing up some interconnect/network debugging software that is
>>> centered around ompi.  What is the best set of functions to

> I was assuming that you would be testing latency/bandwidth, but Patrick
> is correct in stating that there are many more things to test than just
> those two metrics.

There are a lot of metrics, but most of them require deep understanding
of the MPI semantics and implementation details to make sense. The art
of micro-benchmark is to choose the metrics and explain why they matter.
It's obvious for latency/bandwidth, a bit less for unexpected and host
overhead, definitively hard for overlap and progress. And that's just
for point-to-point.

To avoid reinventing the wheel, I would suggest to Josh to develop a
micro-benchmark test suite to compute a very detailed LogP-derived
parameters, ie for all message sizes:
* send overhead (o.s) and recv overhead (o.r). These overheads will
likely be either constant or linear for various message size ranges, it
would be great to automatically compute the ranges.
Memory registration cost is accounted here, so it would useful to
measure with and without registration cache also.
* Latency (L).
* Send gap (g.s) and recv gap (g.r). For large messages, they will
likely be identical and represent the link bandwidth. For smaller
messages, the send gap is the gap of a fan-out pattern (1->N) and the
recv gap is the gap of a flat gather (N->1). It's important to not have
the send or recv overhead hiding the send or recv gap, using several
processes could be used to dive the send/recv overhead.
* unexpected overhead (o.u). Overhead added to (o.r) when the message is
not immediately matched.
* overlap availability (a) that is the percentage of communication time
that you can overlap with real host computation.

 From these parameters, you can derive pretty much all characteristics
of an interconnect without contention.

Patrick


Sorry for the long delay in replying.  Thanks for the info.  What I am
trying to do is create a set of standardized easy to use system level
debugging utilites (and force myself to learn more MPI :).  Currently
I am shooting for latency/bandwidth but would welcome ideas for
further useful node level tests.  I am not just testing the
interconnect, but need to verify memory bandwidth, pci bandwidth to
the interconnect card (I love -mca btl ^sm :), processor
functionality, system errors (currently only parity and pci-express
fatal/nonfatal/etc) and what not.

I want to have tests that are easy enough to run all you have to do is
'mpirun -np $ALL ./footest' and it comes back with any nodes that look
bad for that test as well as some general data about the cluster's
performance.

I want to get the suite out to the comunity after I have some seed
tests written and hope that there will be enough that others will be
interested in contributing, though I am waiting for release approval
from work at the moment, which may not happen :(

Josh


Re: [OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-06-29 Thread Patrick Geoffray

Jeff Squyres (jsquyres) wrote:

-Original Message-
From: devel-boun...@open-mpi.org 
[mailto:devel-boun...@open-mpi.org] On Behalf Of Patrick Geoffray

Sent: Wednesday, June 28, 2006 1:23 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] Best bw/lat performance for 
microbenchmark/debug utility


Josh Aune wrote:

I am writing up some interconnect/network debugging software that is
centered around ompi.  What is the best set of functions to 



I was assuming that you would be testing latency/bandwidth, but Patrick
is correct in stating that there are many more things to test than just
those two metrics.


There are a lot of metrics, but most of them require deep understanding 
of the MPI semantics and implementation details to make sense. The art 
of micro-benchmark is to choose the metrics and explain why they matter. 
It's obvious for latency/bandwidth, a bit less for unexpected and host 
overhead, definitively hard for overlap and progress. And that's just 
for point-to-point.


To avoid reinventing the wheel, I would suggest to Josh to develop a 
micro-benchmark test suite to compute a very detailed LogP-derived 
parameters, ie for all message sizes:
* send overhead (o.s) and recv overhead (o.r). These overheads will 
likely be either constant or linear for various message size ranges, it 
would be great to automatically compute the ranges.
Memory registration cost is accounted here, so it would useful to 
measure with and without registration cache also.

* Latency (L).
* Send gap (g.s) and recv gap (g.r). For large messages, they will 
likely be identical and represent the link bandwidth. For smaller 
messages, the send gap is the gap of a fan-out pattern (1->N) and the 
recv gap is the gap of a flat gather (N->1). It's important to not have 
the send or recv overhead hiding the send or recv gap, using several 
processes could be used to dive the send/recv overhead.
* unexpected overhead (o.u). Overhead added to (o.r) when the message is 
not immediately matched.
* overlap availability (a) that is the percentage of communication time 
that you can overlap with real host computation.


From these parameters, you can derive pretty much all characteristics 
of an interconnect without contention.


Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com


Re: [OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-06-29 Thread Jeff Squyres (jsquyres)
> -Original Message-
> From: devel-boun...@open-mpi.org 
> [mailto:devel-boun...@open-mpi.org] On Behalf Of Patrick Geoffray
> Sent: Wednesday, June 28, 2006 1:23 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] Best bw/lat performance for 
> microbenchmark/debug utility
> 
> Josh Aune wrote:
> > I am writing up some interconnect/network debugging software that is
> > centered around ompi.  What is the best set of functions to 
> use to get
> > the best bandwidth and latency numbers for openmpi and why? 
>  I've been
> 
> You mean MPI functions or internal ompi functions ? For MPI 
> functions, 
> it depends of what you are looking for. Send/recv is fine but it does 
> not show the overlap capability. You would need to do 
> something smarter 
> with Isend/Irecv/Wait for that (Sandia has a nice bench that 
> they should 
> release soon). You may also want to measure the penalty for 
> unexpected 
> messages, the host CPU overhead and the ability to progress.

Patrick's answer is much better than mine.  :-)

I was assuming that you would be testing latency/bandwidth, but Patrick
is correct in stating that there are many more things to test than just
those two metrics.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems



Re: [OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-06-28 Thread Patrick Geoffray

Josh Aune wrote:

I am writing up some interconnect/network debugging software that is
centered around ompi.  What is the best set of functions to use to get
the best bandwidth and latency numbers for openmpi and why?  I've been


You mean MPI functions or internal ompi functions ? For MPI functions, 
it depends of what you are looking for. Send/recv is fine but it does 
not show the overlap capability. You would need to do something smarter 
with Isend/Irecv/Wait for that (Sandia has a nice bench that they should 
release soon). You may also want to measure the penalty for unexpected 
messages, the host CPU overhead and the ability to progress.


All of these metrics are measured by existing benchmarks, do you want to 
write one that covers everything or something like IMB ?


Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com


[OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-06-28 Thread Josh Aune

I am writing up some interconnect/network debugging software that is
centered around ompi.  What is the best set of functions to use to get
the best bandwidth and latency numbers for openmpi and why?  I've been
asking around at work and some people say just send/recieve, though
some of the micro benchmarks I have looked at in the past used
isend/irecv.  Can someone shed some light on this (or propose more
methods)?

Thanks,
josh