On Tue, Jan 10, 2012 at 10:02 AM, Roberto Rey <eros...@gmail.com> wrote: > I'm running some tests on EC2 cluster instances with 10 Gigabit Ethernet > hardware and I'm getting strange latency results with Netpipe and OpenMPI.
- There are 3 types of instances that can use 10 GbE. Are you using "cc1.4xlarge", "cc2.8xlarge", or "cg1.4xlarge"?? - Did you set up a placement group?? - Also, which AMI are you using?? > I'm using the BTL TCP in OpenMPI, so I can't understand why OpenMPI > outperforms raw TCP performance for small messages (40us of difference). > > Can OpenMPI outperform Netpipe over TCP? Why? Is OpenMPI doing any > optimization in BTL TCP? It is indeed interesting! If we can run strace with timing (like strace -tt) and compare the difference between NPmpi & NPtcp, then we can get a better idea on what's happening. It is possible that one is doing more busy polling than another, and/or triggering Xen to handle things a bit differently. Also, we should check the socket options, and also check the system call latency to see if the network is really accountable for the extra 40us delay. > The results for OpenMPI aren't so good but we must take into account the > network virtualization overhead under Xen If you are running Cluster Compute Instances, then you are using HVM. If things are setup properly (HVM & placement group), then you can even get a Top500 computer on EC2... Amazon uses similar setups for their TOP500 submission: http://i.top500.org/site/50321 Rayson ================================= Open Grid Scheduler / Grid Engine http://gridscheduler.sourceforge.net/ Scalable Grid Engine Support Program http://www.scalablelogic.com/