Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times

2010-04-11 Thread Chris Samuel

On 10/04/10 15:12, Bogdan Costescu wrote:


Have there been any process scheduler changes in the newer kernels ?


Are there ever kernels where that doesn't get tweaked ? ;-)


I'm not sure that they could explain four orders of magnitude
differences though...


One idea that comes to mind would be to run the child processes
under strace -c as that will monitor all the system calls and
report how long is spent in which.   By running a comparison
with 2.6.23 and 2.6.24 then you might get a pointer to which
syscall(s) are taking longer.

Alternatively if you want to get fancy then you could play
with doing a git bisection between 2.6.23 and 2.6.24 to track
down the commit that introduces the regression.

To be honest it'd be interesting to see whether the issue still
manifests on a recent kernel though, if so then perhaps we might
be able to get the kernel developers interested (though they will
likely ask for a bisection too).

cheers!
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC


Re: [OMPI devel] kernel 2.6.23 vs 2.6.24 - communication/wait times

2010-04-11 Thread Chris Samuel

On 10/04/10 06:59, Oliver Geisler wrote:


This is the results of skampi pt2pt, first with shared memory allowed,
second shared memory excluded.


For what it's worth I can't replicate those results on an AMD Shanghai
cluster running a 2.6.32 kernel and Open-MPI 1.4.1.

Here is what I see (run under Torque, selecting 2 cores on the same
node, so no need to specify -np):

$ mpirun --mca btl self,sm,tcp  ./skampi -i ski/skampi_pt2pt.ski

# begin result "Pingpong_Send_Recv"
count= 14   2.0   0.0   16   2.0   1.8
count= 28   2.1   0.0   16   2.1   1.8
count= 3   12   2.1   0.18   2.0   2.0
count= 4   16   2.1   0.18   2.0   2.0
count= 6   24   2.0   0.0   16   2.0   1.8
count= 8   32   2.9   0.0   16   2.7   2.4
count= 11   44   2.3   0.1   16   2.2   2.0
count= 16   64   2.2   0.1   16   2.1   2.0
count= 23   92   2.7   0.2   16   2.6   2.1
count= 32  128   2.5   0.1   16   2.5   2.1
count= 45  180   3.0   0.0   16   2.8   2.6
count= 64  256   3.1   0.08   3.0   2.5
count= 91  364   3.1   0.08   3.0   3.0
count= 128  512   3.4   0.2   16   3.3   3.0
count= 181  724   4.1   0.0   16   4.0   4.1
count= 256 1024   5.0   0.08   4.5   4.5
count= 362 1448   6.0   0.0   16   5.8   5.7
count= 512 2048   7.7   0.1   16   7.3   7.6
count= 724 2896  10.0   0.08  10.0   9.8
count= 1024 4096  12.3   0.1   16  12.1  12.0
count= 1448 5792  13.8   0.28  13.5  13.4
count= 2048 8192  18.1   0.0   16  17.9  18.1
count= 289611584  25.0   0.0   16  24.9  25.0
count= 409616384  34.2   0.1   16  34.0  34.2
# end result "Pingpong_Send_Recv"
# duration = 0.00 sec

mpirun --mca btl tcp,self  ./skampi -i ski/skampi_pt2pt.ski

# begin result "Pingpong_Send_Recv"
count= 14  21.2   1.0   16  20.1  17.8
count= 28  20.8   1.0   16  20.6  16.7
count= 3   12  20.2   0.9   16  19.0  17.1
count= 4   16  19.9   1.0   16  19.0  17.0
count= 6   24  21.1   1.1   16  20.6  17.0
count= 8   32  20.0   1.0   16  18.8  17.1
count= 11   44  20.9   0.8   16  20.0  17.1
count= 16   64  21.7   1.1   16  20.5  17.6
count= 23   92  21.7   1.0   16  20.0  18.5
count= 32  128  21.6   1.0   16  20.5  18.5
count= 45  180  22.0   1.0   16  20.9  19.0
count= 64  256  21.8   0.7   16  20.5  20.2
count= 91  364  20.5   0.3   16  19.8  19.1
count= 128  512  18.5   0.38  17.5  18.1
count= 181  724  19.3   0.28  19.1  19.0
count= 256 1024  20.3   0.3   16  19.7  20.0
count= 362 1448  22.1   0.3   16  21.2  21.4
count= 512 2048  24.2   0.3   16  23.7  23.2
count= 724 2896  24.8   0.58  24.0  24.0
count= 1024 4096  26.8   0.2   16  26.1  26.3
count= 1448 5792  31.6   0.3   16  30.4  31.5
count= 2048 8192  38.0   0.6   16  37.3  37.1
count= 289611584  52.1   1.4   16  49.1  50.8
count= 409616384  93.8   1.1   16  81.1  91.5
# end result "Pingpong_Send_Recv"
# duration = 0.02 sec

cheers,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC