Hello all!
We've experienced an interesting problem, on a LINUX cluster, comprising
the
following systems:
Kernel: 1.1.129 SMP (upgraded from 2.1.115)
CPU: 2 x PII 450 MHz
Mem: 512 MB
Motheboard: Tyan Dual board Dluan 1836
Ethernet: 100Mbs eepro100B
[1]
----
The test was done using "netpipe 2.3", between "COMP_0"(transmitter) and
"COMP_1"(receiver).
When running with kernel 2.1.115 we get a maximum of about 180Mb/s
bandwidth between 3 nodes in a hypernet configuration (i.e. each
connected to the others with a seperate ethernet line). this means you
get only ~60 Mb/s from each line (which is defined as FD ). It gets
worse with 4 nodes , again a total of ~180-190 Mb/s - i.e. each line is
degrading to ~45Mb/s.
So we decided to try the newer kernel - 2.1.129 - and there we found
the following problems:
The communication rate degrades considerably starting with 4KB blocks.
It
drops from 60Mbit/s to under 1Mbit/s.
No errors are reported by "netstat" or "ifconfig".
When running netpipe between "COMP_0" <-> "COMP_1" and "COMP_0" <->
"COMP_2"
simultaneously, i.e. COMP_0 and COMP_1 has a seperate ethernet
connection between them and another line (crossed ethernet cable)
between COMP_0 to COMP_2, the kernel hangs up.
Replacing the eepro100 with a 3COM 3C905B (Cyclone) on COMP0 solves the
problem.
We *physically* removed 1 CPU from the system.
The system was rebooted, using the same kernel as previously - "netpipe"
returns now reasonable results.
The same problem, communication degradation, occured on the new "stable"
kernel,
2.0.36.
We speculate that the SMP modifications from 2.0.34 to 2.0.36 and
2.1.115 to
2.1.129 cause the problem.
[2]
----
We tried booting the dual CPU system with the 1.1.129 kernel, not
compiled
for SMP.
In this case, the system locks during the boot, just after loading the
Ethernet drivers.
Any clues on how this problem could be investigated ?
Thank you,
Amir
begin: vcard
fn: Amir Gil
n: Gil;Amir
org: Aplicatek High Performance Applications
adr: bld. 8CIndustrial Park;;P.O.B 3020;Omer;;84965;Israel
email;internet: [EMAIL PROTECTED]
title: Applications Manager
tel;work: 972-7-6909109/91/92
tel;fax: 972-7-6909108
x-mozilla-cpt: ;14688
x-mozilla-html: FALSE
version: 2.1
end: vcard