Jeremy,

You might be better off posting to the Linux Kernel Mailing List (LKML) about this issue. There are a few experts here (Don, Joe, Mike, ....) who might know, but LKML is more likely to give you correct guidance quickly.

cheers,
        Bruce

On Mon, 24 Sep 2007, Jeremy Fleming wrote:

I have a quad opteron node machine where each node is a dual core with each
core running at 2.0Gz.  The machine has 64 GB of ram, two broadcom ethernet
gigabit cards and 2 other gigabit intel cards each supplying 2 ports, and
are supported by the e1000 driver.  The machine is running the default
install of Redhat Enterprise 5.0 (original release, no patches or updates).


Remote machines are supplying ~512 megabit/sec streams over gigabit ethernet
to this machine.  There are two streams on seperate ethernet lines.  I have
each stream connected to a different port on one of the intel cards.  The
streams are sent via multicast, and there are 4 sub-streams per ethernet
line.  Each substream is approximately 131.072 megabits/sec.

On the opteron machine I have a process that can pull a substream off of an
ethernet port and dump it to a ring buffer in shared memory.  To start, the
process could never keep up with receiving the data via ethernet and then
doing a memcpy to shared memory.  Then I found out about NUMA, and decided
to use sched_setaffinity to bind the process to a cpu, I bound the process
to the same cpu the ethernet card is bound to via it's IRQ.

I looked in /proc/interrupts and found "eth0" or "eth1", looked up it's IRQ,
then went into /proc/irq/<eth 0 IRQ>/smp_affinity, and checked which cpu the
IRQ was bound to.  I bound the process to that processor and ran it again.
Luckily no data loss and it could keep up.  I bound the process before I
allocated memory so the memory was bound to the same process too.  I was
even able to run three more processes, bound to the same cpu and have all 4
read the sub-streams from the ethernet device eth0, with no data loss.  I
can even run another process which reads from the ring buffer and dumps the
data to disk and it causes no slow downs or data loss.

Now I want to read a substream from the other stream connected to eth1 while
reading from the other 4 sub-streams.  I start that up just by binding the
same application to the processor associated with eth 1, by checking
"/proc/irq/<eth 1 IRQ>/smp_affinity".  When the process starts the system
starts to not be able to keep up anymore, just like in the beginning when I
just had 1 processor reading one stream without doing anything else.  I
thought I was just trying to do too much work, so I turned off all streams,
and ran just two processes bound to two different processors, each bound to
the same processor as the associated eth device.  I ran them both, and they
lose data.  If I run them seperately they work fine, but when I read 1
sub-stream from each of the two unique streams they fail.

Are the two ethernet devices dumping their multicast data into kernel
buffers associated with different processors?
How do I know what processor the kernel ethernet buffers are associated
with?
Is there a way to set cpuaffinity for ethernet devices before they boot up
so I know which processors memory they are dumping data to?
Any ideas on why there would be a problem with reading a stream from each
eth device at the same time and not with reading 4 streams from one eth
device?
Do I need to turn a NUMA aware scheduler on somehow, or is that on by
default in RHEL 5?
I also noticed that linux assigns IRQs at bootup that vary with each boot,
is there a way to statically assign IRQs to the ethernet cards?

Any help or pointers at all would be great!

Thanks in advance
Jeremy

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to