hello.  My understanding is that the arp caching mechanism works 
regardless of whether
you use static MAC addresses or dynamically generated ones.  The reason is that 
arp bridges the
gap between the layer 2 network, i.e. the MAC addresses, and the layer 3 
network, i.e. the IP
addresses those MAC addresses map to.  You can demonstrate this interaction by 
shutting down
the vif interface to your domu, then delete the MAC address from the arp cache 
for that vif by
using arp -d <MAC address>, then by trying to ping your domu from dom0.  After 
about 20
seconds, you should see the host is down message.  Then, use arp -a to look for 
your domu's IP
address.   what you'll see in the MAC field is the word "incomplete".  
If you then run brconfig on the bridge containing the domu, you'll see the MAC  
address you
assigned, or which was assigned dynamically, alive and well.

        My guess is that you're runing into some sort of short term memory 
crunch inside the
dom0's network stack.  The long term ping test should provide more details 
about where this
memory crunch might be.  The long time favorite variable for this issue is the 
good ole
nmbclusters value, tunable in the kernel config and visible through:
/sbin/sysctl kern.mbuf.nmbclusters
Although it's a blunt instrument, the output from:
netstat -m
might be helpful as well.  specifically, the value listed as the number of 
calls to protocol
drain routines.

        Yet another possibility is if you have a firewall set up , either on 
the dom0, or on the
domu in question.  If you're running into some rule that restricts access or 
bandwidth on the
path between the dom0 and the domu, you might see this kind of behavior.  
Unfortunately, in my
experience, when one runs into a firewall issue of this nature, the error 
messaging around it
is very misleading.  It's important to remember that the IP stacks on the dom0 
or domu,
respectively, don't know that the IP address for the machine at the other end 
of the connection
is actually running on the same hardware.  Consequently, if there are firewall 
rules set up on
either dom0 or the domu in question, and, possibly both, be sure your firewall 
rules provide
full access between the dom0 and domu in question, just as you would if you 
were writing rules
for remote machines.

the fact that you're only seeing this problem when communicating between the 
dom0 and the domu,
and not between the domu and the rest of the world, suggests to me the problem 
is on the dom0,
so I would start by looking there first.

Hope these notes help.
-Brian


Reply via email to