[email protected] wrote:
Nope, seems dhcpcd is not requesting a renew of the lease at 1/2 time as
it should (IIRC). http://wiki.neuralbs.com/~kyron/DHCP_reqs_tb17 contains
a grep of the DHCP activity from the disappearing nodes int
/var/log/messages.
Can you get a client-side log, or even better packet capture?
DISCOVER and RENEW requests may look different enough to match
firewall rules differently. For example, DISCOVER sent to broadcast
address and RENEW sent unicast.
Oct 7 11:50:52 thinkbig16 dhcpcd[29651]: eth0: sending signal 14 to pid
21865
Oct 7 11:50:52 thinkbig16 dhcpcd[21865]: eth0: received SIGALRM,
renewing lease
Oct 7 11:50:52 thinkbig16 dhcpcd[21865]: eth0: renewing lease of 10.0.1.136
Oct 7 11:50:52 thinkbig16 dhcpcd[21865]: eth0: NAK: lease not found
from 10.0.1.129
Oct 7 11:50:52 thinkbig16 dhcpcd[29652]: eth0:
/lib/dhcpcd/dhcpcd-run-hooks: Network is unreachable
Oct 7 11:50:53 thinkbig16 dhcpcd[21865]: eth0: broadcasting for a lease
Oct 7 11:50:53 thinkbig16 dhcpcd[21865]: eth0: offered 10.0.1.136 from
10.0.1.136 `boothost'
Oct 7 11:50:53 thinkbig16 dhcpcd[21865]: eth0: ignoring offer of
10.0.1.136 from 10.0.1.136 `boothost'
Oct 7 11:50:53 thinkbig16 dhcpcd[21865]: eth0: acknowledged 10.0.1.136
from 10.0.1.136 `boothost'
Oct 7 11:50:53 thinkbig16 dhcpcd[21865]: eth0: checking 10.0.1.136 is
available on attached networks
Oct 7 11:50:59 thinkbig16 dhcpcd[21865]: eth0: leased 10.0.1.136 for
43200 seconds
So the lease time is explicitly received as being 12h (which is correct)
but the " eth0: offered 10.0.1.136 from 10.0.1.136 `boothost' " has got
me frowning... (note that `boothost' is defined in the config pasted below)
The tcpdump file is available at:
http://wiki.neuralbs.com/~kyron/dhcpout_tb16 sorry for the net chatter
but you get it all there ;)
What does your network topology look like? Are you using DHCP relays
and/or a not-fully-routed IP network? It's possiblr for hosts to be able
to get a network, but not renew it under some circumstances.
Ok at the risk of being flamed about network topology, I have a hacked
up setup but it shouldn't impact DHCP IMHO (I'm emulating NIC bonding
through IP masking).
Here is the setup:
http://wiki.neuralbs.com/~kyron/HyperTransport/ClusterNetDiagram.png
Here is the config file (sourced by dnsmasq.conf)
eric@headless ~/1_Files/1_ETS/1_Maitrise/Code/pvq $ cat
/etc/dnsmasq.AthlonXP_All.conf
# On force le broadcast à 10.0.1.255
dhcp-option=28,10.0.1.255
#The group name,address range and lease time:
dhcp-range=AthlonXP_1,10.0.1.10,10.0.1.126,255.255.255.0,12h
# The GROUP_NAME's,option 3:default Gateway (if you really need this,
nodes shouldn't require routed access):
dhcp-option=AthlonXP,3,10.0.1.1
# The GROUP_NAME's, option 42:time server address:
dhcp-option=AthlonXP_1,42,10.0.1.1
# This is required for PXE booting
dhcp-boot=net:AthlonXP_1,/pxelinux.0,boothost,10.0.1.1
# As can be seen in the dnsmasq.conf, this option is not guaranteed to
work (DN search order)
dhcp-option=AthlonXP_1,119,cluster.local
# Domain DNS name
dhcp-option=15,cluster.local
# NIS domain
dhcp-option=40,cluster.local
# Now for the host listing, format is:
# dhcp-host=MACADDRESS,net:GROUP_NAME,NODE_NAME,IP_ADDRESS
dhcp-host=00:01:03:df:ca:44,net:AthlonXP_1,thinkbig1,10.0.1.11
dhcp-host=00:01:03:DF:C6:38,net:AthlonXP_1,thinkbig2,10.0.1.12
dhcp-host=00:01:03:DF:D3:30,net:AthlonXP_1,thinkbig3,10.0.1.13
dhcp-host=00:01:03:DF:D3:08,net:AthlonXP_1,thinkbig4,10.0.1.14
dhcp-host=00:01:03:DF:D3:01,net:AthlonXP_1,thinkbig5,10.0.1.15
dhcp-host=00:01:03:DF:CA:3B,net:AthlonXP_1,thinkbig6,10.0.1.16
dhcp-host=00:04:75:C2:21:14,net:AthlonXP_1,thinkbig7,10.0.1.17
dhcp-host=00:01:03:DF:CA:46,net:AthlonXP_1,thinkbig8,10.0.1.18
dhcp-host=00:01:03:de:5f:2e,net:AthlonXP_1,thinkbig9,10.0.1.19
#
# The group name,address range and lease time:
dhcp-range=AthlonXP_2,10.0.1.130,10.0.1.254,255.255.255.0,12h
# The GROUP_NAME's,option 3:default Gateway (if you really need this,
nodes shouldn't require routed access):
dhcp-option=AthlonXP_2,3,10.0.1.129
# The GROUP_NAME's, option 42:time server address:
dhcp-option=AthlonXP_2,42,10.0.1.129
# This is required for PXE booting
dhcp-boot=net:AthlonXP_2,/pxelinux.0,boothost,10.0.1.129
# As can be seen in the dnsmasq.conf, this option is not guaranteed to
work (DN search order)
#dhcp-option=AthlonXP_2,119,cluster.local
# Now for the host listing, format is:
# dhcp-host=MACADDRESS,net:GROUP_NAME,NODE_NAME,IP_ADDRESS
dhcp-host=00:01:03:DE:B5:C3,net:AthlonXP_1,thinkbig10,10.0.1.130
dhcp-host=00:01:03:DE:B6:AE,net:AthlonXP_1,thinkbig11,10.0.1.131
dhcp-host=00:04:75:AA:36:5B,net:AthlonXP_1,thinkbig12,10.0.1.132
dhcp-host=00:01:03:DF:CA:42,net:AthlonXP_2,thinkbig13,10.0.1.133
dhcp-host=00:01:03:DE:B5:C2,net:AthlonXP_2,thinkbig14,10.0.1.134
dhcp-host=00:01:03:24:E9:3B,net:AthlonXP_2,thinkbig15,10.0.1.135
dhcp-host=00:04:75:EC:33:47,net:AthlonXP_2,thinkbig16,10.0.1.136
dhcp-host=00:04:75:EC:4E:F2,net:AthlonXP_2,thinkbig17,10.0.1.137
#dhcp-host=00:04:75:EC:4E:CF,net:AthlonXP_2,thinkbig18,10.0.1.138