Title: Re: [Oscar-devel] New Problem - DHCP not found on client nodes
Hey David:
 
So let's say you have 1 headnode and 2 compute nodes (node1, and node2).
 
Can headnode ping either node1 or node2?  Can you show the output of ifconfig on all three computers?  Do the compute nodes have more than one network card on board?  I wonder if the systems got confused between eth0 and eth1...
 
Cheers,
 
Bernard

From: David Isaacson [mailto:[EMAIL PROTECTED]
Sent: Fri 08/07/2005 5:55 PM
To: Bernard Li
Subject: Re: [Oscar-devel] New Problem - DHCP not found on client nodes

I used to have SL3 on these machines without any network 
problems...maybe that doesn't mean anything though.

By the way, I did get the system successfully installed on 2 of the 
newer nodes, but the nodes and host couldn't ping eachother, or mount 
nfs, or ssh....etc.  I got "Destination Host Unreachable".  All 3 are 
plugged straight into the same private switch, though.

David

On Jul 8, 2005, at 5:28 PM, Bernard Li wrote:

> Another quick way to test is just boot off the first CD of SL3 and see
> if the network adapter is detected.  lspci and friends should work.
>
> Cheers,
>
> Bernard
>
>
>> -----Original Message-----
>> From: Lombard, David N [mailto:[EMAIL PROTECTED]]
>> Sent: Friday, July 08, 2005 17:10
>> To: David Isaacson
>> Cc: Bernard Li; [email protected]
>> Subject: RE: [Oscar-devel] New Problem - DHCP not found on
>> client nodes
>>
>> From: David Isaacson on Friday, July 08, 2005 4:12 PM
>>
>>>
>>>
>>>> Questions, I have questions:
>>>>
>>>> Have you tried another client node?
>>>>
>>>>
>>>>
>>> I had tried another node with the exact same hardware with the same
>>> results.  Now however I decide to try one of our nodes, which have
>>> similar but not exactly the same hardware as the old ones.  It looks
>>> like its working on this one.  At the very least it got
>>>
>> past the point
>>
>>> where it hung before.  Of course I want to use the old nodes too so
>>>
>> the
>>
>>> problem doesn't just go away :(
>>>
>>
>> OK, so that would again appear to implicate the client.  The node 
>> that
>> did work should either reboot or beep, depending on the
>> config in Step 4
>> "Build OSCAR Client Image".  If you set it to beep, just reset it, 
>> and
>> it should boot from local disk (assuming you have the right
>> boot order).
>> The best  BIOSes have a "boot once" mode, so you just do a
>> network boot
>> when you want to reinstall, and a normal boot from disk otherwise
>> (there's also a method where you can use per-node PXE config files to
>> either do a LOCALBOOT or network boot).
>>
>>
>>>> Do you see any messages about the NIC driver loading?
>>>>
>>>>
>>>>
>>> I did see messages about the NIC driver loading....
>>> I think.  Now I can't seem to see any.  But they all go by pretty
>>>
>> fast,
>>
>>> so I really can't be sure.
>>>
>>
>> Back to the failing node...
>>
>> After the "Please contribute" lines, you should see
>>
>>   Listening on LPF/eth0/<MAC-address-as-octets>
>>   Sending on   LPF/eth0/<MAC-address-as-octets>
>>   Listening on LPF/lo/<null>
>>   Sending on   LPF/lo/<null>
>>   DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
>>   DHCPDISCOVER on lo to 255.255.255.255 port 67 interval 6
>>
>> If you don't have the three "eth0" lines, you don't have a NIC (well,
>> you don't have a functioning driver).
>>
>> After the above, you should then have
>>
>>   DHCPOFFER from 192.168.x.y
>>   DHCPREQUEST on eth0 to 255.255.255.255 port 67
>>   DHCPACK from 192.168.x.y
>>
>> These represent the *second* OFFER/REQUREST/ACK set that
>> don't appear to
>> be present in /var/log/messages on the server.
>>
>>
>>>> What board do you have?
>>>>
>>>>
>>>>
>>> The board is a Tyan S2720.
>>>
>>
>> Hmmm.  I did find this
>>
>> <https://www.redhat.com/archives/redhat-list/2002-July/msg00832.html>
>>
>> Can you see which driver, e100 or e1000, is initializing?
>>
>> You can <Ctrl-S> and <Ctrl-Q> the client to stop/start the
>> display, and
>> <Ctrl-C> to interrupt the process.  Once you do, you could
>>
>>     cat /var/state/dhcp/dhclient.leases
>>
>> to see if anything is present, but I'd guess not, and
>>
>>     ifconfig
>>
>> to see what sort of errors you may be hitting.
>>
>>     cat /proc/bus/pci/devices
>>
>> should dump out device info.  Sadly, lspci isn't on the initrd :-
>> (  At
>> any rate, you should see the NIC if a driver has claimed it.
>> The first
>> number (4 digit hex) lists the bus, slot, and function number of each
>> device in a packed format; the second number (8 digit hex) will start
>> with "8086" for an Intel device.  Along with the various
>> bridges & etc,
>> you should see the NIC (with the driver that claimed it)--what is the
>> second set of 4 digits?
>>
>> --
>> dnl
>>
>> My comments represent my opinions, not those of Intel Corporation.
>>
>

Reply via email to