Ok, I found the issue, and it gets back to the problem I was having with hosts, whereby the cluster network line was missing.
Here is an outline of my hosts (// indicate comments i've added below, not in file) # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost 192.168.1.7 titus.androticus titus //rest of local computers 192.168.1.13 janus.androticus janus // my headnode on the local network # These entries are managed by SIS, please don't modify them. 172.16.0.2 node02.androticus node02 // rest of nodes There is a key line missing, that had been in the original file at some time, which is the headnode line on the cluster network, along with additional aliases (some which I now forget) # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost 172.16.0.1 janus.androticus janus oscar_server nfs_oscar pbs_oscar // THIS LINE WAS MISSING!!!! 192.168.1.7 titus.androticus titus //etc If that cluster network line goes missing, evil things happen!!!!!! I think the KDE network tool may screw that up (not 100% sure though.) Brad Aisabaisa at brad-aisa dot com ----- Original Message ---- From: Bernard Li <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; oscar devel <[email protected]> Sent: Wednesday, July 12, 2006 12:41:21 PM Subject: RE: can't load image anymore after pxe boot Hi Brad: What does "rpm -qa | grep tftp" show? And can you check to see if it is working properly? Not sure about tftp logs, they might be in /var/log/messages. And I do not think pfilter will affect tftp, though you mentioned that you have already disabled it. You could probably try to use etherboot to test your tftp: http://rom-o-matic.net/5.4.2/ Just create a boot floopy ROM disk for your network card and see if it goes. If it does, and plain network booting doesn't work, then there's something wrong with your nic... Cheers, Bernard P.S. In regards to swapping nic cards, as long as OSCAR now has the MAC address of your _new_ nic, then it should work fine. > -----Original Message----- > From: Brad Aisa [mailto:[EMAIL PROTECTED] > Sent: Wednesday, July 12, 2006 11:26 > To: oscar devel > Cc: Bernard Li > Subject: RE: can't load image anymore after pxe boot > > Yikes! Could it relate to having changed my nic card? The > previous one was flaky, and I swapped it out for a new one of > the same model. That actually caused a number of issues (as I > had mentioned in a previous post) which I thought I'd finally > worked through. I don't have a third, and am loathe to put that > first one back in, because it seemed to flake out after awhile, > and I also worry about all the same network issues. > > Are there any other possible reasons this could be happening? > > Couldn't it be related to the tftp server??? It is my > understanding that it is a different process for the client to > get the pxe boot image vs. then fetching the actual kernel via > tftp. Is there a utility I could try from a node to test the > tftp? Is there a config file or log file for that? > > Could this be possibly related to turning off pfilter??? > > Thanks so much for any help! Only 2 more nodes to go! > > --- Bernard Li <[EMAIL PROTECTED]> wrote: > > > It is quite possible that this is a hardware issue - do you > > have another > > nic you could try? > > > > Everything else looks fine from here. > > > > Brad Aisa > baisa at brad-aisa dot com > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Oscar-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-devel
