On Wed, 8 Jan 2003, Edmund Bertschinger wrote:
> I wrote yesterday that Oscar2.1 Step 7 resulted in a hang with a request
> for password on oscarnode1. This occurs in scripts/post_install where
> an ssh command is issued to each node, in order to find out how many
> processors each node has.
>
> So, I tried ssh oscarnode1.oscardomain in a root window and, after a
> long delay (a minute or so), got a request for the root password, just
> like I reported yesterday.
I think we have seen this ssh long delay problem before -- I think this
may only be a symptom and not the real cause, but it's still worth fixing.
IIRC, it has to do with faulty /etc/resolv.conf files on the nodes. I see
that this somehow never made it to the FAQ, so we'll have to add it there
when we figure this out again / remember how we fixed it before.
I don't remember the exact issue -- I think it was one of the following:
1) /etc/resolv.conf on the nodes pointed to DNS servers outside of the
OSCAR cluster, and the head node (pfilter) was not allowing the DNS
traffic out
2) /etc/resolv.conf on the nodes pointers to non-existant DNS servers
3) /etc/resolv.conf on the nodes was empty
The reason you can't ssh to nodes properly is because somehow the root SSH
keys are not agreeing. i.e., the client node is supposed to accept the
keys from the head node and allow passwordless logins. But somehow, those
keys did not propogate properly out there.
Are there any errors indicated in your oscarinstall.log? (sorry -- you
may have mentioned this before; I'm kinda jumping in the middle of the
conversation)
> I tried pinging the client nodes and all ping's failed! This despite
> the fact that I had just successfully installed all the client nodes
> over the same network.
This is quite odd. I *believe* that pfilter is only installed on the head
node, and if you're on the head node, you should be able to ping all the
client nodes. Neil?
Regardless, if you can telnet to port 22 on all the nodes (from the head
node), then they're all reachable and all fine, no matter what ping says.
--
{+} Jeff Squyres
{+} [EMAIL PROTECTED]
{+} http://www.lam-mpi.org/
-------------------------------------------------------
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users