Bugs item #844704, was opened at 2003-11-18 15:29
Message generated for change (Comment added) made by brechin
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=844704&group_id=9368
Category: Packages
Group: 2.4
Status: Open
Resolution: None
Priority: 8
Submitted By: Jason Brechin (brechin)
Assigned to: John (muglerj)
Summary: Install package fails
Initial Comment:
==========================================
===================================
== Running step 2 of the OSCAR wizard: Configure
selected OSCAR packages
==========================================
===================================
--> About to
run /opt/oscar/packages/kernel_picker/scripts/pre_config
ure for kernel_picker
warning: /tftpboot/rpm/redhat-release-9-3.i386.rpm: V3
DSA signature: NOKEY, key ID db42a60e
[OSCAR::PackageBest :: Line 407] Reading package
directory
[OSCAR::PackageBest :: Line 419] Reading cache file.
[OSCAR::PackageBest :: Line 432] Comparing cache to
directory.
[OSCAR::PackageBest :: Line 457] Writing new cache
file.
62750 blocks
--> About to
run /opt/oscar/packages/switcher/scripts/pre_configure
for switcher
--> About to
run /opt/oscar/packages/switcher/scripts/post_configure
for switcher
Setting default for tag mpi ("lam-7.0")
Attribute successfully set; new attribute setting will be
effective for
future shells
--> Step 2: Completed successfully
executing:/opt/c3-4/cexec --pipe c3cmd-filter hostname
oscar_cluster oscarnode1.ncsa.uiuc.edu: Warning: No
xauth data; using fake authentication data for X11
forwarding.
Warning: sanity check failed.
----------------------------------------------------------------------
>Comment By: Jason Brechin (brechin)
Date: 2003-11-20 11:31
Message:
Logged In: YES
user_id=274641
Well... this may be a doc fix after all. It seems that the
xauth errors were due to me trying to do the install remotely,
manually setting DISPLAY (instead of just tunneling X). If I
tunnel X, then it continues as expected.
Note that everything worked in this situation before.
Also note that the sanity check will also fail due to the
switcher errors, like "switcher:mpi: Cannot find modulefile for
lam-7.0 -- skipping".
Neither of these situations really indicate a problem with the
cluster.
As for a good method of checking whether nodes are alive,
why not do something like what I suggested - check whether
`cexec --pipe hostname | grep -c 'oscar_cluster'` is equal to
the number of clients. That not only checks to see whether
commands can work, but whether or not ALL the nodes are
up. This doesn't solve everything since you'd still be using
c3cmd-filter to see if the other commands you run succeed,
but it would make for a good sanity check.
----------------------------------------------------------------------
Comment By: Thomas Naughton (naughtont)
Date: 2003-11-19 16:49
Message:
Logged In: YES
user_id=288102
yes, indeed this warning causes problems with the heurstic
used to
try and detect an error with remote cexec commands.
the problem in PackageInUn.pm actually comes from the way,
eval_c3cmd_filter() determines an "error". Output is
expected to have nothing on the right hand side (RHS) of the
colon ":" when getting results from the c3cmd-filter.
The case were Warning: are displayed in this RHS are false
errors. I don't know whether its worth patching with a
check to ignore certain results, e.g., next if( $output =~
/^Warning:/ );
The standard cases however are cought and no actual problem
is in c3cmd-filter, it's in the usage and assumptions
(heuristics) when using it.
I agree that the long-term fix is to have better error
reporting from C3 and that's something we (ORNL) are going
to have to address. But in the mean time, what do folks
suggest, this approached seemed like a reasonable work
around, maybe it is not?
----------------------------------------------------------------------
Comment By: John (muglerj)
Date: 2003-11-19 16:40
Message:
Logged In: YES
user_id=505737
After looking at this for awhile, i'm concluding that there
is no bug here. Jason hit a case where his ssh is not
configured properly and spits warning messages. The sanity
check takes these warning messages and does the right thing,
it fails. Once ssh is properly configured on his test
system, the sanity check should succeed.
I recommend that this bug be either removed or downgraded
unless there are further comments.
NOTES:
1. I looked into possibly coding the sanity check another
way, and looking for positive "node alive" messages from a
cexec command. This turns out to be more difficult than it
first seems and may not be doable at all, unless i'm missing
something.
2. It may be possible to catch specific ssh warning messages
and ignore them. This might not be such a good idea either,
and i think our best bet is to fail gracefully as we are doing.
----------------------------------------------------------------------
Comment By: Jason Brechin (brechin)
Date: 2003-11-19 09:42
Message:
Logged In: YES
user_id=274641
Yep... it returns successfully...
[EMAIL PROTECTED] oscar]# ssh oscarnode1 hostname
Warning: No xauth data; using fake authentication data for
X11 forwarding.
oscarnode1.ncsa.uiuc.edu
[EMAIL PROTECTED] oscar]# echo $?
0
----------------------------------------------------------------------
Comment By: Benoit des Ligneris (bligneri)
Date: 2003-11-19 09:05
Message:
Logged In: YES
user_id=179120
This SSH behavior is caused by the fact that the SSH key for
host oscarnode1.ncsa.uiuc.edu has changed and is not the
same as the one in (~/.ssh/known_hosts).
This can happens because you add/delete node ?
Anyway, we should remove all the key when we remove a node
so that this can not happen (I guess some grepping and
sedding of all
the /home/*/.ssh/known_hosts should do the trick ?).
Same problem if the user has already a .known_hosts file
that conflict with the real SSH key of the host.
Anyway, to reproduce this, simply alter the host key in
.ssh/known_hosts
However, at some point,
You can reproduce this ssh behavior
----------------------------------------------------------------------
Comment By: John (muglerj)
Date: 2003-11-18 23:21
Message:
Logged In: YES
user_id=505737
Well, i do see a warning message:
Warning: No
xauth data; using fake authentication data for X11
forwarding.
Does ssh spit back a success return code with this warning?
If it spits back something other than success, the sanity
check is designed to fail.
I cannot seem to reproduce this, although i've seen it
before with ssh. I tried zapping my .Xauthority file but no
luck with that.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=844704&group_id=9368
-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
Oscar-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-devel