Bugs item #844704, was opened at 2003-11-18 15:29
Message generated for change (Comment added) made by brechin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=844704&group_id=9368

Category: Packages
Group: 2.4
Status: Open
Resolution: None
Priority: 8
Submitted By: Jason Brechin (brechin)
Assigned to: John (muglerj)
Summary: Install package fails

Initial Comment:
==========================================
===================================
== Running step 2 of the OSCAR wizard: Configure 
selected OSCAR packages
==========================================
===================================

--> About to 
run /opt/oscar/packages/kernel_picker/scripts/pre_config
ure for kernel_picker
warning: /tftpboot/rpm/redhat-release-9-3.i386.rpm: V3 
DSA signature: NOKEY, key ID db42a60e
 [OSCAR::PackageBest :: Line 407] Reading package 
directory
 [OSCAR::PackageBest :: Line 419] Reading cache file.
 [OSCAR::PackageBest :: Line 432] Comparing cache to 
directory.
 [OSCAR::PackageBest :: Line 457] Writing new cache 
file.
62750 blocks
--> About to 
run /opt/oscar/packages/switcher/scripts/pre_configure 
for switcher
--> About to 
run /opt/oscar/packages/switcher/scripts/post_configure
 for switcher
Setting default for tag mpi ("lam-7.0")
Attribute successfully set; new attribute setting will be 
effective for
future shells
--> Step 2: Completed successfully
executing:/opt/c3-4/cexec --pipe c3cmd-filter hostname
oscar_cluster oscarnode1.ncsa.uiuc.edu: Warning: No 
xauth data; using fake authentication data for X11 
forwarding.
Warning: sanity check failed.


----------------------------------------------------------------------

>Comment By: Jason Brechin (brechin)
Date: 2003-11-20 11:31

Message:
Logged In: YES 
user_id=274641

Well... this may be a doc fix after all.  It seems that the 
xauth errors were due to me trying to do the install remotely, 
manually setting DISPLAY (instead of just tunneling X).  If I 
tunnel X, then it continues as expected.

Note that everything worked in this situation before.

Also note that the sanity check will also fail due to the 
switcher errors, like "switcher:mpi: Cannot find modulefile for 
lam-7.0 -- skipping".

Neither of these situations really indicate a problem with the 
cluster.

As for a good method of checking whether nodes are alive, 
why not do something like what I suggested - check whether 
`cexec --pipe hostname | grep -c 'oscar_cluster'` is equal to 
the number of clients.  That not only checks to see whether 
commands can work, but whether or not ALL the nodes are 
up.  This doesn't solve everything since you'd still be using 
c3cmd-filter to see if the other commands you run succeed, 
but it would make for a good sanity check.

----------------------------------------------------------------------

Comment By: Thomas Naughton (naughtont)
Date: 2003-11-19 16:49

Message:
Logged In: YES 
user_id=288102

yes, indeed this warning causes problems with the heurstic
used to
try and detect an error with remote cexec commands.

the problem in PackageInUn.pm actually comes from the way,
eval_c3cmd_filter() determines an "error".  Output is
expected to have nothing on the right hand side (RHS) of the
colon ":" when getting results from the c3cmd-filter.

The case were Warning: are displayed in this RHS are false
errors.  I don't know whether its worth patching with a
check to ignore certain results, e.g.,   next if( $output =~
/^Warning:/ );

The standard cases however are cought and no actual problem
is in c3cmd-filter, it's in the usage and assumptions
(heuristics) when using it.

I agree that the long-term fix is to have better error
reporting from C3 and that's something we (ORNL) are going
to have to address.  But in the mean time, what do folks
suggest, this approached seemed like a reasonable work
around, maybe it is not?

----------------------------------------------------------------------

Comment By: John (muglerj)
Date: 2003-11-19 16:40

Message:
Logged In: YES 
user_id=505737

After looking at this for awhile, i'm concluding that there
is no bug here. Jason hit a case where his ssh is not
configured properly and spits warning messages. The sanity
check takes these warning messages and does the right thing,
it fails. Once ssh is properly configured on his test
system, the sanity check should succeed. 

I recommend that this bug be either removed or downgraded
unless there are further comments. 

NOTES:
1. I looked into possibly coding the sanity check another
way, and looking for positive "node alive" messages from a
cexec command. This turns out to be more difficult than it
first seems and may not be doable at all, unless i'm missing
something. 

2. It may be possible to catch specific ssh warning messages
and ignore them. This might not be such a good idea either,
and i think our best bet is to fail gracefully as we are doing. 

----------------------------------------------------------------------

Comment By: Jason Brechin (brechin)
Date: 2003-11-19 09:42

Message:
Logged In: YES 
user_id=274641

Yep... it returns successfully...

[EMAIL PROTECTED] oscar]# ssh oscarnode1 hostname
Warning: No xauth data; using fake authentication data for 
X11 forwarding.
oscarnode1.ncsa.uiuc.edu
[EMAIL PROTECTED] oscar]# echo $?
0


----------------------------------------------------------------------

Comment By: Benoit des Ligneris (bligneri)
Date: 2003-11-19 09:05

Message:
Logged In: YES 
user_id=179120

This SSH behavior is caused by the fact that the SSH key for
host  oscarnode1.ncsa.uiuc.edu has changed and is not the
same as the one in (~/.ssh/known_hosts).

This can happens because you add/delete node ?

Anyway, we should remove all the key when we remove a node
so that this can not happen (I guess some grepping and
sedding of all 
the /home/*/.ssh/known_hosts should do the trick ?).

Same problem if the user has already a .known_hosts file
that conflict with the real SSH key of the host.

Anyway, to reproduce this, simply alter the host key in
.ssh/known_hosts

However, at some point, 

You can reproduce this ssh behavior

----------------------------------------------------------------------

Comment By: John (muglerj)
Date: 2003-11-18 23:21

Message:
Logged In: YES 
user_id=505737

Well, i do see a warning message:

Warning: No
xauth data; using fake authentication data for X11
forwarding.

Does ssh spit back a success return code with this warning?
If it spits back something other than success, the  sanity
check is designed to fail. 

I cannot seem to reproduce this, although i've seen it
before with ssh. I tried zapping my .Xauthority file but no
luck with that. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=109368&aid=844704&group_id=9368


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
Oscar-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-devel

Reply via email to