Drilling down deeper, seems to be two different situations.

On nodes without X11, remoteshell script takes less than a second,
-rw------- 1 root root 1231 Feb  7  2017 authorized_keys
-rw------- 1 root root 1675 Feb  7  2017 id_rsa
-rw------- 1 root root  410 Feb  7  2017 id_rsa.pub
7cb5ab60ff42ede791c823afd016997d  /root/.ssh/authorized_keys
13f430f0001adff42dc250f818eabbd1  /root/.ssh/id_rsa
3f5101404ac152d4aaea6c62f7eb6e30  /root/.ssh/id_rsa.pub

However later in the script, trying to set up to start gpfs, I get this message:

Install: recovering gpfs sdr
Tue Feb  7 18:20:28 UTC 2017: mmsdrrestore: Processing node gpu002
mmsdrrestore: Run the command from an active terminal or enable global 
passwordless access.
mmsdrrestore: Unable to retrieve GPFS cluster files from node 
ut002.oscar.ccv.brown.edu
mmsdrrestore: File /var/mmfs/ssl/stage/genkeyData1 not found.
   Use mmauth genkey to recover the file, or to generate and commit a new key.
mmsdrrestore: Unexpected error from updateMmfsEnvironment.  Return code: 1
mmsdrrestore: Command failed. Examine previous error messages to determine 
cause.


If I copy/paste the command from the postscript file, run it from ssh login, I 
get
[root@gpu002 xcat]# /usr/lpp/mmfs/bin/mmsdrrestore -p ut003 -R /usr/bin/scp
Tue Feb  7 18:26:10 UTC 2017: mmsdrrestore: Processing node gpu002
Warning: Permanently added 'ut002.oscar.ccv.brown.edu' (RSA) to the list of 
known hosts.
mmsdrrestore: Node gpu002 successfully restored.

There is no difference in the /root/.ssh files before or after. Why does it 
work by hand, but not from inside script?

Found that on nodes with X11, remoteshell script was taking 12 minutes to run 
to “completion”,
and the result is zero length id_rsa.pub file.

-rw------- 1 root root 821 Feb  7 12:59 authorized_keys
-rw------- 1 root root   0 Feb  7 13:09 id_rsa.pub
-rw-r--r-- 1 root root 183 Feb  7 13:02 known_hosts
4cd344ed6d3721a283f442977862b981  /root/.ssh/authorized_keys
d41d8cd98f00b204e9800998ecf8427e  /root/.ssh/id_rsa.pub
a178f5a553c74d99590b2047d9517363  /root/.ssh/known_hosts

I thought it was NetworkManager, but it turns out it was firewalld.
(chroot . systemctl disable firewalld )

— ddj

> On Feb 7, 2017, at 6:35 AM, David D Johnson <[email protected]> wrote:
> 
> That was already the case (IP of mgt1 and IP of mgt[2] are the forwarders).
> I don't believe it will forward requests within the zones that it is 
> authoritative.
> I ended up using tabdump to recreate the hosts and nodelist tables. Mostly 
> good.
> 
> Now the problem of the day is fixing the SSH credentials so that all the 
> diskless nodes booting off the
> new frontend can get root access to all the nodes still booted off the old 
> frontend.  Need this
> especially for GPFS.  I've been trying to follow what's going on in the 
> remoteshell postscript,
> and I'm wondering if my "sitespecific" postscript is running before 
> "remoteshell" is competed.
> Is there a way to determine/force the order the postscripts are executed?  
> Sitespecific is after
> remoteshell both in alphabet and in the lsdef output. 
> The basic problem is that mmsdrrestore fails during sitespecific, but works 
> fine when I try it again later by hand.
> 
>  -- ddj
> Dave Johnson
> Brown University
> 
>> On Feb 7, 2017, at 4:32 AM, Er Tao Zhao <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi, David
>>  
>> Will you pls try 'chdef -t site forwarders=<ip_of_mgt1>' and then 'makedns' 
>> to use mgt1 as your remote DNS server.
>> Pls feel free to let me know if there is any more issues.
>>  
>> Thx!
>> Best Regards,
>> -----------------------------------
>> Zhao Er Tao
>> 
>> IBM China System and Technology Laboratory, Beijing
>> Tel:(86-10)82450485
>> Email: [email protected] <mailto:[email protected]>
>> Address: 1/F, 28 Building,ZhongGuanCun Software Park,
>> No.8 DongBeiWang West Road, Haidian District,
>> Beijing, 100193, P.R.China
>>  
>>  
>> ----- Original message -----
>> From: "David D. Johnson" <[email protected] 
>> <mailto:[email protected]>>
>> To: "[email protected] 
>> <mailto:[email protected]>" <[email protected] 
>> <mailto:[email protected]>>
>> Cc:
>> Subject: [xcat-user] upgrading xCAT onto new servers
>> Date: Sat, Feb 4, 2017 3:04 AM
>>  
>> We’re upgrading cluster mgt node hardware and software at the same time, 
>> going from 2.8.3 to 2.13.1,
>> and from centos6.7 to rhels7.2.   I have the new frontend installed and 
>> somewhat functional.
>> Right now I’m needing to clone the DNS / named from “mgt1” that is still 
>> authoritative for the production cluster.
>> I could just tabdump hosts and nodelist and do makedns on “mgt5”, or I’m 
>> thinking there might be a way to make
>> the new mgt5 a slave to the existing named running on mgt1.   Any pros/cons? 
>>  What would you do?
>> 
>> Thanks,
>> 
>>  — ddj
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org <http://slashdot.org/>! 
>> http://sdm.link/slashdot <http://sdm.link/slashdot>
>> _______________________________________________
>> xCAT-user mailing list
>> [email protected] <mailto:[email protected]>
>> https://lists.sourceforge.net/lists/listinfo/xcat-user 
>> <https://lists.sourceforge.net/lists/listinfo/xcat-user>
>>  
>> 
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org <http://slashdot.org/>! 
>> http://sdm.link/slashdot_______________________________________________ 
>> <http://sdm.link/slashdot_______________________________________________>
>> xCAT-user mailing list
>> [email protected] <mailto:[email protected]>
>> https://lists.sourceforge.net/lists/listinfo/xcat-user
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
xCAT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to