Hi Keith,
Most client commands of xCAT are essentially thin clients to the xCAT server's 
API. That is true for nodeset as well. What you're seeing is the client waiting 
for server responses on the connection's file descriptor (which is usually 3, 
after std{in,out,err}). Network communication waits are usually implemented as 
select calls on the fdset and then reading from the FDs in the set, in case 
there is any change in the state of any descriptor in the fdset.

You should take a look at the xcat daemon log In a second terminal while the 
command execution is on. More details on the debugging facilities of xCAT can 
be found at 
https://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/references/man8/xcatdebug.8.html

Regards,
--
Samveen S. Gulati--
The best-laid schemes o' mice an 'men
                Gang aft agley,
An'lea'e us nought but grief an' pain,
                For promis'd joy!
                          -- Robert Burns
(The best laid plans of mice and men often go awry, and bring nothing but grief 
and pain of the ..) 

    On Thursday, May 2, 2024 at 09:07:36 PM GMT+5:30, keith.han...@lmco.com 
<keith.han...@lmco.com> wrote:  
 
  
I have 2 different xcat boot servers (we’ll call them boot1 and boot2) both 
running xcat 2.16.5 who were upgraded from 2.10.   boot2 was upgraded directly, 
while boot 1 was upgraded but then xcat was removed completely then installed 
2.16.5 fresh. 
 
  
 
I am having an issue with nodeset on boot1. For some reason, nodeset on a 
single node on boot 1 takes over a minute, while nodeset on the same node on 
boot2 takes less than a second. I can’ seem to find the cause of this. I tried 
stracing the nodeset and I get long timeouts on a resource being temporarily 
unavailable but I can’t seem to figure out the resource.
 
  
 
[root@boot1 ~]# time nodeset -V proc5201_p osimage=rhel7-ia2
 
proc5201_p: netboot rhels7.4-x86_64-compute
 
  
 
real    0m50.636s
 
user    0m0.064s
 
sys     0m0.038s
 
  
 
[root@boot2 ~]# time nodeset -V proc5201_p osimage=rhel7-ia2
 
proc5201_p: netboot rhels7.4-x86_64-compute
 
  
 
real    0m0.799s
 
user    0m0.060s
 
sys     0m0.033s
 
  
 
  
 
on boot1:
 
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
read(3, 0x20de033, 5)                   = -1 EAGAIN (Resource temporarily 
unavailable)
select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout)
 
  
 
  
 
Has anyone seen this before and can help explain why this is happening?
 
  
 
_______________________
 
- Keith Hannum
 
- keith.han...@lmco.com
 
  
 
  
 _______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user
  
_______________________________________________
xCAT-user mailing list
xCAT-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xcat-user

Reply via email to