Hi Keith, Most client commands of xCAT are essentially thin clients to the xCAT server's API. That is true for nodeset as well. What you're seeing is the client waiting for server responses on the connection's file descriptor (which is usually 3, after std{in,out,err}). Network communication waits are usually implemented as select calls on the fdset and then reading from the FDs in the set, in case there is any change in the state of any descriptor in the fdset.
You should take a look at the xcat daemon log In a second terminal while the command execution is on. More details on the debugging facilities of xCAT can be found at https://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/references/man8/xcatdebug.8.html Regards, -- Samveen S. Gulati-- The best-laid schemes o' mice an 'men Gang aft agley, An'lea'e us nought but grief an' pain, For promis'd joy! -- Robert Burns (The best laid plans of mice and men often go awry, and bring nothing but grief and pain of the ..) On Thursday, May 2, 2024 at 09:07:36 PM GMT+5:30, keith.han...@lmco.com <keith.han...@lmco.com> wrote: I have 2 different xcat boot servers (we’ll call them boot1 and boot2) both running xcat 2.16.5 who were upgraded from 2.10. boot2 was upgraded directly, while boot 1 was upgraded but then xcat was removed completely then installed 2.16.5 fresh. I am having an issue with nodeset on boot1. For some reason, nodeset on a single node on boot 1 takes over a minute, while nodeset on the same node on boot2 takes less than a second. I can’ seem to find the cause of this. I tried stracing the nodeset and I get long timeouts on a resource being temporarily unavailable but I can’t seem to figure out the resource. [root@boot1 ~]# time nodeset -V proc5201_p osimage=rhel7-ia2 proc5201_p: netboot rhels7.4-x86_64-compute real 0m50.636s user 0m0.064s sys 0m0.038s [root@boot2 ~]# time nodeset -V proc5201_p osimage=rhel7-ia2 proc5201_p: netboot rhels7.4-x86_64-compute real 0m0.799s user 0m0.060s sys 0m0.033s on boot1: select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) read(3, 0x20de033, 5) = -1 EAGAIN (Resource temporarily unavailable) select(8, [3], NULL, NULL, {0, 500000}) = 0 (Timeout) Has anyone seen this before and can help explain why this is happening? _______________________ - Keith Hannum - keith.han...@lmco.com _______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user
_______________________________________________ xCAT-user mailing list xCAT-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xcat-user