Hi,

> Am 16.05.2017 um 03:44 schrieb John_Tai <[email protected]>:
> 
>>> And the opened shell is idling?
> 
> No, it's working normally.
> 
>>> How do you log in by this method – the default "builtin" method or anything 
>>> self defined?
> 
> Default, I didn't define anything. Don't even know how to.
> 
>>> Any global or queue prolog in place, which is supposed to run under the sge 
>>> account?
> 
> There are no prolog/epilog defined.
> 
>>> From the root account you can `strace -p 22443` and check what is going on 
>>> therein.
> 
> It keeps looping through these messages. Is it normal?

No.

And this happens for the all logins by `qrsh` on all nodes, but not for 
conventional `qsub`ed jobs?

-- Reuti


> alarm(0)                                = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6, 
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0)                                = 0
> alarm(0)                                = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6, 
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0)                                = 0
> alarm(0)                                = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6, 
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0)                                = 0
> alarm(0)                                = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6, 
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0)                                = 0
> alarm(0)                                = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6, 
> revents=POLLHUP}])
> 
> 
> 
> -----Original Message-----
> From: Reuti [mailto:[email protected]]
> Sent: Monday, May 15, 2017 5:46
> To: John_Tai
> Cc: [email protected]
> Subject: Re: [gridengine users] sge_shepherd using 100% CPU
> 
> Hi,
> 
>> Am 15.05.2017 um 05:28 schrieb John_Tai <[email protected]>:
>> 
>> I recently found a weird problem with qrsh.
>> 
>> If I just use it to login to an exec host, the sge_shepherd uses 100% of CPU.
>> 
>> # qrsh -q lc.q@ibm105
>> # top
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 22443 sge       25   0 20396 1604 1272 R 99.5  0.0   0:08.80 sge_shepherd
>> 19927 sge       16   0  114m 3096 1836 S  0.0  0.0   0:00.26 sge_execd
> 
> And the opened shell is idling?
> 
> How do you log in by this method – the default "builtin" method or anything 
> self defined?
> 
> In my clusters I can't observe this behavior.
> 
> Even if there would be something running in any of the shell's profile: it 
> should show up for the opened shell but not for the sge_shepherd which runs 
> under the sge admin account.
> 
> Any global or queue prolog in place, which is supposed to run under the sge 
> account?
> 
> ==
> 
> From the root account you can `strace -p 22443` and check what is going on 
> therein.
> 
> -- Reuti
> 
> 
>> But if I submit an actual command with qrsh this doesn’t happen.
>> 
>> # qrsh -q lc.q@ibm105 xclock
>> # top
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 19927 sge       16   0  114m 3100 1836 S  0.0  0.0   0:00.38 sge_execd
>> 22671 sge       18   0 20392 1584 1256 S  0.0  0.0   0:00.00 sge_shepherd
>> 
>> Not sure why that is. How do I troubleshoot this?
>> 
>> Thanks
>> Johnt
>> This email (including its attachments, if any) may be confidential and 
>> proprietary information of SMIC, and intended only for the use of the named 
>> recipient(s) above. Any unauthorized use or disclosure of this email is 
>> strictly prohibited. If you are not the intended recipient(s), please notify 
>> the sender immediately and delete this email from your computer.
>> 
>> _______________________________________________
>> users mailing list
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users
> 
> ________________________________
> 
> This email (including its attachments, if any) may be confidential and 
> proprietary information of SMIC, and intended only for the use of the named 
> recipient(s) above. Any unauthorized use or disclosure of this email is 
> strictly prohibited. If you are not the intended recipient(s), please notify 
> the sender immediately and delete this email from your computer.
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to