Hi,
> Am 16.05.2017 um 03:44 schrieb John_Tai <[email protected]>:
>
>>> And the opened shell is idling?
>
> No, it's working normally.
>
>>> How do you log in by this method – the default "builtin" method or anything
>>> self defined?
>
> Default, I didn't define anything. Don't even know how to.
>
>>> Any global or queue prolog in place, which is supposed to run under the sge
>>> account?
>
> There are no prolog/epilog defined.
>
>>> From the root account you can `strace -p 22443` and check what is going on
>>> therein.
>
> It keeps looping through these messages. Is it normal?
No.
And this happens for the all logins by `qrsh` on all nodes, but not for
conventional `qsub`ed jobs?
-- Reuti
> alarm(0) = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6,
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0) = 0
> alarm(0) = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6,
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0) = 0
> alarm(0) = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6,
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0) = 0
> alarm(0) = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6,
> revents=POLLHUP}])
> wait4(-1, 0x7fffa369d95c, WNOHANG, 0x7fffa369e300) = 0
> alarm(0) = 0
> alarm(0) = 0
> poll([{fd=6, events=POLLIN}, {fd=-1}], 2, 1000) = 1 ([{fd=6,
> revents=POLLHUP}])
>
>
>
> -----Original Message-----
> From: Reuti [mailto:[email protected]]
> Sent: Monday, May 15, 2017 5:46
> To: John_Tai
> Cc: [email protected]
> Subject: Re: [gridengine users] sge_shepherd using 100% CPU
>
> Hi,
>
>> Am 15.05.2017 um 05:28 schrieb John_Tai <[email protected]>:
>>
>> I recently found a weird problem with qrsh.
>>
>> If I just use it to login to an exec host, the sge_shepherd uses 100% of CPU.
>>
>> # qrsh -q lc.q@ibm105
>> # top
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 22443 sge 25 0 20396 1604 1272 R 99.5 0.0 0:08.80 sge_shepherd
>> 19927 sge 16 0 114m 3096 1836 S 0.0 0.0 0:00.26 sge_execd
>
> And the opened shell is idling?
>
> How do you log in by this method – the default "builtin" method or anything
> self defined?
>
> In my clusters I can't observe this behavior.
>
> Even if there would be something running in any of the shell's profile: it
> should show up for the opened shell but not for the sge_shepherd which runs
> under the sge admin account.
>
> Any global or queue prolog in place, which is supposed to run under the sge
> account?
>
> ==
>
> From the root account you can `strace -p 22443` and check what is going on
> therein.
>
> -- Reuti
>
>
>> But if I submit an actual command with qrsh this doesn’t happen.
>>
>> # qrsh -q lc.q@ibm105 xclock
>> # top
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 19927 sge 16 0 114m 3100 1836 S 0.0 0.0 0:00.38 sge_execd
>> 22671 sge 18 0 20392 1584 1256 S 0.0 0.0 0:00.00 sge_shepherd
>>
>> Not sure why that is. How do I troubleshoot this?
>>
>> Thanks
>> Johnt
>> This email (including its attachments, if any) may be confidential and
>> proprietary information of SMIC, and intended only for the use of the named
>> recipient(s) above. Any unauthorized use or disclosure of this email is
>> strictly prohibited. If you are not the intended recipient(s), please notify
>> the sender immediately and delete this email from your computer.
>>
>> _______________________________________________
>> users mailing list
>> [email protected]
>> https://gridengine.org/mailman/listinfo/users
>
> ________________________________
>
> This email (including its attachments, if any) may be confidential and
> proprietary information of SMIC, and intended only for the use of the named
> recipient(s) above. Any unauthorized use or disclosure of this email is
> strictly prohibited. If you are not the intended recipient(s), please notify
> the sender immediately and delete this email from your computer.
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users