Hi Reuti, The startup mechanism is as below
qlogin_daemon /usr/sbin/sshd -i qlogin_command /gridapl1/HWEE_ge6/new/qssh Regards, Sudha -----Original Message----- From: Reuti [mailto:[email protected]] Sent: Friday, May 08, 2015 10:50 PM To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom) Cc: [email protected]; [email protected] Subject: Re: [gridengine users] grid jobs not visible with qstat output > Am 08.05.2015 um 16:57 schrieb [email protected]: > > Hi Zhang, > > Please find the o/p > > 32682 61457200 27020 karppa 32682 > /applic36/grid/HWEE_ge6/utilbin/lx24-amd64/qrsh_starter > /gridapl1/HWEE_ge6/default/spo > 32734 61457200 27020 karppa 32734 \_ /bin/ksh ./run_it_file.vcs > 33043 61457200 27020 karppa 32734 \_ /bin/ksh ./vcs.start.dh.no_gui > 33059 61457200 27020 karppa 32734 \_ > ./vcs/tb_bin/hdl_top_rtldhsim/simv -licqueue -cm line+cond+fsm+branch+tgl+ > 38048 61457200 27020 karppa 32734 \_ [target.bin] <defunct> > 5049 61457200 27020 karppa 5049 > /applic36/grid/HWEE_ge6/utilbin/lx24-amd64/qrsh_starter > /gridapl1/HWEE_ge6/default/spoo > 5101 61457200 27020 karppa 5101 \_ /bin/ksh ./run_it_file.vcs > 5408 61457200 27020 karppa 5101 \_ /bin/ksh ./vcs.start.dh.no_gui > 5424 61457200 27020 karppa 5101 \_ > ./vcs/tb_bin/hdl_top_rtldhsim/simv -licqueue -cm line+cond+fsm+branch+tgl+a > 9089 61457200 27020 karppa 5101 \_ [target.bin] <defunct> The problem seems to be, that the `qrsh`starter` is no longer bound to the "sge_shephered". This was after the job? How does it look like while SGE still knows about the job. What is the startup mechanism: $ qconf -sconf ... qlogin_command builtin qlogin_daemon builtin rlogin_command builtin rlogin_daemon builtin rsh_command builtin rsh_daemon builtin -- Reuti > Regards, > Sudha > > -----Original Message----- > From: Feng Zhang [mailto:[email protected]] > Sent: Friday, May 08, 2015 7:35 PM > To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom) > Subject: Re: [gridengine users] grid jobs not visible with qstat output > > Sudha, > > Can you run "ps -e f -o pid,ppid,command", which can show more details? > > On Fri, May 8, 2015 at 4:09 AM, <[email protected]> wrote: >> Hi Reuti, >> >> The processes are not bound to sge_shepherd anymore. >> >> Below are the qrsh_starter processes running still >> >> 5049 ? 00:00:00 qrsh_starter >> 5101 ? 00:00:00 run_it_file.vcs >> 5408 ? 00:00:00 vcs.start.dh.no >> 5424 ? 8-20:57:02 simv >> 9089 ? 00:00:00 target.bin <defunct> >> 16868 ? 00:00:00 sshd >> 16913 pts/9 00:00:00 bash >> 17371 pts/9 00:00:00 ps >> 32682 ? 00:00:00 qrsh_starter >> 32734 ? 00:00:00 run_it_file.vcs >> 33043 ? 00:00:00 vcs.start.dh.no >> 33059 ? 8-21:19:03 simv >> 38048 ? 00:00:00 target.bin <defunct> >> >> Regards, >> Sudha >> >> -----Original Message----- >> From: Reuti [mailto:[email protected]] >> Sent: Thursday, May 07, 2015 9:52 PM >> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom) >> Cc: [email protected]; [email protected] >> Subject: Re: [gridengine users] grid jobs not visible with qstat output >> >> Are the processes still bound to the sge_shephered or did they jump out of >> the process tree? By what method were they started by qrsh_starter: >> "builtin" or by defining `ssh`? >> >> -- Reuti >> >> >>> Am 07.05.2015 um 18:00 schrieb <[email protected]> >>> <[email protected]>: >>> >>> Hi, >>> >>> No the slots are not being used anymore >>> >>> That according to qstat I seem not to have any jobs at host. However, there >>> are my processes running in that specific host (launched by qrsh_starter) >>> that are altogether consuming 200% of CPU and licenses. The problem here is >>> that the processes have been running there over a week and I haven’t been >>> aware of those. I’ve thought that the processes were killed when the job >>> was killed with qdel. >>> >>> What could be the reason for this. >>> >>> Regards, >>> Sudha >>> >>> From: Srirangam Addepalli [mailto:[email protected]] >>> Sent: Wednesday, May 06, 2015 7:52 PM >>> To: Sudha Padmini Penmetsa (WT01 - Global Media & Telecom) >>> Subject: Re: [gridengine users] grid jobs not visible with qstat output >>> >>> That would be strange. Do the slots on the host show as being used. >>> >>> qhost -j -h hostname should list the jobs that Grid Engine is aware of. >>> Unless qrsh some how spwanned a process that is not bound by sge_execd. On >>> the client/ execution host what info do you have in active_jobs and jobs >>> directories. It is more likely that the qrsh session is terminated but >>> left resident processes. >>> >>> Rangam >>> >>> On Wed, May 6, 2015 at 9:05 AM, <[email protected]> wrote: >>> Hi, >>> >>> I noticed that I've had two grid jobs running over a week on a machine of >>> which I haven't been aware of. Both of the jobs have been launched with >>> qrsh but they are not visible with qstat thus for a reason or another they >>> are no longer included in grid book-keeping. This issue will cause that >>> grid resources are wasted for ghost jobs as for example both of my jobs >>> seem to consume 100% CPU on the host. >>> >>> Can anyone please explain on this. >>> >>> Regards, >>> Sudha >>> >>> The information contained in this electronic message and any attachments to >>> this message are intended for the exclusive use of the addressee(s) and may >>> contain proprietary, confidential or privileged information. If you are not >>> the intended recipient, you should not disseminate, distribute or copy this >>> e-mail. Please notify the sender immediately and destroy all copies of this >>> message and any attachments. WARNING: Computer viruses can be transmitted >>> via email. The recipient should check this email and any attachments for >>> the presence of viruses. The company accepts no liability for any damage >>> caused by any virus transmitted by this email. www.wipro.com >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://gridengine.org/mailman/listinfo/users >>> >>> >>> The information contained in this electronic message and any attachments to >>> this message are intended for the exclusive use of the addressee(s) and may >>> contain proprietary, confidential or privileged information. If you are not >>> the intended recipient, you should not disseminate, distribute or copy this >>> e-mail. Please notify the sender immediately and destroy all copies of this >>> message and any attachments. WARNING: Computer viruses can be transmitted >>> via email. The recipient should check this email and any attachments for >>> the presence of viruses. The company accepts no liability for any damage >>> caused by any virus transmitted by this email. www.wipro.com >> >> The information contained in this electronic message and any attachments to >> this message are intended for the exclusive use of the addressee(s) and may >> contain proprietary, confidential or privileged information. If you are not >> the intended recipient, you should not disseminate, distribute or copy this >> e-mail. Please notify the sender immediately and destroy all copies of this >> message and any attachments. WARNING: Computer viruses can be transmitted >> via email. The recipient should check this email and any attachments for the >> presence of viruses. The company accepts no liability for any damage caused >> by any virus transmitted by this email. www.wipro.com >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > > > -- > Best, > > Feng > The information contained in this electronic message and any attachments to > this message are intended for the exclusive use of the addressee(s) and may > contain proprietary, confidential or privileged information. If you are not > the intended recipient, you should not disseminate, distribute or copy this > e-mail. Please notify the sender immediately and destroy all copies of this > message and any attachments. WARNING: Computer viruses can be transmitted via > email. The recipient should check this email and any attachments for the > presence of viruses. The company accepts no liability for any damage caused > by any virus transmitted by this email. www.wipro.com > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
