Thanks Alejandro,
We use python 2.7 and yes I did test the kill -0 with the contents of the
pid file and got the result that you show. It does work now I have a line
of Execute() code in the status function where the OS command was malformed
so the method blew up but the exception was eaten so I didn't know that was
a problem. Once I stripped out all the other code, it is now working.
However, I still have the following questions:
- is there a better way to debug? I tried "ambari-server start --debug"
because the ambari code actually have extra debug statement that would've
helped me but I can't turn the debug flag on - is there a config file that
I can change to trigger those debug logger statement to be printed?

- This setup initially wasn't working because the metainfo file and the
command script were missing from the cache directory. It was a last resort
for me to manually copy over the scripts after restarting the ambari server
a few times. I am hoping that I am missing a config step somewhere to cause
this - please confirm since ambari wasn't updating the cache directories
properly.

Tim

On Fri, May 22, 2015 at 7:01 PM, Alejandro Fernandez <
[email protected]> wrote:

>  Hi Tim,
>
>  Only make the changes to ambari-server's
> /var/lib/ambari-server/resources/stacks/… folder, then restart
> ambari-server; this will cause the agents to update their cache.
> Restarting ambari-server takes time, so I prefer a shell script, or use
> pdsh.
>
>  What python version are you using?
> Have you tried to run this manually?
>
>  sleep 60&
> [root@c6401 ~]# ps aux | grep sleep
> root     29010  0.0  0.0 100904   592 pts/1    S    01:59   0:00 sleep 60
> root     29039  0.0  0.0 103236   864 pts/1    S+   01:59   0:00 grep sleep
> [root@c6401 ~]# kill -0 29010
> [root@c6401 ~]# echo $?
> 0
> [root@c6401 ~]# kill -0 290100000000
> -bash: kill: 290100000000: arguments must be process or job Ids
>
>  Thanks,
> Alejandro
>
>   From: Tim To <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Friday, May 22, 2015 at 11:26 AM
> To: "[email protected]" <[email protected]>
> Subject: Re: Cannot get check status to work for a customer service
>
>    A quick update:
>  I was able to find the check_process_status method and all it does is
> read the pid from the pid file then do  a kill -0 on the pid so this should
> work but it doesn't for some reason. Perhaps I mis a config step where
> Ambari is not calling the our python script somehow. But Ambari is calling
> the script to start our app so any suggestion is appreciated.
>
> One more question I setup the customer script in
> /var/lib/ambari-server/resources/stacks/HDP/2.2/... but when ambari starts
> it complains that it can't find my script at
> /var/lib/ambari-agent/cache/stacks/HDP/2.2/services/... so I manually
> created the directory that ambari is looking for and copied the script
> there and ambari was able to start our app. Anyone knows how to set a
> custom service up so I don't have to manually copy my script to the
> /var/lib/ambari-agent/cache/... directory ??
>
>  Many thanks!
>
>  the portion of the check_process_status code I was talking about :
>
>   try:
>       pid = int(sudo.read_file(pid_file))
>   except:
>       Logger.debug("Pid file {0} does not exist".format(pid_file))
>       raise ComponentIsNotRunning()
>
>   try:
>       # Kill will not actually kill the process
>       # From the doc:
>       # If sig is 0, then no signal is sent, but error checking is still
>       # performed; this can be used to check for the existence of a
>       # process ID or process group ID.
>       os.kill(pid, 0)
>   except OSError:
>       Logger.debug("Process with pid {0} is not running. Stale pid file"
>                    " at {1}".format(pid, pid_file))
>       raise ComponentIsNotRunning()
>
>
> On Fri, May 22, 2015 at 10:35 AM, Tim To <[email protected]> wrote:
>
>>      Hi all,
>>  I am using ambari 2.0 and hadoop 2.4. We have a customer process and
>> I've been trying to set up monitoring and alert for it base on what I
>> leanred from two populate sites:
>> The offiical ambari one:
>> https://cwiki.apache.org/confluence/display/AMBARI/Ambari
>>  and another one that has more detailed explanations:
>>
>> http://mozartanalytics.com/how-to-create-a-software-stack-for-ambari/?preview=true
>>
>>  So far I was able to set it up and I can deploy our custom service and
>> have ambari starts it for me (so I think the stop will work too). However
>> check status doesn't work. Based on comments from the second site I'm
>> trying to pass a pid file location to check_process_status() and magic
>> should happen and Ambari would be able to tell whether this process is
>> working or not.
>>  here's my python function for status:
>>
>>   def status(self, env):
>>     print 'Status of Phemi Central';
>>     check_process_status('/home/testuser/appName/logs/pid-8888')
>>
>>  I manaully checked the file after ambari started our app and it does
>> contain the correct pid for the process but ambari still think the app is
>> "stopped".
>>
>>  - Any pointer as to how check status works and how I am suppose to setup
>> up is apprecaited.
>>  - Any more detailed documentation to setup monitoring and alert in
>> addition to the avoe mentioend website would be greatly helpful (even a
>> confirmation that there is none would save me searching for more :) )
>>  - I also checked out the latest ambari code from github but have a hard
>> time locatng the where check status is done so any help with looking for
>> the code would also be helpful.
>>
>>  Thanks in advance everyone!!
>>
>>
>> --
>>   *Tim To*
>> Software Engineer
>>  *PHEMI Health Systems*
>>  180-887 Great Northern Way
>> Vancouver, BC V5T 4T5
>> website <http://www.phemi.com/> twitter
>> <https://twitter.com/PHEMISystems> Linkedin
>> <http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>
>>
>
>
>
> --
>   *Tim To*
> Software Engineer
>  *PHEMI Health Systems*
>  180-887 Great Northern Way
> Vancouver, BC V5T 4T5
> website <http://www.phemi.com/> twitter <https://twitter.com/PHEMISystems>
>  Linkedin
> <http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>
>



-- 
*Tim To*
Software Engineer
*PHEMI Health Systems*
180-887 Great Northern Way
Vancouver, BC V5T 4T5
website <http://www.phemi.com/> twitter <https://twitter.com/PHEMISystems>
Linkedin
<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>

Reply via email to