Thanks Alejandro, We use python 2.7 and yes I did test the kill -0 with the contents of the pid file and got the result that you show. It does work now I have a line of Execute() code in the status function where the OS command was malformed so the method blew up but the exception was eaten so I didn't know that was a problem. Once I stripped out all the other code, it is now working. However, I still have the following questions: - is there a better way to debug? I tried "ambari-server start --debug" because the ambari code actually have extra debug statement that would've helped me but I can't turn the debug flag on - is there a config file that I can change to trigger those debug logger statement to be printed?
- This setup initially wasn't working because the metainfo file and the command script were missing from the cache directory. It was a last resort for me to manually copy over the scripts after restarting the ambari server a few times. I am hoping that I am missing a config step somewhere to cause this - please confirm since ambari wasn't updating the cache directories properly. Tim On Fri, May 22, 2015 at 7:01 PM, Alejandro Fernandez < [email protected]> wrote: > Hi Tim, > > Only make the changes to ambari-server's > /var/lib/ambari-server/resources/stacks/… folder, then restart > ambari-server; this will cause the agents to update their cache. > Restarting ambari-server takes time, so I prefer a shell script, or use > pdsh. > > What python version are you using? > Have you tried to run this manually? > > sleep 60& > [root@c6401 ~]# ps aux | grep sleep > root 29010 0.0 0.0 100904 592 pts/1 S 01:59 0:00 sleep 60 > root 29039 0.0 0.0 103236 864 pts/1 S+ 01:59 0:00 grep sleep > [root@c6401 ~]# kill -0 29010 > [root@c6401 ~]# echo $? > 0 > [root@c6401 ~]# kill -0 290100000000 > -bash: kill: 290100000000: arguments must be process or job Ids > > Thanks, > Alejandro > > From: Tim To <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Friday, May 22, 2015 at 11:26 AM > To: "[email protected]" <[email protected]> > Subject: Re: Cannot get check status to work for a customer service > > A quick update: > I was able to find the check_process_status method and all it does is > read the pid from the pid file then do a kill -0 on the pid so this should > work but it doesn't for some reason. Perhaps I mis a config step where > Ambari is not calling the our python script somehow. But Ambari is calling > the script to start our app so any suggestion is appreciated. > > One more question I setup the customer script in > /var/lib/ambari-server/resources/stacks/HDP/2.2/... but when ambari starts > it complains that it can't find my script at > /var/lib/ambari-agent/cache/stacks/HDP/2.2/services/... so I manually > created the directory that ambari is looking for and copied the script > there and ambari was able to start our app. Anyone knows how to set a > custom service up so I don't have to manually copy my script to the > /var/lib/ambari-agent/cache/... directory ?? > > Many thanks! > > the portion of the check_process_status code I was talking about : > > try: > pid = int(sudo.read_file(pid_file)) > except: > Logger.debug("Pid file {0} does not exist".format(pid_file)) > raise ComponentIsNotRunning() > > try: > # Kill will not actually kill the process > # From the doc: > # If sig is 0, then no signal is sent, but error checking is still > # performed; this can be used to check for the existence of a > # process ID or process group ID. > os.kill(pid, 0) > except OSError: > Logger.debug("Process with pid {0} is not running. Stale pid file" > " at {1}".format(pid, pid_file)) > raise ComponentIsNotRunning() > > > On Fri, May 22, 2015 at 10:35 AM, Tim To <[email protected]> wrote: > >> Hi all, >> I am using ambari 2.0 and hadoop 2.4. We have a customer process and >> I've been trying to set up monitoring and alert for it base on what I >> leanred from two populate sites: >> The offiical ambari one: >> https://cwiki.apache.org/confluence/display/AMBARI/Ambari >> and another one that has more detailed explanations: >> >> http://mozartanalytics.com/how-to-create-a-software-stack-for-ambari/?preview=true >> >> So far I was able to set it up and I can deploy our custom service and >> have ambari starts it for me (so I think the stop will work too). However >> check status doesn't work. Based on comments from the second site I'm >> trying to pass a pid file location to check_process_status() and magic >> should happen and Ambari would be able to tell whether this process is >> working or not. >> here's my python function for status: >> >> def status(self, env): >> print 'Status of Phemi Central'; >> check_process_status('/home/testuser/appName/logs/pid-8888') >> >> I manaully checked the file after ambari started our app and it does >> contain the correct pid for the process but ambari still think the app is >> "stopped". >> >> - Any pointer as to how check status works and how I am suppose to setup >> up is apprecaited. >> - Any more detailed documentation to setup monitoring and alert in >> addition to the avoe mentioend website would be greatly helpful (even a >> confirmation that there is none would save me searching for more :) ) >> - I also checked out the latest ambari code from github but have a hard >> time locatng the where check status is done so any help with looking for >> the code would also be helpful. >> >> Thanks in advance everyone!! >> >> >> -- >> *Tim To* >> Software Engineer >> *PHEMI Health Systems* >> 180-887 Great Northern Way >> Vancouver, BC V5T 4T5 >> website <http://www.phemi.com/> twitter >> <https://twitter.com/PHEMISystems> Linkedin >> <http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1> >> > > > > -- > *Tim To* > Software Engineer > *PHEMI Health Systems* > 180-887 Great Northern Way > Vancouver, BC V5T 4T5 > website <http://www.phemi.com/> twitter <https://twitter.com/PHEMISystems> > Linkedin > <http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1> > -- *Tim To* Software Engineer *PHEMI Health Systems* 180-887 Great Northern Way Vancouver, BC V5T 4T5 website <http://www.phemi.com/> twitter <https://twitter.com/PHEMISystems> Linkedin <http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>
