For server-side debugging, modify the ambari_server_main.py file to enable DEBUG and can then connect a remote debugger from your favorite IDE to port 5005.
For increased debugging, modify the ambari.properties file debug level. For python logging on the agents, I believe the default logger level is already DEBUG. Thanks, Alejandro ________________________________ From: Tim To <[email protected]> Sent: Friday, May 22, 2015 7:36 PM To: Alejandro Fernandez Cc: [email protected] Subject: Re: Cannot get check status to work for a customer service Thanks Alejandro, We use python 2.7 and yes I did test the kill -0 with the contents of the pid file and got the result that you show. It does work now I have a line of Execute() code in the status function where the OS command was malformed so the method blew up but the exception was eaten so I didn't know that was a problem. Once I stripped out all the other code, it is now working. However, I still have the following questions: - is there a better way to debug? I tried "ambari-server start --debug" because the ambari code actually have extra debug statement that would've helped me but I can't turn the debug flag on - is there a config file that I can change to trigger those debug logger statement to be printed? - This setup initially wasn't working because the metainfo file and the command script were missing from the cache directory. It was a last resort for me to manually copy over the scripts after restarting the ambari server a few times. I am hoping that I am missing a config step somewhere to cause this - please confirm since ambari wasn't updating the cache directories properly. Tim On Fri, May 22, 2015 at 7:01 PM, Alejandro Fernandez <[email protected]<mailto:[email protected]>> wrote: Hi Tim, Only make the changes to ambari-server's /var/lib/ambari-server/resources/stacks/… folder, then restart ambari-server; this will cause the agents to update their cache. Restarting ambari-server takes time, so I prefer a shell script, or use pdsh. What python version are you using? Have you tried to run this manually? sleep 60& [root@c6401 ~]# ps aux | grep sleep root 29010 0.0 0.0 100904 592 pts/1 S 01:59 0:00 sleep 60 root 29039 0.0 0.0 103236 864 pts/1 S+ 01:59 0:00 grep sleep [root@c6401 ~]# kill -0 29010 [root@c6401 ~]# echo $? 0 [root@c6401 ~]# kill -0 290100000000 -bash: kill: 290100000000: arguments must be process or job Ids Thanks, Alejandro From: Tim To <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Friday, May 22, 2015 at 11:26 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Cannot get check status to work for a customer service A quick update: I was able to find the check_process_status method and all it does is read the pid from the pid file then do a kill -0 on the pid so this should work but it doesn't for some reason. Perhaps I mis a config step where Ambari is not calling the our python script somehow. But Ambari is calling the script to start our app so any suggestion is appreciated. One more question I setup the customer script in /var/lib/ambari-server/resources/stacks/HDP/2.2/... but when ambari starts it complains that it can't find my script at /var/lib/ambari-agent/cache/stacks/HDP/2.2/services/... so I manually created the directory that ambari is looking for and copied the script there and ambari was able to start our app. Anyone knows how to set a custom service up so I don't have to manually copy my script to the /var/lib/ambari-agent/cache/... directory ?? Many thanks! the portion of the check_process_status code I was talking about : try: pid = int(sudo.read_file(pid_file)) except: Logger.debug("Pid file {0} does not exist".format(pid_file)) raise ComponentIsNotRunning() try: # Kill will not actually kill the process # From the doc: # If sig is 0, then no signal is sent, but error checking is still # performed; this can be used to check for the existence of a # process ID or process group ID. os.kill(pid, 0) except OSError: Logger.debug("Process with pid {0} is not running. Stale pid file" " at {1}".format(pid, pid_file)) raise ComponentIsNotRunning() On Fri, May 22, 2015 at 10:35 AM, Tim To <[email protected]<mailto:[email protected]>> wrote: Hi all, I am using ambari 2.0 and hadoop 2.4. We have a customer process and I've been trying to set up monitoring and alert for it base on what I leanred from two populate sites: The offiical ambari one: https://cwiki.apache.org/confluence/display/AMBARI/Ambari and another one that has more detailed explanations: http://mozartanalytics.com/how-to-create-a-software-stack-for-ambari/?preview=true So far I was able to set it up and I can deploy our custom service and have ambari starts it for me (so I think the stop will work too). However check status doesn't work. Based on comments from the second site I'm trying to pass a pid file location to check_process_status() and magic should happen and Ambari would be able to tell whether this process is working or not. here's my python function for status: def status(self, env): print 'Status of Phemi Central'; check_process_status('/home/testuser/appName/logs/pid-8888') I manaully checked the file after ambari started our app and it does contain the correct pid for the process but ambari still think the app is "stopped". - Any pointer as to how check status works and how I am suppose to setup up is apprecaited. - Any more detailed documentation to setup monitoring and alert in addition to the avoe mentioend website would be greatly helpful (even a confirmation that there is none would save me searching for more :) ) - I also checked out the latest ambari code from github but have a hard time locatng the where check status is done so any help with looking for the code would also be helpful. Thanks in advance everyone!! -- Tim To Software Engineer PHEMI Health Systems 180-887 Great Northern Way Vancouver, BC V5T 4T5 website<http://www.phemi.com/> twitter<https://twitter.com/PHEMISystems> Linkedin<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1> -- Tim To Software Engineer PHEMI Health Systems 180-887 Great Northern Way Vancouver, BC V5T 4T5 website<http://www.phemi.com/> twitter<https://twitter.com/PHEMISystems> Linkedin<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1> -- Tim To Software Engineer PHEMI Health Systems 180-887 Great Northern Way Vancouver, BC V5T 4T5 website<http://www.phemi.com/> twitter<https://twitter.com/PHEMISystems> Linkedin<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>
