To retain the container logs, go into the yarn config panels of Ambari and search for "debug". You're looking for a value that I believe indicates log retention and is set to '0' (can't recall the exact name). Set it to something like 3600 and you should be able to view the logs after container termination (once you allow Ambari to restart the yarn components).
> On Mar 24, 2017, at 5:03 AM, Gour Saha <gs...@hortonworks.com> wrote: > > That is not from the AM log. It is from the agent log. It will be hard to > say what¹s causing the failure without some snippet of the AM log. Does > your script/application emit any log of its own into any log file? Then I > would look there too. > > > -Gour > >> On 3/23/17, 6:25 PM, "David.Serafini" <david.seraf...@target.com> wrote: >> >> Can anyone tell me what this error means and whether it is significant? >> I have a slider job that seems to randomly fail, and I don't see anything >> interesting in the AppMaster logs except this. (That doesn't mean there >> isn't an error elsewhere: yarn is wiping out the job directories as soon >> as the containter terminates: I haven't figured out how to fix that). >> >> In case it matters, my job is a shell script specified in metainfo.json >> in application.components.commands.exec . The script does some setup >> and then runs tomcat. >> >> thanks in advance, >> david >> >> >> Connecting to the server at >> https://brdn1088.target.com:42721/ws/v1/slider/agents/... >> Registered with the server >> Traceback (most recent call last): >> File "./infra/agent/slider-agent/agent/main.py", line 318, in <module> >> main() >> File "./infra/agent/slider-agent/agent/main.py", line 311, in main >> controller.join(timeout=1.0) >> File "/usr/lib64/python2.6/threading.py", line 655, in join >> self.__block.wait(delay) >> File "/usr/lib64/python2.6/threading.py", line 258, in wait >> _sleep(delay) >> File "./infra/agent/slider-agent/agent/main.py", line 66, in >> signal_handler >> controller.actionQueue.execute_command(controller.stopCommand) >> File >> "/grid/4/hadoop/yarn/local/usercache/Z002JSF/appcache/application_14900386 >> 63882_9176/filecache/11/slider-agent.tar.gz/slider-agent/agent/ActionQueue >> .py", line 164, in execute_command >> if ActionQueue.STORE_APPLIED_CONFIG in command['commandParams']: >> KeyError: 'commandParams' >> >> >