Containers getting killed by application master

2016-05-17 Thread Ananth Gundabattula
Hello All, I was wondering what would be the case for a container to be killed by the application master ? I see the following in the UI when I click on details : " Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit cod

Re: Containers getting killed by application master

2016-05-17 Thread Sandeep Deshmukh
Dear Ananth, Could you please check the STRAM logs for any details of these containers. The first guess would be container going out of memory . Regards, Sandeep On Wed, May 18, 2016 at 10:05 AM, Ananth Gundabattula < agundabatt...@gmail.com> wrote: > Hello All, > > I was wondering what would b

Re: Containers getting killed by application master

2016-05-17 Thread Yogi Devendra
Do you have custom definePartition() implementation for any of the operators? If yes, would you please share the snippet inside that function? ~ Yogi On 18 May 2016 at 11:35, Ananth Gundabattula wrote: > Hello Sandeep, > > Thanks for the response. Please find attached the app master log. > > I

Re: Containers getting killed by application master

2016-05-17 Thread Bhupesh Chawda
Hi Ananth, Seems like you have an exception in Json Parser operator due to some illegal characters: org.codehaus.jackson.JsonParseException: Illegal character ((CTRL-CHAR, code 3)): only regular white space (\r, \n, \t) is allowed between tokens at [Source: java.io.StringReader@50590f2d; line: 1

Re: Containers getting killed by application master

2016-05-17 Thread Ashwin Chandra Putta
Ananth, The heartbeat timeout means that the operator is not sending back the window heartbeat information to the app master. It usually happens because of one of two reasons. 1. System failure - container died, network failure etc. 2. Windows not moving forward in the operator. Some business log

Re: Containers getting killed by application master

2016-05-17 Thread Yogi Devendra
There are some instances of "Heartbeat for unknown operator" in the log. So, looks like operators are sending the heartbeats. But, STRAM is not able to identify the operator. In the past, I observed similar behavior when I was trying to define the dynamic partitioning for some operator. ~ Yogi

Re: Containers getting killed by application master

2016-05-18 Thread Ananth Gundabattula
Thanks all for the inputs. @Yogi: I do not have any operators that are dynamically partitioned. I have not implemented any definePartition() in any of my operators. @Bhupesh: I am not using the JSON parser operator from Malhar. I do use jackson parser as an instance inside my operator that does s

Re: Containers getting killed by application master

2016-05-18 Thread Bhupesh Chawda
Hi Ananth, Do the containers that are getting killed belong to any specific operator? Or are these getting killed randomly. I'll suggest to have a look at the operator / container logs. You can also check this using: yarn logs --applicationId ~Bhupesh On Wed, May 18, 2016 at 12:22 AM, Ananth Gu

Re: Containers getting killed by application master

2016-05-18 Thread Ananth Gundabattula
Hello Bhupesh, The Kafka operator seems to be the one crashing. I am using the Kafka 0.9 operator from Malhar on a kafka broker cluster running CDH kafka 2.x. Attaching the logs of this particular operator for reference. Please note that there is an exception from the netty driver and I believe