Venkatesh, Can you please check the AM log for messages containing "heartbeat timeout"?
That would be a condition under which the container gets killed and where you won't find any exceptions or messages in the container log. Thanks, Thomas On Wed, Jan 20, 2016 at 4:44 PM, Ashwin Chandra Putta < [email protected]> wrote: > Venkatesh, > > I am thinking the code in the end window might be blocking, can you try > putting a log at the first line in end window. > > If you see that this log in end window is not printed, then it is possible > that the one of the process tuple calls might be blocked although the > previous tuples have been processed. You might want to put one log at the > starting of process tuple and one log right before completing the process > tuple just to ensue that each process tuple call is completed. > > If you are implementing IdleTimeHandler, then the handleIdleTime method > call will behave similar to processTuple. > > Just to give a little more detail as how it works. All the operator method > calls are happening in the same thread. For every window, the begin window > is called followed by loop of processTuple calls for all tuples within the > window, followed by endWindow call. In your case, seems like begin window > is called. You also see a series of process tuple calls. We are not sure if > all the process tuple calls are completed or not, and if endWindow is > reached. > > Regards, > Ashwin. > > On Wed, Jan 20, 2016 at 4:32 PM, Kottapalli, Venkatesh < > [email protected]> wrote: > > > Yes Ashwin, the window id isn’t moving forward, the current window id for > > the operator is "-". In the operator, I see the processing part in > > "processTuple" getting completed for incoming tuples but not calling > > "endWindow". > > > > -----Original Message----- > > From: Ashwin Chandra Putta [mailto:[email protected]] > > Sent: Wednesday, January 20, 2016 3:57 PM > > To: [email protected] > > Subject: Re: Reg container getting killed without throwing exceptions > > > > Venkatesh, > > > > If you do not see the window id moving forward, it usually means that the > > business logic is blocking the operator. Please check if window id is > > moving forward. > > > > Regards, > > Ashwin. > > > > On Wed, Jan 20, 2016 at 3:37 PM, Gaurav Gupta <[email protected]> > > wrote: > > > > > Venkatesh, > > > > > > I think actually issue is that operator is getting blocked as you > > > mentioned that operator is taking too long to process and it is not > > > showing any processed and emitted tuples. AM is not getting any heart > > > beat from operator so it kills it. > > > > > > Thanks > > > - Gaurav > > > > > > > On Jan 20, 2016, at 3:17 PM, Kottapalli, Venkatesh > > > <[email protected]> wrote: > > > > > > > > Thanks for your inputs Gaurav and Tim. > > > > > > > > When it is OOM, I see it in the container logs but it in this case I > > > don’t find any. > > > > > > > > I see the processing part in the operator running and printing logs > > > without any issues end to end but not reaching the end window. It > > > might be because of the grouping logic that we have added in the end > > > window that is causing OOM but the container logs doesn’t show it. > > > > > > > > The operator is taking long to process. Total processed and emitted > > > > by > > > that operator is always 0. > > > > > > > > I shall try to increase memory on the Application master and the > > > container as well and see if it works else I will try on a smaller > > > load and see if it is a scaling issue because of OOM. > > > > > > > > Right now, I don’t have access to the AM logs. > > > > > > > > > > > > Regards, > > > > Venkatesh. > > > > > > > > -----Original Message----- > > > > From: Timothy Farkas [mailto:[email protected]] > > > > Sent: Wednesday, January 20, 2016 3:11 PM > > > > To: [email protected] > > > > Subject: Re: Reg container getting killed without throwing > > > > exceptions > > > > > > > > Hey Venkatesh, > > > > > > > > How much memory is allocated to the App Master? You should allocate > > > atleast 2GB to app master with this property. > > > > > > > > > > > > <property> > > > > <name>dt.attr.MASTER_MEMORY_MB</name> > > > > <value>2048</value> > > > > </property> > > > > > > > > Otherwise the App Master may die suddenly without printing anything > > > > to > > > logs. > > > > > > > > Thanks, > > > > Tim > > > > > > > > On Wed, Jan 20, 2016 at 2:47 PM, Gaurav Gupta > > > > <[email protected]> > > > > wrote: > > > > > > > >> Venkatesh, > > > >> > > > >> Did you see any OOM exception? It would be good to see the AM logs > > > >> and container logs to find out more. > > > >> > > > >> Thanks > > > >> - Gaurav > > > >> > > > >>> On Jan 20, 2016, at 2:42 PM, Kottapalli, Venkatesh < > > > >> [email protected]> wrote: > > > >>> > > > >>> Hi, > > > >>> > > > >>> I get the following message when the container is > > > >>> getting > > > >> killed. I don't find logs for any exceptions being thrown. How do > > > >> we identify the root cause for this issue? > > > >>> Sorry for being very abstract. > > > >>> > > > >>> Container killed by the ApplicationMaster. > > > >>> Container killed on request. Exit code is 143 Container exited > > > >>> with a non-zero exit code 143 > > > >>> > > > >>> Regards, > > > >>> Venkatesh. > > > >> > > > >> > > > > > > > > > > > > -- > > > > Regards, > > Ashwin. > > > > > > -- > > Regards, > Ashwin. >
