Venkatesh,

Can you please check the AM log for messages containing "heartbeat timeout"?

That would be a condition under which the container gets killed and where
you won't find any exceptions or messages in the container log.

Thanks,
Thomas


On Wed, Jan 20, 2016 at 4:44 PM, Ashwin Chandra Putta <
[email protected]> wrote:

> Venkatesh,
>
> I am thinking the code in the end window might be blocking, can you try
> putting a log at the first line in end window.
>
> If you see that this log in end window is not printed, then it is possible
> that the one of the process tuple calls might be blocked although the
> previous tuples have been processed. You might want to put one log at the
> starting of process tuple and one log right before completing the process
> tuple just to ensue that each process tuple call is completed.
>
> If you are implementing IdleTimeHandler, then the handleIdleTime method
> call will behave similar to processTuple.
>
> Just to give a little more detail as how it works. All the operator method
> calls are happening in the same thread. For every window, the begin window
> is called followed by loop of processTuple calls for all tuples within the
> window, followed by endWindow call. In your case, seems like  begin window
> is called. You also see a series of process tuple calls. We are not sure if
> all the process tuple calls are completed or not, and if endWindow is
> reached.
>
> Regards,
> Ashwin.
>
> On Wed, Jan 20, 2016 at 4:32 PM, Kottapalli, Venkatesh <
> [email protected]> wrote:
>
> > Yes Ashwin, the window id isn’t moving forward, the current window id for
> > the operator is "-". In the operator, I see the processing part in
> > "processTuple" getting completed for incoming tuples but not calling
> > "endWindow".
> >
> > -----Original Message-----
> > From: Ashwin Chandra Putta [mailto:[email protected]]
> > Sent: Wednesday, January 20, 2016 3:57 PM
> > To: [email protected]
> > Subject: Re: Reg container getting killed without throwing exceptions
> >
> > Venkatesh,
> >
> > If you do not see the window id moving forward, it usually means that the
> > business logic is blocking the operator. Please check if window id is
> > moving forward.
> >
> > Regards,
> > Ashwin.
> >
> > On Wed, Jan 20, 2016 at 3:37 PM, Gaurav Gupta <[email protected]>
> > wrote:
> >
> > > Venkatesh,
> > >
> > > I think actually issue is that operator is getting blocked as you
> > > mentioned that operator is taking too long to process and it is not
> > > showing any processed and emitted tuples. AM is not getting any heart
> > > beat from operator so it kills it.
> > >
> > > Thanks
> > > - Gaurav
> > >
> > > > On Jan 20, 2016, at 3:17 PM, Kottapalli, Venkatesh
> > > <[email protected]> wrote:
> > > >
> > > > Thanks for your inputs Gaurav and Tim.
> > > >
> > > > When it is OOM, I see it in the container logs but it in this case I
> > > don’t find any.
> > > >
> > > > I see the processing part in the operator running and printing logs
> > > without any issues end to end but not reaching the end window. It
> > > might be because of the grouping logic that we have added  in the end
> > > window that is causing OOM but the container logs doesn’t show it.
> > > >
> > > > The operator is taking long to process.  Total processed and emitted
> > > > by
> > > that operator is always 0.
> > > >
> > > > I shall try to increase memory on the Application master and the
> > > container as well and see if it works else I will try on a smaller
> > > load and see if it is a scaling issue because of OOM.
> > > >
> > > > Right now, I don’t have access to the AM logs.
> > > >
> > > >
> > > > Regards,
> > > > Venkatesh.
> > > >
> > > > -----Original Message-----
> > > > From: Timothy Farkas [mailto:[email protected]]
> > > > Sent: Wednesday, January 20, 2016 3:11 PM
> > > > To: [email protected]
> > > > Subject: Re: Reg container getting killed without throwing
> > > > exceptions
> > > >
> > > > Hey Venkatesh,
> > > >
> > > > How much memory is allocated to the App Master? You should allocate
> > > atleast 2GB to app master with this property.
> > > >
> > > >
> > > >  <property>
> > > >    <name>dt.attr.MASTER_MEMORY_MB</name>
> > > >    <value>2048</value>
> > > >  </property>
> > > >
> > > > Otherwise the App Master may die suddenly without printing anything
> > > > to
> > > logs.
> > > >
> > > > Thanks,
> > > > Tim
> > > >
> > > > On Wed, Jan 20, 2016 at 2:47 PM, Gaurav Gupta
> > > > <[email protected]>
> > > > wrote:
> > > >
> > > >> Venkatesh,
> > > >>
> > > >> Did you see any OOM exception? It would be good to see the AM logs
> > > >> and container logs to find out more.
> > > >>
> > > >> Thanks
> > > >> - Gaurav
> > > >>
> > > >>> On Jan 20, 2016, at 2:42 PM, Kottapalli, Venkatesh <
> > > >> [email protected]> wrote:
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>>               I get the following message when the container is
> > > >>> getting
> > > >> killed. I don't find logs for any exceptions being thrown. How do
> > > >> we identify the root cause for this issue?
> > > >>> Sorry for being very abstract.
> > > >>>
> > > >>> Container killed by the ApplicationMaster.
> > > >>> Container killed on request. Exit code is 143 Container exited
> > > >>> with a non-zero exit code 143
> > > >>>
> > > >>> Regards,
> > > >>> Venkatesh.
> > > >>
> > > >>
> > >
> > >
> >
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>
>
>
> --
>
> Regards,
> Ashwin.
>

Reply via email to