Great finding Isha.

In general, it is always advisable to do things in main thread. We had some
timing issues in dtIngest  as we were emitting tuples in the Reconciler
thread. Once we moved all emit statements to the main thread, there were no
issues observed.

Issue: When tuples are emitted in Reconciler thread, some of them were
emitted post endWindow but before the checkpointing is done. These tuples
for the downstream operator are not guaranteed to reach the same window.
Thus checkpointing of the two operators is not in sync and that could
result in few tuples replayed wrongly from the Reconciler based operator.

Regards,
Sandeep

On Wed, Mar 2, 2016 at 8:57 AM, Isha Arkatkar <[email protected]> wrote:

> Hi,
>
>   I checked the application  https://github.com/chaithu14/AppThreadLocal
>
>   In this example, exception from downstream operator is thrown in a
> different thread in AbstractReconciler operator. And the rethrow to main
> operator thread is done in handleIdleTime.  This function is not guaranteed
> to be invoked in every window. In Thread_local locality I checked that
> handleIdleTime did not get invoked. So, the exception did not get rethrown.
>
>   The exception thrown from a different thread other than the main operator
> thread are not caught by Application Master. Something we can probably add
> to troubleshooting guide to add a rethrow in the main thread.
>
>   I verified that if downstream operator throws exception in the main
> thread, it is caught appropriately by application master even in thread
> local case.
>
> Thanks,
> Isha
>
> On Thu, Feb 25, 2016 at 9:57 PM, Chaitanya Chebolu <
> [email protected]> wrote:
>
> > Hi All,
> >
> >   Created Sample application for THREAD_LOCAL issue. Application is here
> > <https://github.com/chaithu14/AppThreadLocal>.
> >   Application has the following DAG:
> >
> >                 RandomEventGenerator -> OuputOperator.
> >
> > Both the operators are THREAD_LOCAL.
> >
> >   In OutputOperator, throwing exceptions at every committed window. So,
> > AppMaster supposed to kill container at every committed window. This is
> > expected behavior.
> >   But, this is not happening with the current Apex.
> >
> >   One more observation is, If the upstream operator throws exception at
> > every committed window, then AppMaster is killing the container
> > continuously. But, this is not happening with the downstream operator.
> >
> >  Created JIRA for this issue: APEXCORE-357
> >
> > Regards,
> > Chaitanya
> >
> > On Thu, Feb 25, 2016 at 12:36 PM, Chaitanya Chebolu <
> > [email protected]> wrote:
> >
> > > Hi ,
> > >
> > >   I am facing issues in Thread_Local. Two operators which are thread
> > local
> > > and out of which, the downstream operator throws exceptions. But,
> > AppMaster
> > > is not catching those exceptions. I was unable to figure out why
> > > application is not working.
> > >   If both the operators are deployed on different containers, then the
> > > container is killed continuously by AppMaster. This is expected
> behavior.
> > >
> > >    For Example, Let's say the dag be op1 -> op2 where op1, op2 are two
> > > operators which are of them thread local. Throws an exception from the
> > > downstream operator op2, AppMaster is not catching exceptions. I will
> > > create a JIRA for this issue. Please some one help on this.
> > >
> > > Regards,
> > > Chaitanya
> > >
> >
>

Reply via email to