I agree there is a problem. Nice catch! Is there a ticket for this?

The fragment executor is responsible for sending the final state, and in
this case its waiting forever, making the query hang. In any scenario where
a thread other than the fragment executor is failing (or cancelling) a
fragment, that thread should change the state, *and then *interrupt the
fragment executor. There are so many ways to get to
*FragmentExecutor.fail()*, and looks like [1] is the scenario you have
mentioned, right?

Thank you,
Sudheesh

[1]
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java#L89

On Thu, Apr 7, 2016 at 3:42 PM, Abdel Hakim Deneche <adene...@maprtech.com>
wrote:

> Thanks Sudheesh. So even after we fix DRILL-3714, it's still possible for
> the root fragment to fail without being cancelled.
>
> Take a look at BaseRawBatchBuffer.enqueue() and you will see that, once a
> fragment is in failed state, this method will release the batch and send
> back an OK ack to the sender.
>
> About your second question. When the UserServer calls
> FragmentExecutor.fail() it will just set it's status to FAILED without
> interrupting it. If the fragment thread is blocked in it's receiver, it
> will never send it's status to the Foreman.
>
> On Thu, Apr 7, 2016 at 10:36 PM, Sudheesh Katkam <sudhe...@apache.org>
> wrote:
>
> > I can answer one question myself. See inline.
> >
> > As you mentioned elsewhere, this issue will rarely happen (and even
> harder
> > to reproduce) once DRILL-3714 is committed.
> >
> > On Thu, Apr 7, 2016 at 11:38 AM, Sudheesh Katkam <sudhe...@apache.org>
> > wrote:
> >
> > > Hakim,
> > >
> > > Can you point me to where [3] happens?
> > >
> > > Two questions:
> > >
> > > + Why is the root fragment blocked? If the user channel is closed, the
> > > query is cancelled [1], which should cancel and interrupt all running
> > > fragments. This interruption happens regardless of fragment failure
> that
> > > you have pointed out when user channel is closed [2]. Unless there is
> > there
> > > a blocking call when failure is handled through the channel closed
> > > listener, I don't see why cancellation is not triggered.
> > >
> >
> > It is possible for fragment failure to be fully processed before Foreman
> > cancels all running fragments, in which case the root fragment will not
> be
> > interrupted (because it is not cancelled, see
> > QueryManager#cancelExecutingFragments).
> >
> >
> > > + Why does the Foreman wait forever? AFAIK failures are reported
> > > immediately to the user. Is the root fragment not reported as FAILED to
> > the
> > > Foreman?
> > >
> > > Thank you,
> > > Sudheesh
> > >
> > > [1]
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java#L179
> > > [2]
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java#L92
> > >
> > > On Thu, Apr 7, 2016 at 6:29 AM, John Omernik <j...@omernik.com> wrote:
> > >
> > >> Abdel -
> > >>
> > >> I think I've seen this on a MapR cluster I run, especially on CTAS.
> For
> > >> me, I have not brought it up because the cluster I am running on has
> > some
> > >> serious personal issues (like being hardware that's near 7 years old,
> > its
> > >> a
> > >> test cluster) and given the "hard to reproduce" nature of the problem,
> > >> I've
> > >> been reluctant to create noise. Given what you've described, it seems
> > very
> > >> similar to CTAS hangs I've seen, but couldn't accurately reproduce.
> > >>
> > >> This didn't add much to your post, but I wanted to give you a +1 for
> > >> outlining this potential problem.  Once I move to more robust
> hardware,
> > >> and
> > >> I am in similar situations, I will post more verbose details from my
> > side.
> > >>
> > >> John
> > >>
> > >>
> > >>
> > >> On Thu, Apr 7, 2016 at 2:29 AM, Abdel Hakim Deneche <
> > >> adene...@maprtech.com>
> > >> wrote:
> > >>
> > >> > So, we've been seeing some queries hang, I've come up with a
> possible
> > >> > explanation, but so far it's really difficult to reproduce. Let me
> > know
> > >> if
> > >> > you think this explanation doesn't hold up or if you have any ideas
> > how
> > >> we
> > >> > can reproduce it. Thanks
> > >> >
> > >> > - generally it's a CTAS running on a large cluster (lot's of writers
> > >> > running in parallel)
> > >> > - logs show that the user channel was closed and UserServer caused
> the
> > >> root
> > >> > fragment to move to a FAILED state [1]
> > >> > - jstack shows that the root fragment is blocked in it's receiver
> > >> waiting
> > >> > for data [2]
> > >> > - jstack also shows that ALL other fragments are no longer running,
> > and
> > >> the
> > >> > logs show that all of them succeeded [3]
> > >> > - the foreman waits *forever* for the root fragment to finish
> > >> >
> > >> > [1] the only case I can think off is when the user channel closed
> > while
> > >> the
> > >> > fragment was waiting for an ack from the user client
> > >> > [2] if a writer finishes earlier than the others, it will send a
> data
> > >> batch
> > >> > to the root fragment that will be sent to the user. The root will
> then
> > >> > immediately block on it's receiver waiting for the remaining writers
> > to
> > >> > finish
> > >> > [3] once the root fragment moves to a failed state, the receiver
> will
> > >> > immediately release any received batch and return an OK to the
> sender
> > >> > without putting the batch in it's blocking queue.
> > >> >
> > >> > Abdelhakim Deneche
> > >> >
> > >> > Software Engineer
> > >> >
> > >> >   <http://www.mapr.com/>
> > >> >
> > >> >
> > >> > Now Available - Free Hadoop On-Demand Training
> > >> > <
> > >> >
> > >>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Reply via email to