Steven Phillips created DRILL-1595:
--------------------------------------
Summary: Initialize intermediate fragments before leaf fragments
to avoid race condition
Key: DRILL-1595
URL: https://issues.apache.org/jira/browse/DRILL-1595
Project: Apache Drill
Issue Type: Bug
Reporter: Steven Phillips
Previous to DRILL-1436, PlanFragments for intermediate fragments were stored in
the Distributed Cache, and then PlanFragments for leaf fragments were sent.
This ensured that the intermediate plan fragments were available once data
started being sent from leaf fragments.
With DRILL-1436, all PlanFragments are sent out at once. It is possible for
data from a leaf fragment to arrive before the PlanFragment for an intermediate
fragment is available. In this case, we currently wait for it to become
available. The problem is that this results in the rpc thread blocking.
It appears that this is causing bit to bit connection timeouts. DRILL-1583 is
meant to address, but the current patch needs to be improved, as it currently
introduces new bugs. As a possibly interim fix, I propose the much simpler
solution of first sending the intermediate fragments, and then sending the leaf
fragments only once they have all been sent and acked. This will complete
eliminate the race condition and won't require extra threads or any blocking in
the rpc threads.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)