sachouche commented on a change in pull request #1470: DRILL-6746: Query can 
hang when PartitionSender task thread sees a co…
URL: https://github.com/apache/drill/pull/1470#discussion_r219619807
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/BaseRawBatchBuffer.java
 ##########
 @@ -167,7 +169,25 @@ public RawFragmentBatch getNext() throws IOException {
 
       // if we didn't get a batch, block on waiting for queue.
       if (b == null && (!isTerminated() || !bufferQueue.isEmpty())) {
-        b = bufferQueue.take();
+        // We shouldn't block infinitely here. There can be a condition such 
that due to a failure FragmentExecutor
+        // state is changed to FAILED and queue is empty. Because of this the 
minor fragment main thread will block
+        // here waiting for next batch to arrive. Meanwhile when next batch 
arrived and was enqueued it sees
+        // FragmentExecutor failure state and doesn't enqueue the batch and 
cleans up the buffer queue. Hence this
+        // thread will stuck forever. So we pool for 5 seconds until we get a 
batch or FragmentExecutor state is in
+        // error condition.
+        while (b == null) {
+          b = bufferQueue.poll(5, TimeUnit.SECONDS);
+          if (!context.getExecutorState().shouldContinue()) {
+            kill(context);
+            if (b != null) {
+              assertAckSent(b);
 
 Review comment:
   can this assert fail when the minor fragment is getting closed? My point, 
assertions should be avoided during cleanup path.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to