Github user ilooner commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1105#discussion_r173542042
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 ---
    @@ -227,11 +280,19 @@ public void run() {
             @Override
             public Void run() throws Exception {
               injector.injectChecked(fragmentContext.getExecutionControls(), 
"fragment-execution", IOException.class);
    -          /*
    -           * Run the query until root.next returns false OR we no longer 
need to continue.
    -           */
    -          while (shouldContinue() && root.next()) {
    -            // loop
    +
    +          while (shouldContinue()) {
    +            // Fragment is not cancelled
    +
    +            for (FragmentHandle fragmentHandle; (fragmentHandle = 
receiverFinishedQueue.poll()) != null;) {
    --- End diff --
    
    The original code allowed 
PartitionSenderRootExec#receivingFragmentFinished() to be called before, after 
or concurrently with PartitionSenderRootExec#innerNext(). The idea was that 
PartitionSenderRootExec#receivingFragmentFinished would terminate a partition. 
The terminate method of OutgoingRecordBatch would cause all outgoing records to 
a finished receiver to be dropped. So the finished downstream receivers would 
stop receiving data from the PartitionSenderRootExec, but the downstream 
receivers that did not finish would continue to receive records. There is a 
gray area here because I'm not sure why only some of the downstream receivers 
would send a RECEIVER_FINISHED message and others wouldn't, but the original 
design seems to make an assumption that this is a very common thing and is 
optimized for it. So I assume the original authors know something that we don't 
with respect to that.
    
    Also calling receivingFragmentFinished after we finished processing all the 
data would defeat the purpose, since the intention was to allow our fragment to 
terminate early if the downstream receivers decided they don't need anymore 
data (ex. A select limit query which only asks the first 100 rows of a result 
with 10 million rows). If we called it only after the next() loop was done we 
would always process all upstream records even when we didn't have to and we 
would see significant degradation in the performance of limit queries.
    



---

Reply via email to