zhijiangW commented on a change in pull request #7911: [FLINK-11082][network] 
Fix the logic of getting backlog in sub partition
URL: https://github.com/apache/flink/pull/7911#discussion_r264508671
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/ResultSubpartition.java
 ##########
 @@ -116,52 +115,58 @@ protected Throwable getFailureCause() {
 
        public abstract boolean isReleased();
 
-       /**
-        * Gets the number of non-event buffers in this subpartition.
-        *
-        * <p><strong>Beware:</strong> This method should only be used in tests 
in non-concurrent access
-        * scenarios since it does not make any concurrency guarantees.
-        */
-       @VisibleForTesting
-       public int getBuffersInBacklog() {
-               return buffersInBacklog;
-       }
-
        /**
         * Makes a best effort to get the current size of the queue.
         * This method must not acquire locks or interfere with the task and 
network threads in
         * any way.
         */
        public abstract int unsynchronizedGetNumberOfQueuedBuffers();
 
+       /**
+        * Gets the number of non-event buffers in this subpartition.
+        */
+       public abstract int getBuffersInBacklog();
+
+       /**
+        * @param lastBufferAvailable whether the last buffer in this 
subpartition is available for consumption
+        * @return the number of non-event buffers in this subpartition
+        */
+       protected int getBuffersInBacklog(boolean lastBufferAvailable) {
 
 Review comment:
   That is very good question. We define the variable `buffersInBacklog` for 
counting the non-event buffers in the queue. `getNumberOfFinishedBuffers` and 
`isAvailableUnsafe` are counting both event and non-event buffers. So they can 
not be unified directly. I even tried another way by counting only event 
buffers instead of current `buffersInBacklog`, but it still does not simplify 
the logic. Especially for `SpillableSubpartition` we could not get total 
buffers directly from the memory queue, then we still need non-event buffer 
counting even though we count the event buffers.
   
   I also find that the current conditions seem a bit complicated to maintain 
in different processes, such as `shouldNotifyDataAvailable`, `isAvailable`, 
`getBuffersInBacklog`, etc. And it would be better if we can integrate 
something to make related conditions unification. After we have a good idea for 
this issue, I would like to make changes. :)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to