paul-rogers commented on a change in pull request #2000: DRILL-7607: support dynamic credit based flow control URL: https://github.com/apache/drill/pull/2000#discussion_r385997586
########## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/batch/UnlimitedRawBatchBuffer.java ########## @@ -90,14 +100,39 @@ public boolean isEmpty() { @Override public void add(RawFragmentBatch batch) { + int recordCount = batch.getHeader().getDef().getRecordCount(); + long bathByteSize = batch.getByteCount(); + if (recordCount != 0) { + //skip first header batch + totalBatchSize += bathByteSize; + sampleTimes++; + } + if (sampleTimes == maxSampleTimes) { + long averageBathSize = totalBatchSize / sampleTimes; + //make a decision + long limit = context.getAllocator().getLimit(); Review comment: Unless this has been adjusted, the limit is generally meaningless; there is no code that sets this to any reasonable value (unless something has changed recently.) It generally defaults to 10 GB, which is far too large. If we use the default, will each receiver try to use the 10 GB of memory? More than 1 or 2 queries will cause OOM on many installations. To provide a bit more background. Unless things changed recently, if the query queue is off, there is no memory budget per query. But, if queuing is on, there is code that allocates memory to the buffering operators (sort, join, etc.) but not (last I looked) to exchanges. (It clearly should do such an allocation.) So, if we use the full 10GB (or whatever value) here, the query may far exceed the budget and the queuing mechanism will still cause the system to exhaust memory before we hit the query limit. In short, this memory needs to be allocated somehow as part of memory planning. I don't see any code in this PR which will do so. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services