junrushao commented on PR #16111:
URL: https://github.com/apache/tvm/pull/16111#issuecomment-1807465481

   Thanks for your response. To summarize, on the high-level, we both agree 
that fragmentation is an issue to be resolved, but differ in approaches we 
believe are effective and sustainable.
   
   >  I think it is acceptable to make use of the upper-bound strategy, due to 
the fact that we are making heavy use of the pool allocator
   
   I believe the point you wanted to make here is that relying on pool 
allocator, as you pointed out repetitively in this thread, could 
alleviate/address the issue of over-allocation with unknown life time. This, 
however, is not the case as it differentiates from temporary intermediate 
buffers.
   
   To better help you understand why, let me work you through the example I 
gave in the previous response pasting below:
   
   ```
   1  std::vector<NDArray> outputs;
   2  for (int i = 0; i < 1024; ++i) {
   3    NDArray outputs = mod["main"](...); # size = 1k, but storage = 128k;
   4    logits.push_back(result);
   5  }
   6  this->outputs = outputs;
   ```
   
   As you may already tell, the over-allocation happened in Line 3 is carried 
through into the vector `outputs`. No matter what implementation of the 
underlying allocator is, it does not control when the vector `outputs` in Line 
6 is recycled. It means the RAM over-usage is propagated by the callee rather 
than RelaxVM internally.
    
   > The main purpose of this PR is, still, to manage the memory fragmentation, 
which has proven to be an existing issue that can be severe when the memory 
usage gets close to the memory limit.
   
   We both agree that memory fragmentation is a problem in general, and I 
believe we both wanted to take stabs at solving them without shooting ourselves 
and other developers in the foot in usecases in the very near future. And more 
broadly, as static memory planning is a common pass used in every 
`relax.build()` call, we will have to consider if it's going to impact the 
entire Relax compilation flow and the end users. Meanwhile, I do believe 
alternatives do concretely exist and am happy to help you understand how them 
work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to