junrushao commented on PR #16111:
URL: https://github.com/apache/tvm/pull/16111#issuecomment-1807465481
Thanks for your response. To summarize, on the high-level, we both agree
that fragmentation is an issue to be resolved, but differ in approaches we
believe are effective and sustainable.
> I think it is acceptable to make use of the upper-bound strategy, due to
the fact that we are making heavy use of the pool allocator
I believe the point you wanted to make here is that relying on pool
allocator, as you pointed out repetitively in this thread, could
alleviate/address the issue of over-allocation with unknown life time. This,
however, is not the case as it differentiates from temporary intermediate
buffers.
To better help you understand why, let me work you through the example I
gave in the previous response pasting below:
```
1 std::vector<NDArray> outputs;
2 for (int i = 0; i < 1024; ++i) {
3 NDArray outputs = mod["main"](...); # size = 1k, but storage = 128k;
4 logits.push_back(result);
5 }
6 this->outputs = outputs;
```
As you may already tell, the over-allocation happened in Line 3 is carried
through into the vector `outputs`. No matter what implementation of the
underlying allocator is, it does not control when the vector `outputs` in Line
6 is recycled. It means the RAM over-usage is propagated by the callee rather
than RelaxVM internally.
> The main purpose of this PR is, still, to manage the memory fragmentation,
which has proven to be an existing issue that can be severe when the memory
usage gets close to the memory limit.
We both agree that memory fragmentation is a problem in general, and I
believe we both wanted to take stabs at solving them without shooting ourselves
and other developers in the foot in usecases in the very near future. And more
broadly, as static memory planning is a common pass used in every
`relax.build()` call, we will have to consider if it's going to impact the
entire Relax compilation flow and the end users. Meanwhile, I do believe
alternatives do concretely exist and am happy to help you understand how them
work.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]