On Wed, 2019-02-20 at 19:23 +0000, Kuehling, Felix wrote: > On 2019-02-20 1:41 a.m., Thomas Hellstrom wrote: > > On Tue, 2019-02-19 at 17:06 +0000, Kuehling, Felix wrote: > > > On 2019-02-18 3:39 p.m., Thomas Hellstrom wrote: > > > > On Mon, 2019-02-18 at 18:07 +0100, Christian König wrote: > > > > > Am 18.02.19 um 10:47 schrieb Thomas Hellstrom: > > > > > > On Mon, 2019-02-18 at 09:20 +0000, Koenig, Christian wrote: > > > > > > > Another good question is also why the heck the acc_size > > > > > > > counts > > > > > > > towards > > > > > > > the DMA32 zone? > > > > > > DMA32 TTM pages are accounted in the DMA32 zone. Other > > > > > > pages > > > > > > are > > > > > > not. > > > > > Yeah, I'm perfectly aware of this. But this is for the > > > > > accounting > > > > > size! > > > > > > > > > > We have an accounting for the stuff needed additional to the > > > > > pages > > > > > backing the BO (e.g. the page and DMA addr array). > > > > > > > > > > And from the bug description it sounds like we use the DMA32 > > > > > zone > > > > > for > > > > > this accounting which of course is completely nonsense. > > > > It's actually accounted in all available zones, since it would > > > > be > > > > pretty hard to determine exactly where that memory should be > > > > accounted. > > > > In particular if it's vmalloced. It might be DMA32, it might > > > > not. > > > > Given > > > > the objective of stopping malicious user-space from exhausting > > > > the > > > > DMA32 zone it was, at the time the code was written, a > > > > reasonable > > > > approximation. With ever increasing memory sizes, there might > > > > be > > > > better > > > > solutions? > > > As far as I can see, in TTM, ttm_mem_global_alloc is only used > > > for > > > the > > > acc_size in ttm_bo_init_reserved. Other than that vmwgfx also > > > seems > > > to > > > use it to account for a few things that are allocated with > > > kmalloc. > > > > > > So would a better solution be to change ttm_mem_global_alloc to > > > use > > > only > > > the kernel zone? > > > > > IMO we need to determine what functionality to keep and then the > > best > > solution. The current code does its job, but is obviously too > > restrictive. Both of the solutions you suggest open up for > > potential > > DOS attacks (DMA32 and kernel zones are not mutually exclusive. > > They > > overlap). > On x86 with HIGHMEM there is no dma32 zone. Why do we need one on > x86_64? Can we make x86_64 more like HIGHMEM instead? > > Regards, > Felix >
IIRC with x86, the kernel zone is always smaller than any dma32 zone, so we'd always exhaust the kernel zone before dma32 anyway. Not sure why we have dma32 on x86 without highmem, though. sounds superflous but harmless. /Thomas > > > > > /Thomas > > > > > > > > > > > Regards, > > > Felix > > > > > > > > > > /Thomas > > > > > > > > > Christian. > > > > > > > > > > > For small persistent allocations using > > > > > > ttm_mem_global_alloc(), > > > > > > they > > > > > > are > > > > > > accounted also in the DMA32 zone, which may cause over- > > > > > > accounting > > > > > > of > > > > > > that zone, but that's pretty unlikely to be a big problem.. > > > > > > > > > > > > /Thomas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In other words why should the internal bookkeeping pages > > > > > > > be > > > > > > > allocated > > > > > > > in > > > > > > > the DMA32 zone? > > > > > > > > > > > > > > That doesn't sounds valid to me in any way, > > > > > > > Christian. > > > > > > > > > > > > > > Am 18.02.19 um 09:02 schrieb Thomas Hellstrom: > > > > > > > > Hmm, > > > > > > > > > > > > > > > > This zone was intended to stop TTM page allocations > > > > > > > > from > > > > > > > > exhausting > > > > > > > > the DMA32 zone. IIRC dma_alloc_coherent() uses DMA32 by > > > > > > > > default, > > > > > > > > which > > > > > > > > means if we drop this check, other devices may stop > > > > > > > > functioning > > > > > > > > unexpectedly? > > > > > > > > > > > > > > > > However, in the end I'd expect the kernel page > > > > > > > > allocation > > > > > > > > system > > > > > > > > to > > > > > > > > make sure there are some pages left in the DMA32 zone, > > > > > > > > otherwise > > > > > > > > random non-IO page allocations would also potentially > > > > > > > > exhaust > > > > > > > > the > > > > > > > > DMA32 zone without anybody caring, which means removing > > > > > > > > this > > > > > > > > zone > > > > > > > > wouldn't be any worse than whatever other subsystems > > > > > > > > may be > > > > > > > > doing > > > > > > > > already... > > > > > > > > > > > > > > > > /Thomas > > > > > > > > > > > > > > > > > > > > > > > > On 2/16/19 12:02 AM, Kuehling, Felix wrote: > > > > > > > > > This is an RFC. I'm not sure this is the right > > > > > > > > > solution, > > > > > > > > > but > > > > > > > > > it > > > > > > > > > highlights the problem I'm trying to solve. > > > > > > > > > > > > > > > > > > The dma32_zone limits the acc_size of all allocated > > > > > > > > > BOs > > > > > > > > > to > > > > > > > > > 2GB. > > > > > > > > > On a > > > > > > > > > 64-bit system with hundreds of GB of system memory > > > > > > > > > and > > > > > > > > > GPU > > > > > > > > > memory, > > > > > > > > > this can become a bottle neck. We're seeing TTM > > > > > > > > > memory > > > > > > > > > allocation > > > > > > > > > failures not because we're truly out of memory, but > > > > > > > > > because > > > > > > > > > we're > > > > > > > > > out of space in the dma32_zone for the acc_size > > > > > > > > > needed > > > > > > > > > for > > > > > > > > > our BO > > > > > > > > > book-keeping. > > > > > > > > > > > > > > > > > > Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com > > > > > > > > > > > > > > > > > > > CC: thellst...@vmware.com > > > > > > > > > CC: christian.koe...@amd.com > > > > > > > > > --- > > > > > > > > > drivers/gpu/drm/ttm/ttm_memory.c | 4 ++-- > > > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_memory.c > > > > > > > > > b/drivers/gpu/drm/ttm/ttm_memory.c > > > > > > > > > index f1567c3..bb05365 100644 > > > > > > > > > --- a/drivers/gpu/drm/ttm/ttm_memory.c > > > > > > > > > +++ b/drivers/gpu/drm/ttm/ttm_memory.c > > > > > > > > > @@ -363,7 +363,7 @@ static int > > > > > > > > > ttm_mem_init_highmem_zone(struct > > > > > > > > > ttm_mem_global *glob, > > > > > > > > > glob->zones[glob->num_zones++] = zone; > > > > > > > > > return 0; > > > > > > > > > } > > > > > > > > > -#else > > > > > > > > > +#elifndef CONFIG_64BIT > > > > > > > > > static int ttm_mem_init_dma32_zone(struct > > > > > > > > > ttm_mem_global > > > > > > > > > *glob, > > > > > > > > > const struct sysinfo *si) > > > > > > > > > { > > > > > > > > > @@ -441,7 +441,7 @@ int ttm_mem_global_init(struct > > > > > > > > > ttm_mem_global > > > > > > > > > *glob) > > > > > > > > > ret = ttm_mem_init_highmem_zone(glob, &si); > > > > > > > > > if (unlikely(ret != 0)) > > > > > > > > > goto out_no_zone; > > > > > > > > > -#else > > > > > > > > > +#elifndef CONFIG_64BIT > > > > > > > > > ret = ttm_mem_init_dma32_zone(glob, &si); > > > > > > > > > if (unlikely(ret != 0)) > > > > > > > > > goto out_no_zone; > > > > > > _______________________________________________ > > > > > > amd-gfx mailing list > > > > > > amd-gfx@lists.freedesktop.org > > > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cthellstrom%40vmware.com%7C5521905ed2f94144c64208d69768eda0%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636862874256927858&sdata=2PqTuwA75OoJTDBZ%2BtXnuEfrI3Lham5uYTWq2rT%2BN24%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx