On 2019-02-20 1:41 a.m., Thomas Hellstrom wrote: > On Tue, 2019-02-19 at 17:06 +0000, Kuehling, Felix wrote: >> On 2019-02-18 3:39 p.m., Thomas Hellstrom wrote: >>> On Mon, 2019-02-18 at 18:07 +0100, Christian König wrote: >>>> Am 18.02.19 um 10:47 schrieb Thomas Hellstrom: >>>>> On Mon, 2019-02-18 at 09:20 +0000, Koenig, Christian wrote: >>>>>> Another good question is also why the heck the acc_size >>>>>> counts >>>>>> towards >>>>>> the DMA32 zone? >>>>> DMA32 TTM pages are accounted in the DMA32 zone. Other pages >>>>> are >>>>> not. >>>> Yeah, I'm perfectly aware of this. But this is for the accounting >>>> size! >>>> >>>> We have an accounting for the stuff needed additional to the >>>> pages >>>> backing the BO (e.g. the page and DMA addr array). >>>> >>>> And from the bug description it sounds like we use the DMA32 zone >>>> for >>>> this accounting which of course is completely nonsense. >>> It's actually accounted in all available zones, since it would be >>> pretty hard to determine exactly where that memory should be >>> accounted. >>> In particular if it's vmalloced. It might be DMA32, it might not. >>> Given >>> the objective of stopping malicious user-space from exhausting the >>> DMA32 zone it was, at the time the code was written, a reasonable >>> approximation. With ever increasing memory sizes, there might be >>> better >>> solutions? >> As far as I can see, in TTM, ttm_mem_global_alloc is only used for >> the >> acc_size in ttm_bo_init_reserved. Other than that vmwgfx also seems >> to >> use it to account for a few things that are allocated with kmalloc. >> >> So would a better solution be to change ttm_mem_global_alloc to use >> only >> the kernel zone? >> > IMO we need to determine what functionality to keep and then the best > solution. The current code does its job, but is obviously too > restrictive. Both of the solutions you suggest open up for potential > DOS attacks (DMA32 and kernel zones are not mutually exclusive. They > overlap). On x86 with HIGHMEM there is no dma32 zone. Why do we need one on x86_64? Can we make x86_64 more like HIGHMEM instead?
Regards, Felix > > > /Thomas > > > > >> Regards, >> Felix >> >> >>> /Thomas >>> >>>> Christian. >>>> >>>>> For small persistent allocations using ttm_mem_global_alloc(), >>>>> they >>>>> are >>>>> accounted also in the DMA32 zone, which may cause over- >>>>> accounting >>>>> of >>>>> that zone, but that's pretty unlikely to be a big problem.. >>>>> >>>>> /Thomas >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> In other words why should the internal bookkeeping pages be >>>>>> allocated >>>>>> in >>>>>> the DMA32 zone? >>>>>> >>>>>> That doesn't sounds valid to me in any way, >>>>>> Christian. >>>>>> >>>>>> Am 18.02.19 um 09:02 schrieb Thomas Hellstrom: >>>>>>> Hmm, >>>>>>> >>>>>>> This zone was intended to stop TTM page allocations from >>>>>>> exhausting >>>>>>> the DMA32 zone. IIRC dma_alloc_coherent() uses DMA32 by >>>>>>> default, >>>>>>> which >>>>>>> means if we drop this check, other devices may stop >>>>>>> functioning >>>>>>> unexpectedly? >>>>>>> >>>>>>> However, in the end I'd expect the kernel page allocation >>>>>>> system >>>>>>> to >>>>>>> make sure there are some pages left in the DMA32 zone, >>>>>>> otherwise >>>>>>> random non-IO page allocations would also potentially >>>>>>> exhaust >>>>>>> the >>>>>>> DMA32 zone without anybody caring, which means removing >>>>>>> this >>>>>>> zone >>>>>>> wouldn't be any worse than whatever other subsystems may be >>>>>>> doing >>>>>>> already... >>>>>>> >>>>>>> /Thomas >>>>>>> >>>>>>> >>>>>>> On 2/16/19 12:02 AM, Kuehling, Felix wrote: >>>>>>>> This is an RFC. I'm not sure this is the right solution, >>>>>>>> but >>>>>>>> it >>>>>>>> highlights the problem I'm trying to solve. >>>>>>>> >>>>>>>> The dma32_zone limits the acc_size of all allocated BOs >>>>>>>> to >>>>>>>> 2GB. >>>>>>>> On a >>>>>>>> 64-bit system with hundreds of GB of system memory and >>>>>>>> GPU >>>>>>>> memory, >>>>>>>> this can become a bottle neck. We're seeing TTM memory >>>>>>>> allocation >>>>>>>> failures not because we're truly out of memory, but >>>>>>>> because >>>>>>>> we're >>>>>>>> out of space in the dma32_zone for the acc_size needed >>>>>>>> for >>>>>>>> our BO >>>>>>>> book-keeping. >>>>>>>> >>>>>>>> Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com> >>>>>>>> CC: thellst...@vmware.com >>>>>>>> CC: christian.koe...@amd.com >>>>>>>> --- >>>>>>>> drivers/gpu/drm/ttm/ttm_memory.c | 4 ++-- >>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_memory.c >>>>>>>> b/drivers/gpu/drm/ttm/ttm_memory.c >>>>>>>> index f1567c3..bb05365 100644 >>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_memory.c >>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_memory.c >>>>>>>> @@ -363,7 +363,7 @@ static int >>>>>>>> ttm_mem_init_highmem_zone(struct >>>>>>>> ttm_mem_global *glob, >>>>>>>> glob->zones[glob->num_zones++] = zone; >>>>>>>> return 0; >>>>>>>> } >>>>>>>> -#else >>>>>>>> +#elifndef CONFIG_64BIT >>>>>>>> static int ttm_mem_init_dma32_zone(struct >>>>>>>> ttm_mem_global >>>>>>>> *glob, >>>>>>>> const struct sysinfo *si) >>>>>>>> { >>>>>>>> @@ -441,7 +441,7 @@ int ttm_mem_global_init(struct >>>>>>>> ttm_mem_global >>>>>>>> *glob) >>>>>>>> ret = ttm_mem_init_highmem_zone(glob, &si); >>>>>>>> if (unlikely(ret != 0)) >>>>>>>> goto out_no_zone; >>>>>>>> -#else >>>>>>>> +#elifndef CONFIG_64BIT >>>>>>>> ret = ttm_mem_init_dma32_zone(glob, &si); >>>>>>>> if (unlikely(ret != 0)) >>>>>>>> goto out_no_zone; >>>>> _______________________________________________ >>>>> amd-gfx mailing list >>>>> amd-gfx@lists.freedesktop.org >>>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cthellstrom%40vmware.com%7C1357d06244fb499b31dd08d6968ca04b%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C636861928079462725&sdata=1Bucho93KMzN0z7QbfiNn%2BNWaZs5yi86Ya6vm9Xhbqo%3D&reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx