Re: Fixing nouveau for 4k PAGE_SIZE
On Sun, Aug 11, 2013 at 7:35 PM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Sun, 2013-08-11 at 11:02 +0200, Maarten Lankhorst wrote: diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c index 5c7433d..c314a5f 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c @@ -190,8 +190,8 @@ nv40_fifo_chan_ctor(struct nouveau_object *parent, if (size sizeof(*args)) return -EINVAL; - ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0xc0, - 0x1000, args-pushbuf, + ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0x80, + 0x1, args-pushbuf, (1ULL NVDEV_ENGINE_DMAOBJ) | (1ULL NVDEV_ENGINE_SW) | (1ULL NVDEV_ENGINE_GR) | Why the size change? This reverts the value to older ones, however that patch might not be needed anymore (I was carrying it from Dave but if we don't map the registers into userspace we shouldn't need to force align them). diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index ef3133e..5833851 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -84,10 +84,11 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, { struct nouveau_vm *vm = vma-vm; struct nouveau_vmmgr *vmm = vm-vmm; - int big = vma-node-type != vmm-spg_shift; + u32 shift = vma-node-type; + int big = shift != vmm-spg_shift; u32 offset = vma-node-offset + (delta 12); - u32 bits = vma-node-type - 12; - u32 num = length vma-node-type; + u32 bits = shift - 12; + u32 num = length shift; u32 pde = (offset vmm-pgt_bits) - vm-fpde; u32 pte = (offset ((1 vmm-pgt_bits) - 1)) bits; u32 max = 1 (vmm-pgt_bits - bits); @@ -98,7 +99,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, for_each_sg(mem-sg-sgl, sg, mem-sg-nents, i) { struct nouveau_gpuobj *pgt = vm-pgt[pde].obj[big]; - sglen = sg_dma_len(sg) PAGE_SHIFT; + sglen = sg_dma_len(sg) shift; end = pte + sglen; if (unlikely(end = max)) Please add a WARN_ON(big); in map_sg and map_sg_table if you do this. So that's debatable :-) The above code is map_sg. Arguably my patch *fixes* using it with card large pages :-) IE, Look at the usual case (PAGE_SHIFT=12). Today, the above means sglen will be in units of small pages. But everything else in that loop operates in units of whatever is represented by the pte, which can either be 4k or larger. So adding sglen to pte was never right for shift != 12 before (regardless of PAGE_SHIFT). With my patch, it's more correct, so as such, adding a WARN_ON here wouldn't be if I do this :-) Now, the big case still cannot really work here with PAGE_SHIFT=12, because that would require the sg segments to be multiple of shift (the card large page) and that is not going to be the case. So funnily enough, we *could* use card large pages of 64k (big) if ... we had PAGE_SHIFT=16 ... which we do on ppc64 :-) But not anywhere else. But yes, the point remains, in the general case, that function cannot work for the big case, so I wonder if we should catch it with a WARN_ON and maybe simplify the code a bunch while at it... @@ -106,7 +107,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, len = end - pte; for (m = 0; m len; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -121,7 +122,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, } if (m sglen) { for (; m sglen; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -136,6 +137,7 @@ finish: vmm-flush(vm); } +#if PAGE_SHIFT == 12 void nouveau_vm_map_sg(struct nouveau_vma *vma, u64 delta, u64 length, struct nouveau_mem *mem) @@ -143,10 +145,11 @@ nouveau_vm_map_sg(struct nouveau_vma *vma, u64 delta, u64 length, struct nouveau_vm *vm = vma-vm; struct nouveau_vmmgr *vmm = vm-vmm; dma_addr_t *list = mem-pages; -
Re: Fixing nouveau for 4k PAGE_SIZE
Op 11-08-13 07:36, Benjamin Herrenschmidt schreef: On Sun, 2013-08-11 at 10:41 +1000, Benjamin Herrenschmidt wrote: Now, to do that, I need a better understanding of the various things in there since I'm not familiar with nouveau at all. What I think I've figured out is with a few questions, it would be awesome if you could answer them so I can have a shot at fixing it all :-) Ok, a few more questions :-) - in struct nouveau_mm_node, what unit are offset and length ? Depends on the context. It's re-used a few times. In case of nouveau_vm it's multiples of 112, which is the smallest unit a card can address. They *seem* to be hard wired to be in units of 4k (regardless of either of system page size) which I assume means card small page size, but offset is in bytes ? At least according to my parsing of the code in nouveau_vm_map_at(). Correct. The big question then is whether that represents card address space or system address space... I assume the former right ? So a function like nouveau_vm_map_at() is solely concerned with mapping a piece of card address space in the card VM and thus shouldn't be concerned by the system PAGE_SIZE at all, right ? Former, but the code entangles system PAGE_SIZE and card PAGE_SIZE/SHIFT/MASK in some cases. IE. The only one we should actually care about here is nouveau_vm_map_sg_table() or am I missing an important piece of the puzzle ? nouveau_vm_map_sg too. nouveau_vm_map is special, and also used to map VRAM into BAR1/BAR3 by subdev/bar code. Additionally, nv41.c has only map_sg() callbacks, no map() callbacks, should I assume that means that nouveau_vm_map() and nouveau_vm_map_at() will never be called on these ? Correct. all cards before the nv50 family have no real vm. the BAR used for vram is just an identity mapping, not the entirety of VRAM may be accessible to the system. - In vm/base.c this construct appears regulary: struct nouveau_gpuobj *pgt = vm-pgt[pde].obj[big]; Which makes me believe we have separate page tables for small vs. large pages (in card mmu) (though I assume big is always 0 on nv40 unless missed something, I want to make sure I'm not breaking everything else...). Thus I assume that each pte in a big page table maps a page size of 1 vmm-lpg_shift, is that correct ? Correct, nv50+ are the only ones that support large pages. vmm-pgt_bits is always the same however, thus I assume that PDEs always map the same amount of space, but regions for large pages just have fewer PTEs, which seem to correspond to what the code does here: u32 pte = (offset ((1 vmm-pgt_bits) - 1)) bits; - My basic idea is to move the card page vs system PAGE breakdown up into vm/base.c, which seems reasonably simple in the case of nouveau_vm_map_sg_table() since we only call vmm-map_sg for one PTE at a time, but things get a bit trickier with nouveau_vm_map_sg which passes the whole page list down. Following my idea would mean essentially also making it operate one PTE at a time, thus potentially reducing the performance of the map operations. One option is to special case the PAGE_SIZE==12 case to keep the existing behaviour and operate in reduced (one PTE per call) mode in the other case but I dislike special casing like that (bitrots, one code path gets untested/unfixed, etc...) Another option would be to make struct nouveau_mem *mem -pages always be an array of 4k (ie, create a breakdown when constructing that list), but I have yet to figure out what the impact would be (where that stuff gets created) and whether this is realistic at all... - struct nouveau_mem is called node in various places (in nouveau_ttm) which is confusing. What is the relationship between nouveau_mem, nouveau_mm and nouveau_mm_node ? nouveau_mem seems to be having things such as offset in multiple of PAGE_SIZE, but have a page_shift member that appears to match the buffer object page_shift, hence seem to indicate that it is a card page shift... Would it be possible, maybe, to add comments next to the fields in those various data structure indicating in which units they are ? - Similar confusion arises with things like struct ttm_mem_reg *mem. For example, in nouveau_ttm.c, I see statements like: ret = nouveau_vm_get(man-priv, mem-num_pages 12, node-page_shift, NV_MEM_ACCESS_RW, node-vma[0]); Which seems to indicate that mem-num_pages is in multiple of 4k always, though I would have though that a ttm object was in multiple of PAGE_SIZE, am I wrong ? Especially since the same object is later populated using: mem-start = node-vma[0].offset PAGE_SHIFT; So the start member is in units of PAGE_SHIFT here, not 12. So I'm still a bit confused :-) The fun has been doubled because TTM expects PAGE units, so some of the PAGE_SHIFT's are genuine. Some may be a result of PAGE_SHIFT == 12, so honestly I don't know the specific ones.
Re: Fixing nouveau for 4k PAGE_SIZE
On Sun, 2013-08-11 at 08:17 +0200, Maarten Lankhorst wrote: So I'm still a bit confused :-) The fun has been doubled because TTM expects PAGE units, so some of the PAGE_SHIFT's are genuine. Some may be a result of PAGE_SHIFT == 12, so honestly I don't know the specific ones. Right, and the other way around too :-) I think I found at least two cases where 12 was used where it should have been PAGE_SHIFT (basically ttm_mem_reg-num_pages). This is only the tip of the iceberg, so this isn't a formal patch submission, but I would appreciate your thought as to whether the below is correct (and thus I'm on the right track) : --- a/drivers/gpu/drm/nouveau/nouveau_sgdma.c +++ b/drivers/gpu/drm/nouveau/nouveau_sgdma.c @@ -31,7 +31,7 @@ nv04_sgdma_bind(struct ttm_tt *ttm, struct ttm_mem_reg *mem) { struct nouveau_sgdma_be *nvbe = (struct nouveau_sgdma_be *)ttm; struct nouveau_mem *node = mem-mm_node; - u64 size = mem-num_pages 12; + u64 size = mem-num_pages PAGE_SHIFT; if (ttm-sg) { node-sg = ttm-sg; diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nou index 19e3757..f0629de 100644 --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c @@ -252,8 +252,8 @@ nv04_gart_manager_new(struct ttm_mem_type_manager *man, node-page_shift = 12; - ret = nouveau_vm_get(man-priv, mem-num_pages 12, node-page_shift, -NV_MEM_ACCESS_RW, node-vma[0]); + ret = nouveau_vm_get(man-priv, mem-num_pages PAGE_SHIFT, +node-page_shift, NV_MEM_ACCESS_RW, node-vma[0]); if (ret) { kfree(node); Thanks ! Cheers, Ben. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Fixing nouveau for 4k PAGE_SIZE
On Sun, 2013-08-11 at 17:06 +1000, Benjamin Herrenschmidt wrote: I think I found at least two cases where 12 was used where it should have been PAGE_SHIFT (basically ttm_mem_reg-num_pages). This is only the tip of the iceberg, so this isn't a formal patch submission, but I would appreciate your thought as to whether the below is correct (and thus I'm on the right track) : This patch (which needs cleanups, and probably be broken down for bisectability) makes it work for me. I've disabled nouveau_dri for now as this has its own problems related to Ajax recent gallium endian changes. Note the horrible duplication of nouveau_vm_map_sg... I think to fix it cleanly we probably need to slightly change the -map_sg API to the vmmr. However, I do have a question whose answer might make things a LOT easier if yes can make things a lot easier: Can we guarantee that that such an sg object (I assume this is always a ttm_bo getting placed in the TT memory, correct ?) has an alignment in *card VM space* that is a multiple of the *system page size* ? Ie, can we make that happen easily ? For example, if my system page size is 64k, can we guarantee that it will always be mapped in the card at a virtual address that is 64k aligned ? If that is the case, then we *know* that a given page in the page list passed to nouveau_vm_map_sg() will never cross a pde boundary (will always be fully contained in the bottom level of the page table). That allows to simplify the code a bit, and maybe to write a unified function that can still pass down page lists to the vmmr On the other hand, if we have to handle misalignment, then we may as well stick to 1 PTE at a time like my patch does to avoid horrible complications. Cheers, Ben. diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c index 5c7433d..c314a5f 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c @@ -190,8 +190,8 @@ nv40_fifo_chan_ctor(struct nouveau_object *parent, if (size sizeof(*args)) return -EINVAL; - ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0xc0, - 0x1000, args-pushbuf, + ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0x80, + 0x1, args-pushbuf, (1ULL NVDEV_ENGINE_DMAOBJ) | (1ULL NVDEV_ENGINE_SW) | (1ULL NVDEV_ENGINE_GR) | diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index ef3133e..5833851 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -84,10 +84,11 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, { struct nouveau_vm *vm = vma-vm; struct nouveau_vmmgr *vmm = vm-vmm; - int big = vma-node-type != vmm-spg_shift; + u32 shift = vma-node-type; + int big = shift != vmm-spg_shift; u32 offset = vma-node-offset + (delta 12); - u32 bits = vma-node-type - 12; - u32 num = length vma-node-type; + u32 bits = shift - 12; + u32 num = length shift; u32 pde = (offset vmm-pgt_bits) - vm-fpde; u32 pte = (offset ((1 vmm-pgt_bits) - 1)) bits; u32 max = 1 (vmm-pgt_bits - bits); @@ -98,7 +99,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, for_each_sg(mem-sg-sgl, sg, mem-sg-nents, i) { struct nouveau_gpuobj *pgt = vm-pgt[pde].obj[big]; - sglen = sg_dma_len(sg) PAGE_SHIFT; + sglen = sg_dma_len(sg) shift; end = pte + sglen; if (unlikely(end = max)) @@ -106,7 +107,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, len = end - pte; for (m = 0; m len; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -121,7 +122,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, } if (m sglen) { for (; m sglen; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -136,6 +137,7 @@ finish: vmm-flush(vm); } +#if PAGE_SHIFT == 12 void nouveau_vm_map_sg(struct nouveau_vma *vma, u64 delta, u64 length,
Re: Fixing nouveau for 4k PAGE_SIZE
Op 11-08-13 10:04, Benjamin Herrenschmidt schreef: On Sun, 2013-08-11 at 17:06 +1000, Benjamin Herrenschmidt wrote: I think I found at least two cases where 12 was used where it should have been PAGE_SHIFT (basically ttm_mem_reg-num_pages). This is only the tip of the iceberg, so this isn't a formal patch submission, but I would appreciate your thought as to whether the below is correct (and thus I'm on the right track) : This patch (which needs cleanups, and probably be broken down for bisectability) makes it work for me. I've disabled nouveau_dri for now as this has its own problems related to Ajax recent gallium endian changes. Note the horrible duplication of nouveau_vm_map_sg... I think to fix it cleanly we probably need to slightly change the -map_sg API to the vmmr. However, I do have a question whose answer might make things a LOT easier if yes can make things a lot easier: Can we guarantee that that such an sg object (I assume this is always a ttm_bo getting placed in the TT memory, correct ?) has an alignment in *card VM space* that is a multiple of the *system page size* ? Ie, can we make that happen easily ? For example, if my system page size is 64k, can we guarantee that it will always be mapped in the card at a virtual address that is 64k aligned ? If that is the case, then we *know* that a given page in the page list passed to nouveau_vm_map_sg() will never cross a pde boundary (will always be fully contained in the bottom level of the page table). That allows to simplify the code a bit, and maybe to write a unified function that can still pass down page lists to the vmmr On the other hand, if we have to handle misalignment, then we may as well stick to 1 PTE at a time like my patch does to avoid horrible complications. Cheers, Ben. diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c index 5c7433d..c314a5f 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c @@ -190,8 +190,8 @@ nv40_fifo_chan_ctor(struct nouveau_object *parent, if (size sizeof(*args)) return -EINVAL; - ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0xc0, - 0x1000, args-pushbuf, + ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0x80, + 0x1, args-pushbuf, (1ULL NVDEV_ENGINE_DMAOBJ) | (1ULL NVDEV_ENGINE_SW) | (1ULL NVDEV_ENGINE_GR) | Why the size change? diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index ef3133e..5833851 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -84,10 +84,11 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, { struct nouveau_vm *vm = vma-vm; struct nouveau_vmmgr *vmm = vm-vmm; - int big = vma-node-type != vmm-spg_shift; + u32 shift = vma-node-type; + int big = shift != vmm-spg_shift; u32 offset = vma-node-offset + (delta 12); - u32 bits = vma-node-type - 12; - u32 num = length vma-node-type; + u32 bits = shift - 12; + u32 num = length shift; u32 pde = (offset vmm-pgt_bits) - vm-fpde; u32 pte = (offset ((1 vmm-pgt_bits) - 1)) bits; u32 max = 1 (vmm-pgt_bits - bits); @@ -98,7 +99,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, for_each_sg(mem-sg-sgl, sg, mem-sg-nents, i) { struct nouveau_gpuobj *pgt = vm-pgt[pde].obj[big]; - sglen = sg_dma_len(sg) PAGE_SHIFT; + sglen = sg_dma_len(sg) shift; end = pte + sglen; if (unlikely(end = max)) Please add a WARN_ON(big); in map_sg and map_sg_table if you do this. @@ -106,7 +107,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, len = end - pte; for (m = 0; m len; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -121,7 +122,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, } if (m sglen) { for (; m sglen; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -136,6
Re: Fixing nouveau for 4k PAGE_SIZE
On Sun, 2013-08-11 at 11:02 +0200, Maarten Lankhorst wrote: diff --git a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c index 5c7433d..c314a5f 100644 --- a/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c +++ b/drivers/gpu/drm/nouveau/core/engine/fifo/nv40.c @@ -190,8 +190,8 @@ nv40_fifo_chan_ctor(struct nouveau_object *parent, if (size sizeof(*args)) return -EINVAL; - ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0xc0, - 0x1000, args-pushbuf, + ret = nouveau_fifo_channel_create(parent, engine, oclass, 0, 0x80, + 0x1, args-pushbuf, (1ULL NVDEV_ENGINE_DMAOBJ) | (1ULL NVDEV_ENGINE_SW) | (1ULL NVDEV_ENGINE_GR) | Why the size change? This reverts the value to older ones, however that patch might not be needed anymore (I was carrying it from Dave but if we don't map the registers into userspace we shouldn't need to force align them). diff --git a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c index ef3133e..5833851 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/vm/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/vm/base.c @@ -84,10 +84,11 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, { struct nouveau_vm *vm = vma-vm; struct nouveau_vmmgr *vmm = vm-vmm; - int big = vma-node-type != vmm-spg_shift; + u32 shift = vma-node-type; + int big = shift != vmm-spg_shift; u32 offset = vma-node-offset + (delta 12); - u32 bits = vma-node-type - 12; - u32 num = length vma-node-type; + u32 bits = shift - 12; + u32 num = length shift; u32 pde = (offset vmm-pgt_bits) - vm-fpde; u32 pte = (offset ((1 vmm-pgt_bits) - 1)) bits; u32 max = 1 (vmm-pgt_bits - bits); @@ -98,7 +99,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, for_each_sg(mem-sg-sgl, sg, mem-sg-nents, i) { struct nouveau_gpuobj *pgt = vm-pgt[pde].obj[big]; - sglen = sg_dma_len(sg) PAGE_SHIFT; + sglen = sg_dma_len(sg) shift; end = pte + sglen; if (unlikely(end = max)) Please add a WARN_ON(big); in map_sg and map_sg_table if you do this. So that's debatable :-) The above code is map_sg. Arguably my patch *fixes* using it with card large pages :-) IE, Look at the usual case (PAGE_SHIFT=12). Today, the above means sglen will be in units of small pages. But everything else in that loop operates in units of whatever is represented by the pte, which can either be 4k or larger. So adding sglen to pte was never right for shift != 12 before (regardless of PAGE_SHIFT). With my patch, it's more correct, so as such, adding a WARN_ON here wouldn't be if I do this :-) Now, the big case still cannot really work here with PAGE_SHIFT=12, because that would require the sg segments to be multiple of shift (the card large page) and that is not going to be the case. So funnily enough, we *could* use card large pages of 64k (big) if ... we had PAGE_SHIFT=16 ... which we do on ppc64 :-) But not anywhere else. But yes, the point remains, in the general case, that function cannot work for the big case, so I wonder if we should catch it with a WARN_ON and maybe simplify the code a bunch while at it... @@ -106,7 +107,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, len = end - pte; for (m = 0; m len; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -121,7 +122,7 @@ nouveau_vm_map_sg_table(struct nouveau_vma *vma, u64 delta, u64 length, } if (m sglen) { for (; m sglen; m++) { - dma_addr_t addr = sg_dma_address(sg) + (m PAGE_SHIFT); + dma_addr_t addr = sg_dma_address(sg) + (m shift); vmm-map_sg(vma, pgt, mem, pte, 1, addr); num--; @@ -136,6 +137,7 @@ finish: vmm-flush(vm); } +#if PAGE_SHIFT == 12 void nouveau_vm_map_sg(struct nouveau_vma *vma, u64 delta, u64 length, struct nouveau_mem *mem) @@ -143,10 +145,11 @@ nouveau_vm_map_sg(struct nouveau_vma *vma, u64 delta, u64 length, struct nouveau_vm *vm = vma-vm; struct nouveau_vmmgr *vmm = vm-vmm; dma_addr_t *list = mem-pages; - int big = vma-node-type != vmm-spg_shift; + u32 shift = vma-node-type; + int big =
Re: Fixing nouveau for 4k PAGE_SIZE
On Sun, 2013-08-11 at 10:41 +1000, Benjamin Herrenschmidt wrote: Now, to do that, I need a better understanding of the various things in there since I'm not familiar with nouveau at all. What I think I've figured out is with a few questions, it would be awesome if you could answer them so I can have a shot at fixing it all :-) Ok, a few more questions :-) - in struct nouveau_mm_node, what unit are offset and length ? They *seem* to be hard wired to be in units of 4k (regardless of either of system page size) which I assume means card small page size, but offset is in bytes ? At least according to my parsing of the code in nouveau_vm_map_at(). The big question then is whether that represents card address space or system address space... I assume the former right ? So a function like nouveau_vm_map_at() is solely concerned with mapping a piece of card address space in the card VM and thus shouldn't be concerned by the system PAGE_SIZE at all, right ? IE. The only one we should actually care about here is nouveau_vm_map_sg_table() or am I missing an important piece of the puzzle ? Additionally, nv41.c has only map_sg() callbacks, no map() callbacks, should I assume that means that nouveau_vm_map() and nouveau_vm_map_at() will never be called on these ? - In vm/base.c this construct appears regulary: struct nouveau_gpuobj *pgt = vm-pgt[pde].obj[big]; Which makes me believe we have separate page tables for small vs. large pages (in card mmu) (though I assume big is always 0 on nv40 unless missed something, I want to make sure I'm not breaking everything else...). Thus I assume that each pte in a big page table maps a page size of 1 vmm-lpg_shift, is that correct ? vmm-pgt_bits is always the same however, thus I assume that PDEs always map the same amount of space, but regions for large pages just have fewer PTEs, which seem to correspond to what the code does here: u32 pte = (offset ((1 vmm-pgt_bits) - 1)) bits; - My basic idea is to move the card page vs system PAGE breakdown up into vm/base.c, which seems reasonably simple in the case of nouveau_vm_map_sg_table() since we only call vmm-map_sg for one PTE at a time, but things get a bit trickier with nouveau_vm_map_sg which passes the whole page list down. Following my idea would mean essentially also making it operate one PTE at a time, thus potentially reducing the performance of the map operations. One option is to special case the PAGE_SIZE==12 case to keep the existing behaviour and operate in reduced (one PTE per call) mode in the other case but I dislike special casing like that (bitrots, one code path gets untested/unfixed, etc...) Another option would be to make struct nouveau_mem *mem -pages always be an array of 4k (ie, create a breakdown when constructing that list), but I have yet to figure out what the impact would be (where that stuff gets created) and whether this is realistic at all... - struct nouveau_mem is called node in various places (in nouveau_ttm) which is confusing. What is the relationship between nouveau_mem, nouveau_mm and nouveau_mm_node ? nouveau_mem seems to be having things such as offset in multiple of PAGE_SIZE, but have a page_shift member that appears to match the buffer object page_shift, hence seem to indicate that it is a card page shift... Would it be possible, maybe, to add comments next to the fields in those various data structure indicating in which units they are ? - Similar confusion arises with things like struct ttm_mem_reg *mem. For example, in nouveau_ttm.c, I see statements like: ret = nouveau_vm_get(man-priv, mem-num_pages 12, node-page_shift, NV_MEM_ACCESS_RW, node-vma[0]); Which seems to indicate that mem-num_pages is in multiple of 4k always, though I would have though that a ttm object was in multiple of PAGE_SIZE, am I wrong ? Especially since the same object is later populated using: mem-start = node-vma[0].offset PAGE_SHIFT; So the start member is in units of PAGE_SHIFT here, not 12. So I'm still a bit confused :-) Cheers, Ben. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel