[PATCH 2/2] Add 2-level GPUVM pagetables support to radeon driver. v2

2012-09-15 Thread Christian König
On 14.09.2012 19:49, Dmitry Cherkasov wrote:
> PDE/PTE update code uses CP ring for memory writes.
> All page table entries are preallocated for now in alloc_pt().
>
> It is made as whole because it's hard to divide it to several patches
> that compile and doesn't break anything being applied separately.
>
> Tested on cayman card.
We need some more tests on SI before that can be pushed upstream. Not so 
much of a problem, cause I can do it and AFAIK Michel also had an older 
version of the patch tested on SI.

>
> v2 changes:
> * rebased on top of "refactor set_page chipset interface v2"
> * code cleanups
>
> Signed-off-by: Dmitry Cherkasov 
> ---
>   drivers/gpu/drm/radeon/ni.c  |4 +-
>   drivers/gpu/drm/radeon/radeon.h  |4 +-
>   drivers/gpu/drm/radeon/radeon_gart.c |  145 
> +++---
>   drivers/gpu/drm/radeon/si.c  |4 +-
>   4 files changed, 140 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
> index 0355c8d..ffa9f7e 100644
> --- a/drivers/gpu/drm/radeon/ni.c
> +++ b/drivers/gpu/drm/radeon/ni.c
> @@ -782,7 +782,7 @@ static int cayman_pcie_gart_enable(struct radeon_device 
> *rdev)
>  (u32)(rdev->dummy_page.addr >> 12));
>   WREG32(VM_CONTEXT1_CNTL2, 0);
>   WREG32(VM_CONTEXT1_CNTL, 0);
> - WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(0) |
> + WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(1) |
>   RANGE_PROTECTION_FAULT_ENABLE_DEFAULT);
>   
>   cayman_pcie_gart_tlb_flush(rdev);
> @@ -1586,7 +1586,7 @@ void cayman_vm_flush(struct radeon_device *rdev, struct 
> radeon_ib *ib)
>   radeon_ring_write(ring, vm->last_pfn);
>   
>   radeon_ring_write(ring, PACKET0(VM_CONTEXT0_PAGE_TABLE_BASE_ADDR + 
> (vm->id << 2), 0));
> - radeon_ring_write(ring, vm->pt_gpu_addr >> 12);
> + radeon_ring_write(ring, vm->pd_gpu_addr >> 12);
>   
>   /* flush hdp cache */
>   radeon_ring_write(ring, PACKET0(HDP_MEM_COHERENCY_FLUSH_CNTL, 0));
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index f02ea8e..6231823 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -655,8 +655,8 @@ struct radeon_vm {
>   struct list_headva;
>   unsignedid;
>   unsignedlast_pfn;
> - u64 pt_gpu_addr;
> - u64 *pt;
> + u64 pd_gpu_addr;
> + u64 __iomem *pd_addr;
As I already said in the last version of this patch, the CPU shouldn't 
come into the need to access that memory directly, with the exception of 
initial clearing it. So please remove this pointer.

>   struct radeon_sa_bo *sa_bo;
>   struct mutexmutex;
>   /* last fence for cs using this vm */
> diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
> b/drivers/gpu/drm/radeon/radeon_gart.c
> index badc835..9c68482 100644
> --- a/drivers/gpu/drm/radeon/radeon_gart.c
> +++ b/drivers/gpu/drm/radeon/radeon_gart.c
> @@ -50,6 +50,59 @@
>* This file handles the common internal GART management.
>*/
>   
> +/* GPUVM defines */
> +
> +/* We consider the case where BLOCK_SIZE is 0 */
> +/* So PDE is 19 bits long, PTE is 9 and OFFSET is 12 */
> +#define RADEON_BLOCK_SIZE   0
Sorry missed that in my last comment, that define should indeed be 
public, cause the chipset specific code needs to know it when 
initializing the hardware.

Also we should name it in a way that makes it obvious that it belongs to 
the VM code, like RADEON_VM_BLOCK_SIZE, or something like this.

> +
> +/* By default there are 512 entries in Page Table */
> +#define RADEON_DEFAULT_PTE_COUNT (1 << 9)
> +
> +/* number of PTEs in Page Table */
> +#define RADEON_PTE_COUNT (RADEON_DEFAULT_PTE_COUNT << RADEON_BLOCK_SIZE)
Please merge those two defines, RADEON_DEFAULT_PTE_COUNT isn't used 
after the second define, and the fact that nine is actually the minimum 
block size is something chipset specific (there might be larger minimum 
page table sizes in the future). Saying that it might actually also be 
useful to set RADEON_(VM)_BLOCK_SIZE to 9 instead of 0 and let the 
chipset specific code calculate the actually value that gets written 
into the register.

> +
> +/* Get last PDE number containing nth PTE */
> +#define RADEON_GET_LAST_PDE_FOR_PFN(_n)  ((_n) / RADEON_PTE_COUNT)
> +
> +/* Get PTE number to containing nth pfn */
> +#define RADEON_GET_PTE_FOR_PFN(_n)   ((_n) % RADEON_PTE_COUNT)
> +
> +/* Number of PDE tables to cover n PTEs */
> +#define RADEON_PDE_COUNT_FOR_N_PAGES(_n) \
> + (((_n) + RADEON_PTE_COUNT - 1) / RADEON_PTE_COUNT)
> +
> +/* Number of PDE tables to cover max_pfn (maximum number of PTEs) */
> +#define RADEON_TOTAL_PDE_COUNT(rdev) \
> + 

[PATCH 2/2] Add 2-level GPUVM pagetables support to radeon driver. v2

2012-09-15 Thread Maarten Maathuis
On Fri, Sep 14, 2012 at 7:49 PM, Dmitry Cherkasov  
wrote:
> +#define RADEON_PT_OFFSET(_rdev) \
> +   (RADEON_GPU_PAGE_ALIGN(RADEON_TOTAL_PDE_COUNT(rdev) * 
> RADEON_PDE_SIZE))

Shouldn't that be _rdev too?

Also a few lines above that you use rdev instead of _rdev.

I didn't check the whole thing, just noticed that when i was staring
at it for no reason :-)

-- 
Far away from the primal instinct, the song seems to fade away, the
river get wider between your thoughts and the things we do and say.


[PATCH 2/2] Add 2-level GPUVM pagetables support to radeon driver. v2

2012-09-15 Thread Dmitry Cherkasov
PDE/PTE update code uses CP ring for memory writes.
All page table entries are preallocated for now in alloc_pt().

It is made as whole because it's hard to divide it to several patches
that compile and doesn't break anything being applied separately.

Tested on cayman card.

v2 changes:
* rebased on top of refactor set_page chipset interface v2
* code cleanups

Signed-off-by: Dmitry Cherkasov dmitrii.cherka...@amd.com
---
 drivers/gpu/drm/radeon/ni.c  |4 +-
 drivers/gpu/drm/radeon/radeon.h  |4 +-
 drivers/gpu/drm/radeon/radeon_gart.c |  145 +++---
 drivers/gpu/drm/radeon/si.c  |4 +-
 4 files changed, 140 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 0355c8d..ffa9f7e 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -782,7 +782,7 @@ static int cayman_pcie_gart_enable(struct radeon_device 
*rdev)
   (u32)(rdev-dummy_page.addr  12));
WREG32(VM_CONTEXT1_CNTL2, 0);
WREG32(VM_CONTEXT1_CNTL, 0);
-   WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(0) |
+   WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(1) |
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT);
 
cayman_pcie_gart_tlb_flush(rdev);
@@ -1586,7 +1586,7 @@ void cayman_vm_flush(struct radeon_device *rdev, struct 
radeon_ib *ib)
radeon_ring_write(ring, vm-last_pfn);
 
radeon_ring_write(ring, PACKET0(VM_CONTEXT0_PAGE_TABLE_BASE_ADDR + 
(vm-id  2), 0));
-   radeon_ring_write(ring, vm-pt_gpu_addr  12);
+   radeon_ring_write(ring, vm-pd_gpu_addr  12);
 
/* flush hdp cache */
radeon_ring_write(ring, PACKET0(HDP_MEM_COHERENCY_FLUSH_CNTL, 0));
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index f02ea8e..6231823 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -655,8 +655,8 @@ struct radeon_vm {
struct list_headva;
unsignedid;
unsignedlast_pfn;
-   u64 pt_gpu_addr;
-   u64 *pt;
+   u64 pd_gpu_addr;
+   u64 __iomem *pd_addr;
struct radeon_sa_bo *sa_bo;
struct mutexmutex;
/* last fence for cs using this vm */
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index badc835..9c68482 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -50,6 +50,59 @@
  * This file handles the common internal GART management.
  */
 
+/* GPUVM defines */
+
+/* We consider the case where BLOCK_SIZE is 0 */
+/* So PDE is 19 bits long, PTE is 9 and OFFSET is 12 */
+#define RADEON_BLOCK_SIZE   0
+
+/* By default there are 512 entries in Page Table */
+#define RADEON_DEFAULT_PTE_COUNT (1  9)
+
+/* number of PTEs in Page Table */
+#define RADEON_PTE_COUNT (RADEON_DEFAULT_PTE_COUNT  RADEON_BLOCK_SIZE)
+
+/* Get last PDE number containing nth PTE */
+#define RADEON_GET_LAST_PDE_FOR_PFN(_n)((_n) / RADEON_PTE_COUNT)
+
+/* Get PTE number to containing nth pfn */
+#define RADEON_GET_PTE_FOR_PFN(_n) ((_n) % RADEON_PTE_COUNT)
+
+/* Number of PDE tables to cover n PTEs */
+#define RADEON_PDE_COUNT_FOR_N_PAGES(_n) \
+   (((_n) + RADEON_PTE_COUNT - 1) / RADEON_PTE_COUNT)
+
+/* Number of PDE tables to cover max_pfn (maximum number of PTEs) */
+#define RADEON_TOTAL_PDE_COUNT(rdev) \
+   RADEON_PDE_COUNT_FOR_N_PAGES(rdev-vm_manager.max_pfn)
+
+#define RADEON_PTE_SIZE 8
+#define RADEON_PDE_SIZE 8
+
+/* offset for npde-th PDE starting from beginning of PDE table */
+#define RADEON_PDE_OFFSET(_rdev, _npde) ((_npde) * RADEON_PDE_SIZE)
+
+#define RADEON_PT_OFFSET(_rdev) \
+   (RADEON_GPU_PAGE_ALIGN(RADEON_TOTAL_PDE_COUNT(rdev) * RADEON_PDE_SIZE))
+
+/* offset for npte-th PTE of npde-th PDE starting from beginning of PDE table 
*/
+#define RADEON_PTE_OFFSET(_rdev, _npde, _npte) \
+   (RADEON_PT_OFFSET(_rdev) +  \
+ (_npde) * RADEON_PTE_COUNT   * RADEON_PTE_SIZE + \
+ (_npte) * RADEON_PTE_SIZE)
+
+
+#define RADEON_PT_DISTANCE \
+   (RADEON_PTE_COUNT * RADEON_PTE_SIZE)
+
+/* cpu address of gpuvm page table */
+#define RADEON_BASE_CPU_ADDR(_vm)  \
+   radeon_sa_bo_cpu_addr(vm-sa_bo)
+
+/* gpu address of gpuvm page table */
+#define RADEON_BASE_GPU_ADDR(_vm)  \
+   radeon_sa_bo_gpu_addr(vm-sa_bo)
+
 /*
  * Common GART table functions.
  */
@@ -490,7 +543,6 @@ static void radeon_vm_free_pt(struct radeon_device *rdev,
 
list_del_init(vm-list);
radeon_sa_bo_free(rdev, vm-sa_bo, vm-fence);
-   vm-pt = NULL;
 
list_for_each_entry(bo_va, vm-va, vm_list) {
bo_va-valid = false;
@@ -547,6 +599,11 @@ 

Re: [PATCH 2/2] Add 2-level GPUVM pagetables support to radeon driver. v2

2012-09-15 Thread Maarten Maathuis
On Fri, Sep 14, 2012 at 7:49 PM, Dmitry Cherkasov dcherkas...@gmail.com wrote:
 +#define RADEON_PT_OFFSET(_rdev) \
 +   (RADEON_GPU_PAGE_ALIGN(RADEON_TOTAL_PDE_COUNT(rdev) * 
 RADEON_PDE_SIZE))

Shouldn't that be _rdev too?

Also a few lines above that you use rdev instead of _rdev.

I didn't check the whole thing, just noticed that when i was staring
at it for no reason :-)

-- 
Far away from the primal instinct, the song seems to fade away, the
river get wider between your thoughts and the things we do and say.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] Add 2-level GPUVM pagetables support to radeon driver. v2

2012-09-15 Thread Christian König

On 14.09.2012 19:49, Dmitry Cherkasov wrote:

PDE/PTE update code uses CP ring for memory writes.
All page table entries are preallocated for now in alloc_pt().

It is made as whole because it's hard to divide it to several patches
that compile and doesn't break anything being applied separately.

Tested on cayman card.
We need some more tests on SI before that can be pushed upstream. Not so 
much of a problem, cause I can do it and AFAIK Michel also had an older 
version of the patch tested on SI.




v2 changes:
* rebased on top of refactor set_page chipset interface v2
* code cleanups

Signed-off-by: Dmitry Cherkasov dmitrii.cherka...@amd.com
---
  drivers/gpu/drm/radeon/ni.c  |4 +-
  drivers/gpu/drm/radeon/radeon.h  |4 +-
  drivers/gpu/drm/radeon/radeon_gart.c |  145 +++---
  drivers/gpu/drm/radeon/si.c  |4 +-
  4 files changed, 140 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 0355c8d..ffa9f7e 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -782,7 +782,7 @@ static int cayman_pcie_gart_enable(struct radeon_device 
*rdev)
   (u32)(rdev-dummy_page.addr  12));
WREG32(VM_CONTEXT1_CNTL2, 0);
WREG32(VM_CONTEXT1_CNTL, 0);
-   WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(0) |
+   WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(1) |
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT);
  
  	cayman_pcie_gart_tlb_flush(rdev);

@@ -1586,7 +1586,7 @@ void cayman_vm_flush(struct radeon_device *rdev, struct 
radeon_ib *ib)
radeon_ring_write(ring, vm-last_pfn);
  
  	radeon_ring_write(ring, PACKET0(VM_CONTEXT0_PAGE_TABLE_BASE_ADDR + (vm-id  2), 0));

-   radeon_ring_write(ring, vm-pt_gpu_addr  12);
+   radeon_ring_write(ring, vm-pd_gpu_addr  12);
  
  	/* flush hdp cache */

radeon_ring_write(ring, PACKET0(HDP_MEM_COHERENCY_FLUSH_CNTL, 0));
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index f02ea8e..6231823 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -655,8 +655,8 @@ struct radeon_vm {
struct list_headva;
unsignedid;
unsignedlast_pfn;
-   u64 pt_gpu_addr;
-   u64 *pt;
+   u64 pd_gpu_addr;
+   u64 __iomem *pd_addr;
As I already said in the last version of this patch, the CPU shouldn't 
come into the need to access that memory directly, with the exception of 
initial clearing it. So please remove this pointer.



struct radeon_sa_bo *sa_bo;
struct mutexmutex;
/* last fence for cs using this vm */
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index badc835..9c68482 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -50,6 +50,59 @@
   * This file handles the common internal GART management.
   */
  
+/* GPUVM defines */

+
+/* We consider the case where BLOCK_SIZE is 0 */
+/* So PDE is 19 bits long, PTE is 9 and OFFSET is 12 */
+#define RADEON_BLOCK_SIZE   0
Sorry missed that in my last comment, that define should indeed be 
public, cause the chipset specific code needs to know it when 
initializing the hardware.


Also we should name it in a way that makes it obvious that it belongs to 
the VM code, like RADEON_VM_BLOCK_SIZE, or something like this.



+
+/* By default there are 512 entries in Page Table */
+#define RADEON_DEFAULT_PTE_COUNT (1  9)
+
+/* number of PTEs in Page Table */
+#define RADEON_PTE_COUNT (RADEON_DEFAULT_PTE_COUNT  RADEON_BLOCK_SIZE)
Please merge those two defines, RADEON_DEFAULT_PTE_COUNT isn't used 
after the second define, and the fact that nine is actually the minimum 
block size is something chipset specific (there might be larger minimum 
page table sizes in the future). Saying that it might actually also be 
useful to set RADEON_(VM)_BLOCK_SIZE to 9 instead of 0 and let the 
chipset specific code calculate the actually value that gets written 
into the register.



+
+/* Get last PDE number containing nth PTE */
+#define RADEON_GET_LAST_PDE_FOR_PFN(_n)((_n) / RADEON_PTE_COUNT)
+
+/* Get PTE number to containing nth pfn */
+#define RADEON_GET_PTE_FOR_PFN(_n) ((_n) % RADEON_PTE_COUNT)
+
+/* Number of PDE tables to cover n PTEs */
+#define RADEON_PDE_COUNT_FOR_N_PAGES(_n) \
+   (((_n) + RADEON_PTE_COUNT - 1) / RADEON_PTE_COUNT)
+
+/* Number of PDE tables to cover max_pfn (maximum number of PTEs) */
+#define RADEON_TOTAL_PDE_COUNT(rdev) \
+   RADEON_PDE_COUNT_FOR_N_PAGES(rdev-vm_manager.max_pfn)
+
+#define RADEON_PTE_SIZE 8
+#define RADEON_PDE_SIZE 8
+
+/* offset for npde-th PDE starting from beginning 

[PATCH 2/2] Add 2-level GPUVM pagetables support to radeon driver. v2

2012-09-14 Thread Dmitry Cherkasov
PDE/PTE update code uses CP ring for memory writes.
All page table entries are preallocated for now in alloc_pt().

It is made as whole because it's hard to divide it to several patches
that compile and doesn't break anything being applied separately.

Tested on cayman card.

v2 changes:
* rebased on top of "refactor set_page chipset interface v2"
* code cleanups

Signed-off-by: Dmitry Cherkasov 
---
 drivers/gpu/drm/radeon/ni.c  |4 +-
 drivers/gpu/drm/radeon/radeon.h  |4 +-
 drivers/gpu/drm/radeon/radeon_gart.c |  145 +++---
 drivers/gpu/drm/radeon/si.c  |4 +-
 4 files changed, 140 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 0355c8d..ffa9f7e 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -782,7 +782,7 @@ static int cayman_pcie_gart_enable(struct radeon_device 
*rdev)
   (u32)(rdev->dummy_page.addr >> 12));
WREG32(VM_CONTEXT1_CNTL2, 0);
WREG32(VM_CONTEXT1_CNTL, 0);
-   WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(0) |
+   WREG32(VM_CONTEXT1_CNTL, ENABLE_CONTEXT | PAGE_TABLE_DEPTH(1) |
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT);

cayman_pcie_gart_tlb_flush(rdev);
@@ -1586,7 +1586,7 @@ void cayman_vm_flush(struct radeon_device *rdev, struct 
radeon_ib *ib)
radeon_ring_write(ring, vm->last_pfn);

radeon_ring_write(ring, PACKET0(VM_CONTEXT0_PAGE_TABLE_BASE_ADDR + 
(vm->id << 2), 0));
-   radeon_ring_write(ring, vm->pt_gpu_addr >> 12);
+   radeon_ring_write(ring, vm->pd_gpu_addr >> 12);

/* flush hdp cache */
radeon_ring_write(ring, PACKET0(HDP_MEM_COHERENCY_FLUSH_CNTL, 0));
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index f02ea8e..6231823 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -655,8 +655,8 @@ struct radeon_vm {
struct list_headva;
unsignedid;
unsignedlast_pfn;
-   u64 pt_gpu_addr;
-   u64 *pt;
+   u64 pd_gpu_addr;
+   u64 __iomem *pd_addr;
struct radeon_sa_bo *sa_bo;
struct mutexmutex;
/* last fence for cs using this vm */
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c 
b/drivers/gpu/drm/radeon/radeon_gart.c
index badc835..9c68482 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -50,6 +50,59 @@
  * This file handles the common internal GART management.
  */

+/* GPUVM defines */
+
+/* We consider the case where BLOCK_SIZE is 0 */
+/* So PDE is 19 bits long, PTE is 9 and OFFSET is 12 */
+#define RADEON_BLOCK_SIZE   0
+
+/* By default there are 512 entries in Page Table */
+#define RADEON_DEFAULT_PTE_COUNT (1 << 9)
+
+/* number of PTEs in Page Table */
+#define RADEON_PTE_COUNT (RADEON_DEFAULT_PTE_COUNT << RADEON_BLOCK_SIZE)
+
+/* Get last PDE number containing nth PTE */
+#define RADEON_GET_LAST_PDE_FOR_PFN(_n)((_n) / RADEON_PTE_COUNT)
+
+/* Get PTE number to containing nth pfn */
+#define RADEON_GET_PTE_FOR_PFN(_n) ((_n) % RADEON_PTE_COUNT)
+
+/* Number of PDE tables to cover n PTEs */
+#define RADEON_PDE_COUNT_FOR_N_PAGES(_n) \
+   (((_n) + RADEON_PTE_COUNT - 1) / RADEON_PTE_COUNT)
+
+/* Number of PDE tables to cover max_pfn (maximum number of PTEs) */
+#define RADEON_TOTAL_PDE_COUNT(rdev) \
+   RADEON_PDE_COUNT_FOR_N_PAGES(rdev->vm_manager.max_pfn)
+
+#define RADEON_PTE_SIZE 8
+#define RADEON_PDE_SIZE 8
+
+/* offset for npde-th PDE starting from beginning of PDE table */
+#define RADEON_PDE_OFFSET(_rdev, _npde) ((_npde) * RADEON_PDE_SIZE)
+
+#define RADEON_PT_OFFSET(_rdev) \
+   (RADEON_GPU_PAGE_ALIGN(RADEON_TOTAL_PDE_COUNT(rdev) * RADEON_PDE_SIZE))
+
+/* offset for npte-th PTE of npde-th PDE starting from beginning of PDE table 
*/
+#define RADEON_PTE_OFFSET(_rdev, _npde, _npte) \
+   (RADEON_PT_OFFSET(_rdev) +  \
+ (_npde) * RADEON_PTE_COUNT   * RADEON_PTE_SIZE + \
+ (_npte) * RADEON_PTE_SIZE)
+
+
+#define RADEON_PT_DISTANCE \
+   (RADEON_PTE_COUNT * RADEON_PTE_SIZE)
+
+/* cpu address of gpuvm page table */
+#define RADEON_BASE_CPU_ADDR(_vm)  \
+   radeon_sa_bo_cpu_addr(vm->sa_bo)
+
+/* gpu address of gpuvm page table */
+#define RADEON_BASE_GPU_ADDR(_vm)  \
+   radeon_sa_bo_gpu_addr(vm->sa_bo)
+
 /*
  * Common GART table functions.
  */
@@ -490,7 +543,6 @@ static void radeon_vm_free_pt(struct radeon_device *rdev,

list_del_init(>list);
radeon_sa_bo_free(rdev, >sa_bo, vm->fence);
-   vm->pt = NULL;

list_for_each_entry(bo_va, >va, vm_list) {
bo_va->valid = false;
@@ -547,6 +599,11 @@ int