date:20170630

Re: [PATCH 01/12] drm/amdgpu: move ring helpers to amdgpu_ring.h

2017-06-30 Thread Felix Kuehling

Thank you Christian! The series is Acked-by: Felix Kuehling


Minor nit-pick in patch 6: I spotted 4-space indentation in
amdgpu_ttm_backend_bind.

I'm looking at patches 5 and 9 more closely, because I'll need to make
similar changes to the KFD IPC copy code.

Regards,
  Felix


On 17-06-30 07:22 AM, Christian König wrote:
> From: Christian König 
>
> Keep them where they belong.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h  | 44 
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 42 ++
>  2 files changed, 42 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ab1dad2..810796a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1801,50 +1801,6 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device 
> *adev);
>  #define RBIOS16(i) (RBIOS8(i) | (RBIOS8((i)+1) << 8))
>  #define RBIOS32(i) ((RBIOS16(i)) | (RBIOS16((i)+2) << 16))
>  
> -/*
> - * RING helpers.
> - */
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> - if (ring->count_dw <= 0)
> - DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> - ring->ring[ring->wptr++ & ring->buf_mask] = v;
> - ring->wptr &= ring->ptr_mask;
> - ring->count_dw--;
> -}
> -
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void 
> *src, int count_dw)
> -{
> - unsigned occupied, chunk1, chunk2;
> - void *dst;
> -
> - if (unlikely(ring->count_dw < count_dw)) {
> - DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> - return;
> - }
> -
> - occupied = ring->wptr & ring->buf_mask;
> - dst = (void *)&ring->ring[occupied];
> - chunk1 = ring->buf_mask + 1 - occupied;
> - chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> - chunk2 = count_dw - chunk1;
> - chunk1 <<= 2;
> - chunk2 <<= 2;
> -
> - if (chunk1)
> - memcpy(dst, src, chunk1);
> -
> - if (chunk2) {
> - src += chunk1;
> - dst = (void *)ring->ring;
> - memcpy(dst, src, chunk2);
> - }
> -
> - ring->wptr += count_dw;
> - ring->wptr &= ring->ptr_mask;
> - ring->count_dw -= count_dw;
> -}
> -
>  static inline struct amdgpu_sdma_instance *
>  amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
>  {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index bc8dec9..04cbc3a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -212,4 +212,46 @@ static inline void amdgpu_ring_clear_ring(struct 
> amdgpu_ring *ring)
>  
>  }
>  
> +static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> + if (ring->count_dw <= 0)
> + DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> + ring->ring[ring->wptr++ & ring->buf_mask] = v;
> + ring->wptr &= ring->ptr_mask;
> + ring->count_dw--;
> +}
> +
> +static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> +   void *src, int count_dw)
> +{
> + unsigned occupied, chunk1, chunk2;
> + void *dst;
> +
> + if (unlikely(ring->count_dw < count_dw)) {
> + DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> + return;
> + }
> +
> + occupied = ring->wptr & ring->buf_mask;
> + dst = (void *)&ring->ring[occupied];
> + chunk1 = ring->buf_mask + 1 - occupied;
> + chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> + chunk2 = count_dw - chunk1;
> + chunk1 <<= 2;
> + chunk2 <<= 2;
> +
> + if (chunk1)
> + memcpy(dst, src, chunk1);
> +
> + if (chunk2) {
> + src += chunk1;
> + dst = (void *)ring->ring;
> + memcpy(dst, src, chunk2);
> + }
> +
> + ring->wptr += count_dw;
> + ring->wptr &= ring->ptr_mask;
> + ring->count_dw -= count_dw;
> +}
> +
>  #endif

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/5] drm/amdgpu/atombios: add function for whether we need asic_init

2017-06-30 Thread Alex Deucher

Check the atom scratch registers to see if asic_init is complete
or not.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 10 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 8e7a7b9..ce44358 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1756,6 +1756,16 @@ void amdgpu_atombios_scratch_regs_engine_hung(struct 
amdgpu_device *adev,
WREG32(adev->bios_scratch_reg_offset + 3, tmp);
 }
 
+bool amdgpu_atombios_scratch_need_asic_init(struct amdgpu_device *adev)
+{
+   u32 tmp = RREG32(adev->bios_scratch_reg_offset + 7);
+
+   if (tmp & ATOM_S7_ASIC_INIT_COMPLETE_MASK)
+   return false;
+   else
+   return true;
+}
+
 /* Atom needs data in little endian format
  * so swap as appropriate when copying data to
  * or from atom. Note that atom operates on
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h
index 38d0fe3..b0d5d1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h
@@ -200,6 +200,7 @@ void amdgpu_atombios_scratch_regs_save(struct amdgpu_device 
*adev);
 void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev);
 void amdgpu_atombios_scratch_regs_engine_hung(struct amdgpu_device *adev,
  bool hung);
+bool amdgpu_atombios_scratch_need_asic_init(struct amdgpu_device *adev);
 
 void amdgpu_atombios_copy_swap(u8 *dst, u8 *src, u8 num_bytes, bool to_le);
 int amdgpu_atombios_get_max_vddc(struct amdgpu_device *adev, u8 voltage_type,
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 5/5] drm/amdgpu: remove get_memsize asic callback

2017-06-30 Thread Alex Deucher

We don't need this anymore since we use bios scratch
registers instead for checking if we need asic_init.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 ---
 drivers/gpu/drm/amd/amdgpu/cik.c| 6 --
 drivers/gpu/drm/amd/amdgpu/si.c | 6 --
 drivers/gpu/drm/amd/amdgpu/soc15.c  | 9 -
 drivers/gpu/drm/amd/amdgpu/vi.c | 6 --
 5 files changed, 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ab1dad2..466c97e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1378,8 +1378,6 @@ struct amdgpu_asic_funcs {
/* static power management */
int (*get_pcie_lanes)(struct amdgpu_device *adev);
void (*set_pcie_lanes)(struct amdgpu_device *adev, int lanes);
-   /* get config memsize register */
-   u32 (*get_config_memsize)(struct amdgpu_device *adev);
 };
 
 /*
@@ -1875,7 +1873,6 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_asic_read_disabled_bios(adev) 
(adev)->asic_funcs->read_disabled_bios((adev))
 #define amdgpu_asic_read_bios_from_rom(adev, b, l) 
(adev)->asic_funcs->read_bios_from_rom((adev), (b), (l))
 #define amdgpu_asic_read_register(adev, se, sh, offset, 
v)((adev)->asic_funcs->read_register((adev), (se), (sh), (offset), (v)))
-#define amdgpu_asic_get_config_memsize(adev) 
(adev)->asic_funcs->get_config_memsize((adev))
 #define amdgpu_gart_flush_gpu_tlb(adev, vmid) 
(adev)->gart.gart_funcs->flush_gpu_tlb((adev), (vmid))
 #define amdgpu_gart_set_pte_pde(adev, pt, idx, addr, flags) 
(adev)->gart.gart_funcs->set_pte_pde((adev), (pt), (idx), (addr), (flags))
 #define amdgpu_gart_get_vm_pde(adev, addr) 
(adev)->gart.gart_funcs->get_vm_pde((adev), (addr))
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
index 6ce9f80..79a0434 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@@ -1212,11 +1212,6 @@ static int cik_asic_reset(struct amdgpu_device *adev)
return r;
 }
 
-static u32 cik_get_config_memsize(struct amdgpu_device *adev)
-{
-   return RREG32(mmCONFIG_MEMSIZE);
-}
-
 static int cik_set_uvd_clock(struct amdgpu_device *adev, u32 clock,
  u32 cntl_reg, u32 status_reg)
 {
@@ -1646,7 +1641,6 @@ static const struct amdgpu_asic_funcs cik_asic_funcs =
.get_xclk = &cik_get_xclk,
.set_uvd_clocks = &cik_set_uvd_clocks,
.set_vce_clocks = &cik_set_vce_clocks,
-   .get_config_memsize = &cik_get_config_memsize,
 };
 
 static int cik_common_early_init(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c
index 3bd6332..02ce8ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/si.c
+++ b/drivers/gpu/drm/amd/amdgpu/si.c
@@ -1156,11 +1156,6 @@ static int si_asic_reset(struct amdgpu_device *adev)
return 0;
 }
 
-static u32 si_get_config_memsize(struct amdgpu_device *adev)
-{
-   return RREG32(mmCONFIG_MEMSIZE);
-}
-
 static void si_vga_set_state(struct amdgpu_device *adev, bool state)
 {
uint32_t temp;
@@ -1212,7 +1207,6 @@ static const struct amdgpu_asic_funcs si_asic_funcs =
.get_xclk = &si_get_xclk,
.set_uvd_clocks = &si_set_uvd_clocks,
.set_vce_clocks = NULL,
-   .get_config_memsize = &si_get_config_memsize,
 };
 
 static uint32_t si_get_rev_id(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 8a4b76d..b3cb8f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -198,14 +198,6 @@ static void soc15_didt_wreg(struct amdgpu_device *adev, 
u32 reg, u32 v)
spin_unlock_irqrestore(&adev->didt_idx_lock, flags);
 }
 
-static u32 soc15_get_config_memsize(struct amdgpu_device *adev)
-{
-   if (adev->flags & AMD_IS_APU)
-   return nbio_v7_0_get_memsize(adev);
-   else
-   return nbio_v6_1_get_memsize(adev);
-}
-
 static const u32 vega10_golden_init[] =
 {
 };
@@ -553,7 +545,6 @@ static const struct amdgpu_asic_funcs soc15_asic_funcs =
.get_xclk = &soc15_get_xclk,
.set_uvd_clocks = &soc15_set_uvd_clocks,
.set_vce_clocks = &soc15_set_vce_clocks,
-   .get_config_memsize = &soc15_get_config_memsize,
 };
 
 static int soc15_common_early_init(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 18bb3cb..3e58c31 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -706,11 +706,6 @@ static int vi_asic_reset(struct amdgpu_device *adev)
return r;
 }
 
-static u32 vi_get_config_memsize(struct amdgpu_device *adev)
-{
-   return RREG32(mmCONFIG_MEMSIZE);
-}
-
 static int vi_set_uvd_clock(struct amdgpu_device *adev, u32 clock,
u32 cntl_reg, u32 status_reg)
 {
@@ -862,7 +857,6 @@ static const struct amdgpu_asic_funcs vi_asic_funcs =
.get_x

[PATCH 1/5] drm/amdgpu/atombios: use bios_scratch_reg_offset for atombios

2017-06-30 Thread Alex Deucher

Align with the atomfirmware code.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 1e8e112..8e7a7b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1686,7 +1686,7 @@ void amdgpu_atombios_scratch_regs_lock(struct 
amdgpu_device *adev, bool lock)
 {
uint32_t bios_6_scratch;
 
-   bios_6_scratch = RREG32(mmBIOS_SCRATCH_6);
+   bios_6_scratch = RREG32(adev->bios_scratch_reg_offset + 6);
 
if (lock) {
bios_6_scratch |= ATOM_S6_CRITICAL_STATE;
@@ -1696,15 +1696,17 @@ void amdgpu_atombios_scratch_regs_lock(struct 
amdgpu_device *adev, bool lock)
bios_6_scratch |= ATOM_S6_ACC_MODE;
}
 
-   WREG32(mmBIOS_SCRATCH_6, bios_6_scratch);
+   WREG32(adev->bios_scratch_reg_offset + 6, bios_6_scratch);
 }
 
 void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev)
 {
uint32_t bios_2_scratch, bios_6_scratch;
 
-   bios_2_scratch = RREG32(mmBIOS_SCRATCH_2);
-   bios_6_scratch = RREG32(mmBIOS_SCRATCH_6);
+   adev->bios_scratch_reg_offset = mmBIOS_SCRATCH_0;
+
+   bios_2_scratch = RREG32(adev->bios_scratch_reg_offset + 2);
+   bios_6_scratch = RREG32(adev->bios_scratch_reg_offset + 6);
 
/* let the bios control the backlight */
bios_2_scratch &= ~ATOM_S2_VRI_BRIGHT_ENABLE;
@@ -1715,8 +1717,8 @@ void amdgpu_atombios_scratch_regs_init(struct 
amdgpu_device *adev)
/* clear the vbios dpms state */
bios_2_scratch &= ~ATOM_S2_DEVICE_DPMS_STATE;
 
-   WREG32(mmBIOS_SCRATCH_2, bios_2_scratch);
-   WREG32(mmBIOS_SCRATCH_6, bios_6_scratch);
+   WREG32(adev->bios_scratch_reg_offset + 2, bios_2_scratch);
+   WREG32(adev->bios_scratch_reg_offset + 6, bios_6_scratch);
 }
 
 void amdgpu_atombios_scratch_regs_save(struct amdgpu_device *adev)
@@ -1724,7 +1726,7 @@ void amdgpu_atombios_scratch_regs_save(struct 
amdgpu_device *adev)
int i;
 
for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-   adev->bios_scratch[i] = RREG32(mmBIOS_SCRATCH_0 + i);
+   adev->bios_scratch[i] = RREG32(adev->bios_scratch_reg_offset + 
i);
 }
 
 void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev)
@@ -1738,20 +1740,20 @@ void amdgpu_atombios_scratch_regs_restore(struct 
amdgpu_device *adev)
adev->bios_scratch[7] &= ~ATOM_S7_ASIC_INIT_COMPLETE_MASK;
 
for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-   WREG32(mmBIOS_SCRATCH_0 + i, adev->bios_scratch[i]);
+   WREG32(adev->bios_scratch_reg_offset + i, 
adev->bios_scratch[i]);
 }
 
 void amdgpu_atombios_scratch_regs_engine_hung(struct amdgpu_device *adev,
  bool hung)
 {
-   u32 tmp = RREG32(mmBIOS_SCRATCH_3);
+   u32 tmp = RREG32(adev->bios_scratch_reg_offset + 3);
 
if (hung)
tmp |= ATOM_S3_ASIC_GUI_ENGINE_HUNG;
else
tmp &= ~ATOM_S3_ASIC_GUI_ENGINE_HUNG;
 
-   WREG32(mmBIOS_SCRATCH_3, tmp);
+   WREG32(adev->bios_scratch_reg_offset + 3, tmp);
 }
 
 /* Atom needs data in little endian format
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/5] drm/amdgpu: check scratch registers to see if we need post

2017-06-30 Thread Alex Deucher

Rather than checking the CONGIG_MEMSIZE register as that may
not be reliable on some APUs.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 63f4bed..9d08f53 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -716,20 +716,12 @@ void amdgpu_gtt_location(struct amdgpu_device *adev, 
struct amdgpu_mc *mc)
  */
 bool amdgpu_need_post(struct amdgpu_device *adev)
 {
-   uint32_t reg;
-
if (adev->has_hw_reset) {
adev->has_hw_reset = false;
return true;
}
-   /* then check MEM_SIZE, in case the crtcs are off */
-   reg = amdgpu_asic_get_config_memsize(adev);
-
-   if ((reg != 0) && (reg != 0x))
-   return false;
-
-   return true;
 
+   return amdgpu_atombios_scratch_need_asic_init(adev);
 }
 
 static bool amdgpu_vpost_needed(struct amdgpu_device *adev)
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/5] drm/amdgpu: unify some atombios/atomfirmware scratch reg functions

2017-06-30 Thread Alex Deucher

Now that we use a pointer to the scratch reg start offset,
most of the functions were duplicated.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 35 
 drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h |  4 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c   | 20 +++---
 drivers/gpu/drm/amd/amdgpu/soc15.c   |  6 ++--
 4 files changed, 7 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
index 4bdda56..9ddfe34 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
@@ -66,41 +66,6 @@ void amdgpu_atomfirmware_scratch_regs_init(struct 
amdgpu_device *adev)
}
 }
 
-void amdgpu_atomfirmware_scratch_regs_save(struct amdgpu_device *adev)
-{
-   int i;
-
-   for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-   adev->bios_scratch[i] = RREG32(adev->bios_scratch_reg_offset + 
i);
-}
-
-void amdgpu_atomfirmware_scratch_regs_restore(struct amdgpu_device *adev)
-{
-   int i;
-
-   /*
-* VBIOS will check ASIC_INIT_COMPLETE bit to decide if
-* execute ASIC_Init posting via driver
-*/
-   adev->bios_scratch[7] &= ~ATOM_S7_ASIC_INIT_COMPLETE_MASK;
-
-   for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-   WREG32(adev->bios_scratch_reg_offset + i, 
adev->bios_scratch[i]);
-}
-
-void amdgpu_atomfirmware_scratch_regs_engine_hung(struct amdgpu_device *adev,
- bool hung)
-{
-   u32 tmp = RREG32(adev->bios_scratch_reg_offset + 3);
-
-   if (hung)
-   tmp |= ATOM_S3_ASIC_GUI_ENGINE_HUNG;
-   else
-   tmp &= ~ATOM_S3_ASIC_GUI_ENGINE_HUNG;
-
-   WREG32(adev->bios_scratch_reg_offset + 3, tmp);
-}
-
 int amdgpu_atomfirmware_allocate_fb_scratch(struct amdgpu_device *adev)
 {
struct atom_context *ctx = adev->mode_info.atom_context;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h
index a2c3ebe..907e48f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h
@@ -26,10 +26,6 @@
 
 bool amdgpu_atomfirmware_gpu_supports_virtualization(struct amdgpu_device 
*adev);
 void amdgpu_atomfirmware_scratch_regs_init(struct amdgpu_device *adev);
-void amdgpu_atomfirmware_scratch_regs_save(struct amdgpu_device *adev);
-void amdgpu_atomfirmware_scratch_regs_restore(struct amdgpu_device *adev);
-void amdgpu_atomfirmware_scratch_regs_engine_hung(struct amdgpu_device *adev,
- bool hung);
 int amdgpu_atomfirmware_allocate_fb_scratch(struct amdgpu_device *adev);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5b1220f..63f4bed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2438,10 +2438,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
suspend, bool fbcon)
 */
amdgpu_bo_evict_vram(adev);
 
-   if (adev->is_atom_fw)
-   amdgpu_atomfirmware_scratch_regs_save(adev);
-   else
-   amdgpu_atombios_scratch_regs_save(adev);
+   amdgpu_atombios_scratch_regs_save(adev);
pci_save_state(dev->pdev);
if (suspend) {
/* Shut down the device */
@@ -2490,10 +2487,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
resume, bool fbcon)
if (r)
goto unlock;
}
-   if (adev->is_atom_fw)
-   amdgpu_atomfirmware_scratch_regs_restore(adev);
-   else
-   amdgpu_atombios_scratch_regs_restore(adev);
+   amdgpu_atombios_scratch_regs_restore(adev);
 
/* post card */
if (amdgpu_need_post(adev)) {
@@ -2926,15 +2920,9 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
r = amdgpu_suspend(adev);
 
 retry:
-   if (adev->is_atom_fw)
-   amdgpu_atomfirmware_scratch_regs_save(adev);
-   else
-   amdgpu_atombios_scratch_regs_save(adev);
+   amdgpu_atombios_scratch_regs_save(adev);
r = amdgpu_asic_reset(adev);
-   if (adev->is_atom_fw)
-   amdgpu_atomfirmware_scratch_regs_restore(adev);
-   else
-   amdgpu_atombios_scratch_regs_restore(adev);
+   amdgpu_atombios_scratch_regs_restore(adev);
/* post card */
amdgpu_atom_asic_init(adev->mode_info.atom_context);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 9210126..8a4b76d 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -25,7 +25,7 @@
 #include 
 #inc

[PATCH libdrm 2/2] radeon: use asic id table to get chipset name

2017-06-30 Thread Samuel Li

Change-Id: I24b6624789d1a9dc0fd3a446b0e6f21ed5183ff2
Signed-off-by: Samuel Li 
---
 radeon/Makefile.am  |   6 +++
 radeon/Makefile.sources |   6 ++-
 radeon/radeon_asic_id.c | 106 
 radeon/radeon_asic_id.h |  37 +
 4 files changed, 153 insertions(+), 2 deletions(-)
 create mode 100644 radeon/radeon_asic_id.c
 create mode 100644 radeon/radeon_asic_id.h

diff --git a/radeon/Makefile.am b/radeon/Makefile.am
index e241531..69407bc 100644
--- a/radeon/Makefile.am
+++ b/radeon/Makefile.am
@@ -30,6 +30,12 @@ AM_CFLAGS = \
$(PTHREADSTUBS_CFLAGS) \
-I$(top_srcdir)/include/drm
 
+libdrmdatadir = @libdrmdatadir@
+ASIC_ID_TABLE_NUM_ENTRIES := $(shell egrep -ci '^[0-9a-f]{4},.*[0-9a-f]+,' \
+   $(top_srcdir)/data/amdgpu.ids)
+AM_CPPFLAGS = -DRADEON_ASIC_ID_TABLE=\"${libdrmdatadir}/amdgpu.ids\" \
+   -DRADEON_ASIC_ID_TABLE_NUM_ENTRIES=$(ASIC_ID_TABLE_NUM_ENTRIES)
+
 libdrm_radeon_la_LTLIBRARIES = libdrm_radeon.la
 libdrm_radeon_ladir = $(libdir)
 libdrm_radeon_la_LDFLAGS = -version-number 1:0:1 -no-undefined
diff --git a/radeon/Makefile.sources b/radeon/Makefile.sources
index 1cf482a..8eaf1c6 100644
--- a/radeon/Makefile.sources
+++ b/radeon/Makefile.sources
@@ -4,7 +4,8 @@ LIBDRM_RADEON_FILES := \
radeon_cs_space.c \
radeon_bo.c \
radeon_cs.c \
-   radeon_surface.c
+   radeon_surface.c \
+   radeon_asic_id.c
 
 LIBDRM_RADEON_H_FILES := \
radeon_bo.h \
@@ -14,7 +15,8 @@ LIBDRM_RADEON_H_FILES := \
radeon_cs_gem.h \
radeon_bo_int.h \
radeon_cs_int.h \
-   r600_pci_ids.h
+   r600_pci_ids.h \
+   radeon_asic_id.h
 
 LIBDRM_RADEON_BOF_FILES := \
bof.c \
diff --git a/radeon/radeon_asic_id.c b/radeon/radeon_asic_id.c
new file mode 100644
index 000..b03502b
--- /dev/null
+++ b/radeon/radeon_asic_id.c
@@ -0,0 +1,106 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+/**
+ * \file radeon_asic_id.c
+ *
+ *  Implementation of chipset name lookup functions for radeon device
+ *
+ */
+
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+//#include 
+//#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "xf86atomic.h"
+#include "util/util_asic_id.h"
+#include "radeon_asic_id.h"
+
+
+static pthread_mutex_t asic_id_mutex = PTHREAD_MUTEX_INITIALIZER;
+static struct util_asic_id *radeon_asic_ids;
+static atomic_t refcount;
+
+int radeon_asic_id_initialize(void)
+{
+   int r = 0;
+   pthread_mutex_lock(&asic_id_mutex);
+   if (radeon_asic_ids) {
+   atomic_inc(&refcount);
+   pthread_mutex_unlock(&asic_id_mutex);
+   return r;
+   }
+
+   r = util_parse_asic_ids(&radeon_asic_ids, RADEON_ASIC_ID_TABLE,
+   RADEON_ASIC_ID_TABLE_NUM_ENTRIES);
+   if (r) {
+   fprintf(stderr, "%s: Cannot parse ASIC IDs, 0x%x.",
+   __func__, r);
+   } else
+   atomic_inc(&refcount);
+
+   pthread_mutex_unlock(&asic_id_mutex);
+   return r;
+}
+
+void radeon_asic_id_deinitialize(void)
+{
+   const struct util_asic_id *id;
+
+   assert(atomic_read(&refcount) > 0);
+   pthread_mutex_lock(&asic_id_mutex);
+   if (atomic_dec_and_test(&refcount)) {
+   if (radeon_asic_ids) {
+   for (id = radeon_asic_ids; id->did; id++)
+   free(id->marketing_name);
+   free(radeon_asic_ids);
+   radeon_asic_ids = NULL;
+   }
+   }
+   pthread_mutex_unlock(&asic_id_mutex);
+}
+
+const char *radeon_get_marketing_name(uint32_t device_id, uint32_t pci_rev_id)
+{
+   const struct util_asic_id *id;
+
+   if (!radeon_asic_ids)
+   return NULL;
+
+   for (id = radeon_asic_ids; id->did; id++

[PATCH libdrm 1/2] util: move some files to an ASIC neutral directory.

2017-06-30 Thread Samuel Li

Change-Id: Iac1c4870253e8b8860a61b7cf175e7a25cc95921
Signed-off-by: Samuel Li 
---
 Makefile.sources |  10 +-
 amdgpu/Makefile.am   |   3 +-
 amdgpu/Makefile.sources  |   7 +-
 amdgpu/amdgpu_asic_id.c  | 219 ---
 amdgpu/amdgpu_device.c   |   7 +-
 amdgpu/amdgpu_internal.h |  11 +-
 amdgpu/util_hash.c   | 387 ---
 amdgpu/util_hash.h   | 107 -
 amdgpu/util_hash_table.c | 262 
 amdgpu/util_hash_table.h |  73 -
 util/util_asic_id.c  | 217 ++
 util/util_asic_id.h  |  39 +
 util/util_hash.c | 387 +++
 util/util_hash.h | 107 +
 util/util_hash_table.c   | 262 
 util/util_hash_table.h   |  73 +
 16 files changed, 1102 insertions(+), 1069 deletions(-)
 delete mode 100644 amdgpu/amdgpu_asic_id.c
 delete mode 100644 amdgpu/util_hash.c
 delete mode 100644 amdgpu/util_hash.h
 delete mode 100644 amdgpu/util_hash_table.c
 delete mode 100644 amdgpu/util_hash_table.h
 create mode 100644 util/util_asic_id.c
 create mode 100644 util/util_asic_id.h
 create mode 100644 util/util_hash.c
 create mode 100644 util/util_hash.h
 create mode 100644 util/util_hash_table.c
 create mode 100644 util/util_hash_table.h

diff --git a/Makefile.sources b/Makefile.sources
index 10aa1d0..f2b0ec6 100644
--- a/Makefile.sources
+++ b/Makefile.sources
@@ -10,12 +10,18 @@ LIBDRM_FILES := \
libdrm_macros.h \
libdrm_lists.h \
util_double_list.h \
-   util_math.h
+   util_math.h \
+   util/util_asic_id.c \
+   util/util_hash.c \
+   util/util_hash_table.c
 
 LIBDRM_H_FILES := \
libsync.h \
xf86drm.h \
-   xf86drmMode.h
+   xf86drmMode.h \
+   util/util_asic_id.h \
+   util/util_hash.h \
+   util/util_hash_table.h
 
 LIBDRM_INCLUDE_H_FILES := \
include/drm/drm.h \
diff --git a/amdgpu/Makefile.am b/amdgpu/Makefile.am
index 66f6f67..c3e83d6 100644
--- a/amdgpu/Makefile.am
+++ b/amdgpu/Makefile.am
@@ -28,7 +28,8 @@ AM_CFLAGS = \
$(WARN_CFLAGS) \
-I$(top_srcdir) \
$(PTHREADSTUBS_CFLAGS) \
-   -I$(top_srcdir)/include/drm
+   -I$(top_srcdir)/include/drm \
+   -I$(top_srcdir)/util
 
 libdrmdatadir = @libdrmdatadir@
 ASIC_ID_TABLE_NUM_ENTRIES := $(shell egrep -ci '^[0-9a-f]{4},.*[0-9a-f]+,' \
diff --git a/amdgpu/Makefile.sources b/amdgpu/Makefile.sources
index bc3abaa..23e9e69 100644
--- a/amdgpu/Makefile.sources
+++ b/amdgpu/Makefile.sources
@@ -1,15 +1,10 @@
 LIBDRM_AMDGPU_FILES := \
-   amdgpu_asic_id.c \
amdgpu_bo.c \
amdgpu_cs.c \
amdgpu_device.c \
amdgpu_gpu_info.c \
amdgpu_internal.h \
-   amdgpu_vamgr.c \
-   util_hash.c \
-   util_hash.h \
-   util_hash_table.c \
-   util_hash_table.h
+   amdgpu_vamgr.c
 
 LIBDRM_AMDGPU_H_FILES := \
amdgpu.h
diff --git a/amdgpu/amdgpu_asic_id.c b/amdgpu/amdgpu_asic_id.c
deleted file mode 100644
index 3a88896..000
--- a/amdgpu/amdgpu_asic_id.c
+++ /dev/null
@@ -1,219 +0,0 @@
-/*
- * Copyright © 2017 Advanced Micro Devices, Inc.
- * All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- */
-
-#ifdef HAVE_CONFIG_H
-#include "config.h"
-#endif
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "xf86drm.h"
-#include "amdgpu_drm.h"
-#include "amdgpu_internal.h"
-
-static int parse_one_line(const char *line, struct amdgpu_asic_id *id)
-{
-   char *buf, *saveptr;
-   char *s_did;
-   char *s_rid;
-   char *s_name;
-   char *endptr;
-   int r = 0;
-
-   buf = strdup(line);
-   if (!buf)
-   return -ENOMEM;
-
-   /* ignore empty line and commented line */
-   if (strlen(line) == 0 || line[0] == '#') {
-

Re: [PATCH 12/12] drm/amdgpu: add gtt_sys_limit

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> Limit the size of the GART table for the system domain.
>
> This saves us a bunch of visible VRAM, but also limitates the maximum BO size 
> we can swap out.

The last phrase can be dropped as it's no longer relevant.

Acked-by: Alex Deucher 

>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 8 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 --
>  5 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 4a2b33d..ef8e6b9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -76,6 +76,7 @@
>  extern int amdgpu_modeset;
>  extern int amdgpu_vram_limit;
>  extern int amdgpu_gart_size;
> +extern unsigned amdgpu_gart_sys_limit;
>  extern int amdgpu_moverate;
>  extern int amdgpu_benchmarking;
>  extern int amdgpu_testing;
> @@ -605,6 +606,7 @@ struct amdgpu_mc {
> u64 mc_vram_size;
> u64 visible_vram_size;
> u64 gtt_size;
> +   u64 gtt_sys_limit;
> u64 gtt_start;
> u64 gtt_end;
> u64 vram_start;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5b1220f..7e3f8cb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1122,6 +1122,12 @@ static void amdgpu_check_arguments(struct 
> amdgpu_device *adev)
> }
> }
>
> +   if (amdgpu_gart_sys_limit < 32) {
> +   dev_warn(adev->dev, "gart sys limit (%d) too small\n",
> +amdgpu_gart_sys_limit);
> +   amdgpu_gart_sys_limit = 32;
> +   }
> +
> amdgpu_check_vm_size(adev);
>
> amdgpu_check_block_size(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 5a1d794..907ae5e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -75,6 +75,7 @@
>
>  int amdgpu_vram_limit = 0;
>  int amdgpu_gart_size = -1; /* auto */
> +unsigned amdgpu_gart_sys_limit = 256;
>  int amdgpu_moverate = -1; /* auto */
>  int amdgpu_benchmarking = 0;
>  int amdgpu_testing = 0;
> @@ -124,6 +125,9 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 
> 0600);
>  MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 
> 64, etc., -1 = auto)");
>  module_param_named(gartsize, amdgpu_gart_size, int, 0600);
>
> +MODULE_PARM_DESC(gartlimit, "GART limit for the system domain in megabytes 
> (default 256)");
> +module_param_named(gartlimit, amdgpu_gart_sys_limit, int, 0600);
> +
>  MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, 
> etc., -1=auto, 0=1=disabled)");
>  module_param_named(moverate, amdgpu_moverate, int, 0600);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index d99b2b2..f82eeaa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -70,6 +70,9 @@ void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
> adev->mc.mc_vram_size);
> else
> adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
> +
> +   adev->mc.gtt_sys_limit = min((uint64_t)amdgpu_gart_sys_limit << 20,
> +adev->mc.gtt_size);
>  }
>
>  /**
> @@ -384,8 +387,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
> if (r)
> return r;
> /* Compute table size */
> -   adev->gart.num_cpu_pages = adev->mc.gtt_size / PAGE_SIZE;
> -   adev->gart.num_gpu_pages = adev->mc.gtt_size / AMDGPU_GPU_PAGE_SIZE;
> +   adev->gart.num_cpu_pages = adev->mc.gtt_sys_limit / PAGE_SIZE;
> +   adev->gart.num_gpu_pages = adev->mc.gtt_sys_limit /
> +   AMDGPU_GPU_PAGE_SIZE;
> DRM_INFO("GART: num cpu pages %u, num gpu pages %u\n",
>  adev->gart.num_cpu_pages, adev->gart.num_gpu_pages);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index a0976dc..9b516c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -42,6 +42,7 @@ struct amdgpu_gtt_mgr {
>  static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager *man,
>unsigned long p_size)
>  {
> +   struct amdgpu_device *adev = amdgpu_ttm

Re: [PATCH 11/12] drm/amdgpu: remove maximum BO size limitation.

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> We can finally remove this now.
>
> Signed-off-by: Christian König 

Woot!
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 ---
>  1 file changed, 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 96c4493..2382785 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -58,17 +58,6 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
> unsigned long size,
> alignment = PAGE_SIZE;
> }
>
> -   if (!(initial_domain & (AMDGPU_GEM_DOMAIN_GDS | AMDGPU_GEM_DOMAIN_GWS 
> | AMDGPU_GEM_DOMAIN_OA))) {
> -   /* Maximum bo size is the unpinned gtt size since we use the 
> gtt to
> -* handle vram to system pool migrations.
> -*/
> -   max_size = adev->mc.gtt_size - adev->gart_pin_size;
> -   if (size > max_size) {
> -   DRM_DEBUG("Allocation size %ldMb bigger than %ldMb 
> limit\n",
> - size >> 20, max_size >> 20);
> -   return -ENOMEM;
> -   }
> -   }
>  retry:
> r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
>  flags, NULL, NULL, &robj);
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 10/12] drm/amdgpu: stop mapping BOs to GTT

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> No need to map BOs to GTT on eviction and intermediate transfers any more.
>
> Signed-off-by: Christian König 

Acked-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 19 ++-
>  1 file changed, 2 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 247ce21..e1ebcba 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -200,7 +200,6 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
> *bo,
> .lpfn = 0,
> .flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM
> };
> -   unsigned i;
>
> if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) {
> placement->placement = &placements;
> @@ -218,20 +217,6 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
> *bo,
> amdgpu_ttm_placement_from_domain(abo, 
> AMDGPU_GEM_DOMAIN_CPU);
> } else {
> amdgpu_ttm_placement_from_domain(abo, 
> AMDGPU_GEM_DOMAIN_GTT);
> -   for (i = 0; i < abo->placement.num_placement; ++i) {
> -   if (!(abo->placements[i].flags &
> - TTM_PL_FLAG_TT))
> -   continue;
> -
> -   if (abo->placements[i].lpfn)
> -   continue;
> -
> -   /* set an upper limit to force directly
> -* allocating address space for the BO.
> -*/
> -   abo->placements[i].lpfn =
> -   adev->mc.gtt_size >> PAGE_SHIFT;
> -   }
> }
> break;
> case TTM_PL_TT:
> @@ -391,7 +376,7 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object 
> *bo,
> placement.num_busy_placement = 1;
> placement.busy_placement = &placements;
> placements.fpfn = 0;
> -   placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
> +   placements.lpfn = 0;
> placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
> r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
>  interruptible, no_wait_gpu);
> @@ -438,7 +423,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
> *bo,
> placement.num_busy_placement = 1;
> placement.busy_placement = &placements;
> placements.fpfn = 0;
> -   placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
> +   placements.lpfn = 0;
> placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
> r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
>  interruptible, no_wait_gpu);
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 09/12] drm/amdgpu: use the GTT windows for BO moves

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> This way we don't need to map the full BO at a time any more.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 127 
> +++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   3 +
>  2 files changed, 111 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index eb0d7d7..247ce21 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -47,10 +47,15 @@
>
>  #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)
>
> +static int amdgpu_map_buffer(struct ttm_buffer_object *bo,
> +struct ttm_mem_reg *mem,
> +unsigned num_pages, uint64_t offset,
> +struct amdgpu_ring *ring,
> +uint64_t *addr);
> +
>  static int amdgpu_ttm_debugfs_init(struct amdgpu_device *adev);
>  static void amdgpu_ttm_debugfs_fini(struct amdgpu_device *adev);
>
> -
>  /*
>   * Global memory.
>   */
> @@ -97,6 +102,9 @@ static int amdgpu_ttm_global_init(struct amdgpu_device 
> *adev)
> goto error_bo;
> }
>
> +   mutex_init(&adev->mman.gtt_window_lock);
> +   adev->mman.gtt_index = 0;
> +
> ring = adev->mman.buffer_funcs_ring;
> rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_KERNEL];
> r = amd_sched_entity_init(&ring->sched, &adev->mman.entity,
> @@ -123,6 +131,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device 
> *adev)
> if (adev->mman.mem_global_referenced) {
> amd_sched_entity_fini(adev->mman.entity.sched,
>   &adev->mman.entity);
> +   mutex_destroy(&adev->mman.gtt_window_lock);
> drm_global_item_unref(&adev->mman.bo_global_ref.ref);
> drm_global_item_unref(&adev->mman.mem_global_ref);
> adev->mman.mem_global_referenced = false;
> @@ -256,10 +265,12 @@ static uint64_t amdgpu_mm_node_addr(struct 
> ttm_buffer_object *bo,
> struct drm_mm_node *mm_node,
> struct ttm_mem_reg *mem)
>  {
> -   uint64_t addr;
> +   uint64_t addr = 0;
>
> -   addr = mm_node->start << PAGE_SHIFT;
> -   addr += bo->bdev->man[mem->mem_type].gpu_offset;
> +   if (mm_node->start != AMDGPU_BO_INVALID_OFFSET) {
> +   addr = mm_node->start << PAGE_SHIFT;
> +   addr += bo->bdev->man[mem->mem_type].gpu_offset;
> +   }
> return addr;
>  }
>
> @@ -284,34 +295,41 @@ static int amdgpu_move_blit(struct ttm_buffer_object 
> *bo,
> return -EINVAL;
> }
>
> -   if (old_mem->mem_type == TTM_PL_TT) {
> -   r = amdgpu_ttm_bind(bo, old_mem);
> -   if (r)
> -   return r;
> -   }
> -
> old_mm = old_mem->mm_node;
> old_size = old_mm->size;
> old_start = amdgpu_mm_node_addr(bo, old_mm, old_mem);
>
> -   if (new_mem->mem_type == TTM_PL_TT) {
> -   r = amdgpu_ttm_bind(bo, new_mem);
> -   if (r)
> -   return r;
> -   }
> -
> new_mm = new_mem->mm_node;
> new_size = new_mm->size;
> new_start = amdgpu_mm_node_addr(bo, new_mm, new_mem);
>
> num_pages = new_mem->num_pages;
> +   mutex_lock(&adev->mman.gtt_window_lock);
> while (num_pages) {
> -   unsigned long cur_pages = min(old_size, new_size);
> +   unsigned long cur_pages = min(min(old_size, new_size),
> + 
> (u64)AMDGPU_GTT_MAX_TRANSFER_SIZE);
> +   uint64_t from = old_start, to = new_start;
> struct dma_fence *next;
>
> -   r = amdgpu_copy_buffer(ring, old_start, new_start,
> +   if (old_mem->mem_type == TTM_PL_TT &&
> +   !amdgpu_gtt_mgr_is_alloced(old_mem)) {
> +   r = amdgpu_map_buffer(bo, old_mem, cur_pages,
> + old_start, ring, &from);
> +   if (r)
> +   goto error;
> +   }
> +
> +   if (new_mem->mem_type == TTM_PL_TT &&
> +   !amdgpu_gtt_mgr_is_alloced(new_mem)) {
> +   r = amdgpu_map_buffer(bo, new_mem, cur_pages,
> + new_start, ring, &to);
> +   if (r)
> +   goto error;
> +   }
> +
> +   r = amdgpu_copy_buffer(ring, from, to,
>cur_pages * PAGE_SIZE,
> -  bo->resv, &next, false, false);
> +  b

Re: [PATCH 08/12] drm/amdgpu: add amdgpu_gart_map function

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> This allows us to write the mapped PTEs into
> an IB instead of the table directly.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  3 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 64 
> 
>  2 files changed, 52 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 810796a..4a2b33d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -572,6 +572,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev);
>  void amdgpu_gart_fini(struct amdgpu_device *adev);
>  int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
> int pages);
> +int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> +   int pages, dma_addr_t *dma_addr, uint64_t flags,
> +   void *dst);
>  int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
>  int pages, struct page **pagelist,
>  dma_addr_t *dma_addr, uint64_t flags);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 8877015..d99b2b2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -280,6 +280,43 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, 
> uint64_t offset,
>  }
>
>  /**
> + * amdgpu_gart_map - map dma_addresses into GART entries
> + *
> + * @adev: amdgpu_device pointer
> + * @offset: offset into the GPU's gart aperture
> + * @pages: number of pages to bind
> + * @dma_addr: DMA addresses of pages
> + *
> + * Map the dma_addresses into GART entries (all asics).
> + * Returns 0 for success, -EINVAL for failure.
> + */
> +int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
> +   int pages, dma_addr_t *dma_addr, uint64_t flags,
> +   void *dst)
> +{
> +   uint64_t page_base;
> +   unsigned t, p;
> +   int i, j;
> +
> +   if (!adev->gart.ready) {
> +   WARN(1, "trying to bind memory to uninitialized GART !\n");
> +   return -EINVAL;
> +   }
> +
> +   t = offset / AMDGPU_GPU_PAGE_SIZE;
> +   p = t / (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE);
> +
> +   for (i = 0; i < pages; i++, p++) {
> +   page_base = dma_addr[i];
> +   for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); j++, t++) 
> {
> +   amdgpu_gart_set_pte_pde(adev, dst, t, page_base, 
> flags);
> +   page_base += AMDGPU_GPU_PAGE_SIZE;
> +   }
> +   }
> +   return 0;
> +}
> +
> +/**
>   * amdgpu_gart_bind - bind pages into the gart page table
>   *
>   * @adev: amdgpu_device pointer
> @@ -296,31 +333,28 @@ int amdgpu_gart_bind(struct amdgpu_device *adev, 
> uint64_t offset,
>  int pages, struct page **pagelist, dma_addr_t *dma_addr,
>  uint64_t flags)
>  {
> -   unsigned t;
> -   unsigned p;
> -   uint64_t page_base;
> -   int i, j;
> +#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
> +   unsigned i;
> +#endif
> +   int r;
>
> if (!adev->gart.ready) {
> WARN(1, "trying to bind memory to uninitialized GART !\n");
> return -EINVAL;
> }
>
> -   t = offset / AMDGPU_GPU_PAGE_SIZE;
> -   p = t / (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE);
> -
> -   for (i = 0; i < pages; i++, p++) {
>  #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
> +   for (i = 0; i < pages; i++, p++)
> adev->gart.pages[p] = pagelist[i];
>  #endif
> -   if (adev->gart.ptr) {
> -   page_base = dma_addr[i];
> -   for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); 
> j++, t++) {
> -   amdgpu_gart_set_pte_pde(adev, adev->gart.ptr, 
> t, page_base, flags);
> -   page_base += AMDGPU_GPU_PAGE_SIZE;
> -   }
> -   }
> +
> +   if (adev->gart.ptr) {
> +   r = amdgpu_gart_map(adev, offset, pages, dma_addr, flags,
> +   adev->gart.ptr);
> +   if (r)
> +   return r;
> }
> +
> mb();
> amdgpu_gart_flush_gpu_tlb(adev, 0);
> return 0;
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 07/12] drm/amdgpu: reserve the first 2x2MB of GART

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> We want to use them as remap address space.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 5 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 3 +++
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 6fdf83a..a0976dc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -43,12 +43,15 @@ static int amdgpu_gtt_mgr_init(struct 
> ttm_mem_type_manager *man,
>unsigned long p_size)
>  {
> struct amdgpu_gtt_mgr *mgr;
> +   uint64_t start, size;
>
> mgr = kzalloc(sizeof(*mgr), GFP_KERNEL);
> if (!mgr)
> return -ENOMEM;
>
> -   drm_mm_init(&mgr->mm, 0, p_size);
> +   start = AMDGPU_GTT_MAX_TRANSFER_SIZE * 
> AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
> +   size = p_size - start;
> +   drm_mm_init(&mgr->mm, start, size);
> spin_lock_init(&mgr->lock);
> mgr->available = p_size;
> man->priv = mgr;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> index 2ade5c5..9c4da0a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> @@ -34,6 +34,9 @@
>  #define AMDGPU_PL_FLAG_GWS (TTM_PL_FLAG_PRIV << 1)
>  #define AMDGPU_PL_FLAG_OA  (TTM_PL_FLAG_PRIV << 2)
>
> +#define AMDGPU_GTT_MAX_TRANSFER_SIZE   512

Maybe AMDGPU_GTT_MAX_TRANSFER_SIZE_PAGES?  Also you may want to update
the patch title to say 2x512 pages rather than 2x2MB.

> +#define AMDGPU_GTT_NUM_TRANSFER_WINDOWS2
> +
>  struct amdgpu_mman {
> struct ttm_bo_global_refbo_global_ref;
> struct drm_global_reference mem_global_ref;
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 06/12] drm/amdgpu: bind BOs with GTT space allocated directly

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> This avoids binding them later on.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 16 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 49 
> ++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 +
>  3 files changed, 46 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index f7d22c4..6fdf83a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> @@ -81,6 +81,20 @@ static int amdgpu_gtt_mgr_fini(struct ttm_mem_type_manager 
> *man)
>  }
>
>  /**
> + * amdgpu_gtt_mgr_is_allocated - Check if mem has address space
> + *
> + * @mem: the mem object to check
> + *
> + * Check if a mem object has already address space allocated.
> + */
> +bool amdgpu_gtt_mgr_is_alloced(struct ttm_mem_reg *mem)

mismatch between documentation and function name.  I prefer the full
amdgpu_gtt_mgr_is_allocated or even better
amdgpu_gtt_mgr_addr_is_allocated.  With that fixed up:
Reviewed-by: Alex Deucher 


> +{
> +   struct drm_mm_node *node = mem->mm_node;
> +
> +   return (node->start != AMDGPU_BO_INVALID_OFFSET);
> +}
> +
> +/**
>   * amdgpu_gtt_mgr_alloc - allocate new ranges
>   *
>   * @man: TTM memory type manager
> @@ -101,7 +115,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
> unsigned long fpfn, lpfn;
> int r;
>
> -   if (node->start != AMDGPU_BO_INVALID_OFFSET)
> +   if (amdgpu_gtt_mgr_is_alloced(mem))
> return 0;
>
> if (place)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 5bfe7f6..eb0d7d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -681,6 +681,31 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt 
> *ttm)
> sg_free_table(ttm->sg);
>  }
>
> +static int amdgpu_ttm_do_bind(struct ttm_tt *ttm, struct ttm_mem_reg *mem)
> +{
> +   struct amdgpu_ttm_tt *gtt = (void *)ttm;
> +   uint64_t flags;
> +   int r;
> +
> +   spin_lock(>t->adev->gtt_list_lock);
> +   flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, mem);
> +   gtt->offset = (u64)mem->start << PAGE_SHIFT;
> +   r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
> +   ttm->pages, gtt->ttm.dma_address, flags);
> +
> +   if (r) {
> +   DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
> + ttm->num_pages, gtt->offset);
> +   goto error_gart_bind;
> +   }
> +
> +   list_add_tail(>t->list, >t->adev->gtt_list);
> +error_gart_bind:
> +   spin_unlock(>t->adev->gtt_list_lock);
> +   return r;
> +
> +}
> +
>  static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
>struct ttm_mem_reg *bo_mem)
>  {
> @@ -704,7 +729,10 @@ static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
> bo_mem->mem_type == AMDGPU_PL_OA)
> return -EINVAL;
>
> -   return 0;
> +   if (amdgpu_gtt_mgr_is_alloced(bo_mem))
> +   r = amdgpu_ttm_do_bind(ttm, bo_mem);
> +
> +   return r;
>  }
>
>  bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
> @@ -717,8 +745,6 @@ bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
>  int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
>  {
> struct ttm_tt *ttm = bo->ttm;
> -   struct amdgpu_ttm_tt *gtt = (void *)bo->ttm;
> -   uint64_t flags;
> int r;
>
> if (!ttm || amdgpu_ttm_is_bound(ttm))
> @@ -731,22 +757,7 @@ int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct 
> ttm_mem_reg *bo_mem)
> return r;
> }
>
> -   spin_lock(>t->adev->gtt_list_lock);
> -   flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, bo_mem);
> -   gtt->offset = (u64)bo_mem->start << PAGE_SHIFT;
> -   r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
> -   ttm->pages, gtt->ttm.dma_address, flags);
> -
> -   if (r) {
> -   DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
> - ttm->num_pages, gtt->offset);
> -   goto error_gart_bind;
> -   }
> -
> -   list_add_tail(>t->list, >t->adev->gtt_list);
> -error_gart_bind:
> -   spin_unlock(>t->adev->gtt_list_lock);
> -   return r;
> +   return amdgpu_ttm_do_bind(ttm, bo_mem);
>  }
>
>  int amdgpu_ttm_recover_gart(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> index cd5bbfa..2ade5c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
> @@ -56,6 +56,7 @@ struct amdgpu_mman {
>  extern const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_f

RE: [PATCH] drm/amdgpu: Make KIQ read/write register routine be atomic

2017-06-30 Thread Liu, Shaoyun

Hi , Christian
The new code actually will not use the fence function , it just need a memory 
that expose both CPU and  GPU address .  Do you  really want to add the wrap 
functions that just expose the CPU and  GPU address in this case  ?

Regards
Shaoyun.liu

-Original Message-
From: Christian König [mailto:deathsim...@vodafone.de] 
Sent: Friday, June 30, 2017 3:57 AM
To: Michel Dänzer; Liu, Shaoyun
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Make KIQ read/write register routine be atomic

Am 30.06.2017 um 03:21 schrieb Michel Dänzer:
> On 30/06/17 06:08 AM, Shaoyun Liu wrote:
>> 1. Use spin lock instead of mutex in KIQ 2. Directly write to KIQ 
>> fence address instead of using fence_emit() 3. Disable the interrupt 
>> for KIQ read/write and use CPU polling
> This list indicates that this patch should be split up in at least 
> three patches. :)
Yeah, apart from that is is not a good idea to mess with the fence internals 
directly in the KIQ code, please add a helper in the fence code for this.

Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 05/12] drm/amdgpu: bind BOs to TTM only once

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> No need to do this on every round.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 70 
> ++---
>  1 file changed, 29 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index bbe1639..5bfe7f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -252,29 +252,15 @@ static void amdgpu_move_null(struct ttm_buffer_object 
> *bo,
> new_mem->mm_node = NULL;
>  }
>
> -static int amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
> -  struct drm_mm_node *mm_node,
> -  struct ttm_mem_reg *mem,
> -  uint64_t *addr)
> +static uint64_t amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
> +   struct drm_mm_node *mm_node,
> +   struct ttm_mem_reg *mem)
>  {
> -   int r;
> -
> -   switch (mem->mem_type) {
> -   case TTM_PL_TT:
> -   r = amdgpu_ttm_bind(bo, mem);
> -   if (r)
> -   return r;
> -
> -   case TTM_PL_VRAM:
> -   *addr = mm_node->start << PAGE_SHIFT;
> -   *addr += bo->bdev->man[mem->mem_type].gpu_offset;
> -   break;
> -   default:
> -   DRM_ERROR("Unknown placement %d\n", mem->mem_type);
> -   return -EINVAL;
> -   }
> +   uint64_t addr;
>
> -   return 0;
> +   addr = mm_node->start << PAGE_SHIFT;
> +   addr += bo->bdev->man[mem->mem_type].gpu_offset;
> +   return addr;
>  }
>
>  static int amdgpu_move_blit(struct ttm_buffer_object *bo,
> @@ -298,18 +284,25 @@ static int amdgpu_move_blit(struct ttm_buffer_object 
> *bo,
> return -EINVAL;
> }
>
> +   if (old_mem->mem_type == TTM_PL_TT) {
> +   r = amdgpu_ttm_bind(bo, old_mem);
> +   if (r)
> +   return r;
> +   }
> +
> old_mm = old_mem->mm_node;
> -   r = amdgpu_mm_node_addr(bo, old_mm, old_mem, &old_start);
> -   if (r)
> -   return r;
> old_size = old_mm->size;
> +   old_start = amdgpu_mm_node_addr(bo, old_mm, old_mem);
>
> +   if (new_mem->mem_type == TTM_PL_TT) {
> +   r = amdgpu_ttm_bind(bo, new_mem);
> +   if (r)
> +   return r;
> +   }
>
> new_mm = new_mem->mm_node;
> -   r = amdgpu_mm_node_addr(bo, new_mm, new_mem, &new_start);
> -   if (r)
> -   return r;
> new_size = new_mm->size;
> +   new_start = amdgpu_mm_node_addr(bo, new_mm, new_mem);
>
> num_pages = new_mem->num_pages;
> while (num_pages) {
> @@ -331,10 +324,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
>
> old_size -= cur_pages;
> if (!old_size) {
> -   r = amdgpu_mm_node_addr(bo, ++old_mm, old_mem,
> -   &old_start);
> -   if (r)
> -   goto error;
> +   old_start = amdgpu_mm_node_addr(bo, ++old_mm, 
> old_mem);
> old_size = old_mm->size;
> } else {
> old_start += cur_pages * PAGE_SIZE;
> @@ -342,11 +332,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
>
> new_size -= cur_pages;
> if (!new_size) {
> -   r = amdgpu_mm_node_addr(bo, ++new_mm, new_mem,
> -   &new_start);
> -   if (r)
> -   goto error;
> -
> +   new_start = amdgpu_mm_node_addr(bo, ++new_mm, 
> new_mem);
> new_size = new_mm->size;
> } else {
> new_start += cur_pages * PAGE_SIZE;
> @@ -1347,6 +1333,12 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
> return -EINVAL;
> }
>
> +   if (bo->tbo.mem.mem_type == TTM_PL_TT) {
> +   r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
> +   if (r)
> +   return r;
> +   }
> +
> num_pages = bo->tbo.num_pages;
> mm_node = bo->tbo.mem.mm_node;
> num_loops = 0;
> @@ -1382,11 +1374,7 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
> uint32_t byte_count = mm_node->size << PAGE_SHIFT;
> uint64_t dst_addr;
>
> -   r = amdgpu_mm_node_addr(&bo->tbo, mm_node,
> -   &bo->tbo.mem, &dst_addr);
> -   if (r)
> -   return r;
> -
> +   dst_addr = amdgpu_mm_node_addr(&bo->tbo, mm_node, 
> &bo

Re: [PATCH 04/12] drm/amdgpu: add vm_needs_flush parameter to amdgpu_copy_buffer

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> This allows us to flush the system VM here.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_test.c  |  4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 12 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  9 -
>  5 files changed, 15 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
> index 1beae5b..2fb299a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
> @@ -40,7 +40,7 @@ static int amdgpu_benchmark_do_move(struct amdgpu_device 
> *adev, unsigned size,
> for (i = 0; i < n; i++) {
> struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
> r = amdgpu_copy_buffer(ring, saddr, daddr, size, NULL, &fence,
> -  false);
> +  false, false);
> if (r)
> goto exit_do_move;
> r = dma_fence_wait(fence, false);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 8ee6965..c34cf2c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -535,7 +535,7 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
>
> r = amdgpu_copy_buffer(ring, bo_addr, shadow_addr,
>amdgpu_bo_size(bo), resv, fence,
> -  direct);
> +  direct, false);
> if (!r)
> amdgpu_bo_fence(bo, *fence, true);
>
> @@ -588,7 +588,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device 
> *adev,
>
> r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
>amdgpu_bo_size(bo), resv, fence,
> -  direct);
> +  direct, false);
> if (!r)
> amdgpu_bo_fence(bo, *fence, true);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
> index 15510da..d02e611 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
> @@ -111,7 +111,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device 
> *adev)
> amdgpu_bo_kunmap(gtt_obj[i]);
>
> r = amdgpu_copy_buffer(ring, gtt_addr, vram_addr,
> -  size, NULL, &fence, false);
> +  size, NULL, &fence, false, false);
>
> if (r) {
> DRM_ERROR("Failed GTT->VRAM copy %d\n", i);
> @@ -156,7 +156,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device 
> *adev)
> amdgpu_bo_kunmap(vram_obj);
>
> r = amdgpu_copy_buffer(ring, vram_addr, gtt_addr,
> -  size, NULL, &fence, false);
> +  size, NULL, &fence, false, false);
>
> if (r) {
> DRM_ERROR("Failed VRAM->GTT copy %d\n", i);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index e4860ac..bbe1639 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -318,7 +318,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
>
> r = amdgpu_copy_buffer(ring, old_start, new_start,
>cur_pages * PAGE_SIZE,
> -  bo->resv, &next, false);
> +  bo->resv, &next, false, false);
> if (r)
> goto error;
>
> @@ -1256,12 +1256,11 @@ int amdgpu_mmap(struct file *filp, struct 
> vm_area_struct *vma)
> return ttm_bo_mmap(filp, vma, &adev->mman.bdev);
>  }
>
> -int amdgpu_copy_buffer(struct amdgpu_ring *ring,
> -  uint64_t src_offset,
> -  uint64_t dst_offset,
> -  uint32_t byte_count,
> +int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
> +  uint64_t dst_offset, uint32_t byte_count,
>struct reservation_object *resv,
> -  struct dma_fence **fence, bool direct_submit)
> +  struct dma_fence **fence, bool direct_submit,
> +  bool vm_needs_flush)
>  {
> struct amdgpu_device *adev = ring->adev;
> struct amdgpu_job *job;
> @@ -1283,6 +1282,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring,

Re: [PATCH 03/12] drm/amdgpu: allow flushing VMID0 before IB execution as well

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> This allows us to queue IBs which needs an up to date system domain as well.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index f774b3f..1b30d2a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -172,7 +172,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
> num_ibs,
> if (ring->funcs->insert_start)
> ring->funcs->insert_start(ring);
>
> -   if (vm) {
> +   if (job) {
> r = amdgpu_vm_flush(ring, job);
> if (r) {
> amdgpu_ring_undo(ring);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 3d641e1..4510627 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -81,6 +81,8 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, 
> unsigned size,
> r = amdgpu_ib_get(adev, NULL, size, &(*job)->ibs[0]);
> if (r)
> kfree(*job);
> +   else
> +   (*job)->vm_pd_addr = adev->gart.table_addr;
>
> return r;
>  }
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 01/12] drm/amdgpu: move ring helpers to amdgpu_ring.h

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> Keep them where they belong.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h  | 44 
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 42 ++
>  2 files changed, 42 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index ab1dad2..810796a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1801,50 +1801,6 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device 
> *adev);
>  #define RBIOS16(i) (RBIOS8(i) | (RBIOS8((i)+1) << 8))
>  #define RBIOS32(i) ((RBIOS16(i)) | (RBIOS16((i)+2) << 16))
>
> -/*
> - * RING helpers.
> - */
> -static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> -{
> -   if (ring->count_dw <= 0)
> -   DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> -   ring->ring[ring->wptr++ & ring->buf_mask] = v;
> -   ring->wptr &= ring->ptr_mask;
> -   ring->count_dw--;
> -}
> -
> -static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void 
> *src, int count_dw)
> -{
> -   unsigned occupied, chunk1, chunk2;
> -   void *dst;
> -
> -   if (unlikely(ring->count_dw < count_dw)) {
> -   DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> -   return;
> -   }
> -
> -   occupied = ring->wptr & ring->buf_mask;
> -   dst = (void *)&ring->ring[occupied];
> -   chunk1 = ring->buf_mask + 1 - occupied;
> -   chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> -   chunk2 = count_dw - chunk1;
> -   chunk1 <<= 2;
> -   chunk2 <<= 2;
> -
> -   if (chunk1)
> -   memcpy(dst, src, chunk1);
> -
> -   if (chunk2) {
> -   src += chunk1;
> -   dst = (void *)ring->ring;
> -   memcpy(dst, src, chunk2);
> -   }
> -
> -   ring->wptr += count_dw;
> -   ring->wptr &= ring->ptr_mask;
> -   ring->count_dw -= count_dw;
> -}
> -
>  static inline struct amdgpu_sdma_instance *
>  amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
>  {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index bc8dec9..04cbc3a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -212,4 +212,46 @@ static inline void amdgpu_ring_clear_ring(struct 
> amdgpu_ring *ring)
>
>  }
>
> +static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
> +{
> +   if (ring->count_dw <= 0)
> +   DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> +   ring->ring[ring->wptr++ & ring->buf_mask] = v;
> +   ring->wptr &= ring->ptr_mask;
> +   ring->count_dw--;
> +}
> +
> +static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
> + void *src, int count_dw)
> +{
> +   unsigned occupied, chunk1, chunk2;
> +   void *dst;
> +
> +   if (unlikely(ring->count_dw < count_dw)) {
> +   DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> +   return;
> +   }
> +
> +   occupied = ring->wptr & ring->buf_mask;
> +   dst = (void *)&ring->ring[occupied];
> +   chunk1 = ring->buf_mask + 1 - occupied;
> +   chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
> +   chunk2 = count_dw - chunk1;
> +   chunk1 <<= 2;
> +   chunk2 <<= 2;
> +
> +   if (chunk1)
> +   memcpy(dst, src, chunk1);
> +
> +   if (chunk2) {
> +   src += chunk1;
> +   dst = (void *)ring->ring;
> +   memcpy(dst, src, chunk2);
> +   }
> +
> +   ring->wptr += count_dw;
> +   ring->wptr &= ring->ptr_mask;
> +   ring->count_dw -= count_dw;
> +}
> +
>  #endif
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 02/12] drm/amdgpu: fix amdgpu_ring_write_multiple

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 7:22 AM, Christian König
 wrote:
> From: Christian König 
>
> Overwriting still used ring content has a low probability to cause
> problems, not writing at all has 100% probability to cause problems.
>
> Signed-off-by: Christian König 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> index 04cbc3a..322d2529 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> @@ -227,10 +227,8 @@ static inline void amdgpu_ring_write_multiple(struct 
> amdgpu_ring *ring,
> unsigned occupied, chunk1, chunk2;
> void *dst;
>
> -   if (unlikely(ring->count_dw < count_dw)) {
> +   if (unlikely(ring->count_dw < count_dw))
> DRM_ERROR("amdgpu: writing more dwords to the ring than 
> expected!\n");
> -   return;
> -   }
>
> occupied = ring->wptr & ring->buf_mask;
> dst = (void *)&ring->ring[occupied];
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amdgpu: Use polling for KIQ read/write register

2017-06-30 Thread Shaoyun Liu

Change-Id: I87762bfc9903401ac06892bed10efa1767c15025
Signed-off-by: Shaoyun Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 47 +++-
 1 file changed, 34 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index a65e76c..06ef893 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -116,24 +116,33 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, 
uint32_t reg)
 {
signed long r;
uint32_t val;
-   struct dma_fence *f;
struct amdgpu_kiq *kiq = &adev->gfx.kiq;
struct amdgpu_ring *ring = &kiq->ring;
+   unsigned long end_jiffies;
+   uint32_t seq;
+   volatile uint32_t *f;
 
BUG_ON(!ring->funcs->emit_rreg);
 
spin_lock(&kiq->ring_lock);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_rreg(ring, reg);
-   amdgpu_fence_emit(ring, &f);
+   f = amdgpu_fence_drv_cpu_addr(ring);
+   *f = 0;
+   seq = ++ring->fence_drv.sync_seq;
+   amdgpu_ring_emit_fence(ring, amdgpu_fence_drv_gpu_addr(ring), seq, 0);
amdgpu_ring_commit(ring);
spin_unlock(&kiq->ring_lock);
 
-   r = dma_fence_wait_timeout(f, false, 
msecs_to_jiffies(MAX_KIQ_REG_WAIT));
-   dma_fence_put(f);
-   if (r < 1) {
-   DRM_ERROR("wait for kiq fence error: %ld.\n", r);
-   return ~0;
+   end_jiffies = (MAX_KIQ_REG_WAIT * HZ / 1000) + jiffies;
+   while (true) {
+   if (*f >= seq)
+   break;
+   if (time_after(jiffies, end_jiffies)) {
+   DRM_ERROR("wait for kiq fence error: %ld.\n", r);
+   return ~0;
+   }
+   cpu_relax();
}
 
val = adev->wb.wb[adev->virt.reg_val_offs];
@@ -144,23 +153,35 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, 
uint32_t reg)
 void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v)
 {
signed long r;
-   struct dma_fence *f;
struct amdgpu_kiq *kiq = &adev->gfx.kiq;
struct amdgpu_ring *ring = &kiq->ring;
+   unsigned long end_jiffies;
+   uint32_t seq;
+   volatile uint32_t *f;
 
BUG_ON(!ring->funcs->emit_wreg);
 
spin_lock(&kiq->ring_lock);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_wreg(ring, reg, v);
-   amdgpu_fence_emit(ring, &f);
+   f = amdgpu_fence_drv_cpu_addr(ring);
+   *f = 0;
+   seq = ++ring->fence_drv.sync_seq;
+   amdgpu_ring_emit_fence(ring, amdgpu_fence_drv_gpu_addr(ring), seq, 0);
amdgpu_ring_commit(ring);
spin_unlock(&kiq->ring_lock);
 
-   r = dma_fence_wait_timeout(f, false, 
msecs_to_jiffies(MAX_KIQ_REG_WAIT));
-   if (r < 1)
-   DRM_ERROR("wait for kiq fence error: %ld.\n", r);
-   dma_fence_put(f);
+   end_jiffies = (MAX_KIQ_REG_WAIT * HZ / 1000) + jiffies;
+   while (true) {
+   if (*f >= seq)
+   break;
+   if (time_after(jiffies, end_jiffies)) {
+   DRM_ERROR("wait for kiq fence error: %ld.\n", r);
+   return;
+   }
+   cpu_relax();
+   }
+
 }
 
 /**
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amdgpu: Use spin_lock instead of mutex for KIQ

2017-06-30 Thread Shaoyun Liu

KIQ read/write register will be called in atomic context so mutex can not be 
used

Change-Id: Ifa14293b3cdfcf74cd7930a4058154d0a7d7f97c
Signed-off-by: Shaoyun Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 8 
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ab1dad2..a155206 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -955,7 +955,7 @@ struct amdgpu_mec {
 struct amdgpu_kiq {
u64 eop_gpu_addr;
struct amdgpu_bo*eop_obj;
-   struct mutexring_mutex;
+   spinlock_t  ring_lock;
struct amdgpu_ring  ring;
struct amdgpu_irq_src   irq;
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index e26108a..e5e5541 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -184,7 +184,7 @@ int amdgpu_gfx_kiq_init_ring(struct amdgpu_device *adev,
struct amdgpu_kiq *kiq = &adev->gfx.kiq;
int r = 0;
 
-   mutex_init(&kiq->ring_mutex);
+   spin_lock_init(&kiq->ring_lock);
 
r = amdgpu_wb_get(adev, &adev->virt.reg_val_offs);
if (r)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 8a081e1..a65e76c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -122,12 +122,12 @@ uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, 
uint32_t reg)
 
BUG_ON(!ring->funcs->emit_rreg);
 
-   mutex_lock(&kiq->ring_mutex);
+   spin_lock(&kiq->ring_lock);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_rreg(ring, reg);
amdgpu_fence_emit(ring, &f);
amdgpu_ring_commit(ring);
-   mutex_unlock(&kiq->ring_mutex);
+   spin_unlock(&kiq->ring_lock);
 
r = dma_fence_wait_timeout(f, false, 
msecs_to_jiffies(MAX_KIQ_REG_WAIT));
dma_fence_put(f);
@@ -150,12 +150,12 @@ void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, 
uint32_t reg, uint32_t v)
 
BUG_ON(!ring->funcs->emit_wreg);
 
-   mutex_lock(&kiq->ring_mutex);
+   spin_lock(&kiq->ring_lock);
amdgpu_ring_alloc(ring, 32);
amdgpu_ring_emit_wreg(ring, reg, v);
amdgpu_fence_emit(ring, &f);
amdgpu_ring_commit(ring);
-   mutex_unlock(&kiq->ring_mutex);
+   spin_unlock(&kiq->ring_lock);
 
r = dma_fence_wait_timeout(f, false, 
msecs_to_jiffies(MAX_KIQ_REG_WAIT));
if (r < 1)
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/acp: properly handle powergating in hw_fini

2017-06-30 Thread Alex Deucher

Stoney does not have powergating, so make the powergating
teardown dependent on whether we have a genpd structure.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
index 06879d1..091b5e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
@@ -398,20 +398,22 @@ static int acp_hw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
/* return early if no ACP */
-   if (!adev->acp.acp_genpd)
+   if (!adev->acp.acp_cell)
return 0;
 
-   for (i = 0; i < ACP_DEVS ; i++) {
-   dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
-   ret = pm_genpd_remove_device(&adev->acp.acp_genpd->gpd, dev);
-   /* If removal fails, dont giveup and try rest */
-   if (ret)
-   dev_err(dev, "remove dev from genpd failed\n");
+   if (adev->acp.acp_genpd) {
+   for (i = 0; i < ACP_DEVS ; i++) {
+   dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
+   ret = pm_genpd_remove_device(&adev->acp.acp_genpd->gpd, 
dev);
+   /* If removal fails, dont giveup and try rest */
+   if (ret)
+   dev_err(dev, "remove dev from genpd failed\n");
+   }
+   kfree(adev->acp.acp_genpd);
}
 
mfd_remove_devices(adev->acp.parent);
kfree(adev->acp.acp_res);
-   kfree(adev->acp.acp_genpd);
kfree(adev->acp.acp_cell);
 
return 0;
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm/amdgpu: Add wrap function for fence driver to expose cpu and gpu address

2017-06-30 Thread Shaoyun Liu

Change-Id: I5c6267253bfe5507a8821a482cf378852946
Signed-off-by: Shaoyun Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index bc8dec9..3f0bbc4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -211,5 +211,14 @@ static inline void amdgpu_ring_clear_ring(struct 
amdgpu_ring *ring)
ring->ring[i++] = ring->funcs->nop;
 
 }
+static inline uint64_t amdgpu_fence_drv_gpu_addr(struct amdgpu_ring *ring)
+{
+   return ring->fence_drv.gpu_addr;
+}
+
+static inline volatile void * amdgpu_fence_drv_cpu_addr(struct amdgpu_ring 
*ring)
+{
+   return ring->fence_drv.cpu_addr;
+}
 
 #endif
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH RFC v2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-06-30 Thread John Brooks

When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This allows it
to potentially later move to invisible VRAM if the CPU does not access it
again.

Setting the CPU_ACCESS flag in amdgpu_bo_fault_reserve_notify() also means
that we can remove the loop to restrict lpfn to the end of visible VRAM,
because amdgpu_ttm_placement_init() will do it for us.

Signed-off-by: John Brooks 
---

Whoops, I forgot to actually remove that loop.
Also, in the changelog: amdgpu_fault_reserve_notify -> 
amdgpu_bo_fault_reserve_notify

 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 15 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  8 
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index fa8aeca..7164f8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -946,13 +946,16 @@ int amdgpu_bo_fault_reserve_notify(struct 
ttm_buffer_object *bo)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
struct amdgpu_bo *abo;
-   unsigned long offset, size, lpfn;
-   int i, r;
+   unsigned long offset, size;
+   int r;
 
if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
return 0;
 
abo = container_of(bo, struct amdgpu_bo, tbo);
+
+   abo->flags |= AMDGPU_BO_FLAG_CPU_ACCESS;
+
if (bo->mem.mem_type != TTM_PL_VRAM)
return 0;
 
@@ -969,14 +972,6 @@ int amdgpu_bo_fault_reserve_notify(struct 
ttm_buffer_object *bo)
/* hurrah the memory is not visible ! */
atomic64_inc(&adev->num_vram_cpu_page_faults);
amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
-   lpfn =  adev->mc.visible_vram_size >> PAGE_SHIFT;
-   for (i = 0; i < abo->placement.num_placement; i++) {
-   /* Force into visible VRAM */
-   if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
-   (!abo->placements[i].lpfn ||
-abo->placements[i].lpfn > lpfn))
-   abo->placements[i].lpfn = lpfn;
-   }
r = ttm_bo_validate(bo, &abo->placement, false, false);
if (unlikely(r == -ENOMEM)) {
amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c9b131b..cc65cdd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -417,6 +417,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
struct ttm_mem_reg *new_mem)
 {
struct amdgpu_device *adev;
+   struct amdgpu_bo *abo;
struct ttm_mem_reg *old_mem = &bo->mem;
struct ttm_mem_reg tmp_mem;
struct ttm_placement placement;
@@ -424,6 +425,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
int r;
 
adev = amdgpu_ttm_adev(bo->bdev);
+   abo = container_of(bo, struct amdgpu_bo, tbo);
tmp_mem = *new_mem;
tmp_mem.mm_node = NULL;
placement.num_placement = 1;
@@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
if (unlikely(r)) {
goto out_cleanup;
}
+
+   /* The page fault handler will re-set this if the CPU accesses the BO
+* after it's moved.
+*/
+   abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS;
+
 out_cleanup:
ttm_bo_mem_put(bo, &tmp_mem);
return r;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM

2017-06-30 Thread John Brooks

When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This allows it
to potentially later move to invisible VRAM if the CPU does not access it
again.

Setting the CPU_ACCESS flag in amdgpu_fault_reserve_notify() also means
that we can remove the loop to restrict lpfn to the end of visible VRAM,
because amdgpu_ttm_placement_init() will do it for us.

Signed-off-by: John Brooks 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 8 
 2 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index fa8aeca..19bd2fd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -953,6 +953,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object 
*bo)
return 0;
 
abo = container_of(bo, struct amdgpu_bo, tbo);
+
+   abo->flags |= AMDGPU_BO_FLAG_CPU_ACCESS;
+
if (bo->mem.mem_type != TTM_PL_VRAM)
return 0;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c9b131b..cc65cdd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -417,6 +417,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
struct ttm_mem_reg *new_mem)
 {
struct amdgpu_device *adev;
+   struct amdgpu_bo *abo;
struct ttm_mem_reg *old_mem = &bo->mem;
struct ttm_mem_reg tmp_mem;
struct ttm_placement placement;
@@ -424,6 +425,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
int r;
 
adev = amdgpu_ttm_adev(bo->bdev);
+   abo = container_of(bo, struct amdgpu_bo, tbo);
tmp_mem = *new_mem;
tmp_mem.mm_node = NULL;
placement.num_placement = 1;
@@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
if (unlikely(r)) {
goto out_cleanup;
}
+
+   /* The page fault handler will re-set this if the CPU accesses the BO
+* after it's moved.
+*/
+   abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS;
+
 out_cleanup:
ttm_bo_mem_put(bo, &tmp_mem);
return r;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH RFC 1/2] drm/amdgpu: Add AMDGPU_BO_FLAG_CPU_ACCESS

2017-06-30 Thread John Brooks

For userspace BO allocations, replace AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
with a new AMDGPU_BO_FLAG_CPU_ACCESS flag. This flag will be used to
indicate that a BO should currently be CPU accessible. Unlike the
CPU_ACCESS_REQUIRED flag, it is meant to be an ephemeral rather than a
permanent constraint. Currently, however, it is treated no differently.

Signed-off-by: John Brooks 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 -
 include/uapi/drm/amdgpu_drm.h  | 1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 12d61ed..a724e4f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -411,6 +411,9 @@ struct amdgpu_bo_va {
 
 #define AMDGPU_GEM_DOMAIN_MAX  0x3
 
+/* BO internal flags */
+#define AMDGPU_BO_FLAG_CPU_ACCESS  (AMDGPU_GEM_CREATE_MAX << 1)
+
 struct amdgpu_bo {
/* Protected by tbo.reserved */
u32 prefered_domains;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8ee6965..fa8aeca 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -128,7 +128,8 @@ static void amdgpu_ttm_placement_init(struct amdgpu_device 
*adev,
places[c].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED |
TTM_PL_FLAG_VRAM;
 
-   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
+   if (flags & (AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
+AMDGPU_BO_FLAG_CPU_ACCESS))
places[c].lpfn = visible_pfn;
else
places[c].flags |= TTM_PL_FLAG_TOPDOWN;
@@ -361,6 +362,12 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
if (!kernel && bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
 
+   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
+   flags |= AMDGPU_BO_FLAG_CPU_ACCESS;
+   /* Treat CPU_ACCESS_REQUIRED only as a hint if given by UMD */
+   if (!kernel)
+   flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+   }
bo->flags = flags;
 
 #ifdef CONFIG_X86_32
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index d9aa4a3..473076f 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -87,6 +87,7 @@ extern "C" {
 #define AMDGPU_GEM_CREATE_SHADOW   (1 << 4)
 /* Flag that allocating the BO should use linear VRAM */
 #define AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS  (1 << 5)
+#define AMDGPU_GEM_CREATE_MAX  (1 << 5)
 
 struct drm_amdgpu_gem_create_in  {
/** the requested memory size */
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH RFC 0/2] Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-06-30 Thread John Brooks

> On 30/06/17 03:59 PM, Christian König wrote:
>> Am 30.06.2017 um 08:51 schrieb Michel Dänzer:
>>> We can deal with that internally in the kernel, while fixing the
>>> existing flag for userspace.
>> And as I said, NAK to that approach. I'm not going to add a
>> CPU_ACCESS_REALLY_REQUIRED flag in the kernel just because mesa has
>> messed up it's use case.
>>
>> We could agree on filtering that flag from userspace when BOs are
>> created and/or map it to a CREATE_CPU_ACCESS_HINT flag.
> Then I propose the following:
>
> One patch:
>
> Convert AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to a kernel internal flag
> AMDGPU_GEM_CPU_ACCESS_HINT in amdgpu_gem_create_ioctl, which is
> initially treated the same way as AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.
>
> Another patch:
>
> Change the treatment of AMDGPU_GEM_CPU_ACCESS_HINT according to John's
> patch 4 in the latest series, or a variation of that as discussed on IRC.
>
>
> If any regressions are reported, we will be able to differentiate
> whether they are due to the addition of the new flag itself or due to
> the change in its handling.
>

How about this?
Note: I haven't tested this beyond compiling.

See replies for:
[PATCH RFC 1/2] drm/amdgpu: Add AMDGPU_BO_FLAG_CPU_ACCESS
[PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault

John

>
> -- 
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH xf86-video-ati] Use pRADEONEnt->fd exclusively for the DRM file descriptor

2017-06-30 Thread Deucher, Alexander

> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Friday, June 30, 2017 5:20 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH xf86-video-ati] Use pRADEONEnt->fd exclusively for the
> DRM file descriptor
> 
> From: Michel Dänzer 
> 
> This brings us closer to amdgpu.
> 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  src/drmmode_display.c  | 151 +
> 
>  src/drmmode_display.h  |   1 -
>  src/radeon.h   |  14 +++--
>  src/radeon_accel.c |   4 +-
>  src/radeon_bo_helper.c |   8 ++-
>  src/radeon_dri2.c  |  54 ++
>  src/radeon_dri2.h  |   1 -
>  src/radeon_dri3.c  |   3 +-
>  src/radeon_exa.c   |   4 +-
>  src/radeon_glamor.c|   3 +-
>  src/radeon_kms.c   |  70 ---
>  src/radeon_present.c   |  13 +++--
>  12 files changed, 172 insertions(+), 154 deletions(-)
> 
> diff --git a/src/drmmode_display.c b/src/drmmode_display.c
> index dd394ec1d..4b964b7b9 100644
> --- a/src/drmmode_display.c
> +++ b/src/drmmode_display.c
> @@ -272,7 +272,7 @@ int drmmode_get_current_ust(int drm_fd, CARD64
> *ust)
>  int drmmode_crtc_get_ust_msc(xf86CrtcPtr crtc, CARD64 *ust, CARD64
> *msc)
>  {
>  ScrnInfoPtr scrn = crtc->scrn;
> -RADEONInfoPtr info = RADEONPTR(scrn);
> +RADEONEntPtr pRADEONEnt = RADEONEntPriv(scrn);
>  drmVBlank vbl;
>  int ret;
> 
> @@ -280,7 +280,7 @@ int drmmode_crtc_get_ust_msc(xf86CrtcPtr crtc,
> CARD64 *ust, CARD64 *msc)
>  vbl.request.type |= radeon_populate_vbl_request_type(crtc);
>  vbl.request.sequence = 0;
> 
> -ret = drmWaitVBlank(info->dri2.drm_fd, &vbl);
> +ret = drmWaitVBlank(pRADEONEnt->fd, &vbl);
>  if (ret) {
>   xf86DrvMsg(scrn->scrnIndex, X_WARNING,
>  "get vblank counter failed: %s\n", strerror(errno));
> @@ -298,7 +298,7 @@ drmmode_do_crtc_dpms(xf86CrtcPtr crtc, int mode)
>  {
>   drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
>   ScrnInfoPtr scrn = crtc->scrn;
> - RADEONInfoPtr info = RADEONPTR(scrn);
> + RADEONEntPtr pRADEONEnt = RADEONEntPriv(scrn);
>   CARD64 ust;
>   int ret;
> 
> @@ -318,7 +318,7 @@ drmmode_do_crtc_dpms(xf86CrtcPtr crtc, int mode)
>   vbl.request.type = DRM_VBLANK_RELATIVE;
>   vbl.request.type |=
> radeon_populate_vbl_request_type(crtc);
>   vbl.request.sequence = 0;
> - ret = drmWaitVBlank(info->dri2.drm_fd, &vbl);
> + ret = drmWaitVBlank(pRADEONEnt->fd, &vbl);
>   if (ret)
>   xf86DrvMsg(scrn->scrnIndex, X_ERROR,
>  "%s cannot get last vblank counter\n",
> @@ -345,7 +345,7 @@ drmmode_do_crtc_dpms(xf86CrtcPtr crtc, int mode)
>* Off->On transition: calculate and accumulate the
>* number of interpolated vblanks while we were in Off state
>*/
> - ret = drmmode_get_current_ust(info->dri2.drm_fd, &ust);
> + ret = drmmode_get_current_ust(pRADEONEnt->fd, &ust);
>   if (ret)
>   xf86DrvMsg(scrn->scrnIndex, X_ERROR,
>  "%s cannot get current time\n", __func__);
> @@ -365,7 +365,7 @@ static void
>  drmmode_crtc_dpms(xf86CrtcPtr crtc, int mode)
>  {
>   drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
> - drmmode_ptr drmmode = drmmode_crtc->drmmode;
> + RADEONEntPtr pRADEONEnt = RADEONEntPriv(crtc->scrn);
> 
>   /* Disable unused CRTCs */
>   if (!crtc->enabled || mode != DPMSModeOn) {
> @@ -373,9 +373,9 @@ drmmode_crtc_dpms(xf86CrtcPtr crtc, int mode)
>   if (drmmode_crtc->flip_pending)
>   return;
> 
> - drmModeSetCrtc(drmmode->fd, drmmode_crtc-
> >mode_crtc->crtc_id,
> + drmModeSetCrtc(pRADEONEnt->fd, drmmode_crtc-
> >mode_crtc->crtc_id,
>  0, 0, 0, NULL, 0, NULL);
> - drmmode_fb_reference(drmmode->fd, &drmmode_crtc-
> >fb, NULL);
> + drmmode_fb_reference(pRADEONEnt->fd,
> &drmmode_crtc->fb, NULL);
>   } else if (drmmode_crtc->dpms_mode != DPMSModeOn)
>   crtc->funcs->set_mode_major(crtc, &crtc->mode, crtc-
> >rotation,
>   crtc->x, crtc->y);
> @@ -385,6 +385,7 @@ static PixmapPtr
>  create_pixmap_for_fbcon(drmmode_ptr drmmode,
>   ScrnInfoPtr pScrn, int fbcon_id)
>  {
> + RADEONEntPtr pRADEONEnt = RADEONEntPriv(pScrn);
>   RADEONInfoPtr info = RADEONPTR(pScrn);
>   PixmapPtr pixmap = info->fbcon_pixmap;
>   struct radeon_bo *bo;
> @@ -394,7 +395,7 @@ create_pixmap_for_fbcon(drmmode_ptr drmmode,
>   if (pixmap)
>   return pixmap;
> 
> - fbcon = drmModeGetFB(drmmode->fd, fbcon_id);
> + fbcon = drmModeGetFB(pRADEONEnt->fd, fbcon_id);
>   if (fbcon ==

RE: [PATCH] /drm/amd/amdgpu: move get memory type function from early init to sw init

2017-06-30 Thread Deucher, Alexander



> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Jim Qu
> Sent: Friday, June 30, 2017 1:32 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Qu, Jim
> Subject: [PATCH] /drm/amd/amdgpu: move get memory type function from
> early init to sw init
> 
> On PX system, it will get memory type before gpu post , and get unkown
> type.
> 
> Change-Id: I79e3760dd789c21a5f552bc4e5754f7a2defdaae
> Signed-off-by: Jim Qu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index 5cc3f39..5ed6788f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -765,14 +765,6 @@ static int gmc_v6_0_early_init(void *handle)
>   gmc_v6_0_set_gart_funcs(adev);
>   gmc_v6_0_set_irq_funcs(adev);
> 
> - if (adev->flags & AMD_IS_APU) {
> - adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
> - } else {
> - u32 tmp = RREG32(mmMC_SEQ_MISC0);
> - tmp &= MC_SEQ_MISC0__MT__MASK;
> - adev->mc.vram_type =
> gmc_v6_0_convert_vram_type(tmp);
> - }
> -
>   return 0;
>  }
> 
> @@ -792,6 +784,14 @@ static int gmc_v6_0_sw_init(void *handle)
>   int dma_bits;
>   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> 
> + if (adev->flags & AMD_IS_APU) {
> + adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
> + } else {
> + u32 tmp = RREG32(mmMC_SEQ_MISC0);
> + tmp &= MC_SEQ_MISC0__MT__MASK;
> + adev->mc.vram_type =
> gmc_v6_0_convert_vram_type(tmp);
> + }
> +
>   r = amdgpu_irq_add_id(adev, AMDGPU_IH_CLIENTID_LEGACY, 146,
> &adev->mc.vm_fault);
>   if (r)
>   return r;
> --
> 1.9.1
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] amdgpu: Set cik/si_support to 1 by default if radeon isn't built

2017-06-30 Thread axie


Hi Michel,

MODULE_PARM_DESC is "used to document arguments that the module can take.  It takes 
two parameters: a variable name and a free form string describing that variable"

I think we should avoid changing document when config change.

How about changing it to something like the following example?

MODULE_PARM_DESC(pos_buf_per_se, "the size of Position Buffer per Shader Engine 
(default depending on gfx)");

Thanks,
Alex Bin

On 2017-06-30 04:36 AM, Michel Dänzer wrote:

From: Michel Dänzer 

It was required to explicitly set these parameters to 1, even if the
radeon driver isn't built at all, which is not intuitive.

Reported-by: Shawn Starr 
Signed-off-by: Michel Dänzer 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 27599db7d630..58770fc40520 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -234,14 +234,28 @@ MODULE_PARM_DESC(param_buf_per_se, "the size of Off-Chip 
Pramater Cache per Shad
  module_param_named(param_buf_per_se, amdgpu_param_buf_per_se, int, 0444);
  
  #ifdef CONFIG_DRM_AMDGPU_SI

+
+#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
  int amdgpu_si_support = 0;
  MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled 
(default))");
+#else
+int amdgpu_si_support = 1;
+MODULE_PARM_DESC(si_support, "SI support (1 = enabled (default), 0 = 
disabled)");
+#endif
+
  module_param_named(si_support, amdgpu_si_support, int, 0444);
  #endif
  
  #ifdef CONFIG_DRM_AMDGPU_CIK

+
+#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
  int amdgpu_cik_support = 0;
  MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled 
(default))");
+#else
+int amdgpu_cik_support = 1;
+MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = 
disabled)");
+#endif
+
  module_param_named(cik_support, amdgpu_cik_support, int, 0444);
  #endif
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] amdgpu: Set cik/si_support to 1 by default if radeon isn't built

2017-06-30 Thread Alex Deucher

On Fri, Jun 30, 2017 at 4:36 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> It was required to explicitly set these parameters to 1, even if the
> radeon driver isn't built at all, which is not intuitive.
>
> Reported-by: Shawn Starr 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 27599db7d630..58770fc40520 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -234,14 +234,28 @@ MODULE_PARM_DESC(param_buf_per_se, "the size of 
> Off-Chip Pramater Cache per Shad
>  module_param_named(param_buf_per_se, amdgpu_param_buf_per_se, int, 0444);
>
>  #ifdef CONFIG_DRM_AMDGPU_SI
> +
> +#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
>  int amdgpu_si_support = 0;
>  MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled 
> (default))");
> +#else
> +int amdgpu_si_support = 1;
> +MODULE_PARM_DESC(si_support, "SI support (1 = enabled (default), 0 = 
> disabled)");
> +#endif
> +
>  module_param_named(si_support, amdgpu_si_support, int, 0444);
>  #endif
>
>  #ifdef CONFIG_DRM_AMDGPU_CIK
> +
> +#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
>  int amdgpu_cik_support = 0;
>  MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled 
> (default))");
> +#else
> +int amdgpu_cik_support = 1;
> +MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = 
> disabled)");
> +#endif
> +
>  module_param_named(cik_support, amdgpu_cik_support, int, 0444);
>  #endif
>
> --
> 2.13.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu/cgs: always set reference clock in mode_info

2017-06-30 Thread Christian König


Am 30.06.2017 um 16:05 schrieb Alex Deucher:

It's relevent regardless of whether there are displays
enabled.  Fixes garbage values for ref clock in powerplay
leading to incorrect fan speed reporting when displays
are disabled.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=101653
Signed-off-by: Alex Deucher 
Cc: sta...@vger.kernel.org

Acked-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
index 8b8eda7..c0a8062 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
@@ -838,9 +838,12 @@ static int amdgpu_cgs_get_active_displays_info(struct 
cgs_device *cgs_device,
return -EINVAL;
  
  	mode_info = info->mode_info;

-   if (mode_info)
+   if (mode_info) {
/* if the displays are off, vblank time is max */
mode_info->vblank_time_us = 0x;
+   /* always set the reference clock */
+   mode_info->ref_clock = adev->clock.spll.reference_freq;
+   }
  
  	if (adev->mode_info.num_crtc && adev->mode_info.mode_config_initialized) {

list_for_each_entry(crtc,



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/cgs: always set reference clock in mode_info

2017-06-30 Thread Alex Deucher

It's relevent regardless of whether there are displays
enabled.  Fixes garbage values for ref clock in powerplay
leading to incorrect fan speed reporting when displays
are disabled.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=101653
Signed-off-by: Alex Deucher 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
index 8b8eda7..c0a8062 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
@@ -838,9 +838,12 @@ static int amdgpu_cgs_get_active_displays_info(struct 
cgs_device *cgs_device,
return -EINVAL;
 
mode_info = info->mode_info;
-   if (mode_info)
+   if (mode_info) {
/* if the displays are off, vblank time is max */
mode_info->vblank_time_us = 0x;
+   /* always set the reference clock */
+   mode_info->ref_clock = adev->clock.spll.reference_freq;
+   }
 
if (adev->mode_info.num_crtc && 
adev->mode_info.mode_config_initialized) {
list_for_each_entry(crtc,
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 4/8] ASoC: AMD: added condition checks for CZ specific code

2017-06-30 Thread Mark Brown

On Thu, Jun 29, 2017 at 12:58:02PM +, Mukunda, Vijendar wrote:

> -Original Message-
> From: Mark Brown [mailto:broo...@kernel.org] 
> Sent: Wednesday, June 28, 2017 11:36 PM
> To: Alex Deucher

Please fix your mail client to quote mails in a more normal fashion,
this looks pretty broken...

> >These defines are being added in the middle of a file but CHIP_STONEY is 
> >also used in another file in the previous patch (and apparently extensively 
> >throughout the DRM driver already).  This is obviously not good, >we 
> >shouldn't have multiple copies of the definition.

...especially in that it's reflowing the message it's replying to to
cause 80 column problems and has serious problems in that regard itself.

> We will modify code to use single definition for CHIP_STONEY and CHIP_CARRIZO.
> There are only two chip sets based on ACP 2.x design(Carrizo and Stoney).
> Our future Chip sets going to use different design based on next ACP IP 
> version.

Write the code well, that way we don't have bad patterns in the codebase
and if plans change with regard to new variants you're covered.

> In the current patch, Condition checks added for Carrizo for setting SRAM 
> BANK state.
> Memory Gating is disabled in Stoney,i.e SRAM Bank's won't be turned off. The 
> default state for SRAM banks is ON.
> As Memory Gating is disabled, there is no need to add condition checks for 
> Stoney to set SRAM Bank state.

Some documentation of this in the code would be good.


signature.asc
Description: PGP signature
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Resizeable PCI BAR support V5

2017-06-30 Thread Christian König


Hi Dieter,

thanks a lot for testing that.

But I think my poor little FUJITSU PRIMERGY TX150 S7, Xeon X3470 
(Nehalem), PCIe 2.0, 24 GB is to old for this stuff... 
Well, actually you only need to figure out how to enable a PCIe window 
above the 4GB limit.


Could be that the BIOS supports this with the ACPI tables (totally 
unlikely) or you could try to dig up the Northbridge documentation for 
this CPU from Intel and use my patch for the AMD CPUs as blueprint how 
to do this on an Intel CPU as well.


Fact is you GFX hardware is perfectly capable of doing this, it's just 
that the BIOS/Motherboard didn't enabled a PCIe window per default to 
avoid problems with 32bit OSes.


Regards,
Christian.

Am 30.06.2017 um 01:51 schrieb Dieter Nützel:

Hello Christian,

I've running this since you've sent it on-top of amd-staging-4.11. But 
I think my poor little FUJITSU PRIMERGY TX150 S7, Xeon X3470 
(Nehalem), PCIe 2.0, 24 GB is to old for this stuff...


[1.066475] pci :05:00.0: VF(n) BAR0 space: [mem 
0x-0x0003 64bit] (contains BAR0 for 16 VFs)
[1.066489] pci :05:00.0: VF(n) BAR2 space: [mem 
0x-0x003f 64bit] (contains BAR2 for 16 VFs)
[1.121656] pci :00:1c.0: BAR 15: assigned [mem 
0x8000-0x801f 64bit pref]
[1.121659] pci :00:1c.6: BAR 15: assigned [mem 
0x8020-0x803f 64bit pref]
[1.121662] pci :01:00.0: BAR 6: assigned [mem 
0xb012-0xb013 pref]
[1.121681] pci :05:00.0: BAR 6: assigned [mem 
0xb028-0xb02f pref]
[1.121683] pci :05:00.0: BAR 9: no space for [mem size 
0x0040 64bit]
[1.121684] pci :05:00.0: BAR 9: failed to assign [mem size 
0x0040 64bit]
[1.121685] pci :05:00.0: BAR 7: no space for [mem size 
0x0004 64bit]
[1.121687] pci :05:00.0: BAR 7: failed to assign [mem size 
0x0004 64bit]
[3.874180] amdgpu :01:00.0: BAR 0: releasing [mem 
0xc000-0xcfff 64bit pref]
[3.874182] amdgpu :01:00.0: BAR 2: releasing [mem 
0xb040-0xb05f 64bit pref]
[3.874198] pcieport :00:03.0: BAR 15: releasing [mem 
0xb040-0xcfff 64bit pref]
[3.874215] pcieport :00:03.0: BAR 15: no space for [mem size 
0x3 64bit pref]
[3.874217] pcieport :00:03.0: BAR 15: failed to assign [mem 
size 0x3 64bit pref]
[3.874221] amdgpu :01:00.0: BAR 0: no space for [mem size 
0x2 64bit pref]
[3.874223] amdgpu :01:00.0: BAR 0: failed to assign [mem size 
0x2 64bit pref]
[3.874226] amdgpu :01:00.0: BAR 2: no space for [mem size 
0x0020 64bit pref]
[3.874227] amdgpu :01:00.0: BAR 2: failed to assign [mem size 
0x0020 64bit pref]

[3.874258] [drm] Not enough PCI address space for a large BAR.
[3.874261] amdgpu :01:00.0: BAR 0: assigned [mem 
0xc000-0xcfff 64bit pref]
[3.874269] amdgpu :01:00.0: BAR 2: assigned [mem 
0xb040-0xb05f 64bit pref]

[3.874288] [drm] Detected VRAM RAM=8192M, BAR=256M

Anyway rebase for current amd-staging-4.11 needed.
Find attached dmesg-amd-staging-4.11-1.g7262353-default+.log.xz

Greetings,
Dieter

Am 09.06.2017 10:59, schrieb Christian König:

Hi everyone,

This is the fith incarnation of this set of patches. It enables device
drivers to resize and most likely also relocate the PCI BAR of devices
they manage to allow the CPU to access all of the device local memory 
at once.


This is very useful for GFX device drivers where the default PCI BAR 
is only
about 256MB in size for compatibility reasons, but the device easily 
have

multiple gigabyte of local memory.

Some changes since V4:
1. Rebased on 4.11.
2. added the rb from Andy Shevchenko to patches which look complete now.
3. Move releasing the BAR and reallocating it on error to the driver 
side.

4. Add amdgpu support for GMC V6 hardware generation as well.

Please review and/or comment,
Christian.

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-06-30 Thread Christian König


Am 30.06.2017 um 14:39 schrieb Daniel Vetter:

On Fri, Jun 30, 2017 at 08:47:27AM +0200, Christian König wrote:

Am 30.06.2017 um 04:24 schrieb Michel Dänzer:

On 29/06/17 07:05 PM, Daniel Vetter wrote:

On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:

On 29/06/17 05:23 PM, Christian König wrote:

Am 29.06.2017 um 04:35 schrieb Michel Dänzer:

On 29/06/17 08:26 AM, John Brooks wrote:

On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:

Instead of the flag being set in stone at BO creation, set the flag
when a
page fault occurs so that it goes somewhere CPU-visible, and clear
it when
the BO is requested by the GPU.

However, clearing the CPU_ACCESS_REQUIRED flag may move BOs in GTT to
invisible VRAM, where they may promptly generate another page
fault. When
BOs are constantly moved back and forth like this, it is highly
detrimental
to performance. Only clear the flag on CS if:

- The BO wasn't page faulted for a certain amount of time
(currently 10
 seconds), and
- its last page fault didn't occur too soon (currently 500ms) after
its
 last CS request, or vice versa.

Setting the flag in amdgpu_fault_reserve_notify() also means that
we can
remove the loop to restrict lpfn to the end of visible VRAM, because
amdgpu_ttm_placement_init() will do it for us.

I'm fine with the general approach, but I'm still absolutely not
keen about
clearing the flag when userspace has originally specified it.

Is there any specific concern you have about that?

Yeah, quite a bunch actually. We want to use this flag for P2P buffer
sharing in the future as well and I don't intent to add another one like
CPU_ACCESS_REALLY_REQUIRED or something like this.

Won't a BO need to be pinned while it's being shared with another device?

That's an artifact of the current kernel implementation, I think we could
do better (but for current use-cases where we share a bunch of scanouts
and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
changing.

Surely there will need to be some kind of transaction though to let the
driver know when a BO starts/stops being shared with another device?
Either via the existing dma-buf callbacks, or something similar. We
can't rely on userspace setting a "CPU access" flag to make sure a BO
can be shared with other devices?

Well I just jumped into the middle of this that it's not entirely out of
the question as an idea, but yeah we'd need to rework the dma-buf stuff
with probably a callback to evict mappings/stall for outstanding rendering
or something like that.


Well, the flag was never intended to be used by userspace.

See the history was more like we need something in the kernel to place the
BO in CPU accessible VRAM.

Then the closed source UMD came along and said hey we have the concept of
two different heaps for visible and invisible VRAM, how does that maps to
amdgpu?

I unfortunately was to tired to push back hard enough on this

Ehrm, are you saying you have uapi for the closed source stack only?


No, Mesa is using that flag as well.

What I'm saying is that we have a flag which became uapi because I was 
to lazy to distinct between uapi and kernel internal flags.



I can help with the push back on this with a revert, no problem :-)


That would break Mesa and is not an option (unfortunately :).

Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 4/5] drm/amdgpu: Set/clear CPU_ACCESS_REQUIRED flag on page fault and CS

2017-06-30 Thread Daniel Vetter

On Fri, Jun 30, 2017 at 08:47:27AM +0200, Christian König wrote:
> Am 30.06.2017 um 04:24 schrieb Michel Dänzer:
> > On 29/06/17 07:05 PM, Daniel Vetter wrote:
> > > On Thu, Jun 29, 2017 at 06:58:05PM +0900, Michel Dänzer wrote:
> > > > On 29/06/17 05:23 PM, Christian König wrote:
> > > > > Am 29.06.2017 um 04:35 schrieb Michel Dänzer:
> > > > > > On 29/06/17 08:26 AM, John Brooks wrote:
> > > > > > > On Wed, Jun 28, 2017 at 03:05:32PM +0200, Christian König wrote:
> > > > > > > > > Instead of the flag being set in stone at BO creation, set 
> > > > > > > > > the flag
> > > > > > > > > when a
> > > > > > > > > page fault occurs so that it goes somewhere CPU-visible, and 
> > > > > > > > > clear
> > > > > > > > > it when
> > > > > > > > > the BO is requested by the GPU.
> > > > > > > > > 
> > > > > > > > > However, clearing the CPU_ACCESS_REQUIRED flag may move BOs 
> > > > > > > > > in GTT to
> > > > > > > > > invisible VRAM, where they may promptly generate another page
> > > > > > > > > fault. When
> > > > > > > > > BOs are constantly moved back and forth like this, it is 
> > > > > > > > > highly
> > > > > > > > > detrimental
> > > > > > > > > to performance. Only clear the flag on CS if:
> > > > > > > > > 
> > > > > > > > > - The BO wasn't page faulted for a certain amount of time
> > > > > > > > > (currently 10
> > > > > > > > > seconds), and
> > > > > > > > > - its last page fault didn't occur too soon (currently 500ms) 
> > > > > > > > > after
> > > > > > > > > its
> > > > > > > > > last CS request, or vice versa.
> > > > > > > > > 
> > > > > > > > > Setting the flag in amdgpu_fault_reserve_notify() also means 
> > > > > > > > > that
> > > > > > > > > we can
> > > > > > > > > remove the loop to restrict lpfn to the end of visible VRAM, 
> > > > > > > > > because
> > > > > > > > > amdgpu_ttm_placement_init() will do it for us.
> > > > > > > > I'm fine with the general approach, but I'm still absolutely not
> > > > > > > > keen about
> > > > > > > > clearing the flag when userspace has originally specified it.
> > > > > > Is there any specific concern you have about that?
> > > > > Yeah, quite a bunch actually. We want to use this flag for P2P buffer
> > > > > sharing in the future as well and I don't intent to add another one 
> > > > > like
> > > > > CPU_ACCESS_REALLY_REQUIRED or something like this.
> > > > Won't a BO need to be pinned while it's being shared with another 
> > > > device?
> > > That's an artifact of the current kernel implementation, I think we could
> > > do better (but for current use-cases where we share a bunch of scanouts
> > > and maybe a few pixmaps it's pointless). I wouldn't bet uapi on this never
> > > changing.
> > Surely there will need to be some kind of transaction though to let the
> > driver know when a BO starts/stops being shared with another device?
> > Either via the existing dma-buf callbacks, or something similar. We
> > can't rely on userspace setting a "CPU access" flag to make sure a BO
> > can be shared with other devices?

Well I just jumped into the middle of this that it's not entirely out of
the question as an idea, but yeah we'd need to rework the dma-buf stuff
with probably a callback to evict mappings/stall for outstanding rendering
or something like that.

> Well, the flag was never intended to be used by userspace.
> 
> See the history was more like we need something in the kernel to place the
> BO in CPU accessible VRAM.
> 
> Then the closed source UMD came along and said hey we have the concept of
> two different heaps for visible and invisible VRAM, how does that maps to
> amdgpu?
> 
> I unfortunately was to tired to push back hard enough on this

Ehrm, are you saying you have uapi for the closed source stack only?

I can help with the push back on this with a revert, no problem :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-06-30 Thread Marek Olšák

On Fri, Jun 30, 2017 at 12:34 PM, Christian König
 wrote:
> Am 30.06.2017 um 09:14 schrieb Michel Dänzer:
>>
>> On 30/06/17 03:59 PM, Christian König wrote:
>>>
>>> Am 30.06.2017 um 08:51 schrieb Michel Dänzer:

 We can deal with that internally in the kernel, while fixing the
 existing flag for userspace.
>>>
>>> And as I said, NAK to that approach. I'm not going to add a
>>> CPU_ACCESS_REALLY_REQUIRED flag in the kernel just because mesa has
>>> messed up it's use case.
>>>
>>> We could agree on filtering that flag from userspace when BOs are
>>> created and/or map it to a CREATE_CPU_ACCESS_HINT flag.
>>
>> Then I propose the following:
>>
>> One patch:
>>
>> Convert AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to a kernel internal flag
>> AMDGPU_GEM_CPU_ACCESS_HINT in amdgpu_gem_create_ioctl, which is
>> initially treated the same way as AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.
>>
>> Another patch:
>>
>> Change the treatment of AMDGPU_GEM_CPU_ACCESS_HINT according to John's
>> patch 4 in the latest series, or a variation of that as discussed on IRC.
>>
>>
>> If any regressions are reported, we will be able to differentiate
>> whether they are due to the addition of the new flag itself or due to
>> the change in its handling.
>
>
>
> It just occurred to me that there is a simpler way of handling this: We just
> never clear the flag on kernel allocations.
>
> See my main concern are the in kernel users of the flag which use it as
> guarantee that the BO is CPU accessible.
>
> If we handle those specially there shouldn't be a problem clearing the flag
> for the UMD BOs.

Hi,

I don't know what is being talked about here anymore, but I wouldn't
like to use CPU_ACCESS_REQUIRED or CPU_ACCESS_REALLY_REQUIRED in
userspace. The reason is that userspace doesn't and can't know whether
CPU access will be required, and the frequency at which it will be
required. 3 heaps {no CPU access, no flag, CPU access required} are
too many. Userspace mostly doesn't use the "no flag" heap for VRAM. It
uses "CPU access required" for almost everything except tiled
textures, which use "no CPU access".

I've been trying to trim down the number of heaps. So far, I have:
- VRAM_NO_CPU_ACCESS (implies WC)
- VRAM (implies WC)
- VRAM_GTT (combined, implies WC)
- GTT_WC
- GTT

See, you can't forbid CPU access for the combined VRAM_GTT heap. It's
one of the compromises there.

The more heaps we have, the more memory can be wasted by
suballocators. It's silly to have more than 3 suballocators just for
VRAM.

Marek
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 08/12] drm/amdgpu: add amdgpu_gart_map function

2017-06-30 Thread Christian König

From: Christian König 

This allows us to write the mapped PTEs into
an IB instead of the table directly.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 64 
 2 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 810796a..4a2b33d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -572,6 +572,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev);
 void amdgpu_gart_fini(struct amdgpu_device *adev);
 int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
int pages);
+int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+   int pages, dma_addr_t *dma_addr, uint64_t flags,
+   void *dst);
 int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
 int pages, struct page **pagelist,
 dma_addr_t *dma_addr, uint64_t flags);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 8877015..d99b2b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -280,6 +280,43 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, 
uint64_t offset,
 }
 
 /**
+ * amdgpu_gart_map - map dma_addresses into GART entries
+ *
+ * @adev: amdgpu_device pointer
+ * @offset: offset into the GPU's gart aperture
+ * @pages: number of pages to bind
+ * @dma_addr: DMA addresses of pages
+ *
+ * Map the dma_addresses into GART entries (all asics).
+ * Returns 0 for success, -EINVAL for failure.
+ */
+int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+   int pages, dma_addr_t *dma_addr, uint64_t flags,
+   void *dst)
+{
+   uint64_t page_base;
+   unsigned t, p;
+   int i, j;
+
+   if (!adev->gart.ready) {
+   WARN(1, "trying to bind memory to uninitialized GART !\n");
+   return -EINVAL;
+   }
+
+   t = offset / AMDGPU_GPU_PAGE_SIZE;
+   p = t / (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE);
+
+   for (i = 0; i < pages; i++, p++) {
+   page_base = dma_addr[i];
+   for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); j++, t++) {
+   amdgpu_gart_set_pte_pde(adev, dst, t, page_base, flags);
+   page_base += AMDGPU_GPU_PAGE_SIZE;
+   }
+   }
+   return 0;
+}
+
+/**
  * amdgpu_gart_bind - bind pages into the gart page table
  *
  * @adev: amdgpu_device pointer
@@ -296,31 +333,28 @@ int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t 
offset,
 int pages, struct page **pagelist, dma_addr_t *dma_addr,
 uint64_t flags)
 {
-   unsigned t;
-   unsigned p;
-   uint64_t page_base;
-   int i, j;
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+   unsigned i;
+#endif
+   int r;
 
if (!adev->gart.ready) {
WARN(1, "trying to bind memory to uninitialized GART !\n");
return -EINVAL;
}
 
-   t = offset / AMDGPU_GPU_PAGE_SIZE;
-   p = t / (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE);
-
-   for (i = 0; i < pages; i++, p++) {
 #ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+   for (i = 0; i < pages; i++, p++)
adev->gart.pages[p] = pagelist[i];
 #endif
-   if (adev->gart.ptr) {
-   page_base = dma_addr[i];
-   for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); 
j++, t++) {
-   amdgpu_gart_set_pte_pde(adev, adev->gart.ptr, 
t, page_base, flags);
-   page_base += AMDGPU_GPU_PAGE_SIZE;
-   }
-   }
+
+   if (adev->gart.ptr) {
+   r = amdgpu_gart_map(adev, offset, pages, dma_addr, flags,
+   adev->gart.ptr);
+   if (r)
+   return r;
}
+
mb();
amdgpu_gart_flush_gpu_tlb(adev, 0);
return 0;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 07/12] drm/amdgpu: reserve the first 2x2MB of GART

2017-06-30 Thread Christian König

From: Christian König 

We want to use them as remap address space.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 3 +++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 6fdf83a..a0976dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -43,12 +43,15 @@ static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager 
*man,
   unsigned long p_size)
 {
struct amdgpu_gtt_mgr *mgr;
+   uint64_t start, size;
 
mgr = kzalloc(sizeof(*mgr), GFP_KERNEL);
if (!mgr)
return -ENOMEM;
 
-   drm_mm_init(&mgr->mm, 0, p_size);
+   start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
+   size = p_size - start;
+   drm_mm_init(&mgr->mm, start, size);
spin_lock_init(&mgr->lock);
mgr->available = p_size;
man->priv = mgr;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 2ade5c5..9c4da0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -34,6 +34,9 @@
 #define AMDGPU_PL_FLAG_GWS (TTM_PL_FLAG_PRIV << 1)
 #define AMDGPU_PL_FLAG_OA  (TTM_PL_FLAG_PRIV << 2)
 
+#define AMDGPU_GTT_MAX_TRANSFER_SIZE   512
+#define AMDGPU_GTT_NUM_TRANSFER_WINDOWS2
+
 struct amdgpu_mman {
struct ttm_bo_global_refbo_global_ref;
struct drm_global_reference mem_global_ref;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 04/12] drm/amdgpu: add vm_needs_flush parameter to amdgpu_copy_buffer

2017-06-30 Thread Christian König

From: Christian König 

This allows us to flush the system VM here.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_test.c  |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 12 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |  9 -
 5 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
index 1beae5b..2fb299a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
@@ -40,7 +40,7 @@ static int amdgpu_benchmark_do_move(struct amdgpu_device 
*adev, unsigned size,
for (i = 0; i < n; i++) {
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
r = amdgpu_copy_buffer(ring, saddr, daddr, size, NULL, &fence,
-  false);
+  false, false);
if (r)
goto exit_do_move;
r = dma_fence_wait(fence, false);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8ee6965..c34cf2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -535,7 +535,7 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
 
r = amdgpu_copy_buffer(ring, bo_addr, shadow_addr,
   amdgpu_bo_size(bo), resv, fence,
-  direct);
+  direct, false);
if (!r)
amdgpu_bo_fence(bo, *fence, true);
 
@@ -588,7 +588,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device 
*adev,
 
r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
   amdgpu_bo_size(bo), resv, fence,
-  direct);
+  direct, false);
if (!r)
amdgpu_bo_fence(bo, *fence, true);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
index 15510da..d02e611 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
@@ -111,7 +111,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
amdgpu_bo_kunmap(gtt_obj[i]);
 
r = amdgpu_copy_buffer(ring, gtt_addr, vram_addr,
-  size, NULL, &fence, false);
+  size, NULL, &fence, false, false);
 
if (r) {
DRM_ERROR("Failed GTT->VRAM copy %d\n", i);
@@ -156,7 +156,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
amdgpu_bo_kunmap(vram_obj);
 
r = amdgpu_copy_buffer(ring, vram_addr, gtt_addr,
-  size, NULL, &fence, false);
+  size, NULL, &fence, false, false);
 
if (r) {
DRM_ERROR("Failed VRAM->GTT copy %d\n", i);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index e4860ac..bbe1639 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -318,7 +318,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 
r = amdgpu_copy_buffer(ring, old_start, new_start,
   cur_pages * PAGE_SIZE,
-  bo->resv, &next, false);
+  bo->resv, &next, false, false);
if (r)
goto error;
 
@@ -1256,12 +1256,11 @@ int amdgpu_mmap(struct file *filp, struct 
vm_area_struct *vma)
return ttm_bo_mmap(filp, vma, &adev->mman.bdev);
 }
 
-int amdgpu_copy_buffer(struct amdgpu_ring *ring,
-  uint64_t src_offset,
-  uint64_t dst_offset,
-  uint32_t byte_count,
+int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
+  uint64_t dst_offset, uint32_t byte_count,
   struct reservation_object *resv,
-  struct dma_fence **fence, bool direct_submit)
+  struct dma_fence **fence, bool direct_submit,
+  bool vm_needs_flush)
 {
struct amdgpu_device *adev = ring->adev;
struct amdgpu_job *job;
@@ -1283,6 +1282,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring,
if (r)
return r;
 
+   job->vm_needs_flush = vm_needs_flush;
if (resv) {
r = amdgpu_sync_resv(adev, &job->sync, resv,
 AMDGPU_FENCE_OWNER_UNDEFINED);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdg

[PATCH 05/12] drm/amdgpu: bind BOs to TTM only once

2017-06-30 Thread Christian König

From: Christian König 

No need to do this on every round.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 70 ++---
 1 file changed, 29 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index bbe1639..5bfe7f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -252,29 +252,15 @@ static void amdgpu_move_null(struct ttm_buffer_object *bo,
new_mem->mm_node = NULL;
 }
 
-static int amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
-  struct drm_mm_node *mm_node,
-  struct ttm_mem_reg *mem,
-  uint64_t *addr)
+static uint64_t amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
+   struct drm_mm_node *mm_node,
+   struct ttm_mem_reg *mem)
 {
-   int r;
-
-   switch (mem->mem_type) {
-   case TTM_PL_TT:
-   r = amdgpu_ttm_bind(bo, mem);
-   if (r)
-   return r;
-
-   case TTM_PL_VRAM:
-   *addr = mm_node->start << PAGE_SHIFT;
-   *addr += bo->bdev->man[mem->mem_type].gpu_offset;
-   break;
-   default:
-   DRM_ERROR("Unknown placement %d\n", mem->mem_type);
-   return -EINVAL;
-   }
+   uint64_t addr;
 
-   return 0;
+   addr = mm_node->start << PAGE_SHIFT;
+   addr += bo->bdev->man[mem->mem_type].gpu_offset;
+   return addr;
 }
 
 static int amdgpu_move_blit(struct ttm_buffer_object *bo,
@@ -298,18 +284,25 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
return -EINVAL;
}
 
+   if (old_mem->mem_type == TTM_PL_TT) {
+   r = amdgpu_ttm_bind(bo, old_mem);
+   if (r)
+   return r;
+   }
+
old_mm = old_mem->mm_node;
-   r = amdgpu_mm_node_addr(bo, old_mm, old_mem, &old_start);
-   if (r)
-   return r;
old_size = old_mm->size;
+   old_start = amdgpu_mm_node_addr(bo, old_mm, old_mem);
 
+   if (new_mem->mem_type == TTM_PL_TT) {
+   r = amdgpu_ttm_bind(bo, new_mem);
+   if (r)
+   return r;
+   }
 
new_mm = new_mem->mm_node;
-   r = amdgpu_mm_node_addr(bo, new_mm, new_mem, &new_start);
-   if (r)
-   return r;
new_size = new_mm->size;
+   new_start = amdgpu_mm_node_addr(bo, new_mm, new_mem);
 
num_pages = new_mem->num_pages;
while (num_pages) {
@@ -331,10 +324,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 
old_size -= cur_pages;
if (!old_size) {
-   r = amdgpu_mm_node_addr(bo, ++old_mm, old_mem,
-   &old_start);
-   if (r)
-   goto error;
+   old_start = amdgpu_mm_node_addr(bo, ++old_mm, old_mem);
old_size = old_mm->size;
} else {
old_start += cur_pages * PAGE_SIZE;
@@ -342,11 +332,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 
new_size -= cur_pages;
if (!new_size) {
-   r = amdgpu_mm_node_addr(bo, ++new_mm, new_mem,
-   &new_start);
-   if (r)
-   goto error;
-
+   new_start = amdgpu_mm_node_addr(bo, ++new_mm, new_mem);
new_size = new_mm->size;
} else {
new_start += cur_pages * PAGE_SIZE;
@@ -1347,6 +1333,12 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
return -EINVAL;
}
 
+   if (bo->tbo.mem.mem_type == TTM_PL_TT) {
+   r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
+   if (r)
+   return r;
+   }
+
num_pages = bo->tbo.num_pages;
mm_node = bo->tbo.mem.mm_node;
num_loops = 0;
@@ -1382,11 +1374,7 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
uint32_t byte_count = mm_node->size << PAGE_SHIFT;
uint64_t dst_addr;
 
-   r = amdgpu_mm_node_addr(&bo->tbo, mm_node,
-   &bo->tbo.mem, &dst_addr);
-   if (r)
-   return r;
-
+   dst_addr = amdgpu_mm_node_addr(&bo->tbo, mm_node, &bo->tbo.mem);
while (byte_count) {
uint32_t cur_size_in_bytes = min(byte_count, max_bytes);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 11/12] drm/amdgpu: remove maximum BO size limitation.

2017-06-30 Thread Christian König

From: Christian König 

We can finally remove this now.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 96c4493..2382785 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -58,17 +58,6 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, 
unsigned long size,
alignment = PAGE_SIZE;
}
 
-   if (!(initial_domain & (AMDGPU_GEM_DOMAIN_GDS | AMDGPU_GEM_DOMAIN_GWS | 
AMDGPU_GEM_DOMAIN_OA))) {
-   /* Maximum bo size is the unpinned gtt size since we use the 
gtt to
-* handle vram to system pool migrations.
-*/
-   max_size = adev->mc.gtt_size - adev->gart_pin_size;
-   if (size > max_size) {
-   DRM_DEBUG("Allocation size %ldMb bigger than %ldMb 
limit\n",
- size >> 20, max_size >> 20);
-   return -ENOMEM;
-   }
-   }
 retry:
r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
 flags, NULL, NULL, &robj);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 06/12] drm/amdgpu: bind BOs with GTT space allocated directly

2017-06-30 Thread Christian König

From: Christian König 

This avoids binding them later on.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 16 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 49 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 +
 3 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index f7d22c4..6fdf83a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -81,6 +81,20 @@ static int amdgpu_gtt_mgr_fini(struct ttm_mem_type_manager 
*man)
 }
 
 /**
+ * amdgpu_gtt_mgr_is_allocated - Check if mem has address space
+ *
+ * @mem: the mem object to check
+ *
+ * Check if a mem object has already address space allocated.
+ */
+bool amdgpu_gtt_mgr_is_alloced(struct ttm_mem_reg *mem)
+{
+   struct drm_mm_node *node = mem->mm_node;
+
+   return (node->start != AMDGPU_BO_INVALID_OFFSET);
+}
+
+/**
  * amdgpu_gtt_mgr_alloc - allocate new ranges
  *
  * @man: TTM memory type manager
@@ -101,7 +115,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
unsigned long fpfn, lpfn;
int r;
 
-   if (node->start != AMDGPU_BO_INVALID_OFFSET)
+   if (amdgpu_gtt_mgr_is_alloced(mem))
return 0;
 
if (place)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 5bfe7f6..eb0d7d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -681,6 +681,31 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
sg_free_table(ttm->sg);
 }
 
+static int amdgpu_ttm_do_bind(struct ttm_tt *ttm, struct ttm_mem_reg *mem)
+{
+   struct amdgpu_ttm_tt *gtt = (void *)ttm;
+   uint64_t flags;
+   int r;
+
+   spin_lock(>t->adev->gtt_list_lock);
+   flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, mem);
+   gtt->offset = (u64)mem->start << PAGE_SHIFT;
+   r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
+   ttm->pages, gtt->ttm.dma_address, flags);
+
+   if (r) {
+   DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
+ ttm->num_pages, gtt->offset);
+   goto error_gart_bind;
+   }
+
+   list_add_tail(>t->list, >t->adev->gtt_list);
+error_gart_bind:
+   spin_unlock(>t->adev->gtt_list_lock);
+   return r;
+
+}
+
 static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
   struct ttm_mem_reg *bo_mem)
 {
@@ -704,7 +729,10 @@ static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
bo_mem->mem_type == AMDGPU_PL_OA)
return -EINVAL;
 
-   return 0;
+   if (amdgpu_gtt_mgr_is_alloced(bo_mem))
+   r = amdgpu_ttm_do_bind(ttm, bo_mem);
+
+   return r;
 }
 
 bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
@@ -717,8 +745,6 @@ bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
 int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
 {
struct ttm_tt *ttm = bo->ttm;
-   struct amdgpu_ttm_tt *gtt = (void *)bo->ttm;
-   uint64_t flags;
int r;
 
if (!ttm || amdgpu_ttm_is_bound(ttm))
@@ -731,22 +757,7 @@ int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct 
ttm_mem_reg *bo_mem)
return r;
}
 
-   spin_lock(>t->adev->gtt_list_lock);
-   flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, bo_mem);
-   gtt->offset = (u64)bo_mem->start << PAGE_SHIFT;
-   r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
-   ttm->pages, gtt->ttm.dma_address, flags);
-
-   if (r) {
-   DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
- ttm->num_pages, gtt->offset);
-   goto error_gart_bind;
-   }
-
-   list_add_tail(>t->list, >t->adev->gtt_list);
-error_gart_bind:
-   spin_unlock(>t->adev->gtt_list_lock);
-   return r;
+   return amdgpu_ttm_do_bind(ttm, bo_mem);
 }
 
 int amdgpu_ttm_recover_gart(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index cd5bbfa..2ade5c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -56,6 +56,7 @@ struct amdgpu_mman {
 extern const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_func;
 extern const struct ttm_mem_type_manager_func amdgpu_vram_mgr_func;
 
+bool amdgpu_gtt_mgr_is_alloced(struct ttm_mem_reg *mem);
 int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 struct ttm_buffer_object *tbo,
 const struct ttm_place *place,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 12/12] drm/amdgpu: add gtt_sys_limit

2017-06-30 Thread Christian König

From: Christian König 

Limit the size of the GART table for the system domain.

This saves us a bunch of visible VRAM, but also limitates the maximum BO size 
we can swap out.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 6 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c| 8 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 --
 5 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 4a2b33d..ef8e6b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -76,6 +76,7 @@
 extern int amdgpu_modeset;
 extern int amdgpu_vram_limit;
 extern int amdgpu_gart_size;
+extern unsigned amdgpu_gart_sys_limit;
 extern int amdgpu_moverate;
 extern int amdgpu_benchmarking;
 extern int amdgpu_testing;
@@ -605,6 +606,7 @@ struct amdgpu_mc {
u64 mc_vram_size;
u64 visible_vram_size;
u64 gtt_size;
+   u64 gtt_sys_limit;
u64 gtt_start;
u64 gtt_end;
u64 vram_start;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5b1220f..7e3f8cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1122,6 +1122,12 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
}
}
 
+   if (amdgpu_gart_sys_limit < 32) {
+   dev_warn(adev->dev, "gart sys limit (%d) too small\n",
+amdgpu_gart_sys_limit);
+   amdgpu_gart_sys_limit = 32;
+   }
+
amdgpu_check_vm_size(adev);
 
amdgpu_check_block_size(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 5a1d794..907ae5e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -75,6 +75,7 @@
 
 int amdgpu_vram_limit = 0;
 int amdgpu_gart_size = -1; /* auto */
+unsigned amdgpu_gart_sys_limit = 256;
 int amdgpu_moverate = -1; /* auto */
 int amdgpu_benchmarking = 0;
 int amdgpu_testing = 0;
@@ -124,6 +125,9 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
 MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 
64, etc., -1 = auto)");
 module_param_named(gartsize, amdgpu_gart_size, int, 0600);
 
+MODULE_PARM_DESC(gartlimit, "GART limit for the system domain in megabytes 
(default 256)");
+module_param_named(gartlimit, amdgpu_gart_sys_limit, int, 0600);
+
 MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, 
etc., -1=auto, 0=1=disabled)");
 module_param_named(moverate, amdgpu_moverate, int, 0600);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index d99b2b2..f82eeaa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -70,6 +70,9 @@ void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
adev->mc.mc_vram_size);
else
adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
+
+   adev->mc.gtt_sys_limit = min((uint64_t)amdgpu_gart_sys_limit << 20,
+adev->mc.gtt_size);
 }
 
 /**
@@ -384,8 +387,9 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
if (r)
return r;
/* Compute table size */
-   adev->gart.num_cpu_pages = adev->mc.gtt_size / PAGE_SIZE;
-   adev->gart.num_gpu_pages = adev->mc.gtt_size / AMDGPU_GPU_PAGE_SIZE;
+   adev->gart.num_cpu_pages = adev->mc.gtt_sys_limit / PAGE_SIZE;
+   adev->gart.num_gpu_pages = adev->mc.gtt_sys_limit /
+   AMDGPU_GPU_PAGE_SIZE;
DRM_INFO("GART: num cpu pages %u, num gpu pages %u\n",
 adev->gart.num_cpu_pages, adev->gart.num_gpu_pages);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index a0976dc..9b516c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -42,6 +42,7 @@ struct amdgpu_gtt_mgr {
 static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager *man,
   unsigned long p_size)
 {
+   struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
struct amdgpu_gtt_mgr *mgr;
uint64_t start, size;
 
@@ -50,7 +51,7 @@ static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager 
*man,
return -ENOMEM;
 
start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
-   size = p_size - start;
+   size = (adev->mc.gtt_sys_limit >> PAGE_SHIF

[PATCH 10/12] drm/amdgpu: stop mapping BOs to GTT

2017-06-30 Thread Christian König

From: Christian König 

No need to map BOs to GTT on eviction and intermediate transfers any more.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 247ce21..e1ebcba 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -200,7 +200,6 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
.lpfn = 0,
.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM
};
-   unsigned i;
 
if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) {
placement->placement = &placements;
@@ -218,20 +217,6 @@ static void amdgpu_evict_flags(struct ttm_buffer_object 
*bo,
amdgpu_ttm_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_CPU);
} else {
amdgpu_ttm_placement_from_domain(abo, 
AMDGPU_GEM_DOMAIN_GTT);
-   for (i = 0; i < abo->placement.num_placement; ++i) {
-   if (!(abo->placements[i].flags &
- TTM_PL_FLAG_TT))
-   continue;
-
-   if (abo->placements[i].lpfn)
-   continue;
-
-   /* set an upper limit to force directly
-* allocating address space for the BO.
-*/
-   abo->placements[i].lpfn =
-   adev->mc.gtt_size >> PAGE_SHIFT;
-   }
}
break;
case TTM_PL_TT:
@@ -391,7 +376,7 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object 
*bo,
placement.num_busy_placement = 1;
placement.busy_placement = &placements;
placements.fpfn = 0;
-   placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
+   placements.lpfn = 0;
placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
 interruptible, no_wait_gpu);
@@ -438,7 +423,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo,
placement.num_busy_placement = 1;
placement.busy_placement = &placements;
placements.fpfn = 0;
-   placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
+   placements.lpfn = 0;
placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
 interruptible, no_wait_gpu);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 03/12] drm/amdgpu: allow flushing VMID0 before IB execution as well

2017-06-30 Thread Christian König

From: Christian König 

This allows us to queue IBs which needs an up to date system domain as well.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c  | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index f774b3f..1b30d2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -172,7 +172,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
if (ring->funcs->insert_start)
ring->funcs->insert_start(ring);
 
-   if (vm) {
+   if (job) {
r = amdgpu_vm_flush(ring, job);
if (r) {
amdgpu_ring_undo(ring);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 3d641e1..4510627 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -81,6 +81,8 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, 
unsigned size,
r = amdgpu_ib_get(adev, NULL, size, &(*job)->ibs[0]);
if (r)
kfree(*job);
+   else
+   (*job)->vm_pd_addr = adev->gart.table_addr;
 
return r;
 }
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 09/12] drm/amdgpu: use the GTT windows for BO moves

2017-06-30 Thread Christian König

From: Christian König 

This way we don't need to map the full BO at a time any more.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 127 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |   3 +
 2 files changed, 111 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index eb0d7d7..247ce21 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -47,10 +47,15 @@
 
 #define DRM_FILE_PAGE_OFFSET (0x1ULL >> PAGE_SHIFT)
 
+static int amdgpu_map_buffer(struct ttm_buffer_object *bo,
+struct ttm_mem_reg *mem,
+unsigned num_pages, uint64_t offset,
+struct amdgpu_ring *ring,
+uint64_t *addr);
+
 static int amdgpu_ttm_debugfs_init(struct amdgpu_device *adev);
 static void amdgpu_ttm_debugfs_fini(struct amdgpu_device *adev);
 
-
 /*
  * Global memory.
  */
@@ -97,6 +102,9 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
goto error_bo;
}
 
+   mutex_init(&adev->mman.gtt_window_lock);
+   adev->mman.gtt_index = 0;
+
ring = adev->mman.buffer_funcs_ring;
rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_KERNEL];
r = amd_sched_entity_init(&ring->sched, &adev->mman.entity,
@@ -123,6 +131,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device 
*adev)
if (adev->mman.mem_global_referenced) {
amd_sched_entity_fini(adev->mman.entity.sched,
  &adev->mman.entity);
+   mutex_destroy(&adev->mman.gtt_window_lock);
drm_global_item_unref(&adev->mman.bo_global_ref.ref);
drm_global_item_unref(&adev->mman.mem_global_ref);
adev->mman.mem_global_referenced = false;
@@ -256,10 +265,12 @@ static uint64_t amdgpu_mm_node_addr(struct 
ttm_buffer_object *bo,
struct drm_mm_node *mm_node,
struct ttm_mem_reg *mem)
 {
-   uint64_t addr;
+   uint64_t addr = 0;
 
-   addr = mm_node->start << PAGE_SHIFT;
-   addr += bo->bdev->man[mem->mem_type].gpu_offset;
+   if (mm_node->start != AMDGPU_BO_INVALID_OFFSET) {
+   addr = mm_node->start << PAGE_SHIFT;
+   addr += bo->bdev->man[mem->mem_type].gpu_offset;
+   }
return addr;
 }
 
@@ -284,34 +295,41 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
return -EINVAL;
}
 
-   if (old_mem->mem_type == TTM_PL_TT) {
-   r = amdgpu_ttm_bind(bo, old_mem);
-   if (r)
-   return r;
-   }
-
old_mm = old_mem->mm_node;
old_size = old_mm->size;
old_start = amdgpu_mm_node_addr(bo, old_mm, old_mem);
 
-   if (new_mem->mem_type == TTM_PL_TT) {
-   r = amdgpu_ttm_bind(bo, new_mem);
-   if (r)
-   return r;
-   }
-
new_mm = new_mem->mm_node;
new_size = new_mm->size;
new_start = amdgpu_mm_node_addr(bo, new_mm, new_mem);
 
num_pages = new_mem->num_pages;
+   mutex_lock(&adev->mman.gtt_window_lock);
while (num_pages) {
-   unsigned long cur_pages = min(old_size, new_size);
+   unsigned long cur_pages = min(min(old_size, new_size),
+ 
(u64)AMDGPU_GTT_MAX_TRANSFER_SIZE);
+   uint64_t from = old_start, to = new_start;
struct dma_fence *next;
 
-   r = amdgpu_copy_buffer(ring, old_start, new_start,
+   if (old_mem->mem_type == TTM_PL_TT &&
+   !amdgpu_gtt_mgr_is_alloced(old_mem)) {
+   r = amdgpu_map_buffer(bo, old_mem, cur_pages,
+ old_start, ring, &from);
+   if (r)
+   goto error;
+   }
+
+   if (new_mem->mem_type == TTM_PL_TT &&
+   !amdgpu_gtt_mgr_is_alloced(new_mem)) {
+   r = amdgpu_map_buffer(bo, new_mem, cur_pages,
+ new_start, ring, &to);
+   if (r)
+   goto error;
+   }
+
+   r = amdgpu_copy_buffer(ring, from, to,
   cur_pages * PAGE_SIZE,
-  bo->resv, &next, false, false);
+  bo->resv, &next, false, true);
if (r)
goto error;
 
@@ -338,12 +356,15 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
new_start += cur_pages * PAGE_SIZE;
}
}
+   mutex_unlock(&adev->mman.gtt_window_lock);

[PATCH 02/12] drm/amdgpu: fix amdgpu_ring_write_multiple

2017-06-30 Thread Christian König

From: Christian König 

Overwriting still used ring content has a low probability to cause
problems, not writing at all has 100% probability to cause problems.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 04cbc3a..322d2529 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -227,10 +227,8 @@ static inline void amdgpu_ring_write_multiple(struct 
amdgpu_ring *ring,
unsigned occupied, chunk1, chunk2;
void *dst;
 
-   if (unlikely(ring->count_dw < count_dw)) {
+   if (unlikely(ring->count_dw < count_dw))
DRM_ERROR("amdgpu: writing more dwords to the ring than 
expected!\n");
-   return;
-   }
 
occupied = ring->wptr & ring->buf_mask;
dst = (void *)&ring->ring[occupied];
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 01/12] drm/amdgpu: move ring helpers to amdgpu_ring.h

2017-06-30 Thread Christian König

From: Christian König 

Keep them where they belong.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h  | 44 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 42 ++
 2 files changed, 42 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ab1dad2..810796a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1801,50 +1801,6 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device 
*adev);
 #define RBIOS16(i) (RBIOS8(i) | (RBIOS8((i)+1) << 8))
 #define RBIOS32(i) ((RBIOS16(i)) | (RBIOS16((i)+2) << 16))
 
-/*
- * RING helpers.
- */
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-   if (ring->count_dw <= 0)
-   DRM_ERROR("amdgpu: writing more dwords to the ring than 
expected!\n");
-   ring->ring[ring->wptr++ & ring->buf_mask] = v;
-   ring->wptr &= ring->ptr_mask;
-   ring->count_dw--;
-}
-
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void 
*src, int count_dw)
-{
-   unsigned occupied, chunk1, chunk2;
-   void *dst;
-
-   if (unlikely(ring->count_dw < count_dw)) {
-   DRM_ERROR("amdgpu: writing more dwords to the ring than 
expected!\n");
-   return;
-   }
-
-   occupied = ring->wptr & ring->buf_mask;
-   dst = (void *)&ring->ring[occupied];
-   chunk1 = ring->buf_mask + 1 - occupied;
-   chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-   chunk2 = count_dw - chunk1;
-   chunk1 <<= 2;
-   chunk2 <<= 2;
-
-   if (chunk1)
-   memcpy(dst, src, chunk1);
-
-   if (chunk2) {
-   src += chunk1;
-   dst = (void *)ring->ring;
-   memcpy(dst, src, chunk2);
-   }
-
-   ring->wptr += count_dw;
-   ring->wptr &= ring->ptr_mask;
-   ring->count_dw -= count_dw;
-}
-
 static inline struct amdgpu_sdma_instance *
 amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index bc8dec9..04cbc3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -212,4 +212,46 @@ static inline void amdgpu_ring_clear_ring(struct 
amdgpu_ring *ring)
 
 }
 
+static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+   if (ring->count_dw <= 0)
+   DRM_ERROR("amdgpu: writing more dwords to the ring than 
expected!\n");
+   ring->ring[ring->wptr++ & ring->buf_mask] = v;
+   ring->wptr &= ring->ptr_mask;
+   ring->count_dw--;
+}
+
+static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+ void *src, int count_dw)
+{
+   unsigned occupied, chunk1, chunk2;
+   void *dst;
+
+   if (unlikely(ring->count_dw < count_dw)) {
+   DRM_ERROR("amdgpu: writing more dwords to the ring than 
expected!\n");
+   return;
+   }
+
+   occupied = ring->wptr & ring->buf_mask;
+   dst = (void *)&ring->ring[occupied];
+   chunk1 = ring->buf_mask + 1 - occupied;
+   chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+   chunk2 = count_dw - chunk1;
+   chunk1 <<= 2;
+   chunk2 <<= 2;
+
+   if (chunk1)
+   memcpy(dst, src, chunk1);
+
+   if (chunk2) {
+   src += chunk1;
+   dst = (void *)ring->ring;
+   memcpy(dst, src, chunk2);
+   }
+
+   ring->wptr += count_dw;
+   ring->wptr &= ring->ptr_mask;
+   ring->count_dw -= count_dw;
+}
+
 #endif
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-06-30 Thread Christian König


Am 30.06.2017 um 09:14 schrieb Michel Dänzer:

On 30/06/17 03:59 PM, Christian König wrote:

Am 30.06.2017 um 08:51 schrieb Michel Dänzer:

We can deal with that internally in the kernel, while fixing the
existing flag for userspace.

And as I said, NAK to that approach. I'm not going to add a
CPU_ACCESS_REALLY_REQUIRED flag in the kernel just because mesa has
messed up it's use case.

We could agree on filtering that flag from userspace when BOs are
created and/or map it to a CREATE_CPU_ACCESS_HINT flag.

Then I propose the following:

One patch:

Convert AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to a kernel internal flag
AMDGPU_GEM_CPU_ACCESS_HINT in amdgpu_gem_create_ioctl, which is
initially treated the same way as AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.

Another patch:

Change the treatment of AMDGPU_GEM_CPU_ACCESS_HINT according to John's
patch 4 in the latest series, or a variation of that as discussed on IRC.


If any regressions are reported, we will be able to differentiate
whether they are due to the addition of the new flag itself or due to
the change in its handling.



It just occurred to me that there is a simpler way of handling this: We 
just never clear the flag on kernel allocations.


See my main concern are the in kernel users of the flag which use it as 
guarantee that the BO is CPU accessible.


If we handle those specially there shouldn't be a problem clearing the 
flag for the UMD BOs.


Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH xf86-video-ati] Use pRADEONEnt->fd exclusively for the DRM file descriptor

2017-06-30 Thread Michel Dänzer

From: Michel Dänzer 

This brings us closer to amdgpu.

Signed-off-by: Michel Dänzer 
---
 src/drmmode_display.c  | 151 +
 src/drmmode_display.h  |   1 -
 src/radeon.h   |  14 +++--
 src/radeon_accel.c |   4 +-
 src/radeon_bo_helper.c |   8 ++-
 src/radeon_dri2.c  |  54 ++
 src/radeon_dri2.h  |   1 -
 src/radeon_dri3.c  |   3 +-
 src/radeon_exa.c   |   4 +-
 src/radeon_glamor.c|   3 +-
 src/radeon_kms.c   |  70 ---
 src/radeon_present.c   |  13 +++--
 12 files changed, 172 insertions(+), 154 deletions(-)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index dd394ec1d..4b964b7b9 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -272,7 +272,7 @@ int drmmode_get_current_ust(int drm_fd, CARD64 *ust)
 int drmmode_crtc_get_ust_msc(xf86CrtcPtr crtc, CARD64 *ust, CARD64 *msc)
 {
 ScrnInfoPtr scrn = crtc->scrn;
-RADEONInfoPtr info = RADEONPTR(scrn);
+RADEONEntPtr pRADEONEnt = RADEONEntPriv(scrn);
 drmVBlank vbl;
 int ret;
 
@@ -280,7 +280,7 @@ int drmmode_crtc_get_ust_msc(xf86CrtcPtr crtc, CARD64 *ust, 
CARD64 *msc)
 vbl.request.type |= radeon_populate_vbl_request_type(crtc);
 vbl.request.sequence = 0;
 
-ret = drmWaitVBlank(info->dri2.drm_fd, &vbl);
+ret = drmWaitVBlank(pRADEONEnt->fd, &vbl);
 if (ret) {
xf86DrvMsg(scrn->scrnIndex, X_WARNING,
   "get vblank counter failed: %s\n", strerror(errno));
@@ -298,7 +298,7 @@ drmmode_do_crtc_dpms(xf86CrtcPtr crtc, int mode)
 {
drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
ScrnInfoPtr scrn = crtc->scrn;
-   RADEONInfoPtr info = RADEONPTR(scrn);
+   RADEONEntPtr pRADEONEnt = RADEONEntPriv(scrn);
CARD64 ust;
int ret;
 
@@ -318,7 +318,7 @@ drmmode_do_crtc_dpms(xf86CrtcPtr crtc, int mode)
vbl.request.type = DRM_VBLANK_RELATIVE;
vbl.request.type |= radeon_populate_vbl_request_type(crtc);
vbl.request.sequence = 0;
-   ret = drmWaitVBlank(info->dri2.drm_fd, &vbl);
+   ret = drmWaitVBlank(pRADEONEnt->fd, &vbl);
if (ret)
xf86DrvMsg(scrn->scrnIndex, X_ERROR,
   "%s cannot get last vblank counter\n",
@@ -345,7 +345,7 @@ drmmode_do_crtc_dpms(xf86CrtcPtr crtc, int mode)
 * Off->On transition: calculate and accumulate the
 * number of interpolated vblanks while we were in Off state
 */
-   ret = drmmode_get_current_ust(info->dri2.drm_fd, &ust);
+   ret = drmmode_get_current_ust(pRADEONEnt->fd, &ust);
if (ret)
xf86DrvMsg(scrn->scrnIndex, X_ERROR,
   "%s cannot get current time\n", __func__);
@@ -365,7 +365,7 @@ static void
 drmmode_crtc_dpms(xf86CrtcPtr crtc, int mode)
 {
drmmode_crtc_private_ptr drmmode_crtc = crtc->driver_private;
-   drmmode_ptr drmmode = drmmode_crtc->drmmode;
+   RADEONEntPtr pRADEONEnt = RADEONEntPriv(crtc->scrn);
 
/* Disable unused CRTCs */
if (!crtc->enabled || mode != DPMSModeOn) {
@@ -373,9 +373,9 @@ drmmode_crtc_dpms(xf86CrtcPtr crtc, int mode)
if (drmmode_crtc->flip_pending)
return;
 
-   drmModeSetCrtc(drmmode->fd, drmmode_crtc->mode_crtc->crtc_id,
+   drmModeSetCrtc(pRADEONEnt->fd, drmmode_crtc->mode_crtc->crtc_id,
   0, 0, 0, NULL, 0, NULL);
-   drmmode_fb_reference(drmmode->fd, &drmmode_crtc->fb, NULL);
+   drmmode_fb_reference(pRADEONEnt->fd, &drmmode_crtc->fb, NULL);
} else if (drmmode_crtc->dpms_mode != DPMSModeOn)
crtc->funcs->set_mode_major(crtc, &crtc->mode, crtc->rotation,
crtc->x, crtc->y);
@@ -385,6 +385,7 @@ static PixmapPtr
 create_pixmap_for_fbcon(drmmode_ptr drmmode,
ScrnInfoPtr pScrn, int fbcon_id)
 {
+   RADEONEntPtr pRADEONEnt = RADEONEntPriv(pScrn);
RADEONInfoPtr info = RADEONPTR(pScrn);
PixmapPtr pixmap = info->fbcon_pixmap;
struct radeon_bo *bo;
@@ -394,7 +395,7 @@ create_pixmap_for_fbcon(drmmode_ptr drmmode,
if (pixmap)
return pixmap;
 
-   fbcon = drmModeGetFB(drmmode->fd, fbcon_id);
+   fbcon = drmModeGetFB(pRADEONEnt->fd, fbcon_id);
if (fbcon == NULL)
return NULL;
 
@@ -404,7 +405,7 @@ create_pixmap_for_fbcon(drmmode_ptr drmmode,
goto out_free_fb;
 
flink.handle = fbcon->handle;
-   if (ioctl(drmmode->fd, DRM_IOCTL_GEM_FLINK, &flink) < 0) {
+   if (ioctl(pRADEONEnt->fd, DRM_IOCTL_GEM_FLINK, &flink) < 0) {
xf86DrvMsg(pScrn->scrnIndex, X_ERROR,
   "Couldn't flink fbcon handle\n");
goto out

[PATCH] amdgpu: Set cik/si_support to 1 by default if radeon isn't built

2017-06-30 Thread Michel Dänzer

From: Michel Dänzer 

It was required to explicitly set these parameters to 1, even if the
radeon driver isn't built at all, which is not intuitive.

Reported-by: Shawn Starr 
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 27599db7d630..58770fc40520 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -234,14 +234,28 @@ MODULE_PARM_DESC(param_buf_per_se, "the size of Off-Chip 
Pramater Cache per Shad
 module_param_named(param_buf_per_se, amdgpu_param_buf_per_se, int, 0444);
 
 #ifdef CONFIG_DRM_AMDGPU_SI
+
+#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
 int amdgpu_si_support = 0;
 MODULE_PARM_DESC(si_support, "SI support (1 = enabled, 0 = disabled 
(default))");
+#else
+int amdgpu_si_support = 1;
+MODULE_PARM_DESC(si_support, "SI support (1 = enabled (default), 0 = 
disabled)");
+#endif
+
 module_param_named(si_support, amdgpu_si_support, int, 0444);
 #endif
 
 #ifdef CONFIG_DRM_AMDGPU_CIK
+
+#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE)
 int amdgpu_cik_support = 0;
 MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled, 0 = disabled 
(default))");
+#else
+int amdgpu_cik_support = 1;
+MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = 
disabled)");
+#endif
+
 module_param_named(cik_support, amdgpu_cik_support, int, 0444);
 #endif
 
-- 
2.13.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Make KIQ read/write register routine be atomic

2017-06-30 Thread Christian König


Am 30.06.2017 um 03:21 schrieb Michel Dänzer:

On 30/06/17 06:08 AM, Shaoyun Liu wrote:

1. Use spin lock instead of mutex in KIQ
2. Directly write to KIQ fence address instead of using fence_emit()
3. Disable the interrupt for KIQ read/write and use CPU polling

This list indicates that this patch should be split up in at least three
patches. :)
Yeah, apart from that is is not a good idea to mess with the fence 
internals directly in the KIQ code, please add a helper in the fence 
code for this.


Regards,
Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/amdgpu: move get memory type function from early init to sw init

2017-06-30 Thread Zhang, Hawking

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Jim Qu
Sent: Friday, June 30, 2017 14:57
To: amd-gfx@lists.freedesktop.org
Cc: Qu, Jim 
Subject: [PATCH] drm/amd/amdgpu: move get memory type function from early init 
to sw init

On PX system, it will get memory type before gpu post , and get unkown type.

Change-Id: I79e3760dd789c21a5f552bc4e5754f7a2defdaae
Signed-off-by: Jim Qu 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 5cc3f39..5ed6788f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -765,14 +765,6 @@ static int gmc_v6_0_early_init(void *handle)
gmc_v6_0_set_gart_funcs(adev);
gmc_v6_0_set_irq_funcs(adev);
 
-   if (adev->flags & AMD_IS_APU) {
-   adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
-   } else {
-   u32 tmp = RREG32(mmMC_SEQ_MISC0);
-   tmp &= MC_SEQ_MISC0__MT__MASK;
-   adev->mc.vram_type = gmc_v6_0_convert_vram_type(tmp);
-   }
-
return 0;
 }
 
@@ -792,6 +784,14 @@ static int gmc_v6_0_sw_init(void *handle)
int dma_bits;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   if (adev->flags & AMD_IS_APU) {
+   adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
+   } else {
+   u32 tmp = RREG32(mmMC_SEQ_MISC0);
+   tmp &= MC_SEQ_MISC0__MT__MASK;
+   adev->mc.vram_type = gmc_v6_0_convert_vram_type(tmp);
+   }
+
r = amdgpu_irq_add_id(adev, AMDGPU_IH_CLIENTID_LEGACY, 146, 
&adev->mc.vm_fault);
if (r)
return r;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/amdgpu: move get memory type function from early init to sw init

2017-06-30 Thread Jim Qu

On PX system, it will get memory type before gpu post , and get unkown type.

Change-Id: I79e3760dd789c21a5f552bc4e5754f7a2defdaae
Signed-off-by: Jim Qu 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 5cc3f39..5ed6788f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -765,14 +765,6 @@ static int gmc_v6_0_early_init(void *handle)
gmc_v6_0_set_gart_funcs(adev);
gmc_v6_0_set_irq_funcs(adev);
 
-   if (adev->flags & AMD_IS_APU) {
-   adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
-   } else {
-   u32 tmp = RREG32(mmMC_SEQ_MISC0);
-   tmp &= MC_SEQ_MISC0__MT__MASK;
-   adev->mc.vram_type = gmc_v6_0_convert_vram_type(tmp);
-   }
-
return 0;
 }
 
@@ -792,6 +784,14 @@ static int gmc_v6_0_sw_init(void *handle)
int dma_bits;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   if (adev->flags & AMD_IS_APU) {
+   adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
+   } else {
+   u32 tmp = RREG32(mmMC_SEQ_MISC0);
+   tmp &= MC_SEQ_MISC0__MT__MASK;
+   adev->mc.vram_type = gmc_v6_0_convert_vram_type(tmp);
+   }
+
r = amdgpu_irq_add_id(adev, AMDGPU_IH_CLIENTID_LEGACY, 146, 
&adev->mc.vm_fault);
if (r)
return r;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-06-30 Thread Michel Dänzer

On 30/06/17 03:59 PM, Christian König wrote:
> Am 30.06.2017 um 08:51 schrieb Michel Dänzer:
>> We can deal with that internally in the kernel, while fixing the
>> existing flag for userspace.
> 
> And as I said, NAK to that approach. I'm not going to add a
> CPU_ACCESS_REALLY_REQUIRED flag in the kernel just because mesa has
> messed up it's use case.
> 
> We could agree on filtering that flag from userspace when BOs are
> created and/or map it to a CREATE_CPU_ACCESS_HINT flag.

Then I propose the following:

One patch:

Convert AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED to a kernel internal flag
AMDGPU_GEM_CPU_ACCESS_HINT in amdgpu_gem_create_ioctl, which is
initially treated the same way as AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED.

Another patch:

Change the treatment of AMDGPU_GEM_CPU_ACCESS_HINT according to John's
patch 4 in the latest series, or a variation of that as discussed on IRC.

If any regressions are reported, we will be able to differentiate
whether they are due to the addition of the new flag itself or due to
the change in its handling.

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] /drm/amd/amdgpu: move get memory type function from early init to sw init

2017-06-30 Thread Jim Qu

On PX system, it will get memory type before gpu post , and get unkown type.

Change-Id: I79e3760dd789c21a5f552bc4e5754f7a2defdaae
Signed-off-by: Jim Qu 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 5cc3f39..5ed6788f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -765,14 +765,6 @@ static int gmc_v6_0_early_init(void *handle)
gmc_v6_0_set_gart_funcs(adev);
gmc_v6_0_set_irq_funcs(adev);
 
-   if (adev->flags & AMD_IS_APU) {
-   adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
-   } else {
-   u32 tmp = RREG32(mmMC_SEQ_MISC0);
-   tmp &= MC_SEQ_MISC0__MT__MASK;
-   adev->mc.vram_type = gmc_v6_0_convert_vram_type(tmp);
-   }
-
return 0;
 }
 
@@ -792,6 +784,14 @@ static int gmc_v6_0_sw_init(void *handle)
int dma_bits;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   if (adev->flags & AMD_IS_APU) {
+   adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
+   } else {
+   u32 tmp = RREG32(mmMC_SEQ_MISC0);
+   tmp &= MC_SEQ_MISC0__MT__MASK;
+   adev->mc.vram_type = gmc_v6_0_convert_vram_type(tmp);
+   }
+
r = amdgpu_irq_add_id(adev, AMDGPU_IH_CLIENTID_LEGACY, 146, 
&adev->mc.vm_fault);
if (r)
return r;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: Deprecation of AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED

2017-06-30 Thread Michel Dänzer

On 30/06/17 01:00 PM, Mao, David wrote:
> Sounds good!
> One thing to confirm, If the original location is already in the
> invisible, will the notifier callback to move the bo from invisible to
> visible?

Yes.

> if it is true and the logic is already available in the kernel, can we
> use NO_CPU_ACCESS flag by default to accomplish the similar purpose for
> now?

You mean set the NO_CPU_ACCESS flag for BOs in the "CPU invisible heap"?
Yes, that's a good idea.

However, we can also improve the kernel driver's handling of the
CPU_ACCESS_REQUIRED flag so that userspace code can continue using it
the way it has been.

> It also reminds me of another related topic, can we always take visible
> heap as priority against to the remote in this case?
> So far, kernel don’t have the heap priority.
> IIRC, if the LFB bo moved to GTT, it will never be moved back since GTT
> is also its preferred heap.

That can happen if userspace specifies both VRAM and GTT as preferred
domains. It's one reason why that isn't recommended.

> (Kernel seems to add the GTT even if the UMD only ask for LFB).

I can only see

robj->allowed_domains = robj->prefered_domains;
if (robj->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
robj->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;

which adds GTT as an *allowed* domain for BOs which only have VRAM as
the preferred domain. Since VRAM is the only preferred domain, the
driver will attempt to move the BO from GTT to VRAM on userspace command
stream submissions (subject to throttling).

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

61 matches

Mail list logo