[PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Pixel Ding
From: Ding Pixel Return success when the ring is properly initialized, otherwise return failure. Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of these IPs is bypassed. The system crashes if application submit IB to their rings which are not ready to use. It could be a comm

[PATCH] drm/amdgpu: check ring being ready before using

2017-01-18 Thread Pixel Ding
From: Ding Pixel Return success when the ring is properly initialized, otherwise return failure. Tonga SRIOV VF doesn't have UVD and VCE engines, the initialization of these IPs is bypassed. The system crashes if application submit IB to their rings which are not ready to use. It could be a comm

[PATCH 1/2] drm/amdgpu: don't clean the framebuffer for VF

2017-02-05 Thread Pixel Ding
The SRIOV host driver cleans framebuffer for each VF, guest driver needn't this action which costs much time on some virtualization platform, otherwise it might get timeout to initialize. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 4 +++- 1 file changed, 3 inser

[PATCH 2/2] drm/amdgpu: increase mailbox timeout to 5000ms

2017-02-05 Thread Pixel Ding
When mutiple VFs try to enter exclusive mode at the same time, the looping mechansim doesn't help to ensure each can get it because it only loops active VFs, then the last one has to wait for a long interval. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h | 2 +- 1

[PATCH 1/2] drm/amdgpu/virt: refine handshake between guest and host by mailbox

2017-02-05 Thread Pixel Ding
s sure the host driver has already recieved the ACK message and handle it like: A: send MSG-> clear VALID-> B: send ACK-> check VALID Signed-off-by: Ken Xue Acked-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 26 +- 1 fil

[PATCH 1/2] drm/amdgpu/virt: refine handshake between guest and host by mailbox

2017-02-05 Thread Pixel Ding
s sure the host driver has already recieved the ACK message and handle it like: A: send MSG-> clear VALID-> B: send ACK-> check VALID Signed-off-by: Ken Xue Acked-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 26 +- 1 fil

[PATCH 2/2] drm/amdgpu/virt: schedule work to handle vm fault for VF

2017-02-05 Thread Pixel Ding
VF uses KIQ to access registers that invoking fence_wait to get the accessing completed. When VM fault occurs, the driver can't sleep in interrupt context. For some test cases, VM fault is 'legal' and shouldn't cause driver soft lockup. Signed-off-by: Pixel Ding --- driver

[PATCH] drm/amdgpu/virt: skip VM fault handler for VF

2017-02-06 Thread Pixel Ding
VF uses KIQ to access registers. When VM fault occurs, the driver can't get back the fence of KIQ submission and runs into CPU soft lockup. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/a

[PATCH] drm/amdgpu/virt: skip VM fault handler for VF (v2)

2017-02-06 Thread Pixel Ding
VF uses KIQ to access registers. When VM fault occurs, the driver can't get back the fence of KIQ submission and runs into CPU soft lockup. v2: print IV entry info Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 7 +++ 1 file changed, 7 insertions(+) diff --

[PATCH] drm/amdgpu: clear framebuffer with GPU

2017-02-07 Thread Pixel Ding
CPU is not efficient to do this job. There's a failure caused by this is that handshaking gets timeout of SRIOV virtual function. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/dr

[PATCH] drm/amdgpu/virt: don't check VALID bit for FLR completion message

2017-02-22 Thread Pixel Ding
hout VALID bit for FLR completion, driver should handle it without checking. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c ind

[PATCH] drm/amdgpu/virt: don't check VALID bit for FLR completion message

2017-02-22 Thread Pixel Ding
hout VALID bit for FLR completion, driver should handle it without checking. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_

[PATCH] drm/amdgpu: always use polling mem to update wptr

2017-12-11 Thread Pixel Ding
Both doorbell and polling mem are working on Tonga VF. SDMA issue happens because SDMA engine accepts doorbell writes even if it's inactive, that introduces conflict when world switch routine update wptr though polling memory. Use polling mem in driver too. Signed-off-by: Pixel Ding --- dr

[PATCH] drm/amdgpu: always use polling mem to update wptr v2

2017-12-11 Thread Pixel Ding
path Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 + drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 26 ++ 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/a

[PATCH 2/2] drm/scheduler: move last_sched fence updating prior to job popping

2018-04-18 Thread Pixel Ding
Make sure main thread won't update last_sched fence when entity is cleanup. Fix a racing issue which is caused by putting last_sched fence twice. Running vulkaninfo in tight loop can produce this issue as seeing wild fence pointer. Signed-off-by: Pixel Ding --- drivers/gpu/drm/sche

[PATCH 1/2] drm/scheduler: always put last_sched fence in entity_fini

2018-04-18 Thread Pixel Ding
Fix the potential memleak since scheduler main thread always hold one last_sched fence. Signed-off-by: Pixel Ding --- drivers/gpu/drm/scheduler/gpu_scheduler.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm

[PATCH] drm/scheduler: don't update last scheduled fence in TDR

2018-04-25 Thread Pixel Ding
jobs in mirror list, so we should not update the last sched fences in TDR. Signed-off-by: Pixel Ding --- drivers/gpu/drm/scheduler/gpu_scheduler.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c index

[PATCH] drm/amdgpu: right shift 2 bits for SDMA_GFX_RB_WPTR_POLL_ADDR_LO v2

2017-09-24 Thread Pixel Ding
Both Tonga and Vega register SPECs indicate that this registers only use 31:2 bits in DW. SRIOV test case immediately fails withtout this shift. v2: write to ADDR field Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 9 + drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c

[PATCH] drm/amdgpu: right shift 2 bits for SDMA_GFX_RB_WPTR_POLL_ADDR_LO

2017-09-25 Thread Pixel Ding
Both Tonga and Vega register SPECs indicate that this registers only use 31:2 bits in DW. SRIOV test case immediately fails withtout this shift. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 +- 2 files changed, 2

[PATCH 3/4] drm/amdgpu: report preemption fence via amdgpu_fence_info

2017-10-13 Thread Pixel Ding
From: pding Only report fence for GFX ring. This can help checking MCBP feature. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c

Make staging driver stable for SRIOV VF (1)

2017-10-13 Thread Pixel Ding
This is the first patch series to make latest staging driver stable for SRIOV VF on both Tonga and Vega. Patches are merged from SRIOV branches or reimplemented, including bug fixes and small features requested by SRIOV users. Please help reviewing, Thanks. [PATCH 1/4] drm/amdgpu: always consid

[PATCH 4/4] drm/amdgpu: busywait KIQ register accessing

2017-10-13 Thread Pixel Ding
From: pding Register accessing is performed when IRQ is disabled. Never sleep in this function. Known issue: dead sleep in many use cases of index/data registers. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++---

[PATCH 1/4] drm/amdgpu: always consider virualised device for checking post

2017-10-13 Thread Pixel Ding
From: pding The post checking on scratch registers isn't reliable for virtual function. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/

[PATCH 2/4] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr

2017-10-13 Thread Pixel Ding
From: pding The polling memory was standalone in VRAM before, so the HDP flush introduced latency that hides a VM fault issue. Now polling memory leverages the WB in system memory and HDP flush is not required, the VM fault at same page happens. Add delay back to workaround until the root cause

[PATCH 2/3] drm/amdgpu: report more amdgpu_fence_info v2

2017-10-16 Thread Pixel Ding
From: pding Only for GFX ring. This can help checking MCBP feature. v2: report more fence offs. The fence at the end of the frame will indicate the completion status. If the frame completed normally, the fence is written to the address given in the EVENT_WRITE_EOP packet. If preemption occurred

Make staging driver stable for SRIOV VF (1 v2)

2017-10-16 Thread Pixel Ding
This is the first patch series to make latest staging driver stable for SRIOV VF on both Tonga and Vega. Patches are merged from SRIOV branches or reimplemented, including bug fixes and small features requested by SRIOV users. v2: "drm/amdgpu: workaround for VM fault caused by SDMA" is dropped.

[PATCH 1/3] drm/amdgpu: workaround for VM fault caused by SDMA set_wptr

2017-10-16 Thread Pixel Ding
From: pding The polling memory was standalone in VRAM before, so the HDP flush introduced latency that hides a VM fault issue. Now polling memory leverages the WB in system memory and HDP flush is not required, the VM fault at same page happens. Add delay back to workaround until the root cause

[PATCH 3/3] drm/amdgpu: busywait KIQ register accessing v2

2017-10-16 Thread Pixel Ding
From: pding Register accessing is performed when IRQ is disabled. Never sleep in this function. Known issue: dead sleep in many use cases of index/data registers. v2: wrap polling fence functions. don't trigger IRQ for polling in case of wrongly fence signal. Signed-off-by: pding --- drivers

Make staging driver stable for SRIOV VF (1 v2)

2017-10-16 Thread Pixel Ding
This is the first patch series to make latest staging driver stable for SRIOV VF on both Tonga and Vega. Patches are merged from SRIOV branches or reimplemented, including bug fixes and small features requested by SRIOV users. v2: "drm/amdgpu: workaround for VM fault caused by SDMA" is dropped.

[PATCH 1/3] drm/amdgpu: always consider virualised device for checking post

2017-10-16 Thread Pixel Ding
From: pding The post checking on scratch registers isn't reliable for virtual function. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/

[PATCH 3/3] drm/amdgpu: busywait KIQ register accessing v2

2017-10-16 Thread Pixel Ding
From: pding Register accessing is performed when IRQ is disabled. Never sleep in this function. Known issue: dead sleep in many use cases of index/data registers. v2: wrap polling fence functions. don't trigger IRQ for polling in case of wrongly fence signal. Signed-off-by: pding --- drivers

[PATCH 2/3] drm/amdgpu: report more amdgpu_fence_info v2

2017-10-16 Thread Pixel Ding
From: pding Only for GFX ring. This can help checking MCBP feature. v2: report more fence offs. The fence at the end of the frame will indicate the completion status. If the frame completed normally, the fence is written to the address given in the EVENT_WRITE_EOP packet. If preemption occurred

[PATCH 3/3] drm/amdgpu: busywait KIQ register accessing v3

2017-10-17 Thread Pixel Ding
From: pding Register accessing is performed when IRQ is disabled. Never sleep in this function. Known issue: dead sleep in many use cases of index/data registers. v2: wrap polling fence functions. don't trigger IRQ for polling in case of wrongly fence signal. v3: handle wrap round gracefully.

[PATCH] drm/amdgpu: drm/amdgpu: always consider virualised device for checking post (v2)

2017-10-18 Thread Pixel Ding
From: pding The post checking on scratch registers isn't reliable for virtual function. v2: only change in IGP reading bios. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

[PATCH] drm/amdgpu: always consider virualised device for checking post (v3)

2017-10-18 Thread Pixel Ding
From: pding v2: - only change in IGP reading bios. v3: - merge functions and apply on all bios checking. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 41 +- 1 file changed, 18 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/am

Init timing changes for SRIOV VF

2017-10-23 Thread Pixel Ding
This is the second patch series merged or reimplemented from SRIOV branch. It changes the init time consuming. Exclusive mode means that a VF occupies hardware and other VFs need to wait until this VF releases exclusive mode. The timing of exclusive mode is limited to avoid starvation causing unav

[PATCH 2/7] drm/amdgpu: add init_log param to control logs in exclusive mode

2017-10-23 Thread Pixel Ding
From: pding When this VF stays in exclusive mode for long, other VFs will be impacted. The redundant messages causes exclusive mode timeout when they're redirected. That is a normal use case for cloud service to redirect guest log to virtual serial port. Introduce init_log param to control logs

[PATCH 1/7] drm/amdgpu: release VF exclusive accessing after hw_init

2017-10-23 Thread Pixel Ding
The subsequent operations don't need exclusive accessing hardware. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 3 --- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/a

[PATCH 4/7] drm/amdgpu/virt: add function to check MMIO accessing

2017-10-23 Thread Pixel Ding
From: pding MMIO space can be blocked on virtualised device. Add this function to check if MMIO is blocked or not. Todo: need a reliable method such like communation with hypervisor. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu

[PATCH 6/7] drm/amdgpu/virt: add wait_reset virt ops

2017-10-23 Thread Pixel Ding
From: pding Driver can use this interface to check if there's a function level reset done in hypervisor. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 2 ++ drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c| 1 + d

[PATCH 5/7] drm/amdgpu: don't disable MSI for GPU virtual function

2017-10-23 Thread Pixel Ding
From: pding After calling pci_disable_msi() and pci_enable_msi(), VF can't receive interrupt anymore. This may introduce problems in module reloading or retrying init. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) d

[PATCH 3/7] drm/amdgpu: avoid soft lockup when waiting for RLC serdes

2017-10-23 Thread Pixel Ding
From: pding Normally all waiting get timeout if there's one. Release the lock and return immediately when timeout happens. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++ drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 6 ++ 2 files changed, 12 insertions(+) diff --git

[PATCH 7/7] drm/amdgpu: retry init if it fails due to exclusive mode timeout

2017-10-23 Thread Pixel Ding
From: pding The exclusive mode has real-time limitation in reality, such like being done in 300ms. It's easy observed if running many VF/VMs in single host with heavy CPU workload. If we find the init fails due to exclusive mode timeout, try it again. Signed-off-by: pding --- drivers/gpu/drm/

[PATCH 1/7] drm/amdgpu: change redundant init logs to debug level

2017-10-24 Thread Pixel Ding
From: pding This is v2 of init log changing. init_log parm and SRIOV specific macro are removed, so I rename the patch. Exclusive mode consumes 230ms with this patch and log redirection, that is acceptable. Please review. --- When this VF stays in exclusive mode for long, other VFs will be impa

[PATCH 2/7] drm/amdgpu: avoid soft lockup when waiting for RLC serdes (v2)

2017-10-24 Thread Pixel Ding
From: pding Normally all waiting get timeout if there's one. Release the lock and return immediately when timeout happens. v2: - set the se_sh to broadcase before return Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 8 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 8 +++

[PATCH 5/7] drm/amdgpu/virt: add wait_reset virt ops

2017-10-24 Thread Pixel Ding
From: pding Hi Alex, Split the wait_reset patch to 2. Part 1. please review. --- Driver can use this interface to check if there's a function level reset done in hypervisor. It's helpful when IRQ handler for reset is not ready, or special handling is required. Signed-off-by: pding --- drive

[PATCH 6/7] drm/amdgpu/virt: implement wait_reset callbacks for vi/ai

2017-10-24 Thread Pixel Ding
From: pding Hi Alex, Split the wait_reset patch to 2. Part 2. please review. --- Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 1 + drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 6 ++ 2 files changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/dr

[PATCH 7/7] drm/amdgpu: retry init if it fails due to exclusive mode timeout (v2)

2017-10-24 Thread Pixel Ding
From: pding The exclusive mode has real-time limitation in reality, such like being done in 300ms. It's easy observed if running many VF/VMs in single host with heavy CPU workload. If we find the init fails due to exclusive mode timeout, try it again. v2: - rewrite the condition for readable v

[PATCH] drm/amdgpu: retry init if it fails due to exclusive mode timeout (v3)

2017-10-25 Thread Pixel Ding
From: pding The exclusive mode has real-time limitation in reality, such like being done in 300ms. It's easy observed if running many VF/VMs in single host with heavy CPU workload. If we find the init fails due to exclusive mode timeout, try it again. v2: - rewrite the condition for readable v

[PATCH] drm/amdgpu: release exclusive mode after hw_init if no kfd

2017-10-30 Thread Pixel Ding
From: pding Move kfd probe prior to device init. Release exclusive mode after hw_init if kfd is not enabled. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 5 +++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff

[PATCH] drm/amdgpu: release exclusive mode after hw_init if no kfd (v2)

2017-10-30 Thread Pixel Ding
From: pding Move kfd probe prior to device init. Release exclusive mode after hw_init if kfd is not enabled. v2: - pass pdev param Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 3 ++- drivers/gpu/drm/amd/amdgpu/amd

[PATCH 1/2] drm/amdgpu: return error when sriov access requests get timeout

2017-10-31 Thread Pixel Ding
From: pding Reported-by: Sun Gary Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c index 818ec0f..f291fb2 100644 --- a/drivers/gpu/dr

[PATCH 2/2] drm/amdgpu: retry init if exclusive mode request is failed

2017-10-31 Thread Pixel Ding
From: pding This is caused of that hypervisor fails to handle request, one known issue is MMIO unblocking timeout. In theory we can retry init here. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/g

[PATCH 1/2] drm/amdgpu: return error when sriov access requests get timeout (v2)

2017-10-31 Thread Pixel Ding
From: pding v2: - readable Reported-by: Sun Gary Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c index 818ec0f..2b435c0 1006

release exclusive mode after hw init if no kfd

2017-10-31 Thread Pixel Ding
Hi Oded, Please review. [PATCH 1/2] drm/amdkfd: initialise kgd field inside kfd device_init As you suggested, move kgd assignment to device_init [PATCH 2/2] drm/amdgpu: release exclusive mode after hw_init if no We still need this change because pdev is passed in. ___

[PATCH 2/2] drm/amdgpu: release exclusive mode after hw_init if no kfd (v2)

2017-10-31 Thread Pixel Ding
From: pding Move kfd probe prior to device init. Release exclusive mode after hw_init if kfd is not enabled. v2: - pass pdev param Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 3 ++- drivers/gpu/drm/amd/amdgpu/amd

[PATCH 1/2] drm/amdkfd: initialise kgd field inside kfd device_init

2017-10-31 Thread Pixel Ding
From: pding kgd field is dependent on kgd device_init. Move the assignment to kfd device_init. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 8 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 6

release ex mode after hw_init if no KIQ

2017-11-01 Thread Pixel Ding
Hi Oded, There're 3 patches for releasing exclusive mode after hw_init if KIQ is not enabled. [PATCH 1/3] drm/amdgpu: wrap allocation for amdgpu_device Allocation of amdgpu_device and base fields are wrapped put it ahead. [PATCH 2/3] drm/amdgpu: release exclusive mode after hw_init if no [PATCH

[PATCH 3/3] drm/amdkfd: pass kgd to kfd inside device_init

2017-11-01 Thread Pixel Ding
From: pding KGD is possible not fully initialised in probe phase, so it's not safe to pass it in if kfd code tries to refer KGD here. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 8 drivers/gpu/d

[PATCH 1/3] drm/amdgpu: wrap allocation for amdgpu_device

2017-11-01 Thread Pixel Ding
From: pding Add amdgpu_device_alloc() which was part of previous amdgpu_device_init(). Then it's flexible to handle init sequence since kfd has dependency to amdgpu_device base fields. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 4 +-- drivers/gpu/drm/amd/amdgpu/amdg

[PATCH 2/3] drm/amdgpu: release exclusive mode after hw_init if no kfd

2017-11-01 Thread Pixel Ding
From: pding KFD device init requires exclusive mode. Driver can release exclusive mode after hw_init if KFD is not enabled. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 5 +++-- 2 files changed, 6 insertions(+), 2 del

[PATCH 1/2] drm/amdkfd: initialise kfd inside amdgpu_device_init

2017-11-06 Thread Pixel Ding
From: pding Also finalize kfd inside amdgpu_device_fini. kfd device_init needs SRIOV exclusive accessing. Try to gather exclusive accessing to reduce time consuming. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 5 ---

release ex mode after hw_init

2017-11-06 Thread Pixel Ding
Hi Felix, Please review. [PATCH 1/2] drm/amdkfd: initialise kfd inside amdgpu_device_init As you suggested, move kfd init/fini inside amdgpu_device_init. Other changes for KFD interfaces are dropped. [PATCH 2/2] drm/amdgpu: release exclusive mode after hw_init __

[PATCH 2/2] drm/amdgpu: release exclusive mode after hw_init

2017-11-06 Thread Pixel Ding
From: pding Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 3 --- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c i

[PATCH] drm/amdgpu: bypass FB resizing for SRIOV VF

2017-11-06 Thread Pixel Ding
From: pding It introduces 900ms latency in exclusive mode which causes failure of driver loading. Host can resize the BAR before guest staring, so the resizing is not necessary here. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 1 file changed, 4 insertions(+)

[PATCH 2/2] drm/amdgpu: use irq-safe lock for adev->ring_lru_list_lock

2017-11-06 Thread Pixel Ding
From: pding This lock is used during register accessing in SRIOV guest since KIQ uses general ring submission (amdgpu_ring_commit). The register accessing could happen both in irq enabled and irq disabled cases. Always use irq-safe lock. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdg

[PATCH 1/2] drm/amdgpu: use irq-safe lock for kiq->ring_lock

2017-11-06 Thread Pixel Ding
From: pding This lock is used during register accessing in SRIOV guest. The register accessing could happen both in irq enabled and irq disabled cases. Always use irq-safe lock. Signed-off-by: pding --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 10 ++ 1 file changed, 6 insertions(+),

[PATCH] drm/amdgpu: bypass lru touch for KIQ ring submission

2017-11-07 Thread Pixel Ding
KIQ ring submission is used for register accessing on SRIOV VF that could happen both in irq enabled and irq disabled cases. Inversion lock could happen on adev->ring_lru_list_lock, while this operation is useless and just adds overhead in this use case. Signed-off-by: Pixel Ding --- driv

[PATCH] drm/amdgpu: revise retry init to fully cleanup driver

2017-11-07 Thread Pixel Ding
Retry at drm_dev_register instead of amdgpu_device_init. Signed-off-by: Pixel Ding --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 11 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 15 ++- 3 files changed, 13 insertions