date:20200324

RE: [PATCH] drm/amdgpu: added xgmi ras error reset sequence

2020-03-24 Thread Zhang, Hawking

[AMD Official Use Only - Internal Distribution Only]

+ case CHIP_VEGA20:
+ default:

I'd suggest do nothing for default case. Other than that the patch is

Reviewed-by: Hawking Zhang 

Regards,
Hawking
From: Clements, John 
Sent: Wednesday, March 25, 2020 14:50
To: amd-gfx@lists.freedesktop.org; Zhang, Hawking 
Subject: [PATCH] drm/amdgpu: added xgmi ras error reset sequence


[AMD Official Use Only - Internal Distribution Only]

Submitting patch to clear xgmi ras error counters inbetween ras error query
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: disable ras query and injection during gpu reset

2020-03-24 Thread Clements, John

[AMD Official Use Only - Internal Distribution Only]

Submitting patch to disable ras debugfs features during the entire GPU reset 
cycle


0001-drm-amdgpu-disable-ras-query-during-gpu-reset.patch
Description: 0001-drm-amdgpu-disable-ras-query-during-gpu-reset.patch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: added xgmi ras error reset sequence

2020-03-24 Thread Clements, John

[AMD Official Use Only - Internal Distribution Only]

Submitting patch to clear xgmi ras error counters inbetween ras error query


0001-drm-amdgpu-added-xgmi-ras-error-reset-sequence.patch
Description: 0001-drm-amdgpu-added-xgmi-ras-error-reset-sequence.patch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: Check entity rq

2020-03-24 Thread xinhui pan

Hit panic during GPU recovery test. drm_sched_entity_select_rq might
set NULL to rq. So add a check like drm_sched_job_init does.

Cc: Christian König 
Cc: Alex Deucher 
Cc: Felix Kuehling 
Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
index cf96c335b258..d30d103e48a2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c
@@ -95,6 +95,8 @@ static int amdgpu_vm_sdma_commit(struct 
amdgpu_vm_update_params *p,
int r;
 
entity = p->direct ? &p->vm->direct : &p->vm->delayed;
+   if (!entity->rq)
+   return -ENOENT;
ring = container_of(entity->rq->sched, struct amdgpu_ring, sched);
 
WARN_ON(ib->length_dw == 0);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 5/7] drm/amdgpu: equip new req_init_data handshake

2020-03-24 Thread Monk Liu

by this new handshake host side can prepare vbios/ip-discovery
and pf&vf exchange data upon recieving this request without
stopping world switch.

this way the world switch is less impacted by VF's exclusive mode
request

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++
 drivers/gpu/drm/amd/amdgpu/nv.c| 15 +--
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ca609b6..273706b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1800,6 +1800,21 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
amdgpu_amdkfd_device_probe(adev);
 
if (amdgpu_sriov_vf(adev)) {
+   /* handle vbios stuff prior full access mode for new handshake 
*/
+   if (adev->virt.req_init_data_ver == 1) {
+   if (!amdgpu_get_bios(adev)) {
+   DRM_ERROR("failed to get vbios\n");
+   return -EINVAL;
+   }
+
+   r = amdgpu_atombios_init(adev);
+   if (r) {
+   dev_err(adev->dev, "amdgpu_atombios_init 
failed\n");
+   amdgpu_vf_error_put(adev, 
AMDGIM_ERROR_VF_ATOMBIOS_INIT_FAIL, 0, 0);
+   return r;
+   }
+   }
+
r = amdgpu_virt_request_full_gpu(adev, true);
if (r)
return -EAGAIN;
@@ -1832,6 +1847,10 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
}
/* get the vbios after the asic_funcs are set up */
if (adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_COMMON) {
+   /* skip vbios handling for new handshake */
+   if (amdgpu_sriov_vf(adev) && 
adev->virt.req_init_data_ver == 1)
+   continue;
+
/* Read BIOS */
if (!amdgpu_get_bios(adev))
return -EINVAL;
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index a67d78d..7768880 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -457,16 +457,19 @@ int nv_set_ip_blocks(struct amdgpu_device *adev)
 {
int r;
 
-   /* Set IP register base before any HW register access */
-   r = nv_reg_base_init(adev);
-   if (r)
-   return r;
-
adev->nbio.funcs = &nbio_v2_3_funcs;
adev->nbio.hdp_flush_reg = &nbio_v2_3_hdp_flush_reg;
 
-   if (amdgpu_sriov_vf(adev))
+   if (amdgpu_sriov_vf(adev)) {
adev->virt.ops = &xgpu_nv_virt_ops;
+   /* try send GPU_INIT_DATA request to host */
+   amdgpu_virt_request_init_data(adev);
+   }
+
+   /* Set IP register base before any HW register access */
+   r = nv_reg_base_init(adev);
+   if (r)
+   return r;
 
switch (adev->asic_type) {
case CHIP_NAVI10:
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/7] drm/amdgpu: cleanup idh event/req for NV headers

2020-03-24 Thread Monk Liu

1) drop the headers from AI in mxgpu_nv.c, should refer to mxgpu_nv.h

2) the IDH_EVENT_MAX is not used and not aligned with host side
   so drop it
3) the IDH_TEXT_MESSAG was provided in host but not defined in guest

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h |  3 ++-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c |  1 -
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 22 ++
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h |  3 ++-
 4 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h
index 37dbe0f..52a6975 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h
@@ -46,7 +46,8 @@ enum idh_event {
IDH_SUCCESS,
IDH_FAIL,
IDH_QUERY_ALIVE,
-   IDH_EVENT_MAX
+
+   IDH_TEXT_MESSAGE = 255,
 };
 
 extern const struct amdgpu_virt_ops xgpu_ai_virt_ops;
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 237fa5e..d9ce12c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -30,7 +30,6 @@
 #include "navi10_ih.h"
 #include "soc15_common.h"
 #include "mxgpu_nv.h"
-#include "mxgpu_ai.h"
 
 static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
index 99b15f6..c80bbc7 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
@@ -28,6 +28,28 @@
 #define NV_MAILBOX_POLL_MSG_TIMEDOUT   12000
 #define NV_MAILBOX_POLL_FLR_TIMEDOUT   500
 
+enum idh_request {
+   IDH_REQ_GPU_INIT_ACCESS = 1,
+   IDH_REL_GPU_INIT_ACCESS,
+   IDH_REQ_GPU_FINI_ACCESS,
+   IDH_REL_GPU_FINI_ACCESS,
+   IDH_REQ_GPU_RESET_ACCESS,
+
+   IDH_LOG_VF_ERROR   = 200,
+};
+
+enum idh_event {
+   IDH_CLR_MSG_BUF = 0,
+   IDH_READY_TO_ACCESS_GPU,
+   IDH_FLR_NOTIFICATION,
+   IDH_FLR_NOTIFICATION_CMPL,
+   IDH_SUCCESS,
+   IDH_FAIL,
+   IDH_QUERY_ALIVE,
+
+   IDH_TEXT_MESSAGE = 255,
+};
+
 extern const struct amdgpu_virt_ops xgpu_nv_virt_ops;
 
 void xgpu_nv_mailbox_set_irq_funcs(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
index f13dc6c..713ee66 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
@@ -43,7 +43,8 @@ enum idh_event {
IDH_READY_TO_ACCESS_GPU,
IDH_FLR_NOTIFICATION,
IDH_FLR_NOTIFICATION_CMPL,
-   IDH_EVENT_MAX
+
+   IDH_TEXT_MESSAGE = 255
 };
 
 extern const struct amdgpu_virt_ops xgpu_vi_virt_ops;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 6/7] drm/amdgpu: adjust sequence of ip_discovery init and timeout_setting

2020-03-24 Thread Monk Liu

what:
1)move timtout setting before ip_early_init to reduce exclusive mode
cost for SRIOV

2)move ip_discovery_init() to inside of amdgpu_discovery_reg_base_init()
it is a prepare for the later upcoming patches.

why:
in later upcoming patches we would use a new mailbox event --
"req_gpu_init_data", which is a callback hooked in adev->virt.ops and
this callback send a new event "REQ_GPU_INIT_DAT" to host to notify
host to do some preparation like "IP discovery/vbios on the VF FB"
and this callback must be:

A) invoked after set_ip_block() because virt.ops is configured during
set_ip_block()

B) invoked before ip_discovery_init() becausen ip_discovery_init()
need host side prepares everything in VF FB first.

current place of ip_discovery_init() is before we can invoke callback
of adev->virt.ops, thus we must move ip_discovery_init() to a place
after the adev->virt.ops all settle done, and the perfect place is in
amdgpu_discovery_reg_base_init()

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 16 
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 10 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h |  1 -
 3 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 273706b..724ad84 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3079,12 +3079,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* detect hw virtualization here */
amdgpu_detect_virtualization(adev);
 
-   if (amdgpu_discovery && adev->asic_type >= CHIP_NAVI10) {
-   r = amdgpu_discovery_init(adev);
-   if (r) {
-   dev_err(adev->dev, "amdgpu_discovery_init failed\n");
-   return r;
-   }
+   r = amdgpu_device_get_job_timeout_settings(adev);
+   if (r) {
+   dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
+   return r;
}
 
/* early init functions */
@@ -3092,12 +3090,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (r)
return r;
 
-   r = amdgpu_device_get_job_timeout_settings(adev);
-   if (r) {
-   dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
-   return r;
-   }
-
/* doorbell bar mapping and doorbell index init*/
amdgpu_device_doorbell_init(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 37e1fcf..43bb22a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -156,7 +156,7 @@ static inline bool amdgpu_discovery_verify_checksum(uint8_t 
*data, uint32_t size
return !!(amdgpu_discovery_calculate_checksum(data, size) == expected);
 }
 
-int amdgpu_discovery_init(struct amdgpu_device *adev)
+static int amdgpu_discovery_init(struct amdgpu_device *adev)
 {
struct table_info *info;
struct binary_header *bhdr;
@@ -255,10 +255,12 @@ int amdgpu_discovery_reg_base_init(struct amdgpu_device 
*adev)
uint8_t num_base_address;
int hw_ip;
int i, j, k;
+   int r;
 
-   if (!adev->discovery) {
-   DRM_ERROR("ip discovery uninitialized\n");
-   return -EINVAL;
+   r = amdgpu_discovery_init(adev);
+   if (r) {
+   DRM_ERROR("amdgpu_discovery_init failed\n");
+   return r;
}
 
bhdr = (struct binary_header *)adev->discovery;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
index ba78e15..d50d597 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h
@@ -26,7 +26,6 @@
 
 #define DISCOVERY_TMR_SIZE  (64 << 10)
 
-int amdgpu_discovery_init(struct amdgpu_device *adev);
 void amdgpu_discovery_fini(struct amdgpu_device *adev);
 int amdgpu_discovery_reg_base_init(struct amdgpu_device *adev);
 int amdgpu_discovery_get_ip_version(struct amdgpu_device *adev, int hw_id,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/7] drm/amdgpu: introduce new idh_request/event enum

2020-03-24 Thread Monk Liu

new idh_request and ihd_event to prepare for the
new handshake protocol implementation later

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
index c80bbc7..598ed2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
@@ -34,6 +34,7 @@ enum idh_request {
IDH_REQ_GPU_FINI_ACCESS,
IDH_REL_GPU_FINI_ACCESS,
IDH_REQ_GPU_RESET_ACCESS,
+   IDH_REQ_GPU_INIT_DATA,
 
IDH_LOG_VF_ERROR   = 200,
 };
@@ -46,6 +47,7 @@ enum idh_event {
IDH_SUCCESS,
IDH_FAIL,
IDH_QUERY_ALIVE,
+   IDH_REQ_GPU_INIT_DATA_READY,
 
IDH_TEXT_MESSAGE = 255,
 };
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/7] drm/amdgpu: introduce new request and its function

2020-03-24 Thread Monk Liu

1) modify xgpu_nv_send_access_requests to support
new idh request

2) introduce new function: req_gpu_init_data() which
is used to notify host to prepare vbios/ip-discovery/pfvf exchange

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 13 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  3 ++
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c| 48 ++--
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h|  2 +-
 4 files changed, 57 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 43a1ee3..135a16c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -152,6 +152,19 @@ int amdgpu_virt_reset_gpu(struct amdgpu_device *adev)
return 0;
 }
 
+void amdgpu_virt_request_init_data(struct amdgpu_device *adev)
+{
+   struct amdgpu_virt *virt = &adev->virt;
+
+   if (virt->ops && virt->ops->req_init_data)
+   virt->ops->req_init_data(adev);
+
+   if (adev->virt.req_init_data_ver > 0)
+   DRM_INFO("host supports REQ_INIT_DATA handshake\n");
+   else
+   DRM_WARN("host doesn't support REQ_INIT_DATA handshake\n");
+}
+
 /**
  * amdgpu_virt_wait_reset() - wait for reset gpu completed
  * @amdgpu:amdgpu device.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 74f9843..f6ae3c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -59,6 +59,7 @@ struct amdgpu_vf_error_buffer {
 struct amdgpu_virt_ops {
int (*req_full_gpu)(struct amdgpu_device *adev, bool init);
int (*rel_full_gpu)(struct amdgpu_device *adev, bool init);
+   int (*req_init_data)(struct amdgpu_device *adev);
int (*reset_gpu)(struct amdgpu_device *adev);
int (*wait_reset)(struct amdgpu_device *adev);
void (*trans_msg)(struct amdgpu_device *adev, u32 req, u32 data1, u32 
data2, u32 data3);
@@ -263,6 +264,7 @@ struct amdgpu_virt {
struct amdgpu_virt_fw_reserve   fw_reserve;
uint32_t gim_feature;
uint32_t reg_access_mode;
+   int req_init_data_ver;
 };
 
 #define amdgpu_sriov_enabled(adev) \
@@ -303,6 +305,7 @@ void amdgpu_virt_kiq_reg_write_reg_wait(struct 
amdgpu_device *adev,
 int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init);
 int amdgpu_virt_release_full_gpu(struct amdgpu_device *adev, bool init);
 int amdgpu_virt_reset_gpu(struct amdgpu_device *adev);
+void amdgpu_virt_request_init_data(struct amdgpu_device *adev);
 int amdgpu_virt_wait_reset(struct amdgpu_device *adev);
 int amdgpu_virt_alloc_mm_table(struct amdgpu_device *adev);
 void amdgpu_virt_free_mm_table(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index d9ce12c..6b9e390 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -109,7 +109,6 @@ static int xgpu_nv_poll_msg(struct amdgpu_device *adev, 
enum idh_event event)
timeout -= 10;
} while (timeout > 1);
 
-   pr_err("Doesn't get msg:%d from pf, error=%d\n", event, r);
 
return -ETIME;
 }
@@ -163,18 +162,45 @@ static int xgpu_nv_send_access_requests(struct 
amdgpu_device *adev,
enum idh_request req)
 {
int r;
+   enum idh_event event = -1;
 
xgpu_nv_mailbox_trans_msg(adev, req, 0, 0, 0);
 
-   /* start to check msg if request is idh_req_gpu_init_access */
-   if (req == IDH_REQ_GPU_INIT_ACCESS ||
-   req == IDH_REQ_GPU_FINI_ACCESS ||
-   req == IDH_REQ_GPU_RESET_ACCESS) {
-   r = xgpu_nv_poll_msg(adev, IDH_READY_TO_ACCESS_GPU);
+   switch (req) {
+   case IDH_REQ_GPU_INIT_ACCESS:
+   case IDH_REQ_GPU_FINI_ACCESS:
+   case IDH_REQ_GPU_RESET_ACCESS:
+   event = IDH_READY_TO_ACCESS_GPU;
+   break;
+   case IDH_REQ_GPU_INIT_DATA:
+   event = IDH_REQ_GPU_INIT_DATA_READY;
+   break;
+   default:
+   break;
+   }
+
+   if (event != -1) {
+   r = xgpu_nv_poll_msg(adev, event);
if (r) {
-   pr_err("Doesn't get READY_TO_ACCESS_GPU from pf, give 
up\n");
-   return r;
+   if (req != IDH_REQ_GPU_INIT_DATA) {
+   pr_err("Doesn't get msg:%d from pf, 
error=%d\n", event, r);
+   return r;
+   }
+   else /* host doesn't support REQ_GPU_INIT_DATA 
handshake */
+   adev->virt.req_init_data_ver = 0;
+   } else {
+   if (req == IDH_REQ_GPU_INIT_DATA)
+   {
+   adev->virt.req_init_data_ver =
+

[PATCH 7/7] drm/amdgpu: postpone entering fullaccess mode

2020-03-24 Thread Monk Liu

if host support new handshake we only need to enter
fullaccess_mode in ip_init() part, otherwise we need
to do it before reading vbios (becuase host prepares vbios
for VF only after received REQ_GPU_INIT event under
legacy handshake)

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 724ad84..b61161a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1814,10 +1814,14 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
return r;
}
}
+   }
 
+   /* we need to send REQ_GPU here for legacy handshaker otherwise the 
vbios
+* will not be prepared by host for this VF */
+   if (amdgpu_sriov_vf(adev) && adev->virt.req_init_data_ver < 1) {
r = amdgpu_virt_request_full_gpu(adev, true);
if (r)
-   return -EAGAIN;
+   return r;
}
 
adev->pm.pp_feature = amdgpu_pp_feature_mask;
@@ -1977,6 +1981,12 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
*adev)
if (r)
return r;
 
+   if (amdgpu_sriov_vf(adev) && adev->virt.req_init_data_ver > 0) {
+   r = amdgpu_virt_request_full_gpu(adev, true);
+   if (r)
+   return -EAGAIN;
+   }
+
for (i = 0; i < adev->num_ip_blocks; i++) {
if (!adev->ip_blocks[i].status.valid)
continue;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/7] drm/amdgpu: use static mmio offset for NV mailbox

2020-03-24 Thread Monk Liu

what:
with the new "req_init_data" handshake we need to use mailbox
before do IP discovery, so in mxgpu_nv.c file the original
SOC15_REG method won'twork because that depends on IP discovery
complete first.

how:
so the solution is to always use static MMIO offset for NV+ mailbox
registers.
HW team confirm us all MAILBOX registers will be at the same
offset for all ASICs, no IP discovery needed for those registers

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 52 +++
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 18 ++--
 2 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 6b9e390..ce2bf1f 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -52,8 +52,7 @@ static void xgpu_nv_mailbox_set_valid(struct amdgpu_device 
*adev, bool val)
  */
 static enum idh_event xgpu_nv_mailbox_peek_msg(struct amdgpu_device *adev)
 {
-   return RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-   mmBIF_BX_PF_MAILBOX_MSGBUF_RCV_DW0));
+   return RREG32_NO_KIQ(mmMAILBOX_MSGBUF_RCV_DW0);
 }
 
 
@@ -62,8 +61,7 @@ static int xgpu_nv_mailbox_rcv_msg(struct amdgpu_device *adev,
 {
u32 reg;
 
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-
mmBIF_BX_PF_MAILBOX_MSGBUF_RCV_DW0));
+   reg = RREG32_NO_KIQ(mmMAILBOX_MSGBUF_RCV_DW0);
if (reg != event)
return -ENOENT;
 
@@ -116,7 +114,6 @@ static int xgpu_nv_poll_msg(struct amdgpu_device *adev, 
enum idh_event event)
 static void xgpu_nv_mailbox_trans_msg (struct amdgpu_device *adev,
  enum idh_request req, u32 data1, u32 data2, u32 data3)
 {
-   u32 reg;
int r;
uint8_t trn;
 
@@ -135,19 +132,10 @@ static void xgpu_nv_mailbox_trans_msg (struct 
amdgpu_device *adev,
}
} while (trn);
 
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-
mmBIF_BX_PF_MAILBOX_MSGBUF_TRN_DW0));
-   reg = REG_SET_FIELD(reg, BIF_BX_PF_MAILBOX_MSGBUF_TRN_DW0,
-   MSGBUF_DATA, req);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, 
mmBIF_BX_PF_MAILBOX_MSGBUF_TRN_DW0),
- reg);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, 
mmBIF_BX_PF_MAILBOX_MSGBUF_TRN_DW1),
-   data1);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, 
mmBIF_BX_PF_MAILBOX_MSGBUF_TRN_DW2),
-   data2);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, 
mmBIF_BX_PF_MAILBOX_MSGBUF_TRN_DW3),
-   data3);
-
+   WREG32_NO_KIQ(mmMAILBOX_MSGBUF_TRN_DW0, req);
+   WREG32_NO_KIQ(mmMAILBOX_MSGBUF_TRN_DW1, data1);
+   WREG32_NO_KIQ(mmMAILBOX_MSGBUF_TRN_DW2, data2);
+   WREG32_NO_KIQ(mmMAILBOX_MSGBUF_TRN_DW3, data3);
xgpu_nv_mailbox_set_valid(adev, true);
 
/* start to poll ack */
@@ -192,8 +180,7 @@ static int xgpu_nv_send_access_requests(struct 
amdgpu_device *adev,
if (req == IDH_REQ_GPU_INIT_DATA)
{
adev->virt.req_init_data_ver =
-   RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-   
mmBIF_BX_PF_MAILBOX_MSGBUF_RCV_DW1));
+   RREG32_NO_KIQ(mmMAILBOX_MSGBUF_RCV_DW1);
 
/* assume V1 in case host doesn't set version 
number */
if (adev->virt.req_init_data_ver < 1)
@@ -204,8 +191,7 @@ static int xgpu_nv_send_access_requests(struct 
amdgpu_device *adev,
/* Retrieve checksum from mailbox2 */
if (req == IDH_REQ_GPU_INIT_ACCESS || req == 
IDH_REQ_GPU_RESET_ACCESS) {
adev->virt.fw_reserve.checksum_key =
-   RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-   mmBIF_BX_PF_MAILBOX_MSGBUF_RCV_DW2));
+   RREG32_NO_KIQ(mmMAILBOX_MSGBUF_RCV_DW2);
}
}
 
@@ -256,11 +242,14 @@ static int xgpu_nv_set_mailbox_ack_irq(struct 
amdgpu_device *adev,
unsigned type,
enum amdgpu_interrupt_state state)
 {
-   u32 tmp = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, 
mmBIF_BX_PF_MAILBOX_INT_CNTL));
+   u32 tmp = RREG32_NO_KIQ(mmMAILBOX_INT_CNTL);
 
-   tmp = REG_SET_FIELD(tmp, BIF_BX_PF_MAILBOX_INT_CNTL, ACK_INT_EN,
-   (state == AMDGPU_IRQ_STATE_ENABLE) ? 1 : 0);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF_MAILBOX_INT_CNTL), 
tmp);
+   if (state == AMDGPU_IRQ_STATE_ENABLE)
+   tmp |= 2;
+   else
+   tmp &= ~2;
+
+   WREG32_NO_KIQ(mmMAILBOX_INT_CNTL,

RE: [PATCH 4/4] drm/amdgpu: cleanup all virtualization detection routine

2020-03-24 Thread Deng, Emily

[AMD Official Use Only - Internal Distribution Only]

Series Reviewed-by: Emily Deng 

>-Original Message-
>From: amd-gfx  On Behalf Of Monk Liu
>Sent: Tuesday, March 24, 2020 6:59 PM
>To: amd-gfx@lists.freedesktop.org
>Cc: Liu, Monk 
>Subject: [PATCH 4/4] drm/amdgpu: cleanup all virtualization detection routine
>
>we need to move virt detection much earlier because:
>1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always be at DE5
>(dw) mmio offset from vega10, this way there is no need to implement
>detect_hw_virt() routine in each nbio/chip file.
>for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at
>0x1503
>
>2) we need to acknowledged we are SRIOV VF before we do IP discovery because
>the IP discovery content will be updated by host everytime after it recieved a
>new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches
>for this new handshake soon).
>
>Signed-off-by: Monk Liu 
>---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 ++
> drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h   |  1 -
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33
>++
> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  6 
> drivers/gpu/drm/amd/amdgpu/cik.c   |  8 --
> drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 18 
> drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 18 
> drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c |  7 -
> drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 18 
> drivers/gpu/drm/amd/amdgpu/nv.c|  2 --
> drivers/gpu/drm/amd/amdgpu/si.c|  8 --
> drivers/gpu/drm/amd/amdgpu/soc15.c |  1 -
> drivers/gpu/drm/amd/amdgpu/vi.c| 24 
> .../amd/include/asic_reg/nbif/nbif_6_1_offset.h|  2 ++
> .../amd/include/asic_reg/nbio/nbio_7_0_offset.h|  2 ++
> .../amd/include/asic_reg/nbio/nbio_7_4_offset.h|  2 ++
> 16 files changed, 48 insertions(+), 105 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>index e55dbcd..ca609b6 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>@@ -3057,6 +3057,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   if (amdgpu_mes && adev->asic_type >= CHIP_NAVI10)
>   adev->enable_mes = true;
>
>+  /* detect hw virtualization here */
>+  amdgpu_detect_virtualization(adev);
>+
>   if (amdgpu_discovery && adev->asic_type >= CHIP_NAVI10) {
>   r = amdgpu_discovery_init(adev);
>   if (r) {
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
>index 919bd56..edaac24 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
>@@ -77,7 +77,6 @@ struct amdgpu_nbio_funcs {
> u32 *flags);
>   void (*ih_control)(struct amdgpu_device *adev);
>   void (*init_registers)(struct amdgpu_device *adev);
>-  void (*detect_hw_virt)(struct amdgpu_device *adev);
>   void (*remap_hdp_registers)(struct amdgpu_device *adev);
>   void (*handle_ras_controller_intr_no_bifring)(struct amdgpu_device
>*adev);
>   void (*handle_ras_err_event_athub_intr_no_bifring)(struct
>amdgpu_device *adev); diff --git
>a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>index adc813c..43a1ee3 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>@@ -287,3 +287,36 @@ void amdgpu_virt_init_data_exchange(struct
>amdgpu_device *adev)
>   }
>   }
> }
>+
>+void amdgpu_detect_virtualization(struct amdgpu_device *adev) {
>+  uint32_t reg;
>+
>+  switch (adev->asic_type) {
>+  case CHIP_TONGA:
>+  case CHIP_FIJI:
>+  reg = RREG32(mmBIF_IOV_FUNC_IDENTIFIER);
>+  break;
>+  case CHIP_VEGA10:
>+  case CHIP_VEGA20:
>+  case CHIP_NAVI10:
>+  case CHIP_NAVI12:
>+  case CHIP_ARCTURUS:
>+  reg = RREG32(mmRCC_IOV_FUNC_IDENTIFIER);
>+  break;
>+  default: /* other chip doesn't support SRIOV */
>+  reg = 0;
>+  break;
>+  }
>+
>+  if (reg & 1)
>+  adev->virt.caps |= AMDGPU_SRIOV_CAPS_IS_VF;
>+
>+  if (reg & 0x8000)
>+  adev->virt.caps |= AMDGPU_SRIOV_CAPS_ENABLE_IOV;
>+
>+  if (!reg) {
>+  if (is_virtual_machine())   /* passthrough mode exclus 
>sriov mod
>*/
>+  adev->virt.caps |= AMDGPU_PASSTHROUGH_MODE;
>+  }
>+}
>diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>index 0a95b13..74f9843 100644
>--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>@@ -30,6 +30,11 @@
> #define AM

Re: [PATCH 1/4] drm/radeon: remove unneeded header include path

2020-03-24 Thread Masahiro Yamada

On Wed, Mar 25, 2020 at 4:42 AM Alex Deucher  wrote:
>
> On Tue, Mar 24, 2020 at 12:48 PM Masahiro Yamada  wrote:
> >
> > Hi,
> >
> > I think this series is a good clean-up.
> >
> > Could you take a look at this please?
>
> Can you resend?  I don't seem to have gotten it.  Must have ended up
> getting flagged a spam or something.


Can you take it from patchwork ?  (4 patches)

https://lore.kernel.org/patchwork/project/lkml/list/?series=429491


Thanks.






> Alex
>
> >
> >
> >
> > On Fri, Feb 14, 2020 at 12:40 AM Masahiro Yamada  
> > wrote:
> > >
> > > A header include path without $(srctree)/ is suspicious because it does
> > > not work with O= builds.
> > >
> > > You can build drivers/gpu/drm/radeon/ without this include path.
> > >
> > > Signed-off-by: Masahiro Yamada 
> > > ---
> > >
> > >  drivers/gpu/drm/radeon/Makefile | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/radeon/Makefile 
> > > b/drivers/gpu/drm/radeon/Makefile
> > > index c693b2ca0329..9d5d3dc1011f 100644
> > > --- a/drivers/gpu/drm/radeon/Makefile
> > > +++ b/drivers/gpu/drm/radeon/Makefile
> > > @@ -3,8 +3,6 @@
> > >  # Makefile for the drm device driver.  This driver provides support for 
> > > the
> > >  # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
> > >
> > > -ccflags-y := -Idrivers/gpu/drm/amd/include
> > > -
> > >  hostprogs := mkregtable
> > >  clean-files := rn50_reg_safe.h r100_reg_safe.h r200_reg_safe.h 
> > > rv515_reg_safe.h r300_reg_safe.h r420_reg_safe.h rs600_reg_safe.h 
> > > r600_reg_safe.h evergreen_reg_safe.h cayman_reg_safe.h
> > >
> > > --
> > > 2.17.1
> > >
> >
> >
> > --
> > Best Regards
> > Masahiro Yamada
> > ___
> > dri-devel mailing list
> > dri-de...@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Best Regards
Masahiro Yamada
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/4] drm/amdgpu: reroute VMC and UMD to IH ring 1 for oss v5

2020-03-24 Thread Alex Sierra

[Why]
Due Page faults can easily overwhelm the interrupt handler.
So to make sure that we never lose valuable interrupts on the primary ring
we re-route page faults to IH ring 1.
It also facilitates the recovery page process, since it's already
running from a process context.
This is valid for Arcturus and future Navi generation GPUs.

[How]
Setting IH_CLIENT_CFG_DATA for VMC and UMD IH clients.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 4ce42635787a..6fca5206833d 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -205,6 +205,24 @@ static uint32_t navi10_ih_doorbell_rptr(struct 
amdgpu_ih_ring *ih)
return ih_doorbell_rtpr;
 }
 
+static void navi10_ih_reroute_ih(struct amdgpu_device *adev)
+{
+   uint32_t tmp;
+
+   /* Reroute to IH ring 1 for VMC */
+   WREG32_SOC15(OSSSYS, 0, mmIH_CLIENT_CFG_INDEX, 0x12);
+   tmp = RREG32_SOC15(OSSSYS, 0, mmIH_CLIENT_CFG_DATA);
+   tmp = REG_SET_FIELD(tmp, IH_CLIENT_CFG_DATA, CLIENT_TYPE, 1);
+   tmp = REG_SET_FIELD(tmp, IH_CLIENT_CFG_DATA, RING_ID, 1);
+   WREG32_SOC15(OSSSYS, 0, mmIH_CLIENT_CFG_DATA, tmp);
+
+   /* Reroute IH ring 1 for UMC */
+   WREG32_SOC15(OSSSYS, 0, mmIH_CLIENT_CFG_INDEX, 0x1B);
+   tmp = RREG32_SOC15(OSSSYS, 0, mmIH_CLIENT_CFG_DATA);
+   tmp = REG_SET_FIELD(tmp, IH_CLIENT_CFG_DATA, RING_ID, 1);
+   WREG32_SOC15(OSSSYS, 0, mmIH_CLIENT_CFG_DATA, tmp);
+}
+
 /**
  * navi10_ih_irq_init - init and enable the interrupt ring
  *
@@ -243,6 +261,7 @@ static int navi10_ih_irq_init(struct amdgpu_device *adev)
} else {
WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
}
+   navi10_ih_reroute_ih(adev);
 
if (unlikely(adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT)) {
if (ih->use_bus_addr) {
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/4] drm/amdgpu: replace ih ip block for vega20 and arcturus

2020-03-24 Thread Alex Sierra

[Why]
Vega20 and Arcturus asics use oss 5.0 version.

[How]
Replace ih ip block by navi10 for vega20 and arcturus.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index aaf02dbb03f7..a38b5a90cd1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -62,6 +62,7 @@
 #include "nbio_v7_0.h"
 #include "nbio_v7_4.h"
 #include "vega10_ih.h"
+#include "navi10_ih.h"
 #include "sdma_v4_0.h"
 #include "uvd_v7_0.h"
 #include "vce_v4_0.h"
@@ -732,9 +733,7 @@ int soc15_set_ip_blocks(struct amdgpu_device *adev)
else
amdgpu_device_ip_block_add(adev, 
&psp_v3_1_ip_block);
}
-   amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
} else {
-   amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
if (likely(adev->firmware.load_type == 
AMDGPU_FW_LOAD_PSP)) {
if (adev->asic_type == CHIP_VEGA20)
amdgpu_device_ip_block_add(adev, 
&psp_v11_0_ip_block);
@@ -742,6 +741,10 @@ int soc15_set_ip_blocks(struct amdgpu_device *adev)
amdgpu_device_ip_block_add(adev, 
&psp_v3_1_ip_block);
}
}
+   if (adev->asic_type == CHIP_VEGA20)
+   amdgpu_device_ip_block_add(adev, &navi10_ih_ip_block);
+   else
+   amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
amdgpu_device_ip_block_add(adev, &gfx_v9_0_ip_block);
amdgpu_device_ip_block_add(adev, &sdma_v4_0_ip_block);
if (is_support_sw_smu(adev)) {
@@ -785,13 +788,11 @@ int soc15_set_ip_blocks(struct amdgpu_device *adev)
if (amdgpu_sriov_vf(adev)) {
if (likely(adev->firmware.load_type == 
AMDGPU_FW_LOAD_PSP))
amdgpu_device_ip_block_add(adev, 
&psp_v11_0_ip_block);
-   amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
} else {
-   amdgpu_device_ip_block_add(adev, &vega10_ih_ip_block);
if (likely(adev->firmware.load_type == 
AMDGPU_FW_LOAD_PSP))
amdgpu_device_ip_block_add(adev, 
&psp_v11_0_ip_block);
}
-
+   amdgpu_device_ip_block_add(adev, &navi10_ih_ip_block);
if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
amdgpu_device_ip_block_add(adev, &dce_virtual_ip_block);
amdgpu_device_ip_block_add(adev, &gfx_v9_0_ip_block);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/4] drm/amdgpu: enable IH ring 1 and ring 2 for navi

2020-03-24 Thread Alex Sierra

Support added into IH to enable ring1 and ring2 for navi10_ih.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c | 205 +++--
 1 file changed, 189 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index e08245a446fc..0c0ba572d7a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -51,6 +51,22 @@ static void navi10_ih_enable_interrupts(struct amdgpu_device 
*adev)
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 1);
WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
adev->irq.ih.enabled = true;
+
+   if (adev->irq.ih1.ring_size) {
+   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
+  RB_ENABLE, 1);
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+   adev->irq.ih1.enabled = true;
+   }
+
+   if (adev->irq.ih2.ring_size) {
+   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
+   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
+  RB_ENABLE, 1);
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+   adev->irq.ih2.enabled = true;
+   }
 }
 
 /**
@@ -72,6 +88,31 @@ static void navi10_ih_disable_interrupts(struct 
amdgpu_device *adev)
WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR, 0);
adev->irq.ih.enabled = false;
adev->irq.ih.rptr = 0;
+
+   if (adev->irq.ih1.ring_size) {
+   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
+   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
+  RB_ENABLE, 0);
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+   /* set rptr, wptr to 0 */
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING1, 0);
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING1, 0);
+   adev->irq.ih1.enabled = false;
+   adev->irq.ih1.rptr = 0;
+   }
+
+   if (adev->irq.ih2.ring_size) {
+   ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
+   ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
+  RB_ENABLE, 0);
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+   /* set rptr, wptr to 0 */
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING2, 0);
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING2, 0);
+   adev->irq.ih2.enabled = false;
+   adev->irq.ih2.rptr = 0;
+   }
+
 }
 
 static uint32_t navi10_ih_rb_cntl(struct amdgpu_ih_ring *ih, uint32_t 
ih_rb_cntl)
@@ -97,6 +138,25 @@ static uint32_t navi10_ih_rb_cntl(struct amdgpu_ih_ring 
*ih, uint32_t ih_rb_cntl
return ih_rb_cntl;
 }
 
+static uint32_t navi10_ih_doorbell_rptr(struct amdgpu_ih_ring *ih)
+{
+   u32 ih_doorbell_rtpr = 0;
+
+   if (ih->use_doorbell) {
+   ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
+IH_DOORBELL_RPTR, OFFSET,
+ih->doorbell_index);
+   ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
+IH_DOORBELL_RPTR,
+ENABLE, 1);
+   } else {
+   ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
+IH_DOORBELL_RPTR,
+ENABLE, 0);
+   }
+   return ih_doorbell_rtpr;
+}
+
 /**
  * navi10_ih_irq_init - init and enable the interrupt ring
  *
@@ -111,7 +171,7 @@ static uint32_t navi10_ih_rb_cntl(struct amdgpu_ih_ring 
*ih, uint32_t ih_rb_cntl
 static int navi10_ih_irq_init(struct amdgpu_device *adev)
 {
struct amdgpu_ih_ring *ih = &adev->irq.ih;
-   u32 ih_rb_cntl, ih_doorbell_rtpr, ih_chicken;
+   u32 ih_rb_cntl, ih_chicken;
u32 tmp;
 
/* disable irqs */
@@ -149,22 +209,52 @@ static int navi10_ih_irq_init(struct amdgpu_device *adev)
WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, 0);
WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR, 0);
 
-   ih_doorbell_rtpr = RREG32_SOC15(OSSSYS, 0, mmIH_DOORBELL_RPTR);
-   if (ih->use_doorbell) {
-   ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
-IH_DOORBELL_RPTR, OFFSET,
-ih->doorbell_index);
-   ih_doorbell_rtpr = REG_SET_FIELD(ih_doorbell_rtpr,
-IH_DOORBELL_RPTR, ENABLE, 1);
-   } else {
-   ih_doorbell_rtpr = REG_SET

[PATCH 2/4] drm/amdgpu: call psp to program ih cntl in SR-IOV for Navi

2020-03-24 Thread Alex Sierra

call psp to program ih cntl in SR-IOV if supported on Navi and Arcturus.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdgpu/navi10_ih.c | 90 +++---
 1 file changed, 80 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c 
b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
index 0c0ba572d7a3..4ce42635787a 100644
--- a/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/navi10_ih.c
@@ -49,14 +49,30 @@ static void navi10_ih_enable_interrupts(struct 
amdgpu_device *adev)
 
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_ENABLE, 1);
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 1);
-   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
+   if (amdgpu_sriov_vf(adev)) {
+   if (psp_reg_program(&adev->psp, PSP_REG_IH_RB_CNTL, 
ih_rb_cntl)) {
+   DRM_ERROR("PSP program IH_RB_CNTL failed!\n");
+   return;
+   }
+   } else {
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
+   }
+
adev->irq.ih.enabled = true;
 
if (adev->irq.ih1.ring_size) {
ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
   RB_ENABLE, 1);
-   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+   if (amdgpu_sriov_vf(adev)) {
+   if (psp_reg_program(&adev->psp, 
PSP_REG_IH_RB_CNTL_RING1,
+   ih_rb_cntl)) {
+   DRM_ERROR("program IH_RB_CNTL_RING1 failed!\n");
+   return;
+   }
+   } else {
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+   }
adev->irq.ih1.enabled = true;
}
 
@@ -64,7 +80,15 @@ static void navi10_ih_enable_interrupts(struct amdgpu_device 
*adev)
ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
   RB_ENABLE, 1);
-   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+   if (amdgpu_sriov_vf(adev)) {
+   if (psp_reg_program(&adev->psp, 
PSP_REG_IH_RB_CNTL_RING2,
+   ih_rb_cntl)) {
+   DRM_ERROR("program IH_RB_CNTL_RING2 failed!\n");
+   return;
+   }
+   } else {
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+   }
adev->irq.ih2.enabled = true;
}
 }
@@ -82,7 +106,15 @@ static void navi10_ih_disable_interrupts(struct 
amdgpu_device *adev)
 
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, RB_ENABLE, 0);
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL, ENABLE_INTR, 0);
-   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
+   if (amdgpu_sriov_vf(adev)) {
+   if (psp_reg_program(&adev->psp, PSP_REG_IH_RB_CNTL, 
ih_rb_cntl)) {
+   DRM_ERROR("PSP program IH_RB_CNTL failed!\n");
+   return;
+   }
+   } else {
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL, ih_rb_cntl);
+   }
+
/* set rptr, wptr to 0 */
WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR, 0);
WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR, 0);
@@ -93,7 +125,15 @@ static void navi10_ih_disable_interrupts(struct 
amdgpu_device *adev)
ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1);
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING1,
   RB_ENABLE, 0);
-   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+   if (amdgpu_sriov_vf(adev)) {
+   if (psp_reg_program(&adev->psp, 
PSP_REG_IH_RB_CNTL_RING1,
+   ih_rb_cntl)) {
+   DRM_ERROR("program IH_RB_CNTL_RING1 failed!\n");
+   return;
+   }
+   } else {
+   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING1, ih_rb_cntl);
+   }
/* set rptr, wptr to 0 */
WREG32_SOC15(OSSSYS, 0, mmIH_RB_RPTR_RING1, 0);
WREG32_SOC15(OSSSYS, 0, mmIH_RB_WPTR_RING1, 0);
@@ -105,7 +145,15 @@ static void navi10_ih_disable_interrupts(struct 
amdgpu_device *adev)
ih_rb_cntl = RREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2);
ih_rb_cntl = REG_SET_FIELD(ih_rb_cntl, IH_RB_CNTL_RING2,
   RB_ENABLE, 0);
-   WREG32_SOC15(OSSSYS, 0, mmIH_RB_CNTL_RING2, ih_rb_cntl);
+

RE: [PATCH] drm/amdgpu: add SOS FW version checking for CAP

2020-03-24 Thread Liu, Shaoyun

[AMD Official Use Only - Internal Distribution Only]

Reviewed by : Shaoyun.liu 


-Original Message-
From: amd-gfx  On Behalf Of Zhigang Luo
Sent: Tuesday, March 24, 2020 3:48 PM
To: amd-gfx@lists.freedesktop.org
Cc: Luo, Zhigang 
Subject: [PATCH] drm/amdgpu: add SOS FW version checking for CAP

To make sure the CAP feature is supported by the SOS, add SOS FW version 
checking before loading the CAP FW.

Change-Id: I7aa1c09f9c117f67ede0db6cd5911d56c8568495
Signed-off-by: Zhigang Luo 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index dc42086a672b..c2bf2d900039 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1407,6 +1407,11 @@ static int psp_np_fw_load(struct psp_context *psp)
if (!ucode->fw)
continue;
 
+   if (ucode->ucode_id == AMDGPU_UCODE_ID_CAP &&
+   (psp->sos_fw_version < 0x80F5B))
+   /* 0x80F5B is the first SOS FW version with CAP support 
*/
+   continue;
+
if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC &&
(psp_smu_reload_quirk(psp) ||
 psp->autoload_supported ||
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CShaoyun.Liu%40amd.com%7Cfb1e9b3393764955cf6c08d7d02c48ec%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637206761598725731&sdata=v9YpU%2BAGlVcDvXLvKTjzqMD0Xwzo35diNdBt4O3QwPM%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: hold the reference of finished fence

2020-03-24 Thread Grodzovsky, Andrey

I see now. In that case the change seems good to me.

Andrey

From: Koenig, Christian 
Sent: 24 March 2020 13:58
To: Grodzovsky, Andrey ; Pan, Xinhui 
; Tao, Yintian ; Deucher, Alexander 

Cc: amd-gfx@lists.freedesktop.org 
Subject: Re: [PATCH] drm/amdgpu: hold the reference of finished fence

There is a misunderstanding here:

Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ?

The refcount on the finished fence doesn't become zero before it is signaled, 
it becomes zero while it is signaled.

CPU 1 calls dma_fence_signal(fence) without holding a reference to the fence. 
CPU 2 at the same time checks if the fence is signaled and frees the last 
reference because it find the signaled flag to be set.

The problem is now that dma_fence_signal() wants to set the timestamp after 
setting the signaled flag and now races with freeing the memory.

That's a really hard to hit problem, but it indeed seems to be possible.

Christian.

Am 24.03.20 um 15:52 schrieb Andrey Grodzovsky:

This is only for the guilty job which was removed from the ring_mirror_list due 
to completion and hence will not be resubmitted by recovery and will not be 
freed by the usual flow in drm_sched_get_cleanup_job (see drm_sched_stop)

Andrey

On 3/24/20 10:45 AM, Pan, Xinhui wrote:

[AMD Official Use Only - Internal Distribution Only]

Does this issue occur when gpu recovery?
I just check the code,  fence timedout will free job and put its fence. but gpu 
recovery might resubmit job.
Correct me if I am wrong.

From: amd-gfx 

 on behalf of Andrey Grodzovsky 

Sent: Tuesday, March 24, 2020 11:40:06 AM
To: Tao, Yintian ; Koenig, 
Christian ; Deucher, 
Alexander 
Cc: amd-gfx@lists.freedesktop.org 

Subject: Re: [PATCH] drm/amdgpu: hold the reference of finished fence

On 3/23/20 10:22 AM, Yintian Tao wrote:
> There is one one corner case at dma_fence_signal_locked
> which will raise the NULL pointer problem just like below.
> ->dma_fence_signal
>  ->dma_fence_signal_locked
>->test_and_set_bit
> here trigger dma_fence_release happen due to the zero of fence refcount.

Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ? The finished fence is created with
refcount set to 1 in drm_sched_fence_create->dma_fence_init and then the
refcount is decremented in
drm_sched_main->amdgpu_job_free_cb->drm_sched_job_cleanup. This should
only happen after fence is already signaled (see
drm_sched_get_cleanup_job). On top of that the finished fence is
referenced from other places (e.g. entity->last_scheduled e.t.c)...

>
> ->dma_fence_put
>  ->dma_fence_release
>->drm_sched_fence_release_scheduled
>->call_rcu
> here make the union fled “cb_list” at finished fence
> to NULL because struct rcu_head contains two pointer
> which is same as struct list_head cb_list
>
> Therefore, to hold the reference of finished fence at drm_sched_process_job
> to prevent the null pointer during finished fence dma_fence_signal
>
> [  732.912867] BUG: kernel NULL pointer dereference, address: 0008
> [  732.914815] #PF: supervisor write access in kernel mode
> [  732.915731] #PF: error_code(0x0002) - not-present page
> [  732.916621] PGD 0 P4D 0
> [  732.917072] Oops: 0002 [#1] SMP PTI
> [  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G   OE 
> 5.4.0-rc7 #1
> [  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> [  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
> [  732.938569] Call Trace:
> [  732.939003]  
> [  732.939364]  dma_fence_signal+0x29/0x50
> [  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
> [  732.941910]  dma_fence_signal_locked+0x85/0x100
> [  732.942692]  dma_fence_signal+0x29/0x50
> [  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
> [  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
>
> v2: hold the finished fence at drm_sched_process_job instead of
>  amdgpu_fence_process
> v3: resume the blank line
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index a18eabf692e4..8e731ed0d9d9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -651,7 +651,9 @@ static void drm_sched_process_job(struct dma_fence *f, 
> struct dma_fence_cb *cb)
>
>trace_drm_sched_proce

[PATCH] drm/amdgpu: add SOS FW version checking for CAP

2020-03-24 Thread Zhigang Luo

To make sure the CAP feature is supported by the SOS, add SOS FW version
checking before loading the CAP FW.

Change-Id: I7aa1c09f9c117f67ede0db6cd5911d56c8568495
Signed-off-by: Zhigang Luo 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index dc42086a672b..c2bf2d900039 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1407,6 +1407,11 @@ static int psp_np_fw_load(struct psp_context *psp)
if (!ucode->fw)
continue;
 
+   if (ucode->ucode_id == AMDGPU_UCODE_ID_CAP &&
+   (psp->sos_fw_version < 0x80F5B))
+   /* 0x80F5B is the first SOS FW version with CAP support 
*/
+   continue;
+
if (ucode->ucode_id == AMDGPU_UCODE_ID_SMC &&
(psp_smu_reload_quirk(psp) ||
 psp->autoload_supported ||
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/4] drm/radeon: remove unneeded header include path

2020-03-24 Thread Alex Deucher

On Tue, Mar 24, 2020 at 12:48 PM Masahiro Yamada  wrote:
>
> Hi,
>
> I think this series is a good clean-up.
>
> Could you take a look at this please?

Can you resend?  I don't seem to have gotten it.  Must have ended up
getting flagged a spam or something.

Alex

>
>
>
> On Fri, Feb 14, 2020 at 12:40 AM Masahiro Yamada  wrote:
> >
> > A header include path without $(srctree)/ is suspicious because it does
> > not work with O= builds.
> >
> > You can build drivers/gpu/drm/radeon/ without this include path.
> >
> > Signed-off-by: Masahiro Yamada 
> > ---
> >
> >  drivers/gpu/drm/radeon/Makefile | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/Makefile 
> > b/drivers/gpu/drm/radeon/Makefile
> > index c693b2ca0329..9d5d3dc1011f 100644
> > --- a/drivers/gpu/drm/radeon/Makefile
> > +++ b/drivers/gpu/drm/radeon/Makefile
> > @@ -3,8 +3,6 @@
> >  # Makefile for the drm device driver.  This driver provides support for the
> >  # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
> >
> > -ccflags-y := -Idrivers/gpu/drm/amd/include
> > -
> >  hostprogs := mkregtable
> >  clean-files := rn50_reg_safe.h r100_reg_safe.h r200_reg_safe.h 
> > rv515_reg_safe.h r300_reg_safe.h r420_reg_safe.h rs600_reg_safe.h 
> > r600_reg_safe.h evergreen_reg_safe.h cayman_reg_safe.h
> >
> > --
> > 2.17.1
> >
>
>
> --
> Best Regards
> Masahiro Yamada
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT

2020-03-24 Thread Jason Gunthorpe

On Tue, Mar 24, 2020 at 08:33:39AM +0100, Christoph Hellwig wrote:
> >  
> > +/*
> > + * If the valid flag is masked off, and default_flags doesn't set valid, 
> > then
> > + * hmm_pte_need_fault() always returns 0.
> > + */
> > +static bool hmm_can_fault(struct hmm_range *range)
> > +{
> > +   return ((range->flags[HMM_PFN_VALID] & range->pfn_flags_mask) |
> > +   range->default_flags) &
> > +  range->flags[HMM_PFN_VALID];
> > +}
> 
> So my idea behind the helper was to turn this into something readable :)

Well, it does help to give the expression a name :)

> E.g.
> 
> /*
>  * We only need to fault if either the default mask requires to fault all
>  * pages, or at least the mask allows for individual pages to be faulted.
>  */
> static bool hmm_can_fault(struct hmm_range *range)
> {
>   return ((range->default_flags | range->pfn_flags_mask) &
>   range->flags[HMM_PFN_VALID]);
> }

Okay, I find this as understandable and it is less cluttered. I think
the comment is good enough now.

Can we concur on this then:

 static unsigned int
 hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 const uint64_t *pfns, unsigned long npages,
 uint64_t cpu_flags)
 {
+   struct hmm_range *range = hmm_vma_walk->range;
unsigned int required_fault = 0;
unsigned long i;
 
-   if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
+   /*
+* If the default flags do not request to fault pages, and the mask does
+* not allow for individual pages to be faulted, then
+* hmm_pte_need_fault() will always return 0.
+*/
+   if (!((range->default_flags | range->pfn_flags_mask) &
+ range->flags[HMM_PFN_VALID]))
return 0;

I think everything else is sorted now, so if yes I'll send this as v3.

Thanks,
Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

latencies and code inefficiencies in amdgpu display handling

2020-03-24 Thread Lucas Stach

Hi guys,

recently I've been tracing some IRQ latencies in a system and the
display handling in amdgpu doesn't really look that good. To be honest
it also doesn't look too bad, but I still want to share my findings
here. The trace below is from a single vblank IRQ with a pageflip.

The most interesting points from my perspective:

1. While this is a single CRTC vblank IRQ, the handler calls
amdgpu_get_vblank_counter_kms 10(!) times. This isn't really a cheap
function as it also reads the current scanout position and thus makes
multiple MMIO device reads.
This seems like low-hanging fruit for optimiaztion, as querying the
same thing this many times in a single IRQ invocation seems like total
overkill.

2. In this particular trace one of the scanout position reads seems to
block. The trace indicates that almost 300us are spent in this single
device read. Is this a known issue?

3. There are quite a few spinlocks being locked with spin_lock_irqsave,
while this handler code is never called from non-IRQ context, so all
those calls could use the cheaper spin_lock. This is a micro
optimization, but it caught my eye when looking over the trace/code.

Regards,
Lucas


# tracer: irqsoff
#
# irqsoff latency trace v1.1.5 on 5.6.0-rc7+
# 
# latency: 417 us, #446/446, CPU#6 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:16)
#-
#| task: user-task-0 (uid:1000 nice:0 policy:0 rt_prio:0)
#-
#  => started at: interrupt_entry
#  => ended at:   swapgs_restore_regs_and_return_to_usermode
#
#
#  _--=> CPU#
# / _-=> irqs-off
#| / _=> need-resched
#|| / _---=> hardirq/softirq 
#||| / _--=> preempt-depth   
# / delay
#  cmd pid   | time  |   caller  
# \   /  |  \|   / 
  user-task-06d...0us : trace_hardirqs_off_thunk <-interrupt_entry
  user-task-06d...0us : do_IRQ <-ret_from_intr
  user-task-06d...0us : irq_enter <-do_IRQ
  user-task-06d...0us : rcu_irq_enter <-irq_enter
  user-task-06d...0us : irqtime_account_irq <-irq_enter
  user-task-06d.h.1us : handle_edge_irq <-do_IRQ
  user-task-06d.h.1us : _raw_spin_lock <-handle_edge_irq
  user-task-06d.h.1us : irq_may_run <-handle_edge_irq
  user-task-06d.h.1us : irq_chip_ack_parent <-handle_edge_irq
  user-task-06d.h.1us : apic_ack_irq <-handle_edge_irq
  user-task-06d.h.1us : handle_irq_event <-handle_edge_irq
  user-task-06d.h.1us : handle_irq_event_percpu <-handle_irq_event
  user-task-06d.h.1us : __handle_irq_event_percpu 
<-handle_irq_event_percpu
  user-task-06d.h.1us : amdgpu_irq_handler <-__handle_irq_event_percpu
  user-task-06d.h.2us : amdgpu_ih_process <-amdgpu_irq_handler
  user-task-06d.h.2us : tonga_ih_get_wptr <-amdgpu_ih_process
  user-task-06d.h.2us : __drm_dbg <-amdgpu_ih_process
  user-task-06d.h.2us : amdgpu_irq_dispatch <-amdgpu_ih_process
  user-task-06d.h.2us : tonga_ih_decode_iv <-amdgpu_irq_dispatch
  user-task-06d.h.3us : amdgpu_dm_irq_handler <-amdgpu_irq_dispatch
  user-task-06d.h.3us : dc_interrupt_to_irq_source 
<-amdgpu_dm_irq_handler
  user-task-06d.h.3us : dal_irq_service_to_irq_source 
<-amdgpu_dm_irq_handler
  user-task-06d.h.4us : to_dal_irq_source_dce110 <-amdgpu_dm_irq_handler
  user-task-06d.h.4us : dc_interrupt_ack <-amdgpu_dm_irq_handler
  user-task-06d.h.5us : dal_irq_service_ack <-amdgpu_dm_irq_handler
  user-task-06d.h.5us : dal_irq_service_ack_generic 
<-dal_irq_service_ack
  user-task-06d.h.5us : dm_read_reg_func <-dal_irq_service_ack_generic
  user-task-06d.h.5us : amdgpu_cgs_read_register <-dm_read_reg_func
  user-task-06d.h.6us : amdgpu_mm_rreg <-dm_read_reg_func
  user-task-06d.h.7us : amdgpu_cgs_write_register 
<-dal_irq_service_ack_generic
  user-task-06d.h.7us : amdgpu_mm_wreg <-dal_irq_service_ack_generic
  user-task-06d.h.7us : _raw_spin_lock_irqsave <-amdgpu_dm_irq_handler
  user-task-06d.h.8us : dm_crtc_high_irq <-amdgpu_dm_irq_handler
  user-task-06d.h.8us : get_crtc_by_otg_inst.isra.0 <-dm_crtc_high_irq
  user-task-06d.h.8us : __drm_dbg <-dm_crtc_high_irq
  user-task-06d.h.8us : drm_crtc_handle_vblank <-dm_crtc_high_irq
  user-task-06d.h.9us : drm_handle_vblank <-dm_crtc_high_irq
  user-task-06d.h.9us : _raw_spin_lock_irqsave <-drm_handle_vblank
  user-task-06d.h.9us : _raw_spin_lock <-drm_handle_vblank
  user-task-06d.h.9us : drm_update_vblank_count <-drm_handle_vblank
  user-task-06d.h.9us : __get_vblank_counter <-drm_update_vblank_count
  user-task-06d.h.   10us : drm_crtc_from_index <-__get_vblank_counter
  user-

Re: [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault()

2020-03-24 Thread Jason Gunthorpe

On Tue, Mar 24, 2020 at 08:27:12AM +0100, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 10:14:50PM -0300, Jason Gunthorpe wrote:
> > +enum {
> > +   HMM_NEED_FAULT = 1 << 0,
> > +   HMM_NEED_WRITE_FAULT = HMM_NEED_FAULT | (1 << 1),
> > +   HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,
> 
> I have to say I find the compound version of HMM_NEED_WRITE_FAULT
> way harder to understand than the logic in the previous version,
> and would refer keeping separate bits here.
> 
> Mostly beccause of statements like this:
> 
> > +   if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
> 
> which seems rather weird.

Okay, I checked it over, and there is one weird statement above but
only one place that |'s them together, so it is overall simpler to
split the enum.

I'll keep the HMM_NEED_ALL_BITS, I think that purpose is clear enough.

Thanks,
Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-03-24 Thread Kenny Ho

Hi Tejun,

Can you elaborate more on what are the missing pieces?

Regards,
Kenny

On Tue, Mar 24, 2020 at 2:46 PM Tejun Heo  wrote:
>
> On Tue, Mar 17, 2020 at 12:03:20PM -0400, Kenny Ho wrote:
> > What's your thoughts on this latest series?
>
> My overall impression is that the feedbacks aren't being incorporated 
> throughly
> / sufficiently.
>
> Thanks.
>
> --
> tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 00/11] new cgroup controller for gpu/drm subsystem

2020-03-24 Thread Tejun Heo

On Tue, Mar 17, 2020 at 12:03:20PM -0400, Kenny Ho wrote:
> What's your thoughts on this latest series?

My overall impression is that the feedbacks aren't being incorporated throughly
/ sufficiently.

Thanks.

-- 
tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC PATCH 1/1] drm/amdgpu: rework sched_list generation

2020-03-24 Thread Christian König


Am 24.03.20 um 12:40 schrieb Nirmoy Das:

Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.


Well good start, a few style nit picks below. But in general no time 
right now to review this deeply.




Signed-off-by: Nirmoy Das 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|   4 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 148 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h|   3 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c|   3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h|   5 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h   |   2 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   |  11 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  28 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h   |   2 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h|   4 -
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |   3 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  13 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |   5 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  |   5 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  11 +-
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  13 +-
  drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c |   3 +-
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |   3 +-
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c |   3 +-
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |   3 +-
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |   3 +-
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |   6 +-
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |   3 +-
  drivers/gpu/drm/amd/amdgpu/si_dma.c|   3 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |   3 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |   3 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |   7 +-
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |   6 +-
  drivers/gpu/drm/amd/amdgpu/vce_v2_0.c  |   2 +-
  drivers/gpu/drm/amd/amdgpu/vce_v3_0.c  |   3 +-
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |   3 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |   6 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |   6 +-
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  |   6 +-
  35 files changed, 131 insertions(+), 203 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7dd74253e7b6..ac2ab2933e12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -984,6 +984,10 @@ struct amdgpu_device {
charproduct_number[16];
charproduct_name[32];
charserial[16];
+
+   /* drm scheduler list */
+   struct drm_gpu_scheduler
*ctx_scheds[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][AMDGPU_MAX_COMPUTE_RINGS];
+   uint32_t
ctx_num_scheds[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
  };
  
  static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 6ed36a2c5f73..24e98d674570 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -72,6 +72,15 @@ static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_compute_prio(enum drm_sch
}
  }
  
+static unsigned int amdgpu_ctx_sched_prio_to_hw_prio(enum drm_sched_priority prio,

+const int hw_ip)
+{
+   if (hw_ip == AMDGPU_HW_IP_COMPUTE)
+   return amdgpu_ctx_sched_prio_to_compute_prio(prio);
+
+   return AMDGPU_RING_PRIO_DEFAULT;
+}
+
  static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, const u32 hw_ip, 
const u32 ring)
  {
struct amdgpu_device *adev = ctx->adev;
@@ -90,52 +99,19 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
const u32 hw_ip, const
entity->sequence = 1;
priority = (ctx->override_priority == DRM_SCHED_PRIORITY_UNSET) ?
ctx->init_priority : ctx->override_priority;
-   switch (hw_ip) {
-   case AMDGPU_HW_IP_GFX:
-   sched = &adev->gfx.gfx_ring[0].sched;
-   scheds = &sched;
-   num_scheds = 1;
-   break;
-   case AMDGPU_HW_IP_COMPUTE:
-   hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(priority);
-   scheds = adev->gfx.compute_prio_sched[hw_prio];
-   num_scheds = adev->gfx.num_compute_sched[hw_prio];
-   break;
-   case AMDGPU_HW_IP_DMA:
-   scheds = adev->sdma.sdma_sched;
-   num_scheds = adev->sdma.num_sdma_sched;
-   break;
-   case AMDGPU_HW_IP_UVD:
-   sch

Re: [PATCH] drm/amdgpu: hold the reference of finished fence

2020-03-24 Thread Christian König

There is a misunderstanding here:

Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ?

The refcount on the finished fence doesn't become zero before it is 
signaled, it becomes zero while it is signaled.

CPU 1 calls dma_fence_signal(fence) without holding a reference to the 
fence. CPU 2 at the same time checks if the fence is signaled and frees 
the last reference because it find the signaled flag to be set.

The problem is now that dma_fence_signal() wants to set the timestamp 
after setting the signaled flag and now races with freeing the memory.

That's a really hard to hit problem, but it indeed seems to be possible.

Christian.

Am 24.03.20 um 15:52 schrieb Andrey Grodzovsky:

This is only for the guilty job which was removed from the 
ring_mirror_list due to completion and hence will not be resubmitted 
by recovery and will not be freed by the usual flow in 
drm_sched_get_cleanup_job (see drm_sched_stop)

Andrey

On 3/24/20 10:45 AM, Pan, Xinhui wrote:

[AMD Official Use Only - Internal Distribution Only]

Does this issue occur when gpu recovery?
I just check the code,  fence timedout will free job and put its 
fence. but gpu recovery might resubmit job.

Correct me if I am wrong.

*From:* amd-gfx  on behalf of 
Andrey Grodzovsky 

*Sent:* Tuesday, March 24, 2020 11:40:06 AM
*To:* Tao, Yintian ; Koenig, Christian 
; Deucher, Alexander 

*Cc:* amd-gfx@lists.freedesktop.org 
*Subject:* Re: [PATCH] drm/amdgpu: hold the reference of finished fence

On 3/23/20 10:22 AM, Yintian Tao wrote:
> There is one one corner case at dma_fence_signal_locked
> which will raise the NULL pointer problem just like below.
> ->dma_fence_signal
>  ->dma_fence_signal_locked
>    ->test_and_set_bit
> here trigger dma_fence_release happen due to the zero of fence 
refcount.

Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ? The finished fence is created with
refcount set to 1 in drm_sched_fence_create->dma_fence_init and then the
refcount is decremented in
drm_sched_main->amdgpu_job_free_cb->drm_sched_job_cleanup. This should
only happen after fence is already signaled (see
drm_sched_get_cleanup_job). On top of that the finished fence is
referenced from other places (e.g. entity->last_scheduled e.t.c)...

>
> ->dma_fence_put
>  ->dma_fence_release
>    ->drm_sched_fence_release_scheduled
>    ->call_rcu
> here make the union fled “cb_list” at finished fence
> to NULL because struct rcu_head contains two pointer
> which is same as struct list_head cb_list
>
> Therefore, to hold the reference of finished fence at 
drm_sched_process_job

> to prevent the null pointer during finished fence dma_fence_signal
>
> [  732.912867] BUG: kernel NULL pointer dereference, address: 
0008

> [  732.914815] #PF: supervisor write access in kernel mode
> [  732.915731] #PF: error_code(0x0002) - not-present page
> [  732.916621] PGD 0 P4D 0
> [  732.917072] Oops: 0002 [#1] SMP PTI
> [  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G   
OE 5.4.0-rc7 #1
> [  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 
1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014

> [  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
> [  732.938569] Call Trace:
> [  732.939003]  
> [  732.939364]  dma_fence_signal+0x29/0x50
> [  732.940036] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
> [  732.941910] dma_fence_signal_locked+0x85/0x100
> [  732.942692]  dma_fence_signal+0x29/0x50
> [  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
> [  732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
>
> v2: hold the finished fence at drm_sched_process_job instead of
>  amdgpu_fence_process
> v3: resume the blank line
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

> index a18eabf692e4..8e731ed0d9d9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -651,7 +651,9 @@ static void drm_sched_process_job(struct 
dma_fence *f, struct dma_fence_cb *cb)

>
>    trace_drm_sched_process_job(s_fence);
>
> + dma_fence_get(&s_fence->finished);
>    drm_sched_fence_finished(s_fence);

If the fence was already released during call to
drm_sched_fence_finished->dma_fence_signal->... why is it safe to
reference the s_fence just before that call ? Can't it already be
released by this time ?

Andrey

> + dma_fence_put(&s_fence->finished);
> wake_up_interruptible(&sched->wake_up_worker);
>   }
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safel

Re: [PATCH 1/4] drm/radeon: remove unneeded header include path

2020-03-24 Thread Masahiro Yamada

Hi,

I think this series is a good clean-up.

Could you take a look at this please?



On Fri, Feb 14, 2020 at 12:40 AM Masahiro Yamada  wrote:
>
> A header include path without $(srctree)/ is suspicious because it does
> not work with O= builds.
>
> You can build drivers/gpu/drm/radeon/ without this include path.
>
> Signed-off-by: Masahiro Yamada 
> ---
>
>  drivers/gpu/drm/radeon/Makefile | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/Makefile b/drivers/gpu/drm/radeon/Makefile
> index c693b2ca0329..9d5d3dc1011f 100644
> --- a/drivers/gpu/drm/radeon/Makefile
> +++ b/drivers/gpu/drm/radeon/Makefile
> @@ -3,8 +3,6 @@
>  # Makefile for the drm device driver.  This driver provides support for the
>  # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
>
> -ccflags-y := -Idrivers/gpu/drm/amd/include
> -
>  hostprogs := mkregtable
>  clean-files := rn50_reg_safe.h r100_reg_safe.h r200_reg_safe.h 
> rv515_reg_safe.h r300_reg_safe.h r420_reg_safe.h rs600_reg_safe.h 
> r600_reg_safe.h evergreen_reg_safe.h cayman_reg_safe.h
>
> --
> 2.17.1
>


--
Best Regards
Masahiro Yamada
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdkfd: Provide SMI events watch

2020-03-24 Thread Amber Lin

Sorry for the messed-up link. This is the link (rocm-smi-lib) which 
makes use of the interface

https://github.com/RadeonOpenCompute/rocm_smi_lib

On 2020-03-23 2:19 p.m., Amber Lin wrote:

Somehow my reply didn't seem to reach the mailing list...

Hi Alex,

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FRadeonOpenCompute%2Frocm_smi_lib&data=02%7C01%7Camber.lin%40amd.com%7C37d1a82d9e734d9fec6d08d7cf56ce36%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637205844045641423&sdata=I%2BVkN3VKYFUiZ0xGW0Yst70rcqrMRXUTcd995RgfRa4%3D&reserved=0 
will use this interface. Those functions will be added to this library:


/* Get a handler for watching events */
rsmi_status_t rsmi_event_init(rsmi_event_handle_t *handle);
/* Register events for the device using the handler from init */ 
rsmi_status_t rsmi_event_register(uint32_t dv_ind, uint32_t events,

    rsmi_event_handle_t *handle);
/* Wait for events. If one of the events happens, a success is 
returned with

 * with details in data.
 */
rsmi_status_t rsmi_event_wait(rsmi_event_handle_t handle, uint32_t 
timeout_ms,

    rsmi_event_data_t *data);
/* Stop watching events */
rsmi_status_t rsmi_event_free(rsmi_event_handle_t handle);

I add the ioctl to /dev/kfd with a debate if it should be in 
/dev/dri/card* or /dev/dri/renderD* instead. The first event to report 
is VM fault in this patch. Other events like RAS errors, PCIe errors, 
GPU reset… etc will be added for the system admin to diagnose the 
system health. I see this as a system feature so I use /dev/kfd. I’ll 
like to hear if people think differently. Thanks.


Thanks.

Amber

On 2020-03-17 3:03 p.m., Alex Deucher wrote:

On Tue, Mar 17, 2020 at 1:57 PM Amber Lin  wrote:
When the compute is malfunctioning or performance drops, the system 
admin
will use SMI (System Management Interface) tool to 
monitor/diagnostic what

went wrong. This patch provides an event watch interface for the user
space to register events they are interested. After the event is
registered, the user can use annoymous file descriptor's pull function
with wait-time specified to wait for the event to happen. Once the 
event

happens, the user can use read() to retrieve information related to the
event.

VM fault event is done in this patch.

Signed-off-by: Amber Lin 
Can you provide a link to the userspace tools that make use of this 
interface?


Thanks,

Alex


---
  drivers/gpu/drm/amd/amdkfd/Makefile  |   3 +-
  drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c |   2 +
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  38 ++
  drivers/gpu/drm/amd/amdkfd/kfd_device.c  |   1 +
  drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c  |   2 +
  drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  10 ++
  drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c  | 143 
+++

  drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h  |  41 +++
  include/uapi/linux/kfd_ioctl.h   |  27 -
  9 files changed, 265 insertions(+), 2 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
  create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h

diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile 
b/drivers/gpu/drm/amd/amdkfd/Makefile

index 6147462..cc98b4a 100644
--- a/drivers/gpu/drm/amd/amdkfd/Makefile
+++ b/drivers/gpu/drm/amd/amdkfd/Makefile
@@ -53,7 +53,8 @@ AMDKFD_FILES  := $(AMDKFD_PATH)/kfd_module.o \
 $(AMDKFD_PATH)/kfd_int_process_v9.o \
 $(AMDKFD_PATH)/kfd_dbgdev.o \
 $(AMDKFD_PATH)/kfd_dbgmgr.o \
-   $(AMDKFD_PATH)/kfd_crat.o
+   $(AMDKFD_PATH)/kfd_crat.o \
+   $(AMDKFD_PATH)/kfd_smi_events.o

  ifneq ($(CONFIG_AMD_IOMMU_V2),)
  AMDKFD_FILES += $(AMDKFD_PATH)/kfd_iommu.o
diff --git a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c 
b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c

index 9f59ba9..24b4717 100644
--- a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
@@ -24,6 +24,7 @@
  #include "kfd_events.h"
  #include "cik_int.h"
  #include "amdgpu_amdkfd.h"
+#include "kfd_smi_events.h"

  static bool cik_event_interrupt_isr(struct kfd_dev *dev,
 const uint32_t *ih_ring_entry,
@@ -107,6 +108,7 @@ static void cik_event_interrupt_wq(struct 
kfd_dev *dev,

 ihre->source_id == CIK_INTSRC_GFX_MEM_PROT_FAULT) {
 struct kfd_vm_fault_info info;

+   kfd_smi_event_update_vmfault(dev, pasid);
 kfd_process_vm_fault(dev->dqm, pasid);

 memset(&info, 0, sizeof(info));
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c

index f8fa03a..8e92956 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -39,6 +39,7 @@
  #include "kfd_device_queue_manag

Re: [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY

2020-03-24 Thread Jason Gunthorpe

On Tue, Mar 24, 2020 at 08:37:46AM +0100, Christoph Hellwig wrote:
> On Mon, Mar 23, 2020 at 10:14:55PM -0300, Jason Gunthorpe wrote:
> > if (pte_none(pte)) {
> > required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
> > if (required_fault)
> > goto fault;
> > +   *pfn = range->values[HMM_PFN_NONE];
> > return 0;
> > }
> >  
> > @@ -274,8 +274,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
> > unsigned long addr,
> > }
> >  
> > required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
> > -   if (!required_fault)
> > +   if (!required_fault) {
> > +   *pfn = range->values[HMM_PFN_NONE];
> > return 0;
> > +   }
> 
> Maybe throw in a goto hole to consolidaste the set PFN and return
> 0 cases?

Then we have goto fault and goto none both ending in returns. I
generally prefer the goto labels to have a single return

The pte_unmap() before faulting makes this routine twisty and I
haven't thought of a good way to untwist it

Thanks,
Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 4/4] drm/amdgpu: cleanup all virtualization detection routine

2020-03-24 Thread Alex Deucher

On Tue, Mar 24, 2020 at 6:59 AM Monk Liu  wrote:
>
> we need to move virt detection much earlier because:
> 1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always
> be at DE5 (dw) mmio offset from vega10, this way there is no
> need to implement detect_hw_virt() routine in each nbio/chip file.
> for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at
> 0x1503
>
> 2) we need to acknowledged we are SRIOV VF before we do IP discovery because
> the IP discovery content will be updated by host everytime after it recieved
> a new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches
> for this new handshake soon).
>
> Signed-off-by: Monk Liu 

Series is:
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h   |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 
> ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  6 
>  drivers/gpu/drm/amd/amdgpu/cik.c   |  8 --
>  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 18 
>  drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 18 
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c |  7 -
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 18 
>  drivers/gpu/drm/amd/amdgpu/nv.c|  2 --
>  drivers/gpu/drm/amd/amdgpu/si.c|  8 --
>  drivers/gpu/drm/amd/amdgpu/soc15.c |  1 -
>  drivers/gpu/drm/amd/amdgpu/vi.c| 24 
>  .../amd/include/asic_reg/nbif/nbif_6_1_offset.h|  2 ++
>  .../amd/include/asic_reg/nbio/nbio_7_0_offset.h|  2 ++
>  .../amd/include/asic_reg/nbio/nbio_7_4_offset.h|  2 ++
>  16 files changed, 48 insertions(+), 105 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index e55dbcd..ca609b6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3057,6 +3057,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> if (amdgpu_mes && adev->asic_type >= CHIP_NAVI10)
> adev->enable_mes = true;
>
> +   /* detect hw virtualization here */
> +   amdgpu_detect_virtualization(adev);
> +
> if (amdgpu_discovery && adev->asic_type >= CHIP_NAVI10) {
> r = amdgpu_discovery_init(adev);
> if (r) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
> index 919bd56..edaac24 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
> @@ -77,7 +77,6 @@ struct amdgpu_nbio_funcs {
>   u32 *flags);
> void (*ih_control)(struct amdgpu_device *adev);
> void (*init_registers)(struct amdgpu_device *adev);
> -   void (*detect_hw_virt)(struct amdgpu_device *adev);
> void (*remap_hdp_registers)(struct amdgpu_device *adev);
> void (*handle_ras_controller_intr_no_bifring)(struct amdgpu_device 
> *adev);
> void (*handle_ras_err_event_athub_intr_no_bifring)(struct 
> amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> index adc813c..43a1ee3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> @@ -287,3 +287,36 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device 
> *adev)
> }
> }
>  }
> +
> +void amdgpu_detect_virtualization(struct amdgpu_device *adev)
> +{
> +   uint32_t reg;
> +
> +   switch (adev->asic_type) {
> +   case CHIP_TONGA:
> +   case CHIP_FIJI:
> +   reg = RREG32(mmBIF_IOV_FUNC_IDENTIFIER);
> +   break;
> +   case CHIP_VEGA10:
> +   case CHIP_VEGA20:
> +   case CHIP_NAVI10:
> +   case CHIP_NAVI12:
> +   case CHIP_ARCTURUS:
> +   reg = RREG32(mmRCC_IOV_FUNC_IDENTIFIER);
> +   break;
> +   default: /* other chip doesn't support SRIOV */
> +   reg = 0;
> +   break;
> +   }
> +
> +   if (reg & 1)
> +   adev->virt.caps |= AMDGPU_SRIOV_CAPS_IS_VF;
> +
> +   if (reg & 0x8000)
> +   adev->virt.caps |= AMDGPU_SRIOV_CAPS_ENABLE_IOV;
> +
> +   if (!reg) {
> +   if (is_virtual_machine())   /* passthrough mode exclus 
> sriov mod */
> +   adev->virt.caps |= AMDGPU_PASSTHROUGH_MODE;
> +   }
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> index 0a95b13..74f9843 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> @@ -30,6 +30,11 @@
>  #define AMDGPU_PASSTHROUGH_MODE(1 << 3) /* thw whole GPU is pass 
> through

Re: [PATCH] drm/amdgpu: Fix FRU data checking

2020-03-24 Thread Alex Deucher

On Tue, Mar 24, 2020 at 7:49 AM Kent Russell  wrote:
>
> Ensure that when we memcpy, we don't end up copying more data than
> the struct supports. For now, this is 16 characters for product number
> and serial number, and 32 chars for product name
>
> Signed-off-by: Kent Russell 

Reviewed-by: Alex Deucher 

> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 21 +++
>  1 file changed, 21 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> index 6f5e98fda181..bfe4259f9508 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
> @@ -116,6 +116,13 @@ int amdgpu_fru_get_product_info(struct amdgpu_device 
> *adev)
> return size;
> }
>
> +   /* Product name should only be 32 characters. Any more,
> +* and something could be wrong. Cap it at 32 to be safe
> +*/
> +   if (size > 32) {
> +   DRM_WARN("FRU Product Number is larger than 32 characters. 
> This is likely a mistake");
> +   size = 32;
> +   }
> /* Start at 2 due to buff using fields 0 and 1 for the address */
> memcpy(adev->product_name, &buff[2], size);
> adev->product_name[size] = '\0';
> @@ -127,6 +134,13 @@ int amdgpu_fru_get_product_info(struct amdgpu_device 
> *adev)
> return size;
> }
>
> +   /* Product number should only be 16 characters. Any more,
> +* and something could be wrong. Cap it at 16 to be safe
> +*/
> +   if (size > 16) {
> +   DRM_WARN("FRU Product Number is larger than 16 characters. 
> This is likely a mistake");
> +   size = 16;
> +   }
> memcpy(adev->product_number, &buff[2], size);
> adev->product_number[size] = '\0';
>
> @@ -146,6 +160,13 @@ int amdgpu_fru_get_product_info(struct amdgpu_device 
> *adev)
> return size;
> }
>
> +   /* Serial number should only be 16 characters. Any more,
> +* and something could be wrong. Cap it at 16 to be safe
> +*/
> +   if (size > 16) {
> +   DRM_WARN("FRU Serial Number is larger than 16 characters. 
> This is likely a mistake");
> +   size = 16;
> +   }
> memcpy(adev->serial, &buff[2], size);
> adev->serial[size] = '\0';
>
> --
> 2.17.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Expose TA FW version in fw_version file

2020-03-24 Thread Alex Deucher

On Tue, Mar 24, 2020 at 7:42 AM Kent Russell  wrote:
>
> Reporting the fw_version just returns 0, the actual version is kept as
> ta_*_ucode_version. This is the same as the feature reported in
> the amdgpu_firmware_info debugfs file.
>
> Signed-off-by: Kent Russell 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> index 9ef312428231..65bb25e31d45 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
> @@ -403,8 +403,8 @@ FW_VERSION_ATTR(mec_fw_version, 0444, gfx.mec_fw_version);
>  FW_VERSION_ATTR(mec2_fw_version, 0444, gfx.mec2_fw_version);
>  FW_VERSION_ATTR(sos_fw_version, 0444, psp.sos_fw_version);
>  FW_VERSION_ATTR(asd_fw_version, 0444, psp.asd_fw_version);
> -FW_VERSION_ATTR(ta_ras_fw_version, 0444, psp.ta_fw_version);
> -FW_VERSION_ATTR(ta_xgmi_fw_version, 0444, psp.ta_fw_version);
> +FW_VERSION_ATTR(ta_ras_fw_version, 0444, psp.ta_ras_ucode_version);
> +FW_VERSION_ATTR(ta_xgmi_fw_version, 0444, psp.ta_xgmi_ucode_version);
>  FW_VERSION_ATTR(smc_fw_version, 0444, pm.fw_version);
>  FW_VERSION_ATTR(sdma_fw_version, 0444, sdma.instance[0].fw_version);
>  FW_VERSION_ATTR(sdma2_fw_version, 0444, sdma.instance[1].fw_version);
> --
> 2.17.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: hold the reference of finished fence

2020-03-24 Thread Andrey Grodzovsky

This is only for the guilty job which was removed from the 
ring_mirror_list due to completion and hence will not be resubmitted by 
recovery and will not be freed by the usual flow in 
drm_sched_get_cleanup_job (see drm_sched_stop)


Andrey

On 3/24/20 10:45 AM, Pan, Xinhui wrote:


[AMD Official Use Only - Internal Distribution Only]


Does this issue occur when gpu recovery?
I just check the code,  fence timedout will free job and put its 
fence. but gpu recovery might resubmit job.

Correct me if I am wrong.

*From:* amd-gfx  on behalf of 
Andrey Grodzovsky 

*Sent:* Tuesday, March 24, 2020 11:40:06 AM
*To:* Tao, Yintian ; Koenig, Christian 
; Deucher, Alexander 

*Cc:* amd-gfx@lists.freedesktop.org 
*Subject:* Re: [PATCH] drm/amdgpu: hold the reference of finished fence

On 3/23/20 10:22 AM, Yintian Tao wrote:
> There is one one corner case at dma_fence_signal_locked
> which will raise the NULL pointer problem just like below.
> ->dma_fence_signal
>  ->dma_fence_signal_locked
>    ->test_and_set_bit
> here trigger dma_fence_release happen due to the zero of fence refcount.


Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ? The finished fence is created with
refcount set to 1 in drm_sched_fence_create->dma_fence_init and then the
refcount is decremented in
drm_sched_main->amdgpu_job_free_cb->drm_sched_job_cleanup. This should
only happen after fence is already signaled (see
drm_sched_get_cleanup_job). On top of that the finished fence is
referenced from other places (e.g. entity->last_scheduled e.t.c)...


>
> ->dma_fence_put
>  ->dma_fence_release
>    ->drm_sched_fence_release_scheduled
>    ->call_rcu
> here make the union fled “cb_list” at finished fence
> to NULL because struct rcu_head contains two pointer
> which is same as struct list_head cb_list
>
> Therefore, to hold the reference of finished fence at 
drm_sched_process_job

> to prevent the null pointer during finished fence dma_fence_signal
>
> [  732.912867] BUG: kernel NULL pointer dereference, address: 
0008

> [  732.914815] #PF: supervisor write access in kernel mode
> [  732.915731] #PF: error_code(0x0002) - not-present page
> [  732.916621] PGD 0 P4D 0
> [  732.917072] Oops: 0002 [#1] SMP PTI
> [  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G   
OE 5.4.0-rc7 #1
> [  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 
1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014

> [  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
> [  732.938569] Call Trace:
> [  732.939003]  
> [  732.939364]  dma_fence_signal+0x29/0x50
> [  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
> [  732.941910]  dma_fence_signal_locked+0x85/0x100
> [  732.942692]  dma_fence_signal+0x29/0x50
> [  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
> [  732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
>
> v2: hold the finished fence at drm_sched_process_job instead of
>  amdgpu_fence_process
> v3: resume the blank line
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c

> index a18eabf692e4..8e731ed0d9d9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -651,7 +651,9 @@ static void drm_sched_process_job(struct 
dma_fence *f, struct dma_fence_cb *cb)

>
>    trace_drm_sched_process_job(s_fence);
>
> + dma_fence_get(&s_fence->finished);
>    drm_sched_fence_finished(s_fence);


If the fence was already released during call to
drm_sched_fence_finished->dma_fence_signal->... why is it safe to
reference the s_fence just before that call ? Can't it already be
released by this time ?

Andrey



> + dma_fence_put(&s_fence->finished);
> wake_up_interruptible(&sched->wake_up_worker);
>   }
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7C65933fca0b414d12aab408d7cfa51165%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637206180230440562&sdata=z6ec%2BcWkwjaDgZvkpL3jOMYkBtDjbNOxlXiAk4Ri5Ck%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: hold the reference of finished fence

2020-03-24 Thread Pan, Xinhui

[AMD Official Use Only - Internal Distribution Only]

Does this issue occur when gpu recovery?
I just check the code,  fence timedout will free job and put its fence. but gpu 
recovery might resubmit job.
Correct me if I am wrong.

From: amd-gfx  on behalf of Andrey 
Grodzovsky 
Sent: Tuesday, March 24, 2020 11:40:06 AM
To: Tao, Yintian ; Koenig, Christian 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org 
Subject: Re: [PATCH] drm/amdgpu: hold the reference of finished fence


On 3/23/20 10:22 AM, Yintian Tao wrote:
> There is one one corner case at dma_fence_signal_locked
> which will raise the NULL pointer problem just like below.
> ->dma_fence_signal
>  ->dma_fence_signal_locked
>->test_and_set_bit
> here trigger dma_fence_release happen due to the zero of fence refcount.


Did you find out why the zero refcount on the finished fence happens
before the fence was signaled ? The finished fence is created with
refcount set to 1 in drm_sched_fence_create->dma_fence_init and then the
refcount is decremented in
drm_sched_main->amdgpu_job_free_cb->drm_sched_job_cleanup. This should
only happen after fence is already signaled (see
drm_sched_get_cleanup_job). On top of that the finished fence is
referenced from other places (e.g. entity->last_scheduled e.t.c)...


>
> ->dma_fence_put
>  ->dma_fence_release
>->drm_sched_fence_release_scheduled
>->call_rcu
> here make the union fled “cb_list” at finished fence
> to NULL because struct rcu_head contains two pointer
> which is same as struct list_head cb_list
>
> Therefore, to hold the reference of finished fence at drm_sched_process_job
> to prevent the null pointer during finished fence dma_fence_signal
>
> [  732.912867] BUG: kernel NULL pointer dereference, address: 0008
> [  732.914815] #PF: supervisor write access in kernel mode
> [  732.915731] #PF: error_code(0x0002) - not-present page
> [  732.916621] PGD 0 P4D 0
> [  732.917072] Oops: 0002 [#1] SMP PTI
> [  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G   OE 
> 5.4.0-rc7 #1
> [  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
> [  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
> [  732.938569] Call Trace:
> [  732.939003]  
> [  732.939364]  dma_fence_signal+0x29/0x50
> [  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
> [  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
> [  732.941910]  dma_fence_signal_locked+0x85/0x100
> [  732.942692]  dma_fence_signal+0x29/0x50
> [  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
> [  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
>
> v2: hold the finished fence at drm_sched_process_job instead of
>  amdgpu_fence_process
> v3: resume the blank line
>
> Signed-off-by: Yintian Tao 
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index a18eabf692e4..8e731ed0d9d9 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -651,7 +651,9 @@ static void drm_sched_process_job(struct dma_fence *f, 
> struct dma_fence_cb *cb)
>
>trace_drm_sched_process_job(s_fence);
>
> + dma_fence_get(&s_fence->finished);
>drm_sched_fence_finished(s_fence);


If the fence was already released during call to
drm_sched_fence_finished->dma_fence_signal->... why is it safe to
reference the s_fence just before that call ? Can't it already be
released by this time ?

Andrey



> + dma_fence_put(&s_fence->finished);
>wake_up_interruptible(&sched->wake_up_worker);
>   }
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7Cxinhui.pan%40amd.com%7C65933fca0b414d12aab408d7cfa51165%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637206180230440562&sdata=z6ec%2BcWkwjaDgZvkpL3jOMYkBtDjbNOxlXiAk4Ri5Ck%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: Fix FRU data checking

2020-03-24 Thread Kent Russell

Ensure that when we memcpy, we don't end up copying more data than
the struct supports. For now, this is 16 characters for product number
and serial number, and 32 chars for product name

Signed-off-by: Kent Russell 
---
 .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 21 +++
 1 file changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
index 6f5e98fda181..bfe4259f9508 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c
@@ -116,6 +116,13 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
return size;
}
 
+   /* Product name should only be 32 characters. Any more,
+* and something could be wrong. Cap it at 32 to be safe
+*/
+   if (size > 32) {
+   DRM_WARN("FRU Product Number is larger than 32 characters. This 
is likely a mistake");
+   size = 32;
+   }
/* Start at 2 due to buff using fields 0 and 1 for the address */
memcpy(adev->product_name, &buff[2], size);
adev->product_name[size] = '\0';
@@ -127,6 +134,13 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
return size;
}
 
+   /* Product number should only be 16 characters. Any more,
+* and something could be wrong. Cap it at 16 to be safe
+*/
+   if (size > 16) {
+   DRM_WARN("FRU Product Number is larger than 16 characters. This 
is likely a mistake");
+   size = 16;
+   }
memcpy(adev->product_number, &buff[2], size);
adev->product_number[size] = '\0';
 
@@ -146,6 +160,13 @@ int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
return size;
}
 
+   /* Serial number should only be 16 characters. Any more,
+* and something could be wrong. Cap it at 16 to be safe
+*/
+   if (size > 16) {
+   DRM_WARN("FRU Serial Number is larger than 16 characters. This 
is likely a mistake");
+   size = 16;
+   }
memcpy(adev->serial, &buff[2], size);
adev->serial[size] = '\0';
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: Expose TA FW version in fw_version file

2020-03-24 Thread Kent Russell

Reporting the fw_version just returns 0, the actual version is kept as
ta_*_ucode_version. This is the same as the feature reported in
the amdgpu_firmware_info debugfs file.

Signed-off-by: Kent Russell 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index 9ef312428231..65bb25e31d45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -403,8 +403,8 @@ FW_VERSION_ATTR(mec_fw_version, 0444, gfx.mec_fw_version);
 FW_VERSION_ATTR(mec2_fw_version, 0444, gfx.mec2_fw_version);
 FW_VERSION_ATTR(sos_fw_version, 0444, psp.sos_fw_version);
 FW_VERSION_ATTR(asd_fw_version, 0444, psp.asd_fw_version);
-FW_VERSION_ATTR(ta_ras_fw_version, 0444, psp.ta_fw_version);
-FW_VERSION_ATTR(ta_xgmi_fw_version, 0444, psp.ta_fw_version);
+FW_VERSION_ATTR(ta_ras_fw_version, 0444, psp.ta_ras_ucode_version);
+FW_VERSION_ATTR(ta_xgmi_fw_version, 0444, psp.ta_xgmi_ucode_version);
 FW_VERSION_ATTR(smc_fw_version, 0444, pm.fw_version);
 FW_VERSION_ATTR(sdma_fw_version, 0444, sdma.instance[0].fw_version);
 FW_VERSION_ATTR(sdma2_fw_version, 0444, sdma.instance[1].fw_version);
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[RFC PATCH 1/1] drm/amdgpu: rework sched_list generation

2020-03-24 Thread Nirmoy Das

Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|   4 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c| 148 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h|   3 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h|   5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.h   |   2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   |  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  28 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h   |   2 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h|   4 -
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |   3 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  13 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |   5 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  |   5 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  11 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  13 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v1_0.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |   6 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |   3 +-
 drivers/gpu/drm/amd/amdgpu/si_dma.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |   3 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |   3 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |   7 +-
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |   6 +-
 drivers/gpu/drm/amd/amdgpu/vce_v2_0.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c  |   3 +-
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |   3 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |   6 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |   6 +-
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  |   6 +-
 35 files changed, 131 insertions(+), 203 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7dd74253e7b6..ac2ab2933e12 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -984,6 +984,10 @@ struct amdgpu_device {
charproduct_number[16];
charproduct_name[32];
charserial[16];
+
+   /* drm scheduler list */
+   struct drm_gpu_scheduler
*ctx_scheds[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][AMDGPU_MAX_COMPUTE_RINGS];
+   uint32_t
ctx_num_scheds[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX];
 };
 
 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 6ed36a2c5f73..24e98d674570 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -72,6 +72,15 @@ static enum gfx_pipe_priority 
amdgpu_ctx_sched_prio_to_compute_prio(enum drm_sch
}
 }
 
+static unsigned int amdgpu_ctx_sched_prio_to_hw_prio(enum drm_sched_priority 
prio,
+const int hw_ip)
+{
+   if (hw_ip == AMDGPU_HW_IP_COMPUTE)
+   return amdgpu_ctx_sched_prio_to_compute_prio(prio);
+
+   return AMDGPU_RING_PRIO_DEFAULT;
+}
+
 static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, const u32 hw_ip, 
const u32 ring)
 {
struct amdgpu_device *adev = ctx->adev;
@@ -90,52 +99,19 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, 
const u32 hw_ip, const
entity->sequence = 1;
priority = (ctx->override_priority == DRM_SCHED_PRIORITY_UNSET) ?
ctx->init_priority : ctx->override_priority;
-   switch (hw_ip) {
-   case AMDGPU_HW_IP_GFX:
-   sched = &adev->gfx.gfx_ring[0].sched;
-   scheds = &sched;
-   num_scheds = 1;
-   break;
-   case AMDGPU_HW_IP_COMPUTE:
-   hw_prio = amdgpu_ctx_sched_prio_to_compute_prio(priority);
-   scheds = adev->gfx.compute_prio_sched[hw_prio];
-   num_scheds = adev->gfx.num_compute_sched[hw_prio];
-   break;
-   case AMDGPU_HW_IP_DMA:
-   scheds = adev->sdma.sdma_sched;
-   num_scheds = adev->sdma.num_sdma_sched;
-   break;
-   case AMDGPU_HW_IP_UVD:
-   sched = &adev->uvd.inst[0].ring.sched;
-   scheds = &sched;
-   num_scheds = 1;
-   break;
-   case AMDGPU_HW_IP_VCE:
-   sched = &adev->vce.ring[

[PATCH 2/4] drm/amdgpu: purge ip_discovery headers

2020-03-24 Thread Monk Liu

those two headers are not needed for ip discovery

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 27d8ae1..37e1fcf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -23,9 +23,7 @@
 
 #include "amdgpu.h"
 #include "amdgpu_discovery.h"
-#include "soc15_common.h"
 #include "soc15_hw_ip.h"
-#include "nbio/nbio_2_3_offset.h"
 #include "discovery.h"
 
 #define mmRCC_CONFIG_MEMSIZE   0xde3
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/4] drm/amdgpu: don't try to reserve training bo for sriov

2020-03-24 Thread Monk Liu

1) SRIOV guest KMD doesn't care training buffer
2) if we resered training buffer that will overlap with IP discovery
reservation because training buffer is at vram_size - 0x8000 and
IP discovery is at ()vram_size - 0x1 => vram_size -1)

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 665db23..54cfa3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1859,9 +1859,10 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 *The reserved vram for memory training must be pinned to the specified
 *place on the VRAM, so reserve it early.
 */
-   r = amdgpu_ttm_training_reserve_vram_init(adev);
-   if (r)
-   return r;
+   if (!amdgpu_sriov_vf(adev))
+   r = amdgpu_ttm_training_reserve_vram_init(adev);
+   if (r)
+   return r;
 
/* allocate memory as required for VGA
 * This is used for VGA emulation and pre-OS scanout buffers to
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/4] drm/amdgpu: cleanup all virtualization detection routine

2020-03-24 Thread Monk Liu

we need to move virt detection much earlier because:
1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always
be at DE5 (dw) mmio offset from vega10, this way there is no
need to implement detect_hw_virt() routine in each nbio/chip file.
for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at
0x1503

2) we need to acknowledged we are SRIOV VF before we do IP discovery because
the IP discovery content will be updated by host everytime after it recieved
a new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches
for this new handshake soon).

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h   |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 33 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  6 
 drivers/gpu/drm/amd/amdgpu/cik.c   |  8 --
 drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c | 18 
 drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c | 18 
 drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c |  7 -
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 18 
 drivers/gpu/drm/amd/amdgpu/nv.c|  2 --
 drivers/gpu/drm/amd/amdgpu/si.c|  8 --
 drivers/gpu/drm/amd/amdgpu/soc15.c |  1 -
 drivers/gpu/drm/amd/amdgpu/vi.c| 24 
 .../amd/include/asic_reg/nbif/nbif_6_1_offset.h|  2 ++
 .../amd/include/asic_reg/nbio/nbio_7_0_offset.h|  2 ++
 .../amd/include/asic_reg/nbio/nbio_7_4_offset.h|  2 ++
 16 files changed, 48 insertions(+), 105 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e55dbcd..ca609b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3057,6 +3057,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
if (amdgpu_mes && adev->asic_type >= CHIP_NAVI10)
adev->enable_mes = true;
 
+   /* detect hw virtualization here */
+   amdgpu_detect_virtualization(adev);
+
if (amdgpu_discovery && adev->asic_type >= CHIP_NAVI10) {
r = amdgpu_discovery_init(adev);
if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
index 919bd56..edaac24 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h
@@ -77,7 +77,6 @@ struct amdgpu_nbio_funcs {
  u32 *flags);
void (*ih_control)(struct amdgpu_device *adev);
void (*init_registers)(struct amdgpu_device *adev);
-   void (*detect_hw_virt)(struct amdgpu_device *adev);
void (*remap_hdp_registers)(struct amdgpu_device *adev);
void (*handle_ras_controller_intr_no_bifring)(struct amdgpu_device 
*adev);
void (*handle_ras_err_event_athub_intr_no_bifring)(struct amdgpu_device 
*adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index adc813c..43a1ee3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -287,3 +287,36 @@ void amdgpu_virt_init_data_exchange(struct amdgpu_device 
*adev)
}
}
 }
+
+void amdgpu_detect_virtualization(struct amdgpu_device *adev)
+{
+   uint32_t reg;
+
+   switch (adev->asic_type) {
+   case CHIP_TONGA:
+   case CHIP_FIJI:
+   reg = RREG32(mmBIF_IOV_FUNC_IDENTIFIER);
+   break;
+   case CHIP_VEGA10:
+   case CHIP_VEGA20:
+   case CHIP_NAVI10:
+   case CHIP_NAVI12:
+   case CHIP_ARCTURUS:
+   reg = RREG32(mmRCC_IOV_FUNC_IDENTIFIER);
+   break;
+   default: /* other chip doesn't support SRIOV */
+   reg = 0;
+   break;
+   }
+
+   if (reg & 1)
+   adev->virt.caps |= AMDGPU_SRIOV_CAPS_IS_VF;
+
+   if (reg & 0x8000)
+   adev->virt.caps |= AMDGPU_SRIOV_CAPS_ENABLE_IOV;
+
+   if (!reg) {
+   if (is_virtual_machine())   /* passthrough mode exclus 
sriov mod */
+   adev->virt.caps |= AMDGPU_PASSTHROUGH_MODE;
+   }
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 0a95b13..74f9843 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -30,6 +30,11 @@
 #define AMDGPU_PASSTHROUGH_MODE(1 << 3) /* thw whole GPU is pass 
through for VM */
 #define AMDGPU_SRIOV_CAPS_RUNTIME  (1 << 4) /* is out of full access mode 
*/
 
+/* all asic after AI use this offset */
+#define mmRCC_IOV_FUNC_IDENTIFIER 0xDE5
+/* tonga/fiji use this offset */
+#define mmBIF_IOV_FUNC_IDENTIFIER 0x1503
+
 struct amdgpu_mm_table {
struct amdgpu_bo*bo

[PATCH 3/4] drm/amdgpu: amends feature bits for MM bandwidth mgr

2020-03-24 Thread Monk Liu

Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index f0128f7..0a95b13 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -83,6 +83,8 @@ enum AMDGIM_FEATURE_FLAG {
AMDGIM_FEATURE_GIM_LOAD_UCODES   = 0x2,
/* VRAM LOST by GIM */
AMDGIM_FEATURE_GIM_FLR_VRAMLOST = 0x4,
+   /* MM bandwidth */
+   AMDGIM_FEATURE_GIM_MM_BW_MGR = 0x8,
/* PP ONE VF MODE in GIM */
AMDGIM_FEATURE_PP_ONE_VF = (1 << 4),
 };
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v1 0/3] drm: drm_encoder_init() => drm_encoder_init_funcs()

2020-03-24 Thread Thomas Zimmermann

Hi Sam

Am 13.03.20 um 21:17 schrieb Sam Ravnborg:
> Thomas Zimmermann had made a nice patch-set that introduced
> drm_simple_encoder_init() which is already present in drm-misc-next.
> 
> While looking at this it was suddenly obvious to me that
> this was functionalty that really should be included in drm_encoder.c
> The case where the core could handle the callback is pretty
> common and not part of the simple pipe line.

The original patchset put the new function into the core implementation
and was shot down for this. So it ended up in the simple-KMS helpers.

> 
> So after some dialog on dri-devel the conclusion was to go for
> a change like this:
> 
> drm_encoder_init_funcs() for all users that specified a
> drm_encoder_funcs to extend the functionality.
> 
> drm_encoder_init() for all users that did not
> need to extend the basic functionality with
> drm_encoder_funcs.

TBH, my take-away was to keep the core as it is ans maybe rename
drm_simple_encoder_init() to some better name.

> 
> A similar approach with a _funcs() prefix is used elsewhere in drm/

IMHO, there are a few things to consider:

From grepping, I could only find drm_gem_fb_create_with_funcs(). And the
proposed change would make the encoder's function name inconsistent with
drm_connector_init(), drm_crtc_init(), and others. Finally,
drm_connector_init_with_ddc() was criticiced for being mid-layerish and
could lead to many combinations of postfixes (e.g., _with_funcs(),
with_ddc(), _with_ddc_and_funcs(), etc).

If there is consent that the drm_simple_encoder_init() should go away,
I'd propose to change drm_encoder_init(). It could use a default
implementation for funcs, if no funcs argument has been specified. We
already have such behavior for some GEM callbacks. In later patches,
drm_gem_fb_create_with_funcs() and drm_connector_init_with_ddc() could
go away as well.

Best regards
Thomas


> 
> This required a rename of the existing users, and
> a follow-up patch that moves drm_simple_encoder_init()
> to drm_encoder.c
> 
> Patches 3 in this set demonstrate the use of drm_encoder_init().
> There are many more drivers that can be converted as Thomas
> has already demonstrated.
> 
> This is all based on work done by Thomas Zimmermann,
> I just wanted to implement my suggestion so
> we could select the best way forward.
> 
> Note: Daniel Vetter has hinted the approach implemented
> here smelled like middle-layer.
> IMO this is not so, it is just a way to handle cleanup
> for the simple cases.
> 
>   Sam
> 
> 
> Sam Ravnborg (3):
>   drm: drm_encoder_init() => drm_encoder_init_funcs()
>   drm: drm_simple_encoder_init() => drm_encoder_init()
>   drm/atmel-hlcdc: Use drm_encoder_init()
> 
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 28 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 28 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 28 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 28 ++---
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.c   |  4 +-
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 10 ++---
>  .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 10 ++---
>  drivers/gpu/drm/arc/arcpgu_hdmi.c  |  4 +-
>  drivers/gpu/drm/arc/arcpgu_sim.c   |  4 +-
>  drivers/gpu/drm/ast/ast_mode.c |  3 +-
>  drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c   |  8 +---
>  drivers/gpu/drm/drm_encoder.c  | 49 
> +++---
>  drivers/gpu/drm/drm_encoder_slave.c|  2 +-
>  drivers/gpu/drm/drm_simple_kms_helper.c| 45 +---
>  drivers/gpu/drm/drm_writeback.c|  6 +--
>  drivers/gpu/drm/exynos/exynos_dp.c |  4 +-
>  drivers/gpu/drm/exynos/exynos_drm_dpi.c|  4 +-
>  drivers/gpu/drm/exynos/exynos_drm_dsi.c|  4 +-
>  drivers/gpu/drm/exynos/exynos_drm_vidi.c   |  4 +-
>  drivers/gpu/drm/exynos/exynos_hdmi.c   |  4 +-
>  drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c  |  4 +-
>  drivers/gpu/drm/gma500/cdv_intel_crt.c |  5 ++-
>  drivers/gpu/drm/gma500/cdv_intel_dp.c  |  4 +-
>  drivers/gpu/drm/gma500/cdv_intel_hdmi.c|  4 +-
>  drivers/gpu/drm/gma500/cdv_intel_lvds.c|  6 +--
>  drivers/gpu/drm/gma500/mdfld_dsi_dpi.c |  7 ++--
>  drivers/gpu/drm/gma500/oaktrail_hdmi.c |  6 +--
>  drivers/gpu/drm/gma500/oaktrail_lvds.c |  4 +-
>  drivers/gpu/drm/gma500/psb_intel_lvds.c|  6 +--
>  drivers/gpu/drm/gma500/psb_intel_sdvo.c|  4 +-
>  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c   |  4 +-
>  drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c   |  4 +-
>  drivers/gpu/drm/i2c/tda998x_drv.c  |  5 ++-
>  drivers/gpu/drm/i915/display/icl_dsi.c |  4 +-
>  drivers/gpu/drm/i915/display/intel_crt.c

Re: [PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:55PM -0300, Jason Gunthorpe wrote:
>   if (pte_none(pte)) {
>   required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
>   if (required_fault)
>   goto fault;
> + *pfn = range->values[HMM_PFN_NONE];
>   return 0;
>   }
>  
> @@ -274,8 +274,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
> unsigned long addr,
>   }
>  
>   required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
> - if (!required_fault)
> + if (!required_fault) {
> + *pfn = range->values[HMM_PFN_NONE];
>   return 0;
> + }

Maybe throw in a goto hole to consolidaste the set PFN and return
0 cases?

Otherwise looks good:

Reviewed-by: Christoph Hellwig 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

Now that flags are handled on a fine-grained per-page basis this global
flag is redundant and has a confusing overlap with the pfn_flags_mask and
default_flags.

Normalize the HMM_FAULT_SNAPSHOT behavior into one place. Callers needing
the SNAPSHOT behavior should set a pfn_flags_mask and default_flags that
always results in a cleared HMM_PFN_VALID. Then no pages will be faulted,
and HMM_FAULT_SNAPSHOT is not a special flow that overrides the masking
mechanism.

As this is the last flag, also remove the flags argument. If future flags
are needed they can be part of the struct hmm_range function arguments.

Signed-off-by: Jason Gunthorpe 
---
 Documentation/vm/hmm.rst| 12 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |  2 +-
 drivers/gpu/drm/nouveau/nouveau_svm.c   |  2 +-
 include/linux/hmm.h |  5 +
 mm/hmm.c| 22 ++
 5 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 95fec596836262..4e3e9362afeb10 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -161,13 +161,11 @@ device must complete the update before the driver 
callback returns.
 When the device driver wants to populate a range of virtual addresses, it can
 use::
 
-  long hmm_range_fault(struct hmm_range *range, unsigned int flags);
+  long hmm_range_fault(struct hmm_range *range);
 
-With the HMM_RANGE_SNAPSHOT flag, it will only fetch present CPU page table
-entries and will not trigger a page fault on missing or non-present entries.
-Without that flag, it does trigger a page fault on missing or read-only entries
-if write access is requested (see below). Page faults use the generic mm page
-fault code path just like a CPU page fault.
+It will trigger a page fault on missing or read-only entries if write access is
+requested (see below). Page faults use the generic mm page fault code path just
+like a CPU page fault.
 
 Both functions copy CPU page table entries into their pfns array argument. Each
 entry in that array corresponds to an address in the virtual range. HMM
@@ -197,7 +195,7 @@ The usage pattern is::
  again:
   range.notifier_seq = mmu_interval_read_begin(&interval_sub);
   down_read(&mm->mmap_sem);
-  ret = hmm_range_fault(&range, HMM_RANGE_SNAPSHOT);
+  ret = hmm_range_fault(&range);
   if (ret) {
   up_read(&mm->mmap_sem);
   if (ret == -EBUSY)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 90821ce5e6cad0..c520290709371b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -856,7 +856,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
range->notifier_seq = mmu_interval_read_begin(&bo->notifier);
 
down_read(&mm->mmap_sem);
-   r = hmm_range_fault(range, 0);
+   r = hmm_range_fault(range);
up_read(&mm->mmap_sem);
if (unlikely(r <= 0)) {
/*
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 39c731a99937c6..e3797b2d4d1759 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -540,7 +540,7 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
range.default_flags = 0;
range.pfn_flags_mask = -1UL;
down_read(&mm->mmap_sem);
-   ret = hmm_range_fault(&range, 0);
+   ret = hmm_range_fault(&range);
up_read(&mm->mmap_sem);
if (ret <= 0) {
if (ret == 0 || ret == -EBUSY)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index daee6508a3f609..7475051100c782 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -117,13 +117,10 @@ static inline struct page *hmm_device_entry_to_page(const 
struct hmm_range *rang
return pfn_to_page(entry >> range->pfn_shift);
 }
 
-/* Don't fault in missing PTEs, just snapshot the current state. */
-#define HMM_FAULT_SNAPSHOT (1 << 1)
-
 /*
  * Please see Documentation/vm/hmm.rst for how to use the range API.
  */
-long hmm_range_fault(struct hmm_range *range, unsigned int flags);
+long hmm_range_fault(struct hmm_range *range);
 
 /*
  * HMM_RANGE_DEFAULT_TIMEOUT - default timeout (ms) when waiting for a range
diff --git a/mm/hmm.c b/mm/hmm.c
index c298c936469bbb..43d107a4d9dec6 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -29,7 +29,6 @@
 struct hmm_vma_walk {
struct hmm_range*range;
unsigned long   last;
-   unsigned intflags;
 };
 
 enum {
@@ -112,9 +111,6 @@ static unsigned int hmm_pte_need_fault(const struct 
hmm_vma_walk *hmm_vma_walk,
 {
struct hmm_range *range = hmm_vma_walk->range;
 
-   if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
-   return 0;
-

[PATCH v2 hmm 7/9] mm/hmm: do not unconditionally set pfns when returning EBUSY

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

In hmm_vma_handle_pte() and hmm_vma_walk_hugetlb_entry() if fault happens
then -EBUSY will be returned and the pfns input flags will have been
destroyed.

For hmm_vma_handle_pte() set HMM_PFN_NONE only on the success returns that
don't otherwise store to pfns.

For hmm_vma_walk_hugetlb_entry() all exit paths already set pfns, so
remove the redundant store.

Fixes: 2aee09d8c116 ("mm/hmm: change hmm_vma_fault() to allow write fault on 
page basis")
Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index e114110ad498a2..bf77b852f12d3a 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -249,11 +249,11 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
pte_t pte = *ptep;
uint64_t orig_pfn = *pfn;
 
-   *pfn = range->values[HMM_PFN_NONE];
if (pte_none(pte)) {
required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
if (required_fault)
goto fault;
+   *pfn = range->values[HMM_PFN_NONE];
return 0;
}
 
@@ -274,8 +274,10 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
}
 
required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, 0);
-   if (!required_fault)
+   if (!required_fault) {
+   *pfn = range->values[HMM_PFN_NONE];
return 0;
+   }
 
if (!non_swap_entry(entry))
goto fault;
@@ -493,7 +495,6 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned 
long hmask,
 
i = (start - range->start) >> PAGE_SHIFT;
orig_pfn = range->pfns[i];
-   range->pfns[i] = range->values[HMM_PFN_NONE];
cpu_flags = pte_to_hmm_pfn_flags(range, entry);
required_fault = hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags);
if (required_fault) {
-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn()

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:54PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> swp_offset() should not be called directly, the wrappers are supposed to
> abstract away the encoding of the device_private specific information in
> the swap entry.
> 
> Reviewed-by: Ralph Campbell 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 4/9] mm/hmm: remove HMM_FAULT_SNAPSHOT

2020-03-24 Thread Christoph Hellwig

>  
> +/*
> + * If the valid flag is masked off, and default_flags doesn't set valid, then
> + * hmm_pte_need_fault() always returns 0.
> + */
> +static bool hmm_can_fault(struct hmm_range *range)
> +{
> + return ((range->flags[HMM_PFN_VALID] & range->pfn_flags_mask) |
> + range->default_flags) &
> +range->flags[HMM_PFN_VALID];
> +}

So my idea behind the helper was to turn this into something readable :)

E.g.

/*
 * We only need to fault if either the default mask requires to fault all
 * pages, or at least the mask allows for individual pages to be faulted.
 */
static bool hmm_can_fault(struct hmm_range *range)
{
return ((range->default_flags | range->pfn_flags_mask) &
range->flags[HMM_PFN_VALID]);
}

In fact now that I managed to destill it down to this I'm not even
sure we really even need the helper, although the comment really helps.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

Most places that return an error code, like -EFAULT, do not set
HMM_PFN_ERROR, only two places do this.

Resolve this inconsistency by never setting the pfns on an error
exit. This doesn't seem like a worthwhile thing to do anyhow.

If for some reason it becomes important, it makes more sense to directly
return the address of the failing page rather than have the caller scan
for the HMM_PFN_ERROR.

No caller inspects the pnfs output array if hmm_range_fault() fails.

Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index bf77b852f12d3a..14c33e1225866c 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -77,17 +77,14 @@ static int hmm_vma_fault(unsigned long addr, unsigned long 
end,
 unsigned int required_fault, struct mm_walk *walk)
 {
struct hmm_vma_walk *hmm_vma_walk = walk->private;
-   struct hmm_range *range = hmm_vma_walk->range;
struct vm_area_struct *vma = walk->vma;
-   uint64_t *pfns = range->pfns;
-   unsigned long i = (addr - range->start) >> PAGE_SHIFT;
unsigned int fault_flags = FAULT_FLAG_REMOTE;
 
WARN_ON_ONCE(!required_fault);
hmm_vma_walk->last = addr;
 
if (!vma)
-   goto out_error;
+   return -EFAULT;
 
if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
if (!(vma->vm_flags & VM_WRITE))
@@ -95,15 +92,10 @@ static int hmm_vma_fault(unsigned long addr, unsigned long 
end,
fault_flags |= FAULT_FLAG_WRITE;
}
 
-   for (; addr < end; addr += PAGE_SIZE, i++)
+   for (; addr < end; addr += PAGE_SIZE)
if (handle_mm_fault(vma, addr, fault_flags) & VM_FAULT_ERROR)
-   goto out_error;
-
+   return -EFAULT;
return -EBUSY;
-
-out_error:
-   pfns[i] = range->values[HMM_PFN_ERROR];
-   return -EFAULT;
 }
 
 static unsigned int hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
@@ -291,7 +283,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
 
/* Report error for everything else */
pte_unmap(ptep);
-   *pfn = range->values[HMM_PFN_ERROR];
return -EFAULT;
}
 
@@ -577,9 +568,6 @@ static const struct mm_walk_ops hmm_walk_ops = {
  *
  * This is similar to get_user_pages(), except that it can read the page tables
  * without mutating them (ie causing faults).
- *
- * On error, for one virtual address in the range, the function will mark the
- * corresponding HMM pfn entry with an error flag.
  */
 long hmm_range_fault(struct hmm_range *range)
 {
-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 8/9] mm/hmm: do not set pfns when returning an error code

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:56PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> Most places that return an error code, like -EFAULT, do not set
> HMM_PFN_ERROR, only two places do this.
> 
> Resolve this inconsistency by never setting the pfns on an error
> exit. This doesn't seem like a worthwhile thing to do anyhow.
> 
> If for some reason it becomes important, it makes more sense to directly
> return the address of the failing page rather than have the caller scan
> for the HMM_PFN_ERROR.
> 
> No caller inspects the pnfs output array if hmm_range_fault() fails.
> 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:57PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> The pagewalker does not call most ops with NULL vma, those are all routed
> to pte_hole instead.

Does ->pte_hole 

> 
> Thus hmm_vma_fault() is only called with a NULL vma from
> hmm_vma_walk_hole(), so hoist the check to there.
> 
> Now it is clear that snapshotting with no vma is a HMM_PFN_ERROR as
> without a vma we have no path to call hmm_vma_fault().
> 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 1/9] mm/hmm: remove pgmap checking for devmap pages

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

The checking boils down to some racy check if the pagemap is still
available or not. Instead of checking this, rely entirely on the
notifiers, if a pagemap is destroyed then all pages that belong to it must
be removed from the tables and the notifiers triggered.

Reviewed-by: Ralph Campbell 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 50 ++
 1 file changed, 2 insertions(+), 48 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index a491d9aaafe45d..3a2610e0713329 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -28,7 +28,6 @@
 
 struct hmm_vma_walk {
struct hmm_range*range;
-   struct dev_pagemap  *pgmap;
unsigned long   last;
unsigned intflags;
 };
@@ -196,19 +195,8 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, 
unsigned long addr,
return hmm_vma_fault(addr, end, fault, write_fault, walk);
 
pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
-   for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) {
-   if (pmd_devmap(pmd)) {
-   hmm_vma_walk->pgmap = get_dev_pagemap(pfn,
- hmm_vma_walk->pgmap);
-   if (unlikely(!hmm_vma_walk->pgmap))
-   return -EBUSY;
-   }
+   for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++)
pfns[i] = hmm_device_entry_from_pfn(range, pfn) | cpu_flags;
-   }
-   if (hmm_vma_walk->pgmap) {
-   put_dev_pagemap(hmm_vma_walk->pgmap);
-   hmm_vma_walk->pgmap = NULL;
-   }
hmm_vma_walk->last = end;
return 0;
 }
@@ -300,15 +288,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
if (fault || write_fault)
goto fault;
 
-   if (pte_devmap(pte)) {
-   hmm_vma_walk->pgmap = get_dev_pagemap(pte_pfn(pte),
- hmm_vma_walk->pgmap);
-   if (unlikely(!hmm_vma_walk->pgmap)) {
-   pte_unmap(ptep);
-   return -EBUSY;
-   }
-   }
-
/*
 * Since each architecture defines a struct page for the zero page, just
 * fall through and treat it like a normal page.
@@ -328,10 +307,6 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
return 0;
 
 fault:
-   if (hmm_vma_walk->pgmap) {
-   put_dev_pagemap(hmm_vma_walk->pgmap);
-   hmm_vma_walk->pgmap = NULL;
-   }
pte_unmap(ptep);
/* Fault any virtual address we were asked to fault */
return hmm_vma_fault(addr, end, fault, write_fault, walk);
@@ -418,16 +393,6 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp,
return r;
}
}
-   if (hmm_vma_walk->pgmap) {
-   /*
-* We do put_dev_pagemap() here and not in hmm_vma_handle_pte()
-* so that we can leverage get_dev_pagemap() optimization which
-* will not re-take a reference on a pgmap if we already have
-* one.
-*/
-   put_dev_pagemap(hmm_vma_walk->pgmap);
-   hmm_vma_walk->pgmap = NULL;
-   }
pte_unmap(ptep - 1);
 
hmm_vma_walk->last = addr;
@@ -491,20 +456,9 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long 
start, unsigned long end,
}
 
pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
-   for (i = 0; i < npages; ++i, ++pfn) {
-   hmm_vma_walk->pgmap = get_dev_pagemap(pfn,
- hmm_vma_walk->pgmap);
-   if (unlikely(!hmm_vma_walk->pgmap)) {
-   ret = -EBUSY;
-   goto out_unlock;
-   }
+   for (i = 0; i < npages; ++i, ++pfn)
pfns[i] = hmm_device_entry_from_pfn(range, pfn) |
  cpu_flags;
-   }
-   if (hmm_vma_walk->pgmap) {
-   put_dev_pagemap(hmm_vma_walk->pgmap);
-   hmm_vma_walk->pgmap = NULL;
-   }
hmm_vma_walk->last = end;
goto out_unlock;
}
-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

Delete several functions that are never called, fix some desync between
comments and structure content, toss the now out of date top of file
header, and move one function only used by hmm.c into hmm.c

Signed-off-by: Jason Gunthorpe 
---
 include/linux/hmm.h | 104 +---
 mm/hmm.c|  24 +++---
 2 files changed, 19 insertions(+), 109 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index bb6be4428633a8..daee6508a3f609 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -3,58 +3,8 @@
  * Copyright 2013 Red Hat Inc.
  *
  * Authors: Jérôme Glisse 
- */
-/*
- * Heterogeneous Memory Management (HMM)
- *
- * See Documentation/vm/hmm.rst for reasons and overview of what HMM is and it
- * is for. Here we focus on the HMM API description, with some explanation of
- * the underlying implementation.
- *
- * Short description: HMM provides a set of helpers to share a virtual address
- * space between CPU and a device, so that the device can access any valid
- * address of the process (while still obeying memory protection). HMM also
- * provides helpers to migrate process memory to device memory, and back. Each
- * set of functionality (address space mirroring, and migration to and from
- * device memory) can be used independently of the other.
- *
- *
- * HMM address space mirroring API:
- *
- * Use HMM address space mirroring if you want to mirror a range of the CPU
- * page tables of a process into a device page table. Here, "mirror" means 
"keep
- * synchronized". Prerequisites: the device must provide the ability to write-
- * protect its page tables (at PAGE_SIZE granularity), and must be able to
- * recover from the resulting potential page faults.
  *
- * HMM guarantees that at any point in time, a given virtual address points to
- * either the same memory in both CPU and device page tables (that is: CPU and
- * device page tables each point to the same pages), or that one page table 
(CPU
- * or device) points to no entry, while the other still points to the old page
- * for the address. The latter case happens when the CPU page table update
- * happens first, and then the update is mirrored over to the device page 
table.
- * This does not cause any issue, because the CPU page table cannot start
- * pointing to a new page until the device page table is invalidated.
- *
- * HMM uses mmu_notifiers to monitor the CPU page tables, and forwards any
- * updates to each device driver that has registered a mirror. It also provides
- * some API calls to help with taking a snapshot of the CPU page table, and to
- * synchronize with any updates that might happen concurrently.
- *
- *
- * HMM migration to and from device memory:
- *
- * HMM provides a set of helpers to hotplug device memory as ZONE_DEVICE, with
- * a new MEMORY_DEVICE_PRIVATE type. This provides a struct page for each page
- * of the device memory, and allows the device driver to manage its memory
- * using those struct pages. Having struct pages for device memory makes
- * migration easier. Because that memory is not addressable by the CPU it must
- * never be pinned to the device; in other words, any CPU page fault can always
- * cause the device memory to be migrated (copied/moved) back to regular 
memory.
- *
- * A new migrate helper (migrate_vma()) has been added (see mm/migrate.c) that
- * allows use of a device DMA engine to perform the copy operation between
- * regular system memory and device memory.
+ * See Documentation/vm/hmm.rst for reasons and overview of what HMM is.
  */
 #ifndef LINUX_HMM_H
 #define LINUX_HMM_H
@@ -120,9 +70,6 @@ enum hmm_pfn_value_e {
  *
  * @notifier: a mmu_interval_notifier that includes the start/end
  * @notifier_seq: result of mmu_interval_read_begin()
- * @hmm: the core HMM structure this range is active against
- * @vma: the vm area struct for the range
- * @list: all range lock are on a list
  * @start: range virtual start address (inclusive)
  * @end: range virtual end address (exclusive)
  * @pfns: array of pfns (big enough for the range)
@@ -130,8 +77,7 @@ enum hmm_pfn_value_e {
  * @values: pfn value for some special case (none, special, error, ...)
  * @default_flags: default flags for the range (write, read, ... see hmm doc)
  * @pfn_flags_mask: allows to mask pfn flags so that only default_flags matter
- * @pfn_shifts: pfn shift value (should be <= PAGE_SHIFT)
- * @valid: pfns array did not change since it has been fill by an HMM function
+ * @pfn_shift: pfn shift value (should be <= PAGE_SHIFT)
  * @dev_private_owner: owner of device private pages
  */
 struct hmm_range {
@@ -171,52 +117,6 @@ static inline struct page *hmm_device_entry_to_page(const 
struct hmm_range *rang
return pfn_to_page(entry >> range->pfn_shift);
 }
 
-/*
- * hmm_device_entry_to_pfn() - return pfn value store in a device entry
- * @range: range use to decode device entry value
- * @entry: device entry to extract pfn

Re: [PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:53PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> This code can be compiled when CONFIG_TRANSPARENT_HUGEPAGE is off, so
> remove the ifdef.
> 
> The function is only ever called under
> 
>if (pmd_devmap(pmd) || pmd_trans_huge(pmd))
> 
> Which is statically false if !CONFIG_TRANSPARENT_HUGEPAGE, so the compiler
> reliably eliminates all of this code.
> 
> Reviewed-by: Ralph Campbell 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 9/9] mm/hmm: return error for non-vma snapshots

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

The pagewalker does not call most ops with NULL vma, those are all routed
to pte_hole instead.

Thus hmm_vma_fault() is only called with a NULL vma from
hmm_vma_walk_hole(), so hoist the check to there.

Now it is clear that snapshotting with no vma is a HMM_PFN_ERROR as
without a vma we have no path to call hmm_vma_fault().

Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 14c33e1225866c..df0574061b37d3 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -83,9 +83,6 @@ static int hmm_vma_fault(unsigned long addr, unsigned long 
end,
WARN_ON_ONCE(!required_fault);
hmm_vma_walk->last = addr;
 
-   if (!vma)
-   return -EFAULT;
-
if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
if (!(vma->vm_flags & VM_WRITE))
return -EPERM;
@@ -175,6 +172,11 @@ static int hmm_vma_walk_hole(unsigned long addr, unsigned 
long end,
npages = (end - addr) >> PAGE_SHIFT;
pfns = &range->pfns[i];
required_fault = hmm_range_need_fault(hmm_vma_walk, pfns, npages, 0);
+   if (!walk->vma) {
+   if (required_fault)
+   return -EFAULT;
+   return hmm_pfns_fill(addr, end, range, HMM_PFN_ERROR);
+   }
if (required_fault)
return hmm_vma_fault(addr, end, required_fault, walk);
hmm_vma_walk->last = addr;
-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 3/9] mm/hmm: remove unused code and tidy comments

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:51PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe 
> 
> Delete several functions that are never called, fix some desync between
> comments and structure content, toss the now out of date top of file
> header, and move one function only used by hmm.c into hmm.c
> 
> Signed-off-by: Jason Gunthorpe 

Looks good,

Reviewed-by: Christoph Hellwig 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 0/9] Small hmm_range_fault() cleanups

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

This is v2 of the first simple series with a few additional patches of little
adjustments.

This needs an additional patch to the hmm tester:

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 033a12c7ab5b6d..da15471a2bbf9a 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -1274,7 +1274,7 @@ TEST_F(hmm2, snapshot)
/* Check what the device saw. */
m = buffer->mirror;
ASSERT_EQ(m[0], HMM_DMIRROR_PROT_ERROR);
-   ASSERT_EQ(m[1], HMM_DMIRROR_PROT_NONE);
+   ASSERT_EQ(m[1], HMM_DMIRROR_PROT_ERROR);
ASSERT_EQ(m[2], HMM_DMIRROR_PROT_ZERO | HMM_DMIRROR_PROT_READ);
ASSERT_EQ(m[3], HMM_DMIRROR_PROT_READ);
ASSERT_EQ(m[4], HMM_DMIRROR_PROT_WRITE);

v2 changes:
 - Simplify and rename the flags, rework hmm_vma_walk_test in patch 2 (CH)
 - Adjust more comments in patch 3 (CH, Ralph)
 - Put the ugly boolean logic into a function in patch 3 (CH)
 - Update commit message of patch 4 (CH)
 - Adjust formatting in patch 5 (CH)
 Patches 6, 7, 8 are new

v1: https://lore.kernel.org/r/20200320164905.21722-1-...@ziepe.ca

Jason Gunthorpe (9):
  mm/hmm: remove pgmap checking for devmap pages
  mm/hmm: return the fault type from hmm_pte_need_fault()
  mm/hmm: remove unused code and tidy comments
  mm/hmm: remove HMM_FAULT_SNAPSHOT
  mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef
  mm/hmm: use device_private_entry_to_pfn()
  mm/hmm: do not unconditionally set pfns when returning EBUSY
  mm/hmm: do not set pfns when returning an error code
  mm/hmm: return error for non-vma snapshots

 Documentation/vm/hmm.rst|  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |   2 +-
 drivers/gpu/drm/nouveau/nouveau_svm.c   |   2 +-
 include/linux/hmm.h | 109 +
 mm/hmm.c| 312 ++--
 5 files changed, 133 insertions(+), 304 deletions(-)

-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault()

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

Using two bools instead of flags return is not necessary and leads to
bugs. Returning a value is easier for the compiler to check and easier to
pass around the code flow.

Convert the two bools into flags and push the change to all callers.

Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 183 ---
 1 file changed, 81 insertions(+), 102 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 3a2610e0713329..2a0eda1534bcda 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -32,6 +32,12 @@ struct hmm_vma_walk {
unsigned intflags;
 };
 
+enum {
+   HMM_NEED_FAULT = 1 << 0,
+   HMM_NEED_WRITE_FAULT = HMM_NEED_FAULT | (1 << 1),
+   HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,
+};
+
 static int hmm_pfns_fill(unsigned long addr, unsigned long end,
struct hmm_range *range, enum hmm_pfn_value_e value)
 {
@@ -49,8 +55,7 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long 
end,
  * hmm_vma_fault() - fault in a range lacking valid pmd or pte(s)
  * @addr: range virtual start address (inclusive)
  * @end: range virtual end address (exclusive)
- * @fault: should we fault or not ?
- * @write_fault: write fault ?
+ * @required_fault: HMM_NEED_* flags
  * @walk: mm_walk structure
  * Return: -EBUSY after page fault, or page fault error
  *
@@ -58,8 +63,7 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long 
end,
  * or whenever there is no page directory covering the virtual address range.
  */
 static int hmm_vma_fault(unsigned long addr, unsigned long end,
- bool fault, bool write_fault,
- struct mm_walk *walk)
+unsigned int required_fault, struct mm_walk *walk)
 {
struct hmm_vma_walk *hmm_vma_walk = walk->private;
struct hmm_range *range = hmm_vma_walk->range;
@@ -68,13 +72,13 @@ static int hmm_vma_fault(unsigned long addr, unsigned long 
end,
unsigned long i = (addr - range->start) >> PAGE_SHIFT;
unsigned int fault_flags = FAULT_FLAG_REMOTE;
 
-   WARN_ON_ONCE(!fault && !write_fault);
+   WARN_ON_ONCE(!required_fault);
hmm_vma_walk->last = addr;
 
if (!vma)
goto out_error;
 
-   if (write_fault) {
+   if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {
if (!(vma->vm_flags & VM_WRITE))
return -EPERM;
fault_flags |= FAULT_FLAG_WRITE;
@@ -91,14 +95,13 @@ static int hmm_vma_fault(unsigned long addr, unsigned long 
end,
return -EFAULT;
 }
 
-static inline void hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
- uint64_t pfns, uint64_t cpu_flags,
- bool *fault, bool *write_fault)
+static unsigned int hmm_pte_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
+  uint64_t pfns, uint64_t cpu_flags)
 {
struct hmm_range *range = hmm_vma_walk->range;
 
if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
-   return;
+   return 0;
 
/*
 * So we not only consider the individual per page request we also
@@ -114,37 +117,37 @@ static inline void hmm_pte_need_fault(const struct 
hmm_vma_walk *hmm_vma_walk,
 
/* We aren't ask to do anything ... */
if (!(pfns & range->flags[HMM_PFN_VALID]))
-   return;
+   return 0;
 
-   /* If CPU page table is not valid then we need to fault */
-   *fault = !(cpu_flags & range->flags[HMM_PFN_VALID]);
/* Need to write fault ? */
if ((pfns & range->flags[HMM_PFN_WRITE]) &&
-   !(cpu_flags & range->flags[HMM_PFN_WRITE])) {
-   *write_fault = true;
-   *fault = true;
-   }
+   !(cpu_flags & range->flags[HMM_PFN_WRITE]))
+   return HMM_NEED_WRITE_FAULT;
+
+   /* If CPU page table is not valid then we need to fault */
+   if (!(cpu_flags & range->flags[HMM_PFN_VALID]))
+   return HMM_NEED_FAULT;
+   return 0;
 }
 
-static void hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
-const uint64_t *pfns, unsigned long npages,
-uint64_t cpu_flags, bool *fault,
-bool *write_fault)
+static unsigned int
+hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
+const uint64_t *pfns, unsigned long npages,
+uint64_t cpu_flags)
 {
+   unsigned int required_fault = 0;
unsigned long i;
 
-   if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT) {
-   *fault = *write_fault = false;
-   return;
-   }
+   if (hmm_vma_walk->flags & HMM_FAULT_SNAPSHOT)
+   return 0;
 
-   *fault = *write_fault = false;
for (i = 0; i < npages; ++i) {

[PATCH v2 hmm 6/9] mm/hmm: use device_private_entry_to_pfn()

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

swp_offset() should not be called directly, the wrappers are supposed to
abstract away the encoding of the device_private specific information in
the swap entry.

Reviewed-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index f59e59fb303e95..e114110ad498a2 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -266,7 +266,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
 */
if (hmm_is_device_private_entry(range, entry)) {
*pfn = hmm_device_entry_from_pfn(range,
-   swp_offset(entry));
+   device_private_entry_to_pfn(entry));
*pfn |= range->flags[HMM_PFN_VALID];
if (is_write_device_private_entry(entry))
*pfn |= range->flags[HMM_PFN_WRITE];
-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 hmm 2/9] mm/hmm: return the fault type from hmm_pte_need_fault()

2020-03-24 Thread Christoph Hellwig

On Mon, Mar 23, 2020 at 10:14:50PM -0300, Jason Gunthorpe wrote:
> +enum {
> + HMM_NEED_FAULT = 1 << 0,
> + HMM_NEED_WRITE_FAULT = HMM_NEED_FAULT | (1 << 1),
> + HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT,

I have to say I find the compound version of HMM_NEED_WRITE_FAULT
way harder to understand than the logic in the previous version,
and would refer keeping separate bits here.

Mostly beccause of statements like this:

> + if ((required_fault & HMM_NEED_WRITE_FAULT) == HMM_NEED_WRITE_FAULT) {

which seems rather weird.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 hmm 5/9] mm/hmm: remove the CONFIG_TRANSPARENT_HUGEPAGE #ifdef

2020-03-24 Thread Jason Gunthorpe

From: Jason Gunthorpe 

This code can be compiled when CONFIG_TRANSPARENT_HUGEPAGE is off, so
remove the ifdef.

The function is only ever called under

   if (pmd_devmap(pmd) || pmd_trans_huge(pmd))

Which is statically false if !CONFIG_TRANSPARENT_HUGEPAGE, so the compiler
reliably eliminates all of this code.

Reviewed-by: Ralph Campbell 
Signed-off-by: Jason Gunthorpe 
---
 mm/hmm.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 43d107a4d9dec6..f59e59fb303e95 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -198,7 +198,6 @@ static inline uint64_t pmd_to_hmm_pfn_flags(struct 
hmm_range *range, pmd_t pmd)
range->flags[HMM_PFN_VALID];
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
unsigned long end, uint64_t *pfns, pmd_t pmd)
 {
@@ -221,11 +220,6 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, 
unsigned long addr,
hmm_vma_walk->last = end;
return 0;
 }
-#else /* CONFIG_TRANSPARENT_HUGEPAGE */
-/* stub to allow the code below to compile */
-int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
-   unsigned long end, uint64_t *pfns, pmd_t pmd);
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
 static inline bool hmm_is_device_private_entry(struct hmm_range *range,
swp_entry_t entry)
-- 
2.25.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

62 matches

Mail list logo