Re: [PATCH] drm/amdgpu: remove needless usage of #ifdef

2019-09-12 Thread Kuehling, Felix
On 2019-09-12 2:44 a.m., S, Shirish wrote:
> define sched_policy in case CONFIG_HSA_AMD is not
> enabled, with this there is no need to check for CONFIG_HSA_AMD
> else where in driver code.
>
> Suggested-by: Felix Kuehling 
> Signed-off-by: Shirish S 

Reviewed-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +-
>   2 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index a1516a3..6ff02bb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -171,6 +171,8 @@ extern int amdgpu_noretry;
>   extern int amdgpu_force_asic_type;
>   #ifdef CONFIG_HSA_AMD
>   extern int sched_policy;
> +#else
> +static const int sched_policy = KFD_SCHED_POLICY_HWS;
>   #endif
>   
>   #ifdef CONFIG_DRM_AMDGPU_SI
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 740638e..3b5282b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1623,11 +1623,7 @@ static int amdgpu_device_ip_early_init(struct 
> amdgpu_device *adev)
>   }
>   
>   adev->pm.pp_feature = amdgpu_pp_feature_mask;
> - if (amdgpu_sriov_vf(adev)
> - #ifdef CONFIG_HSA_AMD
> - || sched_policy == KFD_SCHED_POLICY_NO_HWS
> - #endif
> - )
> + if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
>   adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
>   
>   for (i = 0; i < adev->num_ip_blocks; i++) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD (V2)

2019-09-12 Thread Kuehling, Felix
On 2019-09-12 2:58 a.m., S, Shirish wrote:
> On 9/12/2019 3:29 AM, Kuehling, Felix wrote:
>> On 2019-09-11 2:52 a.m., S, Shirish wrote:
>>> If CONFIG_HSA_AMD is not set, build fails:
>>>
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function 
>>> `amdgpu_device_ip_early_init':
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to 
>>> `sched_policy'
>>>
>>> Use CONFIG_HSA_AMD to guard this.
>>>
>>> Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W 
>>> scheduling policy")
>>>
>>> V2: declare sched_policy in amdgpu.h and remove changes in amdgpu_device.c
>> Which branch is this for. V1 of this patch was already submitted to
>> amd-staging-drm-next. So unless you're planning to revert v1 and submit
>> v2, I was expecting to see a change that fixes up the previous patch,
>> rather than a patch that replaces it.
> Have sent a patch that fixes up previous patch as well.
>
> Apparently, I did not send the revert but my plan was to revert and only
> then submit V2.

Reverts must be reviewed too. If you're planning to submit a revert, 
please do include it in code review. That also avoids this type of 
confusion.

I'll review your other patch.

Thanks,
   Felix


>
> Anyways both work for me as long as the kernel builds.
>
> Regards,
>
> Shirish S
>
>> Regards,
>>  Felix
>>
>>
>>> Signed-off-by: Shirish S 
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 1030cb3..6ff02bb 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -169,7 +169,11 @@ extern int amdgpu_discovery;
>>> extern int amdgpu_mes;
>>> extern int amdgpu_noretry;
>>> extern int amdgpu_force_asic_type;
>>> +#ifdef CONFIG_HSA_AMD
>>> extern int sched_policy;
>>> +#else
>>> +static const int sched_policy = KFD_SCHED_POLICY_HWS;
>>> +#endif
>>> 
>>> #ifdef CONFIG_DRM_AMDGPU_SI
>>> extern int amdgpu_si_support;
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] Revert "drm/amdgpu/nbio7.4: add hw bug workaround for vega20"

2019-09-12 Thread Kuehling, Felix
On 2019-09-10 3:59 p.m., Russell, Kent wrote:
> This reverts commit e01f2d41895102d824c6b8f5e011dd5e286d5e8b.
>
> VG20 did not require this workaround, as the fix is in the VBIOS.
> Leave VG10/12 workaround as some older shipped cards do not have the
> VBIOS fix in place, and the kernel workaround is required in those
> situations
>
> Change-Id: I2d7c394ce9d205d97be6acfa5edc4635951fdadf
> Signed-off-by: Kent Russell 

Reviewed-by: Felix Kuehling 


> ---
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 6 --
>   1 file changed, 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c 
> b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> index 2d171bf07ad5..dafd9b7d31d3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> @@ -308,13 +308,7 @@ static void nbio_v7_4_detect_hw_virt(struct 
> amdgpu_device *adev)
>   
>   static void nbio_v7_4_init_registers(struct amdgpu_device *adev)
>   {
> - uint32_t def, data;
> -
> - def = data = RREG32_PCIE(smnPCIE_CI_CNTL);
> - data = REG_SET_FIELD(data, PCIE_CI_CNTL, CI_SLV_ORDERING_DIS, 1);
>   
> - if (def != data)
> - WREG32_PCIE(smnPCIE_CI_CNTL, data);
>   }
>   
>   static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct 
> amdgpu_device *adev)
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdkfd: Swap trap temporary registers in gfx10 trap handler

2019-09-12 Thread Liu, Shaoyun
Reviewed by: shaoyun liu 

On 2019-09-12 3:13 p.m., Cornwall, Jay wrote:
> ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
> instead, aligning implementation with gfx9 trap handler.
>
> Signed-off-by: Jay Cornwall 
> ---
>   drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h |  6 +++---
>   drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm | 10 +-
>   2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
> index a8cf82d..901fe35 100644
> --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
> @@ -694,10 +694,10 @@ static const uint32_t cwsr_trap_gfx10_hex[] = {
>   0x003f8000, 0x8f6f896f,
>   0x88776f77, 0x8a6eff6e,
>   0x023f8000, 0xb9eef807,
> - 0xb970f812, 0xb971f813,
> - 0x8ff08870, 0xf4051bb8,
> + 0xb97af812, 0xb97bf813,
> + 0x8ffa887a, 0xf4051bbd,
>   0xfa00, 0xbf8cc07f,
> - 0xf4051c38, 0xfa08,
> + 0xf4051ebd, 0xfa08,
>   0xbf8cc07f, 0x87ee6e6e,
>   0xbf840001, 0xbe80206e,
>   0xb971f803, 0x8771ff71,
> diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm 
> b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm
> index 3598621..cdaa523 100644
> --- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm
> +++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm
> @@ -187,12 +187,12 @@ L_FETCH_2ND_TRAP:
>   // Read second-level TBA/TMA from first-level TMA and jump if available.
>   // ttmp[2:5] and ttmp12 can be used (others hold SPI-initialized debug 
> data)
>   // ttmp12 holds SQ_WAVE_STATUS
> - s_getreg_b32ttmp4, hwreg(HW_REG_SHADER_TMA_LO)
> - s_getreg_b32ttmp5, hwreg(HW_REG_SHADER_TMA_HI)
> - s_lshl_b64  [ttmp4, ttmp5], [ttmp4, ttmp5], 0x8
> - s_load_dwordx2  [ttmp2, ttmp3], [ttmp4, ttmp5], 0x0 glc:1   
> // second-level TBA
> + s_getreg_b32ttmp14, hwreg(HW_REG_SHADER_TMA_LO)
> + s_getreg_b32ttmp15, hwreg(HW_REG_SHADER_TMA_HI)
> + s_lshl_b64  [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8
> + s_load_dwordx2  [ttmp2, ttmp3], [ttmp14, ttmp15], 0x0 glc:1 
> // second-level TBA
>   s_waitcnt   lgkmcnt(0)
> - s_load_dwordx2  [ttmp4, ttmp5], [ttmp4, ttmp5], 0x8 glc:1   
> // second-level TMA
> + s_load_dwordx2  [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8 glc:1   
> // second-level TMA
>   s_waitcnt   lgkmcnt(0)
>   s_and_b64   [ttmp2, ttmp3], [ttmp2, ttmp3], [ttmp2, ttmp3]
>   s_cbranch_scc0  L_NO_NEXT_TRAP  
> // second-level trap handler not been set
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/connector: Allow max possible encoders to attach to a connector

2019-09-12 Thread José Roberto de Souza
Currently we restrict the number of encoders that can be linked to
a connector to 3, increase it to match the maximum number of encoders
that can be initialized(32).

To more effiently do that lets switch from an array of encoder ids to
bitmask.

v2: Fixing missed return on amdgpu_dm_connector_to_encoder()

Suggested-by: Ville Syrjälä 
Cc: Ville Syrjälä 
Cc: Alex Deucher 
Cc: dri-de...@lists.freedesktop.org
Cc: intel-...@lists.freedesktop.org
Cc: nouv...@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Ville Syrjälä 
Signed-off-by: Dhinakaran Pandiyan 
Signed-off-by: José Roberto de Souza 
---
 .../gpu/drm/amd/amdgpu/amdgpu_connectors.c| 23 +-
 drivers/gpu/drm/amd/amdgpu/dce_virtual.c  |  5 ++-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  8 -
 drivers/gpu/drm/drm_client_modeset.c  |  3 +-
 drivers/gpu/drm/drm_connector.c   | 31 +--
 drivers/gpu/drm/drm_crtc_helper.c |  9 --
 drivers/gpu/drm/drm_probe_helper.c|  3 +-
 drivers/gpu/drm/nouveau/dispnv04/disp.c   |  2 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.c   |  2 +-
 drivers/gpu/drm/nouveau/nouveau_connector.c   |  7 ++---
 drivers/gpu/drm/radeon/radeon_connectors.c| 27 ++--
 include/drm/drm_connector.h   | 18 +--
 12 files changed, 55 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
index ece55c8fa673..d8729285f731 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c
@@ -217,11 +217,10 @@ amdgpu_connector_update_scratch_regs(struct drm_connector 
*connector,
struct drm_encoder *encoder;
const struct drm_connector_helper_funcs *connector_funcs = 
connector->helper_private;
bool connected;
-   int i;
 
best_encoder = connector_funcs->best_encoder(connector);
 
-   drm_connector_for_each_possible_encoder(connector, encoder, i) {
+   drm_connector_for_each_possible_encoder(connector, encoder) {
if ((encoder == best_encoder) && (status == 
connector_status_connected))
connected = true;
else
@@ -236,9 +235,8 @@ amdgpu_connector_find_encoder(struct drm_connector 
*connector,
   int encoder_type)
 {
struct drm_encoder *encoder;
-   int i;
 
-   drm_connector_for_each_possible_encoder(connector, encoder, i) {
+   drm_connector_for_each_possible_encoder(connector, encoder) {
if (encoder->encoder_type == encoder_type)
return encoder;
}
@@ -347,10 +345,9 @@ static struct drm_encoder *
 amdgpu_connector_best_single_encoder(struct drm_connector *connector)
 {
struct drm_encoder *encoder;
-   int i;
 
/* pick the first one */
-   drm_connector_for_each_possible_encoder(connector, encoder, i)
+   drm_connector_for_each_possible_encoder(connector, encoder)
return encoder;
 
return NULL;
@@ -1065,9 +1062,8 @@ amdgpu_connector_dvi_detect(struct drm_connector 
*connector, bool force)
/* find analog encoder */
if (amdgpu_connector->dac_load_detect) {
struct drm_encoder *encoder;
-   int i;
 
-   drm_connector_for_each_possible_encoder(connector, encoder, i) {
+   drm_connector_for_each_possible_encoder(connector, encoder) {
if (encoder->encoder_type != DRM_MODE_ENCODER_DAC &&
encoder->encoder_type != DRM_MODE_ENCODER_TVDAC)
continue;
@@ -1117,9 +1113,8 @@ amdgpu_connector_dvi_encoder(struct drm_connector 
*connector)
 {
struct amdgpu_connector *amdgpu_connector = 
to_amdgpu_connector(connector);
struct drm_encoder *encoder;
-   int i;
 
-   drm_connector_for_each_possible_encoder(connector, encoder, i) {
+   drm_connector_for_each_possible_encoder(connector, encoder) {
if (amdgpu_connector->use_digital == true) {
if (encoder->encoder_type == DRM_MODE_ENCODER_TMDS)
return encoder;
@@ -1134,7 +1129,7 @@ amdgpu_connector_dvi_encoder(struct drm_connector 
*connector)
 
/* then check use digitial */
/* pick the first one */
-   drm_connector_for_each_possible_encoder(connector, encoder, i)
+   drm_connector_for_each_possible_encoder(connector, encoder)
return encoder;
 
return NULL;
@@ -1271,9 +1266,8 @@ u16 
amdgpu_connector_encoder_get_dp_bridge_encoder_id(struct drm_connector *conn
 {
struct drm_encoder *encoder;
struct amdgpu_encoder *amdgpu_encoder;
-   int i;
 
-   drm_connector_for_each_possible_encoder(connector, encoder, i) {
+   drm_connector_for_each_possible_encoder(connector, encoder) {

[PATCH] drm/amdkfd: Swap trap temporary registers in gfx10 trap handler

2019-09-12 Thread Cornwall, Jay
ttmp[4:5] hold information useful to the debugger. Use ttmp[14:15]
instead, aligning implementation with gfx9 trap handler.

Signed-off-by: Jay Cornwall 
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h |  6 +++---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index a8cf82d..901fe35 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -694,10 +694,10 @@ static const uint32_t cwsr_trap_gfx10_hex[] = {
0x003f8000, 0x8f6f896f,
0x88776f77, 0x8a6eff6e,
0x023f8000, 0xb9eef807,
-   0xb970f812, 0xb971f813,
-   0x8ff08870, 0xf4051bb8,
+   0xb97af812, 0xb97bf813,
+   0x8ffa887a, 0xf4051bbd,
0xfa00, 0xbf8cc07f,
-   0xf4051c38, 0xfa08,
+   0xf4051ebd, 0xfa08,
0xbf8cc07f, 0x87ee6e6e,
0xbf840001, 0xbe80206e,
0xb971f803, 0x8771ff71,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm 
b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm
index 3598621..cdaa523 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm
@@ -187,12 +187,12 @@ L_FETCH_2ND_TRAP:
// Read second-level TBA/TMA from first-level TMA and jump if available.
// ttmp[2:5] and ttmp12 can be used (others hold SPI-initialized debug 
data)
// ttmp12 holds SQ_WAVE_STATUS
-   s_getreg_b32ttmp4, hwreg(HW_REG_SHADER_TMA_LO)
-   s_getreg_b32ttmp5, hwreg(HW_REG_SHADER_TMA_HI)
-   s_lshl_b64  [ttmp4, ttmp5], [ttmp4, ttmp5], 0x8
-   s_load_dwordx2  [ttmp2, ttmp3], [ttmp4, ttmp5], 0x0 glc:1   
// second-level TBA
+   s_getreg_b32ttmp14, hwreg(HW_REG_SHADER_TMA_LO)
+   s_getreg_b32ttmp15, hwreg(HW_REG_SHADER_TMA_HI)
+   s_lshl_b64  [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8
+   s_load_dwordx2  [ttmp2, ttmp3], [ttmp14, ttmp15], 0x0 glc:1 
// second-level TBA
s_waitcnt   lgkmcnt(0)
-   s_load_dwordx2  [ttmp4, ttmp5], [ttmp4, ttmp5], 0x8 glc:1   
// second-level TMA
+   s_load_dwordx2  [ttmp14, ttmp15], [ttmp14, ttmp15], 0x8 glc:1   
// second-level TMA
s_waitcnt   lgkmcnt(0)
s_and_b64   [ttmp2, ttmp3], [ttmp2, ttmp3], [ttmp2, ttmp3]
s_cbranch_scc0  L_NO_NEXT_TRAP  
// second-level trap handler not been set
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Create plane rotation property

2019-09-12 Thread Pierre-Loup A. Griffais

On 9/12/19 10:22 AM, Harry Wentland wrote:

On 2019-09-12 1:01 p.m., Kazlauskas, Nicholas wrote:

On 2019-09-12 12:44 p.m., Pierre-Loup A. Griffais wrote:

It's otherwise properly supported, just needs exposing to userspace.

Signed-off-by: Pierre-Loup A. Griffais 

I know IGT has some tests for plane rotation, do you happen to know what
tests pass or fail when exposing this?

I think DCN1 (Raven) should work as expected but I'd be concerned about
DCE or DCN2. I think we have had some cursor bugs in the past with
cursor rotation but they might only be exposed when used in conjunction
with overlay planes.



Windows guys had a fix (in DC) for cursor with HW rotation on DCN a few
weeks ago. That might have fixed these issues.

We should still make sure we can pass IGT tests that do rotation.

How did you test? Weston?


I've tested it with a patched kmscube to add the rotation property in 
the atomic path. 90, 180 and 270 all seem happy on Raven with that 
setup. I've not tested any other chip at this point.




Harry


I'd just like to make sure there's suitable testing at least if we're
going to expose this to userspace.

Nicholas Kazlauskas


---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
   1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 45be7a2132bb..3772763c6449 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4680,6 +4680,7 @@ static int amdgpu_dm_plane_init(struct 
amdgpu_display_manager *dm,
uint32_t formats[32];
int num_formats;
int res = -EPERM;
+   unsigned int supported_rotations;
   
   	num_formats = get_plane_formats(plane, plane_cap, formats,

ARRAY_SIZE(formats));
@@ -4711,6 +4712,13 @@ static int amdgpu_dm_plane_init(struct 
amdgpu_display_manager *dm,
DRM_COLOR_YCBCR_BT709, DRM_COLOR_YCBCR_LIMITED_RANGE);
}
   
+	supported_rotations =

+   DRM_MODE_ROTATE_0 | DRM_MODE_ROTATE_90 |
+   DRM_MODE_ROTATE_180 | DRM_MODE_ROTATE_270;
+
+   drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
+  supported_rotations);
+
drm_plane_helper_add(plane, _plane_helper_funcs);
   
   	/* Create (reset) the plane state */






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Create plane rotation property

2019-09-12 Thread Harry Wentland
On 2019-09-12 1:01 p.m., Kazlauskas, Nicholas wrote:
> On 2019-09-12 12:44 p.m., Pierre-Loup A. Griffais wrote:
>> It's otherwise properly supported, just needs exposing to userspace.
>>
>> Signed-off-by: Pierre-Loup A. Griffais 
> I know IGT has some tests for plane rotation, do you happen to know what 
> tests pass or fail when exposing this?
> 
> I think DCN1 (Raven) should work as expected but I'd be concerned about 
> DCE or DCN2. I think we have had some cursor bugs in the past with 
> cursor rotation but they might only be exposed when used in conjunction 
> with overlay planes.
> 

Windows guys had a fix (in DC) for cursor with HW rotation on DCN a few
weeks ago. That might have fixed these issues.

We should still make sure we can pass IGT tests that do rotation.

How did you test? Weston?

Harry

> I'd just like to make sure there's suitable testing at least if we're 
> going to expose this to userspace.
> 
> Nicholas Kazlauskas
> 
>> ---
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index 45be7a2132bb..3772763c6449 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -4680,6 +4680,7 @@ static int amdgpu_dm_plane_init(struct 
>> amdgpu_display_manager *dm,
>>  uint32_t formats[32];
>>  int num_formats;
>>  int res = -EPERM;
>> +unsigned int supported_rotations;
>>   
>>  num_formats = get_plane_formats(plane, plane_cap, formats,
>>  ARRAY_SIZE(formats));
>> @@ -4711,6 +4712,13 @@ static int amdgpu_dm_plane_init(struct 
>> amdgpu_display_manager *dm,
>>  DRM_COLOR_YCBCR_BT709, DRM_COLOR_YCBCR_LIMITED_RANGE);
>>  }
>>   
>> +supported_rotations =
>> +DRM_MODE_ROTATE_0 | DRM_MODE_ROTATE_90 |
>> +DRM_MODE_ROTATE_180 | DRM_MODE_ROTATE_270;
>> +
>> +drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
>> +   supported_rotations);
>> +
>>  drm_plane_helper_add(plane, _plane_helper_funcs);
>>   
>>  /* Create (reset) the plane state */
>>
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/4] mm/hmm: make full use of walk_page_range()

2019-09-12 Thread Ralph Campbell



On 9/12/19 1:26 AM, Christoph Hellwig wrote:

+static int hmm_pfns_fill(unsigned long addr,
+unsigned long end,
+struct hmm_range *range,
+enum hmm_pfn_value_e value)


Nit: can we use the space a little more efficient, e.g.:

static int hmm_pfns_fill(unsigned long addr, unsigned long end,
struct hmm_range *range, enum hmm_pfn_value_e value)


+static int hmm_vma_walk_test(unsigned long start,
+unsigned long end,
+struct mm_walk *walk)


Same here.


+   if (!(vma->vm_flags & VM_READ)) {
+   (void) hmm_pfns_fill(start, end, range, HMM_PFN_NONE);


There should be no need for the void cast here.



OK. I'll post a v2 with the these changes.
Thanks for the reviews.


Re: [PATCH 2/4] mm/hmm: allow snapshot of the special zero page

2019-09-12 Thread Ralph Campbell


On 9/12/19 1:26 AM, Christoph Hellwig wrote:

On Wed, Sep 11, 2019 at 03:28:27PM -0700, Ralph Campbell wrote:

Allow hmm_range_fault() to return success (0) when the CPU pagetable
entry points to the special shared zero page.
The caller can then handle the zero page by possibly clearing device
private memory instead of DMAing a zero page.

Signed-off-by: Ralph Campbell 
Cc: "Jérôme Glisse" 
Cc: Jason Gunthorpe 
Cc: Christoph Hellwig 
---
  mm/hmm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 06041d4399ff..7217912bef13 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -532,7 +532,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
unsigned long addr,
return -EBUSY;
} else if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pte_special(pte)) 
{
*pfn = range->values[HMM_PFN_SPECIAL];
-   return -EFAULT;
+   return is_zero_pfn(pte_pfn(pte)) ? 0 : -EFAULT;


Any chance to just use a normal if here:

if (!is_zero_pfn(pte_pfn(pte)))
return -EFAULT;
return 0;



Sure, no problem.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Create plane rotation property

2019-09-12 Thread Kazlauskas, Nicholas
On 2019-09-12 12:44 p.m., Pierre-Loup A. Griffais wrote:
> It's otherwise properly supported, just needs exposing to userspace.
> 
> Signed-off-by: Pierre-Loup A. Griffais 
I know IGT has some tests for plane rotation, do you happen to know what 
tests pass or fail when exposing this?

I think DCN1 (Raven) should work as expected but I'd be concerned about 
DCE or DCN2. I think we have had some cursor bugs in the past with 
cursor rotation but they might only be exposed when used in conjunction 
with overlay planes.

I'd just like to make sure there's suitable testing at least if we're 
going to expose this to userspace.

Nicholas Kazlauskas

> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
>   1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 45be7a2132bb..3772763c6449 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4680,6 +4680,7 @@ static int amdgpu_dm_plane_init(struct 
> amdgpu_display_manager *dm,
>   uint32_t formats[32];
>   int num_formats;
>   int res = -EPERM;
> + unsigned int supported_rotations;
>   
>   num_formats = get_plane_formats(plane, plane_cap, formats,
>   ARRAY_SIZE(formats));
> @@ -4711,6 +4712,13 @@ static int amdgpu_dm_plane_init(struct 
> amdgpu_display_manager *dm,
>   DRM_COLOR_YCBCR_BT709, DRM_COLOR_YCBCR_LIMITED_RANGE);
>   }
>   
> + supported_rotations =
> + DRM_MODE_ROTATE_0 | DRM_MODE_ROTATE_90 |
> + DRM_MODE_ROTATE_180 | DRM_MODE_ROTATE_270;
> +
> + drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
> +supported_rotations);
> +
>   drm_plane_helper_add(plane, _plane_helper_funcs);
>   
>   /* Create (reset) the plane state */
> 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Create plane rotation property

2019-09-12 Thread Pierre-Loup A. Griffais
It's otherwise properly supported, just needs exposing to userspace.

Signed-off-by: Pierre-Loup A. Griffais 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 45be7a2132bb..3772763c6449 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4680,6 +4680,7 @@ static int amdgpu_dm_plane_init(struct 
amdgpu_display_manager *dm,
uint32_t formats[32];
int num_formats;
int res = -EPERM;
+   unsigned int supported_rotations;
 
num_formats = get_plane_formats(plane, plane_cap, formats,
ARRAY_SIZE(formats));
@@ -4711,6 +4712,13 @@ static int amdgpu_dm_plane_init(struct 
amdgpu_display_manager *dm,
DRM_COLOR_YCBCR_BT709, DRM_COLOR_YCBCR_LIMITED_RANGE);
}
 
+   supported_rotations =
+   DRM_MODE_ROTATE_0 | DRM_MODE_ROTATE_90 |
+   DRM_MODE_ROTATE_180 | DRM_MODE_ROTATE_270;
+
+   drm_plane_create_rotation_property(plane, DRM_MODE_ROTATE_0,
+  supported_rotations);
+
drm_plane_helper_add(plane, _plane_helper_funcs);
 
/* Create (reset) the plane state */
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

2019-09-12 Thread Grodzovsky, Andrey
RAS error will be triggered if the HW identified a faulty physical page, 
the error comes through an interrupt where the data payload will have 
information that can be translated into the bad page address, we then as 
a recovery measure reset the ASIC, reserve this bad page so it cannot be 
used by anyone else and store it's address to EEPROM for future 
reservation after boot.

Andrey

On 9/12/19 11:15 AM, Christian König wrote:
> Well I still hope to avoid VRAM lost in most of the cases, but that is 
> really not guaranteed.
>
> What is the bad page and why do you need to reserve it?
>
> Christian.
>
> Am 12.09.19 um 16:09 schrieb Grodzovsky, Andrey:
>> I am not sure VRAM loss happens every time, but when it does I would
>> assume you would have to reserve them again as the page tables content
>> was lost. On the other hand I do remember we keep shadow system memory
>> copies of all page tables so maybe that not an issue, so yes, just try
>> to allocate the bad page after reset and if it's still reserved you will
>> fail.
>>
>> Andrey
>>
>> On 9/12/19 7:35 AM, Zhou1, Tao wrote:
>>> Hi Andrey:
>>>
>>> Are you sure of the VRAM content loss after gpu reset? I'm not very 
>>> familiar with the detail of gpu reset and I'll do experiment to 
>>> confirm the case you mentioned.
>>>
>>> Regards,
>>> Tao
>>>
 -Original Message-
 From: Grodzovsky, Andrey 
 Sent: 2019年9月12日 10:32
 To: Chen, Guchun ; Zhou1, Tao
 ; amd-gfx@lists.freedesktop.org
 Cc: Deucher, Alexander 
 Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

 That not what I meant. Let's say you handled one bad page interrupt 
 and as
 a result have one bad page reserved. Now unrelated gfx ring timeout
 happens which triggers GPU reset and VRAM loss. When you come back 
 from
 reset amdgpu_ras_reserve_bad_pages will be called but since 
 last_reserved
 == data_count the bad page will not be reserved again, maybe we 
 should just
 set data->last_reserved to 0 again if VRAM was lost during ASIC 
 reset...

 Andrey

 
 From: Chen, Guchun 
 Sent: 11 September 2019 21:53:03
 To: Grodzovsky, Andrey; Zhou1, Tao; amd-gfx@lists.freedesktop.org
 Cc: Deucher, Alexander
 Subject: RE: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

 Comment inline.

 Regards,
 Guchun

 -Original Message-
 From: Grodzovsky, Andrey 
 Sent: Wednesday, September 11, 2019 10:41 PM
 To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
 Cc: Chen, Guchun ; Deucher, Alexander
 
 Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

 On second though this will break  what about reserving bad pages when
 resetting GPU for non RAS error reason such as manual reset ,S3 or 
 ring
 timeout, (amdgpu_ras_resume->amdgpu_ras_reset_gpu) so i will keep the
 code as is.

 Another possible issue in existing code - looks like no reservation 
 will take
 place in those case even now as amdgpu_ras_reserve_bad_pages
 data->last_reserved will be equal to data->count , no ? Looks like for
 this case you need to add flag to FORCE reservation for all pages from
 0 to data->counnt.
 [Guchun]Yes, last_reserved is not updated any more, unless we 
 unload our
 driver. So it maybe always equal to data->count, then no new bad 
 page will
 be reserved.
 I see we have one eeprom reset by user, can we put this 
 last_reserved clean
 operation to user in the same stack as well?

 Andrey

 On 9/11/19 10:19 AM, Andrey Grodzovsky wrote:
> I like this much more, I will relocate to
> amdgpu_umc_process_ras_data_cb an push.
>
> Andrey
>
> On 9/10/19 11:08 PM, Zhou1, Tao wrote:
>> amdgpu_ras_reserve_bad_pages is only used by umc block, so another
>> approach is to move it into amdgpu_umc_process_ras_data_cb.
>> Anyway, either way is OK and the patch is:
>>
>> Reviewed-by: Tao Zhou 
>>
>>> -Original Message-
>>> From: Andrey Grodzovsky 
>>> Sent: 2019年9月11日 3:41
>>> To: amd-gfx@lists.freedesktop.org
>>> Cc: Chen, Guchun ; Zhou1, Tao
>>> ; Deucher, Alexander
 ;
>>> Grodzovsky, Andrey 
>>> Subject: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
>>>
>>> Problem:
>>> amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
>>> because writing to EEPROM during ASIC reset was unstable.
>>> But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
>>> directly from ISR context and so locking is not allowed. Also it's
>>> irrelevant for this partilcular interrupt as this is generic RAS
>>> interrupt and not memory errors specific.
>>>
>>> Fix:
>>> Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.
>>>
>>> 

Re:[PATCH] drm/amdgpu: resvert "disable bulk moves for now"

2019-09-12 Thread Zhou, David(ChunMing)
I dont know dkms status,anyway, we should submit this one as early as possible.

 原始邮件 
主题:Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now"
发件人:Christian König
收件人:"Zhou, David(ChunMing)" ,amd-gfx@lists.freedesktop.org
抄送:

Just to double check: We do have that enabled in the DKMS package for a
while and doesn't encounter any more problems with it, correct?

Thanks,
Christian.

Am 12.09.19 um 16:02 schrieb Chunming Zhou:
> RB on it to go ahead.
>
> -David
>
> 在 2019/9/12 18:15, Christian König 写道:
>> This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.
>>
>> The changes to fix this should have landed in 5.1.
>>
>> Signed-off-by: Christian König 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
>>1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 48349e4f0701..fd3fbaa73fa3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
>> *adev,
>>   struct ttm_bo_global *glob = adev->mman.bdev.glob;
>>   struct amdgpu_vm_bo_base *bo_base;
>>
>> -#if 0
>>   if (vm->bulk_moveable) {
>>   spin_lock(>lru_lock);
>>   ttm_bo_bulk_move_lru_tail(>lru_bulk_move);
>>   spin_unlock(>lru_lock);
>>   return;
>>   }
>> -#endif
>>
>>   memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
>>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

2019-09-12 Thread Christian König
Well I still hope to avoid VRAM lost in most of the cases, but that is 
really not guaranteed.


What is the bad page and why do you need to reserve it?

Christian.

Am 12.09.19 um 16:09 schrieb Grodzovsky, Andrey:

I am not sure VRAM loss happens every time, but when it does I would
assume you would have to reserve them again as the page tables content
was lost. On the other hand I do remember we keep shadow system memory
copies of all page tables so maybe that not an issue, so yes, just try
to allocate the bad page after reset and if it's still reserved you will
fail.

Andrey

On 9/12/19 7:35 AM, Zhou1, Tao wrote:

Hi Andrey:

Are you sure of the VRAM content loss after gpu reset? I'm not very familiar 
with the detail of gpu reset and I'll do experiment to confirm the case you 
mentioned.

Regards,
Tao


-Original Message-
From: Grodzovsky, Andrey 
Sent: 2019年9月12日 10:32
To: Chen, Guchun ; Zhou1, Tao
; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

That not what I meant. Let's say you handled one bad page interrupt and as
a result have one bad page reserved. Now unrelated gfx ring timeout
happens which triggers GPU reset and VRAM loss. When you come back from
reset amdgpu_ras_reserve_bad_pages will be called but since last_reserved
== data_count the bad page will not be reserved again, maybe we should just
set data->last_reserved to 0 again if VRAM was lost during ASIC reset...

Andrey


From: Chen, Guchun 
Sent: 11 September 2019 21:53:03
To: Grodzovsky, Andrey; Zhou1, Tao; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: RE: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

Comment inline.

Regards,
Guchun

-Original Message-
From: Grodzovsky, Andrey 
Sent: Wednesday, September 11, 2019 10:41 PM
To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
Cc: Chen, Guchun ; Deucher, Alexander

Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

On second though this will break  what about reserving bad pages when
resetting GPU for non RAS error reason such as manual reset ,S3 or ring
timeout, (amdgpu_ras_resume->amdgpu_ras_reset_gpu) so i will keep the
code as is.

Another possible issue in existing code - looks like no reservation will take
place in those case even now as amdgpu_ras_reserve_bad_pages
data->last_reserved will be equal to data->count , no ? Looks like for
this case you need to add flag to FORCE reservation for all pages from
0 to data->counnt.
[Guchun]Yes, last_reserved is not updated any more, unless we unload our
driver. So it maybe always equal to data->count, then no new bad page will
be reserved.
I see we have one eeprom reset by user, can we put this last_reserved clean
operation to user in the same stack as well?

Andrey

On 9/11/19 10:19 AM, Andrey Grodzovsky wrote:

I like this much more, I will relocate to
amdgpu_umc_process_ras_data_cb an push.

Andrey

On 9/10/19 11:08 PM, Zhou1, Tao wrote:

amdgpu_ras_reserve_bad_pages is only used by umc block, so another
approach is to move it into amdgpu_umc_process_ras_data_cb.
Anyway, either way is OK and the patch is:

Reviewed-by: Tao Zhou 


-Original Message-
From: Andrey Grodzovsky 
Sent: 2019年9月11日 3:41
To: amd-gfx@lists.freedesktop.org
Cc: Chen, Guchun ; Zhou1, Tao
; Deucher, Alexander

;

Grodzovsky, Andrey 
Subject: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

Problem:
amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
because writing to EEPROM during ASIC reset was unstable.
But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
directly from ISR context and so locking is not allowed. Also it's
irrelevant for this partilcular interrupt as this is generic RAS
interrupt and not memory errors specific.

Fix:
Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.

Signed-off-by: Andrey Grodzovsky 
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 012034d..dd5da3c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -504,7 +504,9 @@ static inline int amdgpu_ras_reset_gpu(struct
amdgpu_device *adev,
/* save bad page to eeprom before gpu reset,
 * i2c may be unstable in gpu reset
 */
-amdgpu_ras_reserve_bad_pages(adev);
+if (in_task())
+amdgpu_ras_reserve_bad_pages(adev);
+
if (atomic_cmpxchg(>in_recovery, 0, 1) == 0)
schedule_work(>recovery_work);
return 0;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org

Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now"

2019-09-12 Thread Christian König
Just to double check: We do have that enabled in the DKMS package for a 
while and doesn't encounter any more problems with it, correct?


Thanks,
Christian.

Am 12.09.19 um 16:02 schrieb Chunming Zhou:

RB on it to go ahead.

-David

在 2019/9/12 18:15, Christian König 写道:

This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
   1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 48349e4f0701..fd3fbaa73fa3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
*adev,
struct ttm_bo_global *glob = adev->mman.bdev.glob;
struct amdgpu_vm_bo_base *bo_base;
   
-#if 0

if (vm->bulk_moveable) {
spin_lock(>lru_lock);
ttm_bo_bulk_move_lru_tail(>lru_bulk_move);
spin_unlock(>lru_lock);
return;
}
-#endif
   
   	memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
   


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

2019-09-12 Thread Grodzovsky, Andrey
I am not sure VRAM loss happens every time, but when it does I would 
assume you would have to reserve them again as the page tables content 
was lost. On the other hand I do remember we keep shadow system memory 
copies of all page tables so maybe that not an issue, so yes, just try 
to allocate the bad page after reset and if it's still reserved you will 
fail.

Andrey

On 9/12/19 7:35 AM, Zhou1, Tao wrote:
> Hi Andrey:
>
> Are you sure of the VRAM content loss after gpu reset? I'm not very familiar 
> with the detail of gpu reset and I'll do experiment to confirm the case you 
> mentioned.
>
> Regards,
> Tao
>
>> -Original Message-
>> From: Grodzovsky, Andrey 
>> Sent: 2019年9月12日 10:32
>> To: Chen, Guchun ; Zhou1, Tao
>> ; amd-gfx@lists.freedesktop.org
>> Cc: Deucher, Alexander 
>> Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
>>
>> That not what I meant. Let's say you handled one bad page interrupt and as
>> a result have one bad page reserved. Now unrelated gfx ring timeout
>> happens which triggers GPU reset and VRAM loss. When you come back from
>> reset amdgpu_ras_reserve_bad_pages will be called but since last_reserved
>> == data_count the bad page will not be reserved again, maybe we should just
>> set data->last_reserved to 0 again if VRAM was lost during ASIC reset...
>>
>> Andrey
>>
>> 
>> From: Chen, Guchun 
>> Sent: 11 September 2019 21:53:03
>> To: Grodzovsky, Andrey; Zhou1, Tao; amd-gfx@lists.freedesktop.org
>> Cc: Deucher, Alexander
>> Subject: RE: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
>>
>> Comment inline.
>>
>> Regards,
>> Guchun
>>
>> -Original Message-
>> From: Grodzovsky, Andrey 
>> Sent: Wednesday, September 11, 2019 10:41 PM
>> To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
>> Cc: Chen, Guchun ; Deucher, Alexander
>> 
>> Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
>>
>> On second though this will break  what about reserving bad pages when
>> resetting GPU for non RAS error reason such as manual reset ,S3 or ring
>> timeout, (amdgpu_ras_resume->amdgpu_ras_reset_gpu) so i will keep the
>> code as is.
>>
>> Another possible issue in existing code - looks like no reservation will take
>> place in those case even now as amdgpu_ras_reserve_bad_pages
>> data->last_reserved will be equal to data->count , no ? Looks like for
>> this case you need to add flag to FORCE reservation for all pages from
>> 0 to data->counnt.
>> [Guchun]Yes, last_reserved is not updated any more, unless we unload our
>> driver. So it maybe always equal to data->count, then no new bad page will
>> be reserved.
>> I see we have one eeprom reset by user, can we put this last_reserved clean
>> operation to user in the same stack as well?
>>
>> Andrey
>>
>> On 9/11/19 10:19 AM, Andrey Grodzovsky wrote:
>>> I like this much more, I will relocate to
>>> amdgpu_umc_process_ras_data_cb an push.
>>>
>>> Andrey
>>>
>>> On 9/10/19 11:08 PM, Zhou1, Tao wrote:
 amdgpu_ras_reserve_bad_pages is only used by umc block, so another
 approach is to move it into amdgpu_umc_process_ras_data_cb.
 Anyway, either way is OK and the patch is:

 Reviewed-by: Tao Zhou 

> -Original Message-
> From: Andrey Grodzovsky 
> Sent: 2019年9月11日 3:41
> To: amd-gfx@lists.freedesktop.org
> Cc: Chen, Guchun ; Zhou1, Tao
> ; Deucher, Alexander
>> ;
> Grodzovsky, Andrey 
> Subject: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
>
> Problem:
> amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
> because writing to EEPROM during ASIC reset was unstable.
> But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
> directly from ISR context and so locking is not allowed. Also it's
> irrelevant for this partilcular interrupt as this is generic RAS
> interrupt and not memory errors specific.
>
> Fix:
> Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.
>
> Signed-off-by: Andrey Grodzovsky 
> ---
>drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 +++-
>1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> index 012034d..dd5da3c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> @@ -504,7 +504,9 @@ static inline int amdgpu_ras_reset_gpu(struct
> amdgpu_device *adev,
>/* save bad page to eeprom before gpu reset,
> * i2c may be unstable in gpu reset
> */
> -amdgpu_ras_reserve_bad_pages(adev);
> +if (in_task())
> +amdgpu_ras_reserve_bad_pages(adev);
> +
>if (atomic_cmpxchg(>in_recovery, 0, 1) == 0)
>schedule_work(>recovery_work);
>return 0;
> --
> 2.7.4

Re: [PATCH] drm/amdgpu: resvert "disable bulk moves for now"

2019-09-12 Thread Chunming Zhou
RB on it to go ahead.

-David

在 2019/9/12 18:15, Christian König 写道:
> This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.
>
> The changes to fix this should have landed in 5.1.
>
> Signed-off-by: Christian König 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
>   1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 48349e4f0701..fd3fbaa73fa3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
> *adev,
>   struct ttm_bo_global *glob = adev->mman.bdev.glob;
>   struct amdgpu_vm_bo_base *bo_base;
>   
> -#if 0
>   if (vm->bulk_moveable) {
>   spin_lock(>lru_lock);
>   ttm_bo_bulk_move_lru_tail(>lru_bulk_move);
>   spin_unlock(>lru_lock);
>   return;
>   }
> -#endif
>   
>   memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
>   
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 14/14] drm/amdgpu: set TMZ bits in PTEs for secure bo (v2)

2019-09-12 Thread Koenig, Christian
Am 12.09.19 um 12:27 schrieb Huang, Ray:
> On Wed, Sep 11, 2019 at 08:13:19PM +0800, Koenig, Christian wrote:
>> Am 11.09.19 um 13:50 schrieb Huang, Ray:
>>> From: Alex Deucher 
>>>
>>> If one bo is secure (created with AMDGPU_GEM_CREATE_ENCRYPTED), the TMZ 
>>> bits of
>>> PTEs that belongs that bo should be set. Then psp is able to protect the 
>>> pages
>>> of this bo to avoid the access from an "untrust" domain such as CPU.
>>>
>>> v1: design and draft the skeletion of tmz bits setting on PTEs (Alex)
>>> v2: return failure once create secure bo on no-tmz platform  (Ray)
>>>
>>> Signed-off-by: Alex Deucher 
>>> Reviewed-by: Huang Rui 
>>> Signed-off-by: Huang Rui 
>>> ---
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 12 +++-
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 10 ++
>>>drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  5 +
>>>3 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> index 22eab74..5332104 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> @@ -222,7 +222,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, 
>>> void *data,
>>>   AMDGPU_GEM_CREATE_CPU_GTT_USWC |
>>>   AMDGPU_GEM_CREATE_VRAM_CLEARED |
>>>   AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
>>> - AMDGPU_GEM_CREATE_EXPLICIT_SYNC))
>>> + AMDGPU_GEM_CREATE_EXPLICIT_SYNC |
>>> + AMDGPU_GEM_CREATE_ENCRYPTED))
>>>
>>> return -EINVAL;
>>>
>>> @@ -230,6 +231,11 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, 
>>> void *data,
>>> if (args->in.domains & ~AMDGPU_GEM_DOMAIN_MASK)
>>> return -EINVAL;
>>>
>>> +   if (!adev->tmz.enabled && (flags & AMDGPU_GEM_CREATE_ENCRYPTED)) {
>>> +   DRM_ERROR("Cannot allocate secure buffer while tmz is 
>>> disabled\n");
>>> +   return -EINVAL;
>>> +   }
>>> +
>>> /* create a gem object to contain this object in */
>>> if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS |
>>> AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
>>> @@ -251,6 +257,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, 
>>> void *data,
>>> resv = vm->root.base.bo->tbo.resv;
>>> }
>>>
>>> +   if (flags & AMDGPU_GEM_CREATE_ENCRYPTED) {
>>> +   /* XXX: pad out alignment to meet TMZ requirements */
>>> +   }
>>> +
>>> r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>>>  (u32)(0x & args->in.domains),
>>>  flags, ttm_bo_type_device, resv, );
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 5a3c177..286e2e2 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -224,6 +224,16 @@ static inline bool amdgpu_bo_explicit_sync(struct 
>>> amdgpu_bo *bo)
>>> return bo->flags & AMDGPU_GEM_CREATE_EXPLICIT_SYNC;
>>>}
>>>
>>> +/**
>>> + * amdgpu_bo_encrypted - return whether the bo is encrypted
>>> + */
>>> +static inline bool amdgpu_bo_encrypted(struct amdgpu_bo *bo)
>>> +{
>>> +   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>> +
>>> +   return adev->tmz.enabled && (bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED);
>> Checking the adev->tmz.enabled flags should be dropped here.
>>
> That's fine. BO should be validated while it is created.
>
> But if the BO is created by vmid 0, is this checking needed?
>
>>> +}
>>> +
>>>bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
>>>void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> index 3663655..8f00bb2 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>> @@ -1434,6 +1434,8 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm)
>>>uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct ttm_mem_reg 
>>> *mem)
>>>{
>>> uint64_t flags = 0;
>>> +   struct ttm_buffer_object *tbo = ttm_mem_reg_to_bo(mem);
>>> +   struct amdgpu_bo *abo = ttm_to_amdgpu_bo(tbo);
>> That's a clear NAK. The function is not necessarily called with
>> >mem, which is also the reason why this function doesn't gets the BO
>> as parameter.
>>
> Do you mean we can revise the below functions to use bo as the parameter
> instead?
>
> uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct amdgpu_bo *bo)
> uint64_t amdgpu_ttm_tt_pte_flags(struct ttm_tt *ttm, struct amdgpu_bo *bo)

If that is possible then this would indeed be a solution for the problem.

Christian.

>
> Thanks,
> Ray
>
>> Christian.
>>
>>>
>>> if (mem && mem->mem_type != TTM_PL_SYSTEM)
>>> 

RE: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.

2019-09-12 Thread Zhou1, Tao
Hi Andrey:

Are you sure of the VRAM content loss after gpu reset? I'm not very familiar 
with the detail of gpu reset and I'll do experiment to confirm the case you 
mentioned.

Regards,
Tao

> -Original Message-
> From: Grodzovsky, Andrey 
> Sent: 2019年9月12日 10:32
> To: Chen, Guchun ; Zhou1, Tao
> ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
> 
> That not what I meant. Let's say you handled one bad page interrupt and as
> a result have one bad page reserved. Now unrelated gfx ring timeout
> happens which triggers GPU reset and VRAM loss. When you come back from
> reset amdgpu_ras_reserve_bad_pages will be called but since last_reserved
> == data_count the bad page will not be reserved again, maybe we should just
> set data->last_reserved to 0 again if VRAM was lost during ASIC reset...
> 
> Andrey
> 
> 
> From: Chen, Guchun 
> Sent: 11 September 2019 21:53:03
> To: Grodzovsky, Andrey; Zhou1, Tao; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander
> Subject: RE: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
> 
> Comment inline.
> 
> Regards,
> Guchun
> 
> -Original Message-
> From: Grodzovsky, Andrey 
> Sent: Wednesday, September 11, 2019 10:41 PM
> To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
> Cc: Chen, Guchun ; Deucher, Alexander
> 
> Subject: Re: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
> 
> On second though this will break  what about reserving bad pages when
> resetting GPU for non RAS error reason such as manual reset ,S3 or ring
> timeout, (amdgpu_ras_resume->amdgpu_ras_reset_gpu) so i will keep the
> code as is.
> 
> Another possible issue in existing code - looks like no reservation will take
> place in those case even now as amdgpu_ras_reserve_bad_pages
> data->last_reserved will be equal to data->count , no ? Looks like for
> this case you need to add flag to FORCE reservation for all pages from
> 0 to data->counnt.
> [Guchun]Yes, last_reserved is not updated any more, unless we unload our
> driver. So it maybe always equal to data->count, then no new bad page will
> be reserved.
> I see we have one eeprom reset by user, can we put this last_reserved clean
> operation to user in the same stack as well?
> 
> Andrey
> 
> On 9/11/19 10:19 AM, Andrey Grodzovsky wrote:
> > I like this much more, I will relocate to
> > amdgpu_umc_process_ras_data_cb an push.
> >
> > Andrey
> >
> > On 9/10/19 11:08 PM, Zhou1, Tao wrote:
> >> amdgpu_ras_reserve_bad_pages is only used by umc block, so another
> >> approach is to move it into amdgpu_umc_process_ras_data_cb.
> >> Anyway, either way is OK and the patch is:
> >>
> >> Reviewed-by: Tao Zhou 
> >>
> >>> -Original Message-
> >>> From: Andrey Grodzovsky 
> >>> Sent: 2019年9月11日 3:41
> >>> To: amd-gfx@lists.freedesktop.org
> >>> Cc: Chen, Guchun ; Zhou1, Tao
> >>> ; Deucher, Alexander
> ;
> >>> Grodzovsky, Andrey 
> >>> Subject: [PATCH] drm/amdgpu: Fix mutex lock from atomic context.
> >>>
> >>> Problem:
> >>> amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
> >>> because writing to EEPROM during ASIC reset was unstable.
> >>> But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
> >>> directly from ISR context and so locking is not allowed. Also it's
> >>> irrelevant for this partilcular interrupt as this is generic RAS
> >>> interrupt and not memory errors specific.
> >>>
> >>> Fix:
> >>> Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.
> >>>
> >>> Signed-off-by: Andrey Grodzovsky 
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 +++-
> >>>   1 file changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >>> index 012034d..dd5da3c 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
> >>> @@ -504,7 +504,9 @@ static inline int amdgpu_ras_reset_gpu(struct
> >>> amdgpu_device *adev,
> >>>   /* save bad page to eeprom before gpu reset,
> >>>* i2c may be unstable in gpu reset
> >>>*/
> >>> -amdgpu_ras_reserve_bad_pages(adev);
> >>> +if (in_task())
> >>> +amdgpu_ras_reserve_bad_pages(adev);
> >>> +
> >>>   if (atomic_cmpxchg(>in_recovery, 0, 1) == 0)
> >>>   schedule_work(>recovery_work);
> >>>   return 0;
> >>> --
> >>> 2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: replace DRM_ERROR with DRM_WARN in ras_reserve_bad_pages

2019-09-12 Thread Zhou1, Tao
There are two cases of reserve error should be ignored:
1) a ras bad page has been allocated (used by someone);
2) a ras bad page has been reserved (duplicate error injection for one page);

DRM_ERROR is unnecessary for the failure of bad page reserve

Signed-off-by: Tao Zhou 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index dc5f94e6118b..0425b74e1a8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1478,9 +1478,15 @@ int amdgpu_ras_reserve_bad_pages(struct amdgpu_device 
*adev)
for (i = data->last_reserved; i < data->count; i++) {
bp = data->bps[i].retired_page;
 
+   /*
+* There are two cases of reserve error should be ignored:
+* 1) a ras bad page has been allocated (used by someone);
+* 2) a ras bad page has been reserved (duplicate error 
injection
+* for one page);
+*/
if (amdgpu_ras_reserve_vram(adev, bp << PAGE_SHIFT,
PAGE_SIZE, ))
-   DRM_ERROR("RAS ERROR: reserve vram %llx fail\n", bp);
+   DRM_WARN("RAS WARN: reserve vram for retired page %llx 
fail\n", bp);
 
data->bps_bo[i] = bo;
data->last_reserved = i + 1;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 14/14] drm/amdgpu: set TMZ bits in PTEs for secure bo (v2)

2019-09-12 Thread Huang, Ray
On Wed, Sep 11, 2019 at 08:13:19PM +0800, Koenig, Christian wrote:
> Am 11.09.19 um 13:50 schrieb Huang, Ray:
> > From: Alex Deucher 
> >
> > If one bo is secure (created with AMDGPU_GEM_CREATE_ENCRYPTED), the TMZ 
> > bits of
> > PTEs that belongs that bo should be set. Then psp is able to protect the 
> > pages
> > of this bo to avoid the access from an "untrust" domain such as CPU.
> >
> > v1: design and draft the skeletion of tmz bits setting on PTEs (Alex)
> > v2: return failure once create secure bo on no-tmz platform  (Ray)
> >
> > Signed-off-by: Alex Deucher 
> > Reviewed-by: Huang Rui 
> > Signed-off-by: Huang Rui 
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 12 +++-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 10 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  5 +
> >   3 files changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > index 22eab74..5332104 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> > @@ -222,7 +222,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, 
> > void *data,
> >   AMDGPU_GEM_CREATE_CPU_GTT_USWC |
> >   AMDGPU_GEM_CREATE_VRAM_CLEARED |
> >   AMDGPU_GEM_CREATE_VM_ALWAYS_VALID |
> > - AMDGPU_GEM_CREATE_EXPLICIT_SYNC))
> > + AMDGPU_GEM_CREATE_EXPLICIT_SYNC |
> > + AMDGPU_GEM_CREATE_ENCRYPTED))
> >   
> > return -EINVAL;
> >   
> > @@ -230,6 +231,11 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, 
> > void *data,
> > if (args->in.domains & ~AMDGPU_GEM_DOMAIN_MASK)
> > return -EINVAL;
> >   
> > +   if (!adev->tmz.enabled && (flags & AMDGPU_GEM_CREATE_ENCRYPTED)) {
> > +   DRM_ERROR("Cannot allocate secure buffer while tmz is 
> > disabled\n");
> > +   return -EINVAL;
> > +   }
> > +
> > /* create a gem object to contain this object in */
> > if (args->in.domains & (AMDGPU_GEM_DOMAIN_GDS |
> > AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA)) {
> > @@ -251,6 +257,10 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, 
> > void *data,
> > resv = vm->root.base.bo->tbo.resv;
> > }
> >   
> > +   if (flags & AMDGPU_GEM_CREATE_ENCRYPTED) {
> > +   /* XXX: pad out alignment to meet TMZ requirements */
> > +   }
> > +
> > r = amdgpu_gem_object_create(adev, size, args->in.alignment,
> >  (u32)(0x & args->in.domains),
> >  flags, ttm_bo_type_device, resv, );
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > index 5a3c177..286e2e2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> > @@ -224,6 +224,16 @@ static inline bool amdgpu_bo_explicit_sync(struct 
> > amdgpu_bo *bo)
> > return bo->flags & AMDGPU_GEM_CREATE_EXPLICIT_SYNC;
> >   }
> >   
> > +/**
> > + * amdgpu_bo_encrypted - return whether the bo is encrypted
> > + */
> > +static inline bool amdgpu_bo_encrypted(struct amdgpu_bo *bo)
> > +{
> > +   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
> > +
> > +   return adev->tmz.enabled && (bo->flags & AMDGPU_GEM_CREATE_ENCRYPTED);
> 
> Checking the adev->tmz.enabled flags should be dropped here.
> 

That's fine. BO should be validated while it is created.

But if the BO is created by vmid 0, is this checking needed?

> > +}
> > +
> >   bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
> >   void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
> >   
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 3663655..8f00bb2 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1434,6 +1434,8 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm)
> >   uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct ttm_mem_reg 
> > *mem)
> >   {
> > uint64_t flags = 0;
> > +   struct ttm_buffer_object *tbo = ttm_mem_reg_to_bo(mem);
> > +   struct amdgpu_bo *abo = ttm_to_amdgpu_bo(tbo);
> 
> That's a clear NAK. The function is not necessarily called with 
> >mem, which is also the reason why this function doesn't gets the BO 
> as parameter.
> 

Do you mean we can revise the below functions to use bo as the parameter
instead? 

uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, struct amdgpu_bo *bo)
uint64_t amdgpu_ttm_tt_pte_flags(struct ttm_tt *ttm, struct amdgpu_bo *bo)

Thanks,
Ray

> Christian.
> 
> >   
> > if (mem && mem->mem_type != TTM_PL_SYSTEM)
> > flags |= AMDGPU_PTE_VALID;
> > @@ -1444,6 +1446,9 @@ uint64_t amdgpu_ttm_tt_pde_flags(struct ttm_tt *ttm, 
> > struct ttm_mem_reg *mem)
> > if 

[PATCH] drm/amdgpu: resvert "disable bulk moves for now"

2019-09-12 Thread Christian König
This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a.

The changes to fix this should have landed in 5.1.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 48349e4f0701..fd3fbaa73fa3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -603,14 +603,12 @@ void amdgpu_vm_move_to_lru_tail(struct amdgpu_device 
*adev,
struct ttm_bo_global *glob = adev->mman.bdev.glob;
struct amdgpu_vm_bo_base *bo_base;
 
-#if 0
if (vm->bulk_moveable) {
spin_lock(>lru_lock);
ttm_bo_bulk_move_lru_tail(>lru_bulk_move);
spin_unlock(>lru_lock);
return;
}
-#endif
 
memset(>lru_bulk_move, 0, sizeof(vm->lru_bulk_move));
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/4] mm/hmm: allow snapshot of the special zero page

2019-09-12 Thread Christoph Hellwig
On Wed, Sep 11, 2019 at 03:28:27PM -0700, Ralph Campbell wrote:
> Allow hmm_range_fault() to return success (0) when the CPU pagetable
> entry points to the special shared zero page.
> The caller can then handle the zero page by possibly clearing device
> private memory instead of DMAing a zero page.
> 
> Signed-off-by: Ralph Campbell 
> Cc: "Jérôme Glisse" 
> Cc: Jason Gunthorpe 
> Cc: Christoph Hellwig 
> ---
>  mm/hmm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 06041d4399ff..7217912bef13 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -532,7 +532,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, 
> unsigned long addr,
>   return -EBUSY;
>   } else if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pte_special(pte)) 
> {
>   *pfn = range->values[HMM_PFN_SPECIAL];
> - return -EFAULT;
> + return is_zero_pfn(pte_pfn(pte)) ? 0 : -EFAULT;

Any chance to just use a normal if here:

if (!is_zero_pfn(pte_pfn(pte)))
return -EFAULT;
return 0;


RE: [PATCH] drm/amdgpu: Navi10/12 VF doesn't support SMU

2019-09-12 Thread Deng, Emily
Reviewed-by: Emily Deng 

Best wishes
Emily Deng



>-Original Message-
>From: Zhao, Jiange 
>Sent: Thursday, September 12, 2019 11:46 AM
>To: amd-gfx@lists.freedesktop.org
>Cc: Nieto, David M ; Deng, Emily
>; Koenig, Christian ;
>Zhao, Jiange 
>Subject: [PATCH] drm/amdgpu: Navi10/12 VF doesn't support SMU
>
>From: Jiange Zhao 
>
>In SRIOV case, SMU and powerplay are handled in HV.
>
>VF shouldn't have control over SMU and powerplay.
>
>Signed-off-by: Jiange Zhao 
>---
> drivers/gpu/drm/amd/amdgpu/nv.c | 8 
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c
>b/drivers/gpu/drm/amd/amdgpu/nv.c index 4c24672be12a..fb097aa089da
>100644
>--- a/drivers/gpu/drm/amd/amdgpu/nv.c
>+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
>@@ -438,7 +438,7 @@ int nv_set_ip_blocks(struct amdgpu_device *adev)
>   amdgpu_device_ip_block_add(adev, _ih_ip_block);
>   amdgpu_device_ip_block_add(adev, _v11_0_ip_block);
>   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP &&
>-  is_support_sw_smu(adev))
>+  is_support_sw_smu(adev) && !amdgpu_sriov_vf(adev))
>   amdgpu_device_ip_block_add(adev,
>_v11_0_ip_block);
>   if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
>   amdgpu_device_ip_block_add(adev,
>_virtual_ip_block); @@ -449,7 +449,7 @@ int nv_set_ip_blocks(struct
>amdgpu_device *adev)
>   amdgpu_device_ip_block_add(adev, _v10_0_ip_block);
>   amdgpu_device_ip_block_add(adev, _v5_0_ip_block);
>   if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT
>&&
>-  is_support_sw_smu(adev))
>+  is_support_sw_smu(adev) && !amdgpu_sriov_vf(adev))
>   amdgpu_device_ip_block_add(adev,
>_v11_0_ip_block);
>   amdgpu_device_ip_block_add(adev, _v2_0_ip_block);
>   if (adev->enable_mes)
>@@ -461,7 +461,7 @@ int nv_set_ip_blocks(struct amdgpu_device *adev)
>   amdgpu_device_ip_block_add(adev, _ih_ip_block);
>   amdgpu_device_ip_block_add(adev, _v11_0_ip_block);
>   if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP &&
>-  is_support_sw_smu(adev))
>+  is_support_sw_smu(adev) && !amdgpu_sriov_vf(adev))
>   amdgpu_device_ip_block_add(adev,
>_v11_0_ip_block);
>   if (adev->enable_virtual_display || amdgpu_sriov_vf(adev))
>   amdgpu_device_ip_block_add(adev,
>_virtual_ip_block); @@ -472,7 +472,7 @@ int nv_set_ip_blocks(struct
>amdgpu_device *adev)
>   amdgpu_device_ip_block_add(adev, _v10_0_ip_block);
>   amdgpu_device_ip_block_add(adev, _v5_0_ip_block);
>   if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT
>&&
>-  is_support_sw_smu(adev))
>+  is_support_sw_smu(adev) && !amdgpu_sriov_vf(adev))
>   amdgpu_device_ip_block_add(adev,
>_v11_0_ip_block);
>   amdgpu_device_ip_block_add(adev, _v2_0_ip_block);
>   break;
>--
>2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD (V2)

2019-09-12 Thread S, Shirish

On 9/12/2019 3:29 AM, Kuehling, Felix wrote:
> On 2019-09-11 2:52 a.m., S, Shirish wrote:
>> If CONFIG_HSA_AMD is not set, build fails:
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function 
>> `amdgpu_device_ip_early_init':
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to 
>> `sched_policy'
>>
>> Use CONFIG_HSA_AMD to guard this.
>>
>> Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling 
>> policy")
>>
>> V2: declare sched_policy in amdgpu.h and remove changes in amdgpu_device.c
> Which branch is this for. V1 of this patch was already submitted to
> amd-staging-drm-next. So unless you're planning to revert v1 and submit
> v2, I was expecting to see a change that fixes up the previous patch,
> rather than a patch that replaces it.

Have sent a patch that fixes up previous patch as well.

Apparently, I did not send the revert but my plan was to revert and only 
then submit V2.

Anyways both work for me as long as the kernel builds.

Regards,

Shirish S

> Regards,
>     Felix
>
>
>> Signed-off-by: Shirish S 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 
>>1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 1030cb3..6ff02bb 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -169,7 +169,11 @@ extern int amdgpu_discovery;
>>extern int amdgpu_mes;
>>extern int amdgpu_noretry;
>>extern int amdgpu_force_asic_type;
>> +#ifdef CONFIG_HSA_AMD
>>extern int sched_policy;
>> +#else
>> +static const int sched_policy = KFD_SCHED_POLICY_HWS;
>> +#endif
>>
>>#ifdef CONFIG_DRM_AMDGPU_SI
>>extern int amdgpu_si_support;

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: remove needless usage of #ifdef

2019-09-12 Thread S, Shirish
define sched_policy in case CONFIG_HSA_AMD is not
enabled, with this there is no need to check for CONFIG_HSA_AMD
else where in driver code.

Suggested-by: Felix Kuehling 
Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +-
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index a1516a3..6ff02bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -171,6 +171,8 @@ extern int amdgpu_noretry;
 extern int amdgpu_force_asic_type;
 #ifdef CONFIG_HSA_AMD
 extern int sched_policy;
+#else
+static const int sched_policy = KFD_SCHED_POLICY_HWS;
 #endif
 
 #ifdef CONFIG_DRM_AMDGPU_SI
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 740638e..3b5282b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1623,11 +1623,7 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
}
 
adev->pm.pp_feature = amdgpu_pp_feature_mask;
-   if (amdgpu_sriov_vf(adev)
-   #ifdef CONFIG_HSA_AMD
-   || sched_policy == KFD_SCHED_POLICY_NO_HWS
-   #endif
-   )
+   if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
 
for (i = 0; i < adev->num_ip_blocks; i++) {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx