date:20210507

On Fri, May 7, 2021 at 4:59 PM Tejun Heo  wrote:
>
> Hello,
>
> On Fri, May 07, 2021 at 03:55:39PM -0400, Alex Deucher wrote:
> > The problem is temporal partitioning on GPUs is much harder to enforce
> > unless you have a special case like SR-IOV.  Spatial partitioning, on
> > AMD GPUs at least, is widely available and easily enforced.  What is
> > the point of implementing temporal style cgroups if no one can enforce
> > it effectively?
>
> So, if generic fine-grained partitioning can't be implemented, the right
> thing to do is stopping pushing for full-blown cgroup interface for it. The
> hardware simply isn't capable of being managed in a way which allows generic
> fine-grained hierarchical scheduling and there's no point in bloating the
> interface with half baked hardware dependent features.
>
> This isn't to say that there's no way to support them, but what have been
> being proposed is way too generic and ambitious in terms of interface while
> being poorly developed on the internal abstraction and mechanism front. If
> the hardware can't do generic, either implement the barest minimum interface
> (e.g. be a part of misc controller) or go driver-specific - the feature is
> hardware specific anyway. I've repeated this multiple times in these
> discussions now but it'd be really helpful to try to minimize the interace
> while concentrating more on internal abstractions and actual control
> mechanisms.

Maybe we are speaking past each other.  I'm not following.  We got
here because a device specific cgroup didn't make sense.  With my
Linux user hat on, that makes sense.  I don't want to write code to a
bunch of device specific interfaces if I can avoid it.  But as for
temporal vs spatial partitioning of the GPU, the argument seems to be
a sort of hand-wavy one that both spatial and temporal partitioning
make sense on CPUs, but only temporal partitioning makes sense on
GPUs.  I'm trying to understand that assertion.  There are some GPUs
that can more easily be temporally partitioned and some that can be
more easily spatially partitioned.  It doesn't seem any different than
CPUs.

Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-07 Thread Tejun Heo

Hello,

On Fri, May 07, 2021 at 03:55:39PM -0400, Alex Deucher wrote:
> The problem is temporal partitioning on GPUs is much harder to enforce
> unless you have a special case like SR-IOV.  Spatial partitioning, on
> AMD GPUs at least, is widely available and easily enforced.  What is
> the point of implementing temporal style cgroups if no one can enforce
> it effectively?

So, if generic fine-grained partitioning can't be implemented, the right
thing to do is stopping pushing for full-blown cgroup interface for it. The
hardware simply isn't capable of being managed in a way which allows generic
fine-grained hierarchical scheduling and there's no point in bloating the
interface with half baked hardware dependent features.

This isn't to say that there's no way to support them, but what have been
being proposed is way too generic and ambitious in terms of interface while
being poorly developed on the internal abstraction and mechanism front. If
the hardware can't do generic, either implement the barest minimum interface
(e.g. be a part of misc controller) or go driver-specific - the feature is
hardware specific anyway. I've repeated this multiple times in these
discussions now but it'd be really helpful to try to minimize the interace
while concentrating more on internal abstractions and actual control
mechanisms.

Thanks.

-- 
tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/2] drm/amdgpu/display: fix dal_allocation documentation

Ping?

Alex

On Fri, Apr 23, 2021 at 4:49 PM Alex Deucher  wrote:
>
> Add missing structure elements.
>
> Fixes: 1ace37b873c2 ("drm/amdgpu/display: Implement functions to let DC 
> allocate GPU memory")
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> index 77e338b3ab6b..d6a44b4fc472 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
> @@ -135,6 +135,10 @@ struct amdgpu_dm_backlight_caps {
>
>  /**
>   * struct dal_allocation - Tracks mapped FB memory for SMU communication
> + * @list: list of dal allocations
> + * @bo: GPU buffer object
> + * @cpu_ptr: CPU virtual address of the GPU buffer object
> + * @gpu_addr: GPU virtual address of the GPU buffer object
>   */
>  struct dal_allocation {
> struct list_head list;
> --
> 2.30.2
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/3] drm/amdgpu/display: remove an old DCN3 guard

The DCN3 guards were dropped a while ago, this one must have
snuck in in a merge or something.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index bdbc577be65c..73d41cdd98ba 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3720,10 +3720,8 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
 
/* Use Outbox interrupt */
switch (adev->asic_type) {
-#if defined(CONFIG_DRM_AMD_DC_DCN3_0)
case CHIP_SIENNA_CICHLID:
case CHIP_NAVY_FLOUNDER:
-#endif
case CHIP_RENOIR:
if (register_outbox_irq_handlers(dm->adev)) {
DRM_ERROR("DM: Failed to initialize IRQ\n");
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm/amdgpu/display: fix warning when CONFIG_DRM_AMD_DC_DCN is not defined

Fixes:
At top level:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:633:13: warning: 
‘dm_dmub_outbox1_low_irq’ defined but not used [-Wunused-function]
  633 | static void dm_dmub_outbox1_low_irq(void *interrupt_params)
  | ^~~

Fixes: 77a49c458931 ("drm/amd/display: Support for DMUB AUX")
Signed-off-by: Alex Deucher 
Cc: Jude Shih 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 73d41cdd98ba..77bde54c9515 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -620,7 +620,6 @@ static void dm_dcn_vertical_interrupt0_high_irq(void 
*interrupt_params)
amdgpu_dm_crtc_handle_crc_window_irq(&acrtc->base);
 }
 #endif
-#endif
 
 /**
  * dm_dmub_outbox1_low_irq() - Handles Outbox interrupt
@@ -673,6 +672,7 @@ static void dm_dmub_outbox1_low_irq(void *interrupt_params)
 
ASSERT(count <= DMUB_TRACE_MAX_READ);
 }
+#endif
 
 static int dm_set_clockgating_state(void *handle,
  enum amd_clockgating_state state)
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/3] drm/amdgpu/display: fix build when CONFIG_DRM_AMD_DC_DCN is not defined

Fixes:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In function 
‘amdgpu_dm_initialize_drm_device’:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:3726:7: error: 
implicit declaration of function ‘register_outbox_irq_handlers’; did you mean 
‘register_hpd_handlers’? [-Werror=implicit-function-declaration]
 3726 |   if (register_outbox_irq_handlers(dm->adev)) {
  |   ^~~~
  |   register_hpd_handlers

Fixes: 77a49c458931 ("drm/amd/display: Support for DMUB AUX")
Signed-off-by: Alex Deucher 
Cc: Jude Shish 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 77bde54c9515..8ee9c03bf26c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3718,6 +3718,7 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
goto fail;
}
 
+#if defined(CONFIG_DRM_AMD_DC_DCN)
/* Use Outbox interrupt */
switch (adev->asic_type) {
case CHIP_SIENNA_CICHLID:
@@ -3731,6 +3732,7 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
default:
DRM_DEBUG_KMS("Unsupported ASIC type for outbox: 0x%X\n", 
adev->asic_type);
}
+#endif
 
/* loops over all connectors on the board */
for (i = 0; i < link_cnt; i++) {
-- 
2.30.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 7, 2021 at 3:33 PM Tejun Heo  wrote:
>
> Hello,
>
> On Fri, May 07, 2021 at 06:54:13PM +0200, Daniel Vetter wrote:
> > All I meant is that for the container/cgroups world starting out with
> > time-sharing feels like the best fit, least because your SRIOV designers
> > also seem to think that's the best first cut for cloud-y computing.
> > Whether it's virtualized or containerized is a distinction that's getting
> > ever more blurry, with virtualization become a lot more dynamic and
> > container runtimes als possibly using hw virtualization underneath.
>
> FWIW, I'm completely on the same boat. There are two fundamental issues with
> hardware-mask based control - control granularity and work conservation.
> Combined, they make it a significantly more difficult interface to use which
> requires hardware-specific tuning rather than simply being able to say "I
> wanna prioritize this job twice over that one".
>
> My knoweldge of gpus is really limited but my understanding is also that the
> gpu cores and threads aren't as homogeneous as the CPU counterparts across
> the vendors, product generations and possibly even within a single chip,
> which makes the problem even worse.
>
> Given that GPUs are time-shareable to begin with, the most universal
> solution seems pretty clear.

The problem is temporal partitioning on GPUs is much harder to enforce
unless you have a special case like SR-IOV.  Spatial partitioning, on
AMD GPUs at least, is widely available and easily enforced.  What is
the point of implementing temporal style cgroups if no one can enforce
it effectively?

Alex

>
> Thanks.
>
> --
> tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Expose active display color configurations to userspace

2021-05-07 Thread Werner Sembach

xrandr --prop and other userspace info tools have currently no way of
telling which color configuration is used on HDMI and DP ports.

The ongoing transsition from HDMI 1.4 to 2.0 and the different bandwidth
requirements of YCbCr 4:2:0 and RGB color format raise different
incompatibilities. Having these configuration information readily
available is a useful tool in debuging washed out colors, color artefacts
on small fonts and missing refreshrate options.

Signed-off-by: Werner Sembach 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 58 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  4 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 36 
 3 files changed, 98 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index f753e04fee99..c0404bcda31b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -986,6 +986,40 @@ static const struct drm_prop_enum_list 
amdgpu_dither_enum_list[] =
{ AMDGPU_FMT_DITHER_ENABLE, "on" },
 };
 
+static const struct drm_prop_enum_list 
amdgpu_active_pixel_encoding_enum_list[] = {
+   { PIXEL_ENCODING_UNDEFINED, "undefined" },
+   { PIXEL_ENCODING_RGB, "RGB" },
+   { PIXEL_ENCODING_YCBCR422, "YCbCr 4:2:2" },
+   { PIXEL_ENCODING_YCBCR444, "YCbCr 4:4:4" },
+   { PIXEL_ENCODING_YCBCR420, "YCbCr 4:2:0" },
+};
+
+static const struct drm_prop_enum_list 
amdgpu_active_display_color_depth_enum_list[] = {
+   { COLOR_DEPTH_UNDEFINED, "undefined" },
+   { COLOR_DEPTH_666, "6 bit" },
+   { COLOR_DEPTH_888, "8 bit" },
+   { COLOR_DEPTH_101010, "10 bit" },
+   { COLOR_DEPTH_121212, "12 bit" },
+   { COLOR_DEPTH_141414, "14 bit" },
+   { COLOR_DEPTH_161616, "16 bit" },
+   { COLOR_DEPTH_999, "9 bit" },
+   { COLOR_DEPTH_11, "11 bit" },
+};
+
+static const struct drm_prop_enum_list 
amdgpu_active_output_color_space_enum_list[] = {
+   { COLOR_SPACE_UNKNOWN, "unknown" },
+   { COLOR_SPACE_SRGB, "sRGB" },
+   { COLOR_SPACE_SRGB_LIMITED, "sRGB limited" },
+   { COLOR_SPACE_YCBCR601, "YCbCr 601" },
+   { COLOR_SPACE_YCBCR709, "YCbCr 709" },
+   { COLOR_SPACE_YCBCR601_LIMITED, "YCbCr 601 limited" },
+   { COLOR_SPACE_YCBCR709_LIMITED, "YCbCr 709 limited" },
+   { COLOR_SPACE_2020_RGB_FULLRANGE, "RGB 2020" },
+   { COLOR_SPACE_2020_RGB_LIMITEDRANGE, "RGB 2020 limited" },
+   { COLOR_SPACE_2020_YCBCR, "YCbCr 2020" },
+   { COLOR_SPACE_ADOBERGB, "Adobe RGB" },
+};
+
 int amdgpu_display_modeset_create_props(struct amdgpu_device *adev)
 {
int sz;
@@ -1038,6 +1072,30 @@ int amdgpu_display_modeset_create_props(struct 
amdgpu_device *adev)
  "abm level", 0, 4);
if (!adev->mode_info.abm_level_property)
return -ENOMEM;
+
+   sz = ARRAY_SIZE(amdgpu_active_pixel_encoding_enum_list);
+   adev->mode_info.active_pixel_encoding_property =
+   drm_property_create_enum(adev_to_drm(adev), 0,
+   "active pixel encoding",
+   amdgpu_active_pixel_encoding_enum_list, sz);
+   if (!adev->mode_info.active_pixel_encoding_property)
+   return -ENOMEM;
+
+   sz = ARRAY_SIZE(amdgpu_active_display_color_depth_enum_list);
+   adev->mode_info.active_display_color_depth_property =
+   drm_property_create_enum(adev_to_drm(adev), 0,
+   "active display color depth",
+   amdgpu_active_display_color_depth_enum_list, 
sz);
+   if (!adev->mode_info.active_display_color_depth_property)
+   return -ENOMEM;
+
+   sz = ARRAY_SIZE(amdgpu_active_output_color_space_enum_list);
+   adev->mode_info.active_output_color_space_property =
+   drm_property_create_enum(adev_to_drm(adev), 0,
+   "active output color space",
+   amdgpu_active_output_color_space_enum_list, sz);
+   if (!adev->mode_info.active_output_color_space_property)
+   return -ENOMEM;
}
 
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 319cb19e1b99..ad43af6a878d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -337,6 +337,10 @@ struct amdgpu_mode_info {
struct drm_property *dither_property;
/* Adaptive Backlight Modulation (power feature) */
struct drm_property *abm_level_property;
+   /* Color settings */
+   struct drm_property *active_pixel_encoding_property;
+   struct drm_property *active_display_color_depth_property;
+   struct drm_property *active_o

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-07 Thread Tejun Heo

Hello,

On Fri, May 07, 2021 at 06:54:13PM +0200, Daniel Vetter wrote:
> All I meant is that for the container/cgroups world starting out with
> time-sharing feels like the best fit, least because your SRIOV designers
> also seem to think that's the best first cut for cloud-y computing.
> Whether it's virtualized or containerized is a distinction that's getting
> ever more blurry, with virtualization become a lot more dynamic and
> container runtimes als possibly using hw virtualization underneath.

FWIW, I'm completely on the same boat. There are two fundamental issues with
hardware-mask based control - control granularity and work conservation.
Combined, they make it a significantly more difficult interface to use which
requires hardware-specific tuning rather than simply being able to say "I
wanna prioritize this job twice over that one".

My knoweldge of gpus is really limited but my understanding is also that the
gpu cores and threads aren't as homogeneous as the CPU counterparts across
the vendors, product generations and possibly even within a single chip,
which makes the problem even worse.

Given that GPUs are time-shareable to begin with, the most universal
solution seems pretty clear.

Thanks.

-- 
tejun
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: Quit RAS initialization earlier if RAS is disabled

2021-05-07 Thread Zeng, Oak

Thank you Hawking for reviewing this. I made a mistake when I pushed this in. I 
forgot to add " Reviewed-by: Hawking Zhang ". 

Regards,
Oak 

 

On 2021-05-07, 9:05 AM, "Zhang, Hawking"  wrote:

[AMD Public Use]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Zeng, Oak  
Sent: Friday, May 7, 2021 09:15
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Lazar, Lijo 
; Clements, John ; Joshi, Mukul 
; Zeng, Oak 
Subject: [PATCH] drm/amdgpu: Quit RAS initialization earlier if RAS is 
disabled

If RAS is disabled through amdgpu_ras_enable kernel parameter, we should 
quit the RAS initialization eariler to avoid initialization of some RAS data 
structure such as sysfs etc.

Signed-off-by: Oak Zeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ebbe2c5..7e65b35 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2155,7 +2155,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev)

amdgpu_ras_check_supported(adev, &con->hw_supported,
&con->supported);
-   if (!con->hw_supported || (adev->asic_type == CHIP_VEGA10)) {
+   if (!adev->ras_features || (adev->asic_type == CHIP_VEGA10)) {
/* set gfx block ras context feature for VEGA20 Gaming
 * send ras disable cmd to ras ta during ras late init.
 */
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 0/4] drm: Mark DRM's AGP code as legacy

On Fri, May 7, 2021 at 2:57 PM Thomas Zimmermann  wrote:
>
> This patch moves the DRM core's AGP code behind CONFIG_DRM_LEGACY. The
> only use besides legacy, UMS drivers is radeon, which can implement the
> required functionality by itself.
>
> This patchset has no impact on the AGP support of existing drivers.
>
> Patches 1 and 2 move some AGP code from DRM core into radeon. Radeon
> uses some of the AGP code for its internal purposes. But being a KMS
> driver, there's no reason why radeon should provide the rsp AGP ioctls.
> So duplicate the implementation in radeon and thus uncould it from
> the legacy code.
>
> Patch 3 moves some AGP-related PCI helpers behind CONFIG_DRM_LEGACY.
>
> Patch 4 moves DRM's AGP code behind CONFIG_DRM_LEGACY. The files are
> then only build when legacy drivers are active.
>
> Built-tested with different config options selected.
>
> Thomas Zimmermann (4):
>   drm/radeon: Move AGP helpers into radeon driver
>   drm/radeon: Move AGP data structures into radeon
>   drm: Mark PCI AGP helpers as legacy
>   drm: Mark AGP implementation and ioctls as legacy

Series is:
Reviewed-by: Alex Deucher 

I'm fine to have this merged through drm-misc.

Alex


>
>  drivers/gpu/drm/Makefile|   6 +-
>  drivers/gpu/drm/drm_agpsupport.c|  99 ---
>  drivers/gpu/drm/drm_bufs.c  |   1 -
>  drivers/gpu/drm/drm_drv.c   |   4 +-
>  drivers/gpu/drm/drm_internal.h  |   5 --
>  drivers/gpu/drm/drm_ioc32.c |  19 +++--
>  drivers/gpu/drm/drm_ioctl.c |  24 +++---
>  drivers/gpu/drm/drm_legacy.h|  30 +++
>  drivers/gpu/drm/drm_legacy_misc.c   |   1 -
>  drivers/gpu/drm/drm_memory.c|   1 -
>  drivers/gpu/drm/drm_pci.c   |  23 +++---
>  drivers/gpu/drm/drm_vm.c|   2 -
>  drivers/gpu/drm/i810/i810_dma.c |   3 +-
>  drivers/gpu/drm/mga/mga_dma.c   |  16 ++--
>  drivers/gpu/drm/mga/mga_drv.h   |   1 -
>  drivers/gpu/drm/r128/r128_cce.c |   2 +-
>  drivers/gpu/drm/radeon/radeon.h |  42 ++
>  drivers/gpu/drm/radeon/radeon_agp.c | 118 
>  drivers/gpu/drm/radeon/radeon_drv.c |  13 ---
>  drivers/gpu/drm/radeon/radeon_kms.c |  18 +++--
>  drivers/gpu/drm/radeon/radeon_ttm.c |   6 +-
>  drivers/gpu/drm/via/via_dma.c   |   1 -
>  include/drm/drm_agpsupport.h| 117 ---
>  include/drm/drm_device.h|   6 +-
>  include/drm/drm_legacy.h|  82 +++
>  25 files changed, 375 insertions(+), 265 deletions(-)
>  delete mode 100644 include/drm/drm_agpsupport.h
>
> --
> 2.31.1
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amdkfd: unregistered range accessible by all GPUs

2021-05-07 Thread Philip Yang

New range is created to recover retry vm fault, set all GPUs have access
to the range. The new range preferred_loc is default value
KFD_IOCTL_SVM_LOCATION_UNDEFINED.

Correct one typo.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index d9111fea724b..537b12e75f54 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2243,7 +2243,7 @@ svm_range *svm_range_create_unregistered_range(struct 
amdgpu_device *adev,
 
prange = svm_range_new(&p->svms, start, last);
if (!prange) {
-   pr_debug("Failed to create prange in address [0x%llx]\\n", 
addr);
+   pr_debug("Failed to create prange in address [0x%llx]\n", addr);
return NULL;
}
if (kfd_process_gpuid_from_kgd(p, adev, &gpuid, &gpuidx)) {
@@ -2251,9 +2251,8 @@ svm_range *svm_range_create_unregistered_range(struct 
amdgpu_device *adev,
svm_range_free(prange);
return NULL;
}
-   prange->preferred_loc = gpuid;
-   prange->actual_loc = 0;
-   /* Gurantee prange is migrate it */
+
+   bitmap_fill(prange->bitmap_access, MAX_GPU_INSTANCE);
svm_range_add_to_svms(prange);
svm_range_add_notifier_locked(mm, prange);
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 0/4] drm: Mark DRM's AGP code as legacy

2021-05-07 Thread Christian König


Acked-by: Christian König 

Am 07.05.21 um 20:57 schrieb Thomas Zimmermann:

This patch moves the DRM core's AGP code behind CONFIG_DRM_LEGACY. The
only use besides legacy, UMS drivers is radeon, which can implement the
required functionality by itself.

This patchset has no impact on the AGP support of existing drivers.

Patches 1 and 2 move some AGP code from DRM core into radeon. Radeon
uses some of the AGP code for its internal purposes. But being a KMS
driver, there's no reason why radeon should provide the rsp AGP ioctls.
So duplicate the implementation in radeon and thus uncould it from
the legacy code.

Patch 3 moves some AGP-related PCI helpers behind CONFIG_DRM_LEGACY.

Patch 4 moves DRM's AGP code behind CONFIG_DRM_LEGACY. The files are
then only build when legacy drivers are active.

Built-tested with different config options selected.

Thomas Zimmermann (4):
   drm/radeon: Move AGP helpers into radeon driver
   drm/radeon: Move AGP data structures into radeon
   drm: Mark PCI AGP helpers as legacy
   drm: Mark AGP implementation and ioctls as legacy

  drivers/gpu/drm/Makefile|   6 +-
  drivers/gpu/drm/drm_agpsupport.c|  99 ---
  drivers/gpu/drm/drm_bufs.c  |   1 -
  drivers/gpu/drm/drm_drv.c   |   4 +-
  drivers/gpu/drm/drm_internal.h  |   5 --
  drivers/gpu/drm/drm_ioc32.c |  19 +++--
  drivers/gpu/drm/drm_ioctl.c |  24 +++---
  drivers/gpu/drm/drm_legacy.h|  30 +++
  drivers/gpu/drm/drm_legacy_misc.c   |   1 -
  drivers/gpu/drm/drm_memory.c|   1 -
  drivers/gpu/drm/drm_pci.c   |  23 +++---
  drivers/gpu/drm/drm_vm.c|   2 -
  drivers/gpu/drm/i810/i810_dma.c |   3 +-
  drivers/gpu/drm/mga/mga_dma.c   |  16 ++--
  drivers/gpu/drm/mga/mga_drv.h   |   1 -
  drivers/gpu/drm/r128/r128_cce.c |   2 +-
  drivers/gpu/drm/radeon/radeon.h |  42 ++
  drivers/gpu/drm/radeon/radeon_agp.c | 118 
  drivers/gpu/drm/radeon/radeon_drv.c |  13 ---
  drivers/gpu/drm/radeon/radeon_kms.c |  18 +++--
  drivers/gpu/drm/radeon/radeon_ttm.c |   6 +-
  drivers/gpu/drm/via/via_dma.c   |   1 -
  include/drm/drm_agpsupport.h| 117 ---
  include/drm/drm_device.h|   6 +-
  include/drm/drm_legacy.h|  82 +++
  25 files changed, 375 insertions(+), 265 deletions(-)
  delete mode 100644 include/drm/drm_agpsupport.h

--
2.31.1



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/4] drm/radeon: Move AGP data structures into radeon

With the AGP code already duplicated, move over the AGP structures
from the legacy code base in to radeon. The AGP data structures that
are required by radeon are now declared within the driver. The AGP
instance is stored in struct radeon_device.agp.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/radeon/radeon.h | 38 +++-
 drivers/gpu/drm/radeon/radeon_agp.c | 70 ++---
 drivers/gpu/drm/radeon/radeon_drv.c | 13 --
 drivers/gpu/drm/radeon/radeon_kms.c | 18 +---
 drivers/gpu/drm/radeon/radeon_ttm.c |  6 +--
 5 files changed, 86 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 4f9e8dc460be..80d7637f0c27 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -60,6 +60,7 @@
  *  are considered as fatal)
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -1110,10 +,42 @@ typedef int (*radeon_packet0_check_t)(struct 
radeon_cs_parser *p,
 /*
  * AGP
  */
+
+struct radeon_agp_mode {
+   unsigned long mode; /**< AGP mode */
+};
+
+struct radeon_agp_info {
+   int agp_version_major;
+   int agp_version_minor;
+   unsigned long mode;
+   unsigned long aperture_base;/* physical address */
+   unsigned long aperture_size;/* bytes */
+   unsigned long memory_allowed;   /* bytes */
+   unsigned long memory_used;
+
+   /* PCI information */
+   unsigned short id_vendor;
+   unsigned short id_device;
+};
+
+struct radeon_agp_head {
+   struct agp_kern_info agp_info;
+   struct list_head memory;
+   unsigned long mode;
+   struct agp_bridge_data *bridge;
+   int enabled;
+   int acquired;
+   unsigned long base;
+   int agp_mtrr;
+   int cant_use_aperture;
+   unsigned long page_mask;
+};
+
 #if IS_ENABLED(CONFIG_AGP)
-struct drm_agp_head *radeon_agp_head_init(struct drm_device *dev);
+struct radeon_agp_head *radeon_agp_head_init(struct drm_device *dev);
 #else
-static inline struct drm_agp_head *radeon_agp_head_init(struct drm_device *dev)
+static inline struct radeon_agp_head *radeon_agp_head_init(struct drm_device 
*dev)
 {
return NULL;
 }
@@ -2310,6 +2343,7 @@ struct radeon_device {
 #ifdef __alpha__
struct pci_controller   *hose;
 #endif
+   struct radeon_agp_head  *agp;
struct rw_semaphore exclusive_lock;
/* ASIC */
union radeon_asic_configconfig;
diff --git a/drivers/gpu/drm/radeon/radeon_agp.c 
b/drivers/gpu/drm/radeon/radeon_agp.c
index 398be13c8e2b..d124600b5f58 100644
--- a/drivers/gpu/drm/radeon/radeon_agp.c
+++ b/drivers/gpu/drm/radeon/radeon_agp.c
@@ -27,7 +27,6 @@
 
 #include 
 
-#include 
 #include 
 #include 
 
@@ -128,10 +127,10 @@ static struct radeon_agpmode_quirk 
radeon_agpmode_quirk_list[] = {
{ 0, 0, 0, 0, 0, 0, 0 },
 };
 
-struct drm_agp_head *radeon_agp_head_init(struct drm_device *dev)
+struct radeon_agp_head *radeon_agp_head_init(struct drm_device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev->dev);
-   struct drm_agp_head *head = NULL;
+   struct radeon_agp_head *head = NULL;
 
head = kzalloc(sizeof(*head), GFP_KERNEL);
if (!head)
@@ -160,49 +159,50 @@ struct drm_agp_head *radeon_agp_head_init(struct 
drm_device *dev)
return head;
 }
 
-static int radeon_agp_head_acquire(struct drm_device *dev)
+static int radeon_agp_head_acquire(struct radeon_device *rdev)
 {
+   struct drm_device *dev = rdev->ddev;
struct pci_dev *pdev = to_pci_dev(dev->dev);
 
-   if (!dev->agp)
+   if (!rdev->agp)
return -ENODEV;
-   if (dev->agp->acquired)
+   if (rdev->agp->acquired)
return -EBUSY;
-   dev->agp->bridge = agp_backend_acquire(pdev);
-   if (!dev->agp->bridge)
+   rdev->agp->bridge = agp_backend_acquire(pdev);
+   if (!rdev->agp->bridge)
return -ENODEV;
-   dev->agp->acquired = 1;
+   rdev->agp->acquired = 1;
return 0;
 }
 
-static int radeon_agp_head_release(struct drm_device *dev)
+static int radeon_agp_head_release(struct radeon_device *rdev)
 {
-   if (!dev->agp || !dev->agp->acquired)
+   if (!rdev->agp || !rdev->agp->acquired)
return -EINVAL;
-   agp_backend_release(dev->agp->bridge);
-   dev->agp->acquired = 0;
+   agp_backend_release(rdev->agp->bridge);
+   rdev->agp->acquired = 0;
return 0;
 }
 
-static int radeon_agp_head_enable(struct drm_device *dev, struct drm_agp_mode 
mode)
+static int radeon_agp_head_enable(struct radeon_device *rdev, struct 
radeon_agp_mode mode)
 {
-   if (!dev->agp || !dev->agp->acquired)
+   if (!rdev->agp || !rdev->agp->acquired)
return -EINVAL;
 
-   dev->agp->mode = mode.mode;
-   agp_enable(dev->agp->bridge, mode.mode);
-   dev->agp->enabled = 1;
+   rdev->agp->mode = mode.

[PATCH 1/4] drm/radeon: Move AGP helpers into radeon driver

Radeon calls DRMs core AGP helpers. These helpers are only required
by legacy drivers. Reimplement the code in radeon to uncouple radeon
from the legacy code.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/radeon/radeon.h |   8 +++
 drivers/gpu/drm/radeon/radeon_agp.c | 102 ++--
 drivers/gpu/drm/radeon/radeon_drv.c |   2 +-
 3 files changed, 104 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 42281fce552e..4f9e8dc460be 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1110,6 +1110,14 @@ typedef int (*radeon_packet0_check_t)(struct 
radeon_cs_parser *p,
 /*
  * AGP
  */
+#if IS_ENABLED(CONFIG_AGP)
+struct drm_agp_head *radeon_agp_head_init(struct drm_device *dev);
+#else
+static inline struct drm_agp_head *radeon_agp_head_init(struct drm_device *dev)
+{
+   return NULL;
+}
+#endif
 int radeon_agp_init(struct radeon_device *rdev);
 void radeon_agp_resume(struct radeon_device *rdev);
 void radeon_agp_suspend(struct radeon_device *rdev);
diff --git a/drivers/gpu/drm/radeon/radeon_agp.c 
b/drivers/gpu/drm/radeon/radeon_agp.c
index 0aca7bdf54c7..398be13c8e2b 100644
--- a/drivers/gpu/drm/radeon/radeon_agp.c
+++ b/drivers/gpu/drm/radeon/radeon_agp.c
@@ -127,6 +127,94 @@ static struct radeon_agpmode_quirk 
radeon_agpmode_quirk_list[] = {
PCI_VENDOR_ID_SONY, 0x8175, 1},
{ 0, 0, 0, 0, 0, 0, 0 },
 };
+
+struct drm_agp_head *radeon_agp_head_init(struct drm_device *dev)
+{
+   struct pci_dev *pdev = to_pci_dev(dev->dev);
+   struct drm_agp_head *head = NULL;
+
+   head = kzalloc(sizeof(*head), GFP_KERNEL);
+   if (!head)
+   return NULL;
+   head->bridge = agp_find_bridge(pdev);
+   if (!head->bridge) {
+   head->bridge = agp_backend_acquire(pdev);
+   if (!head->bridge) {
+   kfree(head);
+   return NULL;
+   }
+   agp_copy_info(head->bridge, &head->agp_info);
+   agp_backend_release(head->bridge);
+   } else {
+   agp_copy_info(head->bridge, &head->agp_info);
+   }
+   if (head->agp_info.chipset == NOT_SUPPORTED) {
+   kfree(head);
+   return NULL;
+   }
+   INIT_LIST_HEAD(&head->memory);
+   head->cant_use_aperture = head->agp_info.cant_use_aperture;
+   head->page_mask = head->agp_info.page_mask;
+   head->base = head->agp_info.aper_base;
+
+   return head;
+}
+
+static int radeon_agp_head_acquire(struct drm_device *dev)
+{
+   struct pci_dev *pdev = to_pci_dev(dev->dev);
+
+   if (!dev->agp)
+   return -ENODEV;
+   if (dev->agp->acquired)
+   return -EBUSY;
+   dev->agp->bridge = agp_backend_acquire(pdev);
+   if (!dev->agp->bridge)
+   return -ENODEV;
+   dev->agp->acquired = 1;
+   return 0;
+}
+
+static int radeon_agp_head_release(struct drm_device *dev)
+{
+   if (!dev->agp || !dev->agp->acquired)
+   return -EINVAL;
+   agp_backend_release(dev->agp->bridge);
+   dev->agp->acquired = 0;
+   return 0;
+}
+
+static int radeon_agp_head_enable(struct drm_device *dev, struct drm_agp_mode 
mode)
+{
+   if (!dev->agp || !dev->agp->acquired)
+   return -EINVAL;
+
+   dev->agp->mode = mode.mode;
+   agp_enable(dev->agp->bridge, mode.mode);
+   dev->agp->enabled = 1;
+   return 0;
+}
+
+static int radeon_agp_head_info(struct drm_device *dev, struct drm_agp_info 
*info)
+{
+   struct agp_kern_info *kern;
+
+   if (!dev->agp || !dev->agp->acquired)
+   return -EINVAL;
+
+   kern = &dev->agp->agp_info;
+   info->agp_version_major = kern->version.major;
+   info->agp_version_minor = kern->version.minor;
+   info->mode = kern->mode;
+   info->aperture_base = kern->aper_base;
+   info->aperture_size = kern->aper_size * 1024 * 1024;
+   info->memory_allowed = kern->max_memory << PAGE_SHIFT;
+   info->memory_used = kern->current_memory << PAGE_SHIFT;
+   info->id_vendor = kern->device->vendor;
+   info->id_device = kern->device->device;
+
+   return 0;
+}
 #endif
 
 int radeon_agp_init(struct radeon_device *rdev)
@@ -141,21 +229,21 @@ int radeon_agp_init(struct radeon_device *rdev)
int ret;
 
/* Acquire AGP. */
-   ret = drm_agp_acquire(rdev->ddev);
+   ret = radeon_agp_head_acquire(rdev->ddev);
if (ret) {
DRM_ERROR("Unable to acquire AGP: %d\n", ret);
return ret;
}
 
-   ret = drm_agp_info(rdev->ddev, &info);
+   ret = radeon_agp_head_info(rdev->ddev, &info);
if (ret) {
-   drm_agp_release(rdev->ddev);
+   radeon_agp_head_release(rdev->ddev);
DRM_ERROR("Unable to get AGP info: %d\n", ret);
return ret;
}
 
if (rdev->

[PATCH 0/4] drm: Mark DRM's AGP code as legacy

This patch moves the DRM core's AGP code behind CONFIG_DRM_LEGACY. The
only use besides legacy, UMS drivers is radeon, which can implement the
required functionality by itself.

This patchset has no impact on the AGP support of existing drivers.

Patches 1 and 2 move some AGP code from DRM core into radeon. Radeon
uses some of the AGP code for its internal purposes. But being a KMS
driver, there's no reason why radeon should provide the rsp AGP ioctls.
So duplicate the implementation in radeon and thus uncould it from
the legacy code.

Patch 3 moves some AGP-related PCI helpers behind CONFIG_DRM_LEGACY.

Patch 4 moves DRM's AGP code behind CONFIG_DRM_LEGACY. The files are
then only build when legacy drivers are active.

Built-tested with different config options selected.

Thomas Zimmermann (4):
  drm/radeon: Move AGP helpers into radeon driver
  drm/radeon: Move AGP data structures into radeon
  drm: Mark PCI AGP helpers as legacy
  drm: Mark AGP implementation and ioctls as legacy

 drivers/gpu/drm/Makefile|   6 +-
 drivers/gpu/drm/drm_agpsupport.c|  99 ---
 drivers/gpu/drm/drm_bufs.c  |   1 -
 drivers/gpu/drm/drm_drv.c   |   4 +-
 drivers/gpu/drm/drm_internal.h  |   5 --
 drivers/gpu/drm/drm_ioc32.c |  19 +++--
 drivers/gpu/drm/drm_ioctl.c |  24 +++---
 drivers/gpu/drm/drm_legacy.h|  30 +++
 drivers/gpu/drm/drm_legacy_misc.c   |   1 -
 drivers/gpu/drm/drm_memory.c|   1 -
 drivers/gpu/drm/drm_pci.c   |  23 +++---
 drivers/gpu/drm/drm_vm.c|   2 -
 drivers/gpu/drm/i810/i810_dma.c |   3 +-
 drivers/gpu/drm/mga/mga_dma.c   |  16 ++--
 drivers/gpu/drm/mga/mga_drv.h   |   1 -
 drivers/gpu/drm/r128/r128_cce.c |   2 +-
 drivers/gpu/drm/radeon/radeon.h |  42 ++
 drivers/gpu/drm/radeon/radeon_agp.c | 118 
 drivers/gpu/drm/radeon/radeon_drv.c |  13 ---
 drivers/gpu/drm/radeon/radeon_kms.c |  18 +++--
 drivers/gpu/drm/radeon/radeon_ttm.c |   6 +-
 drivers/gpu/drm/via/via_dma.c   |   1 -
 include/drm/drm_agpsupport.h| 117 ---
 include/drm/drm_device.h|   6 +-
 include/drm/drm_legacy.h|  82 +++
 25 files changed, 375 insertions(+), 265 deletions(-)
 delete mode 100644 include/drm/drm_agpsupport.h

--
2.31.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 3/4] drm: Mark PCI AGP helpers as legacy

DRM's AGP helpers for PCI are only required by legacy drivers. Put them
behind CONFIG_DRM_LEGACY and add the _legacy_ infix.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/drm_drv.c  |  4 +---
 drivers/gpu/drm/drm_internal.h |  5 -
 drivers/gpu/drm/drm_legacy.h   |  6 ++
 drivers/gpu/drm/drm_pci.c  | 20 ++--
 4 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index c2f78dee9f2d..3d8d68a98b95 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -941,9 +941,7 @@ void drm_dev_unregister(struct drm_device *dev)
if (dev->driver->unload)
dev->driver->unload(dev);
 
-   if (dev->agp)
-   drm_pci_agp_destroy(dev);
-
+   drm_legacy_pci_agp_destroy(dev);
drm_legacy_rmmaps(dev);
 
remove_compat_control_link(dev);
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 1265de2b9d90..1dcb5797a3bb 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -56,7 +56,6 @@ void drm_lastclose(struct drm_device *dev);
 /* drm_pci.c */
 int drm_legacy_irq_by_busid(struct drm_device *dev, void *data,
struct drm_file *file_priv);
-void drm_pci_agp_destroy(struct drm_device *dev);
 int drm_pci_set_busid(struct drm_device *dev, struct drm_master *master);
 
 #else
@@ -67,10 +66,6 @@ static inline int drm_legacy_irq_by_busid(struct drm_device 
*dev, void *data,
return -EINVAL;
 }
 
-static inline void drm_pci_agp_destroy(struct drm_device *dev)
-{
-}
-
 static inline int drm_pci_set_busid(struct drm_device *dev,
struct drm_master *master)
 {
diff --git a/drivers/gpu/drm/drm_legacy.h b/drivers/gpu/drm/drm_legacy.h
index f71358f9eac9..ae2d7d2a31c7 100644
--- a/drivers/gpu/drm/drm_legacy.h
+++ b/drivers/gpu/drm/drm_legacy.h
@@ -211,4 +211,10 @@ void drm_master_legacy_init(struct drm_master *master);
 static inline void drm_master_legacy_init(struct drm_master *master) {}
 #endif
 
+#if IS_ENABLED(CONFIG_DRM_LEGACY) && IS_ENABLED(CONFIG_PCI)
+void drm_legacy_pci_agp_destroy(struct drm_device *dev);
+#else
+static inline void drm_legacy_pci_agp_destroy(struct drm_device *dev) {}
+#endif
+
 #endif /* __DRM_LEGACY_H__ */
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index 03bd863ff0b2..6e9af8b40419 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -119,7 +119,9 @@ int drm_legacy_irq_by_busid(struct drm_device *dev, void 
*data,
return drm_pci_irq_by_busid(dev, p);
 }
 
-void drm_pci_agp_destroy(struct drm_device *dev)
+#ifdef CONFIG_DRM_LEGACY
+
+void drm_legacy_pci_agp_destroy(struct drm_device *dev)
 {
if (dev->agp) {
arch_phys_wc_del(dev->agp->agp_mtrr);
@@ -129,9 +131,7 @@ void drm_pci_agp_destroy(struct drm_device *dev)
}
 }
 
-#ifdef CONFIG_DRM_LEGACY
-
-static void drm_pci_agp_init(struct drm_device *dev)
+static void drm_legacy_pci_agp_init(struct drm_device *dev)
 {
if (drm_core_check_feature(dev, DRIVER_USE_AGP)) {
if (pci_find_capability(to_pci_dev(dev->dev), PCI_CAP_ID_AGP))
@@ -145,9 +145,9 @@ static void drm_pci_agp_init(struct drm_device *dev)
}
 }
 
-static int drm_get_pci_dev(struct pci_dev *pdev,
-  const struct pci_device_id *ent,
-  const struct drm_driver *driver)
+static int drm_legacy_get_pci_dev(struct pci_dev *pdev,
+ const struct pci_device_id *ent,
+ const struct drm_driver *driver)
 {
struct drm_device *dev;
int ret;
@@ -169,7 +169,7 @@ static int drm_get_pci_dev(struct pci_dev *pdev,
if (drm_core_check_feature(dev, DRIVER_MODESET))
pci_set_drvdata(pdev, dev);
 
-   drm_pci_agp_init(dev);
+   drm_legacy_pci_agp_init(dev);
 
ret = drm_dev_register(dev, ent->driver_data);
if (ret)
@@ -184,7 +184,7 @@ static int drm_get_pci_dev(struct pci_dev *pdev,
return 0;
 
 err_agp:
-   drm_pci_agp_destroy(dev);
+   drm_legacy_pci_agp_destroy(dev);
pci_disable_device(pdev);
 err_free:
drm_dev_put(dev);
@@ -231,7 +231,7 @@ int drm_legacy_pci_init(const struct drm_driver *driver,
 
/* stealth mode requires a manual probe */
pci_dev_get(pdev);
-   drm_get_pci_dev(pdev, pid, driver);
+   drm_legacy_get_pci_dev(pdev, pid, driver);
}
}
return 0;
-- 
2.31.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 4/4] drm: Mark AGP implementation and ioctls as legacy

Only UMs drivers use DRM's core AGP code and ioctls. Mark the icotls
as legacy. Add the _legacy_ infix to all AGP functions. Move the
declarations to the public and internal legacy header files. The agp
field in struct drm_device is now located in the structure's legacy
section. Adapt drivers to the changes.

AGP code now depends on CONFIG_DRM_LEGACY.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/Makefile  |   6 +-
 drivers/gpu/drm/drm_agpsupport.c  |  99 +
 drivers/gpu/drm/drm_bufs.c|   1 -
 drivers/gpu/drm/drm_ioc32.c   |  19 +++--
 drivers/gpu/drm/drm_ioctl.c   |  24 +++---
 drivers/gpu/drm/drm_legacy.h  |  24 ++
 drivers/gpu/drm/drm_legacy_misc.c |   1 -
 drivers/gpu/drm/drm_memory.c  |   1 -
 drivers/gpu/drm/drm_pci.c |   3 +-
 drivers/gpu/drm/drm_vm.c  |   2 -
 drivers/gpu/drm/i810/i810_dma.c   |   3 +-
 drivers/gpu/drm/mga/mga_dma.c |  16 ++--
 drivers/gpu/drm/mga/mga_drv.h |   1 -
 drivers/gpu/drm/r128/r128_cce.c   |   2 +-
 drivers/gpu/drm/via/via_dma.c |   1 -
 include/drm/drm_agpsupport.h  | 117 --
 include/drm/drm_device.h  |   6 +-
 include/drm/drm_legacy.h  |  82 +
 18 files changed, 198 insertions(+), 210 deletions(-)
 delete mode 100644 include/drm/drm_agpsupport.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 89e747fedc00..a91cc7684904 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -20,15 +20,15 @@ drm-y   :=  drm_aperture.o drm_auth.o drm_cache.o \
drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
drm_managed.o drm_vblank_work.o
 
-drm-$(CONFIG_DRM_LEGACY) += drm_bufs.o drm_context.o drm_dma.o 
drm_legacy_misc.o drm_lock.o \
-   drm_memory.o drm_scatter.o drm_vm.o
+drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
drm_dma.o \
+   drm_legacy_misc.o drm_lock.o drm_memory.o 
drm_scatter.o \
+   drm_vm.o
 drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
 drm-$(CONFIG_COMPAT) += drm_ioc32.o
 drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o
 drm-$(CONFIG_DRM_GEM_SHMEM_HELPER) += drm_gem_shmem_helper.o
 drm-$(CONFIG_DRM_PANEL) += drm_panel.o
 drm-$(CONFIG_OF) += drm_of.o
-drm-$(CONFIG_AGP) += drm_agpsupport.o
 drm-$(CONFIG_PCI) += drm_pci.o
 drm-$(CONFIG_DEBUG_FS) += drm_debugfs.o drm_debugfs_crc.o
 drm-$(CONFIG_DRM_LOAD_EDID_FIRMWARE) += drm_edid_load.o
diff --git a/drivers/gpu/drm/drm_agpsupport.c b/drivers/gpu/drm/drm_agpsupport.c
index 5311d03d49cc..07c10443770e 100644
--- a/drivers/gpu/drm/drm_agpsupport.c
+++ b/drivers/gpu/drm/drm_agpsupport.c
@@ -37,7 +37,6 @@
 
 #include 
 
-#include 
 #include 
 #include 
 #include 
@@ -45,6 +44,8 @@
 
 #include "drm_legacy.h"
 
+#if IS_ENABLED(CONFIG_AGP)
+
 /*
  * Get AGP information.
  *
@@ -53,7 +54,7 @@
  * Verifies the AGP device has been initialized and acquired and fills in the
  * drm_agp_info structure with the information in drm_agp_head::agp_info.
  */
-int drm_agp_info(struct drm_device *dev, struct drm_agp_info *info)
+int drm_legacy_agp_info(struct drm_device *dev, struct drm_agp_info *info)
 {
struct agp_kern_info *kern;
 
@@ -73,15 +74,15 @@ int drm_agp_info(struct drm_device *dev, struct 
drm_agp_info *info)
 
return 0;
 }
-EXPORT_SYMBOL(drm_agp_info);
+EXPORT_SYMBOL(drm_legacy_agp_info);
 
-int drm_agp_info_ioctl(struct drm_device *dev, void *data,
-  struct drm_file *file_priv)
+int drm_legacy_agp_info_ioctl(struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
 {
struct drm_agp_info *info = data;
int err;
 
-   err = drm_agp_info(dev, info);
+   err = drm_legacy_agp_info(dev, info);
if (err)
return err;
 
@@ -97,7 +98,7 @@ int drm_agp_info_ioctl(struct drm_device *dev, void *data,
  * Verifies the AGP device hasn't been acquired before and calls
  * \c agp_backend_acquire.
  */
-int drm_agp_acquire(struct drm_device *dev)
+int drm_legacy_agp_acquire(struct drm_device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev->dev);
 
@@ -111,7 +112,7 @@ int drm_agp_acquire(struct drm_device *dev)
dev->agp->acquired = 1;
return 0;
 }
-EXPORT_SYMBOL(drm_agp_acquire);
+EXPORT_SYMBOL(drm_legacy_agp_acquire);
 
 /*
  * Acquire the AGP device (ioctl).
@@ -121,10 +122,10 @@ EXPORT_SYMBOL(drm_agp_acquire);
  * Verifies the AGP device hasn't been acquired before and calls
  * \c agp_backend_acquire.
  */
-int drm_agp_acquire_ioctl(struct drm_device *dev, void *data,
- struct drm_file *file_priv)
+int drm_legacy_agp_acquire_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file_priv)
 {
-   return drm_agp_acquire((struct drm_device *) file_priv->minor->dev);
+   return drm_legacy_agp_acquire((st

RE: [PATCH 00/14] DC Patches May 10, 2021

2021-05-07 Thread Wheeler, Daniel

[AMD Public Use]

Hi all,
 
This week this patchset was tested on the following systems:

HP Envy 360, with Ryzen 5 4500U, on the following display types: eDP 1080p 
60hz, 4k 60hz  (via USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI), 
1680*1050 60hz (via USB-C to DP and then DP to DVI/VGA)
 
Sapphire Pulse RX5700XT on the following display types:
4k 60hz  (via DP/HDMI), 1440p 144hz (via DP/HDMI), 1680*1050 60hz (via DP to 
DVI/VGA)
 
Reference AMD RX6800 on the following display types:
4k 60hz  (via DP/HDMI and USB-C to DP/HDMI), 1440p 144hz (via USB-C to DP/HDMI 
and USB-C to DP/HDMI), 1680*1050 60hz (via DP to DVI/VGA)
 
Included testing using a Startech DP 1.4 MST hub at 2x 4k 60hz on all systems.
 
Tested-by: Daniel Wheeler 

 
Thank you,
 
Dan Wheeler
Technologist  |  AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
Facebook |  Twitter |  amd.com  


-Original Message-
From: amd-gfx  On Behalf Of Stylon Wang
Sent: May 7, 2021 10:58 AM
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Chao-kai (Stylon) ; Brol, Eryk 
; Li, Sun peng (Leo) ; Wentland, Harry 
; Zhuo, Qingqing ; Siqueira, 
Rodrigo ; Jacob, Anson ; Pillai, 
Aurabindo ; Lakha, Bhawanpreet 
; R, Bindu 
Subject: [PATCH 00/14] DC Patches May 10, 2021

This DC patchset brings improvements in multiple areas. In summary, we
highlight:

* DC v3.2.135.1
* Improvements across DP, DPP, clock management, pixel formats

Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.65

Anthony Wang (1):
  drm/amd/display: Handle potential dpp_inst mismatch with pipe_idx

Aric Cyr (2):
  drm/amd/display: 3.2.135
  drm/amd/display: 3.2.135.1

Chaitanya Dhere (1):
  drm/amd/display: DETBufferSizeInKbyte variable type modifications

Dmytro Laktyushkin (1):
  drm/amd/display: fix use_max_lb flag for 420 pixel formats

Fangzhi Zuo (1):
  drm/amd/display: Add dc log for DP SST DSC enable/disable

Ilya Bakoulin (2):
  drm/amd/display: Fix clock table filling logic
  drm/amd/display: Handle pixel format test request

Jimmy Kizito (4):
  drm/amd/display: Update DPRX detection.
  drm/amd/display: Update setting of DP training parameters.
  drm/amd/display: Add fallback and abort paths for DP link training.
  drm/amd/display: Expand DP module training API.

Wenjing Liu (1):
  drm/amd/display: minor dp link training refactor

 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |   6 +-
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c |  86 ---  
drivers/gpu/drm/amd/display/dc/core/dc_link.c |  49 +++-
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |   4 +
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 211 --  
.../drm/amd/display/dc/core/dc_link_enc_cfg.c |  22 +-
 .../drm/amd/display/dc/core/dc_link_hwss.c|   3 +-
 drivers/gpu/drm/amd/display/dc/dc.h   |   2 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |   1 +
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +-
 .../drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c |   9 +-
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |  33 ++-
 .../dc/dml/dcn20/display_mode_vba_20.c|  26 +--
 .../dc/dml/dcn20/display_mode_vba_20v2.c  |  26 +--
 .../dc/dml/dcn21/display_mode_vba_21.c|  58 ++---
 .../dc/dml/dcn30/display_mode_vba_30.c|  48 ++--
 .../drm/amd/display/dc/dml/display_mode_vba.c |   2 +-
 .../drm/amd/display/dc/dml/display_mode_vba.h |  14 +-
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |  10 +-
 .../gpu/drm/amd/display/dc/inc/link_enc_cfg.h |   7 +-
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 123 +-
 .../amd/display/include/link_service_types.h  |   8 +
 22 files changed, 525 insertions(+), 230 deletions(-)

--
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cdaniel.wheeler%40amd.com%7C9ec96aa8589e4c0fd3ed08d91168a357%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637559963391150266%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PLlUpyhDx28WCzqtbiy3361vSUALhAKIJ0P6jQVI0HA%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v5 20/27] drm: Scope all DRM IOCTLs with drm_dev_enter/exit

2021-05-07 Thread Andrey Grodzovsky

On 2021-05-07 12:24 p.m., Daniel Vetter wrote:

On Fri, May 07, 2021 at 11:39:49AM -0400, Andrey Grodzovsky wrote:

On 2021-05-07 5:11 a.m., Daniel Vetter wrote:

On Thu, May 06, 2021 at 12:25:06PM -0400, Andrey Grodzovsky wrote:

On 2021-05-06 5:40 a.m., Daniel Vetter wrote:

On Fri, Apr 30, 2021 at 01:27:37PM -0400, Andrey Grodzovsky wrote:

On 2021-04-30 6:25 a.m., Daniel Vetter wrote:

On Thu, Apr 29, 2021 at 04:34:55PM -0400, Andrey Grodzovsky wrote:

On 2021-04-29 3:05 p.m., Daniel Vetter wrote:

On Thu, Apr 29, 2021 at 12:04:33PM -0400, Andrey Grodzovsky wrote:

On 2021-04-29 7:32 a.m., Daniel Vetter wrote:

On Thu, Apr 29, 2021 at 01:23:19PM +0200, Daniel Vetter wrote:

On Wed, Apr 28, 2021 at 11:12:00AM -0400, Andrey Grodzovsky wrote:

With this calling drm_dev_unplug will flush and block
all in flight IOCTLs

Also, add feature such that if device supports graceful unplug
we enclose entire IOCTL in SRCU critical section.

Signed-off-by: Andrey Grodzovsky

Nope.

The idea of drm_dev_enter/exit is to mark up hw access. Not entire ioctl.

Then I am confused why we have
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.12%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_ioctl.c%23L826&data=04%7C01%7Candrey.grodzovsky%40amd.com%7C66e4988eb341427e8b0108d91174a232%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637560014906903277%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=rdb3xesAUYYTeqU2WdoZ%2BWLzOuuRdOxBBQNTMMB%2BKB4%3D&reserved=0
currently in code ?

I forgot about this one, again. Thanks for reminding.

Especially not with an opt-in flag so that it could be shrugged of as a
driver hack. Most of these ioctls should have absolutely no problem
working after hotunplug.

Also, doing this defeats the point since it pretty much guarantees
userspace will die in assert()s and stuff. E.g. on i915 the rough contract
is that only execbuf (and even that only when userspace has indicated
support for non-recoverable hw ctx) is allowed to fail. Anything else
might crash userspace.

Given that as I pointed above we already fail any IOCTls with -ENODEV
when device is unplugged, it seems those crashes don't happen that
often ? Also, in all my testing I don't think I saw a user space crash
I could attribute to this.

I guess it should be ok.

What should be ok ?

Your approach, but not your patch. If we go with this let's just lift it
to drm_ioctl() as the default behavior. No driver opt-in flag, because
that's definitely worse than any other approach because we really need to
get rid of driver specific behaviour for generic ioctls, especially
anything a compositor will use directly.

My reasons for making this work is both less trouble for userspace (did
you test with various wayland compositors out there, not just amdgpu x86

I didn't - will give it a try.

Weston worked without crashes, run the egl tester cube there.

driver?), but also testing.

We still need a bunch of these checks in various places or you'll wait a
very long time for a pending modeset or similar to complete. Being able to
run that code easily after hotunplug has completed should help a lot with
testing.

Plus various drivers already acquired drm_dev_enter/exit and now I wonder
whether that was properly tested or not ...

I guess maybe we need a drm module option to disable this check, so that
we can exercise the code as if the ioctl has raced with hotunplug at the
worst possible moment.

Also atomic is really tricky here: I assume your testing has just done
normal synchronous commits, but anything that goes through atomic can be
done nonblocking in a separate thread. Which the ioctl catch-all here wont
capture.

Yes, async commit was on my mind and thanks for reminding me. Indeed
I forgot this but i planned to scope the entire amdgpu_dm_atomic_tail in
drm_dev_enter/exit. Note that i have a bunch of patches, all name's
starting with 'Scope' that just methodically put all the background
work items and timers the drivers schedules in drm_dev_enter/exit scope.
This was supposed to be part of the 'Scope Display code' patch.

That's too much. You still have to arrange that the flip completion event
gets sent out. So it's a bit tricky.

In other places the same problem applies, e.g. probe functions need to
make sure they report "disconnected".

I see, well, this is all part of KMS support which I defer for now
anyway. Will tackle it then.

You probably need similar (and very precisely defined) rules for amdgpu.
And those must definitely exclude any shard ioctls from randomly failing
with EIO, because that just kills the box and defeats the point of trying
to gracefully handling hotunplug and making sure userspace has a chance of
survival. E.g. for atomic everything should continue, including flip
completion, but we set all outputs to "disconnected" and send out the
uevent. Maybe crtc enabling can fail too, but that can als

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-07 Thread Kenny Ho

On Fri, May 7, 2021 at 12:54 PM Daniel Vetter  wrote:
>
> SRIOV is kinda by design vendor specific. You set up the VF endpoint, it
> shows up, it's all hw+fw magic. Nothing for cgroups to manage here at all.
Right, so in theory you just use the device cgroup with the VF endpoints.

> All I meant is that for the container/cgroups world starting out with
> time-sharing feels like the best fit, least because your SRIOV designers
> also seem to think that's the best first cut for cloud-y computing.
> Whether it's virtualized or containerized is a distinction that's getting
> ever more blurry, with virtualization become a lot more dynamic and
> container runtimes als possibly using hw virtualization underneath.
I disagree.  By the same logic, the existence of CU mask would imply
it being the preferred way for sub-device control per process.

Kenny
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 07, 2021 at 12:50:07PM -0400, Alex Deucher wrote:
> On Fri, May 7, 2021 at 12:31 PM Alex Deucher  wrote:
> >
> > On Fri, May 7, 2021 at 12:26 PM Daniel Vetter  wrote:
> > >
> > > On Fri, May 07, 2021 at 12:19:13PM -0400, Alex Deucher wrote:
> > > > On Fri, May 7, 2021 at 12:13 PM Daniel Vetter  wrote:
> > > > >
> > > > > On Fri, May 07, 2021 at 11:33:46AM -0400, Kenny Ho wrote:
> > > > > > On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  
> > > > > > wrote:
> > > > > > >
> > > > > > > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the 
> > > > > > > easier gpu
> > > > > > > cgroups controler to get started, since it's much closer to other 
> > > > > > > cgroups
> > > > > > > that control bandwidth of some kind. Whether it's i/o bandwidth 
> > > > > > > or compute
> > > > > > > bandwidht is kinda a wash.
> > > > > > sriov/time-sliced-of-a-whole gpu does not really need a cgroup
> > > > > > interface since each slice appears as a stand alone device.  This is
> > > > > > already in production (not using cgroup) with users.  The cgroup
> > > > > > proposal has always been parallel to that in many sense: 1) spatial
> > > > > > partitioning as an independent but equally valid use case as time
> > > > > > sharing, 2) sub-device resource control as opposed to full device
> > > > > > control motivated by the workload characterization paper.  It was
> > > > > > never about time vs space in terms of use cases but having new API 
> > > > > > for
> > > > > > users to be able to do spatial subdevice partitioning.
> > > > > >
> > > > > > > CU mask feels a lot more like an isolation/guaranteed forward 
> > > > > > > progress
> > > > > > > kind of thing, and I suspect that's always going to be a lot more 
> > > > > > > gpu hw
> > > > > > > specific than anything we can reasonably put into a general 
> > > > > > > cgroups
> > > > > > > controller.
> > > > > > The first half is correct but I disagree with the conclusion.  The
> > > > > > analogy I would use is multi-core CPU.  The capability of individual
> > > > > > CPU cores, core count and core arrangement may be hw specific but
> > > > > > there are general interfaces to support selection of these cores.  
> > > > > > CU
> > > > > > mask may be hw specific but spatial partitioning as an idea is not.
> > > > > > Most gpu vendors have the concept of sub-device compute units (EU, 
> > > > > > SE,
> > > > > > etc.); OpenCL has the concept of subdevice in the language.  I don't
> > > > > > see any obstacle for vendors to implement spatial partitioning just
> > > > > > like many CPU vendors support the idea of multi-core.
> > > > > >
> > > > > > > Also for the time slice cgroups thing, can you pls give me 
> > > > > > > pointers to
> > > > > > > these old patches that had it, and how it's done? I very 
> > > > > > > obviously missed
> > > > > > > that part.
> > > > > > I think you misunderstood what I wrote earlier.  The original 
> > > > > > proposal
> > > > > > was about spatial partitioning of subdevice resources not time 
> > > > > > sharing
> > > > > > using cgroup (since time sharing is already supported elsewhere.)
> > > > >
> > > > > Well SRIOV time-sharing is for virtualization. cgroups is for
> > > > > containerization, which is just virtualization but with less overhead 
> > > > > and
> > > > > more security bugs.
> > > > >
> > > > > More or less.
> > > > >
> > > > > So either I get things still wrong, or we'll get time-sharing for
> > > > > virtualization, and partitioning of CU for containerization. That 
> > > > > doesn't
> > > > > make that much sense to me.
> > > >
> > > > You could still potentially do SR-IOV for containerization.  You'd
> > > > just pass one of the PCI VFs (virtual functions) to the container and
> > > > you'd automatically get the time slice.  I don't see why cgroups would
> > > > be a factor there.
> > >
> > > Standard interface to manage that time-slicing. I guess for SRIOV it's all
> > > vendor sauce (intel as guilty as anyone else from what I can see), but for
> > > cgroups that feels like it's falling a bit short of what we should aim
> > > for.
> > >
> > > But dunno, maybe I'm just dreaming too much :-)
> >
> > I don't disagree, I'm just not sure how it would apply to SR-IOV.
> > Once you've created the virtual functions, you've already created the
> > partitioning (regardless of whether it's spatial or temporal) so where
> > would cgroups come into play?
> 
> For some background, the SR-IOV virtual functions show up like actual
> PCI endpoints on the bus, so SR-IOV is sort of like cgroups
> implemented in hardware.  When you enable SR-IOV, the endpoints that
> are created are the partitions.

Yeah I think we're massively agreeing right now :-)

SRIOV is kinda by design vendor specific. You set up the VF endpoint, it
shows up, it's all hw+fw magic. Nothing for cgroups to manage here at all.

All I meant is that for the container/cgroups world starting out with
time-sharing feels like the best fit, least because yo

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 7, 2021 at 12:31 PM Alex Deucher  wrote:
>
> On Fri, May 7, 2021 at 12:26 PM Daniel Vetter  wrote:
> >
> > On Fri, May 07, 2021 at 12:19:13PM -0400, Alex Deucher wrote:
> > > On Fri, May 7, 2021 at 12:13 PM Daniel Vetter  wrote:
> > > >
> > > > On Fri, May 07, 2021 at 11:33:46AM -0400, Kenny Ho wrote:
> > > > > On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  wrote:
> > > > > >
> > > > > > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the 
> > > > > > easier gpu
> > > > > > cgroups controler to get started, since it's much closer to other 
> > > > > > cgroups
> > > > > > that control bandwidth of some kind. Whether it's i/o bandwidth or 
> > > > > > compute
> > > > > > bandwidht is kinda a wash.
> > > > > sriov/time-sliced-of-a-whole gpu does not really need a cgroup
> > > > > interface since each slice appears as a stand alone device.  This is
> > > > > already in production (not using cgroup) with users.  The cgroup
> > > > > proposal has always been parallel to that in many sense: 1) spatial
> > > > > partitioning as an independent but equally valid use case as time
> > > > > sharing, 2) sub-device resource control as opposed to full device
> > > > > control motivated by the workload characterization paper.  It was
> > > > > never about time vs space in terms of use cases but having new API for
> > > > > users to be able to do spatial subdevice partitioning.
> > > > >
> > > > > > CU mask feels a lot more like an isolation/guaranteed forward 
> > > > > > progress
> > > > > > kind of thing, and I suspect that's always going to be a lot more 
> > > > > > gpu hw
> > > > > > specific than anything we can reasonably put into a general cgroups
> > > > > > controller.
> > > > > The first half is correct but I disagree with the conclusion.  The
> > > > > analogy I would use is multi-core CPU.  The capability of individual
> > > > > CPU cores, core count and core arrangement may be hw specific but
> > > > > there are general interfaces to support selection of these cores.  CU
> > > > > mask may be hw specific but spatial partitioning as an idea is not.
> > > > > Most gpu vendors have the concept of sub-device compute units (EU, SE,
> > > > > etc.); OpenCL has the concept of subdevice in the language.  I don't
> > > > > see any obstacle for vendors to implement spatial partitioning just
> > > > > like many CPU vendors support the idea of multi-core.
> > > > >
> > > > > > Also for the time slice cgroups thing, can you pls give me pointers 
> > > > > > to
> > > > > > these old patches that had it, and how it's done? I very obviously 
> > > > > > missed
> > > > > > that part.
> > > > > I think you misunderstood what I wrote earlier.  The original proposal
> > > > > was about spatial partitioning of subdevice resources not time sharing
> > > > > using cgroup (since time sharing is already supported elsewhere.)
> > > >
> > > > Well SRIOV time-sharing is for virtualization. cgroups is for
> > > > containerization, which is just virtualization but with less overhead 
> > > > and
> > > > more security bugs.
> > > >
> > > > More or less.
> > > >
> > > > So either I get things still wrong, or we'll get time-sharing for
> > > > virtualization, and partitioning of CU for containerization. That 
> > > > doesn't
> > > > make that much sense to me.
> > >
> > > You could still potentially do SR-IOV for containerization.  You'd
> > > just pass one of the PCI VFs (virtual functions) to the container and
> > > you'd automatically get the time slice.  I don't see why cgroups would
> > > be a factor there.
> >
> > Standard interface to manage that time-slicing. I guess for SRIOV it's all
> > vendor sauce (intel as guilty as anyone else from what I can see), but for
> > cgroups that feels like it's falling a bit short of what we should aim
> > for.
> >
> > But dunno, maybe I'm just dreaming too much :-)
>
> I don't disagree, I'm just not sure how it would apply to SR-IOV.
> Once you've created the virtual functions, you've already created the
> partitioning (regardless of whether it's spatial or temporal) so where
> would cgroups come into play?

For some background, the SR-IOV virtual functions show up like actual
PCI endpoints on the bus, so SR-IOV is sort of like cgroups
implemented in hardware.  When you enable SR-IOV, the endpoints that
are created are the partitions.

Alex

>
> Alex
>
> > -Daniel
> >
> > > Alex
> > >
> > > >
> > > > Since time-sharing is the first thing that's done for virtualization I
> > > > think it's probably also the most reasonable to start with for 
> > > > containers.
> > > > -Daniel
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > http://blog.ffwll.ch
> > > > ___
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
__

Re: [PATCH v5 15/27] drm/scheduler: Fix hang when sched_entity released

2021-05-07 Thread Andrey Grodzovsky




On 2021-05-07 12:29 p.m., Daniel Vetter wrote:

On Fri, Apr 30, 2021 at 12:10:57PM -0400, Andrey Grodzovsky wrote:



On 2021-04-30 2:47 a.m., Christian König wrote:



Am 29.04.21 um 19:06 schrieb Andrey Grodzovsky:



On 2021-04-29 3:18 a.m., Christian König wrote:

I need to take another look at this part when I don't have a
massive headache any more.

Maybe split the patch set up into different parts, something like:
1. Adding general infrastructure.
2. Making sure all memory is unpolated.
3. Job and fence handling


I am not sure you mean this patch here, maybe another one ?
Also note you already RBed it.


No what I meant was to send out the patches before this one as #1 and #2.

That is the easier stuff which can easily go into the drm-misc-next or
amd-staging-drm-next branch.

The scheduler stuff certainly need to go into drm-misc-next.

Christian.


Got you. I am fine with it. What we have here is a working hot-unplug
code but, one with potential use after free MMIO ranges frpom the zombie
device. The followup patches after this patch are all about preventing
this and so the patch-set up until this patch including, is functional
on it's own. While it's necessary to solve the above issue, it's has
complications as can be seen from the discussion with Daniel on later
patch in this series. Still, in my opinion it's better to rollout some
initial support to hot-unplug without use after free protection then
having no support for hot-unplug at all. It will also make the merge
work easier as I need to constantly rebase the patches on top latest
kernel and solve new regressions.

Daniel - given the arguments above can you sound your opinion on this
approach ?


I'm all for incrementally landing this, because it's really hard and
tricky. We might need to go back to some of the decisions, or clarify
things more, or more headaches and pondering how to fix all the parts
that works best to make sure there's no nasty races right across hotunplug
if you're unlucky enough.

But yeah better aim for something and then readjust than bikeshed forever
out of tree.

Cheers, Daniel


Thanks, I will send out V6 limited in scope up to here and fixing
any relevant comments.

Andrey





Andrey




Andrey



Christian.

Am 28.04.21 um 17:11 schrieb Andrey Grodzovsky:

Problem: If scheduler is already stopped by the time sched_entity
is released and entity's job_queue not empty I encountred
a hang in drm_sched_entity_flush. This is because
drm_sched_entity_is_idle
never becomes false.

Fix: In drm_sched_fini detach all sched_entities from the
scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
Also wakeup all those processes stuck in sched_entity flushing
as the scheduler main thread which wakes them up is stopped by now.

v2:
Reverse order of drm_sched_rq_remove_entity and marking
s_entity as stopped to prevent reinserion back to rq due
to race.

v3:
Drop drm_sched_rq_remove_entity, only modify entity->stopped
and check for it in drm_sched_entity_is_idle

Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
---
   drivers/gpu/drm/scheduler/sched_entity.c |  3 ++-
   drivers/gpu/drm/scheduler/sched_main.c   | 24

   2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
b/drivers/gpu/drm/scheduler/sched_entity.c
index f0790e9471d1..cb58f692dad9 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -116,7 +116,8 @@ static bool
drm_sched_entity_is_idle(struct drm_sched_entity *entity)
   rmb(); /* for list_empty to work without lock */
   if (list_empty(&entity->list) ||
-    spsc_queue_count(&entity->job_queue) == 0)
+    spsc_queue_count(&entity->job_queue) == 0 ||
+    entity->stopped)
   return true;
   return false;
diff --git a/drivers/gpu/drm/scheduler/sched_main.c
b/drivers/gpu/drm/scheduler/sched_main.c
index 908b0b56032d..ba087354d0a8 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -897,9 +897,33 @@ EXPORT_SYMBOL(drm_sched_init);
    */
   void drm_sched_fini(struct drm_gpu_scheduler *sched)
   {
+    struct drm_sched_entity *s_entity;
+    int i;
+
   if (sched->thread)
   kthread_stop(sched->thread);
+    for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >=
DRM_SCHED_PRIORITY_MIN; i--) {
+    struct drm_sched_rq *rq = &sched->sched_rq[i];
+
+    if (!rq)
+    continue;
+
+    spin_lock(&rq->lock);
+    list_for_each_entry(s_entity, &rq->entities, list)
+    /*
+ * Prevents reinsertion and marks job_queue as idle,
+ * it will removed from rq in drm_sched_entity_fini
+ * eventually
+ */
+    s_entity->stopped = true;
+    spin_unlock(&rq->lock);
+
+    }
+
+    /* Wakeup everyone stuck in drm_sched_entity_flush for
this scheduler */
+    wake_up_all(&sched->job_scheduled);
+
   /* Confir

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 7, 2021 at 12:26 PM Daniel Vetter  wrote:
>
> On Fri, May 07, 2021 at 12:19:13PM -0400, Alex Deucher wrote:
> > On Fri, May 7, 2021 at 12:13 PM Daniel Vetter  wrote:
> > >
> > > On Fri, May 07, 2021 at 11:33:46AM -0400, Kenny Ho wrote:
> > > > On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  wrote:
> > > > >
> > > > > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the 
> > > > > easier gpu
> > > > > cgroups controler to get started, since it's much closer to other 
> > > > > cgroups
> > > > > that control bandwidth of some kind. Whether it's i/o bandwidth or 
> > > > > compute
> > > > > bandwidht is kinda a wash.
> > > > sriov/time-sliced-of-a-whole gpu does not really need a cgroup
> > > > interface since each slice appears as a stand alone device.  This is
> > > > already in production (not using cgroup) with users.  The cgroup
> > > > proposal has always been parallel to that in many sense: 1) spatial
> > > > partitioning as an independent but equally valid use case as time
> > > > sharing, 2) sub-device resource control as opposed to full device
> > > > control motivated by the workload characterization paper.  It was
> > > > never about time vs space in terms of use cases but having new API for
> > > > users to be able to do spatial subdevice partitioning.
> > > >
> > > > > CU mask feels a lot more like an isolation/guaranteed forward progress
> > > > > kind of thing, and I suspect that's always going to be a lot more gpu 
> > > > > hw
> > > > > specific than anything we can reasonably put into a general cgroups
> > > > > controller.
> > > > The first half is correct but I disagree with the conclusion.  The
> > > > analogy I would use is multi-core CPU.  The capability of individual
> > > > CPU cores, core count and core arrangement may be hw specific but
> > > > there are general interfaces to support selection of these cores.  CU
> > > > mask may be hw specific but spatial partitioning as an idea is not.
> > > > Most gpu vendors have the concept of sub-device compute units (EU, SE,
> > > > etc.); OpenCL has the concept of subdevice in the language.  I don't
> > > > see any obstacle for vendors to implement spatial partitioning just
> > > > like many CPU vendors support the idea of multi-core.
> > > >
> > > > > Also for the time slice cgroups thing, can you pls give me pointers to
> > > > > these old patches that had it, and how it's done? I very obviously 
> > > > > missed
> > > > > that part.
> > > > I think you misunderstood what I wrote earlier.  The original proposal
> > > > was about spatial partitioning of subdevice resources not time sharing
> > > > using cgroup (since time sharing is already supported elsewhere.)
> > >
> > > Well SRIOV time-sharing is for virtualization. cgroups is for
> > > containerization, which is just virtualization but with less overhead and
> > > more security bugs.
> > >
> > > More or less.
> > >
> > > So either I get things still wrong, or we'll get time-sharing for
> > > virtualization, and partitioning of CU for containerization. That doesn't
> > > make that much sense to me.
> >
> > You could still potentially do SR-IOV for containerization.  You'd
> > just pass one of the PCI VFs (virtual functions) to the container and
> > you'd automatically get the time slice.  I don't see why cgroups would
> > be a factor there.
>
> Standard interface to manage that time-slicing. I guess for SRIOV it's all
> vendor sauce (intel as guilty as anyone else from what I can see), but for
> cgroups that feels like it's falling a bit short of what we should aim
> for.
>
> But dunno, maybe I'm just dreaming too much :-)

I don't disagree, I'm just not sure how it would apply to SR-IOV.
Once you've created the virtual functions, you've already created the
partitioning (regardless of whether it's spatial or temporal) so where
would cgroups come into play?

Alex

> -Daniel
>
> > Alex
> >
> > >
> > > Since time-sharing is the first thing that's done for virtualization I
> > > think it's probably also the most reasonable to start with for containers.
> > > -Daniel
> > > --
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > http://blog.ffwll.ch
> > > ___
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v5 15/27] drm/scheduler: Fix hang when sched_entity released

On Fri, Apr 30, 2021 at 12:10:57PM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 2021-04-30 2:47 a.m., Christian König wrote:
> > 
> > 
> > Am 29.04.21 um 19:06 schrieb Andrey Grodzovsky:
> > > 
> > > 
> > > On 2021-04-29 3:18 a.m., Christian König wrote:
> > > > I need to take another look at this part when I don't have a
> > > > massive headache any more.
> > > > 
> > > > Maybe split the patch set up into different parts, something like:
> > > > 1. Adding general infrastructure.
> > > > 2. Making sure all memory is unpolated.
> > > > 3. Job and fence handling
> > > 
> > > I am not sure you mean this patch here, maybe another one ?
> > > Also note you already RBed it.
> > 
> > No what I meant was to send out the patches before this one as #1 and #2.
> > 
> > That is the easier stuff which can easily go into the drm-misc-next or
> > amd-staging-drm-next branch.
> > 
> > The scheduler stuff certainly need to go into drm-misc-next.
> > 
> > Christian.
> 
> Got you. I am fine with it. What we have here is a working hot-unplug
> code but, one with potential use after free MMIO ranges frpom the zombie
> device. The followup patches after this patch are all about preventing
> this and so the patch-set up until this patch including, is functional
> on it's own. While it's necessary to solve the above issue, it's has
> complications as can be seen from the discussion with Daniel on later
> patch in this series. Still, in my opinion it's better to rollout some
> initial support to hot-unplug without use after free protection then
> having no support for hot-unplug at all. It will also make the merge
> work easier as I need to constantly rebase the patches on top latest
> kernel and solve new regressions.
> 
> Daniel - given the arguments above can you sound your opinion on this
> approach ?

I'm all for incrementally landing this, because it's really hard and
tricky. We might need to go back to some of the decisions, or clarify
things more, or more headaches and pondering how to fix all the parts
that works best to make sure there's no nasty races right across hotunplug
if you're unlucky enough.

But yeah better aim for something and then readjust than bikeshed forever
out of tree.

Cheers, Daniel

> 
> Andrey
> > 
> > > 
> > > Andrey
> > > 
> > > > 
> > > > Christian.
> > > > 
> > > > Am 28.04.21 um 17:11 schrieb Andrey Grodzovsky:
> > > > > Problem: If scheduler is already stopped by the time sched_entity
> > > > > is released and entity's job_queue not empty I encountred
> > > > > a hang in drm_sched_entity_flush. This is because
> > > > > drm_sched_entity_is_idle
> > > > > never becomes false.
> > > > > 
> > > > > Fix: In drm_sched_fini detach all sched_entities from the
> > > > > scheduler's run queues. This will satisfy drm_sched_entity_is_idle.
> > > > > Also wakeup all those processes stuck in sched_entity flushing
> > > > > as the scheduler main thread which wakes them up is stopped by now.
> > > > > 
> > > > > v2:
> > > > > Reverse order of drm_sched_rq_remove_entity and marking
> > > > > s_entity as stopped to prevent reinserion back to rq due
> > > > > to race.
> > > > > 
> > > > > v3:
> > > > > Drop drm_sched_rq_remove_entity, only modify entity->stopped
> > > > > and check for it in drm_sched_entity_is_idle
> > > > > 
> > > > > Signed-off-by: Andrey Grodzovsky 
> > > > > Reviewed-by: Christian König 
> > > > > ---
> > > > >   drivers/gpu/drm/scheduler/sched_entity.c |  3 ++-
> > > > >   drivers/gpu/drm/scheduler/sched_main.c   | 24
> > > > > 
> > > > >   2 files changed, 26 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > index f0790e9471d1..cb58f692dad9 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > > @@ -116,7 +116,8 @@ static bool
> > > > > drm_sched_entity_is_idle(struct drm_sched_entity *entity)
> > > > >   rmb(); /* for list_empty to work without lock */
> > > > >   if (list_empty(&entity->list) ||
> > > > > -    spsc_queue_count(&entity->job_queue) == 0)
> > > > > +    spsc_queue_count(&entity->job_queue) == 0 ||
> > > > > +    entity->stopped)
> > > > >   return true;
> > > > >   return false;
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > index 908b0b56032d..ba087354d0a8 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > @@ -897,9 +897,33 @@ EXPORT_SYMBOL(drm_sched_init);
> > > > >    */
> > > > >   void drm_sched_fini(struct drm_gpu_scheduler *sched)
> > > > >   {
> > > > > +    struct drm_sched_entity *s_entity;
> > > > > +    int i;
> > > > > +
> > > > >   if (sched->thread)
> > > > >   kthread_stop(sched->thread);
> > > > > +    for (i = DRM_SCHED_PRIORITY_COUNT - 1; i >=
> > > >

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 07, 2021 at 12:19:13PM -0400, Alex Deucher wrote:
> On Fri, May 7, 2021 at 12:13 PM Daniel Vetter  wrote:
> >
> > On Fri, May 07, 2021 at 11:33:46AM -0400, Kenny Ho wrote:
> > > On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  wrote:
> > > >
> > > > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the easier 
> > > > gpu
> > > > cgroups controler to get started, since it's much closer to other 
> > > > cgroups
> > > > that control bandwidth of some kind. Whether it's i/o bandwidth or 
> > > > compute
> > > > bandwidht is kinda a wash.
> > > sriov/time-sliced-of-a-whole gpu does not really need a cgroup
> > > interface since each slice appears as a stand alone device.  This is
> > > already in production (not using cgroup) with users.  The cgroup
> > > proposal has always been parallel to that in many sense: 1) spatial
> > > partitioning as an independent but equally valid use case as time
> > > sharing, 2) sub-device resource control as opposed to full device
> > > control motivated by the workload characterization paper.  It was
> > > never about time vs space in terms of use cases but having new API for
> > > users to be able to do spatial subdevice partitioning.
> > >
> > > > CU mask feels a lot more like an isolation/guaranteed forward progress
> > > > kind of thing, and I suspect that's always going to be a lot more gpu hw
> > > > specific than anything we can reasonably put into a general cgroups
> > > > controller.
> > > The first half is correct but I disagree with the conclusion.  The
> > > analogy I would use is multi-core CPU.  The capability of individual
> > > CPU cores, core count and core arrangement may be hw specific but
> > > there are general interfaces to support selection of these cores.  CU
> > > mask may be hw specific but spatial partitioning as an idea is not.
> > > Most gpu vendors have the concept of sub-device compute units (EU, SE,
> > > etc.); OpenCL has the concept of subdevice in the language.  I don't
> > > see any obstacle for vendors to implement spatial partitioning just
> > > like many CPU vendors support the idea of multi-core.
> > >
> > > > Also for the time slice cgroups thing, can you pls give me pointers to
> > > > these old patches that had it, and how it's done? I very obviously 
> > > > missed
> > > > that part.
> > > I think you misunderstood what I wrote earlier.  The original proposal
> > > was about spatial partitioning of subdevice resources not time sharing
> > > using cgroup (since time sharing is already supported elsewhere.)
> >
> > Well SRIOV time-sharing is for virtualization. cgroups is for
> > containerization, which is just virtualization but with less overhead and
> > more security bugs.
> >
> > More or less.
> >
> > So either I get things still wrong, or we'll get time-sharing for
> > virtualization, and partitioning of CU for containerization. That doesn't
> > make that much sense to me.
> 
> You could still potentially do SR-IOV for containerization.  You'd
> just pass one of the PCI VFs (virtual functions) to the container and
> you'd automatically get the time slice.  I don't see why cgroups would
> be a factor there.

Standard interface to manage that time-slicing. I guess for SRIOV it's all
vendor sauce (intel as guilty as anyone else from what I can see), but for
cgroups that feels like it's falling a bit short of what we should aim
for.

But dunno, maybe I'm just dreaming too much :-)
-Daniel

> Alex
> 
> >
> > Since time-sharing is the first thing that's done for virtualization I
> > think it's probably also the most reasonable to start with for containers.
> > -Daniel
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v5 20/27] drm: Scope all DRM IOCTLs with drm_dev_enter/exit

On Fri, May 07, 2021 at 11:39:49AM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 2021-05-07 5:11 a.m., Daniel Vetter wrote:
> > On Thu, May 06, 2021 at 12:25:06PM -0400, Andrey Grodzovsky wrote:
> > > 
> > > 
> > > On 2021-05-06 5:40 a.m., Daniel Vetter wrote:
> > > > On Fri, Apr 30, 2021 at 01:27:37PM -0400, Andrey Grodzovsky wrote:
> > > > > 
> > > > > 
> > > > > On 2021-04-30 6:25 a.m., Daniel Vetter wrote:
> > > > > > On Thu, Apr 29, 2021 at 04:34:55PM -0400, Andrey Grodzovsky wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 2021-04-29 3:05 p.m., Daniel Vetter wrote:
> > > > > > > > On Thu, Apr 29, 2021 at 12:04:33PM -0400, Andrey Grodzovsky 
> > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On 2021-04-29 7:32 a.m., Daniel Vetter wrote:
> > > > > > > > > > On Thu, Apr 29, 2021 at 01:23:19PM +0200, Daniel Vetter 
> > > > > > > > > > wrote:
> > > > > > > > > > > On Wed, Apr 28, 2021 at 11:12:00AM -0400, Andrey 
> > > > > > > > > > > Grodzovsky wrote:
> > > > > > > > > > > > With this calling drm_dev_unplug will flush and block
> > > > > > > > > > > > all in flight IOCTLs
> > > > > > > > > > > > 
> > > > > > > > > > > > Also, add feature such that if device supports graceful 
> > > > > > > > > > > > unplug
> > > > > > > > > > > > we enclose entire IOCTL in SRCU critical section.
> > > > > > > > > > > > 
> > > > > > > > > > > > Signed-off-by: Andrey Grodzovsky 
> > > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Nope.
> > > > > > > > > > > 
> > > > > > > > > > > The idea of drm_dev_enter/exit is to mark up hw access. 
> > > > > > > > > > > Not entire ioctl.
> > > > > > > > > 
> > > > > > > > > Then I am confused why we have 
> > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.12%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_ioctl.c%23L826&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Ce53ea46e66fa40a0e03f08d911381a05%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637559754928702763%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=zMlHiglnn8Vm%2BVxI9Rbk8X%2BTyuokq1x1INbhbRCWK4E%3D&reserved=0
> > > > > > > > > currently in code ?
> > > > > > > > 
> > > > > > > > I forgot about this one, again. Thanks for reminding.
> > > > > > > > 
> > > > > > > > > > > Especially not with an opt-in flag so that it could be 
> > > > > > > > > > > shrugged of as a
> > > > > > > > > > > driver hack. Most of these ioctls should have absolutely 
> > > > > > > > > > > no problem
> > > > > > > > > > > working after hotunplug.
> > > > > > > > > > > 
> > > > > > > > > > > Also, doing this defeats the point since it pretty much 
> > > > > > > > > > > guarantees
> > > > > > > > > > > userspace will die in assert()s and stuff. E.g. on i915 
> > > > > > > > > > > the rough contract
> > > > > > > > > > > is that only execbuf (and even that only when userspace 
> > > > > > > > > > > has indicated
> > > > > > > > > > > support for non-recoverable hw ctx) is allowed to fail. 
> > > > > > > > > > > Anything else
> > > > > > > > > > > might crash userspace.
> > > > > > > > > 
> > > > > > > > > Given that as I pointed above we already fail any IOCTls with 
> > > > > > > > > -ENODEV
> > > > > > > > > when device is unplugged, it seems those crashes don't happen 
> > > > > > > > > that
> > > > > > > > > often ? Also, in all my testing I don't think I saw a user 
> > > > > > > > > space crash
> > > > > > > > > I could attribute to this.
> > > > > > > > 
> > > > > > > > I guess it should be ok.
> > > > > > > 
> > > > > > > What should be ok ?
> > > > > > 
> > > > > > Your approach, but not your patch. If we go with this let's just 
> > > > > > lift it
> > > > > > to drm_ioctl() as the default behavior. No driver opt-in flag, 
> > > > > > because
> > > > > > that's definitely worse than any other approach because we really 
> > > > > > need to
> > > > > > get rid of driver specific behaviour for generic ioctls, especially
> > > > > > anything a compositor will use directly.
> > > > > > 
> > > > > > > > My reasons for making this work is both less trouble for 
> > > > > > > > userspace (did
> > > > > > > > you test with various wayland compositors out there, not just 
> > > > > > > > amdgpu x86
> > > > > > > 
> > > > > > > I didn't - will give it a try.
> > > > > 
> > > > > Weston worked without crashes, run the egl tester cube there.
> > > > > 
> > > > > > > 
> > > > > > > > driver?), but also testing.
> > > > > > > > 
> > > > > > > > We still need a bunch of these checks in various places or 
> > > > > > > > you'll wait a
> > > > > > > > very long time for a pending modeset or similar to complete. 
> > > > > > > > Being able to
> > > > > > > > run that code easily after hotunplug has completed should help 
> > > > > > > > a lot with
> > > > > > > > testing.
> > > > > > > > 
> > > > > > > > Plus various drivers already acquired drm_dev_enter/exit and 
> > > > > > > > now I wonder
> > > > > > > >

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 7, 2021 at 12:13 PM Daniel Vetter  wrote:
>
> On Fri, May 07, 2021 at 11:33:46AM -0400, Kenny Ho wrote:
> > On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  wrote:
> > >
> > > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the easier gpu
> > > cgroups controler to get started, since it's much closer to other cgroups
> > > that control bandwidth of some kind. Whether it's i/o bandwidth or compute
> > > bandwidht is kinda a wash.
> > sriov/time-sliced-of-a-whole gpu does not really need a cgroup
> > interface since each slice appears as a stand alone device.  This is
> > already in production (not using cgroup) with users.  The cgroup
> > proposal has always been parallel to that in many sense: 1) spatial
> > partitioning as an independent but equally valid use case as time
> > sharing, 2) sub-device resource control as opposed to full device
> > control motivated by the workload characterization paper.  It was
> > never about time vs space in terms of use cases but having new API for
> > users to be able to do spatial subdevice partitioning.
> >
> > > CU mask feels a lot more like an isolation/guaranteed forward progress
> > > kind of thing, and I suspect that's always going to be a lot more gpu hw
> > > specific than anything we can reasonably put into a general cgroups
> > > controller.
> > The first half is correct but I disagree with the conclusion.  The
> > analogy I would use is multi-core CPU.  The capability of individual
> > CPU cores, core count and core arrangement may be hw specific but
> > there are general interfaces to support selection of these cores.  CU
> > mask may be hw specific but spatial partitioning as an idea is not.
> > Most gpu vendors have the concept of sub-device compute units (EU, SE,
> > etc.); OpenCL has the concept of subdevice in the language.  I don't
> > see any obstacle for vendors to implement spatial partitioning just
> > like many CPU vendors support the idea of multi-core.
> >
> > > Also for the time slice cgroups thing, can you pls give me pointers to
> > > these old patches that had it, and how it's done? I very obviously missed
> > > that part.
> > I think you misunderstood what I wrote earlier.  The original proposal
> > was about spatial partitioning of subdevice resources not time sharing
> > using cgroup (since time sharing is already supported elsewhere.)
>
> Well SRIOV time-sharing is for virtualization. cgroups is for
> containerization, which is just virtualization but with less overhead and
> more security bugs.
>
> More or less.
>
> So either I get things still wrong, or we'll get time-sharing for
> virtualization, and partitioning of CU for containerization. That doesn't
> make that much sense to me.

You could still potentially do SR-IOV for containerization.  You'd
just pass one of the PCI VFs (virtual functions) to the container and
you'd automatically get the time slice.  I don't see why cgroups would
be a factor there.

Alex

>
> Since time-sharing is the first thing that's done for virtualization I
> think it's probably also the most reasonable to start with for containers.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

On Fri, May 07, 2021 at 11:33:46AM -0400, Kenny Ho wrote:
> On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  wrote:
> >
> > Hm I missed that. I feel like time-sliced-of-a-whole gpu is the easier gpu
> > cgroups controler to get started, since it's much closer to other cgroups
> > that control bandwidth of some kind. Whether it's i/o bandwidth or compute
> > bandwidht is kinda a wash.
> sriov/time-sliced-of-a-whole gpu does not really need a cgroup
> interface since each slice appears as a stand alone device.  This is
> already in production (not using cgroup) with users.  The cgroup
> proposal has always been parallel to that in many sense: 1) spatial
> partitioning as an independent but equally valid use case as time
> sharing, 2) sub-device resource control as opposed to full device
> control motivated by the workload characterization paper.  It was
> never about time vs space in terms of use cases but having new API for
> users to be able to do spatial subdevice partitioning.
> 
> > CU mask feels a lot more like an isolation/guaranteed forward progress
> > kind of thing, and I suspect that's always going to be a lot more gpu hw
> > specific than anything we can reasonably put into a general cgroups
> > controller.
> The first half is correct but I disagree with the conclusion.  The
> analogy I would use is multi-core CPU.  The capability of individual
> CPU cores, core count and core arrangement may be hw specific but
> there are general interfaces to support selection of these cores.  CU
> mask may be hw specific but spatial partitioning as an idea is not.
> Most gpu vendors have the concept of sub-device compute units (EU, SE,
> etc.); OpenCL has the concept of subdevice in the language.  I don't
> see any obstacle for vendors to implement spatial partitioning just
> like many CPU vendors support the idea of multi-core.
> 
> > Also for the time slice cgroups thing, can you pls give me pointers to
> > these old patches that had it, and how it's done? I very obviously missed
> > that part.
> I think you misunderstood what I wrote earlier.  The original proposal
> was about spatial partitioning of subdevice resources not time sharing
> using cgroup (since time sharing is already supported elsewhere.)

Well SRIOV time-sharing is for virtualization. cgroups is for
containerization, which is just virtualization but with less overhead and
more security bugs.

More or less.

So either I get things still wrong, or we'll get time-sharing for
virtualization, and partitioning of CU for containerization. That doesn't
make that much sense to me.

Since time-sharing is the first thing that's done for virtualization I
think it's probably also the most reasonable to start with for containers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [v3, 4/5] drm/connector: Add a helper to attach the colorspace property

2021-05-07 Thread Jernej Škrabec

Hi!

Dne petek, 30. april 2021 ob 11:44:50 CEST je Maxime Ripard napisal(a):
> The intel driver uses the same logic to attach the Colorspace property
> in multiple places and we'll need it in vc4 too. Let's move that common
> code in a helper.
> 
> Signed-off-by: Maxime Ripard 
> ---
> 
> Changes from v2:
>   - Rebased on current drm-misc-next
> 
> Changes from v1:
>   - New patch
> ---

Reviewed-by: Jernej Skrabec 

Best regards,
Jernej


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [v3, 5/5] drm/vc4: hdmi: Signal the proper colorimetry info in the infoframe

2021-05-07 Thread Jernej Škrabec

Hi!

Dne petek, 30. april 2021 ob 11:44:51 CEST je Maxime Ripard napisal(a):
> Our driver while supporting HDR didn't send the proper colorimetry info
> in the AVI infoframe.
> 
> Let's add the property needed so that the userspace can let us know what
> the colorspace is supposed to be.
> 
> Signed-off-by: Maxime Ripard 
> ---
> 
> Changes from v2:
>   - Rebased on current drm-misc-next
> 
> Changes from v1:
>   - New patch

Reviewed-by: Jernej Skrabec 

Best regards,
Jernej


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v5 20/27] drm: Scope all DRM IOCTLs with drm_dev_enter/exit

2021-05-07 Thread Andrey Grodzovsky

On 2021-05-07 5:11 a.m., Daniel Vetter wrote:

On Thu, May 06, 2021 at 12:25:06PM -0400, Andrey Grodzovsky wrote:

On 2021-05-06 5:40 a.m., Daniel Vetter wrote:

On Fri, Apr 30, 2021 at 01:27:37PM -0400, Andrey Grodzovsky wrote:

On 2021-04-30 6:25 a.m., Daniel Vetter wrote:

On Thu, Apr 29, 2021 at 04:34:55PM -0400, Andrey Grodzovsky wrote:

On 2021-04-29 3:05 p.m., Daniel Vetter wrote:

On Thu, Apr 29, 2021 at 12:04:33PM -0400, Andrey Grodzovsky wrote:

On 2021-04-29 7:32 a.m., Daniel Vetter wrote:

On Thu, Apr 29, 2021 at 01:23:19PM +0200, Daniel Vetter wrote:

On Wed, Apr 28, 2021 at 11:12:00AM -0400, Andrey Grodzovsky wrote:

With this calling drm_dev_unplug will flush and block
all in flight IOCTLs

Also, add feature such that if device supports graceful unplug
we enclose entire IOCTL in SRCU critical section.

Signed-off-by: Andrey Grodzovsky

Nope.

The idea of drm_dev_enter/exit is to mark up hw access. Not entire ioctl.

Then I am confused why we have
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.12%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_ioctl.c%23L826&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Ce53ea46e66fa40a0e03f08d911381a05%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637559754928702763%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=zMlHiglnn8Vm%2BVxI9Rbk8X%2BTyuokq1x1INbhbRCWK4E%3D&reserved=0
currently in code ?

I forgot about this one, again. Thanks for reminding.

Especially not with an opt-in flag so that it could be shrugged of as a
driver hack. Most of these ioctls should have absolutely no problem
working after hotunplug.

I guess it should be ok.

What should be ok ?

My reasons for making this work is both less trouble for userspace (did
you test with various wayland compositors out there, not just amdgpu x86

I didn't - will give it a try.

Weston worked without crashes, run the egl tester cube there.

driver?), but also testing.

Plus various drivers already acquired drm_dev_enter/exit and now I wonder
whether that was properly tested or not ...

I guess maybe we need a drm module option to disable this check, so that
we can exercise the code as if the ioctl has raced with hotunplug at the
worst possible moment.

That's too much. You still have to arrange that the flip completion event
gets sent out. So it's a bit tricky.

In other places the same problem applies, e.g. probe functions need to
make sure they report "disconnected".

I see, well, this is all part of KMS support which I defer for now
anyway. Will tackle it then.

As I pointed before, beca

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL

2021-05-07 Thread Kenny Ho

On Fri, May 7, 2021 at 4:59 AM Daniel Vetter  wrote:
>
> Hm I missed that. I feel like time-sliced-of-a-whole gpu is the easier gpu
> cgroups controler to get started, since it's much closer to other cgroups
> that control bandwidth of some kind. Whether it's i/o bandwidth or compute
> bandwidht is kinda a wash.
sriov/time-sliced-of-a-whole gpu does not really need a cgroup
interface since each slice appears as a stand alone device.  This is
already in production (not using cgroup) with users.  The cgroup
proposal has always been parallel to that in many sense: 1) spatial
partitioning as an independent but equally valid use case as time
sharing, 2) sub-device resource control as opposed to full device
control motivated by the workload characterization paper.  It was
never about time vs space in terms of use cases but having new API for
users to be able to do spatial subdevice partitioning.

> CU mask feels a lot more like an isolation/guaranteed forward progress
> kind of thing, and I suspect that's always going to be a lot more gpu hw
> specific than anything we can reasonably put into a general cgroups
> controller.
The first half is correct but I disagree with the conclusion.  The
analogy I would use is multi-core CPU.  The capability of individual
CPU cores, core count and core arrangement may be hw specific but
there are general interfaces to support selection of these cores.  CU
mask may be hw specific but spatial partitioning as an idea is not.
Most gpu vendors have the concept of sub-device compute units (EU, SE,
etc.); OpenCL has the concept of subdevice in the language.  I don't
see any obstacle for vendors to implement spatial partitioning just
like many CPU vendors support the idea of multi-core.

> Also for the time slice cgroups thing, can you pls give me pointers to
> these old patches that had it, and how it's done? I very obviously missed
> that part.
I think you misunderstood what I wrote earlier.  The original proposal
was about spatial partitioning of subdevice resources not time sharing
using cgroup (since time sharing is already supported elsewhere.)

Kenny
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [5.12 regression] ttm->pages NULL dereference with radeon driver

2021-05-07 Thread Christian König


Hi Takashi,

Am 07.05.21 um 17:08 schrieb Takashi Iwai:

Hi,

we've received a regression report showing NULL dereference Oops with
radeon driver on 5.12 kernel:
   https://bugzilla.opensuse.org/show_bug.cgi?id=1185516

It turned out that the recent TTM cleanup / refactoring via commit
0575ff3d33cd ("drm/radeon: stop using pages with
drm_prime_sg_to_page_addr_arrays v2") is the culprit.  On 5.12 kernel,
ttm->pages is no longer allocated / set up, while the radeon driver
still has a few places assuming the valid ttm->pages, and for the
reporter (running the modesettig driver), radeon_gart_bind() hits the
problem.

A hackish patch below was confirmed to work, at least, but obviously
we need a proper fix.

Could you take a look at it?


If that's all then that looks trivial to me.

Going to provide a patch on Monday.

Thanks for the notice,
Christian.




thanks,

Takashi

--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -253,7 +253,7 @@ void radeon_gart_unbind(struct radeon_de
t = offset / RADEON_GPU_PAGE_SIZE;
p = t / (PAGE_SIZE / RADEON_GPU_PAGE_SIZE);
for (i = 0; i < pages; i++, p++) {
-   if (rdev->gart.pages[p]) {
+   if (1 /*rdev->gart.pages[p]*/) {
rdev->gart.pages[p] = NULL;
for (j = 0; j < (PAGE_SIZE / RADEON_GPU_PAGE_SIZE); 
j++, t++) {
rdev->gart.pages_entry[t] = 
rdev->dummy_page.entry;
@@ -301,7 +301,7 @@ int radeon_gart_bind(struct radeon_devic
p = t / (PAGE_SIZE / RADEON_GPU_PAGE_SIZE);
  
  	for (i = 0; i < pages; i++, p++) {

-   rdev->gart.pages[p] = pagelist[i];
+   /* rdev->gart.pages[p] = pagelist[i]; */
page_base = dma_addr[i];
for (j = 0; j < (PAGE_SIZE / RADEON_GPU_PAGE_SIZE); j++, t++) {
page_entry = radeon_gart_get_page_entry(page_base, 
flags);
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -360,6 +360,8 @@ static int radeon_ttm_tt_pin_userptr(str
  
  	if (current->mm != gtt->usermm)

return -EPERM;
+   if (!ttm->pages)
+   return -EPERM;
  
  	if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {

/* check that we only pin down anonymous memory
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[5.12 regression] ttm->pages NULL dereference with radeon driver

2021-05-07 Thread Takashi Iwai

Hi,

we've received a regression report showing NULL dereference Oops with
radeon driver on 5.12 kernel:
  https://bugzilla.opensuse.org/show_bug.cgi?id=1185516

It turned out that the recent TTM cleanup / refactoring via commit
0575ff3d33cd ("drm/radeon: stop using pages with
drm_prime_sg_to_page_addr_arrays v2") is the culprit.  On 5.12 kernel,
ttm->pages is no longer allocated / set up, while the radeon driver
still has a few places assuming the valid ttm->pages, and for the
reporter (running the modesettig driver), radeon_gart_bind() hits the
problem.

A hackish patch below was confirmed to work, at least, but obviously
we need a proper fix.

Could you take a look at it?


thanks,

Takashi

--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -253,7 +253,7 @@ void radeon_gart_unbind(struct radeon_de
t = offset / RADEON_GPU_PAGE_SIZE;
p = t / (PAGE_SIZE / RADEON_GPU_PAGE_SIZE);
for (i = 0; i < pages; i++, p++) {
-   if (rdev->gart.pages[p]) {
+   if (1 /*rdev->gart.pages[p]*/) {
rdev->gart.pages[p] = NULL;
for (j = 0; j < (PAGE_SIZE / RADEON_GPU_PAGE_SIZE); 
j++, t++) {
rdev->gart.pages_entry[t] = 
rdev->dummy_page.entry;
@@ -301,7 +301,7 @@ int radeon_gart_bind(struct radeon_devic
p = t / (PAGE_SIZE / RADEON_GPU_PAGE_SIZE);
 
for (i = 0; i < pages; i++, p++) {
-   rdev->gart.pages[p] = pagelist[i];
+   /* rdev->gart.pages[p] = pagelist[i]; */
page_base = dma_addr[i];
for (j = 0; j < (PAGE_SIZE / RADEON_GPU_PAGE_SIZE); j++, t++) {
page_entry = radeon_gart_get_page_entry(page_base, 
flags);
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -360,6 +360,8 @@ static int radeon_ttm_tt_pin_userptr(str
 
if (current->mm != gtt->usermm)
return -EPERM;
+   if (!ttm->pages)
+   return -EPERM;
 
if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
/* check that we only pin down anonymous memory
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Move plane code from amdgpu_dm to amdgpu_dm_plane

2021-05-07 Thread Kazlauskas, Nicholas


On 2021-05-07 10:39 a.m., Rodrigo Siqueira wrote:

The amdgpu_dm file contains most of the code that works as an interface
between DRM API and Display Core. We maintain all the plane operations
inside amdgpu_dm; this commit extracts the plane code to its specific
file named amdgpu_dm_plane. This commit does not introduce any
functional change to the functions; it only changes some static
functions to global and adds some minor adjustments related to the copy
from one place to another.

Signed-off-by: Rodrigo Siqueira 
---
  .../gpu/drm/amd/display/amdgpu_dm/Makefile|9 +-
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1479 +---
  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 1496 +
  .../amd/display/amdgpu_dm/amdgpu_dm_plane.h   |   56 +
  4 files changed, 1559 insertions(+), 1481 deletions(-)
  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile 
b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
index 9a3b7bf8ab0b..6542ef0ff83e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
@@ -23,9 +23,12 @@
  # Makefile for the 'dm' sub-component of DAL.
  # It provides the control and status of dm blocks.
  
-

-
-AMDGPUDM = amdgpu_dm.o amdgpu_dm_irq.o amdgpu_dm_mst_types.o amdgpu_dm_color.o
+AMDGPUDM := \
+   amdgpu_dm.o \
+   amdgpu_dm_color.o \
+   amdgpu_dm_irq.o \
+   amdgpu_dm_mst_types.o \
+   amdgpu_dm_plane.o
  
  ifneq ($(CONFIG_DRM_AMD_DC),)

  AMDGPUDM += amdgpu_dm_services.o amdgpu_dm_helpers.o amdgpu_dm_pp_smu.o
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index cc048c348a92..60ddb4d8be6c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -44,6 +44,7 @@
  #include "amdgpu_ucode.h"
  #include "atom.h"
  #include "amdgpu_dm.h"
+#include "amdgpu_dm_plane.h"
  #ifdef CONFIG_DRM_AMD_DC_HDCP
  #include "amdgpu_dm_hdcp.h"
  #include 
@@ -181,10 +182,6 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev);
  /* removes and deallocates the drm structures, created by the above function 
*/
  static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm);
  
-static int amdgpu_dm_plane_init(struct amdgpu_display_manager *dm,

-   struct drm_plane *plane,
-   unsigned long possible_crtcs,
-   const struct dc_plane_cap *plane_cap);
  static int amdgpu_dm_crtc_init(struct amdgpu_display_manager *dm,
   struct drm_plane *plane,
   uint32_t link_index);
@@ -203,9 +200,6 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state);
  static int amdgpu_dm_atomic_check(struct drm_device *dev,
  struct drm_atomic_state *state);
  
-static void handle_cursor_update(struct drm_plane *plane,

-struct drm_plane_state *old_plane_state);
-
  static void amdgpu_dm_set_psr_caps(struct dc_link *link);
  static bool amdgpu_dm_psr_enable(struct dc_stream_state *stream);
  static bool amdgpu_dm_link_setup_psr(struct dc_stream_state *stream);
@@ -4125,925 +4119,12 @@ static const struct drm_encoder_funcs 
amdgpu_dm_encoder_funcs = {
.destroy = amdgpu_dm_encoder_destroy,
  };
  
-

-static void get_min_max_dc_plane_scaling(struct drm_device *dev,
-struct drm_framebuffer *fb,
-int *min_downscale, int *max_upscale)
-{
-   struct amdgpu_device *adev = drm_to_adev(dev);
-   struct dc *dc = adev->dm.dc;
-   /* Caps for all supported planes are the same on DCE and DCN 1 - 3 */
-   struct dc_plane_cap *plane_cap = &dc->caps.planes[0];
-
-   switch (fb->format->format) {
-   case DRM_FORMAT_P010:
-   case DRM_FORMAT_NV12:
-   case DRM_FORMAT_NV21:
-   *max_upscale = plane_cap->max_upscale_factor.nv12;
-   *min_downscale = plane_cap->max_downscale_factor.nv12;
-   break;
-
-   case DRM_FORMAT_XRGB16161616F:
-   case DRM_FORMAT_ARGB16161616F:
-   case DRM_FORMAT_XBGR16161616F:
-   case DRM_FORMAT_ABGR16161616F:
-   *max_upscale = plane_cap->max_upscale_factor.fp16;
-   *min_downscale = plane_cap->max_downscale_factor.fp16;
-   break;
-
-   default:
-   *max_upscale = plane_cap->max_upscale_factor.argb;
-   *min_downscale = plane_cap->max_downscale_factor.argb;
-   break;
-   }
-
-   /*
-* A factor of 1 in the plane_cap means to not allow scaling, ie. use a
-* scaling factor of 1.0 == 1000 units.
-

[PATCH 14/14] drm/amd/display: 3.2.135.1

From: Aric Cyr 

- adding missed FW promotion

Signed-off-by: Aric Cyr 
Reviewed-by: Aric Cyr 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 213a6cb05d11..d26153ab9d62 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -45,7 +45,7 @@
 /* forward declaration */
 struct aux_payload;
 
-#define DC_VER "3.2.135"
+#define DC_VER "3.2.135.1"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 13/14] drm/amd/display: [FW Promotion] Release 0.0.65

From: Anthony Koo 

- Implement INBOX0 messaging for HW lock

Signed-off-by: Anthony Koo 
Reviewed-by: Anthony Koo 
Acked-by: Stylon Wang 
---
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 123 +-
 1 file changed, 116 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 8df382aaeb8e..40ce15eb934c 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -47,10 +47,10 @@
 
 /* Firmware versioning. */
 #ifdef DMUB_EXPOSE_VERSION
-#define DMUB_FW_VERSION_GIT_HASH 0x9130ab830
+#define DMUB_FW_VERSION_GIT_HASH 0x2cab49dfb
 #define DMUB_FW_VERSION_MAJOR 0
 #define DMUB_FW_VERSION_MINOR 0
-#define DMUB_FW_VERSION_REVISION 64
+#define DMUB_FW_VERSION_REVISION 65
 #define DMUB_FW_VERSION_TEST 0
 #define DMUB_FW_VERSION_VBIOS 0
 #define DMUB_FW_VERSION_HOTFIX 0
@@ -164,6 +164,13 @@ extern "C" {
 #define dmub_udelay(microseconds) udelay(microseconds)
 #endif
 
+/**
+ * Number of nanoseconds per DMUB tick.
+ * DMCUB_TIMER_CURRENT increments in DMUB ticks, which are 10ns by default.
+ * If DMCUB_TIMER_WINDOW is non-zero this will no longer be true.
+ */
+#define NS_PER_DMUB_TICK 10
+
 /**
  * union dmub_addr - DMUB physical/virtual 64-bit address.
  */
@@ -455,6 +462,61 @@ enum dmub_gpint_command {
DMUB_GPINT__PSR_RESIDENCY = 9,
 };
 
+/**
+ * INBOX0 generic command definition
+ */
+union dmub_inbox0_cmd_common {
+   struct {
+   uint32_t command_code: 8; /**< INBOX0 command code */
+   uint32_t param: 24; /**< 24-bit parameter */
+   } bits;
+   uint32_t all;
+};
+
+/**
+ * INBOX0 hw_lock command definition
+ */
+union dmub_inbox0_cmd_lock_hw {
+   struct {
+   uint32_t command_code: 8;
+
+   /* NOTE: Must be have enough bits to match: enum hw_lock_client 
*/
+   uint32_t hw_lock_client: 1;
+
+   /* NOTE: Below fields must match with: struct 
dmub_hw_lock_inst_flags */
+   uint32_t otg_inst: 3;
+   uint32_t opp_inst: 3;
+   uint32_t dig_inst: 3;
+
+   /* NOTE: Below fields must match with: union dmub_hw_lock_flags 
*/
+   uint32_t lock_pipe: 1;
+   uint32_t lock_cursor: 1;
+   uint32_t lock_dig: 1;
+   uint32_t triple_buffer_lock: 1;
+
+   uint32_t lock: 1;   /**< Lock */
+   uint32_t should_release: 1; /**< Release */
+   uint32_t reserved: 8;   /**< Reserved for 
extending more clients, HW, etc. */
+   } bits;
+   uint32_t all;
+};
+
+union dmub_inbox0_data_register {
+   union dmub_inbox0_cmd_common inbox0_cmd_common;
+   union dmub_inbox0_cmd_lock_hw inbox0_cmd_lock_hw;
+};
+
+enum dmub_inbox0_command {
+   /**
+* DESC: Invalid command, ignored.
+*/
+   DMUB_INBOX0_CMD__INVALID_COMMAND = 0,
+   /**
+* DESC: Notification to acquire/release HW lock
+* ARGS:
+*/
+   DMUB_INBOX0_CMD__HW_LOCK = 1,
+};
 
//==
 
//=
 
//==
@@ -573,7 +635,8 @@ struct dmub_cmd_header {
unsigned int type : 8; /**< command type */
unsigned int sub_type : 8; /**< command sub type */
unsigned int ret_status : 1; /**< 1 if returned data, 0 otherwise */
-   unsigned int reserved0 : 7; /**< reserved bits */
+   unsigned int multi_cmd_pending : 1; /**< 1 if multiple commands chained 
together */
+   unsigned int reserved0 : 6; /**< reserved bits */
unsigned int payload_bytes : 6;  /* payload excluding header - up to 60 
bytes */
unsigned int reserved1 : 2; /**< reserved bits */
 };
@@ -1346,6 +1409,9 @@ struct dmub_rb_cmd_psr_force_static {
 
 /**
  * Set of HW components that can be locked.
+ *
+ * Note: If updating with more HW components, fields
+ * in dmub_inbox0_cmd_lock_hw must be updated to match.
  */
 union dmub_hw_lock_flags {
/**
@@ -1378,6 +1444,9 @@ union dmub_hw_lock_flags {
 
 /**
  * Instances of HW to be locked.
+ *
+ * Note: If updating with more HW components, fields
+ * in dmub_inbox0_cmd_lock_hw must be updated to match.
  */
 struct dmub_hw_lock_inst_flags {
/**
@@ -1401,16 +1470,16 @@ struct dmub_hw_lock_inst_flags {
 
 /**
  * Clients that can acquire the HW Lock Manager.
+ *
+ * Note: If updating with more clients, fields in
+ * dmub_inbox0_cmd_lock_hw must be updated to match.
  */
 enum hw_lock_client {
/**
 * Driver is the client of HW Lock Manager.
 */
HW_LOCK_CLIENT_DRIVER = 0,
-   /**
-* FW is the client of HW Lock Manager.
-*/
-   HW_LOCK_CLIENT_FW,
+   HW_LO

[PATCH 12/14] drm/amd/display: 3.2.135

From: Aric Cyr 

Signed-off-by: Aric Cyr 
Reviewed-by: Aric Cyr 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index d9e1657ba6a6..213a6cb05d11 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -45,7 +45,7 @@
 /* forward declaration */
 struct aux_payload;
 
-#define DC_VER "3.2.134"
+#define DC_VER "3.2.135"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 11/14] drm/amd/display: fix use_max_lb flag for 420 pixel formats

From: Dmytro Laktyushkin 

Right now the flag simply selects memory config 0 when flag is true
however 420 modes benefit more from memory config 3.

Signed-off-by: Dmytro Laktyushkin 
Reviewed-by: Aric Cyr 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c
index efa86d5c6847..98ab4b776924 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c
@@ -496,10 +496,13 @@ static enum lb_memory_config 
dpp1_dscl_find_lb_memory_config(struct dcn10_dpp *d
int vtaps_c = scl_data->taps.v_taps_c;
int ceil_vratio = dc_fixpt_ceil(scl_data->ratios.vert);
int ceil_vratio_c = dc_fixpt_ceil(scl_data->ratios.vert_c);
-   enum lb_memory_config mem_cfg = LB_MEMORY_CONFIG_0;
 
-   if (dpp->base.ctx->dc->debug.use_max_lb)
-   return mem_cfg;
+   if (dpp->base.ctx->dc->debug.use_max_lb) {
+   if (scl_data->format == PIXEL_FORMAT_420BPP8
+   || scl_data->format == PIXEL_FORMAT_420BPP10)
+   return LB_MEMORY_CONFIG_3;
+   return LB_MEMORY_CONFIG_0;
+   }
 
dpp->base.caps->dscl_calc_lb_num_partitions(
scl_data, LB_MEMORY_CONFIG_1, &num_part_y, &num_part_c);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 10/14] drm/amd/display: Handle potential dpp_inst mismatch with pipe_idx

From: Anthony Wang 

[Why]
In some pipe harvesting configs, we will select the incorrect
dpp_inst when programming DTO. This is because when any intermediate
pipe is fused, resource instances are no longer in 1:1
correspondence with pipe index.

[How]
When looping through pipes to program DTO, get the dpp_inst
associated with each pipe from res_pool.

Signed-off-by: Anthony Wang 
Reviewed-by: Yongqiang Sun 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
index c2d0f68dbdcc..f965914ea57c 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
@@ -106,10 +106,10 @@ static void rn_update_clocks_update_dpp_dto(struct 
clk_mgr_internal *clk_mgr,
for (i = 0; i < clk_mgr->base.ctx->dc->res_pool->pipe_count; i++) {
int dpp_inst, dppclk_khz, prev_dppclk_khz;
 
-   /* Loop index will match dpp->inst if resource exists,
-* and we want to avoid dependency on dpp object
+   /* Loop index may not match dpp->inst if some pipes disabled,
+* so select correct inst from res_pool
 */
-   dpp_inst = i;
+   dpp_inst = clk_mgr->base.ctx->dc->res_pool->dpps[i]->inst;
dppclk_khz = 
context->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz;
 
prev_dppclk_khz = clk_mgr->dccg->pipe_dppclk_khz[i];
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 09/14] drm/amd/display: Handle pixel format test request

From: Ilya Bakoulin 

[Why]
Some DSC tests fail because stream pixel encoding does not change
its value according to the type requested in the DPCD test params.

[How]
Set stream pixel encoding before updating DSC config and configuring
the test pattern.

Signed-off-by: Ilya Bakoulin 
Reviewed-by: Hanghong Ma 
Acked-by: Stylon Wang 
---
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 27c5d49a7bc1..ba4883fca616 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -2975,6 +2975,22 @@ static void dp_test_send_link_test_pattern(struct 
dc_link *link)
break;
}
 
+   switch (dpcd_test_params.bits.CLR_FORMAT) {
+   case 0:
+   pipe_ctx->stream->timing.pixel_encoding = PIXEL_ENCODING_RGB;
+   break;
+   case 1:
+   pipe_ctx->stream->timing.pixel_encoding = 
PIXEL_ENCODING_YCBCR422;
+   break;
+   case 2:
+   pipe_ctx->stream->timing.pixel_encoding = 
PIXEL_ENCODING_YCBCR444;
+   break;
+   default:
+   pipe_ctx->stream->timing.pixel_encoding = PIXEL_ENCODING_RGB;
+   break;
+   }
+
+
if (requestColorDepth != COLOR_DEPTH_UNDEFINED
&& pipe_ctx->stream->timing.display_color_depth != 
requestColorDepth) {
DC_LOG_DEBUG("%s: original bpc %d, changing to %d\n",
@@ -2982,9 +2998,10 @@ static void dp_test_send_link_test_pattern(struct 
dc_link *link)
pipe_ctx->stream->timing.display_color_depth,
requestColorDepth);
pipe_ctx->stream->timing.display_color_depth = 
requestColorDepth;
-   dp_update_dsc_config(pipe_ctx);
}
 
+   dp_update_dsc_config(pipe_ctx);
+
dc_link_dp_set_test_pattern(
link,
test_pattern,
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Make underlay rules less strict

2021-05-07 Thread Kazlauskas, Nicholas


On 2021-05-07 10:37 a.m., Rodrigo Siqueira wrote:

Currently, we reject all conditions where the underlay plane goes
outside the overlay plane limits, which is not entirely correct since we
reject some valid cases like the ones illustrated below:

   ++  ++
   |   Overlay plane|  |   Overlay plane|
   ||  |+---|--+
   | +--+   |  ||   |  |
   | |  |   |  ||   |  |
   ++  ++  |
 | Primary plane|   +--+
 |  (underlay)  |
 +--+
   +-+--+---+  ++
   |Overlay plane   |  |Overlay plane   |
+-|+   |  |   +--+
| ||   |  |   || |
| ||   |  |   || |
| ||   |  |   || |
+-|+   |  |   +--+
   ++  ++

This patch fixes this issue by only rejecting commit requests where the
underlay is entirely outside the overlay limits. After applying this
patch, a set of subtests related to kms_plane, kms_plane_alpha_blend,
and kms_plane_scaling will pass.

Signed-off-by: Rodrigo Siqueira 


What's the size of the overlay plane in your examples? If the overlay 
plane does not cover the entire screen then this patch is incorrect.


We don't want to be enabling the cursor on multiple pipes and the checks 
in DC to allow disabling cursor on bottom pipes only work if the 
underlay is entirely contained within the overlay.


In the case where the primary (underlay) plane extends beyond the screen 
boundaries it should be preclipped by userspace or earlier in the DM 
code before this check.


Feel free to follow up with clarification, but for now this patch is a 
NAK from me.


Regards,
Nicholas Kazlauskas


---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index cc048c348a92..15006aafc630 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10098,10 +10098,10 @@ static int validate_overlay(struct drm_atomic_state 
*state)
return 0;
  
  	/* Perform the bounds check to ensure the overlay plane covers the primary */

-   if (primary_state->crtc_x < overlay_state->crtc_x ||
-   primary_state->crtc_y < overlay_state->crtc_y ||
-   primary_state->crtc_x + primary_state->crtc_w > overlay_state->crtc_x 
+ overlay_state->crtc_w ||
-   primary_state->crtc_y + primary_state->crtc_h > overlay_state->crtc_y 
+ overlay_state->crtc_h) {
+   if (primary_state->crtc_x + primary_state->crtc_w < 
overlay_state->crtc_x ||
+   primary_state->crtc_x > overlay_state->crtc_x + 
overlay_state->crtc_w ||
+   primary_state->crtc_y > overlay_state->crtc_y + 
overlay_state->crtc_h ||
+   primary_state->crtc_y + primary_state->crtc_h < 
overlay_state->crtc_y) {
DRM_DEBUG_ATOMIC("Overlay plane is enabled with hardware cursor but 
does not fully cover primary plane\n");
return -EINVAL;
}



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 08/14] drm/amd/display: Fix clock table filling logic

From: Ilya Bakoulin 

[Why]
Currently, the code that fills the clock table can miss filling
information about some of the higher voltage states advertised
by the SMU. This, in turn, may cause some of the higher pixel clock
modes (e.g. 8k60) to fail validation.

[How]
Fill the table with one entry per DCFCLK level instead of one entry
per FCLK level. This is needed because the maximum FCLK does not
necessarily need maximum voltage, whereas DCFCLK values from SMU
cover the full voltage range.

Signed-off-by: Ilya Bakoulin 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Stylon Wang 
---
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 80 ---
 .../drm/amd/display/dc/dcn21/dcn21_resource.c | 33 +---
 2 files changed, 74 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
index 887a54246bde..c2d0f68dbdcc 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
@@ -797,46 +797,67 @@ static struct wm_table lpddr4_wm_table_rn = {
},
}
 };
-static unsigned int find_socclk_for_voltage(struct dpm_clocks *clock_table, 
unsigned int voltage)
+
+static unsigned int find_max_fclk_for_voltage(struct dpm_clocks *clock_table,
+   unsigned int voltage)
 {
int i;
+   uint32_t max_clk = 0;
 
-   for (i = 0; i < PP_SMU_NUM_SOCCLK_DPM_LEVELS; i++) {
-   if (clock_table->SocClocks[i].Vol == voltage)
-   return clock_table->SocClocks[i].Freq;
+   for (i = 0; i < PP_SMU_NUM_FCLK_DPM_LEVELS; i++) {
+   if (clock_table->FClocks[i].Vol <= voltage) {
+   max_clk = clock_table->FClocks[i].Freq > max_clk ?
+   clock_table->FClocks[i].Freq : max_clk;
+   }
}
 
-   ASSERT(0);
-   return 0;
+   return max_clk;
 }
-static unsigned int find_dcfclk_for_voltage(struct dpm_clocks *clock_table, 
unsigned int voltage)
+
+static unsigned int find_max_memclk_for_voltage(struct dpm_clocks *clock_table,
+   unsigned int voltage)
 {
int i;
+   uint32_t max_clk = 0;
 
-   for (i = 0; i < PP_SMU_NUM_DCFCLK_DPM_LEVELS; i++) {
-   if (clock_table->DcfClocks[i].Vol == voltage)
-   return clock_table->DcfClocks[i].Freq;
+   for (i = 0; i < PP_SMU_NUM_MEMCLK_DPM_LEVELS; i++) {
+   if (clock_table->MemClocks[i].Vol <= voltage) {
+   max_clk = clock_table->MemClocks[i].Freq > max_clk ?
+   clock_table->MemClocks[i].Freq : max_clk;
+   }
}
 
-   ASSERT(0);
-   return 0;
+   return max_clk;
+}
+
+static unsigned int find_max_socclk_for_voltage(struct dpm_clocks *clock_table,
+   unsigned int voltage)
+{
+   int i;
+   uint32_t max_clk = 0;
+
+   for (i = 0; i < PP_SMU_NUM_SOCCLK_DPM_LEVELS; i++) {
+   if (clock_table->SocClocks[i].Vol <= voltage) {
+   max_clk = clock_table->SocClocks[i].Freq > max_clk ?
+   clock_table->SocClocks[i].Freq : max_clk;
+   }
+   }
+
+   return max_clk;
 }
 
 static void rn_clk_mgr_helper_populate_bw_params(struct clk_bw_params 
*bw_params, struct dpm_clocks *clock_table, struct integrated_info *bios_info)
 {
int i, j = 0;
+   unsigned int volt;
 
j = -1;
 
-   ASSERT(PP_SMU_NUM_FCLK_DPM_LEVELS <= MAX_NUM_DPM_LVL);
-
-   /* Find lowest DPM, FCLK is filled in reverse order*/
-
-   for (i = PP_SMU_NUM_FCLK_DPM_LEVELS - 1; i >= 0; i--) {
-   if (clock_table->FClocks[i].Freq != 0 && 
clock_table->FClocks[i].Vol != 0) {
+   /* Find max DPM */
+   for (i = 0; i < PP_SMU_NUM_DCFCLK_DPM_LEVELS; ++i) {
+   if (clock_table->DcfClocks[i].Freq != 0 &&
+   clock_table->DcfClocks[i].Vol != 0)
j = i;
-   break;
-   }
}
 
if (j == -1) {
@@ -847,13 +868,18 @@ static void rn_clk_mgr_helper_populate_bw_params(struct 
clk_bw_params *bw_params
 
bw_params->clk_table.num_entries = j + 1;
 
-   for (i = 0; i < bw_params->clk_table.num_entries; i++, j--) {
-   bw_params->clk_table.entries[i].fclk_mhz = 
clock_table->FClocks[j].Freq;
-   bw_params->clk_table.entries[i].memclk_mhz = 
clock_table->MemClocks[j].Freq;
-   bw_params->clk_table.entries[i].voltage = 
clock_table->FClocks[j].Vol;
-   bw_params->clk_table.entries[i].dcfclk_mhz = 
find_dcfclk_for_voltage(clock_table, clock_table->FClocks[j].Vol);
-   bw_params->clk_table.entries[i].socclk_mhz = 
find_socclk_for_voltage(clock_table,
-   
bw_params->clk_table.entries[i].vo

[PATCH 07/14] drm/amd/display: minor dp link training refactor

From: Wenjing Liu 

[how]
The change includes some dp link training refactors:
1. break down is_ch_eq_done to checking individual conditions in
its own function.
2. update dpcd_set_training_pattern to take in dc_dp_training_pattern
as input.
3. moving lttpr mode struct definition into link_service_types.h

Signed-off-by: Wenjing Liu 
Reviewed-by: George Shen 
Acked-by: Stylon Wang 
---
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 124 ++
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |   1 +
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   6 -
 .../amd/display/include/link_service_types.h  |   6 +
 4 files changed, 77 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index de75e8581078..27c5d49a7bc1 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -108,10 +108,50 @@ static void wait_for_training_aux_rd_interval(
wait_in_micro_secs);
 }
 
+static enum dpcd_training_patterns
+   dc_dp_training_pattern_to_dpcd_training_pattern(
+   struct dc_link *link,
+   enum dc_dp_training_pattern pattern)
+{
+   enum dpcd_training_patterns dpcd_tr_pattern =
+   DPCD_TRAINING_PATTERN_VIDEOIDLE;
+
+   switch (pattern) {
+   case DP_TRAINING_PATTERN_SEQUENCE_1:
+   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_1;
+   break;
+   case DP_TRAINING_PATTERN_SEQUENCE_2:
+   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_2;
+   break;
+   case DP_TRAINING_PATTERN_SEQUENCE_3:
+   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_3;
+   break;
+   case DP_TRAINING_PATTERN_SEQUENCE_4:
+   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_4;
+   break;
+   case DP_TRAINING_PATTERN_VIDEOIDLE:
+   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_VIDEOIDLE;
+   break;
+   default:
+   ASSERT(0);
+   DC_LOG_HW_LINK_TRAINING("%s: Invalid HW Training pattern: %d\n",
+   __func__, pattern);
+   break;
+   }
+
+   return dpcd_tr_pattern;
+}
+
 static void dpcd_set_training_pattern(
struct dc_link *link,
-   union dpcd_training_pattern dpcd_pattern)
+   enum dc_dp_training_pattern training_pattern)
 {
+   union dpcd_training_pattern dpcd_pattern = { {0} };
+
+   dpcd_pattern.v1_4.TRAINING_PATTERN_SET =
+   dc_dp_training_pattern_to_dpcd_training_pattern(
+   link, training_pattern);
+
core_link_write_dpcd(
link,
DP_TRAINING_PATTERN_SET,
@@ -240,37 +280,6 @@ static void dpcd_set_link_settings(
}
 }
 
-static enum dpcd_training_patterns
-   dc_dp_training_pattern_to_dpcd_training_pattern(
-   struct dc_link *link,
-   enum dc_dp_training_pattern pattern)
-{
-   enum dpcd_training_patterns dpcd_tr_pattern =
-   DPCD_TRAINING_PATTERN_VIDEOIDLE;
-
-   switch (pattern) {
-   case DP_TRAINING_PATTERN_SEQUENCE_1:
-   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_1;
-   break;
-   case DP_TRAINING_PATTERN_SEQUENCE_2:
-   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_2;
-   break;
-   case DP_TRAINING_PATTERN_SEQUENCE_3:
-   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_3;
-   break;
-   case DP_TRAINING_PATTERN_SEQUENCE_4:
-   dpcd_tr_pattern = DPCD_TRAINING_PATTERN_4;
-   break;
-   default:
-   ASSERT(0);
-   DC_LOG_HW_LINK_TRAINING("%s: Invalid HW Training pattern: %d\n",
-   __func__, pattern);
-   break;
-   }
-
-   return dpcd_tr_pattern;
-}
-
 static uint8_t dc_dp_initialize_scrambling_data_symbols(
struct dc_link *link,
enum dc_dp_training_pattern pattern)
@@ -433,20 +442,30 @@ static bool is_cr_done(enum dc_lane_count ln_count,
 }
 
 static bool is_ch_eq_done(enum dc_lane_count ln_count,
-   union lane_status *dpcd_lane_status,
-   union lane_align_status_updated *lane_status_updated)
+   union lane_status *dpcd_lane_status)
 {
+   bool done = true;
uint32_t lane;
-   if (!lane_status_updated->bits.INTERLANE_ALIGN_DONE)
-   return false;
-   else {
-   for (lane = 0; lane < (uint32_t)(ln_count); lane++) {
-   if (!dpcd_lane_status[lane].bits.SYMBOL_LOCKED_0 ||
-   !dpcd_lane_status[lane].bits.CHANNEL_EQ_DONE_0)
-   return false;
-   }
-   }
-   return true;
+   for (lane = 0; lane < (uint32_t)(ln_count); lane++)
+   if (!dpcd_lane_status[lane].bits.CHANNEL_EQ_DONE_0)
+   done = false;
+   return done;
+}
+
+static bool is_symbol_locked(e

[PATCH 06/14] drm/amd/display: DETBufferSizeInKbyte variable type modifications

From: Chaitanya Dhere 

[Why]
DETBufferSizeInKByte is not expected to be sub-dividable, hence
unsigned int is a better suited data-type. Change it to an array
as well to satisfy current requirements.

[How]
Change the data-type of DETBufferSizeInKByte to an unsigned int
array. Modify the all the variables like DETBufferSizeY,
DETBufferSizeC that are involved in DETBufferSizeInKByte calculations
to unsigned int in all the display_mode_vba_xx files.

Signed-off-by: Chaitanya Dhere 
Reviewed-by: Dmytro Laktyushkin 
Acked-by: Stylon Wang 
---
 .../dc/dml/dcn20/display_mode_vba_20.c| 26 -
 .../dc/dml/dcn20/display_mode_vba_20v2.c  | 26 -
 .../dc/dml/dcn21/display_mode_vba_21.c| 58 +--
 .../dc/dml/dcn30/display_mode_vba_30.c| 48 +++
 .../drm/amd/display/dc/dml/display_mode_vba.c |  2 +-
 .../drm/amd/display/dc/dml/display_mode_vba.h | 14 ++---
 6 files changed, 87 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
index 9729cf292e84..d3b5b6fedf04 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c
@@ -2895,7 +2895,7 @@ static void dml20_DisplayPipeConfiguration(struct 
display_mode_lib *mode_lib)
RoundedUpMaxSwathSizeBytesC = 0.0;
 
if (RoundedUpMaxSwathSizeBytesY + RoundedUpMaxSwathSizeBytesC
-   <= mode_lib->vba.DETBufferSizeInKByte * 1024.0 
/ 2.0) {
+   <= mode_lib->vba.DETBufferSizeInKByte[0] * 
1024.0 / 2.0) {
mode_lib->vba.SwathHeightY[k] = MaximumSwathHeightY;
mode_lib->vba.SwathHeightC[k] = MaximumSwathHeightC;
} else {
@@ -2904,17 +2904,17 @@ static void dml20_DisplayPipeConfiguration(struct 
display_mode_lib *mode_lib)
}
 
if (mode_lib->vba.SwathHeightC[k] == 0) {
-   mode_lib->vba.DETBufferSizeY[k] = 
mode_lib->vba.DETBufferSizeInKByte * 1024;
+   mode_lib->vba.DETBufferSizeY[k] = 
mode_lib->vba.DETBufferSizeInKByte[0] * 1024;
mode_lib->vba.DETBufferSizeC[k] = 0;
} else if (mode_lib->vba.SwathHeightY[k] <= 
mode_lib->vba.SwathHeightC[k]) {
-   mode_lib->vba.DETBufferSizeY[k] = 
mode_lib->vba.DETBufferSizeInKByte
+   mode_lib->vba.DETBufferSizeY[k] = 
mode_lib->vba.DETBufferSizeInKByte[0]
* 1024.0 / 2;
-   mode_lib->vba.DETBufferSizeC[k] = 
mode_lib->vba.DETBufferSizeInKByte
+   mode_lib->vba.DETBufferSizeC[k] = 
mode_lib->vba.DETBufferSizeInKByte[0]
* 1024.0 / 2;
} else {
-   mode_lib->vba.DETBufferSizeY[k] = 
mode_lib->vba.DETBufferSizeInKByte
+   mode_lib->vba.DETBufferSizeY[k] = 
mode_lib->vba.DETBufferSizeInKByte[0]
* 1024.0 * 2 / 3;
-   mode_lib->vba.DETBufferSizeC[k] = 
mode_lib->vba.DETBufferSizeInKByte
+   mode_lib->vba.DETBufferSizeC[k] = 
mode_lib->vba.DETBufferSizeInKByte[0]
* 1024.0 / 3;
}
}
@@ -3819,7 +3819,7 @@ void dml20_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
mode_lib->vba.MaximumSwathWidthInDETBuffer =
dml_min(

mode_lib->vba.MaximumSwathWidthSupport,
-   
mode_lib->vba.DETBufferSizeInKByte * 1024.0 / 2.0
+   
mode_lib->vba.DETBufferSizeInKByte[0] * 1024.0 / 2.0
/ 
(locals->BytePerPixelInDETY[k]

* locals->MinSwathHeightY[k]

+ locals->BytePerPixelInDETC[k]
@@ -4322,7 +4322,7 @@ void dml20_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l
locals->RoundedUpMaxSwathSizeBytesC = 0;
}
 
-   if (locals->RoundedUpMaxSwathSizeBytesY + 
locals->RoundedUpMaxSwathSizeBytesC <= locals->DETBufferSizeInKByte * 1024 / 2) 
{
+   if (locals->RoundedUpMaxSwathSizeBytesY + 
locals->RoundedUpMaxSwathSizeBytesC <= locals->DETBufferSizeInKByte[0] * 1024 / 
2) {
locals->SwathHeightYPerState[i][j][k] = 
locals->MaxSwathHeightY[k];
locals->SwathHeigh

[PATCH 05/14] drm/amd/display: Add dc log for DP SST DSC enable/disable

From: Fangzhi Zuo 

Signed-off-by: Fangzhi Zuo 
Reviewed-by: Mikita Lipski 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index 90eacdac0ea0..4646b0d02939 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -544,8 +544,10 @@ bool dm_helpers_dp_write_dsc_enable(
ret = drm_dp_dpcd_write(aconnector->dsc_aux, DP_DSC_ENABLE, 
&enable_dsc, 1);
}
 
-   if (stream->signal == SIGNAL_TYPE_DISPLAY_PORT)
-   return dm_helpers_dp_write_dpcd(ctx, stream->link, 
DP_DSC_ENABLE, &enable_dsc, 1);
+   if (stream->signal == SIGNAL_TYPE_DISPLAY_PORT) {
+   ret = dm_helpers_dp_write_dpcd(ctx, stream->link, 
DP_DSC_ENABLE, &enable_dsc, 1);
+   DC_LOG_DC("Send DSC %s to sst display\n", enable_dsc ? "enable" 
: "disable");
+   }
 
return (ret > 0);
 }
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 04/14] drm/amd/display: Expand DP module training API.

From: Jimmy Kizito 

[Why & How]
Add functionality useful for DP link training to public interface.

Signed-off-by: Jimmy Kizito 
Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 10 +-
 drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h  |  7 +++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index b6ed57ba7a48..de75e8581078 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -1175,7 +1175,7 @@ static inline enum link_training_result 
perform_link_training_int(
return status;
 }
 
-static enum link_training_result check_link_loss_status(
+enum link_training_result dp_check_link_loss_status(
struct dc_link *link,
const struct link_training_settings *link_training_setting)
 {
@@ -1309,7 +1309,7 @@ static void initialize_training_settings(
lt_settings->enhanced_framing = 1;
 }
 
-static uint8_t convert_to_count(uint8_t lttpr_repeater_count)
+uint8_t dp_convert_to_count(uint8_t lttpr_repeater_count)
 {
switch (lttpr_repeater_count) {
case 0x80: // 1 lttpr repeater
@@ -1378,7 +1378,7 @@ static void configure_lttpr_mode_non_transparent(struct 
dc_link *link)
link->dpcd_caps.lttpr_caps.mode = repeater_mode;
}
 
-   repeater_cnt = 
convert_to_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt);
+   repeater_cnt = 
dp_convert_to_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt);
 
for (repeater_id = repeater_cnt; repeater_id > 0; 
repeater_id--) {
aux_interval_address = 
DP_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER1 +
@@ -1605,7 +1605,7 @@ enum link_training_result 
dc_link_dp_perform_link_training(
/* 2. perform link training (set link training done
 *  to false is done as well)
 */
-   repeater_cnt = 
convert_to_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt);
+   repeater_cnt = 
dp_convert_to_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt);
 
for (repeater_id = repeater_cnt; (repeater_id > 0 && status == 
LINK_TRAINING_SUCCESS);
repeater_id--) {
@@ -1648,7 +1648,7 @@ enum link_training_result 
dc_link_dp_perform_link_training(
 */
if (link->connector_signal != SIGNAL_TYPE_EDP && status == 
LINK_TRAINING_SUCCESS) {
msleep(5);
-   status = check_link_loss_status(link, <_settings);
+   status = dp_check_link_loss_status(link, <_settings);
}
 
/* 6. print status message*/
diff --git a/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h 
b/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
index 38e6fbf1e26d..428842511c03 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/dc_link_dp.h
@@ -97,5 +97,12 @@ void dp_set_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool 
enable);
 bool dp_update_dsc_config(struct pipe_ctx *pipe_ctx);
 bool dp_set_dsc_on_rx(struct pipe_ctx *pipe_ctx, bool enable);
 
+/* Convert PHY repeater count read from DPCD uint8_t. */
+uint8_t dp_convert_to_count(uint8_t lttpr_repeater_count);
+
+/* Check DPCD training status registers to detect link loss. */
+enum link_training_result dp_check_link_loss_status(
+   struct dc_link *link,
+   const struct link_training_settings *link_training_setting);
 
 #endif /* __DC_LINK_DP_H__ */
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 03/14] drm/amd/display: Add fallback and abort paths for DP link training.

From: Jimmy Kizito 

[Why]
When enabling a DisplayPort stream:
- Optionally reducing link bandwidth between failed link training
attempts should progressively relax training requirements.
- Abandoning link training altogether if a sink is unplugged should
avoid unnecessary training attempts.

[How]
- Add fallback parameter to DP link training function and reduce link
bandwidth between failed training attempts as long as stream bandwidth
requirements are met.
- Add training status for sink unplug and abort training when this
status is reported.

Signed-off-by: Jimmy Kizito 
Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  5 ++-
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 40 +++
 .../drm/amd/display/dc/core/dc_link_hwss.c|  3 +-
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |  3 +-
 .../amd/display/include/link_service_types.h  |  2 +
 5 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index d040d235c2db..c4405eba724c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -1750,6 +1750,8 @@ static enum dc_status enable_link_dp(struct dc_state 
*state,
bool apply_seamless_boot_optimization = false;
uint32_t bl_oled_enable_delay = 50; // in ms
const uint32_t post_oui_delay = 30; // 30ms
+   /* Reduce link bandwidth between failed link training attempts. */
+   bool do_fallback = false;
 
// check for seamless boot
for (i = 0; i < state->stream_count; i++) {
@@ -1788,7 +1790,8 @@ static enum dc_status enable_link_dp(struct dc_state 
*state,
   skip_video_pattern,
   LINK_TRAINING_ATTEMPTS,
   pipe_ctx,
-  pipe_ctx->stream->signal)) {
+  pipe_ctx->stream->signal,
+  do_fallback)) {
link->cur_link_settings = link_settings;
status = DC_OK;
} else {
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 8565281e6179..b6ed57ba7a48 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -1701,18 +1701,31 @@ bool perform_link_training_with_retries(
bool skip_video_pattern,
int attempts,
struct pipe_ctx *pipe_ctx,
-   enum signal_type signal)
+   enum signal_type signal,
+   bool do_fallback)
 {
uint8_t j;
uint8_t delay_between_attempts = LINK_TRAINING_RETRY_DELAY;
struct dc_stream_state *stream = pipe_ctx->stream;
struct dc_link *link = stream->link;
enum dp_panel_mode panel_mode;
+   struct link_encoder *link_enc;
+   enum link_training_result status = LINK_TRAINING_CR_FAIL_LANE0;
+   struct dc_link_settings currnet_setting = *link_setting;
+
+   /* Dynamically assigned link encoders associated with stream rather than
+* link.
+*/
+   if (link->dc->res_pool->funcs->link_encs_assign)
+   link_enc = stream->link_enc;
+   else
+   link_enc = link->link_enc;
+   ASSERT(link_enc);
 
/* We need to do this before the link training to ensure the idle 
pattern in SST
 * mode will be sent right after the link training
 */
-   link->link_enc->funcs->connect_dig_be_to_fe(link->link_enc,
+   link_enc->funcs->connect_dig_be_to_fe(link_enc,

pipe_ctx->stream_res.stream_enc->id, true);
 
for (j = 0; j < attempts; ++j) {
@@ -1724,7 +1737,7 @@ bool perform_link_training_with_retries(
link,
signal,
pipe_ctx->clock_source->id,
-   link_setting);
+   &currnet_setting);
 
if (stream->sink_patches.dppowerup_delay > 0) {
int delay_dp_power_up_in_ms = 
stream->sink_patches.dppowerup_delay;
@@ -1739,14 +1752,12 @@ bool perform_link_training_with_retries(
 panel_mode != DP_PANEL_MODE_DEFAULT);
 
if (link->aux_access_disabled) {
-   dc_link_dp_perform_link_training_skip_aux(link, 
link_setting);
+   dc_link_dp_perform_link_training_skip_aux(link, 
&currnet_setting);
return true;
} else {
-   enum link_training_result status = 
LINK_TRAINING_CR_FAIL_LANE0;
-
status = dc_link_dp_perform_link_training(

l

[PATCH 02/14] drm/amd/display: Update setting of DP training parameters.

From: Jimmy Kizito 

[Why]
Some links are dynamically assigned link encoders on stream enablement.

[How]
Update DisplayPort training parameter determination stage that assumes
link encoder statically assigned to link.

Signed-off-by: Jimmy Kizito 
Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 17 +-
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  4 
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 18 ++-
 .../drm/amd/display/dc/core/dc_link_enc_cfg.c | 22 ++-
 .../gpu/drm/amd/display/dc/inc/link_enc_cfg.h |  7 +-
 5 files changed, 60 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index a2e7747ee387..d040d235c2db 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -48,6 +48,7 @@
 #include "dce/dmub_psr.h"
 #include "dmub/dmub_srv.h"
 #include "inc/hw/panel_cntl.h"
+#include "inc/link_enc_cfg.h"
 
 #define DC_LOGGER_INIT(logger)
 
@@ -3737,8 +3738,22 @@ void dc_link_overwrite_extended_receiver_cap(
 
 bool dc_link_is_fec_supported(const struct dc_link *link)
 {
+   struct link_encoder *link_enc = NULL;
+
+   /* Links supporting dynamically assigned link encoder will be assigned 
next
+* available encoder if one not already assigned.
+*/
+   if (link->is_dig_mapping_flexible &&
+   link->dc->res_pool->funcs->link_encs_assign) {
+   link_enc = 
link_enc_cfg_get_link_enc_used_by_link(link->dc->current_state, link);
+   if (link_enc == NULL)
+   link_enc = 
link_enc_cfg_get_next_avail_link_enc(link->dc, link->dc->current_state);
+   } else
+   link_enc = link->link_enc;
+   ASSERT(link_enc);
+
return (dc_is_dp_signal(link->connector_signal) &&
-   link->link_enc->features.fec_supported &&
+   link_enc->features.fec_supported &&
link->dpcd_caps.fec_cap.bits.FEC_CAPABLE &&
!IS_FPGA_MAXIMUS_DC(link->ctx->dce_environment));
 }
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index 3bdd54e6248a..ba6b56f20269 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -685,6 +685,10 @@ bool dc_link_aux_try_to_configure_timeout(struct 
ddc_service *ddc,
bool result = false;
struct ddc *ddc_pin = ddc->ddc_pin;
 
+   /* Do not try to access nonexistent DDC pin. */
+   if (ddc->link->ep_type != DISPLAY_ENDPOINT_PHY)
+   return true;
+
if 
(ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en]->funcs->configure_timeout)
 {

ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en]->funcs->configure_timeout(ddc,
 timeout);
result = true;
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index a22484e90e75..8565281e6179 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -14,6 +14,7 @@
 #include "dpcd_defs.h"
 #include "dc_dmub_srv.h"
 #include "dce/dmub_hw_lock_mgr.h"
+#include "inc/link_enc_cfg.h"
 
 /*Travis*/
 static const uint8_t DP_VGA_LVDS_CONVERTER_ID_2[] = "sivarT";
@@ -132,10 +133,22 @@ static enum dc_dp_training_pattern 
decide_cr_training_pattern(
 static enum dc_dp_training_pattern decide_eq_training_pattern(struct dc_link 
*link,
const struct dc_link_settings *link_settings)
 {
+   struct link_encoder *link_enc;
enum dc_dp_training_pattern highest_tp = DP_TRAINING_PATTERN_SEQUENCE_2;
-   struct encoder_feature_support *features = &link->link_enc->features;
+   struct encoder_feature_support *features;
struct dpcd_caps *dpcd_caps = &link->dpcd_caps;
 
+   /* Access link encoder capability based on whether it is statically
+* or dynamically assigned to a link.
+*/
+   if (link->is_dig_mapping_flexible &&
+   link->dc->res_pool->funcs->link_encs_assign)
+   link_enc = 
link_enc_cfg_get_link_enc_used_by_link(link->dc->current_state, link);
+   else
+   link_enc = link->link_enc;
+   ASSERT(link_enc);
+   features = &link_enc->features;
+
if (features->flags.bits.IS_TPS3_CAPABLE)
highest_tp = DP_TRAINING_PATTERN_SEQUENCE_3;
 
@@ -1366,6 +1379,7 @@ static void configure_lttpr_mode_non_transparent(struct 
dc_link *link)
}
 
repeater_cnt = 
convert_to_count(link->dpcd_caps.lttpr_caps.phy_repeater_cnt);
+
for (repeater_id = repeater_cnt; repeater_id > 0; 
repeater_id--) {
aux_interval_address = 
DP_TRAINING_AUX_RD_INTERV

[PATCH 01/14] drm/amd/display: Update DPRX detection.

From: Jimmy Kizito 

[Why]
Some extra provisions are required during DPRX detection for links which
lack physical HPD and AUX/DDC pins.

[How]
Avoid attempting to access nonexistent physical pins during DPRX
detection.

Signed-off-by: Jimmy Kizito 
Reviewed-by: Jun Lei 
Acked-by: Stylon Wang 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 27 ---
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  1 +
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 3fb0cebd6938..a2e7747ee387 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -247,6 +247,16 @@ bool dc_link_detect_sink(struct dc_link *link, enum 
dc_connection_type *type)
link->dc->hwss.edp_wait_for_hpd_ready(link, true);
}
 
+   /* Link may not have physical HPD pin. */
+   if (link->ep_type != DISPLAY_ENDPOINT_PHY) {
+   if (link->hpd_status)
+   *type = dc_connection_single;
+   else
+   *type = dc_connection_none;
+
+   return true;
+   }
+
/* todo: may need to lock gpio access */
hpd_pin = get_hpd_gpio(link->ctx->dc_bios, link->link_id,
   link->ctx->gpio_service);
@@ -432,8 +442,18 @@ bool dc_link_is_dp_sink_present(struct dc_link *link)
 static enum signal_type link_detect_sink(struct dc_link *link,
 enum dc_detect_reason reason)
 {
-   enum signal_type result = get_basic_signal_type(link->link_enc->id,
-   link->link_id);
+   enum signal_type result;
+   struct graphics_object_id enc_id;
+
+   if (link->is_dig_mapping_flexible)
+   enc_id = (struct graphics_object_id){.id = ENCODER_ID_UNKNOWN};
+   else
+   enc_id = link->link_enc->id;
+   result = get_basic_signal_type(enc_id, link->link_id);
+
+   /* Use basic signal type for link without physical connector. */
+   if (link->ep_type != DISPLAY_ENDPOINT_PHY)
+   return result;
 
/* Internal digital encoder will detect only dongles
 * that require digital signal
@@ -955,7 +975,8 @@ static bool dc_link_detect_helper(struct dc_link *link,
 
case SIGNAL_TYPE_DISPLAY_PORT: {
/* wa HPD high coming too early*/
-   if (link->link_enc->features.flags.bits.DP_IS_USB_C == 
1) {
+   if (link->ep_type == DISPLAY_ENDPOINT_PHY &&
+   link->link_enc->features.flags.bits.DP_IS_USB_C == 
1) {
/* if alt mode times out, return false */
if (!wait_for_entering_dp_alt_mode(link))
return false;
diff --git a/drivers/gpu/drm/amd/display/dc/dc_link.h 
b/drivers/gpu/drm/amd/display/dc/dc_link.h
index fc5622ffec3d..5196df1ebad1 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_link.h
+++ b/drivers/gpu/drm/amd/display/dc/dc_link.h
@@ -113,6 +113,7 @@ struct dc_link {
/* TODO: Rename. Flag an endpoint as having a programmable mapping to a
 * DIG encoder. */
bool is_dig_mapping_flexible;
+   bool hpd_status; /* HPD status of link without physical HPD pin. */
 
bool edp_sink_present;
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 00/14] DC Patches May 10, 2021

This DC patchset brings improvements in multiple areas. In summary, we
highlight:

* DC v3.2.135.1
* Improvements across DP, DPP, clock management, pixel formats

Anthony Koo (1):
  drm/amd/display: [FW Promotion] Release 0.0.65

Anthony Wang (1):
  drm/amd/display: Handle potential dpp_inst mismatch with pipe_idx

Aric Cyr (2):
  drm/amd/display: 3.2.135
  drm/amd/display: 3.2.135.1

Chaitanya Dhere (1):
  drm/amd/display: DETBufferSizeInKbyte variable type modifications

Dmytro Laktyushkin (1):
  drm/amd/display: fix use_max_lb flag for 420 pixel formats

Fangzhi Zuo (1):
  drm/amd/display: Add dc log for DP SST DSC enable/disable

Ilya Bakoulin (2):
  drm/amd/display: Fix clock table filling logic
  drm/amd/display: Handle pixel format test request

Jimmy Kizito (4):
  drm/amd/display: Update DPRX detection.
  drm/amd/display: Update setting of DP training parameters.
  drm/amd/display: Add fallback and abort paths for DP link training.
  drm/amd/display: Expand DP module training API.

Wenjing Liu (1):
  drm/amd/display: minor dp link training refactor

 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |   6 +-
 .../amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c |  86 ---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  49 +++-
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |   4 +
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 211 --
 .../drm/amd/display/dc/core/dc_link_enc_cfg.c |  22 +-
 .../drm/amd/display/dc/core/dc_link_hwss.c|   3 +-
 drivers/gpu/drm/amd/display/dc/dc.h   |   2 +-
 drivers/gpu/drm/amd/display/dc/dc_dp_types.h  |   1 +
 drivers/gpu/drm/amd/display/dc/dc_link.h  |   7 +-
 .../drm/amd/display/dc/dcn10/dcn10_dpp_dscl.c |   9 +-
 .../drm/amd/display/dc/dcn21/dcn21_resource.c |  33 ++-
 .../dc/dml/dcn20/display_mode_vba_20.c|  26 +--
 .../dc/dml/dcn20/display_mode_vba_20v2.c  |  26 +--
 .../dc/dml/dcn21/display_mode_vba_21.c|  58 ++---
 .../dc/dml/dcn30/display_mode_vba_30.c|  48 ++--
 .../drm/amd/display/dc/dml/display_mode_vba.c |   2 +-
 .../drm/amd/display/dc/dml/display_mode_vba.h |  14 +-
 .../gpu/drm/amd/display/dc/inc/dc_link_dp.h   |  10 +-
 .../gpu/drm/amd/display/dc/inc/link_enc_cfg.h |   7 +-
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   | 123 +-
 .../amd/display/include/link_service_types.h  |   8 +
 22 files changed, 525 insertions(+), 230 deletions(-)

-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 0/1] Fix the slowdown for Vega 20 server boards

2021-05-07 Thread Clements, John

[AMD Official Use Only - Internal Distribution Only]

Patch is:

Reviewed-by: John Clements 


From: Tuikov, Luben 
Sent: Thursday, May 6, 2021 8:10 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Tuikov, Luben ; Deucher, Alexander 
; Clements, John ; Zhang, 
Hawking 
Subject: [PATCH 0/1] Fix the slowdown for Vega 20 server boards

This patch fixes the interactive--to the point of
unusability--slowdown when RAS is enabled on Vega 20 server
boards--which support full RAS.

Luben Tuikov (1):
  drm/amdgpu: Poll of RAS errors asynchronously

 drivers/gpu/drm/amd/amdgpu/amdgpu.h |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 32 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  9 
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 61 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  8 ++--
 5 files changed, 85 insertions(+), 26 deletions(-)

Cc: Alexander Deucher 
Cc: John Clements 
Cc: Hawking Zhang 

--
2.31.0.97.g1424303384

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 0/4] Normalize redundant variables

2021-05-07 Thread Clements, John

[AMD Official Use Only - Internal Distribution Only]

Series is:
Reviewed-by: John Clements 


From: Tuikov, Luben 
Sent: Wednesday, May 5, 2021 5:47 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Tuikov, Luben ; Deucher, Alexander 
; Clements, John ; Zhang, 
Hawking 
Subject: [PATCH 0/4] Normalize redundant variables

Classic normalization of a redundant variable.
There is no need to have two variables representing
the same quantity. Move up to the structure which
represents the object which determines their values.
Rename to a consistent name, and export to debugfs
for debugging.

Luben Tuikov (4):
  drm/amdgpu: Remove redundant ras->supported
  drm/amdgpu: Move up ras_hw_supported
  drm/amdgpu: Rename to ras_*_enabled
  drm/amdgpu: Export ras_*_enabled to debugfs

 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c|  6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c   | 91 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h   |  5 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/soc15.c|  7 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c |  6 +-
 .../drm/amd/pm/powerplay/hwmgr/vega20_baco.c  |  2 +-
 .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c|  3 +-
 11 files changed, 63 insertions(+), 66 deletions(-)

Cc: Alexander Deucher 
Cc: John Clements 
Cc: Hawking Zhang 

--
2.31.0.97.g1424303384

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/display: Move plane code from amdgpu_dm to amdgpu_dm_plane

2021-05-07 Thread Rodrigo Siqueira

The amdgpu_dm file contains most of the code that works as an interface
between DRM API and Display Core. We maintain all the plane operations
inside amdgpu_dm; this commit extracts the plane code to its specific
file named amdgpu_dm_plane. This commit does not introduce any
functional change to the functions; it only changes some static
functions to global and adds some minor adjustments related to the copy
from one place to another.

Signed-off-by: Rodrigo Siqueira 
---
 .../gpu/drm/amd/display/amdgpu_dm/Makefile|9 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1479 +---
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 1496 +
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.h   |   56 +
 4 files changed, 1559 insertions(+), 1481 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
 create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.h

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile 
b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
index 9a3b7bf8ab0b..6542ef0ff83e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/Makefile
@@ -23,9 +23,12 @@
 # Makefile for the 'dm' sub-component of DAL.
 # It provides the control and status of dm blocks.
 
-
-
-AMDGPUDM = amdgpu_dm.o amdgpu_dm_irq.o amdgpu_dm_mst_types.o amdgpu_dm_color.o
+AMDGPUDM := \
+   amdgpu_dm.o \
+   amdgpu_dm_color.o \
+   amdgpu_dm_irq.o \
+   amdgpu_dm_mst_types.o \
+   amdgpu_dm_plane.o
 
 ifneq ($(CONFIG_DRM_AMD_DC),)
 AMDGPUDM += amdgpu_dm_services.o amdgpu_dm_helpers.o amdgpu_dm_pp_smu.o
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index cc048c348a92..60ddb4d8be6c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -44,6 +44,7 @@
 #include "amdgpu_ucode.h"
 #include "atom.h"
 #include "amdgpu_dm.h"
+#include "amdgpu_dm_plane.h"
 #ifdef CONFIG_DRM_AMD_DC_HDCP
 #include "amdgpu_dm_hdcp.h"
 #include 
@@ -181,10 +182,6 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev);
 /* removes and deallocates the drm structures, created by the above function */
 static void amdgpu_dm_destroy_drm_device(struct amdgpu_display_manager *dm);
 
-static int amdgpu_dm_plane_init(struct amdgpu_display_manager *dm,
-   struct drm_plane *plane,
-   unsigned long possible_crtcs,
-   const struct dc_plane_cap *plane_cap);
 static int amdgpu_dm_crtc_init(struct amdgpu_display_manager *dm,
   struct drm_plane *plane,
   uint32_t link_index);
@@ -203,9 +200,6 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state);
 static int amdgpu_dm_atomic_check(struct drm_device *dev,
  struct drm_atomic_state *state);
 
-static void handle_cursor_update(struct drm_plane *plane,
-struct drm_plane_state *old_plane_state);
-
 static void amdgpu_dm_set_psr_caps(struct dc_link *link);
 static bool amdgpu_dm_psr_enable(struct dc_stream_state *stream);
 static bool amdgpu_dm_link_setup_psr(struct dc_stream_state *stream);
@@ -4125,925 +4119,12 @@ static const struct drm_encoder_funcs 
amdgpu_dm_encoder_funcs = {
.destroy = amdgpu_dm_encoder_destroy,
 };
 
-
-static void get_min_max_dc_plane_scaling(struct drm_device *dev,
-struct drm_framebuffer *fb,
-int *min_downscale, int *max_upscale)
-{
-   struct amdgpu_device *adev = drm_to_adev(dev);
-   struct dc *dc = adev->dm.dc;
-   /* Caps for all supported planes are the same on DCE and DCN 1 - 3 */
-   struct dc_plane_cap *plane_cap = &dc->caps.planes[0];
-
-   switch (fb->format->format) {
-   case DRM_FORMAT_P010:
-   case DRM_FORMAT_NV12:
-   case DRM_FORMAT_NV21:
-   *max_upscale = plane_cap->max_upscale_factor.nv12;
-   *min_downscale = plane_cap->max_downscale_factor.nv12;
-   break;
-
-   case DRM_FORMAT_XRGB16161616F:
-   case DRM_FORMAT_ARGB16161616F:
-   case DRM_FORMAT_XBGR16161616F:
-   case DRM_FORMAT_ABGR16161616F:
-   *max_upscale = plane_cap->max_upscale_factor.fp16;
-   *min_downscale = plane_cap->max_downscale_factor.fp16;
-   break;
-
-   default:
-   *max_upscale = plane_cap->max_upscale_factor.argb;
-   *min_downscale = plane_cap->max_downscale_factor.argb;
-   break;
-   }
-
-   /*
-* A factor of 1 in the plane_cap means to not allow scaling, ie. use a
-* scaling factor of 1.0 == 1000 units.
-*/
-   if (*max_upscale == 1)
-   *max_upscale = 1000;
-
-   if (*m

[PATCH] drm/amd/display: Make underlay rules less strict

2021-05-07 Thread Rodrigo Siqueira

Currently, we reject all conditions where the underlay plane goes
outside the overlay plane limits, which is not entirely correct since we
reject some valid cases like the ones illustrated below:

  ++  ++
  |   Overlay plane|  |   Overlay plane|
  ||  |+---|--+
  | +--+   |  ||   |  |
  | |  |   |  ||   |  |
  ++  ++  |
| Primary plane|   +--+
|  (underlay)  |
+--+
  +-+--+---+  ++
  |Overlay plane   |  |Overlay plane   |
+-|+   |  |   +--+
| ||   |  |   || |
| ||   |  |   || |
| ||   |  |   || |
+-|+   |  |   +--+
  ++  ++

This patch fixes this issue by only rejecting commit requests where the
underlay is entirely outside the overlay limits. After applying this
patch, a set of subtests related to kms_plane, kms_plane_alpha_blend,
and kms_plane_scaling will pass.

Signed-off-by: Rodrigo Siqueira 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index cc048c348a92..15006aafc630 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10098,10 +10098,10 @@ static int validate_overlay(struct drm_atomic_state 
*state)
return 0;
 
/* Perform the bounds check to ensure the overlay plane covers the 
primary */
-   if (primary_state->crtc_x < overlay_state->crtc_x ||
-   primary_state->crtc_y < overlay_state->crtc_y ||
-   primary_state->crtc_x + primary_state->crtc_w > 
overlay_state->crtc_x + overlay_state->crtc_w ||
-   primary_state->crtc_y + primary_state->crtc_h > 
overlay_state->crtc_y + overlay_state->crtc_h) {
+   if (primary_state->crtc_x + primary_state->crtc_w < 
overlay_state->crtc_x ||
+   primary_state->crtc_x > overlay_state->crtc_x + 
overlay_state->crtc_w ||
+   primary_state->crtc_y > overlay_state->crtc_y + 
overlay_state->crtc_h ||
+   primary_state->crtc_y + primary_state->crtc_h < 
overlay_state->crtc_y) {
DRM_DEBUG_ATOMIC("Overlay plane is enabled with hardware cursor 
but does not fully cover primary plane\n");
return -EINVAL;
}
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH libdrm] Revert "tests/amdgpu: fix bo eviction test issue"

2021-05-07 Thread Deucher, Alexander

[AMD Official Use Only - Internal Distribution Only]

For libdrm tests, please open a gitlab merge request:
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests

Alex


From: amd-gfx  on behalf of Yu, Lang 

Sent: Friday, May 7, 2021 3:10 AM
To: Chen, Guchun ; amd-gfx@lists.freedesktop.org 
; Huang, Ray ; Song, Asher 

Subject: RE: [PATCH libdrm] Revert "tests/amdgpu: fix bo eviction test issue"

[AMD Official Use Only - Internal Distribution Only]


Reviewed-by:  Lang Yu 

Regards，
Lang

-Original Message-
From: Chen, Guchun 
Sent: Thursday, May 6, 2021 5:55 PM
To: amd-gfx@lists.freedesktop.org; Yu, Lang ; Huang, Ray 
; Song, Asher 
Cc: Chen, Guchun 
Subject: [PATCH libdrm] Revert "tests/amdgpu: fix bo eviction test issue"

This reverts commit a5a400c9581c3b91598623603067556b18084c5d.

bo evict test was disabled by default per below commit. So still keep it as 
disabled.

1f6a85cc test/amdgpu: disable bo eviction test by default

Signed-off-by: Guchun Chen 
---
 tests/amdgpu/amdgpu_test.c |  3 +++
 tests/amdgpu/basic_tests.c | 13 -
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/tests/amdgpu/amdgpu_test.c b/tests/amdgpu/amdgpu_test.c index 
60f3a508..77bbfbcc 100644
--- a/tests/amdgpu/amdgpu_test.c
+++ b/tests/amdgpu/amdgpu_test.c
@@ -496,6 +496,9 @@ static void amdgpu_disable_suites()
 "gfx ring slow bad draw test (set 
amdgpu.lockup_timeout=50)", CU_FALSE))
 fprintf(stderr, "test deactivation failed - %s\n", 
CU_get_error_msg());

+   if (amdgpu_set_test_active(BASIC_TESTS_STR, "bo eviction Test", 
CU_FALSE))
+   fprintf(stderr, "test deactivation failed - %s\n",
+CU_get_error_msg());
+
 /* This test was ran on GFX8 and GFX9 only */
 if (family_id < AMDGPU_FAMILY_VI || family_id > AMDGPU_FAMILY_RV)
 if (amdgpu_set_test_active(BASIC_TESTS_STR, "Sync dependency 
Test", CU_FALSE)) diff --git a/tests/amdgpu/basic_tests.c 
b/tests/amdgpu/basic_tests.c index 8e7c4916..3a4214f5 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests/amdgpu/basic_tests.c
@@ -928,15 +928,6 @@ static void amdgpu_bo_eviction_test(void)
0, &vram_info);
 CU_ASSERT_EQUAL(r, 0);

-   r = amdgpu_query_heap_info(device_handle, AMDGPU_GEM_DOMAIN_GTT,
-  0, >t_info);
-   CU_ASSERT_EQUAL(r, 0);
-
-   if (vram_info.max_allocation > gtt_info.heap_size/3) {
-   vram_info.max_allocation = gtt_info.heap_size/3;
-   gtt_info.max_allocation = vram_info.max_allocation;
-   }
-
 r = amdgpu_bo_alloc_wrap(device_handle, vram_info.max_allocation, 4096,
  AMDGPU_GEM_DOMAIN_VRAM, 0, &vram_max[0]);
 CU_ASSERT_EQUAL(r, 0);
@@ -944,6 +935,10 @@ static void amdgpu_bo_eviction_test(void)
  AMDGPU_GEM_DOMAIN_VRAM, 0, &vram_max[1]);
 CU_ASSERT_EQUAL(r, 0);

+   r = amdgpu_query_heap_info(device_handle, AMDGPU_GEM_DOMAIN_GTT,
+  0, >t_info);
+   CU_ASSERT_EQUAL(r, 0);
+
 r = amdgpu_bo_alloc_wrap(device_handle, gtt_info.max_allocation, 4096,
  AMDGPU_GEM_DOMAIN_GTT, 0, >t_max[0]);
 CU_ASSERT_EQUAL(r, 0);
--
2.17.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7Cb3ce363db6e94aa1396308d9112727fc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637559682163619573%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=d9uch8frkIAiVkdtaOHillKHngoVp8brn%2FxJuKatmYU%3D&reserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: Quit RAS initialization earlier if RAS is disabled

2021-05-07 Thread Zhang, Hawking

[AMD Public Use]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Zeng, Oak  
Sent: Friday, May 7, 2021 09:15
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Lazar, Lijo ; 
Clements, John ; Joshi, Mukul ; 
Zeng, Oak 
Subject: [PATCH] drm/amdgpu: Quit RAS initialization earlier if RAS is disabled

If RAS is disabled through amdgpu_ras_enable kernel parameter, we should quit 
the RAS initialization eariler to avoid initialization of some RAS data 
structure such as sysfs etc.

Signed-off-by: Oak Zeng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index ebbe2c5..7e65b35 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -2155,7 +2155,7 @@ int amdgpu_ras_init(struct amdgpu_device *adev)
 
amdgpu_ras_check_supported(adev, &con->hw_supported,
&con->supported);
-   if (!con->hw_supported || (adev->asic_type == CHIP_VEGA10)) {
+   if (!adev->ras_features || (adev->asic_type == CHIP_VEGA10)) {
/* set gfx block ras context feature for VEGA20 Gaming
 * send ras disable cmd to ras ta during ras late init.
 */
--
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] CRIU support for ROCm

On Thu, May 06, 2021 at 12:10:15PM -0400, Felix Kuehling wrote:
> Am 2021-05-04 um 9:00 a.m. schrieb Daniel Vetter:
> > On Fri, Apr 30, 2021 at 09:57:45PM -0400, Felix Kuehling wrote:
> >> We have been working on a prototype supporting CRIU (Checkpoint/Restore
> >> In Userspace) for accelerated compute applications running on AMD GPUs
> >> using ROCm (Radeon Open Compute Platform). We're happy to finally share
> >> this work publicly to solicit feedback and advice. The end-goal is to
> >> get this work included upstream in Linux and CRIU. A short whitepaper
> >> describing our design and intention can be found on Github:
> >> https://github.com/RadeonOpenCompute/criu/tree/criu-dev/test/others/ext-kfd/README.md
> >>
> >> We have RFC patch series for the kernel (based on Alex Deucher's
> >> amd-staging-drm-next branch) and for CRIU including a new plugin and a
> >> few core CRIU changes. I will send those to the respective mailing lists
> >> separately in a minute. They can also be found on Github.
> >>
> >> CRIU+plugin: https://github.com/RadeonOpenCompute/criu/commits/criu-dev
> >> Kernel (KFD):
> >> 
> >> https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/commits/fxkamd/criu-wip
> >>
> >> At this point this is very much a work in progress and not ready for
> >> upstream inclusion. There are still several missing features, known
> >> issues, and open questions that we would like to start addressing with
> >> your feedback.
> > Since the thread is a bit split I'm dumping the big thoughts here on this
> > RFC.
> >
> > We've discussed this in the past, but I'm once more (insert meme here)
> > asking whether continuing to walk down the amdgpu vs amdkfd split is
> > really the right choice. It starts to feel a bit much like sunk cost
> > fallacy ...
> 
> Hi Daniel,
> 
> Thanks for the feedback. I have some comments to your specific points
> below. This is my own opinion at this point and may not reflect AMDs
> position. I'm starting some internal discussions about unifying the KFD
> and graphics APIs in the long run. But IMO this is going to take years
> and won't be supported on our current compute GPUs, including Aldebaran
> which isn't even released yet.

Well yeah that's why I'm bringing this up early (and I think we're at
round 2 or 3 on this discussion by now). I know how many years this takes
to roll out.

> > - From the big thread we're having right now on dri-devel it's clear that
> >   3d will also move towards more and more a userspace submit model.
> 
> I'll need to start following dri-devel more closely and take a more
> active role in those discussions. If there is an opportunity for a
> unified memory management and command submission model for graphics and
> compute on future hardware, I want to be sure that our compute
> requirements are understood early on.
> 
> 
> >  But
> >   due to backwards compat issues it will be a mixed model, and in some
> >   cases we need to pick at runtime which model we're picking. A hard split
> >   between the amdgpu and the amdkfd world gets in the way here.
> 
> Backwards compatibility will force us to maintain KFD at least for GFXv9
> and older AMD GPUs. As I understand it, the new user mode command
> submission model will only be viable on GFXv10 or even newer GPUs. GFXv9
> is our architecture for current compute GPUs (including Aldebaran which
> is still in development), so this is going to be important for us for
> years to come.

Yeah this is definitely a 5 year plus plan.

> > - There's use-cases for doing compute in vulkan (that was a discussion
> >   from Feb that I kicked again in private, since I think still
> >   unresolved). So you need a vulkan stack that runs on both amdgpu and
> >   amdvlk.
> 
> By amdvlk, do you mean AMD's "pro" driver that supports our non-RADV
> Vulcan driver?

Yeah that's what I heard at least from Alex (If I recall correctly).
People seem to be using vulkan for cloud compute. This came up when we
talked about enforcing a default timeout for compute engines on the amdgpu
side (kernel can hang if you dont), and I think John Bridgman mentioned
the proper fix was to move at least opencl over to use amdkfd by default.


> > - Maybe not yet on amd's radar, but there's a lot of cloud computing. And
> >   maybe they also want CRIU for migrating their containers around. So that
> >   means CRIU for amdgpu too, not just amdkf.
> 
> Our strategy for compute support (OpenCL, HIP, OpenMP and mayby future
> C++ standards) is to use ROCm on all current and future GPUs starting
> with GFXv9. As long as ROCm is based on KFD, that means our compute
> stack runs on KFD. We don't see KFD as something mutually exclusive with
> graphics. They are different ioctl APIs, but they're sharing the
> hardware and memory and are meant to work together.
> 
> We are already planning to add CRIU support to our render node API
> because video acceleration on our compute stack still relies on Mesa and
> the render node CS API.
> 
> I admit, t

Re: [PATCH v5 20/27] drm: Scope all DRM IOCTLs with drm_dev_enter/exit

On Thu, May 06, 2021 at 12:25:06PM -0400, Andrey Grodzovsky wrote:
> 
> 
> On 2021-05-06 5:40 a.m., Daniel Vetter wrote:
> > On Fri, Apr 30, 2021 at 01:27:37PM -0400, Andrey Grodzovsky wrote:
> > > 
> > > 
> > > On 2021-04-30 6:25 a.m., Daniel Vetter wrote:
> > > > On Thu, Apr 29, 2021 at 04:34:55PM -0400, Andrey Grodzovsky wrote:
> > > > > 
> > > > > 
> > > > > On 2021-04-29 3:05 p.m., Daniel Vetter wrote:
> > > > > > On Thu, Apr 29, 2021 at 12:04:33PM -0400, Andrey Grodzovsky wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 2021-04-29 7:32 a.m., Daniel Vetter wrote:
> > > > > > > > On Thu, Apr 29, 2021 at 01:23:19PM +0200, Daniel Vetter wrote:
> > > > > > > > > On Wed, Apr 28, 2021 at 11:12:00AM -0400, Andrey Grodzovsky 
> > > > > > > > > wrote:
> > > > > > > > > > With this calling drm_dev_unplug will flush and block
> > > > > > > > > > all in flight IOCTLs
> > > > > > > > > > 
> > > > > > > > > > Also, add feature such that if device supports graceful 
> > > > > > > > > > unplug
> > > > > > > > > > we enclose entire IOCTL in SRCU critical section.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Andrey Grodzovsky 
> > > > > > > > > 
> > > > > > > > > Nope.
> > > > > > > > > 
> > > > > > > > > The idea of drm_dev_enter/exit is to mark up hw access. Not 
> > > > > > > > > entire ioctl.
> > > > > > > 
> > > > > > > Then I am confused why we have 
> > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.12%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fdrm_ioctl.c%23L826&data=04%7C01%7Candrey.grodzovsky%40amd.com%7Ca0ca5bdab20a4533491c08d91072fe2a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637558908355926609%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=SESZFWQEcQUHGGek8d1cNi9Iwo9XOmXqxg9MieRkxNU%3D&reserved=0
> > > > > > > currently in code ?
> > > > > > 
> > > > > > I forgot about this one, again. Thanks for reminding.
> > > > > > 
> > > > > > > > > Especially not with an opt-in flag so that it could be 
> > > > > > > > > shrugged of as a
> > > > > > > > > driver hack. Most of these ioctls should have absolutely no 
> > > > > > > > > problem
> > > > > > > > > working after hotunplug.
> > > > > > > > > 
> > > > > > > > > Also, doing this defeats the point since it pretty much 
> > > > > > > > > guarantees
> > > > > > > > > userspace will die in assert()s and stuff. E.g. on i915 the 
> > > > > > > > > rough contract
> > > > > > > > > is that only execbuf (and even that only when userspace has 
> > > > > > > > > indicated
> > > > > > > > > support for non-recoverable hw ctx) is allowed to fail. 
> > > > > > > > > Anything else
> > > > > > > > > might crash userspace.
> > > > > > > 
> > > > > > > Given that as I pointed above we already fail any IOCTls with 
> > > > > > > -ENODEV
> > > > > > > when device is unplugged, it seems those crashes don't happen that
> > > > > > > often ? Also, in all my testing I don't think I saw a user space 
> > > > > > > crash
> > > > > > > I could attribute to this.
> > > > > > 
> > > > > > I guess it should be ok.
> > > > > 
> > > > > What should be ok ?
> > > > 
> > > > Your approach, but not your patch. If we go with this let's just lift it
> > > > to drm_ioctl() as the default behavior. No driver opt-in flag, because
> > > > that's definitely worse than any other approach because we really need 
> > > > to
> > > > get rid of driver specific behaviour for generic ioctls, especially
> > > > anything a compositor will use directly.
> > > > 
> > > > > > My reasons for making this work is both less trouble for userspace 
> > > > > > (did
> > > > > > you test with various wayland compositors out there, not just 
> > > > > > amdgpu x86
> > > > > 
> > > > > I didn't - will give it a try.
> > > 
> > > Weston worked without crashes, run the egl tester cube there.
> > > 
> > > > > 
> > > > > > driver?), but also testing.
> > > > > > 
> > > > > > We still need a bunch of these checks in various places or you'll 
> > > > > > wait a
> > > > > > very long time for a pending modeset or similar to complete. Being 
> > > > > > able to
> > > > > > run that code easily after hotunplug has completed should help a 
> > > > > > lot with
> > > > > > testing.
> > > > > > 
> > > > > > Plus various drivers already acquired drm_dev_enter/exit and now I 
> > > > > > wonder
> > > > > > whether that was properly tested or not ...
> > > > > > 
> > > > > > I guess maybe we need a drm module option to disable this check, so 
> > > > > > that
> > > > > > we can exercise the code as if the ioctl has raced with hotunplug 
> > > > > > at the
> > > > > > worst possible moment.
> > > > > > 
> > > > > > Also atomic is really tricky here: I assume your testing has just 
> > > > > > done
> > > > > > normal synchronous commits, but anything that goes through atomic 
> > > > > > can be
> > > > > > done nonblocking in a separate thread. Which the ioctl catch-all 
> > > > > > here wont
> > > > > > capture.
>

Re: [RFC] Add BPF_PROG_TYPE_CGROUP_IOCTL