Re: [PATCH v3 7/9] drm/msm/dpu: drop _dpu_crtc_check_and_setup_lm_bounds from atomic_begin

2024-06-18 Thread Dmitry Baryshkov
On Wed, 19 Jun 2024 at 01:56, Abhinav Kumar  wrote:
> On 6/13/2024 4:20 PM, Abhinav Kumar wrote:
> > On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:
> >> The dpu_crtc_atomic_check() already calls the function
> >> _dpu_crtc_check_and_setup_lm_bounds().  There is no need to call it
> >> again from dpu_crtc_atomic_begin().
> >>
> >> Signed-off-by: Dmitry Baryshkov 
> >> ---
> >>   drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 --
> >>   1 file changed, 2 deletions(-)
> >>
> >
> > Reviewed-by: Abhinav Kumar 
>
>
> This change is causing a small regression on sc7280 chromebook.
>
> I have tested and concluded that this is causing the chrome boot
> animation to disappear.
>
> I have tested a couple of times and without this change it works fine.
>
> If this change was meant as an optimization, can we drop this one and
> investigate later why this is causing one? I have not spent time
> investigating why it happened. Rest of the series works well and I dont
> see any dependency as such. Let me know if that works for you. Otherwise
> I will have to spend a little more time on this patch and why chrome
> compositor does not like this for the animation screen.

Oh, my. Thank you for the test!
I think I know what's happening. The cstate->num_mixers gets set only
in dpu_encoder_virt_atomic_mode_set(). So during
dpu_crtc_atomic_check() we don't have cstate->num_mixers is stale (and
if it is 0, the check is skipped).

I guess I'll have to move cstate->mixers[] and cstate->num_mixers
assignment to the dpu_encoder_virt_atomic_check(). And maybe we should
start thinking about my old idea of moving resource allocation to the
CRTC code.

-- 
With best wishes
Dmitry


Re: [PATCH v2 08/14] drm/msm/hdmi: add runtime PM calls to DDC transfer function

2024-06-18 Thread Jessica Zhang




On 5/22/2024 3:51 AM, Dmitry Baryshkov wrote:

We must be sure that the HDMI controller is powered on, while performing
the DDC transfer. Add corresponding runtime PM calls to
msm_hdmi_i2c_xfer().

Signed-off-by: Dmitry Baryshkov 


Reviewed-by: Jessica Zhang 


---
  drivers/gpu/drm/msm/hdmi/hdmi_i2c.c | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c
index 7aa500d24240..ebefea4fb408 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c
@@ -107,11 +107,15 @@ static int msm_hdmi_i2c_xfer(struct i2c_adapter *i2c,
if (num == 0)
return num;
  
+	ret = pm_runtime_resume_and_get(&hdmi->pdev->dev);

+   if (ret)
+   return ret;
+
init_ddc(hdmi_i2c);
  
  	ret = ddc_clear_irq(hdmi_i2c);

if (ret)
-   return ret;
+   goto fail;
  
  	for (i = 0; i < num; i++) {

struct i2c_msg *p = &msgs[i];
@@ -169,7 +173,7 @@ static int msm_hdmi_i2c_xfer(struct i2c_adapter *i2c,
hdmi_read(hdmi, REG_HDMI_DDC_SW_STATUS),
hdmi_read(hdmi, REG_HDMI_DDC_HW_STATUS),
hdmi_read(hdmi, REG_HDMI_DDC_INT_CTRL));
-   return ret;
+   goto fail;
}
  
  	ddc_status = hdmi_read(hdmi, REG_HDMI_DDC_SW_STATUS);

@@ -202,7 +206,13 @@ static int msm_hdmi_i2c_xfer(struct i2c_adapter *i2c,
}
}
  
+	pm_runtime_put(&hdmi->pdev->dev);

+
return i;
+
+fail:
+   pm_runtime_put(&hdmi->pdev->dev);
+   return ret;
  }
  
  static u32 msm_hdmi_i2c_func(struct i2c_adapter *adapter)


--
2.39.2



Re: [PATCH v2 07/14] drm/msm/hdmi: switch to pm_runtime_resume_and_get()

2024-06-18 Thread Jessica Zhang




On 5/22/2024 3:50 AM, Dmitry Baryshkov wrote:

The pm_runtime_get_sync() function is a bad choise for runtime power


[nit: s/choise/choice/]

Reviewed-by: Jessica Zhang 


management. Switch HDMI driver to pm_runtime_resume_and_get() and add
proper error handling, while we are at it.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/hdmi/hdmi_bridge.c |  2 +-
  drivers/gpu/drm/msm/hdmi/hdmi_hpd.c| 12 ++--
  drivers/gpu/drm/msm/hdmi/hdmi_phy.c|  6 +-
  3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
index fb99328107dd..d1b35328b6e8 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
@@ -19,7 +19,7 @@ static void msm_hdmi_power_on(struct drm_bridge *bridge)
const struct hdmi_platform_config *config = hdmi->config;
int ret;
  
-	pm_runtime_get_sync(&hdmi->pdev->dev);

+   pm_runtime_resume_and_get(&hdmi->pdev->dev);
  
  	ret = regulator_bulk_enable(config->pwr_reg_cnt, hdmi->pwr_regs);

if (ret)
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
index 36266aa626dc..fc21ad3b01dc 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
@@ -85,7 +85,12 @@ int msm_hdmi_hpd_enable(struct drm_bridge *bridge)
if (hdmi->hpd_gpiod)
gpiod_set_value_cansleep(hdmi->hpd_gpiod, 1);
  
-	pm_runtime_get_sync(dev);

+   ret = pm_runtime_resume_and_get(dev);
+   if (ret) {
+   DRM_DEV_ERROR(dev, "runtime resume failed: %d\n", ret);
+   goto fail;
+   }
+
ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks);
if (ret)
goto fail;
@@ -178,7 +183,10 @@ static enum drm_connector_status detect_reg(struct hdmi 
*hdmi)
uint32_t hpd_int_status = 0;
int ret;
  
-	pm_runtime_get_sync(&hdmi->pdev->dev);

+   ret = pm_runtime_resume_and_get(&hdmi->pdev->dev);
+   if (ret)
+   goto out;
+
ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks);
if (ret)
goto out;
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_phy.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_phy.c
index 88a3423b7f24..d5acae752300 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_phy.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_phy.c
@@ -58,7 +58,11 @@ int msm_hdmi_phy_resource_enable(struct hdmi_phy *phy)
struct device *dev = &phy->pdev->dev;
int i, ret = 0;
  
-	pm_runtime_get_sync(dev);

+   ret = pm_runtime_resume_and_get(dev);
+   if (ret) {
+   DRM_DEV_ERROR(dev, "runtime resume failed: %d\n", ret);
+   return ret;
+   }
  
  	ret = regulator_bulk_enable(cfg->num_regs, phy->regs);

if (ret) {

--
2.39.2



Re: [PATCH v2 06/14] drm/msm/hdmi: switch to clk_bulk API

2024-06-18 Thread Jessica Zhang




On 5/22/2024 3:50 AM, Dmitry Baryshkov wrote:

The last platform using legacy clock names for HDMI block (APQ8064)
switched to new clock names in 5.16. It's time to stop caring about old
DT, drop hand-coded helpers and switch to clk_bulk_* API.

Signed-off-by: Dmitry Baryshkov 


Reviewed-by: Jessica Zhang 


---
  drivers/gpu/drm/msm/hdmi/hdmi.c | 15 +-
  drivers/gpu/drm/msm/hdmi/hdmi.h |  2 +-
  drivers/gpu/drm/msm/hdmi/hdmi_hpd.c | 39 +
  3 files changed, 19 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index c14e009f38b1..7ec4ca3b7597 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -469,17 +469,12 @@ static int msm_hdmi_dev_probe(struct platform_device 
*pdev)
if (!hdmi->hpd_clks)
return -ENOMEM;
  
-	for (i = 0; i < config->hpd_clk_cnt; i++) {

-   struct clk *clk;
+   for (i = 0; i < config->hpd_clk_cnt; i++)
+   hdmi->hpd_clks[i].id = config->hpd_clk_names[i];
  
-		clk = msm_clk_get(pdev, config->hpd_clk_names[i]);

-   if (IS_ERR(clk))
-   return dev_err_probe(dev, PTR_ERR(clk),
-"failed to get hpd clk: %s\n",
-config->hpd_clk_names[i]);
-
-   hdmi->hpd_clks[i] = clk;
-   }
+   ret = devm_clk_bulk_get(&pdev->dev, config->hpd_clk_cnt, 
hdmi->hpd_clks);
+   if (ret)
+   return ret;
  
  	hdmi->extp_clk = devm_clk_get_optional(&pdev->dev, "extp");

if (IS_ERR(hdmi->extp_clk))
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h
index c0d60ed23b75..eeba85ffef09 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.h
@@ -50,7 +50,7 @@ struct hdmi {
  
  	struct regulator_bulk_data *hpd_regs;

struct regulator_bulk_data *pwr_regs;
-   struct clk **hpd_clks;
+   struct clk_bulk_data *hpd_clks;
struct clk *extp_clk;
  
  	struct gpio_desc *hpd_gpiod;

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
index 7ae69b14e953..36266aa626dc 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
@@ -60,27 +60,6 @@ static void msm_hdmi_phy_reset(struct hdmi *hdmi)
}
  }
  
-static void enable_hpd_clocks(struct hdmi *hdmi, bool enable)

-{
-   const struct hdmi_platform_config *config = hdmi->config;
-   struct device *dev = &hdmi->pdev->dev;
-   int i, ret;
-
-   if (enable) {
-   for (i = 0; i < config->hpd_clk_cnt; i++) {
-   ret = clk_prepare_enable(hdmi->hpd_clks[i]);
-   if (ret) {
-   DRM_DEV_ERROR(dev,
-   "failed to enable hpd clk: %s (%d)\n",
-   config->hpd_clk_names[i], ret);
-   }
-   }
-   } else {
-   for (i = config->hpd_clk_cnt - 1; i >= 0; i--)
-   clk_disable_unprepare(hdmi->hpd_clks[i]);
-   }
-}
-
  int msm_hdmi_hpd_enable(struct drm_bridge *bridge)
  {
struct hdmi_bridge *hdmi_bridge = to_hdmi_bridge(bridge);
@@ -107,7 +86,9 @@ int msm_hdmi_hpd_enable(struct drm_bridge *bridge)
gpiod_set_value_cansleep(hdmi->hpd_gpiod, 1);
  
  	pm_runtime_get_sync(dev);

-   enable_hpd_clocks(hdmi, true);
+   ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks);
+   if (ret)
+   goto fail;
  
  	msm_hdmi_set_mode(hdmi, false);

msm_hdmi_phy_reset(hdmi);
@@ -149,7 +130,7 @@ void msm_hdmi_hpd_disable(struct hdmi *hdmi)
  
  	msm_hdmi_set_mode(hdmi, false);
  
-	enable_hpd_clocks(hdmi, false);

+   clk_bulk_disable_unprepare(config->hpd_clk_cnt, hdmi->hpd_clks);
pm_runtime_put(dev);
  
  	ret = pinctrl_pm_select_sleep_state(dev);

@@ -193,14 +174,20 @@ void msm_hdmi_hpd_irq(struct drm_bridge *bridge)
  
  static enum drm_connector_status detect_reg(struct hdmi *hdmi)

  {
-   uint32_t hpd_int_status;
+   const struct hdmi_platform_config *config = hdmi->config;
+   uint32_t hpd_int_status = 0;
+   int ret;
  
  	pm_runtime_get_sync(&hdmi->pdev->dev);

-   enable_hpd_clocks(hdmi, true);
+   ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks);
+   if (ret)
+   goto out;
  
  	hpd_int_status = hdmi_read(hdmi, REG_HDMI_HPD_INT_STATUS);
  
-	enable_hpd_clocks(hdmi, false);

+   clk_bulk_disable_unprepare(config->hpd_clk_cnt, hdmi->hpd_clks);
+
+out:
pm_runtime_put(&hdmi->pdev->dev);
  
  	return (hpd_int_status & HDMI_HPD_INT_STATUS_CABLE_DETECTED) ?


--
2.39.2



Re: [PATCH v2 05/14] drm/msm/hdmi: drop clock frequency assignment

2024-06-18 Thread Jessica Zhang




On 5/22/2024 3:50 AM, Dmitry Baryshkov wrote:

The only clock which has frequency being set through hpd_freqs is the
"core" aka MDSS_HDMI_CLK clock. It always has the specified frequency,
so we can drop corresponding clk_set_rate() call together with the
hpd_freq infrastructure.

Signed-off-by: Dmitry Baryshkov 


Reviewed-by: Jessica Zhang 


---
  drivers/gpu/drm/msm/hdmi/hdmi.c | 2 --
  drivers/gpu/drm/msm/hdmi/hdmi.h | 1 -
  drivers/gpu/drm/msm/hdmi/hdmi_hpd.c | 9 -
  3 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index 681265e29aa0..c14e009f38b1 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -236,12 +236,10 @@ static const struct hdmi_platform_config 
hdmi_tx_8960_config = {
  
  static const char *pwr_reg_names_8x74[] = {"core-vdda", "core-vcc"};

  static const char *hpd_clk_names_8x74[] = {"iface", "core", "mdp_core", 
"alt_iface"};
-static unsigned long hpd_clk_freq_8x74[] = {0, 1920, 0, 0};
  
  static const struct hdmi_platform_config hdmi_tx_8974_config = {

HDMI_CFG(pwr_reg, 8x74),
HDMI_CFG(hpd_clk, 8x74),
-   .hpd_freq  = hpd_clk_freq_8x74,
  };
  
  /*

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h
index abdbe4779cf9..c0d60ed23b75 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.h
@@ -96,7 +96,6 @@ struct hdmi_platform_config {
  
  	/* clks that need to be on for hpd: */

const char **hpd_clk_names;
-   const long unsigned *hpd_freq;
int hpd_clk_cnt;
  };
  
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c

index 9ce0ffa35417..7ae69b14e953 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
@@ -68,15 +68,6 @@ static void enable_hpd_clocks(struct hdmi *hdmi, bool enable)
  
  	if (enable) {

for (i = 0; i < config->hpd_clk_cnt; i++) {
-   if (config->hpd_freq && config->hpd_freq[i]) {
-   ret = clk_set_rate(hdmi->hpd_clks[i],
-  config->hpd_freq[i]);
-   if (ret)
-   dev_warn(dev,
-"failed to set clk %s (%d)\n",
-config->hpd_clk_names[i], ret);
-   }
-
ret = clk_prepare_enable(hdmi->hpd_clks[i]);
if (ret) {
DRM_DEV_ERROR(dev,

--
2.39.2



Re: [PATCH v3 9/9] drm/msm/dpu: sync mode_config limits to the FB limits in dpu_plane.c

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 4:16 PM, Abhinav Kumar wrote:



On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

Lift mode_config limits set by the DPU driver to the actual FB limits as
handled by the dpu_plane.c. Move 2*max_lm_width check where it belongs,
to the drm_crtc_helper_funcs::mode_valid() callback.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 15 +++
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c  |  9 ++---
  2 files changed, 17 insertions(+), 7 deletions(-)



Did anything change in this patch from v2 that the R-b was dropped?

Here it is again,

Reviewed-by: Abhinav Kumar 



Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v3 8/9] drm/msm/dpu: merge MAX_IMG_WIDTH/HEIGHT with DPU_MAX_IMG_WIDTH/HEIGHT

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

dpu_formats.c defines DPU_MAX_IMG_WIDTH and _HEIGHT, while
dpu_hw_catalog.h defines just MAX_IMG_WIDTH and _HEIGHT. Merge these
constants to remove duplication.

Reviewed-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c| 3 ---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 4 ++--
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c  | 4 ++--
  3 files changed, 4 insertions(+), 7 deletions(-)



Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v3 7/9] drm/msm/dpu: drop _dpu_crtc_check_and_setup_lm_bounds from atomic_begin

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 4:20 PM, Abhinav Kumar wrote:



On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

The dpu_crtc_atomic_check() already calls the function
_dpu_crtc_check_and_setup_lm_bounds().  There is no need to call it
again from dpu_crtc_atomic_begin().

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 --
  1 file changed, 2 deletions(-)



Reviewed-by: Abhinav Kumar 



This change is causing a small regression on sc7280 chromebook.

I have tested and concluded that this is causing the chrome boot 
animation to disappear.


I have tested a couple of times and without this change it works fine.

If this change was meant as an optimization, can we drop this one and 
investigate later why this is causing one? I have not spent time 
investigating why it happened. Rest of the series works well and I dont 
see any dependency as such. Let me know if that works for you. Otherwise 
I will have to spend a little more time on this patch and why chrome 
compositor does not like this for the animation screen.


Re: [PATCH v3 6/9] drm/msm/dpu: check for the plane pitch overflow

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

Check that the plane pitch doesn't overflow the maximum pitch size
allowed by the hardware.

Reviewed-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h | 2 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c   | 6 +-
  2 files changed, 7 insertions(+), 1 deletion(-)



Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v3 5/9] drm/msm/dpu: move dpu_format_populate_plane_sizes to atomic_check

2024-06-18 Thread Abhinav Kumar




On 6/14/2024 3:34 AM, Dmitry Baryshkov wrote:

On Thu, Jun 13, 2024 at 04:19:07PM GMT, Abhinav Kumar wrote:



On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

Move a call to dpu_format_populate_plane_sizes() to the atomic_check
step, so that any issues with the FB layout can be reported as early as
possible.

Signed-off-by: Dmitry Baryshkov 
---
   drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 12 ++--
   1 file changed, 6 insertions(+), 6 deletions(-)



Did anything change between v2 to v3 that R-b was dropped?


No, it was my failure to run b4 trailers, please excuse me.



no problem.

Tested-by: Abhinav Kumar  # sc7280



Here it is again,

Reviewed-by: Abhinav Kumar 




Re: [PATCH v3 4/9] drm/msm/dpu: split dpu_format_populate_layout

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

Split dpu_format_populate_layout() into addess-related and
pitch/format-related parts.

Reviewed-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
  .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c|  8 +++-
  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c| 45 --
  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h|  8 +++-
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c  | 12 --
  4 files changed, 46 insertions(+), 27 deletions(-)



as promised,

Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v3 3/9] drm/msm/dpu: drop dpu_format_populate_layout from dpu_plane_sspp_atomic_update

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

The dpu_plane_prepare_fb() already calls dpu_format_populate_layout().
Store the generated layout in the plane state and drop this call from
dpu_plane_sspp_update().

Reviewed-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 19 ---
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h |  3 +++
  2 files changed, 7 insertions(+), 15 deletions(-)



Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v3 2/9] drm/msm/dpu: drop dpu_format_check_modified_format

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 4:14 PM, Abhinav Kumar wrote:



On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

The msm_kms_funcs::check_modified_format() callback is not used by the
driver. Drop it completely.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 43 
-

  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 16 ---
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c |  1 -
  drivers/gpu/drm/msm/msm_kms.h   |  6 
  4 files changed, 66 deletions(-)




Reviewed-by: Abhinav Kumar 


Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v3 1/9] drm/msm/dpu: check for overflow in _dpu_crtc_setup_lm_bounds()

2024-06-18 Thread Abhinav Kumar




On 6/13/2024 4:13 PM, Abhinav Kumar wrote:



On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote:

Make _dpu_crtc_setup_lm_bounds() check that CRTC width is not
overflowing LM requirements. Rename the function accordingly.

Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)



Reviewed-by: Abhinav Kumar 


As promised,


Tested-by: Abhinav Kumar  # sc7280


Re: [PATCH v4 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog

2024-06-18 Thread Konrad Dybcio




On 6/18/24 18:42, Rob Clark wrote:

From: Rob Clark 

Move the CP_PROTECT settings into the hw catalog.

Signed-off-by: Rob Clark 
Reviewed-by: Dmitry Baryshkov 
---


[...]


+static inline void __build_asserts(void)
+{
+   BUILD_BUG_ON(a630_protect.count > a630_protect.count_max);
+   BUILD_BUG_ON(a650_protect.count > a650_protect.count_max);
+   BUILD_BUG_ON(a660_protect.count > a660_protect.count_max);
+   BUILD_BUG_ON(a690_protect.count > a690_protect.count_max);
+   BUILD_BUG_ON(a730_protect.count > a730_protect.count_max);
+}
+


patch:394: new blank line at EOF

other than that:

Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info

2024-06-18 Thread Dmitry Baryshkov
On Tue, Jun 18, 2024 at 09:33:48AM GMT, Rob Clark wrote:
> On Tue, Jun 18, 2024 at 1:30 AM Dmitry Baryshkov
>  wrote:
> >
> > On Mon, Jun 17, 2024 at 03:51:14PM GMT, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > Introduce a6xx_info where we can stash gen specific stuff without
> > > polluting the toplevel adreno_info struct.
> > >
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +--
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  6 +--
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  9 
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  6 ++-
> > >  4 files changed, 67 insertions(+), 19 deletions(-)
> > >
> >
> > Reviewed-by: Dmitry Baryshkov 
> >
> >
> > > @@ -98,7 +100,9 @@ struct adreno_info {
> > >   struct msm_gpu *(*init)(struct drm_device *dev);
> > >   const char *zapfw;
> > >   u32 inactive_period;
> > > - const struct adreno_reglist *hwcg;
> > > + union {
> > > + const struct a6xx_info *a6xx;
> > > + };
> > >   u64 address_space_size;
> > >   /**
> > >* @speedbins: Optional table of fuse to speedbin mappings
> >
> > My preference would be towards wrapping the adreno_gpu, but that would
> > require more significant rework of the driver. Let's see if we can get
> > to that later.
> >
> 
> yeah, it was going to be more re-work, and I'm neck deep in
> gpuvm/vm_bind.. I just wanted to land this since it is a pita (and
> error prone) to rebase as more gpu's get added ;-)

Yes, I'm fine with that. My note was more like a 'later todo' item.

> 
> It isn't entirely unlike how we handle gpu gen specific options in
> mesa, where we have a somewhat bigger set of options, so I wouldn't
> say that this approach was worse than extending adreno_info.. just
> different..


-- 
With best wishes
Dmitry


[PATCH v4 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog

2024-06-18 Thread Rob Clark
From: Rob Clark 

Move the CP_PROTECT settings into the hw catalog.

Signed-off-by: Rob Clark 
Reviewed-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 248 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 257 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   2 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  13 ++
 4 files changed, 269 insertions(+), 251 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index b81bcae59ac3..329b88b24b80 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -454,6 +454,173 @@ static const struct adreno_reglist a690_hwcg[] = {
{}
 };
 
+/* For a615, a616, a618, a619, a630, a640 and a680 */
+static const u32 a630_protect_regs[] = {
+   A6XX_PROTECT_RDONLY(0x0, 0x04ff),
+   A6XX_PROTECT_RDONLY(0x00501, 0x0005),
+   A6XX_PROTECT_RDONLY(0x0050b, 0x02f4),
+   A6XX_PROTECT_NORDWR(0x0050e, 0x),
+   A6XX_PROTECT_NORDWR(0x00510, 0x),
+   A6XX_PROTECT_NORDWR(0x00534, 0x),
+   A6XX_PROTECT_NORDWR(0x00800, 0x0082),
+   A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
+   A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
+   A6XX_PROTECT_RDONLY(0x008de, 0x00ae),
+   A6XX_PROTECT_NORDWR(0x00900, 0x004d),
+   A6XX_PROTECT_NORDWR(0x0098d, 0x0272),
+   A6XX_PROTECT_NORDWR(0x00e00, 0x0001),
+   A6XX_PROTECT_NORDWR(0x00e03, 0x000c),
+   A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
+   A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff),
+   A6XX_PROTECT_NORDWR(0x08630, 0x01cf),
+   A6XX_PROTECT_NORDWR(0x08e00, 0x),
+   A6XX_PROTECT_NORDWR(0x08e08, 0x),
+   A6XX_PROTECT_NORDWR(0x08e50, 0x001f),
+   A6XX_PROTECT_NORDWR(0x09624, 0x01db),
+   A6XX_PROTECT_NORDWR(0x09e70, 0x0001),
+   A6XX_PROTECT_NORDWR(0x09e78, 0x0187),
+   A6XX_PROTECT_NORDWR(0x0a630, 0x01cf),
+   A6XX_PROTECT_NORDWR(0x0ae02, 0x),
+   A6XX_PROTECT_NORDWR(0x0ae50, 0x032f),
+   A6XX_PROTECT_NORDWR(0x0b604, 0x),
+   A6XX_PROTECT_NORDWR(0x0be02, 0x0001),
+   A6XX_PROTECT_NORDWR(0x0be20, 0x17df),
+   A6XX_PROTECT_NORDWR(0x0f000, 0x0bff),
+   A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff),
+   A6XX_PROTECT_NORDWR(0x11c00, 0x), /* note: infinite range */
+};
+DECLARE_ADRENO_PROTECT(a630_protect, 32);
+
+/* These are for a620 and a650 */
+static const u32 a650_protect_regs[] = {
+   A6XX_PROTECT_RDONLY(0x0, 0x04ff),
+   A6XX_PROTECT_RDONLY(0x00501, 0x0005),
+   A6XX_PROTECT_RDONLY(0x0050b, 0x02f4),
+   A6XX_PROTECT_NORDWR(0x0050e, 0x),
+   A6XX_PROTECT_NORDWR(0x00510, 0x),
+   A6XX_PROTECT_NORDWR(0x00534, 0x),
+   A6XX_PROTECT_NORDWR(0x00800, 0x0082),
+   A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
+   A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
+   A6XX_PROTECT_RDONLY(0x008de, 0x00ae),
+   A6XX_PROTECT_NORDWR(0x00900, 0x004d),
+   A6XX_PROTECT_NORDWR(0x0098d, 0x0272),
+   A6XX_PROTECT_NORDWR(0x00e00, 0x0001),
+   A6XX_PROTECT_NORDWR(0x00e03, 0x000c),
+   A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
+   A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff),
+   A6XX_PROTECT_NORDWR(0x08630, 0x01cf),
+   A6XX_PROTECT_NORDWR(0x08e00, 0x),
+   A6XX_PROTECT_NORDWR(0x08e08, 0x),
+   A6XX_PROTECT_NORDWR(0x08e50, 0x001f),
+   A6XX_PROTECT_NORDWR(0x08e80, 0x027f),
+   A6XX_PROTECT_NORDWR(0x09624, 0x01db),
+   A6XX_PROTECT_NORDWR(0x09e60, 0x0011),
+   A6XX_PROTECT_NORDWR(0x09e78, 0x0187),
+   A6XX_PROTECT_NORDWR(0x0a630, 0x01cf),
+   A6XX_PROTECT_NORDWR(0x0ae02, 0x),
+   A6XX_PROTECT_NORDWR(0x0ae50, 0x032f),
+   A6XX_PROTECT_NORDWR(0x0b604, 0x),
+   A6XX_PROTECT_NORDWR(0x0b608, 0x0007),
+   A6XX_PROTECT_NORDWR(0x0be02, 0x0001),
+   A6XX_PROTECT_NORDWR(0x0be20, 0x17df),
+   A6XX_PROTECT_NORDWR(0x0f000, 0x0bff),
+   A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff),
+   A6XX_PROTECT_NORDWR(0x18400, 0x1fff),
+   A6XX_PROTECT_NORDWR(0x1a800, 0x1fff),
+   A6XX_PROTECT_NORDWR(0x1f400, 0x0443),
+   A6XX_PROTECT_RDONLY(0x1f844, 0x007b),
+   A6XX_PROTECT_NORDWR(0x1f887, 0x001b),
+   A6XX_PROTECT_NORDWR(0x1f8c0, 0x), /* note: infinite range */
+};
+DECLARE_ADRENO_PROTECT(a650_protect, 48);
+
+/* These are for a635 and a660 */
+static const u32 a660_protect_regs[] = {
+   A6XX_PROTECT_RDONLY(0x0, 0x04ff),
+   A6XX_PROTECT_RDONLY(0x00501, 0x0005),
+   A6XX_PROTECT_RDONLY(0x0050b, 0x02f4),
+   A6XX_PROTECT_NORDWR(0x0050e, 0x),
+   A6XX_PROTECT_NORDWR(0x00510, 0x),
+   A6XX_PROTECT_NORDWR(0x00534, 0x),
+   A6XX_PROTECT_NORDWR(0x00800, 0x0082),
+   A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
+   A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
+   A6XX_PROTECT_RDONLY(0x008de, 0x00ae),
+   A6XX_PROTECT_NORDWR(0x00900, 0x004d),
+   A6XX_PROTECT_NORDWR(0x0098d, 0x0272),
+   A6XX_PROTECT_

[PATCH v4 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info

2024-06-18 Thread Rob Clark
From: Rob Clark 

Introduce a6xx_info where we can stash gen specific stuff without
polluting the toplevel adreno_info struct.

Signed-off-by: Rob Clark 
Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  6 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  9 
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  6 ++-
 4 files changed, 67 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index bcc2f4d8cfc6..b81bcae59ac3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -7,6 +7,7 @@
  */
 
 #include "adreno_gpu.h"
+#include "a6xx_gpu.h"
 #include "a6xx.xml.h"
 #include "a6xx_gmu.xml.h"
 
@@ -465,7 +466,9 @@ static const struct adreno_info a6xx_gpus[] = {
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init = a6xx_gpu_init,
.zapfw = "a610_zap.mdt",
-   .hwcg = a612_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a612_hwcg,
+   },
/*
 * There are (at least) three SoCs implementing A610: SM6125
 * (trinket), SM6115 (bengal) and SM6225 (khaje). Trinket does
@@ -493,7 +496,9 @@ static const struct adreno_info a6xx_gpus[] = {
.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
.init = a6xx_gpu_init,
.zapfw = "a615_zap.mbn",
-   .hwcg = a615_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a615_hwcg,
+   },
.speedbins = ADRENO_SPEEDBINS(
{ 0,   0 },
{ 128, 1 },
@@ -513,6 +518,8 @@ static const struct adreno_info a6xx_gpus[] = {
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
.init = a6xx_gpu_init,
+   .a6xx = &(const struct a6xx_info) {
+   },
.speedbins = ADRENO_SPEEDBINS(
{ 0,   0 },
{ 169, 1 },
@@ -531,7 +538,9 @@ static const struct adreno_info a6xx_gpus[] = {
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init = a6xx_gpu_init,
.zapfw = "a615_zap.mdt",
-   .hwcg = a615_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a615_hwcg,
+   },
.speedbins = ADRENO_SPEEDBINS(
{ 0,   0 },
{ 138, 1 },
@@ -550,7 +559,9 @@ static const struct adreno_info a6xx_gpus[] = {
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init = a6xx_gpu_init,
.zapfw = "a615_zap.mdt",
-   .hwcg = a615_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a615_hwcg,
+   },
.speedbins = ADRENO_SPEEDBINS(
{ 0,   0 },
{ 190, 1 },
@@ -569,7 +580,9 @@ static const struct adreno_info a6xx_gpus[] = {
.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
.init = a6xx_gpu_init,
.zapfw = "a615_zap.mdt",
-   .hwcg = a615_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a615_hwcg,
+   },
.speedbins = ADRENO_SPEEDBINS(
{ 0,   0 },
{ 120, 4 },
@@ -593,7 +606,9 @@ static const struct adreno_info a6xx_gpus[] = {
.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
.init = a6xx_gpu_init,
.zapfw = "a630_zap.mdt",
-   .hwcg = a630_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a630_hwcg,
+   },
}, {
.chip_ids = ADRENO_CHIP_IDS(0x06040001),
.family = ADRENO_6XX_GEN2,
@@ -607,7 +622,9 @@ static const struct adreno_info a6xx_gpus[] = {
.quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT,
.init = a6xx_gpu_init,
.zapfw = "a640_zap.mdt",
-   .hwcg = a640_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a640_hwcg,
+   },
.speedbins = ADRENO_SPEEDBINS(
{ 0, 0 },
{ 1, 1 },
@@ -626,7 +643,9 @@ static const struct adreno_info a6xx_gpus[] = {
ADRENO_QUIRK_HAS_HW_APRIV,
.init = a6xx_gpu_init,
.zapfw = "a650_zap.mdt",
-   .hwcg = a650_hwcg,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a650_hwcg,
+   }

[PATCH v4 3/5] drm/msm/adreno: Move hwcg regs to a6xx hw catalog

2024-06-18 Thread Rob Clark
From: Rob Clark 

Move the hwcg tables into the hw catalog.

Signed-off-by: Rob Clark 
Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 619 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 617 -
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   3 -
 3 files changed, 619 insertions(+), 620 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 10a92eab0232..bcc2f4d8cfc6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -7,6 +7,451 @@
  */
 
 #include "adreno_gpu.h"
+#include "a6xx.xml.h"
+#include "a6xx_gmu.xml.h"
+
+static const struct adreno_reglist a612_hwcg[] = {
+   {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220},
+   {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0081},
+   {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001},
+   {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007},
+   {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0120},
+   {REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220},
+   {REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00},
+   {REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011},
+   {REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422},
+   {REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x},
+   {REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0222},
+   {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000},
+   {REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200},
+   {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004},
+   {REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004},
+   {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002},
+   {REG_A6XX_RBBM_ISDB_CNT, 0x0182},
+   {REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x},
+   {REG_A6XX_RBBM_SP_HYST_CNT, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222},
+   {REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111},
+   {REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555},
+   {},
+};
+
+/* For a615 family (a615, a616, a618 and a619) */
+static const struct adreno_reglist a615_hwcg[] = {
+   {REG_A6XX_RBBM_CLOCK_CNTL_SP0,  0x0222},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220},
+   {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0080},
+   {REG_A6XX_RBBM_CLOCK_HYST_SP0,  0xF3CF},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TP0,  0x0222},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TP1,  0x0222},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_TP1, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_HYST_TP0,  0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_TP1,  0x},
+   {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST2_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST3_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007},
+   {REG_A6XX_RBBM_CLOCK_HYST4_TP1, 0x0007},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY2_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY3_TP1, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001},
+   {REG_A6XX_RBBM_CLOCK_DELAY4_TP1, 0x0001},
+   {REG_A6XX_RBBM_CLOCK_CNTL_UCHE,  0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_UCHE, 0x},
+   {REG_A6XX_RBBM_CLOCK_C

[PATCH v4 2/5] drm/msm/adreno: Split catalog into separate files

2024-06-18 Thread Rob Clark
From: Rob Clark 

Split each gen's gpu table into it's own file.  Only code-motion, no
functional change.

Signed-off-by: Rob Clark 
Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/Makefile   |   5 +
 drivers/gpu/drm/msm/adreno/a2xx_catalog.c  |  52 ++
 drivers/gpu/drm/msm/adreno/a3xx_catalog.c  |  81 +++
 drivers/gpu/drm/msm/adreno/a4xx_catalog.c  |  50 ++
 drivers/gpu/drm/msm/adreno/a5xx_catalog.c  | 148 +
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c  | 338 +++
 drivers/gpu/drm/msm/adreno/adreno_device.c | 625 +
 7 files changed, 680 insertions(+), 619 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a2xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a3xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a4xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_catalog.c

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index eb788921ff4f..f5e2838c6a76 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -8,13 +8,18 @@ ccflags-$(CONFIG_DRM_MSM_DP) += -I $(src)/dp
 adreno-y := \
adreno/adreno_device.o \
adreno/adreno_gpu.o \
+   adreno/a2xx_catalog.o \
adreno/a2xx_gpu.o \
adreno/a2xx_gpummu.o \
+   adreno/a3xx_catalog.o \
adreno/a3xx_gpu.o \
+   adreno/a4xx_catalog.o \
adreno/a4xx_gpu.o \
+   adreno/a5xx_catalog.o \
adreno/a5xx_gpu.o \
adreno/a5xx_power.o \
adreno/a5xx_preempt.o \
+   adreno/a6xx_catalog.o \
adreno/a6xx_gpu.o \
adreno/a6xx_gmu.o \
adreno/a6xx_hfi.o \
diff --git a/drivers/gpu/drm/msm/adreno/a2xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a2xx_catalog.c
new file mode 100644
index ..9ddb7b31fd98
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a2xx_catalog.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2013-2014 Red Hat
+ * Author: Rob Clark 
+ *
+ * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved.
+ */
+
+#include "adreno_gpu.h"
+
+static const struct adreno_info a2xx_gpus[] = {
+   {
+   .chip_ids = ADRENO_CHIP_IDS(0x0200),
+   .family = ADRENO_2XX_GEN1,
+   .revn  = 200,
+   .fw = {
+   [ADRENO_FW_PM4] = "yamato_pm4.fw",
+   [ADRENO_FW_PFP] = "yamato_pfp.fw",
+   },
+   .gmem  = SZ_256K,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .init  = a2xx_gpu_init,
+   }, { /* a200 on i.mx51 has only 128kib gmem */
+   .chip_ids = ADRENO_CHIP_IDS(0x0201),
+   .family = ADRENO_2XX_GEN1,
+   .revn  = 201,
+   .fw = {
+   [ADRENO_FW_PM4] = "yamato_pm4.fw",
+   [ADRENO_FW_PFP] = "yamato_pfp.fw",
+   },
+   .gmem  = SZ_128K,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .init  = a2xx_gpu_init,
+   }, {
+   .chip_ids = ADRENO_CHIP_IDS(0x0202),
+   .family = ADRENO_2XX_GEN2,
+   .revn  = 220,
+   .fw = {
+   [ADRENO_FW_PM4] = "leia_pm4_470.fw",
+   [ADRENO_FW_PFP] = "leia_pfp_470.fw",
+   },
+   .gmem  = SZ_512K,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .init  = a2xx_gpu_init,
+   }
+};
+DECLARE_ADRENO_GPULIST(a2xx);
+
+MODULE_FIRMWARE("qcom/leia_pfp_470.fw");
+MODULE_FIRMWARE("qcom/leia_pm4_470.fw");
+MODULE_FIRMWARE("qcom/yamato_pfp.fw");
+MODULE_FIRMWARE("qcom/yamato_pm4.fw");
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a3xx_catalog.c
new file mode 100644
index ..0de8465b6cf0
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a3xx_catalog.c
@@ -0,0 +1,81 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2013-2014 Red Hat
+ * Author: Rob Clark 
+ *
+ * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved.
+ */
+
+#include "adreno_gpu.h"
+
+static const struct adreno_info a3xx_gpus[] = {
+   {
+   .chip_ids = ADRENO_CHIP_IDS(0x03000512),
+   .family = ADRENO_3XX,
+   .fw = {
+   [ADRENO_FW_PM4] = "a330_pm4.fw",
+   [ADRENO_FW_PFP] = "a330_pfp.fw",
+   },
+   .gmem  = SZ_128K,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .init  = a3xx_gpu_init,
+   }, {
+   .chip_ids = ADRENO_CHIP_IDS(0x03000520),
+   .family = ADRENO_3XX,
+   .revn  = 305,
+   .fw = {
+   [ADRENO_FW_PM4] = "a300_pm4.fw",
+   [ADRENO_FW_PFP] = "a300_pfp.fw",
+   },
+   

[PATCH v4 1/5] drm/msm/adreno: Split up giant device table

2024-06-18 Thread Rob Clark
From: Rob Clark 

Split into a separate table per generation, in preparation to move each
gen's device table to it's own file.

Signed-off-by: Rob Clark 
Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/adreno_device.c | 67 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h| 10 
 2 files changed, 63 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index c3703a51287b..a57659eaddc2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -20,7 +20,7 @@ bool allow_vram_carveout = false;
 MODULE_PARM_DESC(allow_vram_carveout, "Allow using VRAM Carveout, in place of 
IOMMU");
 module_param_named(allow_vram_carveout, allow_vram_carveout, bool, 0600);
 
-static const struct adreno_info gpulist[] = {
+static const struct adreno_info a2xx_gpus[] = {
{
.chip_ids = ADRENO_CHIP_IDS(0x0200),
.family = ADRENO_2XX_GEN1,
@@ -54,7 +54,12 @@ static const struct adreno_info gpulist[] = {
.gmem  = SZ_512K,
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init  = a2xx_gpu_init,
-   }, {
+   }
+};
+DECLARE_ADRENO_GPULIST(a2xx);
+
+static const struct adreno_info a3xx_gpus[] = {
+   {
.chip_ids = ADRENO_CHIP_IDS(0x03000512),
.family = ADRENO_3XX,
.fw = {
@@ -116,7 +121,12 @@ static const struct adreno_info gpulist[] = {
.gmem  = SZ_1M,
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init  = a3xx_gpu_init,
-   }, {
+   }
+};
+DECLARE_ADRENO_GPULIST(a3xx);
+
+static const struct adreno_info a4xx_gpus[] = {
+   {
.chip_ids = ADRENO_CHIP_IDS(0x04000500),
.family = ADRENO_4XX,
.revn  = 405,
@@ -149,7 +159,12 @@ static const struct adreno_info gpulist[] = {
.gmem  = (SZ_1M + SZ_512K),
.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init  = a4xx_gpu_init,
-   }, {
+   }
+};
+DECLARE_ADRENO_GPULIST(a4xx);
+
+static const struct adreno_info a5xx_gpus[] = {
+   {
.chip_ids = ADRENO_CHIP_IDS(0x05000600),
.family = ADRENO_5XX,
.revn = 506,
@@ -274,7 +289,12 @@ static const struct adreno_info gpulist[] = {
.quirks = ADRENO_QUIRK_LMLOADKILL_DISABLE,
.init = a5xx_gpu_init,
.zapfw = "a540_zap.mdt",
-   }, {
+   }
+};
+DECLARE_ADRENO_GPULIST(a5xx);
+
+static const struct adreno_info a6xx_gpus[] = {
+   {
.chip_ids = ADRENO_CHIP_IDS(0x0601),
.family = ADRENO_6XX_GEN1,
.revn = 610,
@@ -520,7 +540,12 @@ static const struct adreno_info gpulist[] = {
.zapfw = "a690_zap.mdt",
.hwcg = a690_hwcg,
.address_space_size = SZ_16G,
-   }, {
+   }
+};
+DECLARE_ADRENO_GPULIST(a6xx);
+
+static const struct adreno_info a7xx_gpus[] = {
+   {
.chip_ids = ADRENO_CHIP_IDS(0x07000200),
.family = ADRENO_6XX_GEN1, /* NOT a mistake! */
.fw = {
@@ -582,7 +607,17 @@ static const struct adreno_info gpulist[] = {
.init = a6xx_gpu_init,
.zapfw = "gen70900_zap.mbn",
.address_space_size = SZ_16G,
-   },
+   }
+};
+DECLARE_ADRENO_GPULIST(a7xx);
+
+static const struct adreno_gpulist *gpulists[] = {
+   &a2xx_gpulist,
+   &a3xx_gpulist,
+   &a4xx_gpulist,
+   &a5xx_gpulist,
+   &a6xx_gpulist,
+   &a6xx_gpulist,
 };
 
 MODULE_FIRMWARE("qcom/a300_pm4.fw");
@@ -617,13 +652,17 @@ MODULE_FIRMWARE("qcom/yamato_pm4.fw");
 static const struct adreno_info *adreno_info(uint32_t chip_id)
 {
/* identify gpu: */
-   for (int i = 0; i < ARRAY_SIZE(gpulist); i++) {
-   const struct adreno_info *info = &gpulist[i];
-   if (info->machine && !of_machine_is_compatible(info->machine))
-   continue;
-   for (int j = 0; info->chip_ids[j]; j++)
-   if (info->chip_ids[j] == chip_id)
-   return info;
+   for (int i = 0; i < ARRAY_SIZE(gpulists); i++) {
+   for (int j = 0; j < gpulists[i]->gpus_count; j++) {
+   const struct adreno_info *info = &gpulists[i]->gpus[j];
+
+   if (info->machine && 
!of_machine_is_compatible(info->machine))
+   continue;
+
+   for (int k = 0; info->chip_ids[k]; k++)
+   if (info->chip_ids[k] == chip_id)
+   return info;
+   }
}
 
return NULL;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_

[PATCH v4 0/5] drm/msm/adreno: Introduce/rework device hw catalog

2024-06-18 Thread Rob Clark
From: Rob Clark 

Split the single flat gpulist table into per-gen tables that exist in
their own per-gen files, and start moving more info into the device
table.  This at least gets all the big tables of register settings out
of the heart of the a6xx_gpu code.  Probably more could be moved, to
remove at least some of the per-gen if/else ladders, but this seemed
like a reasonably good start.

v2: Drop sentinel table entries
v3: Fix typo
v4: More const, fix missing a702 protect regs

Rob Clark (5):
  drm/msm/adreno: Split up giant device table
  drm/msm/adreno: Split catalog into separate files
  drm/msm/adreno: Move hwcg regs to a6xx hw catalog
  drm/msm/adreno: Move hwcg table into a6xx specific info
  drm/msm/adreno: Move CP_PROTECT settings to hw catalog

 drivers/gpu/drm/msm/Makefile   |5 +
 drivers/gpu/drm/msm/adreno/a2xx_catalog.c  |   52 +
 drivers/gpu/drm/msm/adreno/a3xx_catalog.c  |   81 ++
 drivers/gpu/drm/msm/adreno/a4xx_catalog.c  |   50 +
 drivers/gpu/drm/msm/adreno/a5xx_catalog.c  |  148 +++
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c  | 1240 
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  880 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h  |   11 +
 drivers/gpu/drm/msm/adreno/adreno_device.c |  624 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|   32 +-
 10 files changed, 1649 insertions(+), 1474 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a2xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a3xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a4xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_catalog.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_catalog.c

-- 
2.45.2



Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-18 Thread Akhil P Oommen
On Tue, Jun 04, 2024 at 07:35:04PM +0200, Konrad Dybcio wrote:
> 
> 
> On 5/14/24 20:38, Akhil P Oommen wrote:
> > On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote:
> > > Memory barriers help ensure instruction ordering, NOT time and order
> > > of actual write arrival at other observers (e.g. memory-mapped IP).
> > > On architectures employing weak memory ordering, the latter can be a
> > > giant pain point, and it has been as part of this driver.
> > > 
> > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of
> > > readl/writel, which include r/w (respectively) barriers.
> > > 
> > > Replace the barriers with a readback that ensures the previous writes
> > > have exited the write buffer (as the CPU must flush the write to the
> > > register it's trying to read back) and subsequently remove the hack
> > > introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt
> > > status in hw_init").
> > > 
> > > Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in 
> > > hw_init")
> > > Signed-off-by: Konrad Dybcio 
> > > ---
> > >   drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  5 ++---
> > >   drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 --
> > >   2 files changed, 6 insertions(+), 13 deletions(-)
> > 
> > I prefer this version compared to the v2. A helper routine is
> > unnecessary here because:
> > 1. there are very few scenarios where we have to read back the same
> > register.
> > 2. we may accidently readback a write only register.
> 
> Which would still trigger an address dependency on the CPU, no?

Yes, but it is not a good idea to read a write-only register. We can't be
sure about its effect on the endpoint.

> 
> > 
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > index 0e3dfd4c2bc8..4135a53b55a7 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> > > @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
> > >   int ret;
> > >   u32 val;
> > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1);
> > > - /* Wait for the register to finish posting */
> > > - wmb();
> > > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1));
> > > + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ);
> > 
> > This is unnecessary because we are polling on a register on the same port 
> > below. But I think we
> > can replace "wmb()" above with "mb()" to avoid reordering between read
> > and write IO instructions.
> 
> Ok on the dropping readback part
> 
> + AFAIU from Will's response, we can drop the barrier as well

Lets wait a bit on Will's response on compiler reordering.

> 
> > 
> > >   ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val,
> > >   val & (1 << 1), 100, 1);
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> > > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index 973872ad0474..0acbc38b8e70 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -1713,22 +1713,16 @@ static int hw_init(struct msm_gpu *gpu)
> > >   }
> > >   /* Clear GBIF halt in case GX domain was not collapsed */
> > > + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > 
> > We need a full barrier here to avoid reordering. Also, lets add a
> > comment about why we are doing this odd looking sequence.
> > 
> > > + gpu_read(gpu, REG_A6XX_GBIF_HALT);
> > >   if (adreno_is_a619_holi(adreno_gpu)) {
> > > - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > >   gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
> > > - /* Let's make extra sure that the GPU can access the memory.. */
> > > - mb();
> > 
> > We need a full barrier here.
> > 
> > > + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL);
> > >   } else if (a6xx_has_gbif(adreno_gpu)) {
> > > - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
> > >   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
> > > - /* Let's make extra sure that the GPU can access the memory.. */
> > > - mb();
> > 
> > We need a full barrier here.
> 
> Not sure we do between REG_A6XX_GBIF_HALT & 
> REG_A6XX_RBBM_(GBIF_HALT/GPR0_CNTL),
> but I suppose keeping the one after REG_A6XX_RBBM_(GBIF_HALT/GPR0_CNTL) makes
> sense to avoid the possibility of configuring the GPU before it can access 
> DRAM..

Techinically, I think we don't need a barrier or the below read back.
Because the above write is ordered with the write (on CP_CNTL reg) which
finally triggers CP INIT later. GPU won't access memory before CP INIT.

> 
> > 
> > > + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT);
> > >   }
> > > - /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */
> > > - if (adreno_is_a7xx(adreno_gpu) && a6xx_has_gbif(adreno_gpu))
> > > - spin_until(!gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK));
> > > -
> > 
> > Why i

Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info

2024-06-18 Thread Rob Clark
On Tue, Jun 18, 2024 at 1:30 AM Dmitry Baryshkov
 wrote:
>
> On Mon, Jun 17, 2024 at 03:51:14PM GMT, Rob Clark wrote:
> > From: Rob Clark 
> >
> > Introduce a6xx_info where we can stash gen specific stuff without
> > polluting the toplevel adreno_info struct.
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +--
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  6 +--
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  9 
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  6 ++-
> >  4 files changed, 67 insertions(+), 19 deletions(-)
> >
>
> Reviewed-by: Dmitry Baryshkov 
>
>
> > @@ -98,7 +100,9 @@ struct adreno_info {
> >   struct msm_gpu *(*init)(struct drm_device *dev);
> >   const char *zapfw;
> >   u32 inactive_period;
> > - const struct adreno_reglist *hwcg;
> > + union {
> > + const struct a6xx_info *a6xx;
> > + };
> >   u64 address_space_size;
> >   /**
> >* @speedbins: Optional table of fuse to speedbin mappings
>
> My preference would be towards wrapping the adreno_gpu, but that would
> require more significant rework of the driver. Let's see if we can get
> to that later.
>

yeah, it was going to be more re-work, and I'm neck deep in
gpuvm/vm_bind.. I just wanted to land this since it is a pita (and
error prone) to rebase as more gpu's get added ;-)

It isn't entirely unlike how we handle gpu gen specific options in
mesa, where we have a somewhat bigger set of options, so I wouldn't
say that this approach was worse than extending adreno_info.. just
different..

BR,
-R


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-18 Thread Akhil P Oommen
On Tue, Jun 04, 2024 at 03:40:56PM +0100, Will Deacon wrote:
> On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> > On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> > > On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
> > > > If I understand correctly, you don't need any memory barrier.
> > > > writel()/readl()'s are ordered to the same endpoint. That goes for all
> > > > the reordering/barrier comments mentioned below too.
> > > > 
> > > > device-io.rst:
> > > > 
> > > > The read and write functions are defined to be ordered. That is the
> > > > compiler is not permitted to reorder the I/O sequence. When the 
> > > > ordering
> > > > can be compiler optimised, you can use __readb() and friends to
> > > > indicate the relaxed ordering. Use this with care.
> > > > 
> > > > memory-barriers.txt:
> > > > 
> > > >  (*) readX(), writeX():
> > > > 
> > > > The readX() and writeX() MMIO accessors take a pointer to 
> > > > the
> > > > peripheral being accessed as an __iomem * parameter. For 
> > > > pointers
> > > > mapped with the default I/O attributes (e.g. those returned 
> > > > by
> > > > ioremap()), the ordering guarantees are as follows:
> > > > 
> > > > 1. All readX() and writeX() accesses to the same peripheral 
> > > > are ordered
> > > >with respect to each other. This ensures that MMIO 
> > > > register accesses
> > > >by the same CPU thread to a particular device will 
> > > > arrive in program
> > > >order.
> > > > 
> > > 
> > > In arm64, a writel followed by readl translates to roughly the following
> > > sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not
> > > sure what is stopping compiler from reordering  __raw_writel() and 
> > > __raw_readl()
> > > above? I am assuming iomem cookie is ignored during compilation.
> > 
> > It seems to me that is due to some usage of volatile there in
> > __raw_writel() etc, but to be honest after reading about volatile and
> > some threads from gcc mailing lists, I don't have a confident answer :)
> > 
> > > 
> > > Added Will to this thread if he can throw some light on this.
> > 
> > Hopefully Will can school us.
> 
> The ordering in this case is ensured by the memory attributes used for
> ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
> (as it the case for ioremap()), the "nR" part means "no reordering", so
> readX() and writeX() to that region are ordered wrt each other.

But that avoids only HW reordering, doesn't it? What about *compiler 
reordering* in the
case of a writel following by a readl which translates to:
1: dmb_wmb()
2: __raw_writel() -> roughly "asm volatile('str')
3: __raw_readl() -> roughly "asm volatile('ldr')
4: dmb_rmb()

Is the 'volatile' keyword sufficient to avoid reordering between (2) and (3)? Or
do we need a "memory" clobber to inhibit reordering?

This is still not clear to me even after going through some compiler 
documentions.

-Akhil.

> 
> Note that guarantee _doesn't_ apply to other flavours of ioremap(), so
> e.g. ioremap_wc() won't give you the ordering.
> 
> Hope that helps,
> 
> Will


Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers

2024-06-18 Thread Will Deacon
On Thu, Jun 06, 2024 at 02:03:24PM +0200, Konrad Dybcio wrote:
> On 4.06.2024 4:40 PM, Will Deacon wrote:
> > On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote:
> >> On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote:
> >>> On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote:
>  If I understand correctly, you don't need any memory barrier.
>  writel()/readl()'s are ordered to the same endpoint. That goes for all
>  the reordering/barrier comments mentioned below too.
> 
>  device-io.rst:
> 
>  The read and write functions are defined to be ordered. That is the
>  compiler is not permitted to reorder the I/O sequence. When the 
>  ordering
>  can be compiler optimised, you can use __readb() and friends to
>  indicate the relaxed ordering. Use this with care.
> 
>  memory-barriers.txt:
> 
>   (*) readX(), writeX():
> 
>   The readX() and writeX() MMIO accessors take a pointer to the
>   peripheral being accessed as an __iomem * parameter. For pointers
>   mapped with the default I/O attributes (e.g. those returned by
>   ioremap()), the ordering guarantees are as follows:
> 
>   1. All readX() and writeX() accesses to the same peripheral are 
>  ordered
>  with respect to each other. This ensures that MMIO register 
>  accesses
>  by the same CPU thread to a particular device will arrive in 
>  program
>  order.
> 
> >>>
> >>> In arm64, a writel followed by readl translates to roughly the following
> >>> sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not
> >>> sure what is stopping compiler from reordering  __raw_writel() and 
> >>> __raw_readl()
> >>> above? I am assuming iomem cookie is ignored during compilation.
> >>
> >> It seems to me that is due to some usage of volatile there in
> >> __raw_writel() etc, but to be honest after reading about volatile and
> >> some threads from gcc mailing lists, I don't have a confident answer :)
> >>
> >>>
> >>> Added Will to this thread if he can throw some light on this.
> >>
> >> Hopefully Will can school us.
> > 
> > The ordering in this case is ensured by the memory attributes used for
> > ioremap(). When an MMIO region is mapped using Device-nGnRE attributes
> > (as it the case for ioremap()), the "nR" part means "no reordering", so
> > readX() and writeX() to that region are ordered wrt each other.
> > 
> > Note that guarantee _doesn't_ apply to other flavours of ioremap(), so
> > e.g. ioremap_wc() won't give you the ordering.
> > 
> > Hope that helps,
> 
> Just to make sure I'm following, would mapping things as nGnRnE effectively
> get rid of write buffering, perhaps being a way of debugging whether that
> in particular is causing issues (at the cost of speed)?

I think the "nE" part is just a hint, so it will depend on how the
hardware has been built. On top of that, you'll still need something
like a DSB to force the CPU to wait for the write response.

Will


Re: [PATCH v3 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog

2024-06-18 Thread Konrad Dybcio




On 6/18/24 00:51, Rob Clark wrote:

From: Rob Clark 

Move the CP_PROTECT settings into the hw catalog.

Signed-off-by: Rob Clark 
---


I think a702 was skipped over

Konrad


Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info

2024-06-18 Thread Konrad Dybcio




On 6/18/24 00:51, Rob Clark wrote:

From: Rob Clark 

Introduce a6xx_info where we can stash gen specific stuff without
polluting the toplevel adreno_info struct.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +--
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  6 +--
  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  9 
  drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  6 ++-
  4 files changed, 67 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index bcc2f4d8cfc6..96d93251fdd6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -7,6 +7,7 @@
   */
  
  #include "adreno_gpu.h"

+#include "a6xx_gpu.h"
  #include "a6xx.xml.h"
  #include "a6xx_gmu.xml.h"
  
@@ -465,7 +466,9 @@ static const struct adreno_info a6xx_gpus[] = {

.inactive_period = DRM_MSM_INACTIVE_PERIOD,
.init = a6xx_gpu_init,
.zapfw = "a610_zap.mdt",
-   .hwcg = a612_hwcg,
+   .a6xx = &(struct a6xx_info) {


const

Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH v3 3/5] drm/msm/adreno: Move hwcg regs to a6xx hw catalog

2024-06-18 Thread Konrad Dybcio




On 6/18/24 00:51, Rob Clark wrote:

From: Rob Clark 

Move the hwcg tables into the hw catalog.

Signed-off-by: Rob Clark 
---


Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH v3 2/5] drm/msm/adreno: Split catalog into separate files

2024-06-18 Thread Konrad Dybcio




On 6/18/24 00:51, Rob Clark wrote:

From: Rob Clark 

Split each gen's gpu table into it's own file.  Only code-motion, no
functional change.

Signed-off-by: Rob Clark 
---


Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH v3 1/5] drm/msm/adreno: Split up giant device table

2024-06-18 Thread Konrad Dybcio




On 6/18/24 00:51, Rob Clark wrote:

From: Rob Clark 

Split into a separate table per generation, in preparation to move each
gen's device table to it's own file.

Signed-off-by: Rob Clark 
---


Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info

2024-06-18 Thread Dmitry Baryshkov
On Mon, Jun 17, 2024 at 03:51:14PM GMT, Rob Clark wrote:
> From: Rob Clark 
> 
> Introduce a6xx_info where we can stash gen specific stuff without
> polluting the toplevel adreno_info struct.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  6 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  9 
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  6 ++-
>  4 files changed, 67 insertions(+), 19 deletions(-)
> 

Reviewed-by: Dmitry Baryshkov 


> @@ -98,7 +100,9 @@ struct adreno_info {
>   struct msm_gpu *(*init)(struct drm_device *dev);
>   const char *zapfw;
>   u32 inactive_period;
> - const struct adreno_reglist *hwcg;
> + union {
> + const struct a6xx_info *a6xx;
> + };
>   u64 address_space_size;
>   /**
>* @speedbins: Optional table of fuse to speedbin mappings

My preference would be towards wrapping the adreno_gpu, but that would
require more significant rework of the driver. Let's see if we can get
to that later.

-- 
With best wishes
Dmitry


Re: [PATCH v3 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog

2024-06-18 Thread Dmitry Baryshkov
On Mon, Jun 17, 2024 at 03:51:15PM GMT, Rob Clark wrote:
> From: Rob Clark 
> 
> Move the CP_PROTECT settings into the hw catalog.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 247 +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 257 +-
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   2 +
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  13 ++
>  4 files changed, 268 insertions(+), 251 deletions(-)
> 

Reviewed-by: Dmitry Baryshkov 


-- 
With best wishes
Dmitry