Re: [PATCH v3 7/9] drm/msm/dpu: drop _dpu_crtc_check_and_setup_lm_bounds from atomic_begin
On Wed, 19 Jun 2024 at 01:56, Abhinav Kumar wrote: > On 6/13/2024 4:20 PM, Abhinav Kumar wrote: > > On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: > >> The dpu_crtc_atomic_check() already calls the function > >> _dpu_crtc_check_and_setup_lm_bounds(). There is no need to call it > >> again from dpu_crtc_atomic_begin(). > >> > >> Signed-off-by: Dmitry Baryshkov > >> --- > >> drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 -- > >> 1 file changed, 2 deletions(-) > >> > > > > Reviewed-by: Abhinav Kumar > > > This change is causing a small regression on sc7280 chromebook. > > I have tested and concluded that this is causing the chrome boot > animation to disappear. > > I have tested a couple of times and without this change it works fine. > > If this change was meant as an optimization, can we drop this one and > investigate later why this is causing one? I have not spent time > investigating why it happened. Rest of the series works well and I dont > see any dependency as such. Let me know if that works for you. Otherwise > I will have to spend a little more time on this patch and why chrome > compositor does not like this for the animation screen. Oh, my. Thank you for the test! I think I know what's happening. The cstate->num_mixers gets set only in dpu_encoder_virt_atomic_mode_set(). So during dpu_crtc_atomic_check() we don't have cstate->num_mixers is stale (and if it is 0, the check is skipped). I guess I'll have to move cstate->mixers[] and cstate->num_mixers assignment to the dpu_encoder_virt_atomic_check(). And maybe we should start thinking about my old idea of moving resource allocation to the CRTC code. -- With best wishes Dmitry
Re: [PATCH v2 08/14] drm/msm/hdmi: add runtime PM calls to DDC transfer function
On 5/22/2024 3:51 AM, Dmitry Baryshkov wrote: We must be sure that the HDMI controller is powered on, while performing the DDC transfer. Add corresponding runtime PM calls to msm_hdmi_i2c_xfer(). Signed-off-by: Dmitry Baryshkov Reviewed-by: Jessica Zhang --- drivers/gpu/drm/msm/hdmi/hdmi_i2c.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c b/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c index 7aa500d24240..ebefea4fb408 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_i2c.c @@ -107,11 +107,15 @@ static int msm_hdmi_i2c_xfer(struct i2c_adapter *i2c, if (num == 0) return num; + ret = pm_runtime_resume_and_get(&hdmi->pdev->dev); + if (ret) + return ret; + init_ddc(hdmi_i2c); ret = ddc_clear_irq(hdmi_i2c); if (ret) - return ret; + goto fail; for (i = 0; i < num; i++) { struct i2c_msg *p = &msgs[i]; @@ -169,7 +173,7 @@ static int msm_hdmi_i2c_xfer(struct i2c_adapter *i2c, hdmi_read(hdmi, REG_HDMI_DDC_SW_STATUS), hdmi_read(hdmi, REG_HDMI_DDC_HW_STATUS), hdmi_read(hdmi, REG_HDMI_DDC_INT_CTRL)); - return ret; + goto fail; } ddc_status = hdmi_read(hdmi, REG_HDMI_DDC_SW_STATUS); @@ -202,7 +206,13 @@ static int msm_hdmi_i2c_xfer(struct i2c_adapter *i2c, } } + pm_runtime_put(&hdmi->pdev->dev); + return i; + +fail: + pm_runtime_put(&hdmi->pdev->dev); + return ret; } static u32 msm_hdmi_i2c_func(struct i2c_adapter *adapter) -- 2.39.2
Re: [PATCH v2 07/14] drm/msm/hdmi: switch to pm_runtime_resume_and_get()
On 5/22/2024 3:50 AM, Dmitry Baryshkov wrote: The pm_runtime_get_sync() function is a bad choise for runtime power [nit: s/choise/choice/] Reviewed-by: Jessica Zhang management. Switch HDMI driver to pm_runtime_resume_and_get() and add proper error handling, while we are at it. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/hdmi/hdmi_bridge.c | 2 +- drivers/gpu/drm/msm/hdmi/hdmi_hpd.c| 12 ++-- drivers/gpu/drm/msm/hdmi/hdmi_phy.c| 6 +- 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c index fb99328107dd..d1b35328b6e8 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c @@ -19,7 +19,7 @@ static void msm_hdmi_power_on(struct drm_bridge *bridge) const struct hdmi_platform_config *config = hdmi->config; int ret; - pm_runtime_get_sync(&hdmi->pdev->dev); + pm_runtime_resume_and_get(&hdmi->pdev->dev); ret = regulator_bulk_enable(config->pwr_reg_cnt, hdmi->pwr_regs); if (ret) diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c index 36266aa626dc..fc21ad3b01dc 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c @@ -85,7 +85,12 @@ int msm_hdmi_hpd_enable(struct drm_bridge *bridge) if (hdmi->hpd_gpiod) gpiod_set_value_cansleep(hdmi->hpd_gpiod, 1); - pm_runtime_get_sync(dev); + ret = pm_runtime_resume_and_get(dev); + if (ret) { + DRM_DEV_ERROR(dev, "runtime resume failed: %d\n", ret); + goto fail; + } + ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks); if (ret) goto fail; @@ -178,7 +183,10 @@ static enum drm_connector_status detect_reg(struct hdmi *hdmi) uint32_t hpd_int_status = 0; int ret; - pm_runtime_get_sync(&hdmi->pdev->dev); + ret = pm_runtime_resume_and_get(&hdmi->pdev->dev); + if (ret) + goto out; + ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks); if (ret) goto out; diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_phy.c b/drivers/gpu/drm/msm/hdmi/hdmi_phy.c index 88a3423b7f24..d5acae752300 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_phy.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_phy.c @@ -58,7 +58,11 @@ int msm_hdmi_phy_resource_enable(struct hdmi_phy *phy) struct device *dev = &phy->pdev->dev; int i, ret = 0; - pm_runtime_get_sync(dev); + ret = pm_runtime_resume_and_get(dev); + if (ret) { + DRM_DEV_ERROR(dev, "runtime resume failed: %d\n", ret); + return ret; + } ret = regulator_bulk_enable(cfg->num_regs, phy->regs); if (ret) { -- 2.39.2
Re: [PATCH v2 06/14] drm/msm/hdmi: switch to clk_bulk API
On 5/22/2024 3:50 AM, Dmitry Baryshkov wrote: The last platform using legacy clock names for HDMI block (APQ8064) switched to new clock names in 5.16. It's time to stop caring about old DT, drop hand-coded helpers and switch to clk_bulk_* API. Signed-off-by: Dmitry Baryshkov Reviewed-by: Jessica Zhang --- drivers/gpu/drm/msm/hdmi/hdmi.c | 15 +- drivers/gpu/drm/msm/hdmi/hdmi.h | 2 +- drivers/gpu/drm/msm/hdmi/hdmi_hpd.c | 39 + 3 files changed, 19 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c index c14e009f38b1..7ec4ca3b7597 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi.c @@ -469,17 +469,12 @@ static int msm_hdmi_dev_probe(struct platform_device *pdev) if (!hdmi->hpd_clks) return -ENOMEM; - for (i = 0; i < config->hpd_clk_cnt; i++) { - struct clk *clk; + for (i = 0; i < config->hpd_clk_cnt; i++) + hdmi->hpd_clks[i].id = config->hpd_clk_names[i]; - clk = msm_clk_get(pdev, config->hpd_clk_names[i]); - if (IS_ERR(clk)) - return dev_err_probe(dev, PTR_ERR(clk), -"failed to get hpd clk: %s\n", -config->hpd_clk_names[i]); - - hdmi->hpd_clks[i] = clk; - } + ret = devm_clk_bulk_get(&pdev->dev, config->hpd_clk_cnt, hdmi->hpd_clks); + if (ret) + return ret; hdmi->extp_clk = devm_clk_get_optional(&pdev->dev, "extp"); if (IS_ERR(hdmi->extp_clk)) diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h index c0d60ed23b75..eeba85ffef09 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi.h +++ b/drivers/gpu/drm/msm/hdmi/hdmi.h @@ -50,7 +50,7 @@ struct hdmi { struct regulator_bulk_data *hpd_regs; struct regulator_bulk_data *pwr_regs; - struct clk **hpd_clks; + struct clk_bulk_data *hpd_clks; struct clk *extp_clk; struct gpio_desc *hpd_gpiod; diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c index 7ae69b14e953..36266aa626dc 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c @@ -60,27 +60,6 @@ static void msm_hdmi_phy_reset(struct hdmi *hdmi) } } -static void enable_hpd_clocks(struct hdmi *hdmi, bool enable) -{ - const struct hdmi_platform_config *config = hdmi->config; - struct device *dev = &hdmi->pdev->dev; - int i, ret; - - if (enable) { - for (i = 0; i < config->hpd_clk_cnt; i++) { - ret = clk_prepare_enable(hdmi->hpd_clks[i]); - if (ret) { - DRM_DEV_ERROR(dev, - "failed to enable hpd clk: %s (%d)\n", - config->hpd_clk_names[i], ret); - } - } - } else { - for (i = config->hpd_clk_cnt - 1; i >= 0; i--) - clk_disable_unprepare(hdmi->hpd_clks[i]); - } -} - int msm_hdmi_hpd_enable(struct drm_bridge *bridge) { struct hdmi_bridge *hdmi_bridge = to_hdmi_bridge(bridge); @@ -107,7 +86,9 @@ int msm_hdmi_hpd_enable(struct drm_bridge *bridge) gpiod_set_value_cansleep(hdmi->hpd_gpiod, 1); pm_runtime_get_sync(dev); - enable_hpd_clocks(hdmi, true); + ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks); + if (ret) + goto fail; msm_hdmi_set_mode(hdmi, false); msm_hdmi_phy_reset(hdmi); @@ -149,7 +130,7 @@ void msm_hdmi_hpd_disable(struct hdmi *hdmi) msm_hdmi_set_mode(hdmi, false); - enable_hpd_clocks(hdmi, false); + clk_bulk_disable_unprepare(config->hpd_clk_cnt, hdmi->hpd_clks); pm_runtime_put(dev); ret = pinctrl_pm_select_sleep_state(dev); @@ -193,14 +174,20 @@ void msm_hdmi_hpd_irq(struct drm_bridge *bridge) static enum drm_connector_status detect_reg(struct hdmi *hdmi) { - uint32_t hpd_int_status; + const struct hdmi_platform_config *config = hdmi->config; + uint32_t hpd_int_status = 0; + int ret; pm_runtime_get_sync(&hdmi->pdev->dev); - enable_hpd_clocks(hdmi, true); + ret = clk_bulk_prepare_enable(config->hpd_clk_cnt, hdmi->hpd_clks); + if (ret) + goto out; hpd_int_status = hdmi_read(hdmi, REG_HDMI_HPD_INT_STATUS); - enable_hpd_clocks(hdmi, false); + clk_bulk_disable_unprepare(config->hpd_clk_cnt, hdmi->hpd_clks); + +out: pm_runtime_put(&hdmi->pdev->dev); return (hpd_int_status & HDMI_HPD_INT_STATUS_CABLE_DETECTED) ? -- 2.39.2
Re: [PATCH v2 05/14] drm/msm/hdmi: drop clock frequency assignment
On 5/22/2024 3:50 AM, Dmitry Baryshkov wrote: The only clock which has frequency being set through hpd_freqs is the "core" aka MDSS_HDMI_CLK clock. It always has the specified frequency, so we can drop corresponding clk_set_rate() call together with the hpd_freq infrastructure. Signed-off-by: Dmitry Baryshkov Reviewed-by: Jessica Zhang --- drivers/gpu/drm/msm/hdmi/hdmi.c | 2 -- drivers/gpu/drm/msm/hdmi/hdmi.h | 1 - drivers/gpu/drm/msm/hdmi/hdmi_hpd.c | 9 - 3 files changed, 12 deletions(-) diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c index 681265e29aa0..c14e009f38b1 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi.c @@ -236,12 +236,10 @@ static const struct hdmi_platform_config hdmi_tx_8960_config = { static const char *pwr_reg_names_8x74[] = {"core-vdda", "core-vcc"}; static const char *hpd_clk_names_8x74[] = {"iface", "core", "mdp_core", "alt_iface"}; -static unsigned long hpd_clk_freq_8x74[] = {0, 1920, 0, 0}; static const struct hdmi_platform_config hdmi_tx_8974_config = { HDMI_CFG(pwr_reg, 8x74), HDMI_CFG(hpd_clk, 8x74), - .hpd_freq = hpd_clk_freq_8x74, }; /* diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h index abdbe4779cf9..c0d60ed23b75 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi.h +++ b/drivers/gpu/drm/msm/hdmi/hdmi.h @@ -96,7 +96,6 @@ struct hdmi_platform_config { /* clks that need to be on for hpd: */ const char **hpd_clk_names; - const long unsigned *hpd_freq; int hpd_clk_cnt; }; diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c index 9ce0ffa35417..7ae69b14e953 100644 --- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c +++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c @@ -68,15 +68,6 @@ static void enable_hpd_clocks(struct hdmi *hdmi, bool enable) if (enable) { for (i = 0; i < config->hpd_clk_cnt; i++) { - if (config->hpd_freq && config->hpd_freq[i]) { - ret = clk_set_rate(hdmi->hpd_clks[i], - config->hpd_freq[i]); - if (ret) - dev_warn(dev, -"failed to set clk %s (%d)\n", -config->hpd_clk_names[i], ret); - } - ret = clk_prepare_enable(hdmi->hpd_clks[i]); if (ret) { DRM_DEV_ERROR(dev, -- 2.39.2
Re: [PATCH v3 9/9] drm/msm/dpu: sync mode_config limits to the FB limits in dpu_plane.c
On 6/13/2024 4:16 PM, Abhinav Kumar wrote: On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: Lift mode_config limits set by the DPU driver to the actual FB limits as handled by the dpu_plane.c. Move 2*max_lm_width check where it belongs, to the drm_crtc_helper_funcs::mode_valid() callback. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 15 +++ drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 9 ++--- 2 files changed, 17 insertions(+), 7 deletions(-) Did anything change in this patch from v2 that the R-b was dropped? Here it is again, Reviewed-by: Abhinav Kumar Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v3 8/9] drm/msm/dpu: merge MAX_IMG_WIDTH/HEIGHT with DPU_MAX_IMG_WIDTH/HEIGHT
On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: dpu_formats.c defines DPU_MAX_IMG_WIDTH and _HEIGHT, while dpu_hw_catalog.h defines just MAX_IMG_WIDTH and _HEIGHT. Merge these constants to remove duplication. Reviewed-by: Abhinav Kumar Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c| 3 --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 4 ++-- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 4 ++-- 3 files changed, 4 insertions(+), 7 deletions(-) Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v3 7/9] drm/msm/dpu: drop _dpu_crtc_check_and_setup_lm_bounds from atomic_begin
On 6/13/2024 4:20 PM, Abhinav Kumar wrote: On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: The dpu_crtc_atomic_check() already calls the function _dpu_crtc_check_and_setup_lm_bounds(). There is no need to call it again from dpu_crtc_atomic_begin(). Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 -- 1 file changed, 2 deletions(-) Reviewed-by: Abhinav Kumar This change is causing a small regression on sc7280 chromebook. I have tested and concluded that this is causing the chrome boot animation to disappear. I have tested a couple of times and without this change it works fine. If this change was meant as an optimization, can we drop this one and investigate later why this is causing one? I have not spent time investigating why it happened. Rest of the series works well and I dont see any dependency as such. Let me know if that works for you. Otherwise I will have to spend a little more time on this patch and why chrome compositor does not like this for the animation screen.
Re: [PATCH v3 6/9] drm/msm/dpu: check for the plane pitch overflow
On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: Check that the plane pitch doesn't overflow the maximum pitch size allowed by the hardware. Reviewed-by: Abhinav Kumar Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h | 2 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 6 +- 2 files changed, 7 insertions(+), 1 deletion(-) Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v3 5/9] drm/msm/dpu: move dpu_format_populate_plane_sizes to atomic_check
On 6/14/2024 3:34 AM, Dmitry Baryshkov wrote: On Thu, Jun 13, 2024 at 04:19:07PM GMT, Abhinav Kumar wrote: On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: Move a call to dpu_format_populate_plane_sizes() to the atomic_check step, so that any issues with the FB layout can be reported as early as possible. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) Did anything change between v2 to v3 that R-b was dropped? No, it was my failure to run b4 trailers, please excuse me. no problem. Tested-by: Abhinav Kumar # sc7280 Here it is again, Reviewed-by: Abhinav Kumar
Re: [PATCH v3 4/9] drm/msm/dpu: split dpu_format_populate_layout
On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: Split dpu_format_populate_layout() into addess-related and pitch/format-related parts. Reviewed-by: Abhinav Kumar Signed-off-by: Dmitry Baryshkov --- .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c| 8 +++- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c| 45 -- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h| 8 +++- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 12 -- 4 files changed, 46 insertions(+), 27 deletions(-) as promised, Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v3 3/9] drm/msm/dpu: drop dpu_format_populate_layout from dpu_plane_sspp_atomic_update
On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: The dpu_plane_prepare_fb() already calls dpu_format_populate_layout(). Store the generated layout in the plane state and drop this call from dpu_plane_sspp_update(). Reviewed-by: Abhinav Kumar Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 19 --- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 3 +++ 2 files changed, 7 insertions(+), 15 deletions(-) Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v3 2/9] drm/msm/dpu: drop dpu_format_check_modified_format
On 6/13/2024 4:14 PM, Abhinav Kumar wrote: On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: The msm_kms_funcs::check_modified_format() callback is not used by the driver. Drop it completely. Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 43 - drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 16 --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 1 - drivers/gpu/drm/msm/msm_kms.h | 6 4 files changed, 66 deletions(-) Reviewed-by: Abhinav Kumar Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v3 1/9] drm/msm/dpu: check for overflow in _dpu_crtc_setup_lm_bounds()
On 6/13/2024 4:13 PM, Abhinav Kumar wrote: On 6/13/2024 3:36 PM, Dmitry Baryshkov wrote: Make _dpu_crtc_setup_lm_bounds() check that CRTC width is not overflowing LM requirements. Rename the function accordingly. Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support") Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) Reviewed-by: Abhinav Kumar As promised, Tested-by: Abhinav Kumar # sc7280
Re: [PATCH v4 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog
On 6/18/24 18:42, Rob Clark wrote: From: Rob Clark Move the CP_PROTECT settings into the hw catalog. Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov --- [...] +static inline void __build_asserts(void) +{ + BUILD_BUG_ON(a630_protect.count > a630_protect.count_max); + BUILD_BUG_ON(a650_protect.count > a650_protect.count_max); + BUILD_BUG_ON(a660_protect.count > a660_protect.count_max); + BUILD_BUG_ON(a690_protect.count > a690_protect.count_max); + BUILD_BUG_ON(a730_protect.count > a730_protect.count_max); +} + patch:394: new blank line at EOF other than that: Reviewed-by: Konrad Dybcio Konrad
Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info
On Tue, Jun 18, 2024 at 09:33:48AM GMT, Rob Clark wrote: > On Tue, Jun 18, 2024 at 1:30 AM Dmitry Baryshkov > wrote: > > > > On Mon, Jun 17, 2024 at 03:51:14PM GMT, Rob Clark wrote: > > > From: Rob Clark > > > > > > Introduce a6xx_info where we can stash gen specific stuff without > > > polluting the toplevel adreno_info struct. > > > > > > Signed-off-by: Rob Clark > > > --- > > > drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +-- > > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-- > > > drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 9 > > > drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 ++- > > > 4 files changed, 67 insertions(+), 19 deletions(-) > > > > > > > Reviewed-by: Dmitry Baryshkov > > > > > > > @@ -98,7 +100,9 @@ struct adreno_info { > > > struct msm_gpu *(*init)(struct drm_device *dev); > > > const char *zapfw; > > > u32 inactive_period; > > > - const struct adreno_reglist *hwcg; > > > + union { > > > + const struct a6xx_info *a6xx; > > > + }; > > > u64 address_space_size; > > > /** > > >* @speedbins: Optional table of fuse to speedbin mappings > > > > My preference would be towards wrapping the adreno_gpu, but that would > > require more significant rework of the driver. Let's see if we can get > > to that later. > > > > yeah, it was going to be more re-work, and I'm neck deep in > gpuvm/vm_bind.. I just wanted to land this since it is a pita (and > error prone) to rebase as more gpu's get added ;-) Yes, I'm fine with that. My note was more like a 'later todo' item. > > It isn't entirely unlike how we handle gpu gen specific options in > mesa, where we have a somewhat bigger set of options, so I wouldn't > say that this approach was worse than extending adreno_info.. just > different.. -- With best wishes Dmitry
[PATCH v4 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog
From: Rob Clark Move the CP_PROTECT settings into the hw catalog. Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 248 + drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 257 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 + drivers/gpu/drm/msm/adreno/adreno_gpu.h | 13 ++ 4 files changed, 269 insertions(+), 251 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c index b81bcae59ac3..329b88b24b80 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c @@ -454,6 +454,173 @@ static const struct adreno_reglist a690_hwcg[] = { {} }; +/* For a615, a616, a618, a619, a630, a640 and a680 */ +static const u32 a630_protect_regs[] = { + A6XX_PROTECT_RDONLY(0x0, 0x04ff), + A6XX_PROTECT_RDONLY(0x00501, 0x0005), + A6XX_PROTECT_RDONLY(0x0050b, 0x02f4), + A6XX_PROTECT_NORDWR(0x0050e, 0x), + A6XX_PROTECT_NORDWR(0x00510, 0x), + A6XX_PROTECT_NORDWR(0x00534, 0x), + A6XX_PROTECT_NORDWR(0x00800, 0x0082), + A6XX_PROTECT_NORDWR(0x008a0, 0x0008), + A6XX_PROTECT_NORDWR(0x008ab, 0x0024), + A6XX_PROTECT_RDONLY(0x008de, 0x00ae), + A6XX_PROTECT_NORDWR(0x00900, 0x004d), + A6XX_PROTECT_NORDWR(0x0098d, 0x0272), + A6XX_PROTECT_NORDWR(0x00e00, 0x0001), + A6XX_PROTECT_NORDWR(0x00e03, 0x000c), + A6XX_PROTECT_NORDWR(0x03c00, 0x00c3), + A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff), + A6XX_PROTECT_NORDWR(0x08630, 0x01cf), + A6XX_PROTECT_NORDWR(0x08e00, 0x), + A6XX_PROTECT_NORDWR(0x08e08, 0x), + A6XX_PROTECT_NORDWR(0x08e50, 0x001f), + A6XX_PROTECT_NORDWR(0x09624, 0x01db), + A6XX_PROTECT_NORDWR(0x09e70, 0x0001), + A6XX_PROTECT_NORDWR(0x09e78, 0x0187), + A6XX_PROTECT_NORDWR(0x0a630, 0x01cf), + A6XX_PROTECT_NORDWR(0x0ae02, 0x), + A6XX_PROTECT_NORDWR(0x0ae50, 0x032f), + A6XX_PROTECT_NORDWR(0x0b604, 0x), + A6XX_PROTECT_NORDWR(0x0be02, 0x0001), + A6XX_PROTECT_NORDWR(0x0be20, 0x17df), + A6XX_PROTECT_NORDWR(0x0f000, 0x0bff), + A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff), + A6XX_PROTECT_NORDWR(0x11c00, 0x), /* note: infinite range */ +}; +DECLARE_ADRENO_PROTECT(a630_protect, 32); + +/* These are for a620 and a650 */ +static const u32 a650_protect_regs[] = { + A6XX_PROTECT_RDONLY(0x0, 0x04ff), + A6XX_PROTECT_RDONLY(0x00501, 0x0005), + A6XX_PROTECT_RDONLY(0x0050b, 0x02f4), + A6XX_PROTECT_NORDWR(0x0050e, 0x), + A6XX_PROTECT_NORDWR(0x00510, 0x), + A6XX_PROTECT_NORDWR(0x00534, 0x), + A6XX_PROTECT_NORDWR(0x00800, 0x0082), + A6XX_PROTECT_NORDWR(0x008a0, 0x0008), + A6XX_PROTECT_NORDWR(0x008ab, 0x0024), + A6XX_PROTECT_RDONLY(0x008de, 0x00ae), + A6XX_PROTECT_NORDWR(0x00900, 0x004d), + A6XX_PROTECT_NORDWR(0x0098d, 0x0272), + A6XX_PROTECT_NORDWR(0x00e00, 0x0001), + A6XX_PROTECT_NORDWR(0x00e03, 0x000c), + A6XX_PROTECT_NORDWR(0x03c00, 0x00c3), + A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff), + A6XX_PROTECT_NORDWR(0x08630, 0x01cf), + A6XX_PROTECT_NORDWR(0x08e00, 0x), + A6XX_PROTECT_NORDWR(0x08e08, 0x), + A6XX_PROTECT_NORDWR(0x08e50, 0x001f), + A6XX_PROTECT_NORDWR(0x08e80, 0x027f), + A6XX_PROTECT_NORDWR(0x09624, 0x01db), + A6XX_PROTECT_NORDWR(0x09e60, 0x0011), + A6XX_PROTECT_NORDWR(0x09e78, 0x0187), + A6XX_PROTECT_NORDWR(0x0a630, 0x01cf), + A6XX_PROTECT_NORDWR(0x0ae02, 0x), + A6XX_PROTECT_NORDWR(0x0ae50, 0x032f), + A6XX_PROTECT_NORDWR(0x0b604, 0x), + A6XX_PROTECT_NORDWR(0x0b608, 0x0007), + A6XX_PROTECT_NORDWR(0x0be02, 0x0001), + A6XX_PROTECT_NORDWR(0x0be20, 0x17df), + A6XX_PROTECT_NORDWR(0x0f000, 0x0bff), + A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff), + A6XX_PROTECT_NORDWR(0x18400, 0x1fff), + A6XX_PROTECT_NORDWR(0x1a800, 0x1fff), + A6XX_PROTECT_NORDWR(0x1f400, 0x0443), + A6XX_PROTECT_RDONLY(0x1f844, 0x007b), + A6XX_PROTECT_NORDWR(0x1f887, 0x001b), + A6XX_PROTECT_NORDWR(0x1f8c0, 0x), /* note: infinite range */ +}; +DECLARE_ADRENO_PROTECT(a650_protect, 48); + +/* These are for a635 and a660 */ +static const u32 a660_protect_regs[] = { + A6XX_PROTECT_RDONLY(0x0, 0x04ff), + A6XX_PROTECT_RDONLY(0x00501, 0x0005), + A6XX_PROTECT_RDONLY(0x0050b, 0x02f4), + A6XX_PROTECT_NORDWR(0x0050e, 0x), + A6XX_PROTECT_NORDWR(0x00510, 0x), + A6XX_PROTECT_NORDWR(0x00534, 0x), + A6XX_PROTECT_NORDWR(0x00800, 0x0082), + A6XX_PROTECT_NORDWR(0x008a0, 0x0008), + A6XX_PROTECT_NORDWR(0x008ab, 0x0024), + A6XX_PROTECT_RDONLY(0x008de, 0x00ae), + A6XX_PROTECT_NORDWR(0x00900, 0x004d), + A6XX_PROTECT_NORDWR(0x0098d, 0x0272), + A6XX_PROTECT_
[PATCH v4 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info
From: Rob Clark Introduce a6xx_info where we can stash gen specific stuff without polluting the toplevel adreno_info struct. Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov Reviewed-by: Konrad Dybcio --- drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 9 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 ++- 4 files changed, 67 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c index bcc2f4d8cfc6..b81bcae59ac3 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c @@ -7,6 +7,7 @@ */ #include "adreno_gpu.h" +#include "a6xx_gpu.h" #include "a6xx.xml.h" #include "a6xx_gmu.xml.h" @@ -465,7 +466,9 @@ static const struct adreno_info a6xx_gpus[] = { .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a6xx_gpu_init, .zapfw = "a610_zap.mdt", - .hwcg = a612_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a612_hwcg, + }, /* * There are (at least) three SoCs implementing A610: SM6125 * (trinket), SM6115 (bengal) and SM6225 (khaje). Trinket does @@ -493,7 +496,9 @@ static const struct adreno_info a6xx_gpus[] = { .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT, .init = a6xx_gpu_init, .zapfw = "a615_zap.mbn", - .hwcg = a615_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a615_hwcg, + }, .speedbins = ADRENO_SPEEDBINS( { 0, 0 }, { 128, 1 }, @@ -513,6 +518,8 @@ static const struct adreno_info a6xx_gpus[] = { .inactive_period = DRM_MSM_INACTIVE_PERIOD, .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT, .init = a6xx_gpu_init, + .a6xx = &(const struct a6xx_info) { + }, .speedbins = ADRENO_SPEEDBINS( { 0, 0 }, { 169, 1 }, @@ -531,7 +538,9 @@ static const struct adreno_info a6xx_gpus[] = { .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a6xx_gpu_init, .zapfw = "a615_zap.mdt", - .hwcg = a615_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a615_hwcg, + }, .speedbins = ADRENO_SPEEDBINS( { 0, 0 }, { 138, 1 }, @@ -550,7 +559,9 @@ static const struct adreno_info a6xx_gpus[] = { .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a6xx_gpu_init, .zapfw = "a615_zap.mdt", - .hwcg = a615_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a615_hwcg, + }, .speedbins = ADRENO_SPEEDBINS( { 0, 0 }, { 190, 1 }, @@ -569,7 +580,9 @@ static const struct adreno_info a6xx_gpus[] = { .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT, .init = a6xx_gpu_init, .zapfw = "a615_zap.mdt", - .hwcg = a615_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a615_hwcg, + }, .speedbins = ADRENO_SPEEDBINS( { 0, 0 }, { 120, 4 }, @@ -593,7 +606,9 @@ static const struct adreno_info a6xx_gpus[] = { .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT, .init = a6xx_gpu_init, .zapfw = "a630_zap.mdt", - .hwcg = a630_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a630_hwcg, + }, }, { .chip_ids = ADRENO_CHIP_IDS(0x06040001), .family = ADRENO_6XX_GEN2, @@ -607,7 +622,9 @@ static const struct adreno_info a6xx_gpus[] = { .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT, .init = a6xx_gpu_init, .zapfw = "a640_zap.mdt", - .hwcg = a640_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a640_hwcg, + }, .speedbins = ADRENO_SPEEDBINS( { 0, 0 }, { 1, 1 }, @@ -626,7 +643,9 @@ static const struct adreno_info a6xx_gpus[] = { ADRENO_QUIRK_HAS_HW_APRIV, .init = a6xx_gpu_init, .zapfw = "a650_zap.mdt", - .hwcg = a650_hwcg, + .a6xx = &(const struct a6xx_info) { + .hwcg = a650_hwcg, + }
[PATCH v4 3/5] drm/msm/adreno: Move hwcg regs to a6xx hw catalog
From: Rob Clark Move the hwcg tables into the hw catalog. Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov Reviewed-by: Konrad Dybcio --- drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 619 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 617 - drivers/gpu/drm/msm/adreno/adreno_gpu.h | 3 - 3 files changed, 619 insertions(+), 620 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c index 10a92eab0232..bcc2f4d8cfc6 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c @@ -7,6 +7,451 @@ */ #include "adreno_gpu.h" +#include "a6xx.xml.h" +#include "a6xx_gmu.xml.h" + +static const struct adreno_reglist a612_hwcg[] = { + {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220}, + {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0081}, + {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf}, + {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002}, + {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001}, + {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007}, + {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0120}, + {REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220}, + {REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00}, + {REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022}, + {REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011}, + {REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044}, + {REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422}, + {REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x}, + {REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0222}, + {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002}, + {REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000}, + {REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200}, + {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004}, + {REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004}, + {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002}, + {REG_A6XX_RBBM_ISDB_CNT, 0x0182}, + {REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x}, + {REG_A6XX_RBBM_SP_HYST_CNT, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222}, + {REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111}, + {REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555}, + {}, +}; + +/* For a615 family (a615, a616, a618 and a619) */ +static const struct adreno_reglist a615_hwcg[] = { + {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x0222}, + {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220}, + {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0080}, + {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xF3CF}, + {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x0222}, + {REG_A6XX_RBBM_CLOCK_CNTL_TP1, 0x0222}, + {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL2_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL3_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002}, + {REG_A6XX_RBBM_CLOCK_CNTL4_TP1, 0x0002}, + {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST2_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST3_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007}, + {REG_A6XX_RBBM_CLOCK_HYST4_TP1, 0x0007}, + {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY2_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY3_TP1, 0x}, + {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001}, + {REG_A6XX_RBBM_CLOCK_DELAY4_TP1, 0x0001}, + {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x}, + {REG_A6XX_RBBM_CLOCK_CNTL2_UCHE, 0x}, + {REG_A6XX_RBBM_CLOCK_C
[PATCH v4 2/5] drm/msm/adreno: Split catalog into separate files
From: Rob Clark Split each gen's gpu table into it's own file. Only code-motion, no functional change. Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov Reviewed-by: Konrad Dybcio --- drivers/gpu/drm/msm/Makefile | 5 + drivers/gpu/drm/msm/adreno/a2xx_catalog.c | 52 ++ drivers/gpu/drm/msm/adreno/a3xx_catalog.c | 81 +++ drivers/gpu/drm/msm/adreno/a4xx_catalog.c | 50 ++ drivers/gpu/drm/msm/adreno/a5xx_catalog.c | 148 + drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 338 +++ drivers/gpu/drm/msm/adreno/adreno_device.c | 625 + 7 files changed, 680 insertions(+), 619 deletions(-) create mode 100644 drivers/gpu/drm/msm/adreno/a2xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a3xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a4xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_catalog.c diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index eb788921ff4f..f5e2838c6a76 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -8,13 +8,18 @@ ccflags-$(CONFIG_DRM_MSM_DP) += -I $(src)/dp adreno-y := \ adreno/adreno_device.o \ adreno/adreno_gpu.o \ + adreno/a2xx_catalog.o \ adreno/a2xx_gpu.o \ adreno/a2xx_gpummu.o \ + adreno/a3xx_catalog.o \ adreno/a3xx_gpu.o \ + adreno/a4xx_catalog.o \ adreno/a4xx_gpu.o \ + adreno/a5xx_catalog.o \ adreno/a5xx_gpu.o \ adreno/a5xx_power.o \ adreno/a5xx_preempt.o \ + adreno/a6xx_catalog.o \ adreno/a6xx_gpu.o \ adreno/a6xx_gmu.o \ adreno/a6xx_hfi.o \ diff --git a/drivers/gpu/drm/msm/adreno/a2xx_catalog.c b/drivers/gpu/drm/msm/adreno/a2xx_catalog.c new file mode 100644 index ..9ddb7b31fd98 --- /dev/null +++ b/drivers/gpu/drm/msm/adreno/a2xx_catalog.c @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2013-2014 Red Hat + * Author: Rob Clark + * + * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved. + */ + +#include "adreno_gpu.h" + +static const struct adreno_info a2xx_gpus[] = { + { + .chip_ids = ADRENO_CHIP_IDS(0x0200), + .family = ADRENO_2XX_GEN1, + .revn = 200, + .fw = { + [ADRENO_FW_PM4] = "yamato_pm4.fw", + [ADRENO_FW_PFP] = "yamato_pfp.fw", + }, + .gmem = SZ_256K, + .inactive_period = DRM_MSM_INACTIVE_PERIOD, + .init = a2xx_gpu_init, + }, { /* a200 on i.mx51 has only 128kib gmem */ + .chip_ids = ADRENO_CHIP_IDS(0x0201), + .family = ADRENO_2XX_GEN1, + .revn = 201, + .fw = { + [ADRENO_FW_PM4] = "yamato_pm4.fw", + [ADRENO_FW_PFP] = "yamato_pfp.fw", + }, + .gmem = SZ_128K, + .inactive_period = DRM_MSM_INACTIVE_PERIOD, + .init = a2xx_gpu_init, + }, { + .chip_ids = ADRENO_CHIP_IDS(0x0202), + .family = ADRENO_2XX_GEN2, + .revn = 220, + .fw = { + [ADRENO_FW_PM4] = "leia_pm4_470.fw", + [ADRENO_FW_PFP] = "leia_pfp_470.fw", + }, + .gmem = SZ_512K, + .inactive_period = DRM_MSM_INACTIVE_PERIOD, + .init = a2xx_gpu_init, + } +}; +DECLARE_ADRENO_GPULIST(a2xx); + +MODULE_FIRMWARE("qcom/leia_pfp_470.fw"); +MODULE_FIRMWARE("qcom/leia_pm4_470.fw"); +MODULE_FIRMWARE("qcom/yamato_pfp.fw"); +MODULE_FIRMWARE("qcom/yamato_pm4.fw"); diff --git a/drivers/gpu/drm/msm/adreno/a3xx_catalog.c b/drivers/gpu/drm/msm/adreno/a3xx_catalog.c new file mode 100644 index ..0de8465b6cf0 --- /dev/null +++ b/drivers/gpu/drm/msm/adreno/a3xx_catalog.c @@ -0,0 +1,81 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2013-2014 Red Hat + * Author: Rob Clark + * + * Copyright (c) 2014,2017 The Linux Foundation. All rights reserved. + */ + +#include "adreno_gpu.h" + +static const struct adreno_info a3xx_gpus[] = { + { + .chip_ids = ADRENO_CHIP_IDS(0x03000512), + .family = ADRENO_3XX, + .fw = { + [ADRENO_FW_PM4] = "a330_pm4.fw", + [ADRENO_FW_PFP] = "a330_pfp.fw", + }, + .gmem = SZ_128K, + .inactive_period = DRM_MSM_INACTIVE_PERIOD, + .init = a3xx_gpu_init, + }, { + .chip_ids = ADRENO_CHIP_IDS(0x03000520), + .family = ADRENO_3XX, + .revn = 305, + .fw = { + [ADRENO_FW_PM4] = "a300_pm4.fw", + [ADRENO_FW_PFP] = "a300_pfp.fw", + }, +
[PATCH v4 1/5] drm/msm/adreno: Split up giant device table
From: Rob Clark Split into a separate table per generation, in preparation to move each gen's device table to it's own file. Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov Reviewed-by: Konrad Dybcio --- drivers/gpu/drm/msm/adreno/adreno_device.c | 67 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h| 10 2 files changed, 63 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index c3703a51287b..a57659eaddc2 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -20,7 +20,7 @@ bool allow_vram_carveout = false; MODULE_PARM_DESC(allow_vram_carveout, "Allow using VRAM Carveout, in place of IOMMU"); module_param_named(allow_vram_carveout, allow_vram_carveout, bool, 0600); -static const struct adreno_info gpulist[] = { +static const struct adreno_info a2xx_gpus[] = { { .chip_ids = ADRENO_CHIP_IDS(0x0200), .family = ADRENO_2XX_GEN1, @@ -54,7 +54,12 @@ static const struct adreno_info gpulist[] = { .gmem = SZ_512K, .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a2xx_gpu_init, - }, { + } +}; +DECLARE_ADRENO_GPULIST(a2xx); + +static const struct adreno_info a3xx_gpus[] = { + { .chip_ids = ADRENO_CHIP_IDS(0x03000512), .family = ADRENO_3XX, .fw = { @@ -116,7 +121,12 @@ static const struct adreno_info gpulist[] = { .gmem = SZ_1M, .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a3xx_gpu_init, - }, { + } +}; +DECLARE_ADRENO_GPULIST(a3xx); + +static const struct adreno_info a4xx_gpus[] = { + { .chip_ids = ADRENO_CHIP_IDS(0x04000500), .family = ADRENO_4XX, .revn = 405, @@ -149,7 +159,12 @@ static const struct adreno_info gpulist[] = { .gmem = (SZ_1M + SZ_512K), .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a4xx_gpu_init, - }, { + } +}; +DECLARE_ADRENO_GPULIST(a4xx); + +static const struct adreno_info a5xx_gpus[] = { + { .chip_ids = ADRENO_CHIP_IDS(0x05000600), .family = ADRENO_5XX, .revn = 506, @@ -274,7 +289,12 @@ static const struct adreno_info gpulist[] = { .quirks = ADRENO_QUIRK_LMLOADKILL_DISABLE, .init = a5xx_gpu_init, .zapfw = "a540_zap.mdt", - }, { + } +}; +DECLARE_ADRENO_GPULIST(a5xx); + +static const struct adreno_info a6xx_gpus[] = { + { .chip_ids = ADRENO_CHIP_IDS(0x0601), .family = ADRENO_6XX_GEN1, .revn = 610, @@ -520,7 +540,12 @@ static const struct adreno_info gpulist[] = { .zapfw = "a690_zap.mdt", .hwcg = a690_hwcg, .address_space_size = SZ_16G, - }, { + } +}; +DECLARE_ADRENO_GPULIST(a6xx); + +static const struct adreno_info a7xx_gpus[] = { + { .chip_ids = ADRENO_CHIP_IDS(0x07000200), .family = ADRENO_6XX_GEN1, /* NOT a mistake! */ .fw = { @@ -582,7 +607,17 @@ static const struct adreno_info gpulist[] = { .init = a6xx_gpu_init, .zapfw = "gen70900_zap.mbn", .address_space_size = SZ_16G, - }, + } +}; +DECLARE_ADRENO_GPULIST(a7xx); + +static const struct adreno_gpulist *gpulists[] = { + &a2xx_gpulist, + &a3xx_gpulist, + &a4xx_gpulist, + &a5xx_gpulist, + &a6xx_gpulist, + &a6xx_gpulist, }; MODULE_FIRMWARE("qcom/a300_pm4.fw"); @@ -617,13 +652,17 @@ MODULE_FIRMWARE("qcom/yamato_pm4.fw"); static const struct adreno_info *adreno_info(uint32_t chip_id) { /* identify gpu: */ - for (int i = 0; i < ARRAY_SIZE(gpulist); i++) { - const struct adreno_info *info = &gpulist[i]; - if (info->machine && !of_machine_is_compatible(info->machine)) - continue; - for (int j = 0; info->chip_ids[j]; j++) - if (info->chip_ids[j] == chip_id) - return info; + for (int i = 0; i < ARRAY_SIZE(gpulists); i++) { + for (int j = 0; j < gpulists[i]->gpus_count; j++) { + const struct adreno_info *info = &gpulists[i]->gpus[j]; + + if (info->machine && !of_machine_is_compatible(info->machine)) + continue; + + for (int k = 0; info->chip_ids[k]; k++) + if (info->chip_ids[k] == chip_id) + return info; + } } return NULL; diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_
[PATCH v4 0/5] drm/msm/adreno: Introduce/rework device hw catalog
From: Rob Clark Split the single flat gpulist table into per-gen tables that exist in their own per-gen files, and start moving more info into the device table. This at least gets all the big tables of register settings out of the heart of the a6xx_gpu code. Probably more could be moved, to remove at least some of the per-gen if/else ladders, but this seemed like a reasonably good start. v2: Drop sentinel table entries v3: Fix typo v4: More const, fix missing a702 protect regs Rob Clark (5): drm/msm/adreno: Split up giant device table drm/msm/adreno: Split catalog into separate files drm/msm/adreno: Move hwcg regs to a6xx hw catalog drm/msm/adreno: Move hwcg table into a6xx specific info drm/msm/adreno: Move CP_PROTECT settings to hw catalog drivers/gpu/drm/msm/Makefile |5 + drivers/gpu/drm/msm/adreno/a2xx_catalog.c | 52 + drivers/gpu/drm/msm/adreno/a3xx_catalog.c | 81 ++ drivers/gpu/drm/msm/adreno/a4xx_catalog.c | 50 + drivers/gpu/drm/msm/adreno/a5xx_catalog.c | 148 +++ drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 1240 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 880 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 11 + drivers/gpu/drm/msm/adreno/adreno_device.c | 624 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h| 32 +- 10 files changed, 1649 insertions(+), 1474 deletions(-) create mode 100644 drivers/gpu/drm/msm/adreno/a2xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a3xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a4xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_catalog.c create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_catalog.c -- 2.45.2
Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers
On Tue, Jun 04, 2024 at 07:35:04PM +0200, Konrad Dybcio wrote: > > > On 5/14/24 20:38, Akhil P Oommen wrote: > > On Wed, May 08, 2024 at 07:46:31PM +0200, Konrad Dybcio wrote: > > > Memory barriers help ensure instruction ordering, NOT time and order > > > of actual write arrival at other observers (e.g. memory-mapped IP). > > > On architectures employing weak memory ordering, the latter can be a > > > giant pain point, and it has been as part of this driver. > > > > > > Moreover, the gpu_/gmu_ accessors already use non-relaxed versions of > > > readl/writel, which include r/w (respectively) barriers. > > > > > > Replace the barriers with a readback that ensures the previous writes > > > have exited the write buffer (as the CPU must flush the write to the > > > register it's trying to read back) and subsequently remove the hack > > > introduced in commit b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt > > > status in hw_init"). > > > > > > Fixes: b77532803d11 ("drm/msm/a6xx: Poll for GBIF unhalt status in > > > hw_init") > > > Signed-off-by: Konrad Dybcio > > > --- > > > drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 5 ++--- > > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 -- > > > 2 files changed, 6 insertions(+), 13 deletions(-) > > > > I prefer this version compared to the v2. A helper routine is > > unnecessary here because: > > 1. there are very few scenarios where we have to read back the same > > register. > > 2. we may accidently readback a write only register. > > Which would still trigger an address dependency on the CPU, no? Yes, but it is not a good idea to read a write-only register. We can't be sure about its effect on the endpoint. > > > > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > > b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > > index 0e3dfd4c2bc8..4135a53b55a7 100644 > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c > > > @@ -466,9 +466,8 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu) > > > int ret; > > > u32 val; > > > - gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 1 << 1); > > > - /* Wait for the register to finish posting */ > > > - wmb(); > > > + gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, BIT(1)); > > > + gmu_read(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ); > > > > This is unnecessary because we are polling on a register on the same port > > below. But I think we > > can replace "wmb()" above with "mb()" to avoid reordering between read > > and write IO instructions. > > Ok on the dropping readback part > > + AFAIU from Will's response, we can drop the barrier as well Lets wait a bit on Will's response on compiler reordering. > > > > > > ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_RSCC_CONTROL_ACK, val, > > > val & (1 << 1), 100, 1); > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > > b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > > index 973872ad0474..0acbc38b8e70 100644 > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c > > > @@ -1713,22 +1713,16 @@ static int hw_init(struct msm_gpu *gpu) > > > } > > > /* Clear GBIF halt in case GX domain was not collapsed */ > > > + gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > > > > We need a full barrier here to avoid reordering. Also, lets add a > > comment about why we are doing this odd looking sequence. > > > > > + gpu_read(gpu, REG_A6XX_GBIF_HALT); > > > if (adreno_is_a619_holi(adreno_gpu)) { > > > - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > > > gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0); > > > - /* Let's make extra sure that the GPU can access the memory.. */ > > > - mb(); > > > > We need a full barrier here. > > > > > + gpu_read(gpu, REG_A6XX_RBBM_GPR0_CNTL); > > > } else if (a6xx_has_gbif(adreno_gpu)) { > > > - gpu_write(gpu, REG_A6XX_GBIF_HALT, 0); > > > gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0); > > > - /* Let's make extra sure that the GPU can access the memory.. */ > > > - mb(); > > > > We need a full barrier here. > > Not sure we do between REG_A6XX_GBIF_HALT & > REG_A6XX_RBBM_(GBIF_HALT/GPR0_CNTL), > but I suppose keeping the one after REG_A6XX_RBBM_(GBIF_HALT/GPR0_CNTL) makes > sense to avoid the possibility of configuring the GPU before it can access > DRAM.. Techinically, I think we don't need a barrier or the below read back. Because the above write is ordered with the write (on CP_CNTL reg) which finally triggers CP INIT later. GPU won't access memory before CP INIT. > > > > > > + gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT); > > > } > > > - /* Some GPUs are stubborn and take their sweet time to unhalt GBIF! */ > > > - if (adreno_is_a7xx(adreno_gpu) && a6xx_has_gbif(adreno_gpu)) > > > - spin_until(!gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK)); > > > - > > > > Why i
Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info
On Tue, Jun 18, 2024 at 1:30 AM Dmitry Baryshkov wrote: > > On Mon, Jun 17, 2024 at 03:51:14PM GMT, Rob Clark wrote: > > From: Rob Clark > > > > Introduce a6xx_info where we can stash gen specific stuff without > > polluting the toplevel adreno_info struct. > > > > Signed-off-by: Rob Clark > > --- > > drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +-- > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-- > > drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 9 > > drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 ++- > > 4 files changed, 67 insertions(+), 19 deletions(-) > > > > Reviewed-by: Dmitry Baryshkov > > > > @@ -98,7 +100,9 @@ struct adreno_info { > > struct msm_gpu *(*init)(struct drm_device *dev); > > const char *zapfw; > > u32 inactive_period; > > - const struct adreno_reglist *hwcg; > > + union { > > + const struct a6xx_info *a6xx; > > + }; > > u64 address_space_size; > > /** > >* @speedbins: Optional table of fuse to speedbin mappings > > My preference would be towards wrapping the adreno_gpu, but that would > require more significant rework of the driver. Let's see if we can get > to that later. > yeah, it was going to be more re-work, and I'm neck deep in gpuvm/vm_bind.. I just wanted to land this since it is a pita (and error prone) to rebase as more gpu's get added ;-) It isn't entirely unlike how we handle gpu gen specific options in mesa, where we have a somewhat bigger set of options, so I wouldn't say that this approach was worse than extending adreno_info.. just different.. BR, -R
Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers
On Tue, Jun 04, 2024 at 03:40:56PM +0100, Will Deacon wrote: > On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote: > > On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote: > > > On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote: > > > > If I understand correctly, you don't need any memory barrier. > > > > writel()/readl()'s are ordered to the same endpoint. That goes for all > > > > the reordering/barrier comments mentioned below too. > > > > > > > > device-io.rst: > > > > > > > > The read and write functions are defined to be ordered. That is the > > > > compiler is not permitted to reorder the I/O sequence. When the > > > > ordering > > > > can be compiler optimised, you can use __readb() and friends to > > > > indicate the relaxed ordering. Use this with care. > > > > > > > > memory-barriers.txt: > > > > > > > > (*) readX(), writeX(): > > > > > > > > The readX() and writeX() MMIO accessors take a pointer to > > > > the > > > > peripheral being accessed as an __iomem * parameter. For > > > > pointers > > > > mapped with the default I/O attributes (e.g. those returned > > > > by > > > > ioremap()), the ordering guarantees are as follows: > > > > > > > > 1. All readX() and writeX() accesses to the same peripheral > > > > are ordered > > > >with respect to each other. This ensures that MMIO > > > > register accesses > > > >by the same CPU thread to a particular device will > > > > arrive in program > > > >order. > > > > > > > > > > In arm64, a writel followed by readl translates to roughly the following > > > sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not > > > sure what is stopping compiler from reordering __raw_writel() and > > > __raw_readl() > > > above? I am assuming iomem cookie is ignored during compilation. > > > > It seems to me that is due to some usage of volatile there in > > __raw_writel() etc, but to be honest after reading about volatile and > > some threads from gcc mailing lists, I don't have a confident answer :) > > > > > > > > Added Will to this thread if he can throw some light on this. > > > > Hopefully Will can school us. > > The ordering in this case is ensured by the memory attributes used for > ioremap(). When an MMIO region is mapped using Device-nGnRE attributes > (as it the case for ioremap()), the "nR" part means "no reordering", so > readX() and writeX() to that region are ordered wrt each other. But that avoids only HW reordering, doesn't it? What about *compiler reordering* in the case of a writel following by a readl which translates to: 1: dmb_wmb() 2: __raw_writel() -> roughly "asm volatile('str') 3: __raw_readl() -> roughly "asm volatile('ldr') 4: dmb_rmb() Is the 'volatile' keyword sufficient to avoid reordering between (2) and (3)? Or do we need a "memory" clobber to inhibit reordering? This is still not clear to me even after going through some compiler documentions. -Akhil. > > Note that guarantee _doesn't_ apply to other flavours of ioremap(), so > e.g. ioremap_wc() won't give you the ordering. > > Hope that helps, > > Will
Re: [PATCH] drm/msm/adreno: De-spaghettify the use of memory barriers
On Thu, Jun 06, 2024 at 02:03:24PM +0200, Konrad Dybcio wrote: > On 4.06.2024 4:40 PM, Will Deacon wrote: > > On Thu, May 16, 2024 at 01:55:26PM -0500, Andrew Halaney wrote: > >> On Thu, May 16, 2024 at 08:20:05PM GMT, Akhil P Oommen wrote: > >>> On Thu, May 16, 2024 at 08:15:34AM -0500, Andrew Halaney wrote: > If I understand correctly, you don't need any memory barrier. > writel()/readl()'s are ordered to the same endpoint. That goes for all > the reordering/barrier comments mentioned below too. > > device-io.rst: > > The read and write functions are defined to be ordered. That is the > compiler is not permitted to reorder the I/O sequence. When the > ordering > can be compiler optimised, you can use __readb() and friends to > indicate the relaxed ordering. Use this with care. > > memory-barriers.txt: > > (*) readX(), writeX(): > > The readX() and writeX() MMIO accessors take a pointer to the > peripheral being accessed as an __iomem * parameter. For pointers > mapped with the default I/O attributes (e.g. those returned by > ioremap()), the ordering guarantees are as follows: > > 1. All readX() and writeX() accesses to the same peripheral are > ordered > with respect to each other. This ensures that MMIO register > accesses > by the same CPU thread to a particular device will arrive in > program > order. > > >>> > >>> In arm64, a writel followed by readl translates to roughly the following > >>> sequence: dmb_wmb(), __raw_writel(), __raw_readl(), dmb_rmb(). I am not > >>> sure what is stopping compiler from reordering __raw_writel() and > >>> __raw_readl() > >>> above? I am assuming iomem cookie is ignored during compilation. > >> > >> It seems to me that is due to some usage of volatile there in > >> __raw_writel() etc, but to be honest after reading about volatile and > >> some threads from gcc mailing lists, I don't have a confident answer :) > >> > >>> > >>> Added Will to this thread if he can throw some light on this. > >> > >> Hopefully Will can school us. > > > > The ordering in this case is ensured by the memory attributes used for > > ioremap(). When an MMIO region is mapped using Device-nGnRE attributes > > (as it the case for ioremap()), the "nR" part means "no reordering", so > > readX() and writeX() to that region are ordered wrt each other. > > > > Note that guarantee _doesn't_ apply to other flavours of ioremap(), so > > e.g. ioremap_wc() won't give you the ordering. > > > > Hope that helps, > > Just to make sure I'm following, would mapping things as nGnRnE effectively > get rid of write buffering, perhaps being a way of debugging whether that > in particular is causing issues (at the cost of speed)? I think the "nE" part is just a hint, so it will depend on how the hardware has been built. On top of that, you'll still need something like a DSB to force the CPU to wait for the write response. Will
Re: [PATCH v3 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog
On 6/18/24 00:51, Rob Clark wrote: From: Rob Clark Move the CP_PROTECT settings into the hw catalog. Signed-off-by: Rob Clark --- I think a702 was skipped over Konrad
Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info
On 6/18/24 00:51, Rob Clark wrote: From: Rob Clark Introduce a6xx_info where we can stash gen specific stuff without polluting the toplevel adreno_info struct. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 9 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 ++- 4 files changed, 67 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c index bcc2f4d8cfc6..96d93251fdd6 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c @@ -7,6 +7,7 @@ */ #include "adreno_gpu.h" +#include "a6xx_gpu.h" #include "a6xx.xml.h" #include "a6xx_gmu.xml.h" @@ -465,7 +466,9 @@ static const struct adreno_info a6xx_gpus[] = { .inactive_period = DRM_MSM_INACTIVE_PERIOD, .init = a6xx_gpu_init, .zapfw = "a610_zap.mdt", - .hwcg = a612_hwcg, + .a6xx = &(struct a6xx_info) { const Reviewed-by: Konrad Dybcio Konrad
Re: [PATCH v3 3/5] drm/msm/adreno: Move hwcg regs to a6xx hw catalog
On 6/18/24 00:51, Rob Clark wrote: From: Rob Clark Move the hwcg tables into the hw catalog. Signed-off-by: Rob Clark --- Reviewed-by: Konrad Dybcio Konrad
Re: [PATCH v3 2/5] drm/msm/adreno: Split catalog into separate files
On 6/18/24 00:51, Rob Clark wrote: From: Rob Clark Split each gen's gpu table into it's own file. Only code-motion, no functional change. Signed-off-by: Rob Clark --- Reviewed-by: Konrad Dybcio Konrad
Re: [PATCH v3 1/5] drm/msm/adreno: Split up giant device table
On 6/18/24 00:51, Rob Clark wrote: From: Rob Clark Split into a separate table per generation, in preparation to move each gen's device table to it's own file. Signed-off-by: Rob Clark --- Reviewed-by: Konrad Dybcio Konrad
Re: [PATCH v3 4/5] drm/msm/adreno: Move hwcg table into a6xx specific info
On Mon, Jun 17, 2024 at 03:51:14PM GMT, Rob Clark wrote: > From: Rob Clark > > Introduce a6xx_info where we can stash gen specific stuff without > polluting the toplevel adreno_info struct. > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 65 +-- > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-- > drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 9 > drivers/gpu/drm/msm/adreno/adreno_gpu.h | 6 ++- > 4 files changed, 67 insertions(+), 19 deletions(-) > Reviewed-by: Dmitry Baryshkov > @@ -98,7 +100,9 @@ struct adreno_info { > struct msm_gpu *(*init)(struct drm_device *dev); > const char *zapfw; > u32 inactive_period; > - const struct adreno_reglist *hwcg; > + union { > + const struct a6xx_info *a6xx; > + }; > u64 address_space_size; > /** >* @speedbins: Optional table of fuse to speedbin mappings My preference would be towards wrapping the adreno_gpu, but that would require more significant rework of the driver. Let's see if we can get to that later. -- With best wishes Dmitry
Re: [PATCH v3 5/5] drm/msm/adreno: Move CP_PROTECT settings to hw catalog
On Mon, Jun 17, 2024 at 03:51:15PM GMT, Rob Clark wrote: > From: Rob Clark > > Move the CP_PROTECT settings into the hw catalog. > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 247 + > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 257 +- > drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 + > drivers/gpu/drm/msm/adreno/adreno_gpu.h | 13 ++ > 4 files changed, 268 insertions(+), 251 deletions(-) > Reviewed-by: Dmitry Baryshkov -- With best wishes Dmitry