Re: [Freedreno] [PATCH v7 01/15] dma-buf/dma-fence: Add deadline awareness

2023-02-28 Thread Bagas Sanjaya
On Mon, Feb 27, 2023 at 11:35:07AM -0800, Rob Clark wrote:
> diff --git a/Documentation/driver-api/dma-buf.rst 
> b/Documentation/driver-api/dma-buf.rst
> index 622b8156d212..183e480d8cea 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -164,6 +164,12 @@ DMA Fence Signalling Annotations
>  .. kernel-doc:: drivers/dma-buf/dma-fence.c
> :doc: fence signalling annotation
>  
> +DMA Fence Deadline Hints
> +
> +
> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> +   :doc: deadline hints
> +
>  DMA Fences Functions Reference
>  ~~
>  
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 0de0482cd36e..e103e821d993 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -912,6 +912,65 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
> uint32_t count,
>  }
>  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  
> +/**
> + * DOC: deadline hints
> + *
> + * In an ideal world, it would be possible to pipeline a workload 
> sufficiently
> + * that a utilization based device frequency governor could arrive at a 
> minimum
> + * frequency that meets the requirements of the use-case, in order to 
> minimize
> + * power consumption.  But in the real world there are many workloads which
> + * defy this ideal.  For example, but not limited to:
> + *
> + * * Workloads that ping-pong between device and CPU, with alternating 
> periods
> + *   of CPU waiting for device, and device waiting on CPU.  This can result 
> in
> + *   devfreq and cpufreq seeing idle time in their respective domains and in
> + *   result reduce frequency.
> + *
> + * * Workloads that interact with a periodic time based deadline, such as 
> double
> + *   buffered GPU rendering vs vblank sync'd page flipping.  In this 
> scenario,
> + *   missing a vblank deadline results in an *increase* in idle time on the 
> GPU
> + *   (since it has to wait an additional vblank period), sending a single to
> + *   the GPU's devfreq to reduce frequency, when in fact the opposite is 
> what is
> + *   needed.
> + *
> + * To this end, deadline hint(s) can be set on a &dma_fence via 
> &dma_fence_set_deadline.
> + * The deadline hint provides a way for the waiting driver, or userspace, to
> + * convey an appropriate sense of urgency to the signaling driver.
> + *
> + * A deadline hint is given in absolute ktime (CLOCK_MONOTONIC for userspace
> + * facing APIs).  The time could either be some point in the future (such as
> + * the vblank based deadline for page-flipping, or the start of a 
> compositor's
> + * composition cycle), or the current time to indicate an immediate deadline
> + * hint (Ie. forward progress cannot be made until this fence is signaled).
> + *
> + * Multiple deadlines may be set on a given fence, even in parallel.  See the
> + * documentation for &dma_fence_ops.set_deadline.
> + *
> + * The deadline hint is just that, a hint.  The driver that created the fence
> + * may react by increasing frequency, making different scheduling choices, 
> etc.
> + * Or doing nothing at all.
> + */
> +
> +/**
> + * dma_fence_set_deadline - set desired fence-wait deadline hint
> + * @fence:the fence that is to be waited on
> + * @deadline: the time by which the waiter hopes for the fence to be
> + *signaled
> + *
> + * Give the fence signaler a hint about an upcoming deadline, such as
> + * vblank, by which point the waiter would prefer the fence to be
> + * signaled by.  This is intended to give feedback to the fence signaler
> + * to aid in power management decisions, such as boosting GPU frequency
> + * if a periodic vblank deadline is approaching but the fence is not
> + * yet signaled..
> + */
> +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> +{
> + if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
> + fence->ops->set_deadline(fence, deadline);
> +}
> +EXPORT_SYMBOL(dma_fence_set_deadline);
> +
>  /**
>   * dma_fence_describe - Dump fence describtion into seq_file
>   * @fence: the 6fence to describe
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 775cdc0b4f24..87c0d846dbb4 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -257,6 +257,24 @@ struct dma_fence_ops {
>*/
>   void (*timeline_value_str)(struct dma_fence *fence,
>  char *str, int size);
> +
> + /**
> +  * @set_deadline:
> +  *
> +  * Callback to allow a fence waiter to inform the fence signaler of
> +  * an upcoming deadline, such as vblank, by which point the waiter
> +  * would prefer the fence to be signaled by.  This is intended to
> +  * give feedback to the fence signaler to aid in power management
> +  * decisions, such as boosting GPU frequency.
> +  *
> +  * This is called without &dma_fence.lock held, it can be called
>

Re: [Freedreno] [PATCH v4 3/4] drm/msm/dpu: Remove empty prepare_commit() function

2023-02-28 Thread Dmitry Baryshkov

On 21/02/2023 20:42, Jessica Zhang wrote:

Now that the TE setup has been moved to prepare_for_kickoff(),  we have
not prepare_commit() callbacks left. This makes dpu_encoder_prepare_commit()
do nothing. Remove prepare_commit() from DPU driver.

Changes in V3:
- Reworded commit message to be more clear
- Corrected spelling mistake in commit message

Changes in V4:
- Reworded commit message for clarity

Signed-off-by: Jessica Zhang 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 19 ---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |  7 ---
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 21 -
  3 files changed, 47 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry



Re: [Freedreno] [PATCH v4 1/4] drm/msm/dpu: Move TE setup to prepare_for_kickoff()

2023-02-28 Thread Dmitry Baryshkov

On 21/02/2023 20:42, Jessica Zhang wrote:

Currently, DPU will enable TE during prepare_commit(). However, this
will cause a crash and reboot to sahara when trying to read/write to
register in get_autorefresh_config(), because the core clock rates
aren't set at that time.

This used to work because phys_enc->hw_pp is only initialized in mode
set [1], so the first prepare_commit() will return before any register
read/write as hw_pp would be NULL.

However, when we try to implement support for INTF TE, we will run into
the clock issue described above as hw_intf will *not* be NULL on the
first prepare_commit(). This is because the initialization of
dpu_enc->hw_intf has been moved to dpu_encoder_setup() [2].

To avoid this issue, let's enable TE during prepare_for_kickoff()
instead as the core clock rates are guaranteed to be set then.

Depends on: "Implement tearcheck support on INTF block" [3]

Changes in V3:
- Added function prototypes
- Reordered function definitions to make change more legible
- Removed prepare_commit() function from dpu_encoder_phys_cmd

Changes in V4:
- Reworded commit message to be more specific
- Removed dpu_encoder_phys_cmd_is_ongoing_pptx() prototype

[1] 
https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c#L1109
[2] 
https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c#L2339
[3] https://patchwork.freedesktop.org/series/112332/

Signed-off-by: Jessica Zhang 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry



Re: [Freedreno] [PATCH v7 00/15] dma-fence: Deadline awareness

2023-02-28 Thread Bagas Sanjaya
On 2/28/23 22:44, Rob Clark wrote:
> You can find my branch here:
> 
> https://gitlab.freedesktop.org/robclark/msm/-/commits/dma-fence/deadline
> 

Pulled, thanks!

-- 
An old man doll... just what I always wanted! - Clara



Re: [Freedreno] [PATCH v3] drm/msm/dp: check core_initialized flag at both host_init() and host_deinit()

2023-02-28 Thread Dmitry Baryshkov
On Wed, 1 Mar 2023 at 02:17, Kuogee Hsieh  wrote:
>
> There is a reboot/suspend test case where system suspend is forced
> during system booting up. Since dp_display_host_init() of external
> DP is executed at hpd thread context, this test case may created a
> scenario that dp_display_host_deinit() from pm_suspend() run before
> dp_display_host_init() if hpd thread has no chance to run during
> booting up while suspend request command was issued. At this scenario
> system will crash at aux register access at dp_display_host_deinit()
> since aux clock had not yet been enabled by dp_display_host_init().
> Therefore we have to ensure aux clock enabled by checking
> core_initialized flag before access aux registers at pm_suspend.

Can a call to dp_display_host_init() be moved from
dp_display_config_hpd() to dp_display_bind()?

Related question: what is the primary reason for having
EV_HPD_INIT_SETUP and calling dp_display_config_hpd() via the event
thread? Does DP driver really depend on DPU irqs being installed? As
far as I understand, DP device uses MDSS interrupts and those IRQs are
available and working at the time of dp_display_probe() /
dp_display_bind().

>
> Changes in v2:
> -- at commit text, dp_display_host_init() instead of host_init()
> -- at commit text, dp_display_host_deinit() instead of host_deinit()
>
> Changes in v3:
> -- re arrange to avoid commit text line over 75 chars
>
> Fixes: 989ebe7bc446 ("drm/msm/dp: do not initialize phy until plugin 
> interrupt received")
> Signed-off-by: Kuogee Hsieh 
> Reviewed-by: Stephen Boyd 
> ---
>  drivers/gpu/drm/msm/dp/dp_display.c | 20 
>  1 file changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
> b/drivers/gpu/drm/msm/dp/dp_display.c
> index bde1a7c..1850738 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -460,10 +460,12 @@ static void dp_display_host_init(struct 
> dp_display_private *dp)
> dp->dp_display.connector_type, dp->core_initialized,
> dp->phy_initialized);
>
> -   dp_power_init(dp->power, false);
> -   dp_ctrl_reset_irq_ctrl(dp->ctrl, true);
> -   dp_aux_init(dp->aux);
> -   dp->core_initialized = true;
> +   if (!dp->core_initialized) {
> +   dp_power_init(dp->power, false);
> +   dp_ctrl_reset_irq_ctrl(dp->ctrl, true);
> +   dp_aux_init(dp->aux);
> +   dp->core_initialized = true;
> +   }
>  }
>
>  static void dp_display_host_deinit(struct dp_display_private *dp)
> @@ -472,10 +474,12 @@ static void dp_display_host_deinit(struct 
> dp_display_private *dp)
> dp->dp_display.connector_type, dp->core_initialized,
> dp->phy_initialized);
>
> -   dp_ctrl_reset_irq_ctrl(dp->ctrl, false);
> -   dp_aux_deinit(dp->aux);
> -   dp_power_deinit(dp->power);
> -   dp->core_initialized = false;
> +   if (dp->core_initialized) {
> +   dp_ctrl_reset_irq_ctrl(dp->ctrl, false);
> +   dp_aux_deinit(dp->aux);
> +   dp_power_deinit(dp->power);
> +   dp->core_initialized = false;
> +   }
>  }
>
>  static int dp_display_usbpd_configure_cb(struct device *dev)
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


-- 
With best wishes
Dmitry


[Freedreno] [PATCH v3] drm/msm/dp: check core_initialized flag at both host_init() and host_deinit()

2023-02-28 Thread Kuogee Hsieh
There is a reboot/suspend test case where system suspend is forced
during system booting up. Since dp_display_host_init() of external
DP is executed at hpd thread context, this test case may created a
scenario that dp_display_host_deinit() from pm_suspend() run before
dp_display_host_init() if hpd thread has no chance to run during
booting up while suspend request command was issued. At this scenario
system will crash at aux register access at dp_display_host_deinit()
since aux clock had not yet been enabled by dp_display_host_init().
Therefore we have to ensure aux clock enabled by checking
core_initialized flag before access aux registers at pm_suspend.

Changes in v2:
-- at commit text, dp_display_host_init() instead of host_init()
-- at commit text, dp_display_host_deinit() instead of host_deinit()

Changes in v3:
-- re arrange to avoid commit text line over 75 chars

Fixes: 989ebe7bc446 ("drm/msm/dp: do not initialize phy until plugin interrupt 
received")
Signed-off-by: Kuogee Hsieh 
Reviewed-by: Stephen Boyd 
---
 drivers/gpu/drm/msm/dp/dp_display.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c
index bde1a7c..1850738 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -460,10 +460,12 @@ static void dp_display_host_init(struct 
dp_display_private *dp)
dp->dp_display.connector_type, dp->core_initialized,
dp->phy_initialized);
 
-   dp_power_init(dp->power, false);
-   dp_ctrl_reset_irq_ctrl(dp->ctrl, true);
-   dp_aux_init(dp->aux);
-   dp->core_initialized = true;
+   if (!dp->core_initialized) {
+   dp_power_init(dp->power, false);
+   dp_ctrl_reset_irq_ctrl(dp->ctrl, true);
+   dp_aux_init(dp->aux);
+   dp->core_initialized = true;
+   }
 }
 
 static void dp_display_host_deinit(struct dp_display_private *dp)
@@ -472,10 +474,12 @@ static void dp_display_host_deinit(struct 
dp_display_private *dp)
dp->dp_display.connector_type, dp->core_initialized,
dp->phy_initialized);
 
-   dp_ctrl_reset_irq_ctrl(dp->ctrl, false);
-   dp_aux_deinit(dp->aux);
-   dp_power_deinit(dp->power);
-   dp->core_initialized = false;
+   if (dp->core_initialized) {
+   dp_ctrl_reset_irq_ctrl(dp->ctrl, false);
+   dp_aux_deinit(dp->aux);
+   dp_power_deinit(dp->power);
+   dp->core_initialized = false;
+   }
 }
 
 static int dp_display_usbpd_configure_cb(struct device *dev)
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [Freedreno] [PATCH v2 11/50] drm/msm/dpu: drop DPU_DIM_LAYER from MIXER_MSM8998_MASK

2023-02-28 Thread Abhinav Kumar




On 2/11/2023 3:12 PM, Dmitry Baryshkov wrote:

The msm8998 doesn't seem to support DIM_LAYER, so drop it from
the supported features mask.

Fixes: 2d8a4edb672d ("drm/msm/dpu: use feature bit for LM combined alpha check")
Fixes: 94391a14fc27 ("drm/msm/dpu1: Add MSM8998 to hw catalog")
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Abhinav Kumar 


[Freedreno] [PATCH v8 16/16] drm/i915: Add deadline based boost support

2023-02-28 Thread Rob Clark
From: Rob Clark 

v2: rebase

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/i915/i915_request.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 7503dcb9043b..44491e7e214c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence 
*fence)
return i915_request_enable_breadcrumb(to_request(fence));
 }
 
+static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+   struct i915_request *rq = to_request(fence);
+
+   if (i915_request_completed(rq))
+   return;
+
+   if (i915_request_started(rq))
+   return;
+
+   /*
+* TODO something more clever for deadlines that are in the
+* future.  I think probably track the nearest deadline in
+* rq->timeline and set timer to trigger boost accordingly?
+*/
+
+   intel_rps_boost(rq);
+}
+
 static signed long i915_fence_wait(struct dma_fence *fence,
   bool interruptible,
   signed long timeout)
@@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
.signaled = i915_fence_signaled,
.wait = i915_fence_wait,
.release = i915_fence_release,
+   .set_deadline = i915_fence_set_deadline,
 };
 
 static void irq_execute_cb(struct irq_work *wrk)
-- 
2.39.1



Re: [Freedreno] [PATCH v7 10/15] drm/vblank: Add helper to get next vblank time

2023-02-28 Thread Mario Kleiner
LGTM. This one is

Reviewed-by: Mario Kleiner 

-mario


On Mon, Feb 27, 2023 at 8:36 PM Rob Clark  wrote:

> From: Rob Clark 
>
> Will be used in the next commit to set a deadline on fences that an
> atomic update is waiting on.
>
> v2: Calculate time at *start* of vblank period, not end
> v3: Fix kbuild complaints
>
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/drm_vblank.c | 53 ++--
>  include/drm/drm_vblank.h |  1 +
>  2 files changed, 45 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index 2ff31717a3de..299fa2a19a90 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -844,10 +844,9 @@ bool
> drm_crtc_vblank_helper_get_vblank_timestamp(struct drm_crtc *crtc,
>  EXPORT_SYMBOL(drm_crtc_vblank_helper_get_vblank_timestamp);
>
>  /**
> - * drm_get_last_vbltimestamp - retrieve raw timestamp for the most recent
> - * vblank interval
> - * @dev: DRM device
> - * @pipe: index of CRTC whose vblank timestamp to retrieve
> + * drm_crtc_get_last_vbltimestamp - retrieve raw timestamp for the most
> + *  recent vblank interval
> + * @crtc: CRTC whose vblank timestamp to retrieve
>   * @tvblank: Pointer to target time which should receive the timestamp
>   * @in_vblank_irq:
>   * True when called from drm_crtc_handle_vblank().  Some drivers
> @@ -865,10 +864,9 @@
> EXPORT_SYMBOL(drm_crtc_vblank_helper_get_vblank_timestamp);
>   * True if timestamp is considered to be very precise, false otherwise.
>   */
>  static bool
> -drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> - ktime_t *tvblank, bool in_vblank_irq)
> +drm_crtc_get_last_vbltimestamp(struct drm_crtc *crtc, ktime_t *tvblank,
> +  bool in_vblank_irq)
>  {
> -   struct drm_crtc *crtc = drm_crtc_from_index(dev, pipe);
> bool ret = false;
>
> /* Define requested maximum error on timestamps (nanoseconds). */
> @@ -876,8 +874,6 @@ drm_get_last_vbltimestamp(struct drm_device *dev,
> unsigned int pipe,
>
> /* Query driver if possible and precision timestamping enabled. */
> if (crtc && crtc->funcs->get_vblank_timestamp && max_error > 0) {
> -   struct drm_crtc *crtc = drm_crtc_from_index(dev, pipe);
> -
> ret = crtc->funcs->get_vblank_timestamp(crtc, &max_error,
> tvblank,
> in_vblank_irq);
> }
> @@ -891,6 +887,15 @@ drm_get_last_vbltimestamp(struct drm_device *dev,
> unsigned int pipe,
> return ret;
>  }
>
> +static bool
> +drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
> + ktime_t *tvblank, bool in_vblank_irq)
> +{
> +   struct drm_crtc *crtc = drm_crtc_from_index(dev, pipe);
> +
> +   return drm_crtc_get_last_vbltimestamp(crtc, tvblank,
> in_vblank_irq);
> +}
> +
>  /**
>   * drm_crtc_vblank_count - retrieve "cooked" vblank counter value
>   * @crtc: which counter to retrieve
> @@ -980,6 +985,36 @@ u64 drm_crtc_vblank_count_and_time(struct drm_crtc
> *crtc,
>  }
>  EXPORT_SYMBOL(drm_crtc_vblank_count_and_time);
>
> +/**
> + * drm_crtc_next_vblank_start - calculate the time of the next vblank
> + * @crtc: the crtc for which to calculate next vblank time
> + * @vblanktime: pointer to time to receive the next vblank timestamp.
> + *
> + * Calculate the expected time of the start of the next vblank period,
> + * based on time of previous vblank and frame duration
> + */
> +int drm_crtc_next_vblank_start(struct drm_crtc *crtc, ktime_t *vblanktime)
> +{
> +   unsigned int pipe = drm_crtc_index(crtc);
> +   struct drm_vblank_crtc *vblank = &crtc->dev->vblank[pipe];
> +   struct drm_display_mode *mode = &vblank->hwmode;
> +   u64 vblank_start;
> +
> +   if (!vblank->framedur_ns || !vblank->linedur_ns)
> +   return -EINVAL;
> +
> +   if (!drm_crtc_get_last_vbltimestamp(crtc, vblanktime, false))
> +   return -EINVAL;
> +
> +   vblank_start = DIV_ROUND_DOWN_ULL(
> +   (u64)vblank->framedur_ns * mode->crtc_vblank_start,
> +   mode->crtc_vtotal);
> +   *vblanktime  = ktime_add(*vblanktime, ns_to_ktime(vblank_start));
> +
> +   return 0;
> +}
> +EXPORT_SYMBOL(drm_crtc_next_vblank_start);
> +
>  static void send_vblank_event(struct drm_device *dev,
> struct drm_pending_vblank_event *e,
> u64 seq, ktime_t now)
> diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
> index 733a3e2d1d10..7f3957943dd1 100644
> --- a/include/drm/drm_vblank.h
> +++ b/include/drm/drm_vblank.h
> @@ -230,6 +230,7 @@ bool drm_dev_has_vblank(const struct drm_device *dev);
>  u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
>  u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
>   

[Freedreno] [PATCH v8 15/16] drm/msm/atomic: Switch to vblank_start helper

2023-02-28 Thread Rob Clark
From: Rob Clark 

Drop our custom thing and switch to drm_crtc_next_vblank_start() for
calculating the time of the start of the next vblank period.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 15 ---
 drivers/gpu/drm/msm/msm_atomic.c|  8 +---
 drivers/gpu/drm/msm/msm_kms.h   |  8 
 3 files changed, 5 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index a683bd9b5a04..43996aecaf8c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -411,20 +411,6 @@ static void dpu_kms_disable_commit(struct msm_kms *kms)
pm_runtime_put_sync(&dpu_kms->pdev->dev);
 }
 
-static ktime_t dpu_kms_vsync_time(struct msm_kms *kms, struct drm_crtc *crtc)
-{
-   struct drm_encoder *encoder;
-
-   drm_for_each_encoder_mask(encoder, crtc->dev, 
crtc->state->encoder_mask) {
-   ktime_t vsync_time;
-
-   if (dpu_encoder_vsync_time(encoder, &vsync_time) == 0)
-   return vsync_time;
-   }
-
-   return ktime_get();
-}
-
 static void dpu_kms_prepare_commit(struct msm_kms *kms,
struct drm_atomic_state *state)
 {
@@ -953,7 +939,6 @@ static const struct msm_kms_funcs kms_funcs = {
.irq = dpu_core_irq,
.enable_commit   = dpu_kms_enable_commit,
.disable_commit  = dpu_kms_disable_commit,
-   .vsync_time  = dpu_kms_vsync_time,
.prepare_commit  = dpu_kms_prepare_commit,
.flush_commit= dpu_kms_flush_commit,
.wait_flush  = dpu_kms_wait_flush,
diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c
index 1686fbb611fd..c5e71c05f038 100644
--- a/drivers/gpu/drm/msm/msm_atomic.c
+++ b/drivers/gpu/drm/msm/msm_atomic.c
@@ -186,8 +186,7 @@ void msm_atomic_commit_tail(struct drm_atomic_state *state)
struct msm_kms *kms = priv->kms;
struct drm_crtc *async_crtc = NULL;
unsigned crtc_mask = get_crtc_mask(state);
-   bool async = kms->funcs->vsync_time &&
-   can_do_async(state, &async_crtc);
+   bool async = can_do_async(state, &async_crtc);
 
trace_msm_atomic_commit_tail_start(async, crtc_mask);
 
@@ -231,7 +230,9 @@ void msm_atomic_commit_tail(struct drm_atomic_state *state)
 
kms->pending_crtc_mask |= crtc_mask;
 
-   vsync_time = kms->funcs->vsync_time(kms, async_crtc);
+   if (drm_crtc_next_vblank_start(async_crtc, &vsync_time))
+   goto fallback;
+
wakeup_time = ktime_sub(vsync_time, ms_to_ktime(1));
 
msm_hrtimer_queue_work(&timer->work, wakeup_time,
@@ -253,6 +254,7 @@ void msm_atomic_commit_tail(struct drm_atomic_state *state)
return;
}
 
+fallback:
/*
 * If there is any async flush pending on updated crtcs, fold
 * them into the current flush.
diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index f8ed7588928c..086a3f1ff956 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -59,14 +59,6 @@ struct msm_kms_funcs {
void (*enable_commit)(struct msm_kms *kms);
void (*disable_commit)(struct msm_kms *kms);
 
-   /**
-* If the kms backend supports async commit, it should implement
-* this method to return the time of the next vsync.  This is
-* used to determine a time slightly before vsync, for the async
-* commit timer to run and complete an async commit.
-*/
-   ktime_t (*vsync_time)(struct msm_kms *kms, struct drm_crtc *crtc);
-
/**
 * Prepare for atomic commit.  This is called after any previous
 * (async or otherwise) commit has completed.
-- 
2.39.1



[Freedreno] [PATCH v8 14/16] drm/msm: Add wait-boost support

2023-02-28 Thread Rob Clark
From: Rob Clark 

Add a way for various userspace waits to signal urgency.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_drv.c | 12 
 drivers/gpu/drm/msm/msm_gem.c |  5 +
 include/uapi/drm/msm_drm.h| 14 --
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index aca48c868c14..f6764a86b2da 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -46,6 +46,7 @@
  * - 1.8.0 - Add MSM_BO_CACHED_COHERENT for supported GPUs (a6xx)
  * - 1.9.0 - Add MSM_SUBMIT_FENCE_SN_IN
  * - 1.10.0 - Add MSM_SUBMIT_BO_NO_IMPLICIT
+ * - 1.11.0 - Add wait boost (MSM_WAIT_FENCE_BOOST, MSM_PREP_BOOST)
  */
 #define MSM_VERSION_MAJOR  1
 #define MSM_VERSION_MINOR  10
@@ -899,7 +900,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
 }
 
 static int wait_fence(struct msm_gpu_submitqueue *queue, uint32_t fence_id,
- ktime_t timeout)
+ ktime_t timeout, uint32_t flags)
 {
struct dma_fence *fence;
int ret;
@@ -929,6 +930,9 @@ static int wait_fence(struct msm_gpu_submitqueue *queue, 
uint32_t fence_id,
if (!fence)
return 0;
 
+   if (flags & MSM_WAIT_FENCE_BOOST)
+   dma_fence_set_deadline(fence, ktime_get());
+
ret = dma_fence_wait_timeout(fence, true, timeout_to_jiffies(&timeout));
if (ret == 0) {
ret = -ETIMEDOUT;
@@ -949,8 +953,8 @@ static int msm_ioctl_wait_fence(struct drm_device *dev, 
void *data,
struct msm_gpu_submitqueue *queue;
int ret;
 
-   if (args->pad) {
-   DRM_ERROR("invalid pad: %08x\n", args->pad);
+   if (args->flags & ~MSM_WAIT_FENCE_FLAGS) {
+   DRM_ERROR("invalid flags: %08x\n", args->flags);
return -EINVAL;
}
 
@@ -961,7 +965,7 @@ static int msm_ioctl_wait_fence(struct drm_device *dev, 
void *data,
if (!queue)
return -ENOENT;
 
-   ret = wait_fence(queue, args->fence, to_ktime(args->timeout));
+   ret = wait_fence(queue, args->fence, to_ktime(args->timeout), 
args->flags);
 
msm_submitqueue_put(queue);
 
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 1dee0d18abbb..dd4a0d773f6e 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -846,6 +846,11 @@ int msm_gem_cpu_prep(struct drm_gem_object *obj, uint32_t 
op, ktime_t *timeout)
op & MSM_PREP_NOSYNC ? 0 : timeout_to_jiffies(timeout);
long ret;
 
+   if (op & MSM_PREP_BOOST) {
+   dma_resv_set_deadline(obj->resv, dma_resv_usage_rw(write),
+ ktime_get());
+   }
+
ret = dma_resv_wait_timeout(obj->resv, dma_resv_usage_rw(write),
true,  remain);
if (ret == 0)
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index 329100016e7c..dbf0d6f43fa9 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -151,8 +151,13 @@ struct drm_msm_gem_info {
 #define MSM_PREP_READ0x01
 #define MSM_PREP_WRITE   0x02
 #define MSM_PREP_NOSYNC  0x04
+#define MSM_PREP_BOOST   0x08
 
-#define MSM_PREP_FLAGS   (MSM_PREP_READ | MSM_PREP_WRITE | MSM_PREP_NOSYNC)
+#define MSM_PREP_FLAGS   (MSM_PREP_READ | \
+ MSM_PREP_WRITE | \
+ MSM_PREP_NOSYNC | \
+ MSM_PREP_BOOST | \
+ 0)
 
 struct drm_msm_gem_cpu_prep {
__u32 handle; /* in */
@@ -286,6 +291,11 @@ struct drm_msm_gem_submit {
 
 };
 
+#define MSM_WAIT_FENCE_BOOST   0x0001
+#define MSM_WAIT_FENCE_FLAGS   ( \
+   MSM_WAIT_FENCE_BOOST | \
+   0)
+
 /* The normal way to synchronize with the GPU is just to CPU_PREP on
  * a buffer if you need to access it from the CPU (other cmdstream
  * submission from same or other contexts, PAGE_FLIP ioctl, etc, all
@@ -295,7 +305,7 @@ struct drm_msm_gem_submit {
  */
 struct drm_msm_wait_fence {
__u32 fence;  /* in */
-   __u32 pad;
+   __u32 flags;  /* in, bitmask of MSM_WAIT_FENCE_x */
struct drm_msm_timespec timeout;   /* in */
__u32 queueid; /* in, submitqueue id */
 };
-- 
2.39.1



[Freedreno] [PATCH v8 13/16] drm/msm: Add deadline based boost support

2023-02-28 Thread Rob Clark
From: Rob Clark 

Track the nearest deadline on a fence timeline and set a timer to expire
shortly before to trigger boost if the fence has not yet been signaled.

v2: rebase

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_fence.c | 74 +
 drivers/gpu/drm/msm/msm_fence.h | 20 +
 2 files changed, 94 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 56641408ea74..51b461f32103 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -8,6 +8,35 @@
 
 #include "msm_drv.h"
 #include "msm_fence.h"
+#include "msm_gpu.h"
+
+static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx)
+{
+   struct msm_drm_private *priv = fctx->dev->dev_private;
+   return priv->gpu;
+}
+
+static enum hrtimer_restart deadline_timer(struct hrtimer *t)
+{
+   struct msm_fence_context *fctx = container_of(t,
+   struct msm_fence_context, deadline_timer);
+
+   kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work);
+
+   return HRTIMER_NORESTART;
+}
+
+static void deadline_work(struct kthread_work *work)
+{
+   struct msm_fence_context *fctx = container_of(work,
+   struct msm_fence_context, deadline_work);
+
+   /* If deadline fence has already passed, nothing to do: */
+   if (msm_fence_completed(fctx, fctx->next_deadline_fence))
+   return;
+
+   msm_devfreq_boost(fctx2gpu(fctx), 2);
+}
 
 
 struct msm_fence_context *
@@ -36,6 +65,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile 
uint32_t *fenceptr,
fctx->completed_fence = fctx->last_fence;
*fctx->fenceptr = fctx->last_fence;
 
+   hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
+   fctx->deadline_timer.function = deadline_timer;
+
+   kthread_init_work(&fctx->deadline_work, deadline_work);
+
+   fctx->next_deadline = ktime_get();
+
return fctx;
 }
 
@@ -62,6 +98,8 @@ void msm_update_fence(struct msm_fence_context *fctx, 
uint32_t fence)
spin_lock_irqsave(&fctx->spinlock, flags);
if (fence_after(fence, fctx->completed_fence))
fctx->completed_fence = fence;
+   if (msm_fence_completed(fctx, fctx->next_deadline_fence))
+   hrtimer_cancel(&fctx->deadline_timer);
spin_unlock_irqrestore(&fctx->spinlock, flags);
 }
 
@@ -92,10 +130,46 @@ static bool msm_fence_signaled(struct dma_fence *fence)
return msm_fence_completed(f->fctx, f->base.seqno);
 }
 
+static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+   struct msm_fence *f = to_msm_fence(fence);
+   struct msm_fence_context *fctx = f->fctx;
+   unsigned long flags;
+   ktime_t now;
+
+   spin_lock_irqsave(&fctx->spinlock, flags);
+   now = ktime_get();
+
+   if (ktime_after(now, fctx->next_deadline) ||
+   ktime_before(deadline, fctx->next_deadline)) {
+   fctx->next_deadline = deadline;
+   fctx->next_deadline_fence =
+   max(fctx->next_deadline_fence, (uint32_t)fence->seqno);
+
+   /*
+* Set timer to trigger boost 3ms before deadline, or
+* if we are already less than 3ms before the deadline
+* schedule boost work immediately.
+*/
+   deadline = ktime_sub(deadline, ms_to_ktime(3));
+
+   if (ktime_after(now, deadline)) {
+   kthread_queue_work(fctx2gpu(fctx)->worker,
+   &fctx->deadline_work);
+   } else {
+   hrtimer_start(&fctx->deadline_timer, deadline,
+   HRTIMER_MODE_ABS);
+   }
+   }
+
+   spin_unlock_irqrestore(&fctx->spinlock, flags);
+}
+
 static const struct dma_fence_ops msm_fence_ops = {
.get_driver_name = msm_fence_get_driver_name,
.get_timeline_name = msm_fence_get_timeline_name,
.signaled = msm_fence_signaled,
+   .set_deadline = msm_fence_set_deadline,
 };
 
 struct dma_fence *
diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index 7f1798c54cd1..cdaebfb94f5c 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -52,6 +52,26 @@ struct msm_fence_context {
volatile uint32_t *fenceptr;
 
spinlock_t spinlock;
+
+   /*
+* TODO this doesn't really deal with multiple deadlines, like
+* if userspace got multiple frames ahead.. OTOH atomic updates
+* don't queue, so maybe that is ok
+*/
+
+   /** next_deadline: Time of next deadline */
+   ktime_t next_deadline;
+
+   /**
+* next_deadline_fence:
+*
+* Fence value for next pending deadline.  The deadline timer is
+* canceled when this fence is signaled.
+*/
+   uint

[Freedreno] [PATCH v8 12/16] drm/atomic-helper: Set fence deadline for vblank

2023-02-28 Thread Rob Clark
From: Rob Clark 

For an atomic commit updating a single CRTC (ie. a pageflip) calculate
the next vblank time, and inform the fence(s) of that deadline.

v2: Comment typo fix (danvet)

Signed-off-by: Rob Clark 
Reviewed-by: Daniel Vetter 
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/drm_atomic_helper.c | 36 +
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index d579fd8f7cb8..d8ee98ce2fc5 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1511,6 +1511,40 @@ void drm_atomic_helper_commit_modeset_enables(struct 
drm_device *dev,
 }
 EXPORT_SYMBOL(drm_atomic_helper_commit_modeset_enables);
 
+/*
+ * For atomic updates which touch just a single CRTC, calculate the time of the
+ * next vblank, and inform all the fences of the deadline.
+ */
+static void set_fence_deadline(struct drm_device *dev,
+  struct drm_atomic_state *state)
+{
+   struct drm_crtc *crtc, *wait_crtc = NULL;
+   struct drm_crtc_state *new_crtc_state;
+   struct drm_plane *plane;
+   struct drm_plane_state *new_plane_state;
+   ktime_t vbltime;
+   int i;
+
+   for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
+   if (wait_crtc)
+   return;
+   wait_crtc = crtc;
+   }
+
+   /* If no CRTCs updated, then nothing to do: */
+   if (!wait_crtc)
+   return;
+
+   if (drm_crtc_next_vblank_start(wait_crtc, &vbltime))
+   return;
+
+   for_each_new_plane_in_state (state, plane, new_plane_state, i) {
+   if (!new_plane_state->fence)
+   continue;
+   dma_fence_set_deadline(new_plane_state->fence, vbltime);
+   }
+}
+
 /**
  * drm_atomic_helper_wait_for_fences - wait for fences stashed in plane state
  * @dev: DRM device
@@ -1540,6 +1574,8 @@ int drm_atomic_helper_wait_for_fences(struct drm_device 
*dev,
struct drm_plane_state *new_plane_state;
int i, ret;
 
+   set_fence_deadline(dev, state);
+
for_each_new_plane_in_state(state, plane, new_plane_state, i) {
if (!new_plane_state->fence)
continue;
-- 
2.39.1



[Freedreno] [PATCH v8 11/16] drm/vblank: Add helper to get next vblank time

2023-02-28 Thread Rob Clark
From: Rob Clark 

Will be used in the next commit to set a deadline on fences that an
atomic update is waiting on.

v2: Calculate time at *start* of vblank period, not end
v3: Fix kbuild complaints

Signed-off-by: Rob Clark 
Reviewed-by: Mario Kleiner 
---
 drivers/gpu/drm/drm_vblank.c | 53 ++--
 include/drm/drm_vblank.h |  1 +
 2 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 2ff31717a3de..299fa2a19a90 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -844,10 +844,9 @@ bool drm_crtc_vblank_helper_get_vblank_timestamp(struct 
drm_crtc *crtc,
 EXPORT_SYMBOL(drm_crtc_vblank_helper_get_vblank_timestamp);
 
 /**
- * drm_get_last_vbltimestamp - retrieve raw timestamp for the most recent
- * vblank interval
- * @dev: DRM device
- * @pipe: index of CRTC whose vblank timestamp to retrieve
+ * drm_crtc_get_last_vbltimestamp - retrieve raw timestamp for the most
+ *  recent vblank interval
+ * @crtc: CRTC whose vblank timestamp to retrieve
  * @tvblank: Pointer to target time which should receive the timestamp
  * @in_vblank_irq:
  * True when called from drm_crtc_handle_vblank().  Some drivers
@@ -865,10 +864,9 @@ EXPORT_SYMBOL(drm_crtc_vblank_helper_get_vblank_timestamp);
  * True if timestamp is considered to be very precise, false otherwise.
  */
 static bool
-drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
- ktime_t *tvblank, bool in_vblank_irq)
+drm_crtc_get_last_vbltimestamp(struct drm_crtc *crtc, ktime_t *tvblank,
+  bool in_vblank_irq)
 {
-   struct drm_crtc *crtc = drm_crtc_from_index(dev, pipe);
bool ret = false;
 
/* Define requested maximum error on timestamps (nanoseconds). */
@@ -876,8 +874,6 @@ drm_get_last_vbltimestamp(struct drm_device *dev, unsigned 
int pipe,
 
/* Query driver if possible and precision timestamping enabled. */
if (crtc && crtc->funcs->get_vblank_timestamp && max_error > 0) {
-   struct drm_crtc *crtc = drm_crtc_from_index(dev, pipe);
-
ret = crtc->funcs->get_vblank_timestamp(crtc, &max_error,
tvblank, in_vblank_irq);
}
@@ -891,6 +887,15 @@ drm_get_last_vbltimestamp(struct drm_device *dev, unsigned 
int pipe,
return ret;
 }
 
+static bool
+drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
+ ktime_t *tvblank, bool in_vblank_irq)
+{
+   struct drm_crtc *crtc = drm_crtc_from_index(dev, pipe);
+
+   return drm_crtc_get_last_vbltimestamp(crtc, tvblank, in_vblank_irq);
+}
+
 /**
  * drm_crtc_vblank_count - retrieve "cooked" vblank counter value
  * @crtc: which counter to retrieve
@@ -980,6 +985,36 @@ u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
 }
 EXPORT_SYMBOL(drm_crtc_vblank_count_and_time);
 
+/**
+ * drm_crtc_next_vblank_start - calculate the time of the next vblank
+ * @crtc: the crtc for which to calculate next vblank time
+ * @vblanktime: pointer to time to receive the next vblank timestamp.
+ *
+ * Calculate the expected time of the start of the next vblank period,
+ * based on time of previous vblank and frame duration
+ */
+int drm_crtc_next_vblank_start(struct drm_crtc *crtc, ktime_t *vblanktime)
+{
+   unsigned int pipe = drm_crtc_index(crtc);
+   struct drm_vblank_crtc *vblank = &crtc->dev->vblank[pipe];
+   struct drm_display_mode *mode = &vblank->hwmode;
+   u64 vblank_start;
+
+   if (!vblank->framedur_ns || !vblank->linedur_ns)
+   return -EINVAL;
+
+   if (!drm_crtc_get_last_vbltimestamp(crtc, vblanktime, false))
+   return -EINVAL;
+
+   vblank_start = DIV_ROUND_DOWN_ULL(
+   (u64)vblank->framedur_ns * mode->crtc_vblank_start,
+   mode->crtc_vtotal);
+   *vblanktime  = ktime_add(*vblanktime, ns_to_ktime(vblank_start));
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_crtc_next_vblank_start);
+
 static void send_vblank_event(struct drm_device *dev,
struct drm_pending_vblank_event *e,
u64 seq, ktime_t now)
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index 733a3e2d1d10..7f3957943dd1 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -230,6 +230,7 @@ bool drm_dev_has_vblank(const struct drm_device *dev);
 u64 drm_crtc_vblank_count(struct drm_crtc *crtc);
 u64 drm_crtc_vblank_count_and_time(struct drm_crtc *crtc,
   ktime_t *vblanktime);
+int drm_crtc_next_vblank_start(struct drm_crtc *crtc, ktime_t *vblanktime);
 void drm_crtc_send_vblank_event(struct drm_crtc *crtc,
   struct drm_pending_vblank_event *e);
 void drm_crtc_arm_vblank_event(struct drm_crtc *crtc,
-- 
2.39.1



[Freedreno] [PATCH v8 08/16] dma-buf/sw_sync: Add fence deadline support

2023-02-28 Thread Rob Clark
From: Rob Clark 

This consists of simply storing the most recent deadline, and adding an
ioctl to retrieve the deadline.  This can be used in conjunction with
the SET_DEADLINE ioctl on a fence fd for testing.  Ie. create various
sw_sync fences, merge them into a fence-array, set deadline on the
fence-array and confirm that it is propagated properly to each fence.

v2: Switch UABI to express deadline as u64
v3: More verbose UAPI docs, show how to convert from timespec
v4: Better comments, track the soonest deadline, as a normal fence
implementation would, return an error if no deadline set.

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/dma-buf/sw_sync.c| 81 
 drivers/dma-buf/sync_debug.h |  2 +
 2 files changed, 83 insertions(+)

diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..f53071bca3af 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -52,12 +52,33 @@ struct sw_sync_create_fence_data {
__s32   fence; /* fd of new fence */
 };
 
+/**
+ * struct sw_sync_get_deadline - get the deadline hint of a sw_sync fence
+ * @deadline_ns: absolute time of the deadline
+ * @pad:   must be zero
+ * @fence_fd:  the sw_sync fence fd (in)
+ *
+ * Return the earliest deadline set on the fence.  The timebase for the
+ * deadline is CLOCK_MONOTONIC (same as vblank).  If there is no deadline
+ * set on the fence, this ioctl will return -ENOENT.
+ */
+struct sw_sync_get_deadline {
+   __u64   deadline_ns;
+   __u32   pad;
+   __s32   fence_fd;
+};
+
 #define SW_SYNC_IOC_MAGIC  'W'
 
 #define SW_SYNC_IOC_CREATE_FENCE   _IOWR(SW_SYNC_IOC_MAGIC, 0,\
struct sw_sync_create_fence_data)
 
 #define SW_SYNC_IOC_INC_IOW(SW_SYNC_IOC_MAGIC, 1, 
__u32)
+#define SW_SYNC_GET_DEADLINE   _IOWR(SW_SYNC_IOC_MAGIC, 2, \
+   struct sw_sync_get_deadline)
+
+
+#define SW_SYNC_HAS_DEADLINE_BIT   DMA_FENCE_FLAG_USER_BITS
 
 static const struct dma_fence_ops timeline_fence_ops;
 
@@ -171,6 +192,22 @@ static void timeline_fence_timeline_value_str(struct 
dma_fence *fence,
snprintf(str, size, "%d", parent->value);
 }
 
+static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t 
deadline)
+{
+   struct sync_pt *pt = dma_fence_to_sync_pt(fence);
+   unsigned long flags;
+
+   spin_lock_irqsave(fence->lock, flags);
+   if (test_bit(SW_SYNC_HAS_DEADLINE_BIT, &fence->flags)) {
+   if (ktime_before(deadline, pt->deadline))
+   pt->deadline = deadline;
+   } else {
+   pt->deadline = deadline;
+   set_bit(SW_SYNC_HAS_DEADLINE_BIT, &fence->flags);
+   }
+   spin_unlock_irqrestore(fence->lock, flags);
+}
+
 static const struct dma_fence_ops timeline_fence_ops = {
.get_driver_name = timeline_fence_get_driver_name,
.get_timeline_name = timeline_fence_get_timeline_name,
@@ -179,6 +216,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
.release = timeline_fence_release,
.fence_value_str = timeline_fence_value_str,
.timeline_value_str = timeline_fence_timeline_value_str,
+   .set_deadline = timeline_fence_set_deadline,
 };
 
 /**
@@ -387,6 +425,46 @@ static long sw_sync_ioctl_inc(struct sync_timeline *obj, 
unsigned long arg)
return 0;
 }
 
+static int sw_sync_ioctl_get_deadline(struct sync_timeline *obj, unsigned long 
arg)
+{
+   struct sw_sync_get_deadline data;
+   struct dma_fence *fence;
+   struct sync_pt *pt;
+   int ret = 0;
+
+   if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
+   return -EFAULT;
+
+   if (data.deadline_ns || data.pad)
+   return -EINVAL;
+
+   fence = sync_file_get_fence(data.fence_fd);
+   if (!fence)
+   return -EINVAL;
+
+   pt = dma_fence_to_sync_pt(fence);
+   if (!pt)
+   return -EINVAL;
+
+   spin_lock(fence->lock);
+   if (test_bit(SW_SYNC_HAS_DEADLINE_BIT, &fence->flags)) {
+   data.deadline_ns = ktime_to_ns(pt->deadline);
+   } else {
+   ret = -ENOENT;
+   }
+   spin_unlock(fence->lock);
+
+   dma_fence_put(fence);
+
+   if (ret)
+   return ret;
+
+   if (copy_to_user((void __user *)arg, &data, sizeof(data)))
+   return -EFAULT;
+
+   return 0;
+}
+
 static long sw_sync_ioctl(struct file *file, unsigned int cmd,
  unsigned long arg)
 {
@@ -399,6 +477,9 @@ static long sw_sync_ioctl(struct file *file, unsigned int 
cmd,
case SW_SYNC_IOC_INC:
return sw_sync_ioctl_inc(obj, arg);
 
+   case SW_SYNC_GET_DEADLINE:
+   return sw_sync_ioctl_get_deadline(obj, arg);
+
default:
return -ENOTTY;
}
diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h
index 617

[Freedreno] [PATCH v8 10/16] drm/syncobj: Add deadline support for syncobj waits

2023-02-28 Thread Rob Clark
From: Rob Clark 

Add a new flag to let userspace provide a deadline as a hint for syncobj
and timeline waits.  This gives a hint to the driver signaling the
backing fences about how soon userspace needs it to compete work, so it
can addjust GPU frequency accordingly.  An immediate deadline can be
given to provide something equivalent to i915 "wait boost".

v2: Use absolute u64 ns value for deadline hint, drop cap and driver
feature flag in favor of allowing count_handles==0 as a way for
userspace to probe kernel for support of new flag
v3: More verbose comments about UAPI

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/drm_syncobj.c | 64 ---
 include/uapi/drm/drm.h| 17 ++
 2 files changed, 68 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 0c2be8360525..a85e9464f07b 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -126,6 +126,11 @@
  * synchronize between the two.
  * This requirement is inherited from the Vulkan fence API.
  *
+ * If &DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE is set, the ioctl will also set
+ * a fence deadline hint on the backing fences before waiting, to provide the
+ * fence signaler with an appropriate sense of urgency.  The deadline is
+ * specified as an absolute &CLOCK_MONOTONIC value in units of ns.
+ *
  * Similarly, &DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT takes an array of syncobj
  * handles as well as an array of u64 points and does a host-side wait on all
  * of syncobj fences at the given points simultaneously.
@@ -973,7 +978,8 @@ static signed long drm_syncobj_array_wait_timeout(struct 
drm_syncobj **syncobjs,
  uint32_t count,
  uint32_t flags,
  signed long timeout,
- uint32_t *idx)
+ uint32_t *idx,
+ ktime_t *deadline)
 {
struct syncobj_wait_entry *entries;
struct dma_fence *fence;
@@ -1053,6 +1059,15 @@ static signed long drm_syncobj_array_wait_timeout(struct 
drm_syncobj **syncobjs,
drm_syncobj_fence_add_wait(syncobjs[i], &entries[i]);
}
 
+   if (deadline) {
+   for (i = 0; i < count; ++i) {
+   fence = entries[i].fence;
+   if (!fence)
+   continue;
+   dma_fence_set_deadline(fence, *deadline);
+   }
+   }
+
do {
set_current_state(TASK_INTERRUPTIBLE);
 
@@ -1151,7 +1166,8 @@ static int drm_syncobj_array_wait(struct drm_device *dev,
  struct drm_file *file_private,
  struct drm_syncobj_wait *wait,
  struct drm_syncobj_timeline_wait 
*timeline_wait,
- struct drm_syncobj **syncobjs, bool timeline)
+ struct drm_syncobj **syncobjs, bool timeline,
+ ktime_t *deadline)
 {
signed long timeout = 0;
uint32_t first = ~0;
@@ -1162,7 +1178,8 @@ static int drm_syncobj_array_wait(struct drm_device *dev,
 NULL,
 wait->count_handles,
 wait->flags,
-timeout, &first);
+timeout, &first,
+deadline);
if (timeout < 0)
return timeout;
wait->first_signaled = first;
@@ -1172,7 +1189,8 @@ static int drm_syncobj_array_wait(struct drm_device *dev,
 
u64_to_user_ptr(timeline_wait->points),
 
timeline_wait->count_handles,
 timeline_wait->flags,
-timeout, &first);
+timeout, &first,
+deadline);
if (timeout < 0)
return timeout;
timeline_wait->first_signaled = first;
@@ -1243,17 +1261,22 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void 
*data,
 {
struct drm_syncobj_wait *args = data;
struct drm_syncobj **syncobjs;
+   unsigned possible_flags;
+   ktime_t t, *tp = NULL;
int ret = 0;
 
if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
return -EOPNOTSUPP;
 
- 

[Freedreno] [PATCH v8 06/16] dma-buf/sync_file: Add SET_DEADLINE ioctl

2023-02-28 Thread Rob Clark
From: Rob Clark 

The initial purpose is for igt tests, but this would also be useful for
compositors that wait until close to vblank deadline to make decisions
about which frame to show.

The igt tests can be found at:

https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline

v2: Clarify the timebase, add link to igt tests
v3: Use u64 value in ns to express deadline.
v4: More doc

Signed-off-by: Rob Clark 
Acked-by: Pekka Paalanen 
---
 drivers/dma-buf/dma-fence.c|  3 ++-
 drivers/dma-buf/sync_file.c| 19 +++
 include/uapi/linux/sync_file.h | 22 ++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index f177c56269bb..74e36f6d05b0 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -933,7 +933,8 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
  *   the GPU's devfreq to reduce frequency, when in fact the opposite is what 
is
  *   needed.
  *
- * To this end, deadline hint(s) can be set on a &dma_fence via 
&dma_fence_set_deadline.
+ * To this end, deadline hint(s) can be set on a &dma_fence via 
&dma_fence_set_deadline
+ * (or indirectly via userspace facing ioctls like &sync_set_deadline).
  * The deadline hint provides a way for the waiting driver, or userspace, to
  * convey an appropriate sense of urgency to the signaling driver.
  *
diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index af57799c86ce..418021cfb87c 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct sync_file 
*sync_file,
return ret;
 }
 
+static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
+   unsigned long arg)
+{
+   struct sync_set_deadline ts;
+
+   if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
+   return -EFAULT;
+
+   if (ts.pad)
+   return -EINVAL;
+
+   dma_fence_set_deadline(sync_file->fence, ns_to_ktime(ts.deadline_ns));
+
+   return 0;
+}
+
 static long sync_file_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
 {
@@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned int 
cmd,
case SYNC_IOC_FILE_INFO:
return sync_file_ioctl_fence_info(sync_file, arg);
 
+   case SYNC_IOC_SET_DEADLINE:
+   return sync_file_ioctl_set_deadline(sync_file, arg);
+
default:
return -ENOTTY;
}
diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index eced40c204d7..594b8bba7948 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -76,6 +76,27 @@ struct sync_file_info {
__u64   sync_fence_info;
 };
 
+/**
+ * struct sync_set_deadline - SYNC_IOC_SET_DEADLINE - set a deadline hint on a 
fence
+ * @deadline_ns: absolute time of the deadline
+ * @pad:   must be zero
+ *
+ * Allows userspace to set a deadline on a fence, see &dma_fence_set_deadline
+ *
+ * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank).  For
+ * example
+ *
+ * clock_gettime(CLOCK_MONOTONIC, &t);
+ * deadline_ns = (t.tv_sec * 10L) + t.tv_nsec + ns_until_deadline
+ */
+struct sync_set_deadline {
+   __u64   deadline_ns;
+   /* Not strictly needed for alignment but gives some possibility
+* for future extension:
+*/
+   __u64   pad;
+};
+
 #define SYNC_IOC_MAGIC '>'
 
 /**
@@ -87,5 +108,6 @@ struct sync_file_info {
 
 #define SYNC_IOC_MERGE _IOWR(SYNC_IOC_MAGIC, 3, struct sync_merge_data)
 #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
+#define SYNC_IOC_SET_DEADLINE  _IOW(SYNC_IOC_MAGIC, 5, struct 
sync_set_deadline)
 
 #endif /* _UAPI_LINUX_SYNC_H */
-- 
2.39.1



[Freedreno] [PATCH v8 09/16] drm/scheduler: Add fence deadline support

2023-02-28 Thread Rob Clark
As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

v2: Split into drm_sched_fence_set_parent() (ckoenig)
v3: Ensure a thread calling drm_sched_fence_set_deadline_finished() sees
fence->parent set before drm_sched_fence_set_parent() does this
test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT).

Signed-off-by: Rob Clark 
Acked-by: Luben Tuikov 
---
 drivers/gpu/drm/scheduler/sched_fence.c | 46 +
 drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
 include/drm/gpu_scheduler.h | 17 +
 3 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
b/drivers/gpu/drm/scheduler/sched_fence.c
index 7fd869520ef2..fe9c6468e440 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -123,6 +123,37 @@ static void drm_sched_fence_release_finished(struct 
dma_fence *f)
dma_fence_put(&fence->scheduled);
 }
 
+static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
+ ktime_t deadline)
+{
+   struct drm_sched_fence *fence = to_drm_sched_fence(f);
+   struct dma_fence *parent;
+   unsigned long flags;
+
+   spin_lock_irqsave(&fence->lock, flags);
+
+   /* If we already have an earlier deadline, keep it: */
+   if (test_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
+   ktime_before(fence->deadline, deadline)) {
+   spin_unlock_irqrestore(&fence->lock, flags);
+   return;
+   }
+
+   fence->deadline = deadline;
+   set_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
+
+   spin_unlock_irqrestore(&fence->lock, flags);
+
+   /*
+* smp_load_aquire() to ensure that if we are racing another
+* thread calling drm_sched_fence_set_parent(), that we see
+* the parent set before it calls test_bit(HAS_DEADLINE_BIT)
+*/
+   parent = smp_load_acquire(&fence->parent);
+   if (parent)
+   dma_fence_set_deadline(parent, deadline);
+}
+
 static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
@@ -133,6 +164,7 @@ static const struct dma_fence_ops 
drm_sched_fence_ops_finished = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
.release = drm_sched_fence_release_finished,
+   .set_deadline = drm_sched_fence_set_deadline_finished,
 };
 
 struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
@@ -147,6 +179,20 @@ struct drm_sched_fence *to_drm_sched_fence(struct 
dma_fence *f)
 }
 EXPORT_SYMBOL(to_drm_sched_fence);
 
+void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
+   struct dma_fence *fence)
+{
+   /*
+* smp_store_release() to ensure another thread racing us
+* in drm_sched_fence_set_deadline_finished() sees the
+* fence's parent set before test_bit()
+*/
+   smp_store_release(&s_fence->parent, dma_fence_get(fence));
+   if (test_bit(DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT,
+&s_fence->finished.flags))
+   dma_fence_set_deadline(fence, s_fence->deadline);
+}
+
 struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
  void *owner)
 {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 4e6ad6e122bc..007f98c48f8d 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1019,7 +1019,7 @@ static int drm_sched_main(void *param)
drm_sched_fence_scheduled(s_fence);
 
if (!IS_ERR_OR_NULL(fence)) {
-   s_fence->parent = dma_fence_get(fence);
+   drm_sched_fence_set_parent(s_fence, fence);
/* Drop for original kref_init of the fence */
dma_fence_put(fence);
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 9db9e5e504ee..99584e457153 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -41,6 +41,15 @@
  */
 #define DRM_SCHED_FENCE_DONT_PIPELINE  DMA_FENCE_FLAG_USER_BITS
 
+/**
+ * DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT - A fence deadline hint has been set
+ *
+ * Because we could have a deadline hint can be set before the backing hw
+ * fence is created, we need to keep track of whether a deadline has already
+ * been set.
+ */
+#define DRM_SCHED_FENCE_FLAG_HAS_DEADLINE_BIT  (DMA_FENCE_FLAG_USER_BITS + 1)
+
 enum dma_resv_usage;
 struct dma_resv;
 struct drm_gem_object;
@@

[Freedreno] [PATCH v8 07/16] dma-buf/sync_file: Support (E)POLLPRI

2023-02-28 Thread Rob Clark
From: Rob Clark 

Allow userspace to use the EPOLLPRI/POLLPRI flag to indicate an urgent
wait (as opposed to a "housekeeping" wait to know when to cleanup after
some work has completed).  Usermode components of GPU driver stacks
often poll() on fence fd's to know when it is safe to do things like
free or reuse a buffer, but they can also poll() on a fence fd when
waiting to read back results from the GPU.  The EPOLLPRI/POLLPRI flag
lets the kernel differentiate these two cases.

Signed-off-by: Rob Clark 
---
 drivers/dma-buf/sync_file.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index 418021cfb87c..cbe96295373b 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -192,6 +192,14 @@ static __poll_t sync_file_poll(struct file *file, 
poll_table *wait)
 {
struct sync_file *sync_file = file->private_data;
 
+   /*
+* The POLLPRI/EPOLLPRI flag can be used to signal that
+* userspace wants the fence to signal ASAP, express this
+* as an immediate deadline.
+*/
+   if (poll_requested_events(wait) & EPOLLPRI)
+   dma_fence_set_deadline(sync_file->fence, ktime_get());
+
poll_wait(file, &sync_file->wq, wait);
 
if (list_empty(&sync_file->cb.node) &&
-- 
2.39.1



[Freedreno] [PATCH v8 05/16] dma-buf/sync_file: Surface sync-file uABI

2023-02-28 Thread Rob Clark
From: Rob Clark 

We had all of the internal driver APIs, but not the all important
userspace uABI, in the dma-buf doc.  Fix that.  And re-arrange the
comments slightly as otherwise the comments for the ioctl nr defines
would not show up.

Signed-off-by: Rob Clark 
---
 Documentation/driver-api/dma-buf.rst | 10 ++--
 include/uapi/linux/sync_file.h   | 35 +++-
 2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/Documentation/driver-api/dma-buf.rst 
b/Documentation/driver-api/dma-buf.rst
index 183e480d8cea..ff3f8da296af 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -203,8 +203,8 @@ DMA Fence unwrap
 .. kernel-doc:: include/linux/dma-fence-unwrap.h
:internal:
 
-DMA Fence uABI/Sync File
-
+DMA Fence Sync File
+~~~
 
 .. kernel-doc:: drivers/dma-buf/sync_file.c
:export:
@@ -212,6 +212,12 @@ DMA Fence uABI/Sync File
 .. kernel-doc:: include/linux/sync_file.h
:internal:
 
+DMA Fence Sync File uABI
+
+
+.. kernel-doc:: include/uapi/linux/sync_file.h
+   :internal:
+
 Indefinite DMA Fences
 ~
 
diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
index ee2dcfb3d660..eced40c204d7 100644
--- a/include/uapi/linux/sync_file.h
+++ b/include/uapi/linux/sync_file.h
@@ -16,12 +16,16 @@
 #include 
 
 /**
- * struct sync_merge_data - data passed to merge ioctl
+ * struct sync_merge_data - SYNC_IOC_MERGE: merge two fences
  * @name:  name of new fence
  * @fd2:   file descriptor of second fence
  * @fence: returns the fd of the new fence to userspace
  * @flags: merge_data flags
  * @pad:   padding for 64-bit alignment, should always be zero
+ *
+ * Creates a new fence containing copies of the sync_pts in both
+ * the calling fd and sync_merge_data.fd2.  Returns the new fence's
+ * fd in sync_merge_data.fence
  */
 struct sync_merge_data {
charname[32];
@@ -34,8 +38,8 @@ struct sync_merge_data {
 /**
  * struct sync_fence_info - detailed fence information
  * @obj_name:  name of parent sync_timeline
-* @driver_name:name of driver implementing the parent
-* @status: status of the fence 0:active 1:signaled <0:error
+ * @driver_name:   name of driver implementing the parent
+ * @status:status of the fence 0:active 1:signaled <0:error
  * @flags: fence_info flags
  * @timestamp_ns:  timestamp of status change in nanoseconds
  */
@@ -48,14 +52,19 @@ struct sync_fence_info {
 };
 
 /**
- * struct sync_file_info - data returned from fence info ioctl
+ * struct sync_file_info - SYNC_IOC_FILE_INFO: get detailed information on a 
sync_file
  * @name:  name of fence
  * @status:status of fence. 1: signaled 0:active <0:error
  * @flags: sync_file_info flags
  * @num_fences number of fences in the sync_file
  * @pad:   padding for 64-bit alignment, should always be zero
- * @sync_fence_info: pointer to array of structs sync_fence_info with all
+ * @sync_fence_info: pointer to array of struct &sync_fence_info with all
  *  fences in the sync_file
+ *
+ * Takes a struct sync_file_info. If num_fences is 0, the field is updated
+ * with the actual number of fences. If num_fences is > 0, the system will
+ * use the pointer provided on sync_fence_info to return up to num_fences of
+ * struct sync_fence_info, with detailed fence information.
  */
 struct sync_file_info {
charname[32];
@@ -76,23 +85,7 @@ struct sync_file_info {
  * no upstream users available.
  */
 
-/**
- * DOC: SYNC_IOC_MERGE - merge two fences
- *
- * Takes a struct sync_merge_data.  Creates a new fence containing copies of
- * the sync_pts in both the calling fd and sync_merge_data.fd2.  Returns the
- * new fence's fd in sync_merge_data.fence
- */
 #define SYNC_IOC_MERGE _IOWR(SYNC_IOC_MAGIC, 3, struct sync_merge_data)
-
-/**
- * DOC: SYNC_IOC_FILE_INFO - get detailed information on a sync_file
- *
- * Takes a struct sync_file_info. If num_fences is 0, the field is updated
- * with the actual number of fences. If num_fences is > 0, the system will
- * use the pointer provided on sync_fence_info to return up to num_fences of
- * struct sync_fence_info, with detailed fence information.
- */
 #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
 
 #endif /* _UAPI_LINUX_SYNC_H */
-- 
2.39.1



[Freedreno] [PATCH v8 03/16] dma-buf/fence-chain: Add fence deadline support

2023-02-28 Thread Rob Clark
From: Rob Clark 

Propagate the deadline to all the fences in the chain.

v2: Use dma_fence_chain_contained [Tvrtko]

Signed-off-by: Rob Clark 
Reviewed-by: Christian König  for this one.
---
 drivers/dma-buf/dma-fence-chain.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
index a0d920576ba6..9663ba1bb6ac 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -206,6 +206,17 @@ static void dma_fence_chain_release(struct dma_fence 
*fence)
dma_fence_free(fence);
 }
 
+
+static void dma_fence_chain_set_deadline(struct dma_fence *fence,
+ktime_t deadline)
+{
+   dma_fence_chain_for_each(fence, fence) {
+   struct dma_fence *f = dma_fence_chain_contained(fence);
+
+   dma_fence_set_deadline(f, deadline);
+   }
+}
+
 const struct dma_fence_ops dma_fence_chain_ops = {
.use_64bit_seqno = true,
.get_driver_name = dma_fence_chain_get_driver_name,
@@ -213,6 +224,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
.enable_signaling = dma_fence_chain_enable_signaling,
.signaled = dma_fence_chain_signaled,
.release = dma_fence_chain_release,
+   .set_deadline = dma_fence_chain_set_deadline,
 };
 EXPORT_SYMBOL(dma_fence_chain_ops);
 
-- 
2.39.1



[Freedreno] [PATCH v8 02/16] dma-buf/fence-array: Add fence deadline support

2023-02-28 Thread Rob Clark
From: Rob Clark 

Propagate the deadline to all the fences in the array.

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/dma-buf/dma-fence-array.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-array.c 
b/drivers/dma-buf/dma-fence-array.c
index 5c8a7084577b..9b3ce8948351 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -123,12 +123,23 @@ static void dma_fence_array_release(struct dma_fence 
*fence)
dma_fence_free(fence);
 }
 
+static void dma_fence_array_set_deadline(struct dma_fence *fence,
+ktime_t deadline)
+{
+   struct dma_fence_array *array = to_dma_fence_array(fence);
+   unsigned i;
+
+   for (i = 0; i < array->num_fences; ++i)
+   dma_fence_set_deadline(array->fences[i], deadline);
+}
+
 const struct dma_fence_ops dma_fence_array_ops = {
.get_driver_name = dma_fence_array_get_driver_name,
.get_timeline_name = dma_fence_array_get_timeline_name,
.enable_signaling = dma_fence_array_enable_signaling,
.signaled = dma_fence_array_signaled,
.release = dma_fence_array_release,
+   .set_deadline = dma_fence_array_set_deadline,
 };
 EXPORT_SYMBOL(dma_fence_array_ops);
 
-- 
2.39.1



[Freedreno] [PATCH v8 04/16] dma-buf/dma-resv: Add a way to set fence deadline

2023-02-28 Thread Rob Clark
From: Rob Clark 

Add a way to set a deadline on remaining resv fences according to the
requested usage.

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
---
 drivers/dma-buf/dma-resv.c | 22 ++
 include/linux/dma-resv.h   |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 1c76aed8e262..2a594b754af1 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -684,6 +684,28 @@ long dma_resv_wait_timeout(struct dma_resv *obj, enum 
dma_resv_usage usage,
 }
 EXPORT_SYMBOL_GPL(dma_resv_wait_timeout);
 
+/**
+ * dma_resv_set_deadline - Set a deadline on reservation's objects fences
+ * @obj: the reservation object
+ * @usage: controls which fences to include, see enum dma_resv_usage.
+ * @deadline: the requested deadline (MONOTONIC)
+ *
+ * May be called without holding the dma_resv lock.  Sets @deadline on
+ * all fences filtered by @usage.
+ */
+void dma_resv_set_deadline(struct dma_resv *obj, enum dma_resv_usage usage,
+  ktime_t deadline)
+{
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+
+   dma_resv_iter_begin(&cursor, obj, usage);
+   dma_resv_for_each_fence_unlocked(&cursor, fence) {
+   dma_fence_set_deadline(fence, deadline);
+   }
+   dma_resv_iter_end(&cursor);
+}
+EXPORT_SYMBOL_GPL(dma_resv_set_deadline);
 
 /**
  * dma_resv_test_signaled - Test if a reservation object's fences have been
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 0637659a702c..8d0e34dad446 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -479,6 +479,8 @@ int dma_resv_get_singleton(struct dma_resv *obj, enum 
dma_resv_usage usage,
 int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
 long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
   bool intr, unsigned long timeout);
+void dma_resv_set_deadline(struct dma_resv *obj, enum dma_resv_usage usage,
+  ktime_t deadline);
 bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage);
 void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq);
 
-- 
2.39.1



[Freedreno] [PATCH v8 01/16] dma-buf/dma-fence: Add deadline awareness

2023-02-28 Thread Rob Clark
From: Rob Clark 

Add a way to hint to the fence signaler of an upcoming deadline, such as
vblank, which the fence waiter would prefer not to miss.  This is to aid
the fence signaler in making power management decisions, like boosting
frequency as the deadline approaches and awareness of missing deadlines
so that can be factored in to the frequency scaling.

v2: Drop dma_fence::deadline and related logic to filter duplicate
deadlines, to avoid increasing dma_fence size.  The fence-context
implementation will need similar logic to track deadlines of all
the fences on the same timeline.  [ckoenig]
v3: Clarify locking wrt. set_deadline callback
v4: Clarify in docs comment that this is a hint
v5: Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
v6: More docs
v7: Fix typo, clarify past deadlines

Signed-off-by: Rob Clark 
Reviewed-by: Christian König 
Acked-by: Pekka Paalanen 
---
 Documentation/driver-api/dma-buf.rst |  6 +++
 drivers/dma-buf/dma-fence.c  | 59 
 include/linux/dma-fence.h| 22 +++
 3 files changed, 87 insertions(+)

diff --git a/Documentation/driver-api/dma-buf.rst 
b/Documentation/driver-api/dma-buf.rst
index 622b8156d212..183e480d8cea 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -164,6 +164,12 @@ DMA Fence Signalling Annotations
 .. kernel-doc:: drivers/dma-buf/dma-fence.c
:doc: fence signalling annotation
 
+DMA Fence Deadline Hints
+
+
+.. kernel-doc:: drivers/dma-buf/dma-fence.c
+   :doc: deadline hints
+
 DMA Fences Functions Reference
 ~~
 
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0de0482cd36e..f177c56269bb 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -912,6 +912,65 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
 }
 EXPORT_SYMBOL(dma_fence_wait_any_timeout);
 
+/**
+ * DOC: deadline hints
+ *
+ * In an ideal world, it would be possible to pipeline a workload sufficiently
+ * that a utilization based device frequency governor could arrive at a minimum
+ * frequency that meets the requirements of the use-case, in order to minimize
+ * power consumption.  But in the real world there are many workloads which
+ * defy this ideal.  For example, but not limited to:
+ *
+ * * Workloads that ping-pong between device and CPU, with alternating periods
+ *   of CPU waiting for device, and device waiting on CPU.  This can result in
+ *   devfreq and cpufreq seeing idle time in their respective domains and in
+ *   result reduce frequency.
+ *
+ * * Workloads that interact with a periodic time based deadline, such as 
double
+ *   buffered GPU rendering vs vblank sync'd page flipping.  In this scenario,
+ *   missing a vblank deadline results in an *increase* in idle time on the GPU
+ *   (since it has to wait an additional vblank period), sending a signal to
+ *   the GPU's devfreq to reduce frequency, when in fact the opposite is what 
is
+ *   needed.
+ *
+ * To this end, deadline hint(s) can be set on a &dma_fence via 
&dma_fence_set_deadline.
+ * The deadline hint provides a way for the waiting driver, or userspace, to
+ * convey an appropriate sense of urgency to the signaling driver.
+ *
+ * A deadline hint is given in absolute ktime (CLOCK_MONOTONIC for userspace
+ * facing APIs).  The time could either be some point in the future (such as
+ * the vblank based deadline for page-flipping, or the start of a compositor's
+ * composition cycle), or the current time to indicate an immediate deadline
+ * hint (Ie. forward progress cannot be made until this fence is signaled).
+ *
+ * Multiple deadlines may be set on a given fence, even in parallel.  See the
+ * documentation for &dma_fence_ops.set_deadline.
+ *
+ * The deadline hint is just that, a hint.  The driver that created the fence
+ * may react by increasing frequency, making different scheduling choices, etc.
+ * Or doing nothing at all.
+ */
+
+/**
+ * dma_fence_set_deadline - set desired fence-wait deadline hint
+ * @fence:the fence that is to be waited on
+ * @deadline: the time by which the waiter hopes for the fence to be
+ *signaled
+ *
+ * Give the fence signaler a hint about an upcoming deadline, such as
+ * vblank, by which point the waiter would prefer the fence to be
+ * signaled by.  This is intended to give feedback to the fence signaler
+ * to aid in power management decisions, such as boosting GPU frequency
+ * if a periodic vblank deadline is approaching but the fence is not
+ * yet signaled..
+ */
+void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+   if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
+   fence->ops->set_deadline(fence, deadline);
+}
+EXPORT_SYMBOL(dma_fence_set_deadline);
+
 /**
  * dma_fence_describe - Dump fence describtion into seq_file
  * @fence: the 6fence to describe
diff --git 

[Freedreno] [PATCH v8 00/16] dma-fence: Deadline awareness

2023-02-28 Thread Rob Clark
From: Rob Clark 

This series adds a deadline hint to fences, so realtime deadlines
such as vblank can be communicated to the fence signaller for power/
frequency management decisions.

This is partially inspired by a trick i915 does, but implemented
via dma-fence for a couple of reasons:

1) To continue to be able to use the atomic helpers
2) To support cases where display and gpu are different drivers

This iteration adds a dma-fence ioctl to set a deadline (both to
support igt-tests, and compositors which delay decisions about which
client buffer to display), and a sw_sync ioctl to read back the
deadline.  IGT tests utilizing these can be found at:

  https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline


v1: https://patchwork.freedesktop.org/series/93035/
v2: Move filtering out of later deadlines to fence implementation
to avoid increasing the size of dma_fence
v3: Add support in fence-array and fence-chain; Add some uabi to
support igt tests and userspace compositors.
v4: Rebase, address various comments, and add syncobj deadline
support, and sync_file EPOLLPRI based on experience with perf/
freq issues with clvk compute workloads on i915 (anv)
v5: Clarify that this is a hint as opposed to a more hard deadline
guarantee, switch to using u64 ns values in UABI (still absolute
CLOCK_MONOTONIC values), drop syncobj related cap and driver
feature flag in favor of allowing count_handles==0 for probing
kernel support.
v6: Re-work vblank helper to calculate time of _start_ of vblank,
and work correctly if the last vblank event was more than a
frame ago.  Add (mostly unrelated) drm/msm patch which also
uses the vblank helper.  Use dma_fence_chain_contained().  More
verbose syncobj UABI comments.  Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
v7: Fix kbuild complaints about vblank helper.  Add more docs.
v8: Add patch to surface sync_file UAPI, and more docs updates.

Rob Clark (16):
  dma-buf/dma-fence: Add deadline awareness
  dma-buf/fence-array: Add fence deadline support
  dma-buf/fence-chain: Add fence deadline support
  dma-buf/dma-resv: Add a way to set fence deadline
  dma-buf/sync_file: Surface sync-file uABI
  dma-buf/sync_file: Add SET_DEADLINE ioctl
  dma-buf/sync_file: Support (E)POLLPRI
  dma-buf/sw_sync: Add fence deadline support
  drm/scheduler: Add fence deadline support
  drm/syncobj: Add deadline support for syncobj waits
  drm/vblank: Add helper to get next vblank time
  drm/atomic-helper: Set fence deadline for vblank
  drm/msm: Add deadline based boost support
  drm/msm: Add wait-boost support
  drm/msm/atomic: Switch to vblank_start helper
  drm/i915: Add deadline based boost support

 Documentation/driver-api/dma-buf.rst| 16 -
 drivers/dma-buf/dma-fence-array.c   | 11 
 drivers/dma-buf/dma-fence-chain.c   | 12 
 drivers/dma-buf/dma-fence.c | 60 ++
 drivers/dma-buf/dma-resv.c  | 22 +++
 drivers/dma-buf/sw_sync.c   | 81 +
 drivers/dma-buf/sync_debug.h|  2 +
 drivers/dma-buf/sync_file.c | 27 +
 drivers/gpu/drm/drm_atomic_helper.c | 36 +++
 drivers/gpu/drm/drm_syncobj.c   | 64 +++
 drivers/gpu/drm/drm_vblank.c| 53 +---
 drivers/gpu/drm/i915/i915_request.c | 20 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 15 -
 drivers/gpu/drm/msm/msm_atomic.c|  8 ++-
 drivers/gpu/drm/msm/msm_drv.c   | 12 ++--
 drivers/gpu/drm/msm/msm_fence.c | 74 ++
 drivers/gpu/drm/msm/msm_fence.h | 20 ++
 drivers/gpu/drm/msm/msm_gem.c   |  5 ++
 drivers/gpu/drm/msm/msm_kms.h   |  8 ---
 drivers/gpu/drm/scheduler/sched_fence.c | 46 ++
 drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
 include/drm/drm_vblank.h|  1 +
 include/drm/gpu_scheduler.h | 17 ++
 include/linux/dma-fence.h   | 22 +++
 include/linux/dma-resv.h|  2 +
 include/uapi/drm/drm.h  | 17 ++
 include/uapi/drm/msm_drm.h  | 14 -
 include/uapi/linux/sync_file.h  | 57 ++---
 28 files changed, 646 insertions(+), 78 deletions(-)

-- 
2.39.1



Re: [Freedreno] [PATCH v4 06/14] dma-buf/sync_file: Support (E)POLLPRI

2023-02-28 Thread Rob Clark
On Tue, Feb 28, 2023 at 6:30 AM Sebastian Wick
 wrote:
>
> On Tue, Feb 28, 2023 at 12:48 AM Rob Clark  wrote:
> >
> > On Mon, Feb 27, 2023 at 2:44 PM Sebastian Wick
> >  wrote:
> > >
> > > On Mon, Feb 27, 2023 at 11:20 PM Rob Clark  wrote:
> > > >
> > > > On Mon, Feb 27, 2023 at 1:36 PM Rodrigo Vivi  
> > > > wrote:
> > > > >
> > > > > On Fri, Feb 24, 2023 at 09:59:57AM -0800, Rob Clark wrote:
> > > > > > On Fri, Feb 24, 2023 at 7:27 AM Luben Tuikov  
> > > > > > wrote:
> > > > > > >
> > > > > > > On 2023-02-24 06:37, Tvrtko Ursulin wrote:
> > > > > > > >
> > > > > > > > On 24/02/2023 11:00, Pekka Paalanen wrote:
> > > > > > > >> On Fri, 24 Feb 2023 10:50:51 +
> > > > > > > >> Tvrtko Ursulin  wrote:
> > > > > > > >>
> > > > > > > >>> On 24/02/2023 10:24, Pekka Paalanen wrote:
> > > > > > >  On Fri, 24 Feb 2023 09:41:46 +
> > > > > > >  Tvrtko Ursulin  wrote:
> > > > > > > 
> > > > > > > > On 24/02/2023 09:26, Pekka Paalanen wrote:
> > > > > > > >> On Thu, 23 Feb 2023 10:51:48 -0800
> > > > > > > >> Rob Clark  wrote:
> > > > > > > >>
> > > > > > > >>> On Thu, Feb 23, 2023 at 1:38 AM Pekka Paalanen 
> > > > > > > >>>  wrote:
> > > > > > > 
> > > > > > >  On Wed, 22 Feb 2023 07:37:26 -0800
> > > > > > >  Rob Clark  wrote:
> > > > > > > 
> > > > > > > > On Wed, Feb 22, 2023 at 1:49 AM Pekka Paalanen 
> > > > > > > >  wrote:
> > > > > > > >>
> > > > > > > >> ...
> > > > > > > >>
> > > > > > > >> On another matter, if the application uses 
> > > > > > > >> SET_DEADLINE with one
> > > > > > > >> timestamp, and the compositor uses SET_DEADLINE on the 
> > > > > > > >> same thing with
> > > > > > > >> another timestamp, what should happen?
> > > > > > > >
> > > > > > > > The expectation is that many deadline hints can be set 
> > > > > > > > on a fence.
> > > > > > > > The fence signaller should track the soonest deadline.
> > > > > > > 
> > > > > > >  You need to document that as UAPI, since it is 
> > > > > > >  observable to userspace.
> > > > > > >  It would be bad if drivers or subsystems would differ in 
> > > > > > >  behaviour.
> > > > > > > 
> > > > > > > >>>
> > > > > > > >>> It is in the end a hint.  It is about giving the driver 
> > > > > > > >>> more
> > > > > > > >>> information so that it can make better choices.  But the 
> > > > > > > >>> driver is
> > > > > > > >>> even free to ignore it.  So maybe "expectation" is too 
> > > > > > > >>> strong of a
> > > > > > > >>> word.  Rather, any other behavior doesn't really make 
> > > > > > > >>> sense.  But it
> > > > > > > >>> could end up being dictated by how the hw and/or fw works.
> > > > > > > >>
> > > > > > > >> It will stop being a hint once it has been implemented and 
> > > > > > > >> used in the
> > > > > > > >> wild long enough. The kernel userspace regression rules 
> > > > > > > >> make sure of
> > > > > > > >> that.
> > > > > > > >
> > > > > > > > Yeah, tricky and maybe a gray area in this case. I think we 
> > > > > > > > eluded
> > > > > > > > elsewhere in the thread that renaming the thing might be an 
> > > > > > > > option.
> > > > > > > >
> > > > > > > > So maybe instead of deadline, which is a very strong word, 
> > > > > > > > use something
> > > > > > > > along the lines of "present time hint", or "signalled time 
> > > > > > > > hint"? Maybe
> > > > > > > > reads clumsy. Just throwing some ideas for a start.
> > > > > > > 
> > > > > > >  You can try, but I fear that if it ever changes behaviour and
> > > > > > >  someone notices that, it's labelled as a kernel regression. 
> > > > > > >  I don't
> > > > > > >  think documentation has ever been the authoritative 
> > > > > > >  definition of UABI
> > > > > > >  in Linux, it just guides drivers and userspace towards a 
> > > > > > >  common
> > > > > > >  understanding and common usage patterns.
> > > > > > > 
> > > > > > >  So even if the UABI contract is not documented (ugh), you 
> > > > > > >  need to be
> > > > > > >  prepared to set the UABI contract through kernel 
> > > > > > >  implementation.
> > > > > > > >>>
> > > > > > > >>> To be the devil's advocate it probably wouldn't be an ABI 
> > > > > > > >>> regression but
> > > > > > > >>> just an regression. Same way as what nice(2) priorities mean 
> > > > > > > >>> hasn't
> > > > > > > >>> always been the same over the years, I don't think there is a 
> > > > > > > >>> strict
> > > > > > > >>> contract.
> > > > > > > >>>
> > > > > > > >>> Having said that, it may be different with latency sensitive 
> > > > > > > >>> stuff such
> > > > > > > >>> as UIs though since it is very observable and can be very 
> > > > > > > >>> painful to users.
> 

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Konrad Dybcio



On 28.02.2023 21:48, Akhil P Oommen wrote:
> On 3/1/2023 2:14 AM, Akhil P Oommen wrote:
>> On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>>> On 28.02.2023 21:23, Akhil P Oommen wrote:
 On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
>
> Sort the variable definition and assignment alphabetically.
 Sorting based on decreasing order of line length is more readable, isn't 
 it?
>>> I can do that.
>>>
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> Set default values for all of the tunables to zero, as they should be.
>
> Values were validated against downstream and will be fixed up in
> separate commits so as not to make this one even more messy.
>
> A618 remains untouched (left at hw defaults) in this patch.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
> ---
>  1 file changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..bdae341e0a7c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + /* Unknown, introduced with A640/680 */
>   u32 amsbc = 0;
> + /*
> +  * The Highest Bank Bit value represents the bit of the highest DDR 
> bank.
> +  * We then subtract 13 from it (13 is the minimum value allowed by hw) 
> and
> +  * write the lowest two bits of the remaining value as hbb_lo and the
> +  * one above it as hbb_hi to the hardware. The default values (when HBB 
> is
> +  * not specified) are 0, 0.
> +  */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>   u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
>   u32 uavflagprd_inv = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu)) {
> + /* HBB = 14 */
> + hbb_lo = 1;
> + }
> +
> + if (adreno_is_a630(adreno_gpu)) {
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
> +
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
>   amsbc = 1;
> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> + /* HBB = 16 */
> + hbb_lo = 3;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
> + /* HBB is unset in downstream DTS, defaulting to 0 */
 This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, 
 hbb configurations were moved to the driver from DT in recent downstream 
 kernels.
>>> Right, seems to have happened with msm-5.10. Though a random kernel I
>>> grabbed seems to suggest it's 15 and not 14?
>>>
>>> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
>> We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 
>> 7c3 is only supported on LP4, we can hardcode 14 here.
Okay, I see.

>> In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect 
>> ddrtype. If we can get something like that in upstream, we should implement 
>> a similar logic here.
Yeah, I mentioned it here [1], but I doubt it'd be implemented,
given what Krzysztof pointed out.

>>
>> -Akhil.
> Also, I haven't closely reviewed other targets configuration you updated, but 
> it is a good idea to leave the existing configurations here as it in this 
> refactor patch. Any update should be a separate patch.
Sure, will do.

Konrad

[1] https://github.com/devicetree-org/devicetree-specification/issues/62
> 
> -Akhil.
>>> Konrad
 -Akhil.
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MO

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 3/1/2023 2:14 AM, Akhil P Oommen wrote:
> On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>> On 28.02.2023 21:23, Akhil P Oommen wrote:
>>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
 Rename lower_bit to hbb_lo and explain what it signifies.
 Add explanations (wherever possible to other tunables).

 Sort the variable definition and assignment alphabetically.
>>> Sorting based on decreasing order of line length is more readable, isn't it?
>> I can do that.
>>
 Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
 Set default values for all of the tunables to zero, as they should be.

 Values were validated against downstream and will be fixed up in
 separate commits so as not to make this one even more messy.

 A618 remains untouched (left at hw defaults) in this patch.

 Signed-off-by: Konrad Dybcio 
 ---
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
 ---
  1 file changed, 45 insertions(+), 10 deletions(-)

 diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
 b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 index c5f5d0bb3fdc..bdae341e0a7c 100644
 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
  {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 -  u32 lower_bit = 2;
 +  /* Unknown, introduced with A640/680 */
u32 amsbc = 0;
 +  /*
 +   * The Highest Bank Bit value represents the bit of the highest DDR 
 bank.
 +   * We then subtract 13 from it (13 is the minimum value allowed by hw) 
 and
 +   * write the lowest two bits of the remaining value as hbb_lo and the
 +   * one above it as hbb_hi to the hardware. The default values (when HBB 
 is
 +   * not specified) are 0, 0.
 +   */
 +  u32 hbb_hi = 0;
 +  u32 hbb_lo = 0;
 +  /* Whether the minimum access length is 64 bits */
 +  u32 min_acc_len = 0;
 +  /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
u32 rgb565_predicator = 0;
 +  /* Unknown, introduced with A650 family */
u32 uavflagprd_inv = 0;
 +  /* Entirely magic, per-GPU-gen value */
 +  u32 ubwc_mode = 0;
  
/* a618 is using the hw default values */
if (adreno_is_a618(adreno_gpu))
return;
  
 -  if (adreno_is_a640_family(adreno_gpu))
 +  if (adreno_is_a619(adreno_gpu)) {
 +  /* HBB = 14 */
 +  hbb_lo = 1;
 +  }
 +
 +  if (adreno_is_a630(adreno_gpu)) {
 +  /* HBB = 15 */
 +  hbb_lo = 2;
 +  }
 +
 +  if (adreno_is_a640_family(adreno_gpu)) {
amsbc = 1;
 +  /* HBB = 15 */
 +  hbb_lo = 2;
 +  }
  
if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
 -  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
 -  lower_bit = 3;
amsbc = 1;
 +  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
 +  /* HBB = 16 */
 +  hbb_lo = 3;
rgb565_predicator = 1;
uavflagprd_inv = 2;
}
  
if (adreno_is_7c3(adreno_gpu)) {
 -  lower_bit = 1;
amsbc = 1;
 +  /* HBB is unset in downstream DTS, defaulting to 0 */
>>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
>>> configurations were moved to the driver from DT in recent downstream 
>>> kernels.
>> Right, seems to have happened with msm-5.10. Though a random kernel I
>> grabbed seems to suggest it's 15 and not 14?
>>
>> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
> We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 
> 7c3 is only supported on LP4, we can hardcode 14 here.
> In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect 
> ddrtype. If we can get something like that in upstream, we should implement a 
> similar logic here.
>
> -Akhil.
Also, I haven't closely reviewed other targets configuration you updated, but 
it is a good idea to leave the existing configurations here as it in this 
refactor patch. Any update should be a separate patch.

-Akhil.
>> Konrad
>>> -Akhil.
rgb565_predicator = 1;
uavflagprd_inv = 2;
}
  
gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
 -  rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
 -  gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
 -  gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
 -  uavflagprd_inv << 4 | lower_bit << 1);
 -  gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
 +   

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 3/1/2023 2:10 AM, Konrad Dybcio wrote:
>
> On 28.02.2023 21:23, Akhil P Oommen wrote:
>> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>>> Rename lower_bit to hbb_lo and explain what it signifies.
>>> Add explanations (wherever possible to other tunables).
>>>
>>> Sort the variable definition and assignment alphabetically.
>> Sorting based on decreasing order of line length is more readable, isn't it?
> I can do that.
>
>>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>>> Set default values for all of the tunables to zero, as they should be.
>>>
>>> Values were validated against downstream and will be fixed up in
>>> separate commits so as not to make this one even more messy.
>>>
>>> A618 remains untouched (left at hw defaults) in this patch.
>>>
>>> Signed-off-by: Konrad Dybcio 
>>> ---
>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
>>> ---
>>>  1 file changed, 45 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>>  {
>>> struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>>> -   u32 lower_bit = 2;
>>> +   /* Unknown, introduced with A640/680 */
>>> u32 amsbc = 0;
>>> +   /*
>>> +* The Highest Bank Bit value represents the bit of the highest DDR 
>>> bank.
>>> +* We then subtract 13 from it (13 is the minimum value allowed by hw) 
>>> and
>>> +* write the lowest two bits of the remaining value as hbb_lo and the
>>> +* one above it as hbb_hi to the hardware. The default values (when HBB 
>>> is
>>> +* not specified) are 0, 0.
>>> +*/
>>> +   u32 hbb_hi = 0;
>>> +   u32 hbb_lo = 0;
>>> +   /* Whether the minimum access length is 64 bits */
>>> +   u32 min_acc_len = 0;
>>> +   /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>> u32 rgb565_predicator = 0;
>>> +   /* Unknown, introduced with A650 family */
>>> u32 uavflagprd_inv = 0;
>>> +   /* Entirely magic, per-GPU-gen value */
>>> +   u32 ubwc_mode = 0;
>>>  
>>> /* a618 is using the hw default values */
>>> if (adreno_is_a618(adreno_gpu))
>>> return;
>>>  
>>> -   if (adreno_is_a640_family(adreno_gpu))
>>> +   if (adreno_is_a619(adreno_gpu)) {
>>> +   /* HBB = 14 */
>>> +   hbb_lo = 1;
>>> +   }
>>> +
>>> +   if (adreno_is_a630(adreno_gpu)) {
>>> +   /* HBB = 15 */
>>> +   hbb_lo = 2;
>>> +   }
>>> +
>>> +   if (adreno_is_a640_family(adreno_gpu)) {
>>> amsbc = 1;
>>> +   /* HBB = 15 */
>>> +   hbb_lo = 2;
>>> +   }
>>>  
>>> if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>> -   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> -   lower_bit = 3;
>>> amsbc = 1;
>>> +   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>>> +   /* HBB = 16 */
>>> +   hbb_lo = 3;
>>> rgb565_predicator = 1;
>>> uavflagprd_inv = 2;
>>> }
>>>  
>>> if (adreno_is_7c3(adreno_gpu)) {
>>> -   lower_bit = 1;
>>> amsbc = 1;
>>> +   /* HBB is unset in downstream DTS, defaulting to 0 */
>> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
>> configurations were moved to the driver from DT in recent downstream kernels.
> Right, seems to have happened with msm-5.10. Though a random kernel I
> grabbed seems to suggest it's 15 and not 14?
>
> https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710
We override that with 14 in a6xx_init() for LP4 platforms dynamically. Since 
7c3 is only supported on LP4, we can hardcode 14 here.
In the downstream kernel, there is an api (of_fdt_get_ddrtype()) to detect 
ddrtype. If we can get something like that in upstream, we should implement a 
similar logic here.

-Akhil.
>
> Konrad
>> -Akhil.
>>> rgb565_predicator = 1;
>>> uavflagprd_inv = 2;
>>> }
>>>  
>>> gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>>> -   rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>>> -   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>>> -   gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>>> -   uavflagprd_inv << 4 | lower_bit << 1);
>>> -   gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>>> + rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>> +
>>> +   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
>>> + min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>>> +
>>> +   gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |

Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Konrad Dybcio



On 28.02.2023 21:23, Akhil P Oommen wrote:
> On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
>> Rename lower_bit to hbb_lo and explain what it signifies.
>> Add explanations (wherever possible to other tunables).
>>
>> Sort the variable definition and assignment alphabetically.
> Sorting based on decreasing order of line length is more readable, isn't it?
I can do that.

>>
>> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
>> Set default values for all of the tunables to zero, as they should be.
>>
>> Values were validated against downstream and will be fixed up in
>> separate commits so as not to make this one even more messy.
>>
>> A618 remains untouched (left at hw defaults) in this patch.
>>
>> Signed-off-by: Konrad Dybcio 
>> ---
>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
>> ---
>>  1 file changed, 45 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> index c5f5d0bb3fdc..bdae341e0a7c 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>  {
>>  struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>> -u32 lower_bit = 2;
>> +/* Unknown, introduced with A640/680 */
>>  u32 amsbc = 0;
>> +/*
>> + * The Highest Bank Bit value represents the bit of the highest DDR 
>> bank.
>> + * We then subtract 13 from it (13 is the minimum value allowed by hw) 
>> and
>> + * write the lowest two bits of the remaining value as hbb_lo and the
>> + * one above it as hbb_hi to the hardware. The default values (when HBB 
>> is
>> + * not specified) are 0, 0.
>> + */
>> +u32 hbb_hi = 0;
>> +u32 hbb_lo = 0;
>> +/* Whether the minimum access length is 64 bits */
>> +u32 min_acc_len = 0;
>> +/* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>>  u32 rgb565_predicator = 0;
>> +/* Unknown, introduced with A650 family */
>>  u32 uavflagprd_inv = 0;
>> +/* Entirely magic, per-GPU-gen value */
>> +u32 ubwc_mode = 0;
>>  
>>  /* a618 is using the hw default values */
>>  if (adreno_is_a618(adreno_gpu))
>>  return;
>>  
>> -if (adreno_is_a640_family(adreno_gpu))
>> +if (adreno_is_a619(adreno_gpu)) {
>> +/* HBB = 14 */
>> +hbb_lo = 1;
>> +}
>> +
>> +if (adreno_is_a630(adreno_gpu)) {
>> +/* HBB = 15 */
>> +hbb_lo = 2;
>> +}
>> +
>> +if (adreno_is_a640_family(adreno_gpu)) {
>>  amsbc = 1;
>> +/* HBB = 15 */
>> +hbb_lo = 2;
>> +}
>>  
>>  if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>> -/* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>> -lower_bit = 3;
>>  amsbc = 1;
>> +/* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>> +/* HBB = 16 */
>> +hbb_lo = 3;
>>  rgb565_predicator = 1;
>>  uavflagprd_inv = 2;
>>  }
>>  
>>  if (adreno_is_7c3(adreno_gpu)) {
>> -lower_bit = 1;
>>  amsbc = 1;
>> +/* HBB is unset in downstream DTS, defaulting to 0 */
> This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
> configurations were moved to the driver from DT in recent downstream kernels.
Right, seems to have happened with msm-5.10. Though a random kernel I
grabbed seems to suggest it's 15 and not 14?

https://github.com/sonyxperiadev/kernel/blob/aosp/K.P.1.0.r1/drivers/gpu/msm/adreno-gpulist.h#L1710

Konrad
> 
> -Akhil.
>>  rgb565_predicator = 1;
>>  uavflagprd_inv = 2;
>>  }
>>  
>>  gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>> -rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>> -gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>> -gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>> -uavflagprd_inv << 4 | lower_bit << 1);
>> -gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>> +  rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
>> +  min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>> +
>> +gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
>> +  min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
>> +
>> +gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
>> +  uavflagprd_inv << 4 | min_acc_len << 3 |
>> +  hbb_lo << 1 | ubwc_mode);
>> +
>> +gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
>> 21);
>>  }
>>  
>>  static int a6xx_cp_init(struct msm_gpu *gpu)
>>
> 


Re: [Freedreno] [PATCH v3 04/15] drm/msm/a6xx: Extend and explain UBWC config

2023-02-28 Thread Akhil P Oommen
On 2/23/2023 5:36 PM, Konrad Dybcio wrote:
> Rename lower_bit to hbb_lo and explain what it signifies.
> Add explanations (wherever possible to other tunables).
>
> Sort the variable definition and assignment alphabetically.
Sorting based on decreasing order of line length is more readable, isn't it?
>
> Port setting min_access_length, ubwc_mode and hbb_hi from downstream.
> Set default values for all of the tunables to zero, as they should be.
>
> Values were validated against downstream and will be fixed up in
> separate commits so as not to make this one even more messy.
>
> A618 remains untouched (left at hw defaults) in this patch.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 55 
> ---
>  1 file changed, 45 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..bdae341e0a7c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,39 +786,74 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + /* Unknown, introduced with A640/680 */
>   u32 amsbc = 0;
> + /*
> +  * The Highest Bank Bit value represents the bit of the highest DDR 
> bank.
> +  * We then subtract 13 from it (13 is the minimum value allowed by hw) 
> and
> +  * write the lowest two bits of the remaining value as hbb_lo and the
> +  * one above it as hbb_hi to the hardware. The default values (when HBB 
> is
> +  * not specified) are 0, 0.
> +  */
> + u32 hbb_hi = 0;
> + u32 hbb_lo = 0;
> + /* Whether the minimum access length is 64 bits */
> + u32 min_acc_len = 0;
> + /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
>   u32 rgb565_predicator = 0;
> + /* Unknown, introduced with A650 family */
>   u32 uavflagprd_inv = 0;
> + /* Entirely magic, per-GPU-gen value */
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a619(adreno_gpu)) {
> + /* HBB = 14 */
> + hbb_lo = 1;
> + }
> +
> + if (adreno_is_a630(adreno_gpu)) {
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
> +
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + /* HBB = 15 */
> + hbb_lo = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
> - /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> - lower_bit = 3;
>   amsbc = 1;
> + /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> + /* HBB = 16 */
> + hbb_lo = 3;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
> + /* HBB is unset in downstream DTS, defaulting to 0 */
This is incorrect. For 7c3 hbb value is 14. So hbb_lo should be 1. FYI, hbb 
configurations were moved to the driver from DT in recent downstream kernels.

-Akhil.
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
> +   min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   hbb_lo << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
> 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)
>



Re: [Freedreno] [PATCH v7 07/15] dma-buf/sw_sync: Add fence deadline support

2023-02-28 Thread Rob Clark
On Tue, Feb 28, 2023 at 1:23 AM Pekka Paalanen  wrote:
>
> On Mon, 27 Feb 2023 11:35:13 -0800
> Rob Clark  wrote:
>
> > From: Rob Clark 
> >
> > This consists of simply storing the most recent deadline, and adding an
> > ioctl to retrieve the deadline.  This can be used in conjunction with
> > the SET_DEADLINE ioctl on a fence fd for testing.  Ie. create various
> > sw_sync fences, merge them into a fence-array, set deadline on the
> > fence-array and confirm that it is propagated properly to each fence.
> >
> > v2: Switch UABI to express deadline as u64
> > v3: More verbose UAPI docs, show how to convert from timespec
> >
> > Signed-off-by: Rob Clark 
> > Reviewed-by: Christian König 
> > ---
> >  drivers/dma-buf/sw_sync.c  | 58 ++
> >  drivers/dma-buf/sync_debug.h   |  2 ++
> >  include/uapi/linux/sync_file.h |  6 +++-
> >  3 files changed, 65 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
> > index 348b3a9170fa..3e2315ee955b 100644
> > --- a/drivers/dma-buf/sw_sync.c
> > +++ b/drivers/dma-buf/sw_sync.c
> > @@ -52,12 +52,28 @@ struct sw_sync_create_fence_data {
> >   __s32   fence; /* fd of new fence */
> >  };
> >
> > +/**
> > + * struct sw_sync_get_deadline - get the deadline hint of a sw_sync fence
> > + * @deadline_ns: absolute time of the deadline
> > + * @pad: must be zero
> > + * @fence_fd:the sw_sync fence fd (in)
> > + *
> > + * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank)
>
> Hi,
>
> the commit message explains this returns the "most recent" deadline,
> but the doc here forgets to mention that. I suppose that means the
> most recently set deadline and not the deadline furthest forward in
> time (largest value).
>
> Is "most recent" the appropriate behaviour when multiple deadlines have
> been set? Would you not want the earliest deadline set so far instead?

It's not what a "normal" implementation of ->set_deadline() would do.
But it was useful for determining that the deadline propagates
correctly through composite (array/chain) fences.

I guess I could change the test to work with a more normal
->set_deadline() implementation (which would just track the nearest
(in time) deadline).

> What if none has been set?

you'd get zero.. I suppose I could make it return an error instead..

BR,
-R

> > + */
> > +struct sw_sync_get_deadline {
> > + __u64   deadline_ns;
> > + __u32   pad;
> > + __s32   fence_fd;
> > +};
> > +
> >  #define SW_SYNC_IOC_MAGIC'W'
> >
> >  #define SW_SYNC_IOC_CREATE_FENCE _IOWR(SW_SYNC_IOC_MAGIC, 0,\
> >   struct sw_sync_create_fence_data)
> >
> >  #define SW_SYNC_IOC_INC  _IOW(SW_SYNC_IOC_MAGIC, 1, 
> > __u32)
> > +#define SW_SYNC_GET_DEADLINE _IOWR(SW_SYNC_IOC_MAGIC, 2, \
> > + struct sw_sync_get_deadline)
> >
> >  static const struct dma_fence_ops timeline_fence_ops;
> >
> > @@ -171,6 +187,13 @@ static void timeline_fence_timeline_value_str(struct 
> > dma_fence *fence,
> >   snprintf(str, size, "%d", parent->value);
> >  }
> >
> > +static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t 
> > deadline)
> > +{
> > + struct sync_pt *pt = dma_fence_to_sync_pt(fence);
> > +
> > + pt->deadline = deadline;
> > +}
> > +
> >  static const struct dma_fence_ops timeline_fence_ops = {
> >   .get_driver_name = timeline_fence_get_driver_name,
> >   .get_timeline_name = timeline_fence_get_timeline_name,
> > @@ -179,6 +202,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
> >   .release = timeline_fence_release,
> >   .fence_value_str = timeline_fence_value_str,
> >   .timeline_value_str = timeline_fence_timeline_value_str,
> > + .set_deadline = timeline_fence_set_deadline,
> >  };
> >
> >  /**
> > @@ -387,6 +411,37 @@ static long sw_sync_ioctl_inc(struct sync_timeline 
> > *obj, unsigned long arg)
> >   return 0;
> >  }
> >
> > +static int sw_sync_ioctl_get_deadline(struct sync_timeline *obj, unsigned 
> > long arg)
> > +{
> > + struct sw_sync_get_deadline data;
> > + struct dma_fence *fence;
> > + struct sync_pt *pt;
> > +
> > + if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
> > + return -EFAULT;
> > +
> > + if (data.deadline_ns || data.pad)
> > + return -EINVAL;
> > +
> > + fence = sync_file_get_fence(data.fence_fd);
> > + if (!fence)
> > + return -EINVAL;
> > +
> > + pt = dma_fence_to_sync_pt(fence);
> > + if (!pt)
> > + return -EINVAL;
> > +
> > +
> > + data.deadline_ns = ktime_to_ns(pt->deadline);
> > +
> > + dma_fence_put(fence);
> > +
> > + if (copy_to_user((void __user *)arg, &data, sizeof(data)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> >  static long sw_sync_ioctl(struct file *file, unsigned int cmd,
> > unsigned long arg)
> >  {
> > @@ -399,6 +4

Re: [Freedreno] [PATCH v7 05/15] dma-buf/sync_file: Add SET_DEADLINE ioctl

2023-02-28 Thread Rob Clark
On Tue, Feb 28, 2023 at 1:22 AM Pekka Paalanen  wrote:
>
> On Mon, 27 Feb 2023 11:35:11 -0800
> Rob Clark  wrote:
>
> > From: Rob Clark 
> >
> > The initial purpose is for igt tests, but this would also be useful for
> > compositors that wait until close to vblank deadline to make decisions
> > about which frame to show.
> >
> > The igt tests can be found at:
> >
> > https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline
> >
> > v2: Clarify the timebase, add link to igt tests
> > v3: Use u64 value in ns to express deadline.
> > v4: More doc
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/dma-buf/dma-fence.c|  3 ++-
> >  drivers/dma-buf/sync_file.c| 19 +++
> >  include/uapi/linux/sync_file.h | 22 ++
> >  3 files changed, 43 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index e103e821d993..7761ceeae620 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -933,7 +933,8 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >   *   the GPU's devfreq to reduce frequency, when in fact the opposite is 
> > what is
> >   *   needed.
> >   *
> > - * To this end, deadline hint(s) can be set on a &dma_fence via 
> > &dma_fence_set_deadline.
> > + * To this end, deadline hint(s) can be set on a &dma_fence via 
> > &dma_fence_set_deadline
> > + * (or indirectly via userspace facing ioctls like &SYNC_IOC_SET_DEADLINE).
> >   * The deadline hint provides a way for the waiting driver, or userspace, 
> > to
> >   * convey an appropriate sense of urgency to the signaling driver.
>
> Hi,
>
> when the kernel HTML doc is generated, I assume the above becomes a
> link to "DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence", right?

Heh, kernel docs completely miss the sync_file uABI.. I'll add a patch
to correct that in order to make these links work properly.

BR,
-R

> >   *
> > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > index af57799c86ce..418021cfb87c 100644
> > --- a/drivers/dma-buf/sync_file.c
> > +++ b/drivers/dma-buf/sync_file.c
> > @@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct 
> > sync_file *sync_file,
> >   return ret;
> >  }
> >
> > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > + unsigned long arg)
> > +{
> > + struct sync_set_deadline ts;
> > +
> > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > + return -EFAULT;
> > +
> > + if (ts.pad)
> > + return -EINVAL;
> > +
> > + dma_fence_set_deadline(sync_file->fence, ns_to_ktime(ts.deadline_ns));
> > +
> > + return 0;
> > +}
> > +
> >  static long sync_file_ioctl(struct file *file, unsigned int cmd,
> >   unsigned long arg)
> >  {
> > @@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned 
> > int cmd,
> >   case SYNC_IOC_FILE_INFO:
> >   return sync_file_ioctl_fence_info(sync_file, arg);
> >
> > + case SYNC_IOC_SET_DEADLINE:
> > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > +
> >   default:
> >   return -ENOTTY;
> >   }
> > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > index ee2dcfb3d660..49325cf6749b 100644
> > --- a/include/uapi/linux/sync_file.h
> > +++ b/include/uapi/linux/sync_file.h
> > @@ -67,6 +67,21 @@ struct sync_file_info {
> >   __u64   sync_fence_info;
> >  };
> >
> > +/**
> > + * struct sync_set_deadline - set a deadline hint on a fence
> > + * @deadline_ns: absolute time of the deadline
>
> Is it legal to pass zero as deadline_ns?
>
> > + * @pad: must be zero
> > + *
> > + * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank)
>
> Does something here provide doc links to "DOC: SYNC_IOC_SET_DEADLINE -
> set a deadline on a fence" and to the "DOC: deadline hints"?
>
> > + */
> > +struct sync_set_deadline {
> > + __u64   deadline_ns;
> > + /* Not strictly needed for alignment but gives some possibility
> > +  * for future extension:
> > +  */
> > + __u64   pad;
> > +};
> > +
> >  #define SYNC_IOC_MAGIC   '>'
> >
> >  /**
> > @@ -95,4 +110,11 @@ struct sync_file_info {
> >   */
> >  #define SYNC_IOC_FILE_INFO   _IOWR(SYNC_IOC_MAGIC, 4, struct 
> > sync_file_info)
> >
> > +/**
> > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > + *
> > + * Allows userspace to set a deadline on a fence, see 
> > dma_fence_set_deadline()
>
> Does something here provide doc links to struct sync_set_deadline and
> to the "DOC: deadline hints"?
>
> > + */
> > +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct 
> > sync_set_deadline)
> > +
> >  #endif /* _UAPI_LINUX_SYNC_H */
>
> With all those links added:
> Acked-by: Pekka Paalanen 
>
>
> Thanks,
> pq


Re: [Freedreno] [PATCH v2] drm/msm/disp/dpu: fix sc7280_pp base offset

2023-02-28 Thread Dmitry Baryshkov

On 27/02/2023 23:36, Kuogee Hsieh wrote:

At sc7280, pingpong block is used to management the dither effects
to reduce distortion at panel. Currently pingpong-0 base offset is
wrongly set at 0x59000. This mistake will not cause system to crash.
However it will make dither not work. This patch correct sc7280 ping
pong-0 block base offset.

Changes in v2:
-- add more details info n regrading of pingpong block at commit text

Fixes: 591e34a091d1 ("drm/msm/disp/dpu1: add support for display for SC7280 
target")
Signed-off-by: Kuogee Hsieh 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry



Re: [Freedreno] [PATCH v2] drm/msm/disp/dpu: fix sc7280_pp base offset

2023-02-28 Thread Abhinav Kumar




On 2/27/2023 1:36 PM, Kuogee Hsieh wrote:

At sc7280, pingpong block is used to management the dither effects
to reduce distortion at panel. Currently pingpong-0 base offset is
wrongly set at 0x59000. This mistake will not cause system to crash.
However it will make dither not work. This patch correct sc7280 ping
pong-0 block base offset.

Changes in v2:
-- add more details info n regrading of pingpong block at commit text

Fixes: 591e34a091d1 ("drm/msm/disp/dpu1: add support for display for SC7280 
target")
Signed-off-by: Kuogee Hsieh 


Reviewed-by: Abhinav Kumar 


---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 7deffc9f9..286437e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -1707,7 +1707,7 @@ static const struct dpu_pingpong_cfg sm8350_pp[] = {
  };
  
  static const struct dpu_pingpong_cfg sc7280_pp[] = {

-   PP_BLK("pingpong_0", PINGPONG_0, 0x59000, 0, sc7280_pp_sblk, -1, -1),
+   PP_BLK("pingpong_0", PINGPONG_0, 0x69000, 0, sc7280_pp_sblk, -1, -1),
PP_BLK("pingpong_1", PINGPONG_1, 0x6a000, 0, sc7280_pp_sblk, -1, -1),
PP_BLK("pingpong_2", PINGPONG_2, 0x6b000, 0, sc7280_pp_sblk, -1, -1),
PP_BLK("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, -1),


Re: [Freedreno] [PATCH v7 01/15] dma-buf/dma-fence: Add deadline awareness

2023-02-28 Thread Rob Clark
On Tue, Feb 28, 2023 at 1:21 AM Pekka Paalanen  wrote:
>
> On Mon, 27 Feb 2023 11:35:07 -0800
> Rob Clark  wrote:
>
> > From: Rob Clark 
> >
> > Add a way to hint to the fence signaler of an upcoming deadline, such as
> > vblank, which the fence waiter would prefer not to miss.  This is to aid
> > the fence signaler in making power management decisions, like boosting
> > frequency as the deadline approaches and awareness of missing deadlines
> > so that can be factored in to the frequency scaling.
> >
> > v2: Drop dma_fence::deadline and related logic to filter duplicate
> > deadlines, to avoid increasing dma_fence size.  The fence-context
> > implementation will need similar logic to track deadlines of all
> > the fences on the same timeline.  [ckoenig]
> > v3: Clarify locking wrt. set_deadline callback
> > v4: Clarify in docs comment that this is a hint
> > v5: Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
> > v6: More docs
> >
> > Signed-off-by: Rob Clark 
> > Reviewed-by: Christian König 
> > ---
> >  Documentation/driver-api/dma-buf.rst |  6 +++
> >  drivers/dma-buf/dma-fence.c  | 59 
> >  include/linux/dma-fence.h| 20 ++
> >  3 files changed, 85 insertions(+)
> >
> > diff --git a/Documentation/driver-api/dma-buf.rst 
> > b/Documentation/driver-api/dma-buf.rst
> > index 622b8156d212..183e480d8cea 100644
> > --- a/Documentation/driver-api/dma-buf.rst
> > +++ b/Documentation/driver-api/dma-buf.rst
> > @@ -164,6 +164,12 @@ DMA Fence Signalling Annotations
> >  .. kernel-doc:: drivers/dma-buf/dma-fence.c
> > :doc: fence signalling annotation
> >
> > +DMA Fence Deadline Hints
> > +
> > +
> > +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> > +   :doc: deadline hints
> > +
> >  DMA Fences Functions Reference
> >  ~~
> >
> > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> > index 0de0482cd36e..e103e821d993 100644
> > --- a/drivers/dma-buf/dma-fence.c
> > +++ b/drivers/dma-buf/dma-fence.c
> > @@ -912,6 +912,65 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
> > uint32_t count,
> >  }
> >  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
> >
> > +/**
> > + * DOC: deadline hints
> > + *
> > + * In an ideal world, it would be possible to pipeline a workload 
> > sufficiently
> > + * that a utilization based device frequency governor could arrive at a 
> > minimum
> > + * frequency that meets the requirements of the use-case, in order to 
> > minimize
> > + * power consumption.  But in the real world there are many workloads which
> > + * defy this ideal.  For example, but not limited to:
> > + *
> > + * * Workloads that ping-pong between device and CPU, with alternating 
> > periods
> > + *   of CPU waiting for device, and device waiting on CPU.  This can 
> > result in
> > + *   devfreq and cpufreq seeing idle time in their respective domains and 
> > in
> > + *   result reduce frequency.
> > + *
> > + * * Workloads that interact with a periodic time based deadline, such as 
> > double
> > + *   buffered GPU rendering vs vblank sync'd page flipping.  In this 
> > scenario,
> > + *   missing a vblank deadline results in an *increase* in idle time on 
> > the GPU
> > + *   (since it has to wait an additional vblank period), sending a single 
> > to
>
> Hi Rob,
>
> s/single/signal/ ?

oops, yes

> > + *   the GPU's devfreq to reduce frequency, when in fact the opposite is 
> > what is
> > + *   needed.
> > + *
> > + * To this end, deadline hint(s) can be set on a &dma_fence via 
> > &dma_fence_set_deadline.
> > + * The deadline hint provides a way for the waiting driver, or userspace, 
> > to
> > + * convey an appropriate sense of urgency to the signaling driver.
> > + *
> > + * A deadline hint is given in absolute ktime (CLOCK_MONOTONIC for 
> > userspace
> > + * facing APIs).  The time could either be some point in the future (such 
> > as
> > + * the vblank based deadline for page-flipping, or the start of a 
> > compositor's
> > + * composition cycle), or the current time to indicate an immediate 
> > deadline
> > + * hint (Ie. forward progress cannot be made until this fence is signaled).
>
> As "current time" not a special value, but just an absolute timestamp
> like any other, deadlines already in the past must also be accepted?

Yes, well "current time" is already in the past after the next clock
tick, so deadlines already passed should be accepted.  I've been
trying to avoid advocating zero as a special value, but I guess
realistically we don't have a rollover problem for a couple hundred
years.  In any case, I think `deadline < now` should be allowed (ie.
what if you were preempted in the process of setting a deadline, etc)

I'll try to clarify this in the next version.

BR,
-R

> > + *
> > + * Multiple deadlines may be set on a given fence, even in parallel.  See 
> > the
> > + * documentation for &dma_fence_ops.set_deadline.
> > + *
> > + * The deadline hint is 

Re: [Freedreno] [PATCH 06/10] drm/display/dsc: split DSC 1.2 and DSC 1.1 (pre-SCR) parameters

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> The array of rc_parameters contains a mixture of parameters from DSC 1.1
> and DSC 1.2 standards. Split these tow configuration arrays in
> preparation to adding more configuration data.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/display/drm_dsc_helper.c  | 127 ++
>  drivers/gpu/drm/i915/display/intel_vdsc.c |  10 +-
>  include/drm/display/drm_dsc_helper.h  |   7 +-
>  3 files changed, 119 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> b/drivers/gpu/drm/display/drm_dsc_helper.c
> index a6d11f474656..51794b40526a 100644
> --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> @@ -326,11 +326,81 @@ struct rc_parameters_data {
>  
>  #define DSC_BPP(bpp) ((bpp) << 4)
>  
> +static const struct rc_parameters_data rc_parameters_pre_scr[] = {
> +{ DSC_BPP(8), 8,
> + /* 8BPP/8BPC */

I still dislike this indentation...

> + { 512, 12, 6144, 3, 12, 11, 11, {
> + { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
> + { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
> + { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
> + { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(8), 10,
> + /* 8BPP/10BPC */
> + { 512, 12, 6144, 7, 16, 15, 15, {
> + /*
> +  * DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
> +  * VESA DSC 1.1 Table E-5 sets it to 4.
> +  */
> + { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
> + { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
> + { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
> + { 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(8), 12,
> + /* 8BPP/12BPC */
> + { 512, 12, 6144, 11, 20, 19, 19, {
> + { 0, 12, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 9, 14, -2 },
> + { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
> + { 11, 17, -8 }, { 11, 18, -10 }, { 13, 19, -10 },
> + { 13, 20, -12 }, { 13, 21, -12 }, { 15, 21, -12 },
> + { 21, 23, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(12), 8,
> + /* 12BPP/8BPC */
> + { 341, 15, 2048, 3, 12, 11, 11, {
> + { 0, 2, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
> + { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
> + { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
> + { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(12), 10,
> + /* 12BPP/10BPC */
> + { 341, 15, 2048, 7, 16, 15, 15, {
> + { 0, 2, 2 }, { 2, 5, 0 }, { 3, 7, 0 }, { 4, 8, -2 },
> + { 6, 9, -4 }, { 7, 10, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
> + { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
> + { 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(12), 12,
> + /* 12BPP/12BPC */
> + { 341, 15, 2048, 11, 20, 19, 19, {
> + { 0, 6, 2 }, { 4, 9, 0 }, { 7, 11, 0 }, { 8, 12, -2 },
> + { 10, 13, -4 }, { 11, 14, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
> + { 11, 17, -8 }, { 11, 18, -10 }, { 13, 19, -10 },
> + { 13, 20, -12 }, { 13, 21, -12 }, { 15, 21, -12 },
> + { 21, 23, -12 }
> + }
> + }
> +},
> +{ /* sentinel */ }
> +};
> +
>  /*
>   * Selected Rate Control Related Parameter Recommended Values
>   * from DSC_v1.11 spec & C Model release: DSC_model_20161212
>   */
> -static const struct rc_parameters_data rc_parameters[] = {
> +static const struct rc_parameters_data rc_parameters_1_2_444[] = {
>  { DSC_BPP(6), 8,
>   /* 6BPP/8BPC */
>   { 768, 15, 6144, 3, 13, 11, 11, {
> @@ -390,22 +460,18 @@ static const struct rc_parameters_data rc_parameters[] 
> = {
>   { 512, 12, 6144, 3, 12, 11, 11, {
>   { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
>   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
> - { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
> - { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
> + { 3, 9, -8 }, { 3, 10, -10 }, { 5, 10, -10 }, { 5, 11, -12 },
> + { 5, 11, -12 }, { 9, 12, -12 }, { 12, 13, -12 }
>   }
>   }
>  },
>  { DSC_BPP(8), 10,
>   /* 8BPP/10BPC */
>   { 512, 12, 6144, 7, 16, 15, 15, {
> - /*
> -  * DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
> -  * VESA DSC 1.1 Table E-5 sets it to 4.
> -  */
> - { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
> + { 0, 8, 2 }, { 4, 8, 0 }, 

Re: [Freedreno] [PATCH 07/10] drm/display/dsc: include the rest of pre-SCR parameters

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> DSC model contains pre-SCR RC parameters for other bpp/bpc combinations,
> include them here for completeness.

Need to run now, note to self:

Does i915 use the arrays to limit the bpp/bpc combos supported by
hardware? Do we need to add separate limiting in i915.

BR,
Jani.



>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/display/drm_dsc_helper.c | 72 
>  1 file changed, 72 insertions(+)
>
> diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> b/drivers/gpu/drm/display/drm_dsc_helper.c
> index 51794b40526a..1612536014ea 100644
> --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> @@ -327,6 +327,16 @@ struct rc_parameters_data {
>  #define DSC_BPP(bpp) ((bpp) << 4)
>  
>  static const struct rc_parameters_data rc_parameters_pre_scr[] = {
> +{ DSC_BPP(6), 8,
> + /* 6BPP/8BPC */
> + { 683, 15, 6144, 3, 13, 11, 11, {
> + { 0, 2, 0 }, { 1, 4, -2 }, { 3, 6, -2 }, { 4, 6, -4 },
> + { 5, 7, -6 }, { 5, 7, -6 }, { 6, 7, -6 }, { 6, 8, -8 },
> + { 7, 9, -8 }, { 8, 10, -10 }, { 9, 11, -10 }, { 10, 12, -12 },
> + { 10, 13, -12 }, { 12, 14, -12 }, { 15, 15, -12 }
> + }
> + }
> +},
>  { DSC_BPP(8), 8,
>   /* 8BPP/8BPC */
>   { 512, 12, 6144, 3, 12, 11, 11, {
> @@ -362,6 +372,37 @@ static const struct rc_parameters_data 
> rc_parameters_pre_scr[] = {
>   }
>   }
>  },
> +{ DSC_BPP(10), 8,
> + /* 10BPP/8BPC */
> + { 410, 12, 5632, 3, 12, 11, 11, {
> + { 0, 3, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 2, 6, -2 },
> + { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
> + { 3, 9, -8 }, { 3, 9, -10 }, { 5, 10, -10 }, { 5, 11, -10 },
> + { 5, 12, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(10), 10,
> + /* 10BPP/10BPC */
> + { 410, 12, 5632, 7, 16, 15, 15, {
> + { 0, 7, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 6, 10, -2 },
> + { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
> + { 7, 13, -8 }, { 7, 13, -10 }, { 9, 14, -10 }, { 9, 15, -10 },
> + { 9, 16, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(10), 12,
> + /* 10BPP/12BPC */
> + { 410, 12, 5632, 11, 20, 19, 19, {
> + { 0, 11, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 10, 14, -2 },
> + { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
> + { 11, 17, -8 }, { 11, 17, -10 }, { 13, 18, -10 },
> + { 13, 19, -10 }, { 13, 20, -12 }, { 15, 21, -12 },
> + { 21, 23, -12 }
> + }
> + }
> +},
>  { DSC_BPP(12), 8,
>   /* 12BPP/8BPC */
>   { 341, 15, 2048, 3, 12, 11, 11, {
> @@ -393,6 +434,37 @@ static const struct rc_parameters_data 
> rc_parameters_pre_scr[] = {
>   }
>   }
>  },
> +{ DSC_BPP(15), 8,
> + /* 15BPP/8BPC */
> + { 273, 15, 2048, 3, 12, 11, 11, {
> + { 0, 0, 10 }, { 0, 1, 8 }, { 0, 1, 6 }, { 0, 2, 4 },
> + { 1, 2, 2 }, { 1, 3, 0 }, { 1, 4, -2 }, { 2, 4, -4 },
> + { 3, 4, -6 }, { 3, 5, -8 }, { 4, 6, -10 }, { 5, 7, -10 },
> + { 5, 8, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(15), 10,
> + /* 15BPP/10BPC */
> + { 273, 15, 2048, 7, 16, 15, 15, {
> + { 0, 2, 10 }, { 2, 5, 8 }, { 3, 5, 6 }, { 4, 6, 4 },
> + { 5, 6, 2 }, { 5, 7, 0 }, { 5, 8, -2 }, { 6, 8, -4 },
> + { 7, 8, -6 }, { 7, 9, -8 }, { 8, 10, -10 }, { 9, 11, -10 },
> + { 9, 12, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
> + }
> + }
> +},
> +{ DSC_BPP(15), 12,
> + /* 15BPP/12BPC */
> + { 273, 15, 2048, 11, 20, 19, 19, {
> + { 0, 4, 10 }, { 2, 7, 8 }, { 4, 9, 6 }, { 6, 11, 4 },
> + { 9, 11, 2 }, { 9, 11, 0 }, { 9, 12, -2 }, { 10, 12, -4 },
> + { 11, 12, -6 }, { 11, 13, -8 }, { 12, 14, -10 },
> + { 13, 15, -10 }, { 13, 16, -12 }, { 15, 21, -12 },
> + { 21, 23, -12 }
> + }
> + }
> +},
>  { /* sentinel */ }
>  };

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [Freedreno] [PATCH 05/10] drm/display/dsc: use flat array for rc_parameters lookup

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> Next commits are going to add support for additional RC parameter lookup
> tables. These tables are going to use different bpp/bpc combinations,
> thus it makes little sense to keep the 2d array for RC parameters.
> Switch to using the flat array.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/display/drm_dsc_helper.c | 188 +++
>  1 file changed, 88 insertions(+), 100 deletions(-)
>
> diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> b/drivers/gpu/drm/display/drm_dsc_helper.c
> index deaa84722bd4..a6d11f474656 100644
> --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> @@ -307,24 +307,6 @@ void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config 
> *vdsc_cfg)
>  }
>  EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
>  
> -enum ROW_INDEX_BPP {
> - ROW_INDEX_6BPP = 0,
> - ROW_INDEX_8BPP,
> - ROW_INDEX_10BPP,
> - ROW_INDEX_12BPP,
> - ROW_INDEX_15BPP,
> - MAX_ROW_INDEX
> -};
> -
> -enum COLUMN_INDEX_BPC {
> - COLUMN_INDEX_8BPC = 0,
> - COLUMN_INDEX_10BPC,
> - COLUMN_INDEX_12BPC,
> - COLUMN_INDEX_14BPC,
> - COLUMN_INDEX_16BPC,
> - MAX_COLUMN_INDEX
> -};
> -
>  struct rc_parameters {
>   u16 initial_xmit_delay;
>   u8 first_line_bpg_offset;
> @@ -336,12 +318,20 @@ struct rc_parameters {
>   struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
>  };
>  
> +struct rc_parameters_data {
> + u8 bpp;
> + u8 bpc;
> + struct rc_parameters params;
> +};
> +
> +#define DSC_BPP(bpp) ((bpp) << 4)
> +
>  /*
>   * Selected Rate Control Related Parameter Recommended Values
>   * from DSC_v1.11 spec & C Model release: DSC_model_20161212
>   */
> -static const struct rc_parameters rc_parameters[][MAX_COLUMN_INDEX] = {
> -{
> +static const struct rc_parameters_data rc_parameters[] = {
> +{ DSC_BPP(6), 8,

I was kind of hoping for a patch that would clean up the hideous
indentation in the tables. Please at least let's not add more with the
one space indent?

>   /* 6BPP/8BPC */

With designated initializers I think we could just toss the comments
out.

.bpp = DSC_BPP(6), .bpc = 8,

With that,

Reviewed-by: Jani Nikula 


>   { 768, 15, 6144, 3, 13, 11, 11, {
>   { 0, 4, 0 }, { 1, 6, -2 }, { 3, 8, -2 }, { 4, 8, -4 },
> @@ -349,7 +339,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 7, 11, -8 }, { 8, 12, -10 }, { 9, 12, -10 }, { 10, 12, -12 },
>   { 10, 12, -12 }, { 11, 12, -12 }, { 13, 14, -12 }
>   }
> - },
> + }
> +},
> +{ DSC_BPP(6), 10,
>   /* 6BPP/10BPC */
>   { 768, 15, 6144, 7, 17, 15, 15, {
>   { 0, 8, 0 }, { 3, 10, -2 }, { 7, 12, -2 }, { 8, 12, -4 },
> @@ -358,7 +350,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 14, 16, -12 }, { 14, 16, -12 }, { 15, 16, -12 },
>   { 17, 18, -12 }
>   }
> - },
> + }
> +},
> +{ DSC_BPP(6), 12,
>   /* 6BPP/12BPC */
>   { 768, 15, 6144, 11, 21, 19, 19, {
>   { 0, 12, 0 }, { 5, 14, -2 }, { 11, 16, -2 }, { 12, 16, -4 },
> @@ -367,7 +361,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 18, 20, -12 }, { 18, 20, -12 }, { 19, 20, -12 },
>   { 21, 22, -12 }
>   }
> - },
> + }
> +},
> +{ DSC_BPP(6), 14,
>   /* 6BPP/14BPC */
>   { 768, 15, 6144, 15, 25, 23, 23, {
>   { 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
> @@ -376,7 +372,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 22, 24, -12 }, { 22, 24, -12 }, { 23, 24, -12 },
>   { 25, 26, -12 }
>   }
> - },
> + }
> +},
> +{ DSC_BPP(6), 16,
>   /* 6BPP/16BPC */
>   { 768, 15, 6144, 19, 29, 27, 27, {
>   { 0, 20, 0 }, { 9, 22, -2 }, { 19, 24, -2 }, { 20, 24, -4 },
> @@ -385,9 +383,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 26, 28, -12 }, { 26, 28, -12 }, { 27, 28, -12 },
>   { 29, 30, -12 }
>   }
> - },
> + }
>  },
> -{
> +{ DSC_BPP(8), 8,
>   /* 8BPP/8BPC */
>   { 512, 12, 6144, 3, 12, 11, 11, {
>   { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
> @@ -395,7 +393,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
>   { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
>   }
> - },
> + }
> +},
> +{ DSC_BPP(8), 10,
>   /* 8BPP/10BPC */
>   { 512, 12, 6144, 7, 16, 15, 15, {
>   /*
> @@ -407,7 +407,9 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 

Re: [Freedreno] [PATCH 04/10] drm/i915/dsc: stop using interim structure for calculated params

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> Stop using an interim structure rc_parameters for storing calculated
> params and then setting drm_dsc_config using that structure. Instead put
> calculated params into the struct drm_dsc_config directly.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/i915/display/intel_vdsc.c | 89 +--
>  1 file changed, 20 insertions(+), 69 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
> b/drivers/gpu/drm/i915/display/intel_vdsc.c
> index d5a7e9494b23..1ee8d13c9d64 100644
> --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
> @@ -18,17 +18,6 @@
>  #include "intel_qp_tables.h"
>  #include "intel_vdsc.h"
>  
> -struct rc_parameters {
> - u16 initial_xmit_delay;
> - u8 first_line_bpg_offset;
> - u16 initial_offset;
> - u8 flatness_min_qp;
> - u8 flatness_max_qp;
> - u8 rc_quant_incr_limit0;
> - u8 rc_quant_incr_limit1;
> - struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
> -};
> -
>  bool intel_dsc_source_support(const struct intel_crtc_state *crtc_state)
>  {
>   const struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> @@ -63,8 +52,7 @@ static bool is_pipe_dsc(struct intel_crtc *crtc, enum 
> transcoder cpu_transcoder)
>  }
>  
>  static void
> -calculate_rc_params(struct rc_parameters *rc,
> - struct drm_dsc_config *vdsc_cfg)
> +calculate_rc_params(struct drm_dsc_config *vdsc_cfg)
>  {
>   int bpc = vdsc_cfg->bits_per_component;
>   int bpp = vdsc_cfg->bits_per_pixel >> 4;
> @@ -84,54 +72,54 @@ calculate_rc_params(struct rc_parameters *rc,
>   u32 res, buf_i, bpp_i;
>  
>   if (vdsc_cfg->slice_height >= 8)
> - rc->first_line_bpg_offset =
> + vdsc_cfg->first_line_bpg_offset =
>   12 + DIV_ROUND_UP((9 * min(34, vdsc_cfg->slice_height - 
> 8)), 100);
>   else
> - rc->first_line_bpg_offset = 2 * (vdsc_cfg->slice_height - 1);
> + vdsc_cfg->first_line_bpg_offset = 2 * (vdsc_cfg->slice_height - 
> 1);
>  
>   /* Our hw supports only 444 modes as of today */
>   if (bpp >= 12)
> - rc->initial_offset = 2048;
> + vdsc_cfg->initial_offset = 2048;
>   else if (bpp >= 10)
> - rc->initial_offset = 5632 - DIV_ROUND_UP(((bpp - 10) * 3584), 
> 2);
> + vdsc_cfg->initial_offset = 5632 - DIV_ROUND_UP(((bpp - 10) * 
> 3584), 2);
>   else if (bpp >= 8)
> - rc->initial_offset = 6144 - DIV_ROUND_UP(((bpp - 8) * 512), 2);
> + vdsc_cfg->initial_offset = 6144 - DIV_ROUND_UP(((bpp - 8) * 
> 512), 2);
>   else
> - rc->initial_offset = 6144;
> + vdsc_cfg->initial_offset = 6144;
>  
>   /* initial_xmit_delay = rc_model_size/2/compression_bpp */
> - rc->initial_xmit_delay = DIV_ROUND_UP(DSC_RC_MODEL_SIZE_CONST, 2 * bpp);
> + vdsc_cfg->initial_xmit_delay = DIV_ROUND_UP(DSC_RC_MODEL_SIZE_CONST, 2 
> * bpp);
>  
> - rc->flatness_min_qp = 3 + qp_bpc_modifier;
> - rc->flatness_max_qp = 12 + qp_bpc_modifier;
> + vdsc_cfg->flatness_min_qp = 3 + qp_bpc_modifier;
> + vdsc_cfg->flatness_max_qp = 12 + qp_bpc_modifier;
>  
> - rc->rc_quant_incr_limit0 = 11 + qp_bpc_modifier;
> - rc->rc_quant_incr_limit1 = 11 + qp_bpc_modifier;
> + vdsc_cfg->rc_quant_incr_limit0 = 11 + qp_bpc_modifier;
> + vdsc_cfg->rc_quant_incr_limit1 = 11 + qp_bpc_modifier;
>  
>   bpp_i  = (2 * (bpp - 6));
>   for (buf_i = 0; buf_i < DSC_NUM_BUF_RANGES; buf_i++) {
>   /* Read range_minqp and range_max_qp from qp tables */
> - rc->rc_range_params[buf_i].range_min_qp =
> + vdsc_cfg->rc_range_params[buf_i].range_min_qp =
>   intel_lookup_range_min_qp(bpc, buf_i, bpp_i);
> - rc->rc_range_params[buf_i].range_max_qp =
> + vdsc_cfg->rc_range_params[buf_i].range_max_qp =
>   intel_lookup_range_max_qp(bpc, buf_i, bpp_i);
>  
>   /* Calculate range_bgp_offset */
>   if (bpp <= 6) {
> - rc->rc_range_params[buf_i].range_bpg_offset = 
> ofs_und6[buf_i];
> + vdsc_cfg->rc_range_params[buf_i].range_bpg_offset = 
> ofs_und6[buf_i];
>   } else if (bpp <= 8) {
>   res = DIV_ROUND_UP(((bpp - 6) * (ofs_und8[buf_i] - 
> ofs_und6[buf_i])), 2);
> - rc->rc_range_params[buf_i].range_bpg_offset =
> + vdsc_cfg->rc_range_params[buf_i].range_bpg_offset =
>   ofs_und6[buf_i] 
> + res;
>   } else if (bpp <= 12) {
> - rc->rc_range_params[buf_i].range_bpg_offset =
> + vdsc_cfg->rc_range_params[buf_i].range_bpg_offset =
>   ofs_und8[buf_

Re: [Freedreno] [PATCH 03/10] drm/i915/dsc: move DSC tables to DRM DSC helper

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> This moves DSC RC tables to DRM DSC helper. No additional code changes
> and/or cleanups are a part of this commit, it will be cleaned up in the
> followup commits.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/display/drm_dsc_helper.c  | 364 ++
>  drivers/gpu/drm/i915/display/intel_vdsc.c | 319 +--
>  include/drm/display/drm_dsc_helper.h  |   1 +
>  3 files changed, 372 insertions(+), 312 deletions(-)
>
> diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> b/drivers/gpu/drm/display/drm_dsc_helper.c
> index ab8679c158b5..deaa84722bd4 100644
> --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> @@ -307,6 +307,370 @@ void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config 
> *vdsc_cfg)
>  }
>  EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
>  
> +enum ROW_INDEX_BPP {
> + ROW_INDEX_6BPP = 0,
> + ROW_INDEX_8BPP,
> + ROW_INDEX_10BPP,
> + ROW_INDEX_12BPP,
> + ROW_INDEX_15BPP,
> + MAX_ROW_INDEX
> +};
> +
> +enum COLUMN_INDEX_BPC {
> + COLUMN_INDEX_8BPC = 0,
> + COLUMN_INDEX_10BPC,
> + COLUMN_INDEX_12BPC,
> + COLUMN_INDEX_14BPC,
> + COLUMN_INDEX_16BPC,
> + MAX_COLUMN_INDEX
> +};
> +
> +struct rc_parameters {
> + u16 initial_xmit_delay;
> + u8 first_line_bpg_offset;
> + u16 initial_offset;
> + u8 flatness_min_qp;
> + u8 flatness_max_qp;
> + u8 rc_quant_incr_limit0;
> + u8 rc_quant_incr_limit1;
> + struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
> +};
> +
> +/*
> + * Selected Rate Control Related Parameter Recommended Values
> + * from DSC_v1.11 spec & C Model release: DSC_model_20161212
> + */
> +static const struct rc_parameters rc_parameters[][MAX_COLUMN_INDEX] = {
> +{
> + /* 6BPP/8BPC */
> + { 768, 15, 6144, 3, 13, 11, 11, {
> + { 0, 4, 0 }, { 1, 6, -2 }, { 3, 8, -2 }, { 4, 8, -4 },
> + { 5, 9, -6 }, { 5, 9, -6 }, { 6, 9, -6 }, { 6, 10, -8 },
> + { 7, 11, -8 }, { 8, 12, -10 }, { 9, 12, -10 }, { 10, 12, -12 },
> + { 10, 12, -12 }, { 11, 12, -12 }, { 13, 14, -12 }
> + }
> + },
> + /* 6BPP/10BPC */
> + { 768, 15, 6144, 7, 17, 15, 15, {
> + { 0, 8, 0 }, { 3, 10, -2 }, { 7, 12, -2 }, { 8, 12, -4 },
> + { 9, 13, -6 }, { 9, 13, -6 }, { 10, 13, -6 }, { 10, 14, -8 },
> + { 11, 15, -8 }, { 12, 16, -10 }, { 13, 16, -10 },
> + { 14, 16, -12 }, { 14, 16, -12 }, { 15, 16, -12 },
> + { 17, 18, -12 }
> + }
> + },
> + /* 6BPP/12BPC */
> + { 768, 15, 6144, 11, 21, 19, 19, {
> + { 0, 12, 0 }, { 5, 14, -2 }, { 11, 16, -2 }, { 12, 16, -4 },
> + { 13, 17, -6 }, { 13, 17, -6 }, { 14, 17, -6 }, { 14, 18, -8 },
> + { 15, 19, -8 }, { 16, 20, -10 }, { 17, 20, -10 },
> + { 18, 20, -12 }, { 18, 20, -12 }, { 19, 20, -12 },
> + { 21, 22, -12 }
> + }
> + },
> + /* 6BPP/14BPC */
> + { 768, 15, 6144, 15, 25, 23, 23, {
> + { 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
> + { 17, 21, -6 }, { 17, 21, -6 }, { 18, 21, -6 }, { 18, 22, -8 },
> + { 19, 23, -8 }, { 20, 24, -10 }, { 21, 24, -10 },
> + { 22, 24, -12 }, { 22, 24, -12 }, { 23, 24, -12 },
> + { 25, 26, -12 }
> + }
> + },
> + /* 6BPP/16BPC */
> + { 768, 15, 6144, 19, 29, 27, 27, {
> + { 0, 20, 0 }, { 9, 22, -2 }, { 19, 24, -2 }, { 20, 24, -4 },
> + { 21, 25, -6 }, { 21, 25, -6 }, { 22, 25, -6 }, { 22, 26, -8 },
> + { 23, 27, -8 }, { 24, 28, -10 }, { 25, 28, -10 },
> + { 26, 28, -12 }, { 26, 28, -12 }, { 27, 28, -12 },
> + { 29, 30, -12 }
> + }
> + },
> +},
> +{
> + /* 8BPP/8BPC */
> + { 512, 12, 6144, 3, 12, 11, 11, {
> + { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
> + { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
> + { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
> + { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
> + }
> + },
> + /* 8BPP/10BPC */
> + { 512, 12, 6144, 7, 16, 15, 15, {
> + /*
> +  * DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
> +  * VESA DSC 1.1 Table E-5 sets it to 4.
> +  */
> + { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
> + { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
> + { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
> + { 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
> + }
> + },
> + /* 8BPP/12BPC */
> + { 512, 12, 6144, 11, 20, 19, 19, {
> + { 0, 12, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 9, 14, -2 },
> + 

Re: [Freedreno] [PATCH 01/10] drm/i915/dsc: change DSC param tables to follow the DSC model

2023-02-28 Thread Dmitry Baryshkov

On 28/02/2023 17:56, Jani Nikula wrote:

On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:

After cross-checking DSC models (20150914, 20161212, 20210623) change
values in rc_parameters tables to follow config files present inside
the DSC model. Handle two places, where i915 tables diverged from the
model, by patching the rc values in the code.

Note: I left one case uncorrected, 8bpp/10bpc/range_max_qp[0], because
the table in the VESA DSC 1.1 sets it to 4.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/i915/display/intel_vdsc.c | 18 --
  1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
b/drivers/gpu/drm/i915/display/intel_vdsc.c
index 207b2a648d32..d080741fd0b3 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
@@ -86,7 +86,7 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
}
},
/* 6BPP/14BPC */
-   { 768, 15, 6144, 15, 25, 23, 27, {
+   { 768, 15, 6144, 15, 25, 23, 23, {
{ 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
{ 17, 21, -6 }, { 17, 21, -6 }, { 18, 21, -6 }, { 18, 22, -8 },
{ 19, 23, -8 }, { 20, 24, -10 }, { 21, 24, -10 },
@@ -115,6 +115,10 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
},
/* 8BPP/10BPC */
{ 512, 12, 6144, 7, 16, 15, 15, {
+   /*
+* DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
+* VESA DSC 1.1 Table E-5 sets it to 4.
+*/
{ 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
{ 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
{ 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
@@ -132,7 +136,7 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
},
/* 8BPP/14BPC */
{ 512, 12, 6144, 15, 24, 23, 23, {
-   { 0, 12, 0 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
+   { 0, 12, 2 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
{ 15, 19, -4 }, { 15, 19, -6 }, { 15, 19, -8 }, { 15, 20, -8 },
{ 15, 21, -8 }, { 15, 22, -10 }, { 17, 22, -10 },
{ 17, 23, -12 }, { 17, 23, -12 }, { 21, 24, -12 },
@@ -529,6 +533,16 @@ int intel_dsc_compute_params(struct intel_crtc_state 
*pipe_config)
DSC_RANGE_BPG_OFFSET_MASK;
}
  
+	if (DISPLAY_VER(dev_priv) < 13) {

+   if (compressed_bpp == 6 &&
+   vdsc_cfg->bits_per_component == 8)
+   vdsc_cfg->rc_quant_incr_limit1 = 23;
+
+   if (compressed_bpp == 8 &&
+   vdsc_cfg->bits_per_component == 14)
+   vdsc_cfg->rc_range_params[0].range_bpg_offset = 0;
+   }
+


I wonder if we shouldn't just use the updated values...


I also wondered about this, so I wanted to get a double check from 
somebody having better knowledge of this part, if it is a typo in the 
original patch or a typo in the cfg files.


E.g. the pre_scr_cfg_files_for_reference/rc_10bpc_8bpp.cfg has 8 as 
RX_MAXQP[0], which (for me) looks like a typo in the CFG file itself, 
rather than being a typo in the driver.


On the other hand, these two issues belong to the 'current' CFG files, 
so they, most probably, received more attention from anybody working 
with the standard and with the model.


I can change this patch to become a fix for the tables (dropping the if 
clause), if you can confirm that these values are typos in the driver.




Maybe add a FIXME comment above the block to consider removing it?

Reviewed-by: Jani Nikula 



/*
 * BitsPerComponent value determines mux_word_size:
 * When BitsPerComponent is less than or 10bpc, muxWordSize will be 
equal to




--
With best wishes
Dmitry



Re: [Freedreno] [PATCH 02/10] drm/i915/dsc: move rc_buf_thresh values to common helper

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> The rc_buf_thresh values are common to all DSC implementations. Move
> them to the common helper together with the code to propagage them to
> the drm_dsc_config.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/display/drm_dsc_helper.c  | 37 +++
>  drivers/gpu/drm/i915/display/intel_vdsc.c | 24 +--
>  include/drm/display/drm_dsc_helper.h  |  1 +
>  3 files changed, 39 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> b/drivers/gpu/drm/display/drm_dsc_helper.c
> index c869c6e51e2b..ab8679c158b5 100644
> --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> @@ -270,6 +270,43 @@ void drm_dsc_pps_payload_pack(struct 
> drm_dsc_picture_parameter_set *pps_payload,
>  }
>  EXPORT_SYMBOL(drm_dsc_pps_payload_pack);
>  
> +/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
> +const u16 drm_dsc_rc_buf_thresh[] = {
> + 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
> + 7744, 7872, 8000, 8064
> +};
> +EXPORT_SYMBOL(drm_dsc_rc_buf_thresh);
> +
> +/**
> + * drm_dsc_set_rc_buf_thresh() - Set thresholds for the RC model
> + * in accordance with the DSC 1.2 specification.
> + *
> + * @vdsc_cfg: DSC Configuration data partially filled by driver
> + */
> +void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg)
> +{
> + int i = 0;
> +
> + for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
> + /*
> +  * six 0s are appended to the lsb of each threshold value
> +  * internally in h/w.
> +  * Only 8 bits are allowed for programming RcBufThreshold
> +  */

Not sure how appropriate the hardware references are, maybe clean it up
a bit.

With that, and +static and -export mentioned earlier,

Reviewed-by: Jani Nikula 

> + vdsc_cfg->rc_buf_thresh[i] = drm_dsc_rc_buf_thresh[i] >> 6;
> + }
> +
> + /*
> +  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
> +  * as per C Model
> +  */
> + if (vdsc_cfg->bits_per_pixel == 6 << 4) {
> + vdsc_cfg->rc_buf_thresh[12] = 7936 >> 6;
> + vdsc_cfg->rc_buf_thresh[13] = 8000 >> 6;
> + }
> +}
> +EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
> +
>  /**
>   * drm_dsc_compute_rc_parameters() - Write rate control
>   * parameters to the dsc configuration defined in
> diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
> b/drivers/gpu/drm/i915/display/intel_vdsc.c
> index d080741fd0b3..b4faab4c8fb3 100644
> --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
> @@ -36,12 +36,6 @@ enum COLUMN_INDEX_BPC {
>   MAX_COLUMN_INDEX
>  };
>  
> -/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
> -static const u16 rc_buf_thresh[] = {
> - 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
> - 7744, 7872, 8000, 8064
> -};
> -
>  struct rc_parameters {
>   u16 initial_xmit_delay;
>   u8 first_line_bpg_offset;
> @@ -474,23 +468,7 @@ int intel_dsc_compute_params(struct intel_crtc_state 
> *pipe_config)
>   vdsc_cfg->bits_per_pixel = compressed_bpp << 4;
>   vdsc_cfg->bits_per_component = pipe_config->pipe_bpp / 3;
>  
> - for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
> - /*
> -  * six 0s are appended to the lsb of each threshold value
> -  * internally in h/w.
> -  * Only 8 bits are allowed for programming RcBufThreshold
> -  */
> - vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
> - }
> -
> - /*
> -  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
> -  * as per C Model
> -  */
> - if (compressed_bpp == 6) {
> - vdsc_cfg->rc_buf_thresh[12] = 0x7C;
> - vdsc_cfg->rc_buf_thresh[13] = 0x7D;
> - }
> + drm_dsc_set_rc_buf_thresh(vdsc_cfg);
>  
>   /*
>* From XE_LPD onwards we supports compression bpps in steps of 1
> diff --git a/include/drm/display/drm_dsc_helper.h 
> b/include/drm/display/drm_dsc_helper.h
> index 8b41edbbabab..706ba1d34742 100644
> --- a/include/drm/display/drm_dsc_helper.h
> +++ b/include/drm/display/drm_dsc_helper.h
> @@ -14,6 +14,7 @@ void drm_dsc_dp_pps_header_init(struct dp_sdp_header 
> *pps_header);
>  int drm_dsc_dp_rc_buffer_size(u8 rc_buffer_block_size, u8 rc_buffer_size);
>  void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set *pps_sdp,
> const struct drm_dsc_config *dsc_cfg);
> +void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg);
>  int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg);
>  
>  #endif /* _DRM_DSC_HELPER_H_ */

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [Freedreno] [PATCH 01/10] drm/i915/dsc: change DSC param tables to follow the DSC model

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> After cross-checking DSC models (20150914, 20161212, 20210623) change
> values in rc_parameters tables to follow config files present inside
> the DSC model. Handle two places, where i915 tables diverged from the
> model, by patching the rc values in the code.
>
> Note: I left one case uncorrected, 8bpp/10bpc/range_max_qp[0], because
> the table in the VESA DSC 1.1 sets it to 4.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/i915/display/intel_vdsc.c | 18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
> b/drivers/gpu/drm/i915/display/intel_vdsc.c
> index 207b2a648d32..d080741fd0b3 100644
> --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
> @@ -86,7 +86,7 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   }
>   },
>   /* 6BPP/14BPC */
> - { 768, 15, 6144, 15, 25, 23, 27, {
> + { 768, 15, 6144, 15, 25, 23, 23, {
>   { 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
>   { 17, 21, -6 }, { 17, 21, -6 }, { 18, 21, -6 }, { 18, 22, -8 },
>   { 19, 23, -8 }, { 20, 24, -10 }, { 21, 24, -10 },
> @@ -115,6 +115,10 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   },
>   /* 8BPP/10BPC */
>   { 512, 12, 6144, 7, 16, 15, 15, {
> + /*
> +  * DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
> +  * VESA DSC 1.1 Table E-5 sets it to 4.
> +  */
>   { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
>   { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
>   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
> @@ -132,7 +136,7 @@ static const struct rc_parameters 
> rc_parameters[][MAX_COLUMN_INDEX] = {
>   },
>   /* 8BPP/14BPC */
>   { 512, 12, 6144, 15, 24, 23, 23, {
> - { 0, 12, 0 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
> + { 0, 12, 2 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
>   { 15, 19, -4 }, { 15, 19, -6 }, { 15, 19, -8 }, { 15, 20, -8 },
>   { 15, 21, -8 }, { 15, 22, -10 }, { 17, 22, -10 },
>   { 17, 23, -12 }, { 17, 23, -12 }, { 21, 24, -12 },
> @@ -529,6 +533,16 @@ int intel_dsc_compute_params(struct intel_crtc_state 
> *pipe_config)
>   DSC_RANGE_BPG_OFFSET_MASK;
>   }
>  
> + if (DISPLAY_VER(dev_priv) < 13) {
> + if (compressed_bpp == 6 &&
> + vdsc_cfg->bits_per_component == 8)
> + vdsc_cfg->rc_quant_incr_limit1 = 23;
> +
> + if (compressed_bpp == 8 &&
> + vdsc_cfg->bits_per_component == 14)
> + vdsc_cfg->rc_range_params[0].range_bpg_offset = 0;
> + }
> +

I wonder if we shouldn't just use the updated values...

Maybe add a FIXME comment above the block to consider removing it?

Reviewed-by: Jani Nikula 


>   /*
>* BitsPerComponent value determines mux_word_size:
>* When BitsPerComponent is less than or 10bpc, muxWordSize will be 
> equal to

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [Freedreno] [PATCH v7 00/15] dma-fence: Deadline awareness

2023-02-28 Thread Rob Clark
On Tue, Feb 28, 2023 at 4:43 AM Bagas Sanjaya  wrote:
>
> On Mon, Feb 27, 2023 at 11:35:06AM -0800, Rob Clark wrote:
> > From: Rob Clark 
> >
> > This series adds a deadline hint to fences, so realtime deadlines
> > such as vblank can be communicated to the fence signaller for power/
> > frequency management decisions.
> >
> > This is partially inspired by a trick i915 does, but implemented
> > via dma-fence for a couple of reasons:
> >
> > 1) To continue to be able to use the atomic helpers
> > 2) To support cases where display and gpu are different drivers
> >
> > This iteration adds a dma-fence ioctl to set a deadline (both to
> > support igt-tests, and compositors which delay decisions about which
> > client buffer to display), and a sw_sync ioctl to read back the
> > deadline.  IGT tests utilizing these can be found at:
> >
> >   
> > https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline
> >
> >
> > v1: https://patchwork.freedesktop.org/series/93035/
> > v2: Move filtering out of later deadlines to fence implementation
> > to avoid increasing the size of dma_fence
> > v3: Add support in fence-array and fence-chain; Add some uabi to
> > support igt tests and userspace compositors.
> > v4: Rebase, address various comments, and add syncobj deadline
> > support, and sync_file EPOLLPRI based on experience with perf/
> > freq issues with clvk compute workloads on i915 (anv)
> > v5: Clarify that this is a hint as opposed to a more hard deadline
> > guarantee, switch to using u64 ns values in UABI (still absolute
> > CLOCK_MONOTONIC values), drop syncobj related cap and driver
> > feature flag in favor of allowing count_handles==0 for probing
> > kernel support.
> > v6: Re-work vblank helper to calculate time of _start_ of vblank,
> > and work correctly if the last vblank event was more than a
> > frame ago.  Add (mostly unrelated) drm/msm patch which also
> > uses the vblank helper.  Use dma_fence_chain_contained().  More
> > verbose syncobj UABI comments.  Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
> > v7: Fix kbuild complaints about vblank helper.  Add more docs.
> >
>
> I want to apply this series for testing, but it can't be applied cleanly
> on current drm-misc tree. On what tree (and commit) is this series based
> on?

You can find my branch here:

https://gitlab.freedesktop.org/robclark/msm/-/commits/dma-fence/deadline

BR,
-R


Re: [Freedreno] [PATCH 03/10] drm/i915/dsc: move DSC tables to DRM DSC helper

2023-02-28 Thread kernel test robot
Hi Dmitry,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm-intel/for-linux-next 
drm-intel/for-linux-next-fixes drm/drm-next v6.2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Dmitry-Baryshkov/drm-i915-dsc-change-DSC-param-tables-to-follow-the-DSC-model/20230228-193505
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/20230228113342.2051425-4-dmitry.baryshkov%40linaro.org
patch subject: [PATCH 03/10] drm/i915/dsc: move DSC tables to DRM DSC helper
config: x86_64-randconfig-a002-20230227 
(https://download.01.org/0day-ci/archive/20230228/202302282241.qrmajdx8-...@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project 
f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/ee048cb6c2ec7f7f92bea6b72e8cd3ef9921993e
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Dmitry-Baryshkov/drm-i915-dsc-change-DSC-param-tables-to-follow-the-DSC-model/20230228-193505
git checkout ee048cb6c2ec7f7f92bea6b72e8cd3ef9921993e
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/display/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 
| Link: 
https://lore.kernel.org/oe-kbuild-all/202302282241.qrmajdx8-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/display/drm_dsc_helper.c:635: warning: expecting prototype 
>> for drm_dsc_compute_rc_parameters(). Prototype was for 
>> drm_dsc_setup_rc_params() instead


vim +635 drivers/gpu/drm/display/drm_dsc_helper.c

   627  
   628  /**
   629   * drm_dsc_compute_rc_parameters() - Set parameters and limits for RC 
model in
   630   * accordance with the DSC 1.1 or 1.2 specification and DSC C Model
   631   *
   632   * @vdsc_cfg: DSC Configuration data partially filled by driver
   633   */
   634  int drm_dsc_setup_rc_params(struct drm_dsc_config *vdsc_cfg)
 > 635  {
   636  const struct rc_parameters *rc_params;
   637  int i;
   638  
   639  /* fractional BPP is not supported */
   640  if (vdsc_cfg->bits_per_pixel & 0xf)
   641  return -EINVAL;
   642  
   643  rc_params = get_rc_params(vdsc_cfg->bits_per_pixel >> 4,
   644vdsc_cfg->bits_per_component);
   645  if (!rc_params)
   646  return -EINVAL;
   647  
   648  vdsc_cfg->first_line_bpg_offset = 
rc_params->first_line_bpg_offset;
   649  vdsc_cfg->initial_xmit_delay = rc_params->initial_xmit_delay;
   650  vdsc_cfg->initial_offset = rc_params->initial_offset;
   651  vdsc_cfg->flatness_min_qp = rc_params->flatness_min_qp;
   652  vdsc_cfg->flatness_max_qp = rc_params->flatness_max_qp;
   653  vdsc_cfg->rc_quant_incr_limit0 = 
rc_params->rc_quant_incr_limit0;
   654  vdsc_cfg->rc_quant_incr_limit1 = 
rc_params->rc_quant_incr_limit1;
   655  
   656  for (i = 0; i < DSC_NUM_BUF_RANGES; i++) {
   657  vdsc_cfg->rc_range_params[i].range_min_qp =
   658  rc_params->rc_range_params[i].range_min_qp;
   659  vdsc_cfg->rc_range_params[i].range_max_qp =
   660  rc_params->rc_range_params[i].range_max_qp;
   661  /*
   662   * Range BPG Offset uses 2's complement and is only a 6 
bits. So
   663   * mask it to get only 6 bits.
   664   */
   665  vdsc_cfg->rc_range_params[i].range_bpg_offset =
   666  rc_params->rc_range_params[i].range_bpg_offset &
   667  DSC_RANGE_BPG_OFFSET_MASK;
   668  }
   669  
   670  return 0;
   671  }
   672  EXPORT_SYMBOL(drm_dsc_setup_rc_params);
   673  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Re: [Freedreno] [PATCH 03/10] drm/i915/dsc: move DSC tables to DRM DSC helper

2023-02-28 Thread kernel test robot
Hi Dmitry,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm-intel/for-linux-next 
drm-intel/for-linux-next-fixes drm/drm-next linus/master v6.2 next-20230228]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Dmitry-Baryshkov/drm-i915-dsc-change-DSC-param-tables-to-follow-the-DSC-model/20230228-193505
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/20230228113342.2051425-4-dmitry.baryshkov%40linaro.org
patch subject: [PATCH 03/10] drm/i915/dsc: move DSC tables to DRM DSC helper
config: ia64-allyesconfig 
(https://download.01.org/0day-ci/archive/20230228/202302282203.ghupsryf-...@intel.com/config)
compiler: ia64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/ee048cb6c2ec7f7f92bea6b72e8cd3ef9921993e
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Dmitry-Baryshkov/drm-i915-dsc-change-DSC-param-tables-to-follow-the-DSC-model/20230228-193505
git checkout ee048cb6c2ec7f7f92bea6b72e8cd3ef9921993e
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=ia64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=ia64 SHELL=/bin/bash drivers/gpu/drm/display/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 
| Link: 
https://lore.kernel.org/oe-kbuild-all/202302282203.ghupsryf-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/display/drm_dsc_helper.c:635: warning: expecting prototype 
>> for drm_dsc_compute_rc_parameters(). Prototype was for 
>> drm_dsc_setup_rc_params() instead


vim +635 drivers/gpu/drm/display/drm_dsc_helper.c

   627  
   628  /**
   629   * drm_dsc_compute_rc_parameters() - Set parameters and limits for RC 
model in
   630   * accordance with the DSC 1.1 or 1.2 specification and DSC C Model
   631   *
   632   * @vdsc_cfg: DSC Configuration data partially filled by driver
   633   */
   634  int drm_dsc_setup_rc_params(struct drm_dsc_config *vdsc_cfg)
 > 635  {
   636  const struct rc_parameters *rc_params;
   637  int i;
   638  
   639  /* fractional BPP is not supported */
   640  if (vdsc_cfg->bits_per_pixel & 0xf)
   641  return -EINVAL;
   642  
   643  rc_params = get_rc_params(vdsc_cfg->bits_per_pixel >> 4,
   644vdsc_cfg->bits_per_component);
   645  if (!rc_params)
   646  return -EINVAL;
   647  
   648  vdsc_cfg->first_line_bpg_offset = 
rc_params->first_line_bpg_offset;
   649  vdsc_cfg->initial_xmit_delay = rc_params->initial_xmit_delay;
   650  vdsc_cfg->initial_offset = rc_params->initial_offset;
   651  vdsc_cfg->flatness_min_qp = rc_params->flatness_min_qp;
   652  vdsc_cfg->flatness_max_qp = rc_params->flatness_max_qp;
   653  vdsc_cfg->rc_quant_incr_limit0 = 
rc_params->rc_quant_incr_limit0;
   654  vdsc_cfg->rc_quant_incr_limit1 = 
rc_params->rc_quant_incr_limit1;
   655  
   656  for (i = 0; i < DSC_NUM_BUF_RANGES; i++) {
   657  vdsc_cfg->rc_range_params[i].range_min_qp =
   658  rc_params->rc_range_params[i].range_min_qp;
   659  vdsc_cfg->rc_range_params[i].range_max_qp =
   660  rc_params->rc_range_params[i].range_max_qp;
   661  /*
   662   * Range BPG Offset uses 2's complement and is only a 6 
bits. So
   663   * mask it to get only 6 bits.
   664   */
   665  vdsc_cfg->rc_range_params[i].range_bpg_offset =
   666  rc_params->rc_range_params[i].range_bpg_offset &
   667  DSC_RANGE_BPG_OFFSET_MASK;
   668  }
   669  
   670  return 0;
   671  }
   672  EXPORT_SYMBOL(drm_dsc_setup_rc_params);
   673  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Re: [Freedreno] [PATCH v4 06/14] dma-buf/sync_file: Support (E)POLLPRI

2023-02-28 Thread Sebastian Wick
On Tue, Feb 28, 2023 at 12:48 AM Rob Clark  wrote:
>
> On Mon, Feb 27, 2023 at 2:44 PM Sebastian Wick
>  wrote:
> >
> > On Mon, Feb 27, 2023 at 11:20 PM Rob Clark  wrote:
> > >
> > > On Mon, Feb 27, 2023 at 1:36 PM Rodrigo Vivi  
> > > wrote:
> > > >
> > > > On Fri, Feb 24, 2023 at 09:59:57AM -0800, Rob Clark wrote:
> > > > > On Fri, Feb 24, 2023 at 7:27 AM Luben Tuikov  
> > > > > wrote:
> > > > > >
> > > > > > On 2023-02-24 06:37, Tvrtko Ursulin wrote:
> > > > > > >
> > > > > > > On 24/02/2023 11:00, Pekka Paalanen wrote:
> > > > > > >> On Fri, 24 Feb 2023 10:50:51 +
> > > > > > >> Tvrtko Ursulin  wrote:
> > > > > > >>
> > > > > > >>> On 24/02/2023 10:24, Pekka Paalanen wrote:
> > > > > >  On Fri, 24 Feb 2023 09:41:46 +
> > > > > >  Tvrtko Ursulin  wrote:
> > > > > > 
> > > > > > > On 24/02/2023 09:26, Pekka Paalanen wrote:
> > > > > > >> On Thu, 23 Feb 2023 10:51:48 -0800
> > > > > > >> Rob Clark  wrote:
> > > > > > >>
> > > > > > >>> On Thu, Feb 23, 2023 at 1:38 AM Pekka Paalanen 
> > > > > > >>>  wrote:
> > > > > > 
> > > > > >  On Wed, 22 Feb 2023 07:37:26 -0800
> > > > > >  Rob Clark  wrote:
> > > > > > 
> > > > > > > On Wed, Feb 22, 2023 at 1:49 AM Pekka Paalanen 
> > > > > > >  wrote:
> > > > > > >>
> > > > > > >> ...
> > > > > > >>
> > > > > > >> On another matter, if the application uses SET_DEADLINE 
> > > > > > >> with one
> > > > > > >> timestamp, and the compositor uses SET_DEADLINE on the 
> > > > > > >> same thing with
> > > > > > >> another timestamp, what should happen?
> > > > > > >
> > > > > > > The expectation is that many deadline hints can be set on 
> > > > > > > a fence.
> > > > > > > The fence signaller should track the soonest deadline.
> > > > > > 
> > > > > >  You need to document that as UAPI, since it is observable 
> > > > > >  to userspace.
> > > > > >  It would be bad if drivers or subsystems would differ in 
> > > > > >  behaviour.
> > > > > > 
> > > > > > >>>
> > > > > > >>> It is in the end a hint.  It is about giving the driver more
> > > > > > >>> information so that it can make better choices.  But the 
> > > > > > >>> driver is
> > > > > > >>> even free to ignore it.  So maybe "expectation" is too 
> > > > > > >>> strong of a
> > > > > > >>> word.  Rather, any other behavior doesn't really make 
> > > > > > >>> sense.  But it
> > > > > > >>> could end up being dictated by how the hw and/or fw works.
> > > > > > >>
> > > > > > >> It will stop being a hint once it has been implemented and 
> > > > > > >> used in the
> > > > > > >> wild long enough. The kernel userspace regression rules make 
> > > > > > >> sure of
> > > > > > >> that.
> > > > > > >
> > > > > > > Yeah, tricky and maybe a gray area in this case. I think we 
> > > > > > > eluded
> > > > > > > elsewhere in the thread that renaming the thing might be an 
> > > > > > > option.
> > > > > > >
> > > > > > > So maybe instead of deadline, which is a very strong word, 
> > > > > > > use something
> > > > > > > along the lines of "present time hint", or "signalled time 
> > > > > > > hint"? Maybe
> > > > > > > reads clumsy. Just throwing some ideas for a start.
> > > > > > 
> > > > > >  You can try, but I fear that if it ever changes behaviour and
> > > > > >  someone notices that, it's labelled as a kernel regression. I 
> > > > > >  don't
> > > > > >  think documentation has ever been the authoritative definition 
> > > > > >  of UABI
> > > > > >  in Linux, it just guides drivers and userspace towards a common
> > > > > >  understanding and common usage patterns.
> > > > > > 
> > > > > >  So even if the UABI contract is not documented (ugh), you need 
> > > > > >  to be
> > > > > >  prepared to set the UABI contract through kernel 
> > > > > >  implementation.
> > > > > > >>>
> > > > > > >>> To be the devil's advocate it probably wouldn't be an ABI 
> > > > > > >>> regression but
> > > > > > >>> just an regression. Same way as what nice(2) priorities mean 
> > > > > > >>> hasn't
> > > > > > >>> always been the same over the years, I don't think there is a 
> > > > > > >>> strict
> > > > > > >>> contract.
> > > > > > >>>
> > > > > > >>> Having said that, it may be different with latency sensitive 
> > > > > > >>> stuff such
> > > > > > >>> as UIs though since it is very observable and can be very 
> > > > > > >>> painful to users.
> > > > > > >>>
> > > > > >  If you do not document the UABI contract, then different 
> > > > > >  drivers are
> > > > > >  likely to implement it differently, leading to differing 
> > > > > >  behaviour.
> > > > > >  Also userspace will invent wild ways to abuse the UABI if 
> > > > > > >>>

Re: [Freedreno] [PATCH v7 00/15] dma-fence: Deadline awareness

2023-02-28 Thread Bagas Sanjaya
On Mon, Feb 27, 2023 at 11:35:06AM -0800, Rob Clark wrote:
> From: Rob Clark 
> 
> This series adds a deadline hint to fences, so realtime deadlines
> such as vblank can be communicated to the fence signaller for power/
> frequency management decisions.
> 
> This is partially inspired by a trick i915 does, but implemented
> via dma-fence for a couple of reasons:
> 
> 1) To continue to be able to use the atomic helpers
> 2) To support cases where display and gpu are different drivers
> 
> This iteration adds a dma-fence ioctl to set a deadline (both to
> support igt-tests, and compositors which delay decisions about which
> client buffer to display), and a sw_sync ioctl to read back the
> deadline.  IGT tests utilizing these can be found at:
> 
>   
> https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline
> 
> 
> v1: https://patchwork.freedesktop.org/series/93035/
> v2: Move filtering out of later deadlines to fence implementation
> to avoid increasing the size of dma_fence
> v3: Add support in fence-array and fence-chain; Add some uabi to
> support igt tests and userspace compositors.
> v4: Rebase, address various comments, and add syncobj deadline
> support, and sync_file EPOLLPRI based on experience with perf/
> freq issues with clvk compute workloads on i915 (anv)
> v5: Clarify that this is a hint as opposed to a more hard deadline
> guarantee, switch to using u64 ns values in UABI (still absolute
> CLOCK_MONOTONIC values), drop syncobj related cap and driver
> feature flag in favor of allowing count_handles==0 for probing
> kernel support.
> v6: Re-work vblank helper to calculate time of _start_ of vblank,
> and work correctly if the last vblank event was more than a
> frame ago.  Add (mostly unrelated) drm/msm patch which also
> uses the vblank helper.  Use dma_fence_chain_contained().  More
> verbose syncobj UABI comments.  Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
> v7: Fix kbuild complaints about vblank helper.  Add more docs.
> 

I want to apply this series for testing, but it can't be applied cleanly
on current drm-misc tree. On what tree (and commit) is this series based
on?

-- 
An old man doll... just what I always wanted! - Clara


signature.asc
Description: PGP signature


Re: [Freedreno] [PATCH 02/10] drm/i915/dsc: move rc_buf_thresh values to common helper

2023-02-28 Thread Dmitry Baryshkov

On 28/02/2023 14:49, Jani Nikula wrote:

On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:

On Tue, 28 Feb 2023 at 14:25, Jani Nikula  wrote:


On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:

The rc_buf_thresh values are common to all DSC implementations. Move
them to the common helper together with the code to propagage them to
the drm_dsc_config.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/display/drm_dsc_helper.c  | 37 +++
  drivers/gpu/drm/i915/display/intel_vdsc.c | 24 +--
  include/drm/display/drm_dsc_helper.h  |  1 +
  3 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index c869c6e51e2b..ab8679c158b5 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -270,6 +270,43 @@ void drm_dsc_pps_payload_pack(struct 
drm_dsc_picture_parameter_set *pps_payload,
  }
  EXPORT_SYMBOL(drm_dsc_pps_payload_pack);

+/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
+const u16 drm_dsc_rc_buf_thresh[] = {
+ 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
+ 7744, 7872, 8000, 8064
+};
+EXPORT_SYMBOL(drm_dsc_rc_buf_thresh);


This needs to be static, without exports.


Exported this to let other drivers use it, while skipping the
drm_dsc_set_rc_buf_thresh(). For example amdgpu driver sets buffer
thresholds on the interim structure, so the helper is not directly
applicable. See _do_calc_rc_params().


Regardless, I'm still saying don't do that.

Data is not an interface.

If you make it easy to just use the data, nobody will ever fix their
drivers to use proper interfaces, and you'll lock yourself to a
particular representation of the data even though it's supposed to be a
hidden implementation detail.


Yes, I usually do not export data, exactly for these reasons. I could 
have argued here that the data is constant here, etc. etc.

However let's stop caring about other drivers. I'll drop the export for v2.




BR,
Jani.







+
+/**
+ * drm_dsc_set_rc_buf_thresh() - Set thresholds for the RC model
+ * in accordance with the DSC 1.2 specification.
+ *
+ * @vdsc_cfg: DSC Configuration data partially filled by driver
+ */
+void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg)
+{
+ int i = 0;


Unnecessary initialization.


My bad.




+
+ for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {


Please use ARRAY_SIZE(). Maybe add BUILD_BUG_ON() for DSC_NUM_BUF_RANGES
vs. ARRAY_SIZE(). (Yes, we should've used ARRAY_SIZE() in i915.)


Ack




+ /*
+  * six 0s are appended to the lsb of each threshold value
+  * internally in h/w.
+  * Only 8 bits are allowed for programming RcBufThreshold
+  */
+ vdsc_cfg->rc_buf_thresh[i] = drm_dsc_rc_buf_thresh[i] >> 6;
+ }
+
+ /*
+  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
+  * as per C Model
+  */
+ if (vdsc_cfg->bits_per_pixel == 6 << 4) {
+ vdsc_cfg->rc_buf_thresh[12] = 7936 >> 6;
+ vdsc_cfg->rc_buf_thresh[13] = 8000 >> 6;
+ }
+}
+EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
+
  /**
   * drm_dsc_compute_rc_parameters() - Write rate control
   * parameters to the dsc configuration defined in
diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
b/drivers/gpu/drm/i915/display/intel_vdsc.c
index d080741fd0b3..b4faab4c8fb3 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
@@ -36,12 +36,6 @@ enum COLUMN_INDEX_BPC {
   MAX_COLUMN_INDEX
  };

-/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
-static const u16 rc_buf_thresh[] = {
- 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
- 7744, 7872, 8000, 8064
-};
-
  struct rc_parameters {
   u16 initial_xmit_delay;
   u8 first_line_bpg_offset;
@@ -474,23 +468,7 @@ int intel_dsc_compute_params(struct intel_crtc_state 
*pipe_config)
   vdsc_cfg->bits_per_pixel = compressed_bpp << 4;
   vdsc_cfg->bits_per_component = pipe_config->pipe_bpp / 3;

- for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
- /*
-  * six 0s are appended to the lsb of each threshold value
-  * internally in h/w.
-  * Only 8 bits are allowed for programming RcBufThreshold
-  */
- vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
- }
-
- /*
-  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
-  * as per C Model
-  */
- if (compressed_bpp == 6) {
- vdsc_cfg->rc_buf_thresh[12] = 0x7C;
- vdsc_cfg->rc_buf_thresh[13] = 0x7D;
- }
+ drm_dsc_set_rc_buf_thresh(vdsc_cfg);

   /*
* From XE_LPD onwards we supports compression bpps in steps of 1
diff --git a/include/drm/display/drm_dsc_helper.h 
b/include/drm/display/drm_dsc_helper

Re: [Freedreno] [PATCH 02/10] drm/i915/dsc: move rc_buf_thresh values to common helper

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> On Tue, 28 Feb 2023 at 14:25, Jani Nikula  wrote:
>>
>> On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
>> > The rc_buf_thresh values are common to all DSC implementations. Move
>> > them to the common helper together with the code to propagage them to
>> > the drm_dsc_config.
>> >
>> > Signed-off-by: Dmitry Baryshkov 
>> > ---
>> >  drivers/gpu/drm/display/drm_dsc_helper.c  | 37 +++
>> >  drivers/gpu/drm/i915/display/intel_vdsc.c | 24 +--
>> >  include/drm/display/drm_dsc_helper.h  |  1 +
>> >  3 files changed, 39 insertions(+), 23 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
>> > b/drivers/gpu/drm/display/drm_dsc_helper.c
>> > index c869c6e51e2b..ab8679c158b5 100644
>> > --- a/drivers/gpu/drm/display/drm_dsc_helper.c
>> > +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
>> > @@ -270,6 +270,43 @@ void drm_dsc_pps_payload_pack(struct 
>> > drm_dsc_picture_parameter_set *pps_payload,
>> >  }
>> >  EXPORT_SYMBOL(drm_dsc_pps_payload_pack);
>> >
>> > +/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically 
>> > constant */
>> > +const u16 drm_dsc_rc_buf_thresh[] = {
>> > + 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
>> > + 7744, 7872, 8000, 8064
>> > +};
>> > +EXPORT_SYMBOL(drm_dsc_rc_buf_thresh);
>>
>> This needs to be static, without exports.
>
> Exported this to let other drivers use it, while skipping the
> drm_dsc_set_rc_buf_thresh(). For example amdgpu driver sets buffer
> thresholds on the interim structure, so the helper is not directly
> applicable. See _do_calc_rc_params().

Regardless, I'm still saying don't do that.

Data is not an interface.

If you make it easy to just use the data, nobody will ever fix their
drivers to use proper interfaces, and you'll lock yourself to a
particular representation of the data even though it's supposed to be a
hidden implementation detail.


BR,
Jani.


>
>>
>> > +
>> > +/**
>> > + * drm_dsc_set_rc_buf_thresh() - Set thresholds for the RC model
>> > + * in accordance with the DSC 1.2 specification.
>> > + *
>> > + * @vdsc_cfg: DSC Configuration data partially filled by driver
>> > + */
>> > +void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg)
>> > +{
>> > + int i = 0;
>>
>> Unnecessary initialization.
>
> My bad.
>
>>
>> > +
>> > + for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
>>
>> Please use ARRAY_SIZE(). Maybe add BUILD_BUG_ON() for DSC_NUM_BUF_RANGES
>> vs. ARRAY_SIZE(). (Yes, we should've used ARRAY_SIZE() in i915.)
>
> Ack
>
>>
>> > + /*
>> > +  * six 0s are appended to the lsb of each threshold value
>> > +  * internally in h/w.
>> > +  * Only 8 bits are allowed for programming RcBufThreshold
>> > +  */
>> > + vdsc_cfg->rc_buf_thresh[i] = drm_dsc_rc_buf_thresh[i] >> 6;
>> > + }
>> > +
>> > + /*
>> > +  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
>> > +  * as per C Model
>> > +  */
>> > + if (vdsc_cfg->bits_per_pixel == 6 << 4) {
>> > + vdsc_cfg->rc_buf_thresh[12] = 7936 >> 6;
>> > + vdsc_cfg->rc_buf_thresh[13] = 8000 >> 6;
>> > + }
>> > +}
>> > +EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
>> > +
>> >  /**
>> >   * drm_dsc_compute_rc_parameters() - Write rate control
>> >   * parameters to the dsc configuration defined in
>> > diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
>> > b/drivers/gpu/drm/i915/display/intel_vdsc.c
>> > index d080741fd0b3..b4faab4c8fb3 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
>> > @@ -36,12 +36,6 @@ enum COLUMN_INDEX_BPC {
>> >   MAX_COLUMN_INDEX
>> >  };
>> >
>> > -/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically 
>> > constant */
>> > -static const u16 rc_buf_thresh[] = {
>> > - 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
>> > - 7744, 7872, 8000, 8064
>> > -};
>> > -
>> >  struct rc_parameters {
>> >   u16 initial_xmit_delay;
>> >   u8 first_line_bpg_offset;
>> > @@ -474,23 +468,7 @@ int intel_dsc_compute_params(struct intel_crtc_state 
>> > *pipe_config)
>> >   vdsc_cfg->bits_per_pixel = compressed_bpp << 4;
>> >   vdsc_cfg->bits_per_component = pipe_config->pipe_bpp / 3;
>> >
>> > - for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
>> > - /*
>> > -  * six 0s are appended to the lsb of each threshold value
>> > -  * internally in h/w.
>> > -  * Only 8 bits are allowed for programming RcBufThreshold
>> > -  */
>> > - vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
>> > - }
>> > -
>> > - /*
>> > -  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
>> > -  * as per C Model
>> > -  */
>> > - if (compressed_bpp == 6) {
>> > - vdsc_cfg->rc_buf_thresh[12] = 0x7C;
>> > - 

Re: [Freedreno] [PATCH 02/10] drm/i915/dsc: move rc_buf_thresh values to common helper

2023-02-28 Thread Dmitry Baryshkov
On Tue, 28 Feb 2023 at 14:25, Jani Nikula  wrote:
>
> On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> > The rc_buf_thresh values are common to all DSC implementations. Move
> > them to the common helper together with the code to propagage them to
> > the drm_dsc_config.
> >
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >  drivers/gpu/drm/display/drm_dsc_helper.c  | 37 +++
> >  drivers/gpu/drm/i915/display/intel_vdsc.c | 24 +--
> >  include/drm/display/drm_dsc_helper.h  |  1 +
> >  3 files changed, 39 insertions(+), 23 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> > b/drivers/gpu/drm/display/drm_dsc_helper.c
> > index c869c6e51e2b..ab8679c158b5 100644
> > --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> > +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> > @@ -270,6 +270,43 @@ void drm_dsc_pps_payload_pack(struct 
> > drm_dsc_picture_parameter_set *pps_payload,
> >  }
> >  EXPORT_SYMBOL(drm_dsc_pps_payload_pack);
> >
> > +/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant 
> > */
> > +const u16 drm_dsc_rc_buf_thresh[] = {
> > + 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
> > + 7744, 7872, 8000, 8064
> > +};
> > +EXPORT_SYMBOL(drm_dsc_rc_buf_thresh);
>
> This needs to be static, without exports.

Exported this to let other drivers use it, while skipping the
drm_dsc_set_rc_buf_thresh(). For example amdgpu driver sets buffer
thresholds on the interim structure, so the helper is not directly
applicable. See _do_calc_rc_params().

>
> > +
> > +/**
> > + * drm_dsc_set_rc_buf_thresh() - Set thresholds for the RC model
> > + * in accordance with the DSC 1.2 specification.
> > + *
> > + * @vdsc_cfg: DSC Configuration data partially filled by driver
> > + */
> > +void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg)
> > +{
> > + int i = 0;
>
> Unnecessary initialization.

My bad.

>
> > +
> > + for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
>
> Please use ARRAY_SIZE(). Maybe add BUILD_BUG_ON() for DSC_NUM_BUF_RANGES
> vs. ARRAY_SIZE(). (Yes, we should've used ARRAY_SIZE() in i915.)

Ack

>
> > + /*
> > +  * six 0s are appended to the lsb of each threshold value
> > +  * internally in h/w.
> > +  * Only 8 bits are allowed for programming RcBufThreshold
> > +  */
> > + vdsc_cfg->rc_buf_thresh[i] = drm_dsc_rc_buf_thresh[i] >> 6;
> > + }
> > +
> > + /*
> > +  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
> > +  * as per C Model
> > +  */
> > + if (vdsc_cfg->bits_per_pixel == 6 << 4) {
> > + vdsc_cfg->rc_buf_thresh[12] = 7936 >> 6;
> > + vdsc_cfg->rc_buf_thresh[13] = 8000 >> 6;
> > + }
> > +}
> > +EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
> > +
> >  /**
> >   * drm_dsc_compute_rc_parameters() - Write rate control
> >   * parameters to the dsc configuration defined in
> > diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
> > b/drivers/gpu/drm/i915/display/intel_vdsc.c
> > index d080741fd0b3..b4faab4c8fb3 100644
> > --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
> > +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
> > @@ -36,12 +36,6 @@ enum COLUMN_INDEX_BPC {
> >   MAX_COLUMN_INDEX
> >  };
> >
> > -/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant 
> > */
> > -static const u16 rc_buf_thresh[] = {
> > - 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
> > - 7744, 7872, 8000, 8064
> > -};
> > -
> >  struct rc_parameters {
> >   u16 initial_xmit_delay;
> >   u8 first_line_bpg_offset;
> > @@ -474,23 +468,7 @@ int intel_dsc_compute_params(struct intel_crtc_state 
> > *pipe_config)
> >   vdsc_cfg->bits_per_pixel = compressed_bpp << 4;
> >   vdsc_cfg->bits_per_component = pipe_config->pipe_bpp / 3;
> >
> > - for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
> > - /*
> > -  * six 0s are appended to the lsb of each threshold value
> > -  * internally in h/w.
> > -  * Only 8 bits are allowed for programming RcBufThreshold
> > -  */
> > - vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
> > - }
> > -
> > - /*
> > -  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
> > -  * as per C Model
> > -  */
> > - if (compressed_bpp == 6) {
> > - vdsc_cfg->rc_buf_thresh[12] = 0x7C;
> > - vdsc_cfg->rc_buf_thresh[13] = 0x7D;
> > - }
> > + drm_dsc_set_rc_buf_thresh(vdsc_cfg);
> >
> >   /*
> >* From XE_LPD onwards we supports compression bpps in steps of 1
> > diff --git a/include/drm/display/drm_dsc_helper.h 
> > b/include/drm/display/drm_dsc_helper.h
> > index 8b41edbbabab..706ba1d34742 100644
> > --- a/include/drm/display/drm_dsc_helper.h
> > +++ b/include/drm/display/drm_dsc_helper.h
> > @@ -14,6 +14,7 @@ void drm_dsc_dp_pps_header_init(struct dp_sd

Re: [Freedreno] [PATCH 02/10] drm/i915/dsc: move rc_buf_thresh values to common helper

2023-02-28 Thread Jani Nikula
On Tue, 28 Feb 2023, Dmitry Baryshkov  wrote:
> The rc_buf_thresh values are common to all DSC implementations. Move
> them to the common helper together with the code to propagage them to
> the drm_dsc_config.
>
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/display/drm_dsc_helper.c  | 37 +++
>  drivers/gpu/drm/i915/display/intel_vdsc.c | 24 +--
>  include/drm/display/drm_dsc_helper.h  |  1 +
>  3 files changed, 39 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
> b/drivers/gpu/drm/display/drm_dsc_helper.c
> index c869c6e51e2b..ab8679c158b5 100644
> --- a/drivers/gpu/drm/display/drm_dsc_helper.c
> +++ b/drivers/gpu/drm/display/drm_dsc_helper.c
> @@ -270,6 +270,43 @@ void drm_dsc_pps_payload_pack(struct 
> drm_dsc_picture_parameter_set *pps_payload,
>  }
>  EXPORT_SYMBOL(drm_dsc_pps_payload_pack);
>  
> +/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
> +const u16 drm_dsc_rc_buf_thresh[] = {
> + 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
> + 7744, 7872, 8000, 8064
> +};
> +EXPORT_SYMBOL(drm_dsc_rc_buf_thresh);

This needs to be static, without exports.

> +
> +/**
> + * drm_dsc_set_rc_buf_thresh() - Set thresholds for the RC model
> + * in accordance with the DSC 1.2 specification.
> + *
> + * @vdsc_cfg: DSC Configuration data partially filled by driver
> + */
> +void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg)
> +{
> + int i = 0;

Unnecessary initialization.

> +
> + for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {

Please use ARRAY_SIZE(). Maybe add BUILD_BUG_ON() for DSC_NUM_BUF_RANGES
vs. ARRAY_SIZE(). (Yes, we should've used ARRAY_SIZE() in i915.)

> + /*
> +  * six 0s are appended to the lsb of each threshold value
> +  * internally in h/w.
> +  * Only 8 bits are allowed for programming RcBufThreshold
> +  */
> + vdsc_cfg->rc_buf_thresh[i] = drm_dsc_rc_buf_thresh[i] >> 6;
> + }
> +
> + /*
> +  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
> +  * as per C Model
> +  */
> + if (vdsc_cfg->bits_per_pixel == 6 << 4) {
> + vdsc_cfg->rc_buf_thresh[12] = 7936 >> 6;
> + vdsc_cfg->rc_buf_thresh[13] = 8000 >> 6;
> + }
> +}
> +EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
> +
>  /**
>   * drm_dsc_compute_rc_parameters() - Write rate control
>   * parameters to the dsc configuration defined in
> diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
> b/drivers/gpu/drm/i915/display/intel_vdsc.c
> index d080741fd0b3..b4faab4c8fb3 100644
> --- a/drivers/gpu/drm/i915/display/intel_vdsc.c
> +++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
> @@ -36,12 +36,6 @@ enum COLUMN_INDEX_BPC {
>   MAX_COLUMN_INDEX
>  };
>  
> -/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
> -static const u16 rc_buf_thresh[] = {
> - 896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
> - 7744, 7872, 8000, 8064
> -};
> -
>  struct rc_parameters {
>   u16 initial_xmit_delay;
>   u8 first_line_bpg_offset;
> @@ -474,23 +468,7 @@ int intel_dsc_compute_params(struct intel_crtc_state 
> *pipe_config)
>   vdsc_cfg->bits_per_pixel = compressed_bpp << 4;
>   vdsc_cfg->bits_per_component = pipe_config->pipe_bpp / 3;
>  
> - for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
> - /*
> -  * six 0s are appended to the lsb of each threshold value
> -  * internally in h/w.
> -  * Only 8 bits are allowed for programming RcBufThreshold
> -  */
> - vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
> - }
> -
> - /*
> -  * For 6bpp, RC Buffer threshold 12 and 13 need a different value
> -  * as per C Model
> -  */
> - if (compressed_bpp == 6) {
> - vdsc_cfg->rc_buf_thresh[12] = 0x7C;
> - vdsc_cfg->rc_buf_thresh[13] = 0x7D;
> - }
> + drm_dsc_set_rc_buf_thresh(vdsc_cfg);
>  
>   /*
>* From XE_LPD onwards we supports compression bpps in steps of 1
> diff --git a/include/drm/display/drm_dsc_helper.h 
> b/include/drm/display/drm_dsc_helper.h
> index 8b41edbbabab..706ba1d34742 100644
> --- a/include/drm/display/drm_dsc_helper.h
> +++ b/include/drm/display/drm_dsc_helper.h
> @@ -14,6 +14,7 @@ void drm_dsc_dp_pps_header_init(struct dp_sdp_header 
> *pps_header);
>  int drm_dsc_dp_rc_buffer_size(u8 rc_buffer_block_size, u8 rc_buffer_size);
>  void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set *pps_sdp,
> const struct drm_dsc_config *dsc_cfg);
> +void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg);
>  int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg);
>  
>  #endif /* _DRM_DSC_HELPER_H_ */

-- 
Jani Nikula, Intel Open Source Graphics Center


[Freedreno] [PATCH 10/10] drm/msm/dsi: use new helpers for DSC setup

2023-02-28 Thread Dmitry Baryshkov
Use new DRM DSC helpers to setup DSI DSC configuration. The
initial_scale_value needs to be adjusted according to the standard, but
this is a separate change.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/dsi/dsi_host.c | 61 --
 1 file changed, 8 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
b/drivers/gpu/drm/msm/dsi/dsi_host.c
index 18fa30e1e858..dda989727921 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -1735,28 +1735,9 @@ static int dsi_host_parse_lane_data(struct msm_dsi_host 
*msm_host,
return -EINVAL;
 }
 
-static u32 dsi_dsc_rc_buf_thresh[DSC_NUM_BUF_RANGES - 1] = {
-   0x0e, 0x1c, 0x2a, 0x38, 0x46, 0x54, 0x62,
-   0x69, 0x70, 0x77, 0x79, 0x7b, 0x7d, 0x7e
-};
-
-/* only 8bpc, 8bpp added */
-static char min_qp[DSC_NUM_BUF_RANGES] = {
-   0, 0, 1, 1, 3, 3, 3, 3, 3, 3, 5, 5, 5, 7, 13
-};
-
-static char max_qp[DSC_NUM_BUF_RANGES] = {
-   4, 4, 5, 6, 7, 7, 7, 8, 9, 10, 11, 12, 13, 13, 15
-};
-
-static char bpg_offset[DSC_NUM_BUF_RANGES] = {
-   2, 0, 0, -2, -4, -6, -8, -8, -8, -10, -10, -12, -12, -12, -12
-};
-
 static int dsi_populate_dsc_params(struct msm_dsi_host *msm_host, struct 
drm_dsc_config *dsc)
 {
-   int i;
-   u16 bpp = dsc->bits_per_pixel >> 4;
+   int ret;
 
if (dsc->bits_per_pixel & 0xf) {
DRM_DEV_ERROR(&msm_host->pdev->dev, "DSI does not support 
fractional bits_per_pixel\n");
@@ -1768,49 +1749,23 @@ static int dsi_populate_dsc_params(struct msm_dsi_host 
*msm_host, struct drm_dsc
return -EOPNOTSUPP;
}
 
-   dsc->rc_model_size = 8192;
-   dsc->first_line_bpg_offset = 12;
-   dsc->rc_edge_factor = 6;
-   dsc->rc_tgt_offset_high = 3;
-   dsc->rc_tgt_offset_low = 3;
dsc->simple_422 = 0;
dsc->convert_rgb = 1;
dsc->vbr_enable = 0;
 
-   /* handle only bpp = bpc = 8 */
-   for (i = 0; i < DSC_NUM_BUF_RANGES - 1 ; i++)
-   dsc->rc_buf_thresh[i] = dsi_dsc_rc_buf_thresh[i];
+   drm_dsc_set_const_params(dsc);
+   drm_dsc_set_rc_buf_thresh(dsc);
 
-   for (i = 0; i < DSC_NUM_BUF_RANGES; i++) {
-   dsc->rc_range_params[i].range_min_qp = min_qp[i];
-   dsc->rc_range_params[i].range_max_qp = max_qp[i];
-   /*
-* Range BPG Offset contains two's-complement signed values 
that fill
-* 8 bits, yet the registers and DCS PPS field are only 6 bits 
wide.
-*/
-   dsc->rc_range_params[i].range_bpg_offset = bpg_offset[i] & 
DSC_RANGE_BPG_OFFSET_MASK;
+   /* handle only bpp = bpc = 8, pre-SCR panels */
+   ret = drm_dsc_setup_rc_params(dsc, DRM_DSC_1_1_PRE_SCR);
+   if (ret) {
+   DRM_DEV_ERROR(&msm_host->pdev->dev, "could not find DSC RC 
parameters\n");
+   return ret;
}
 
-   dsc->initial_offset = 6144; /* Not bpp 12 */
-   if (bpp != 8)
-   dsc->initial_offset = 2048; /* bpp = 12 */
-
-   if (dsc->bits_per_component <= 10)
-   dsc->mux_word_size = DSC_MUX_WORD_SIZE_8_10_BPC;
-   else
-   dsc->mux_word_size = DSC_MUX_WORD_SIZE_12_BPC;
-
-   dsc->initial_xmit_delay = 512;
dsc->initial_scale_value = 32;
-   dsc->first_line_bpg_offset = 12;
dsc->line_buf_depth = dsc->bits_per_component + 1;
 
-   /* bpc 8 */
-   dsc->flatness_min_qp = 3;
-   dsc->flatness_max_qp = 12;
-   dsc->rc_quant_incr_limit0 = 11;
-   dsc->rc_quant_incr_limit1 = 11;
-
return drm_dsc_compute_rc_parameters(dsc);
 }
 
-- 
2.39.2



[Freedreno] [PATCH 08/10] drm/display/dsc: add YCbCr 4:2:2 and 4:2:0 RC parameters

2023-02-28 Thread Dmitry Baryshkov
Include RC parameters for YCbCr 4:2:2 and 4:2:0 configurations.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c | 438 +++
 include/drm/display/drm_dsc_helper.h |   2 +
 2 files changed, 440 insertions(+)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index 1612536014ea..d11ee8f1efa7 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -742,6 +742,438 @@ static const struct rc_parameters_data 
rc_parameters_1_2_444[] = {
 { /* sentinel */ }
 };
 
+static const struct rc_parameters_data rc_parameters_1_2_422[] = {
+{ DSC_BPP(6), 8,
+   /* 12BPP/8BPC */
+   { 512, 15, 6144, 3, 12, 11, 11, {
+   { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
+   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
+   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 10, -10 }, { 5, 11, -12 },
+   { 5, 11, -12 }, { 9, 12, -12 }, { 12, 13, -12 }
+   }
+   }
+},
+{ DSC_BPP(6), 10,
+   /* 12BPP/10BPC */
+   { 512, 15, 6144, 7, 16, 15, 15, {
+   { 0, 8, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
+   { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
+   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 14, -10 }, { 9, 15, -12 },
+   { 9, 15, -12 }, { 13, 16, -12 }, { 16, 17, -12 }
+   }
+   }
+},
+{ DSC_BPP(6), 12,
+   /* 12BPP/12BPC */
+   { 512, 15, 6144, 11, 20, 19, 19, {
+   { 0, 12, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 9, 14, -2 },
+   { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
+   { 11, 17, -8 }, { 11, 18, -10 }, { 13, 18, -10 },
+   { 13, 19, -12 }, { 13, 19, -12 }, { 17, 20, -12 },
+   { 20, 21, -12 }
+   }
+   }
+},
+{ DSC_BPP(6), 14,
+   /* 12BPP/14BPC */
+   { 512, 15, 6144, 15, 24, 23, 23, {
+   { 0, 12, 2 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
+   { 15, 19, -4 }, { 15, 19, -6 }, { 15, 19, -8 }, { 15, 20, -8 },
+   { 15, 21, -8 }, { 15, 22, -10 }, { 17, 22, -10 },
+   { 17, 23, -12 }, { 17, 23, -12 }, { 21, 24, -12 },
+   { 24, 25, -12 }
+   }
+   }
+},
+{ DSC_BPP(6), 16,
+   /* 12BPP/16BPC */
+   { 512, 15, 6144, 19, 28, 27, 27, {
+   { 0, 12, 2 }, { 6, 14, 0 }, { 13, 17, 0 }, { 15, 20, -2 },
+   { 19, 23, -4 }, { 19, 23, -6 }, { 19, 23, -8 }, { 19, 24, -8 },
+   { 19, 25, -8 }, { 19, 26, -10 }, { 21, 26, -10 },
+   { 21, 27, -12 }, { 21, 27, -12 }, { 25, 28, -12 },
+   { 28, 29, -12 }
+   }
+   }
+},
+{ DSC_BPP(7), 8,
+   /* 14BPP/8BPC */
+   { 410, 15, 5632, 3, 12, 11, 11, {
+   { 0, 3, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 2, 6, -2 },
+   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
+   { 3, 9, -8 }, { 3, 9, -10 }, { 5, 10, -10 }, { 5, 10, -10 },
+   { 5, 11, -12 }, { 7, 11, -12 }, { 11, 12, -12 }
+   }
+   }
+},
+{ DSC_BPP(7), 10,
+   /* 14BPP/10BPC */
+   { 410, 15, 5632, 7, 16, 15, 15, {
+   { 0, 7, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 6, 10, -2 },
+   { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
+   { 7, 13, -8 }, { 7, 13, -10 }, { 9, 14, -10 }, { 9, 14, -10 },
+   { 9, 15, -12 }, { 11, 15, -12 }, { 15, 16, -12 }
+   }
+   }
+},
+{ DSC_BPP(7), 12,
+   /* 14BPP/12BPC */
+   { 410, 15, 5632, 11, 20, 19, 19, {
+   { 0, 11, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 10, 14, -2 },
+   { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
+   { 11, 17, -8 }, { 11, 17, -10 }, { 13, 18, -10 },
+   { 13, 18, -10 }, { 13, 19, -12 }, { 15, 19, -12 },
+   { 19, 20, -12 }
+   }
+   }
+},
+{ DSC_BPP(7), 14,
+   /* 14BPP/14BPC */
+   { 410, 15, 5632, 15, 24, 23, 23, {
+   { 0, 11, 2 }, { 5, 13, 0 }, { 11, 15, 0 }, { 13, 18, -2 },
+   { 15, 19, -4 }, { 15, 19, -6 }, { 15, 19, -8 }, { 15, 20, -8 },
+   { 15, 21, -8 }, { 15, 21, -10 }, { 17, 22, -10 },
+   { 17, 22, -10 }, { 17, 23, -12 }, { 19, 23, -12 },
+   { 23, 24, -12 }
+   }
+   }
+},
+{ DSC_BPP(7), 16,
+   /* 14BPP/16BPC */
+   { 410, 15, 5632, 19, 28, 27, 27, {
+   { 0, 11, 2 }, { 6, 14, 0 }, { 13, 17, 0 }, { 16, 20, -2 },
+   { 19, 23, -4 }, { 19, 23, -6 }, { 19, 23, -8 }, { 19, 24, -8 },
+   { 19, 25, -8 }, { 19, 25, -10 }, { 21, 26, -10 },
+   { 21, 26, -10 }, { 21, 27, -12 }, { 23, 27, -12 },
+   { 27, 28, -12 }
+   }
+   }
+},
+{ DSC_BPP(8), 8,
+   /* 16BPP/8BPC */
+   { 341, 15, 2048, 

[Freedreno] [PATCH 05/10] drm/display/dsc: use flat array for rc_parameters lookup

2023-02-28 Thread Dmitry Baryshkov
Next commits are going to add support for additional RC parameter lookup
tables. These tables are going to use different bpp/bpc combinations,
thus it makes little sense to keep the 2d array for RC parameters.
Switch to using the flat array.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c | 188 +++
 1 file changed, 88 insertions(+), 100 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index deaa84722bd4..a6d11f474656 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -307,24 +307,6 @@ void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config 
*vdsc_cfg)
 }
 EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
 
-enum ROW_INDEX_BPP {
-   ROW_INDEX_6BPP = 0,
-   ROW_INDEX_8BPP,
-   ROW_INDEX_10BPP,
-   ROW_INDEX_12BPP,
-   ROW_INDEX_15BPP,
-   MAX_ROW_INDEX
-};
-
-enum COLUMN_INDEX_BPC {
-   COLUMN_INDEX_8BPC = 0,
-   COLUMN_INDEX_10BPC,
-   COLUMN_INDEX_12BPC,
-   COLUMN_INDEX_14BPC,
-   COLUMN_INDEX_16BPC,
-   MAX_COLUMN_INDEX
-};
-
 struct rc_parameters {
u16 initial_xmit_delay;
u8 first_line_bpg_offset;
@@ -336,12 +318,20 @@ struct rc_parameters {
struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
 };
 
+struct rc_parameters_data {
+   u8 bpp;
+   u8 bpc;
+   struct rc_parameters params;
+};
+
+#define DSC_BPP(bpp)   ((bpp) << 4)
+
 /*
  * Selected Rate Control Related Parameter Recommended Values
  * from DSC_v1.11 spec & C Model release: DSC_model_20161212
  */
-static const struct rc_parameters rc_parameters[][MAX_COLUMN_INDEX] = {
-{
+static const struct rc_parameters_data rc_parameters[] = {
+{ DSC_BPP(6), 8,
/* 6BPP/8BPC */
{ 768, 15, 6144, 3, 13, 11, 11, {
{ 0, 4, 0 }, { 1, 6, -2 }, { 3, 8, -2 }, { 4, 8, -4 },
@@ -349,7 +339,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 7, 11, -8 }, { 8, 12, -10 }, { 9, 12, -10 }, { 10, 12, -12 },
{ 10, 12, -12 }, { 11, 12, -12 }, { 13, 14, -12 }
}
-   },
+   }
+},
+{ DSC_BPP(6), 10,
/* 6BPP/10BPC */
{ 768, 15, 6144, 7, 17, 15, 15, {
{ 0, 8, 0 }, { 3, 10, -2 }, { 7, 12, -2 }, { 8, 12, -4 },
@@ -358,7 +350,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 14, 16, -12 }, { 14, 16, -12 }, { 15, 16, -12 },
{ 17, 18, -12 }
}
-   },
+   }
+},
+{ DSC_BPP(6), 12,
/* 6BPP/12BPC */
{ 768, 15, 6144, 11, 21, 19, 19, {
{ 0, 12, 0 }, { 5, 14, -2 }, { 11, 16, -2 }, { 12, 16, -4 },
@@ -367,7 +361,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 18, 20, -12 }, { 18, 20, -12 }, { 19, 20, -12 },
{ 21, 22, -12 }
}
-   },
+   }
+},
+{ DSC_BPP(6), 14,
/* 6BPP/14BPC */
{ 768, 15, 6144, 15, 25, 23, 23, {
{ 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
@@ -376,7 +372,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 22, 24, -12 }, { 22, 24, -12 }, { 23, 24, -12 },
{ 25, 26, -12 }
}
-   },
+   }
+},
+{ DSC_BPP(6), 16,
/* 6BPP/16BPC */
{ 768, 15, 6144, 19, 29, 27, 27, {
{ 0, 20, 0 }, { 9, 22, -2 }, { 19, 24, -2 }, { 20, 24, -4 },
@@ -385,9 +383,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 26, 28, -12 }, { 26, 28, -12 }, { 27, 28, -12 },
{ 29, 30, -12 }
}
-   },
+   }
 },
-{
+{ DSC_BPP(8), 8,
/* 8BPP/8BPC */
{ 512, 12, 6144, 3, 12, 11, 11, {
{ 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
@@ -395,7 +393,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
{ 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
}
-   },
+   }
+},
+{ DSC_BPP(8), 10,
/* 8BPP/10BPC */
{ 512, 12, 6144, 7, 16, 15, 15, {
/*
@@ -407,7 +407,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
{ 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
}
-   },
+   }
+},
+{ DSC_BPP(8), 12,
/* 8BPP/12BPC */
{ 512, 12, 6144, 11, 20, 19, 19, {
{ 0, 12, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 9, 14, -2 },
@@ -416,7 +418,9 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
{ 13, 20, -12 }, { 13, 21, -12 }, { 15, 21, -12 },
{ 21, 23, -12 }
}

[Freedreno] [PATCH 04/10] drm/i915/dsc: stop using interim structure for calculated params

2023-02-28 Thread Dmitry Baryshkov
Stop using an interim structure rc_parameters for storing calculated
params and then setting drm_dsc_config using that structure. Instead put
calculated params into the struct drm_dsc_config directly.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/i915/display/intel_vdsc.c | 89 +--
 1 file changed, 20 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
b/drivers/gpu/drm/i915/display/intel_vdsc.c
index d5a7e9494b23..1ee8d13c9d64 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
@@ -18,17 +18,6 @@
 #include "intel_qp_tables.h"
 #include "intel_vdsc.h"
 
-struct rc_parameters {
-   u16 initial_xmit_delay;
-   u8 first_line_bpg_offset;
-   u16 initial_offset;
-   u8 flatness_min_qp;
-   u8 flatness_max_qp;
-   u8 rc_quant_incr_limit0;
-   u8 rc_quant_incr_limit1;
-   struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
-};
-
 bool intel_dsc_source_support(const struct intel_crtc_state *crtc_state)
 {
const struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
@@ -63,8 +52,7 @@ static bool is_pipe_dsc(struct intel_crtc *crtc, enum 
transcoder cpu_transcoder)
 }
 
 static void
-calculate_rc_params(struct rc_parameters *rc,
-   struct drm_dsc_config *vdsc_cfg)
+calculate_rc_params(struct drm_dsc_config *vdsc_cfg)
 {
int bpc = vdsc_cfg->bits_per_component;
int bpp = vdsc_cfg->bits_per_pixel >> 4;
@@ -84,54 +72,54 @@ calculate_rc_params(struct rc_parameters *rc,
u32 res, buf_i, bpp_i;
 
if (vdsc_cfg->slice_height >= 8)
-   rc->first_line_bpg_offset =
+   vdsc_cfg->first_line_bpg_offset =
12 + DIV_ROUND_UP((9 * min(34, vdsc_cfg->slice_height - 
8)), 100);
else
-   rc->first_line_bpg_offset = 2 * (vdsc_cfg->slice_height - 1);
+   vdsc_cfg->first_line_bpg_offset = 2 * (vdsc_cfg->slice_height - 
1);
 
/* Our hw supports only 444 modes as of today */
if (bpp >= 12)
-   rc->initial_offset = 2048;
+   vdsc_cfg->initial_offset = 2048;
else if (bpp >= 10)
-   rc->initial_offset = 5632 - DIV_ROUND_UP(((bpp - 10) * 3584), 
2);
+   vdsc_cfg->initial_offset = 5632 - DIV_ROUND_UP(((bpp - 10) * 
3584), 2);
else if (bpp >= 8)
-   rc->initial_offset = 6144 - DIV_ROUND_UP(((bpp - 8) * 512), 2);
+   vdsc_cfg->initial_offset = 6144 - DIV_ROUND_UP(((bpp - 8) * 
512), 2);
else
-   rc->initial_offset = 6144;
+   vdsc_cfg->initial_offset = 6144;
 
/* initial_xmit_delay = rc_model_size/2/compression_bpp */
-   rc->initial_xmit_delay = DIV_ROUND_UP(DSC_RC_MODEL_SIZE_CONST, 2 * bpp);
+   vdsc_cfg->initial_xmit_delay = DIV_ROUND_UP(DSC_RC_MODEL_SIZE_CONST, 2 
* bpp);
 
-   rc->flatness_min_qp = 3 + qp_bpc_modifier;
-   rc->flatness_max_qp = 12 + qp_bpc_modifier;
+   vdsc_cfg->flatness_min_qp = 3 + qp_bpc_modifier;
+   vdsc_cfg->flatness_max_qp = 12 + qp_bpc_modifier;
 
-   rc->rc_quant_incr_limit0 = 11 + qp_bpc_modifier;
-   rc->rc_quant_incr_limit1 = 11 + qp_bpc_modifier;
+   vdsc_cfg->rc_quant_incr_limit0 = 11 + qp_bpc_modifier;
+   vdsc_cfg->rc_quant_incr_limit1 = 11 + qp_bpc_modifier;
 
bpp_i  = (2 * (bpp - 6));
for (buf_i = 0; buf_i < DSC_NUM_BUF_RANGES; buf_i++) {
/* Read range_minqp and range_max_qp from qp tables */
-   rc->rc_range_params[buf_i].range_min_qp =
+   vdsc_cfg->rc_range_params[buf_i].range_min_qp =
intel_lookup_range_min_qp(bpc, buf_i, bpp_i);
-   rc->rc_range_params[buf_i].range_max_qp =
+   vdsc_cfg->rc_range_params[buf_i].range_max_qp =
intel_lookup_range_max_qp(bpc, buf_i, bpp_i);
 
/* Calculate range_bgp_offset */
if (bpp <= 6) {
-   rc->rc_range_params[buf_i].range_bpg_offset = 
ofs_und6[buf_i];
+   vdsc_cfg->rc_range_params[buf_i].range_bpg_offset = 
ofs_und6[buf_i];
} else if (bpp <= 8) {
res = DIV_ROUND_UP(((bpp - 6) * (ofs_und8[buf_i] - 
ofs_und6[buf_i])), 2);
-   rc->rc_range_params[buf_i].range_bpg_offset =
+   vdsc_cfg->rc_range_params[buf_i].range_bpg_offset =
ofs_und6[buf_i] 
+ res;
} else if (bpp <= 12) {
-   rc->rc_range_params[buf_i].range_bpg_offset =
+   vdsc_cfg->rc_range_params[buf_i].range_bpg_offset =
ofs_und8[buf_i];
} else if (bpp <= 15) {
res = DIV_ROUND_UP(((bpp - 12) * (ofs_und15[buf_i] - 
ofs_und12[buf_i])), 

[Freedreno] [PATCH 07/10] drm/display/dsc: include the rest of pre-SCR parameters

2023-02-28 Thread Dmitry Baryshkov
DSC model contains pre-SCR RC parameters for other bpp/bpc combinations,
include them here for completeness.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c | 72 
 1 file changed, 72 insertions(+)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index 51794b40526a..1612536014ea 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -327,6 +327,16 @@ struct rc_parameters_data {
 #define DSC_BPP(bpp)   ((bpp) << 4)
 
 static const struct rc_parameters_data rc_parameters_pre_scr[] = {
+{ DSC_BPP(6), 8,
+   /* 6BPP/8BPC */
+   { 683, 15, 6144, 3, 13, 11, 11, {
+   { 0, 2, 0 }, { 1, 4, -2 }, { 3, 6, -2 }, { 4, 6, -4 },
+   { 5, 7, -6 }, { 5, 7, -6 }, { 6, 7, -6 }, { 6, 8, -8 },
+   { 7, 9, -8 }, { 8, 10, -10 }, { 9, 11, -10 }, { 10, 12, -12 },
+   { 10, 13, -12 }, { 12, 14, -12 }, { 15, 15, -12 }
+   }
+   }
+},
 { DSC_BPP(8), 8,
/* 8BPP/8BPC */
{ 512, 12, 6144, 3, 12, 11, 11, {
@@ -362,6 +372,37 @@ static const struct rc_parameters_data 
rc_parameters_pre_scr[] = {
}
}
 },
+{ DSC_BPP(10), 8,
+   /* 10BPP/8BPC */
+   { 410, 12, 5632, 3, 12, 11, 11, {
+   { 0, 3, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 2, 6, -2 },
+   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
+   { 3, 9, -8 }, { 3, 9, -10 }, { 5, 10, -10 }, { 5, 11, -10 },
+   { 5, 12, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
+   }
+   }
+},
+{ DSC_BPP(10), 10,
+   /* 10BPP/10BPC */
+   { 410, 12, 5632, 7, 16, 15, 15, {
+   { 0, 7, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 6, 10, -2 },
+   { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
+   { 7, 13, -8 }, { 7, 13, -10 }, { 9, 14, -10 }, { 9, 15, -10 },
+   { 9, 16, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
+   }
+   }
+},
+{ DSC_BPP(10), 12,
+   /* 10BPP/12BPC */
+   { 410, 12, 5632, 11, 20, 19, 19, {
+   { 0, 11, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 10, 14, -2 },
+   { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
+   { 11, 17, -8 }, { 11, 17, -10 }, { 13, 18, -10 },
+   { 13, 19, -10 }, { 13, 20, -12 }, { 15, 21, -12 },
+   { 21, 23, -12 }
+   }
+   }
+},
 { DSC_BPP(12), 8,
/* 12BPP/8BPC */
{ 341, 15, 2048, 3, 12, 11, 11, {
@@ -393,6 +434,37 @@ static const struct rc_parameters_data 
rc_parameters_pre_scr[] = {
}
}
 },
+{ DSC_BPP(15), 8,
+   /* 15BPP/8BPC */
+   { 273, 15, 2048, 3, 12, 11, 11, {
+   { 0, 0, 10 }, { 0, 1, 8 }, { 0, 1, 6 }, { 0, 2, 4 },
+   { 1, 2, 2 }, { 1, 3, 0 }, { 1, 4, -2 }, { 2, 4, -4 },
+   { 3, 4, -6 }, { 3, 5, -8 }, { 4, 6, -10 }, { 5, 7, -10 },
+   { 5, 8, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
+   }
+   }
+},
+{ DSC_BPP(15), 10,
+   /* 15BPP/10BPC */
+   { 273, 15, 2048, 7, 16, 15, 15, {
+   { 0, 2, 10 }, { 2, 5, 8 }, { 3, 5, 6 }, { 4, 6, 4 },
+   { 5, 6, 2 }, { 5, 7, 0 }, { 5, 8, -2 }, { 6, 8, -4 },
+   { 7, 8, -6 }, { 7, 9, -8 }, { 8, 10, -10 }, { 9, 11, -10 },
+   { 9, 12, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
+   }
+   }
+},
+{ DSC_BPP(15), 12,
+   /* 15BPP/12BPC */
+   { 273, 15, 2048, 11, 20, 19, 19, {
+   { 0, 4, 10 }, { 2, 7, 8 }, { 4, 9, 6 }, { 6, 11, 4 },
+   { 9, 11, 2 }, { 9, 11, 0 }, { 9, 12, -2 }, { 10, 12, -4 },
+   { 11, 12, -6 }, { 11, 13, -8 }, { 12, 14, -10 },
+   { 13, 15, -10 }, { 13, 16, -12 }, { 15, 21, -12 },
+   { 21, 23, -12 }
+   }
+   }
+},
 { /* sentinel */ }
 };
 
-- 
2.39.2



[Freedreno] [PATCH 09/10] drm/display/dsc: add helper to set semi-const parameters

2023-02-28 Thread Dmitry Baryshkov
Add a helper setting config values which are typically constant across
operating modes (table E-4 of the standard) and mux_word_size (which is
a const according to 3.5.2).

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c | 21 +
 include/drm/display/drm_dsc_helper.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index d11ee8f1efa7..7de1d84f5bc7 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -270,6 +270,27 @@ void drm_dsc_pps_payload_pack(struct 
drm_dsc_picture_parameter_set *pps_payload,
 }
 EXPORT_SYMBOL(drm_dsc_pps_payload_pack);
 
+/**
+ * drm_dsc_set_const_params() - Set DSC parameters considered typically
+ * constant across operation modes
+ *
+ * @vdsc_cfg:
+ * DSC Configuration data partially filled by driver
+ */
+void drm_dsc_set_const_params(struct drm_dsc_config *vdsc_cfg)
+{
+   vdsc_cfg->rc_model_size = DSC_RC_MODEL_SIZE_CONST;
+   vdsc_cfg->rc_edge_factor = DSC_RC_EDGE_FACTOR_CONST;
+   vdsc_cfg->rc_tgt_offset_high = DSC_RC_TGT_OFFSET_HI_CONST;
+   vdsc_cfg->rc_tgt_offset_low = DSC_RC_TGT_OFFSET_LO_CONST;
+
+   if (vdsc_cfg->bits_per_component <= 10)
+   vdsc_cfg->mux_word_size = DSC_MUX_WORD_SIZE_8_10_BPC;
+   else
+   vdsc_cfg->mux_word_size = DSC_MUX_WORD_SIZE_12_BPC;
+}
+EXPORT_SYMBOL(drm_dsc_set_const_params);
+
 /* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
 const u16 drm_dsc_rc_buf_thresh[] = {
896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
diff --git a/include/drm/display/drm_dsc_helper.h 
b/include/drm/display/drm_dsc_helper.h
index 0bb0c3afd740..4448c482b092 100644
--- a/include/drm/display/drm_dsc_helper.h
+++ b/include/drm/display/drm_dsc_helper.h
@@ -21,6 +21,7 @@ void drm_dsc_dp_pps_header_init(struct dp_sdp_header 
*pps_header);
 int drm_dsc_dp_rc_buffer_size(u8 rc_buffer_block_size, u8 rc_buffer_size);
 void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set *pps_sdp,
  const struct drm_dsc_config *dsc_cfg);
+void drm_dsc_set_const_params(struct drm_dsc_config *vdsc_cfg);
 void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg);
 int drm_dsc_setup_rc_params(struct drm_dsc_config *vdsc_cfg, enum 
drm_dsc_params_kind kind);
 int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg);
-- 
2.39.2



[Freedreno] [PATCH 06/10] drm/display/dsc: split DSC 1.2 and DSC 1.1 (pre-SCR) parameters

2023-02-28 Thread Dmitry Baryshkov
The array of rc_parameters contains a mixture of parameters from DSC 1.1
and DSC 1.2 standards. Split these tow configuration arrays in
preparation to adding more configuration data.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c  | 127 ++
 drivers/gpu/drm/i915/display/intel_vdsc.c |  10 +-
 include/drm/display/drm_dsc_helper.h  |   7 +-
 3 files changed, 119 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index a6d11f474656..51794b40526a 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -326,11 +326,81 @@ struct rc_parameters_data {
 
 #define DSC_BPP(bpp)   ((bpp) << 4)
 
+static const struct rc_parameters_data rc_parameters_pre_scr[] = {
+{ DSC_BPP(8), 8,
+   /* 8BPP/8BPC */
+   { 512, 12, 6144, 3, 12, 11, 11, {
+   { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
+   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
+   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
+   { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
+   }
+   }
+},
+{ DSC_BPP(8), 10,
+   /* 8BPP/10BPC */
+   { 512, 12, 6144, 7, 16, 15, 15, {
+   /*
+* DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
+* VESA DSC 1.1 Table E-5 sets it to 4.
+*/
+   { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
+   { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
+   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
+   { 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
+   }
+   }
+},
+{ DSC_BPP(8), 12,
+   /* 8BPP/12BPC */
+   { 512, 12, 6144, 11, 20, 19, 19, {
+   { 0, 12, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 9, 14, -2 },
+   { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
+   { 11, 17, -8 }, { 11, 18, -10 }, { 13, 19, -10 },
+   { 13, 20, -12 }, { 13, 21, -12 }, { 15, 21, -12 },
+   { 21, 23, -12 }
+   }
+   }
+},
+{ DSC_BPP(12), 8,
+   /* 12BPP/8BPC */
+   { 341, 15, 2048, 3, 12, 11, 11, {
+   { 0, 2, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
+   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
+   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
+   { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
+   }
+   }
+},
+{ DSC_BPP(12), 10,
+   /* 12BPP/10BPC */
+   { 341, 15, 2048, 7, 16, 15, 15, {
+   { 0, 2, 2 }, { 2, 5, 0 }, { 3, 7, 0 }, { 4, 8, -2 },
+   { 6, 9, -4 }, { 7, 10, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
+   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
+   { 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
+   }
+   }
+},
+{ DSC_BPP(12), 12,
+   /* 12BPP/12BPC */
+   { 341, 15, 2048, 11, 20, 19, 19, {
+   { 0, 6, 2 }, { 4, 9, 0 }, { 7, 11, 0 }, { 8, 12, -2 },
+   { 10, 13, -4 }, { 11, 14, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
+   { 11, 17, -8 }, { 11, 18, -10 }, { 13, 19, -10 },
+   { 13, 20, -12 }, { 13, 21, -12 }, { 15, 21, -12 },
+   { 21, 23, -12 }
+   }
+   }
+},
+{ /* sentinel */ }
+};
+
 /*
  * Selected Rate Control Related Parameter Recommended Values
  * from DSC_v1.11 spec & C Model release: DSC_model_20161212
  */
-static const struct rc_parameters_data rc_parameters[] = {
+static const struct rc_parameters_data rc_parameters_1_2_444[] = {
 { DSC_BPP(6), 8,
/* 6BPP/8BPC */
{ 768, 15, 6144, 3, 13, 11, 11, {
@@ -390,22 +460,18 @@ static const struct rc_parameters_data rc_parameters[] = {
{ 512, 12, 6144, 3, 12, 11, 11, {
{ 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
{ 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
-   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
-   { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
+   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 10, -10 }, { 5, 11, -12 },
+   { 5, 11, -12 }, { 9, 12, -12 }, { 12, 13, -12 }
}
}
 },
 { DSC_BPP(8), 10,
/* 8BPP/10BPC */
{ 512, 12, 6144, 7, 16, 15, 15, {
-   /*
-* DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
-* VESA DSC 1.1 Table E-5 sets it to 4.
-*/
-   { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
+   { 0, 8, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
{ 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
-   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16,

[Freedreno] [PATCH 02/10] drm/i915/dsc: move rc_buf_thresh values to common helper

2023-02-28 Thread Dmitry Baryshkov
The rc_buf_thresh values are common to all DSC implementations. Move
them to the common helper together with the code to propagage them to
the drm_dsc_config.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c  | 37 +++
 drivers/gpu/drm/i915/display/intel_vdsc.c | 24 +--
 include/drm/display/drm_dsc_helper.h  |  1 +
 3 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index c869c6e51e2b..ab8679c158b5 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -270,6 +270,43 @@ void drm_dsc_pps_payload_pack(struct 
drm_dsc_picture_parameter_set *pps_payload,
 }
 EXPORT_SYMBOL(drm_dsc_pps_payload_pack);
 
+/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
+const u16 drm_dsc_rc_buf_thresh[] = {
+   896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
+   7744, 7872, 8000, 8064
+};
+EXPORT_SYMBOL(drm_dsc_rc_buf_thresh);
+
+/**
+ * drm_dsc_set_rc_buf_thresh() - Set thresholds for the RC model
+ * in accordance with the DSC 1.2 specification.
+ *
+ * @vdsc_cfg: DSC Configuration data partially filled by driver
+ */
+void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg)
+{
+   int i = 0;
+
+   for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
+   /*
+* six 0s are appended to the lsb of each threshold value
+* internally in h/w.
+* Only 8 bits are allowed for programming RcBufThreshold
+*/
+   vdsc_cfg->rc_buf_thresh[i] = drm_dsc_rc_buf_thresh[i] >> 6;
+   }
+
+   /*
+* For 6bpp, RC Buffer threshold 12 and 13 need a different value
+* as per C Model
+*/
+   if (vdsc_cfg->bits_per_pixel == 6 << 4) {
+   vdsc_cfg->rc_buf_thresh[12] = 7936 >> 6;
+   vdsc_cfg->rc_buf_thresh[13] = 8000 >> 6;
+   }
+}
+EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
+
 /**
  * drm_dsc_compute_rc_parameters() - Write rate control
  * parameters to the dsc configuration defined in
diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
b/drivers/gpu/drm/i915/display/intel_vdsc.c
index d080741fd0b3..b4faab4c8fb3 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
@@ -36,12 +36,6 @@ enum COLUMN_INDEX_BPC {
MAX_COLUMN_INDEX
 };
 
-/* From DSC_v1.11 spec, rc_parameter_Set syntax element typically constant */
-static const u16 rc_buf_thresh[] = {
-   896, 1792, 2688, 3584, 4480, 5376, 6272, 6720, 7168, 7616,
-   7744, 7872, 8000, 8064
-};
-
 struct rc_parameters {
u16 initial_xmit_delay;
u8 first_line_bpg_offset;
@@ -474,23 +468,7 @@ int intel_dsc_compute_params(struct intel_crtc_state 
*pipe_config)
vdsc_cfg->bits_per_pixel = compressed_bpp << 4;
vdsc_cfg->bits_per_component = pipe_config->pipe_bpp / 3;
 
-   for (i = 0; i < DSC_NUM_BUF_RANGES - 1; i++) {
-   /*
-* six 0s are appended to the lsb of each threshold value
-* internally in h/w.
-* Only 8 bits are allowed for programming RcBufThreshold
-*/
-   vdsc_cfg->rc_buf_thresh[i] = rc_buf_thresh[i] >> 6;
-   }
-
-   /*
-* For 6bpp, RC Buffer threshold 12 and 13 need a different value
-* as per C Model
-*/
-   if (compressed_bpp == 6) {
-   vdsc_cfg->rc_buf_thresh[12] = 0x7C;
-   vdsc_cfg->rc_buf_thresh[13] = 0x7D;
-   }
+   drm_dsc_set_rc_buf_thresh(vdsc_cfg);
 
/*
 * From XE_LPD onwards we supports compression bpps in steps of 1
diff --git a/include/drm/display/drm_dsc_helper.h 
b/include/drm/display/drm_dsc_helper.h
index 8b41edbbabab..706ba1d34742 100644
--- a/include/drm/display/drm_dsc_helper.h
+++ b/include/drm/display/drm_dsc_helper.h
@@ -14,6 +14,7 @@ void drm_dsc_dp_pps_header_init(struct dp_sdp_header 
*pps_header);
 int drm_dsc_dp_rc_buffer_size(u8 rc_buffer_block_size, u8 rc_buffer_size);
 void drm_dsc_pps_payload_pack(struct drm_dsc_picture_parameter_set *pps_sdp,
  const struct drm_dsc_config *dsc_cfg);
+void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config *vdsc_cfg);
 int drm_dsc_compute_rc_parameters(struct drm_dsc_config *vdsc_cfg);
 
 #endif /* _DRM_DSC_HELPER_H_ */
-- 
2.39.2



[Freedreno] [PATCH 03/10] drm/i915/dsc: move DSC tables to DRM DSC helper

2023-02-28 Thread Dmitry Baryshkov
This moves DSC RC tables to DRM DSC helper. No additional code changes
and/or cleanups are a part of this commit, it will be cleaned up in the
followup commits.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/display/drm_dsc_helper.c  | 364 ++
 drivers/gpu/drm/i915/display/intel_vdsc.c | 319 +--
 include/drm/display/drm_dsc_helper.h  |   1 +
 3 files changed, 372 insertions(+), 312 deletions(-)

diff --git a/drivers/gpu/drm/display/drm_dsc_helper.c 
b/drivers/gpu/drm/display/drm_dsc_helper.c
index ab8679c158b5..deaa84722bd4 100644
--- a/drivers/gpu/drm/display/drm_dsc_helper.c
+++ b/drivers/gpu/drm/display/drm_dsc_helper.c
@@ -307,6 +307,370 @@ void drm_dsc_set_rc_buf_thresh(struct drm_dsc_config 
*vdsc_cfg)
 }
 EXPORT_SYMBOL(drm_dsc_set_rc_buf_thresh);
 
+enum ROW_INDEX_BPP {
+   ROW_INDEX_6BPP = 0,
+   ROW_INDEX_8BPP,
+   ROW_INDEX_10BPP,
+   ROW_INDEX_12BPP,
+   ROW_INDEX_15BPP,
+   MAX_ROW_INDEX
+};
+
+enum COLUMN_INDEX_BPC {
+   COLUMN_INDEX_8BPC = 0,
+   COLUMN_INDEX_10BPC,
+   COLUMN_INDEX_12BPC,
+   COLUMN_INDEX_14BPC,
+   COLUMN_INDEX_16BPC,
+   MAX_COLUMN_INDEX
+};
+
+struct rc_parameters {
+   u16 initial_xmit_delay;
+   u8 first_line_bpg_offset;
+   u16 initial_offset;
+   u8 flatness_min_qp;
+   u8 flatness_max_qp;
+   u8 rc_quant_incr_limit0;
+   u8 rc_quant_incr_limit1;
+   struct drm_dsc_rc_range_parameters rc_range_params[DSC_NUM_BUF_RANGES];
+};
+
+/*
+ * Selected Rate Control Related Parameter Recommended Values
+ * from DSC_v1.11 spec & C Model release: DSC_model_20161212
+ */
+static const struct rc_parameters rc_parameters[][MAX_COLUMN_INDEX] = {
+{
+   /* 6BPP/8BPC */
+   { 768, 15, 6144, 3, 13, 11, 11, {
+   { 0, 4, 0 }, { 1, 6, -2 }, { 3, 8, -2 }, { 4, 8, -4 },
+   { 5, 9, -6 }, { 5, 9, -6 }, { 6, 9, -6 }, { 6, 10, -8 },
+   { 7, 11, -8 }, { 8, 12, -10 }, { 9, 12, -10 }, { 10, 12, -12 },
+   { 10, 12, -12 }, { 11, 12, -12 }, { 13, 14, -12 }
+   }
+   },
+   /* 6BPP/10BPC */
+   { 768, 15, 6144, 7, 17, 15, 15, {
+   { 0, 8, 0 }, { 3, 10, -2 }, { 7, 12, -2 }, { 8, 12, -4 },
+   { 9, 13, -6 }, { 9, 13, -6 }, { 10, 13, -6 }, { 10, 14, -8 },
+   { 11, 15, -8 }, { 12, 16, -10 }, { 13, 16, -10 },
+   { 14, 16, -12 }, { 14, 16, -12 }, { 15, 16, -12 },
+   { 17, 18, -12 }
+   }
+   },
+   /* 6BPP/12BPC */
+   { 768, 15, 6144, 11, 21, 19, 19, {
+   { 0, 12, 0 }, { 5, 14, -2 }, { 11, 16, -2 }, { 12, 16, -4 },
+   { 13, 17, -6 }, { 13, 17, -6 }, { 14, 17, -6 }, { 14, 18, -8 },
+   { 15, 19, -8 }, { 16, 20, -10 }, { 17, 20, -10 },
+   { 18, 20, -12 }, { 18, 20, -12 }, { 19, 20, -12 },
+   { 21, 22, -12 }
+   }
+   },
+   /* 6BPP/14BPC */
+   { 768, 15, 6144, 15, 25, 23, 23, {
+   { 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
+   { 17, 21, -6 }, { 17, 21, -6 }, { 18, 21, -6 }, { 18, 22, -8 },
+   { 19, 23, -8 }, { 20, 24, -10 }, { 21, 24, -10 },
+   { 22, 24, -12 }, { 22, 24, -12 }, { 23, 24, -12 },
+   { 25, 26, -12 }
+   }
+   },
+   /* 6BPP/16BPC */
+   { 768, 15, 6144, 19, 29, 27, 27, {
+   { 0, 20, 0 }, { 9, 22, -2 }, { 19, 24, -2 }, { 20, 24, -4 },
+   { 21, 25, -6 }, { 21, 25, -6 }, { 22, 25, -6 }, { 22, 26, -8 },
+   { 23, 27, -8 }, { 24, 28, -10 }, { 25, 28, -10 },
+   { 26, 28, -12 }, { 26, 28, -12 }, { 27, 28, -12 },
+   { 29, 30, -12 }
+   }
+   },
+},
+{
+   /* 8BPP/8BPC */
+   { 512, 12, 6144, 3, 12, 11, 11, {
+   { 0, 4, 2 }, { 0, 4, 0 }, { 1, 5, 0 }, { 1, 6, -2 },
+   { 3, 7, -4 }, { 3, 7, -6 }, { 3, 7, -8 }, { 3, 8, -8 },
+   { 3, 9, -8 }, { 3, 10, -10 }, { 5, 11, -10 }, { 5, 12, -12 },
+   { 5, 13, -12 }, { 7, 13, -12 }, { 13, 15, -12 }
+   }
+   },
+   /* 8BPP/10BPC */
+   { 512, 12, 6144, 7, 16, 15, 15, {
+   /*
+* DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
+* VESA DSC 1.1 Table E-5 sets it to 4.
+*/
+   { 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
+   { 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
+   { 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
+   { 9, 17, -12 }, { 11, 17, -12 }, { 17, 19, -12 }
+   }
+   },
+   /* 8BPP/12BPC */
+   { 512, 12, 6144, 11, 20, 19, 19, {
+   { 0, 12, 2 }, { 4, 12, 0 }, { 9, 13, 0 }, { 9, 14, -2 },
+   { 11, 15, -4 }, { 11, 15, -6 }, { 11, 15, -8 }, { 11, 16, -8 },
+   { 11, 17, -8 }, { 11, 18, -10 }, { 13, 1

[Freedreno] [PATCH 01/10] drm/i915/dsc: change DSC param tables to follow the DSC model

2023-02-28 Thread Dmitry Baryshkov
After cross-checking DSC models (20150914, 20161212, 20210623) change
values in rc_parameters tables to follow config files present inside
the DSC model. Handle two places, where i915 tables diverged from the
model, by patching the rc values in the code.

Note: I left one case uncorrected, 8bpp/10bpc/range_max_qp[0], because
the table in the VESA DSC 1.1 sets it to 4.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/i915/display/intel_vdsc.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_vdsc.c 
b/drivers/gpu/drm/i915/display/intel_vdsc.c
index 207b2a648d32..d080741fd0b3 100644
--- a/drivers/gpu/drm/i915/display/intel_vdsc.c
+++ b/drivers/gpu/drm/i915/display/intel_vdsc.c
@@ -86,7 +86,7 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
}
},
/* 6BPP/14BPC */
-   { 768, 15, 6144, 15, 25, 23, 27, {
+   { 768, 15, 6144, 15, 25, 23, 23, {
{ 0, 16, 0 }, { 7, 18, -2 }, { 15, 20, -2 }, { 16, 20, -4 },
{ 17, 21, -6 }, { 17, 21, -6 }, { 18, 21, -6 }, { 18, 22, -8 },
{ 19, 23, -8 }, { 20, 24, -10 }, { 21, 24, -10 },
@@ -115,6 +115,10 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
},
/* 8BPP/10BPC */
{ 512, 12, 6144, 7, 16, 15, 15, {
+   /*
+* DSC model/pre-SCR-cfg has 8 for range_max_qp[0], however
+* VESA DSC 1.1 Table E-5 sets it to 4.
+*/
{ 0, 4, 2 }, { 4, 8, 0 }, { 5, 9, 0 }, { 5, 10, -2 },
{ 7, 11, -4 }, { 7, 11, -6 }, { 7, 11, -8 }, { 7, 12, -8 },
{ 7, 13, -8 }, { 7, 14, -10 }, { 9, 15, -10 }, { 9, 16, -12 },
@@ -132,7 +136,7 @@ static const struct rc_parameters 
rc_parameters[][MAX_COLUMN_INDEX] = {
},
/* 8BPP/14BPC */
{ 512, 12, 6144, 15, 24, 23, 23, {
-   { 0, 12, 0 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
+   { 0, 12, 2 }, { 5, 13, 0 }, { 11, 15, 0 }, { 12, 17, -2 },
{ 15, 19, -4 }, { 15, 19, -6 }, { 15, 19, -8 }, { 15, 20, -8 },
{ 15, 21, -8 }, { 15, 22, -10 }, { 17, 22, -10 },
{ 17, 23, -12 }, { 17, 23, -12 }, { 21, 24, -12 },
@@ -529,6 +533,16 @@ int intel_dsc_compute_params(struct intel_crtc_state 
*pipe_config)
DSC_RANGE_BPG_OFFSET_MASK;
}
 
+   if (DISPLAY_VER(dev_priv) < 13) {
+   if (compressed_bpp == 6 &&
+   vdsc_cfg->bits_per_component == 8)
+   vdsc_cfg->rc_quant_incr_limit1 = 23;
+
+   if (compressed_bpp == 8 &&
+   vdsc_cfg->bits_per_component == 14)
+   vdsc_cfg->rc_range_params[0].range_bpg_offset = 0;
+   }
+
/*
 * BitsPerComponent value determines mux_word_size:
 * When BitsPerComponent is less than or 10bpc, muxWordSize will be 
equal to
-- 
2.39.2



[Freedreno] [PATCH 00/10] drm/i915: move DSC RC tables to drm_dsc_helper.c

2023-02-28 Thread Dmitry Baryshkov
Other platforms (msm) will benefit from sharing the DSC config setup
functions. This series moves parts of static DSC config data from the
i915 driver to the common helpers to be used by other drivers.

Note: the RC parameters were cross-checked against config files found in
DSC model 2021062, 20161212 (and 20150914). The first patch modifies
tables according to those config files, while preserving parameter
values using the code. I have not changed one of the values in the
pre-SCR config file as it clearly looks like a typo in the config file,
considering the table E in DSC 1.1 and in the DSC 1.1 SCR.

Dmitry Baryshkov (10):
  drm/i915/dsc: change DSC param tables to follow the DSC model
  drm/i915/dsc: move rc_buf_thresh values to common helper
  drm/i915/dsc: move DSC tables to DRM DSC helper
  drm/i915/dsc: stop using interim structure for calculated params
  drm/display/dsc: use flat array for rc_parameters lookup
  drm/display/dsc: split DSC 1.2 and DSC 1.1 (pre-SCR) parameters
  drm/display/dsc: include the rest of pre-SCR parameters
  drm/display/dsc: add YCbCr 4:2:2 and 4:2:0 RC parameters
  drm/display/dsc: add helper to set semi-const parameters
  drm/msm/dsi: use new helpers for DSC setup

 drivers/gpu/drm/display/drm_dsc_helper.c  | 1001 +
 drivers/gpu/drm/i915/display/intel_vdsc.c |  432 +
 drivers/gpu/drm/msm/dsi/dsi_host.c|   61 +-
 include/drm/display/drm_dsc_helper.h  |   10 +
 4 files changed, 1058 insertions(+), 446 deletions(-)

-- 
2.39.2



Re: [Freedreno] [PATCH v7 07/15] dma-buf/sw_sync: Add fence deadline support

2023-02-28 Thread Pekka Paalanen
On Mon, 27 Feb 2023 11:35:13 -0800
Rob Clark  wrote:

> From: Rob Clark 
> 
> This consists of simply storing the most recent deadline, and adding an
> ioctl to retrieve the deadline.  This can be used in conjunction with
> the SET_DEADLINE ioctl on a fence fd for testing.  Ie. create various
> sw_sync fences, merge them into a fence-array, set deadline on the
> fence-array and confirm that it is propagated properly to each fence.
> 
> v2: Switch UABI to express deadline as u64
> v3: More verbose UAPI docs, show how to convert from timespec
> 
> Signed-off-by: Rob Clark 
> Reviewed-by: Christian König 
> ---
>  drivers/dma-buf/sw_sync.c  | 58 ++
>  drivers/dma-buf/sync_debug.h   |  2 ++
>  include/uapi/linux/sync_file.h |  6 +++-
>  3 files changed, 65 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
> index 348b3a9170fa..3e2315ee955b 100644
> --- a/drivers/dma-buf/sw_sync.c
> +++ b/drivers/dma-buf/sw_sync.c
> @@ -52,12 +52,28 @@ struct sw_sync_create_fence_data {
>   __s32   fence; /* fd of new fence */
>  };
>  
> +/**
> + * struct sw_sync_get_deadline - get the deadline hint of a sw_sync fence
> + * @deadline_ns: absolute time of the deadline
> + * @pad: must be zero
> + * @fence_fd:the sw_sync fence fd (in)
> + *
> + * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank)

Hi,

the commit message explains this returns the "most recent" deadline,
but the doc here forgets to mention that. I suppose that means the
most recently set deadline and not the deadline furthest forward in
time (largest value).

Is "most recent" the appropriate behaviour when multiple deadlines have
been set? Would you not want the earliest deadline set so far instead?

What if none has been set?

> + */
> +struct sw_sync_get_deadline {
> + __u64   deadline_ns;
> + __u32   pad;
> + __s32   fence_fd;
> +};
> +
>  #define SW_SYNC_IOC_MAGIC'W'
>  
>  #define SW_SYNC_IOC_CREATE_FENCE _IOWR(SW_SYNC_IOC_MAGIC, 0,\
>   struct sw_sync_create_fence_data)
>  
>  #define SW_SYNC_IOC_INC  _IOW(SW_SYNC_IOC_MAGIC, 1, 
> __u32)
> +#define SW_SYNC_GET_DEADLINE _IOWR(SW_SYNC_IOC_MAGIC, 2, \
> + struct sw_sync_get_deadline)
>  
>  static const struct dma_fence_ops timeline_fence_ops;
>  
> @@ -171,6 +187,13 @@ static void timeline_fence_timeline_value_str(struct 
> dma_fence *fence,
>   snprintf(str, size, "%d", parent->value);
>  }
>  
> +static void timeline_fence_set_deadline(struct dma_fence *fence, ktime_t 
> deadline)
> +{
> + struct sync_pt *pt = dma_fence_to_sync_pt(fence);
> +
> + pt->deadline = deadline;
> +}
> +
>  static const struct dma_fence_ops timeline_fence_ops = {
>   .get_driver_name = timeline_fence_get_driver_name,
>   .get_timeline_name = timeline_fence_get_timeline_name,
> @@ -179,6 +202,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
>   .release = timeline_fence_release,
>   .fence_value_str = timeline_fence_value_str,
>   .timeline_value_str = timeline_fence_timeline_value_str,
> + .set_deadline = timeline_fence_set_deadline,
>  };
>  
>  /**
> @@ -387,6 +411,37 @@ static long sw_sync_ioctl_inc(struct sync_timeline *obj, 
> unsigned long arg)
>   return 0;
>  }
>  
> +static int sw_sync_ioctl_get_deadline(struct sync_timeline *obj, unsigned 
> long arg)
> +{
> + struct sw_sync_get_deadline data;
> + struct dma_fence *fence;
> + struct sync_pt *pt;
> +
> + if (copy_from_user(&data, (void __user *)arg, sizeof(data)))
> + return -EFAULT;
> +
> + if (data.deadline_ns || data.pad)
> + return -EINVAL;
> +
> + fence = sync_file_get_fence(data.fence_fd);
> + if (!fence)
> + return -EINVAL;
> +
> + pt = dma_fence_to_sync_pt(fence);
> + if (!pt)
> + return -EINVAL;
> +
> +
> + data.deadline_ns = ktime_to_ns(pt->deadline);
> +
> + dma_fence_put(fence);
> +
> + if (copy_to_user((void __user *)arg, &data, sizeof(data)))
> + return -EFAULT;
> +
> + return 0;
> +}
> +
>  static long sw_sync_ioctl(struct file *file, unsigned int cmd,
> unsigned long arg)
>  {
> @@ -399,6 +454,9 @@ static long sw_sync_ioctl(struct file *file, unsigned int 
> cmd,
>   case SW_SYNC_IOC_INC:
>   return sw_sync_ioctl_inc(obj, arg);
>  
> + case SW_SYNC_GET_DEADLINE:
> + return sw_sync_ioctl_get_deadline(obj, arg);
> +
>   default:
>   return -ENOTTY;
>   }
> diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h
> index 6176e52ba2d7..2e0146d0bdbb 100644
> --- a/drivers/dma-buf/sync_debug.h
> +++ b/drivers/dma-buf/sync_debug.h
> @@ -55,11 +55,13 @@ static inline struct sync_timeline 
> *dma_fence_parent(struct dma_fence *fence)
>   * @base: base fence object
>   * @link: link on the sync timeline's list
>   * @node: node 

Re: [Freedreno] [PATCH v7 06/15] dma-buf/sync_file: Support (E)POLLPRI

2023-02-28 Thread Pekka Paalanen
On Mon, 27 Feb 2023 11:35:12 -0800
Rob Clark  wrote:

> From: Rob Clark 
> 
> Allow userspace to use the EPOLLPRI/POLLPRI flag to indicate an urgent
> wait (as opposed to a "housekeeping" wait to know when to cleanup after
> some work has completed).  Usermode components of GPU driver stacks
> often poll() on fence fd's to know when it is safe to do things like
> free or reuse a buffer, but they can also poll() on a fence fd when
> waiting to read back results from the GPU.  The EPOLLPRI/POLLPRI flag
> lets the kernel differentiate these two cases.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/dma-buf/sync_file.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> index 418021cfb87c..cbe96295373b 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -192,6 +192,14 @@ static __poll_t sync_file_poll(struct file *file, 
> poll_table *wait)
>  {
>   struct sync_file *sync_file = file->private_data;
>  
> + /*
> +  * The POLLPRI/EPOLLPRI flag can be used to signal that
> +  * userspace wants the fence to signal ASAP, express this
> +  * as an immediate deadline.
> +  */
> + if (poll_requested_events(wait) & EPOLLPRI)
> + dma_fence_set_deadline(sync_file->fence, ktime_get());

Hi,

I don't think this doc will appear anywhere where it could be found,
maybe not in kernel HTML doc at all?

I also think this is not a good idea, but not my call.


Thanks,
pq


> +
>   poll_wait(file, &sync_file->wq, wait);
>  
>   if (list_empty(&sync_file->cb.node) &&



pgp0kx6rGxhY6.pgp
Description: OpenPGP digital signature


Re: [Freedreno] [PATCH v7 05/15] dma-buf/sync_file: Add SET_DEADLINE ioctl

2023-02-28 Thread Pekka Paalanen
On Mon, 27 Feb 2023 11:35:11 -0800
Rob Clark  wrote:

> From: Rob Clark 
> 
> The initial purpose is for igt tests, but this would also be useful for
> compositors that wait until close to vblank deadline to make decisions
> about which frame to show.
> 
> The igt tests can be found at:
> 
> https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/fence-deadline
> 
> v2: Clarify the timebase, add link to igt tests
> v3: Use u64 value in ns to express deadline.
> v4: More doc
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/dma-buf/dma-fence.c|  3 ++-
>  drivers/dma-buf/sync_file.c| 19 +++
>  include/uapi/linux/sync_file.h | 22 ++
>  3 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index e103e821d993..7761ceeae620 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -933,7 +933,8 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>   *   the GPU's devfreq to reduce frequency, when in fact the opposite is 
> what is
>   *   needed.
>   *
> - * To this end, deadline hint(s) can be set on a &dma_fence via 
> &dma_fence_set_deadline.
> + * To this end, deadline hint(s) can be set on a &dma_fence via 
> &dma_fence_set_deadline
> + * (or indirectly via userspace facing ioctls like &SYNC_IOC_SET_DEADLINE).
>   * The deadline hint provides a way for the waiting driver, or userspace, to
>   * convey an appropriate sense of urgency to the signaling driver.

Hi,

when the kernel HTML doc is generated, I assume the above becomes a
link to "DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence", right?

>   *
> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> index af57799c86ce..418021cfb87c 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -350,6 +350,22 @@ static long sync_file_ioctl_fence_info(struct sync_file 
> *sync_file,
>   return ret;
>  }
>  
> +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> + unsigned long arg)
> +{
> + struct sync_set_deadline ts;
> +
> + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> + return -EFAULT;
> +
> + if (ts.pad)
> + return -EINVAL;
> +
> + dma_fence_set_deadline(sync_file->fence, ns_to_ktime(ts.deadline_ns));
> +
> + return 0;
> +}
> +
>  static long sync_file_ioctl(struct file *file, unsigned int cmd,
>   unsigned long arg)
>  {
> @@ -362,6 +378,9 @@ static long sync_file_ioctl(struct file *file, unsigned 
> int cmd,
>   case SYNC_IOC_FILE_INFO:
>   return sync_file_ioctl_fence_info(sync_file, arg);
>  
> + case SYNC_IOC_SET_DEADLINE:
> + return sync_file_ioctl_set_deadline(sync_file, arg);
> +
>   default:
>   return -ENOTTY;
>   }
> diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> index ee2dcfb3d660..49325cf6749b 100644
> --- a/include/uapi/linux/sync_file.h
> +++ b/include/uapi/linux/sync_file.h
> @@ -67,6 +67,21 @@ struct sync_file_info {
>   __u64   sync_fence_info;
>  };
>  
> +/**
> + * struct sync_set_deadline - set a deadline hint on a fence
> + * @deadline_ns: absolute time of the deadline

Is it legal to pass zero as deadline_ns?

> + * @pad: must be zero
> + *
> + * The timebase for the deadline is CLOCK_MONOTONIC (same as vblank)

Does something here provide doc links to "DOC: SYNC_IOC_SET_DEADLINE -
set a deadline on a fence" and to the "DOC: deadline hints"?

> + */
> +struct sync_set_deadline {
> + __u64   deadline_ns;
> + /* Not strictly needed for alignment but gives some possibility
> +  * for future extension:
> +  */
> + __u64   pad;
> +};
> +
>  #define SYNC_IOC_MAGIC   '>'
>  
>  /**
> @@ -95,4 +110,11 @@ struct sync_file_info {
>   */
>  #define SYNC_IOC_FILE_INFO   _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
>  
> +/**
> + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> + *
> + * Allows userspace to set a deadline on a fence, see 
> dma_fence_set_deadline()

Does something here provide doc links to struct sync_set_deadline and
to the "DOC: deadline hints"?

> + */
> +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct 
> sync_set_deadline)
> +
>  #endif /* _UAPI_LINUX_SYNC_H */

With all those links added:
Acked-by: Pekka Paalanen 


Thanks,
pq


pgpQ1QDPNtFre.pgp
Description: OpenPGP digital signature


Re: [Freedreno] [PATCH v7 01/15] dma-buf/dma-fence: Add deadline awareness

2023-02-28 Thread Pekka Paalanen
On Mon, 27 Feb 2023 11:35:07 -0800
Rob Clark  wrote:

> From: Rob Clark 
> 
> Add a way to hint to the fence signaler of an upcoming deadline, such as
> vblank, which the fence waiter would prefer not to miss.  This is to aid
> the fence signaler in making power management decisions, like boosting
> frequency as the deadline approaches and awareness of missing deadlines
> so that can be factored in to the frequency scaling.
> 
> v2: Drop dma_fence::deadline and related logic to filter duplicate
> deadlines, to avoid increasing dma_fence size.  The fence-context
> implementation will need similar logic to track deadlines of all
> the fences on the same timeline.  [ckoenig]
> v3: Clarify locking wrt. set_deadline callback
> v4: Clarify in docs comment that this is a hint
> v5: Drop DMA_FENCE_FLAG_HAS_DEADLINE_BIT.
> v6: More docs
> 
> Signed-off-by: Rob Clark 
> Reviewed-by: Christian König 
> ---
>  Documentation/driver-api/dma-buf.rst |  6 +++
>  drivers/dma-buf/dma-fence.c  | 59 
>  include/linux/dma-fence.h| 20 ++
>  3 files changed, 85 insertions(+)
> 
> diff --git a/Documentation/driver-api/dma-buf.rst 
> b/Documentation/driver-api/dma-buf.rst
> index 622b8156d212..183e480d8cea 100644
> --- a/Documentation/driver-api/dma-buf.rst
> +++ b/Documentation/driver-api/dma-buf.rst
> @@ -164,6 +164,12 @@ DMA Fence Signalling Annotations
>  .. kernel-doc:: drivers/dma-buf/dma-fence.c
> :doc: fence signalling annotation
>  
> +DMA Fence Deadline Hints
> +
> +
> +.. kernel-doc:: drivers/dma-buf/dma-fence.c
> +   :doc: deadline hints
> +
>  DMA Fences Functions Reference
>  ~~
>  
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 0de0482cd36e..e103e821d993 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -912,6 +912,65 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
> uint32_t count,
>  }
>  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  
> +/**
> + * DOC: deadline hints
> + *
> + * In an ideal world, it would be possible to pipeline a workload 
> sufficiently
> + * that a utilization based device frequency governor could arrive at a 
> minimum
> + * frequency that meets the requirements of the use-case, in order to 
> minimize
> + * power consumption.  But in the real world there are many workloads which
> + * defy this ideal.  For example, but not limited to:
> + *
> + * * Workloads that ping-pong between device and CPU, with alternating 
> periods
> + *   of CPU waiting for device, and device waiting on CPU.  This can result 
> in
> + *   devfreq and cpufreq seeing idle time in their respective domains and in
> + *   result reduce frequency.
> + *
> + * * Workloads that interact with a periodic time based deadline, such as 
> double
> + *   buffered GPU rendering vs vblank sync'd page flipping.  In this 
> scenario,
> + *   missing a vblank deadline results in an *increase* in idle time on the 
> GPU
> + *   (since it has to wait an additional vblank period), sending a single to

Hi Rob,

s/single/signal/ ?

> + *   the GPU's devfreq to reduce frequency, when in fact the opposite is 
> what is
> + *   needed.
> + *
> + * To this end, deadline hint(s) can be set on a &dma_fence via 
> &dma_fence_set_deadline.
> + * The deadline hint provides a way for the waiting driver, or userspace, to
> + * convey an appropriate sense of urgency to the signaling driver.
> + *
> + * A deadline hint is given in absolute ktime (CLOCK_MONOTONIC for userspace
> + * facing APIs).  The time could either be some point in the future (such as
> + * the vblank based deadline for page-flipping, or the start of a 
> compositor's
> + * composition cycle), or the current time to indicate an immediate deadline
> + * hint (Ie. forward progress cannot be made until this fence is signaled).

As "current time" not a special value, but just an absolute timestamp
like any other, deadlines already in the past must also be accepted?

> + *
> + * Multiple deadlines may be set on a given fence, even in parallel.  See the
> + * documentation for &dma_fence_ops.set_deadline.
> + *
> + * The deadline hint is just that, a hint.  The driver that created the fence
> + * may react by increasing frequency, making different scheduling choices, 
> etc.
> + * Or doing nothing at all.
> + */

Yes! Thank you for writing this! Well explained.

> +
> +/**
> + * dma_fence_set_deadline - set desired fence-wait deadline hint
> + * @fence:the fence that is to be waited on
> + * @deadline: the time by which the waiter hopes for the fence to be
> + *signaled
> + *
> + * Give the fence signaler a hint about an upcoming deadline, such as
> + * vblank, by which point the waiter would prefer the fence to be
> + * signaled by.  This is intended to give feedback to the fence signaler
> + * to aid in power management decisions, such as boosting GPU frequency
> + * if a