date:20220126

Re: [PATCH 01/19] dma-buf-map: Add read/write helpers

2022-01-26 Thread Christian König


Am 27.01.22 um 08:36 schrieb Matthew Brost:

[SNIP]

   /**
* dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
* @dst: The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map 
*map, size_t incr)
map->vaddr += incr;
   }
+/**
+ * dma_buf_map_read_field - Read struct member from dma-buf mapping with
+ * arbitrary size and handling un-aligned accesses
+ *
+ * @map__: The dma-buf mapping structure
+ * @type__:The struct to be used containing the field to read
+ * @field__:   Member from struct we want to read
+ *
+ * Read a value from dma-buf mapping calculating the offset and size: this 
assumes
+ * the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
+ * or u64 can be read, based on the offset and size of type__.field__.
+ */
+#define dma_buf_map_read_field(map__, type__, field__) ({  
\
+   type__ *t__;
\
+   typeof(t__->field__) val__; 
 \
+   dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, 
field__),\
+  sizeof(t__->field__));   
 \
+   val__;  
\
+})
+
+/**
+ * dma_buf_map_write_field - Write struct member to the dma-buf mapping with
+ * arbitrary size and handling un-aligned accesses
+ *
+ * @map__: The dma-buf mapping structure
+ * @type__:The struct to be used containing the field to write
+ * @field__:   Member from struct we want to write
+ * @val__: Value to be written
+ *
+ * Write a value to the dma-buf mapping calculating the offset and size.
+ * A single u8, u16, u32 or u64 can be written based on the offset and size of
+ * type__.field__.
+ */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({  
\
+   type__ *t__;
\
+   typeof(t__->field__) val = val__;   
 \
+   dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__),  
\
+&val, sizeof(t__->field__));   
 \
+})
+

Uff well that absolutely looks like overkill to me.


Hold on...


That's a rather special use case as far as I can see and I think we should
only have this in the common framework if more than one driver is using it.


I disagree, this is rather elegant.

The i915 can't be the *only* driver that defines a struct which
describes the layout of a dma_buf object.


That's not the problem, amdgpu as well as nouveau are doing that as 
well. The problem is DMA-buf is a buffer sharing framework between drivers.


In other words which importer is supposed to use this with a DMA-buf 
exported by another device?



IMO this base macro allows *all* other drivers to build on this write
directly to fields in structures those drivers have defined.


Exactly that's the point. This is something drivers should absolutely 
*NOT* do.


That are driver internals and it is extremely questionable to move this 
into the common framework.


Regards,
Christian.


  Patches
later in this series do this for the GuC ads.

Matt
  

Regards,
Christian.


   #endif /* __DMA_BUF_MAP_H__ */

Re: [PATCH 02/19] dma-buf-map: Add helper to initialize second map

2022-01-26 Thread Lucas De Marchi


On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:

Am 26.01.22 um 21:36 schrieb Lucas De Marchi:

When dma_buf_map struct is passed around, it's useful to be able to
initialize a second map that takes care of reading/writing to an offset
of the original map.

Add a helper that copies the struct and add the offset to the proper
address.


Well what you propose here can lead to all kind of problems and is 
rather bad design as far as I can see.


The struct dma_buf_map is only to be filled in by the exporter and 
should not be modified in this way by the importer.


humn... not sure if I was  clear. There is no importer and exporter here.
There is a role delegation on filling out and reading a buffer when
that buffer represents a struct layout.

struct bla {
int a;
int b;
int c;
struct foo foo;
struct bar bar;
int d;
}


This implementation allows you to have:

fill_foo(struct dma_buf_map *bla_map) { ... }
fill_bar(struct dma_buf_map *bla_map) { ... }

and the first thing these do is to make sure the map it's pointing to
is relative to the struct it's supposed to write/read. Otherwise you're
suggesting everything to be relative to struct bla, or to do the same
I'm doing it, but IMO more prone to error:

struct dma_buf_map map = *bla_map;
dma_buf_map_incr(map, offsetof(...));

IMO this construct is worse because at a point in time in the function
the map was pointing to the wrong thing the function was supposed to
read/write.

It's also useful when the function has double duty, updating a global
part of the struct and a table inside it (see example in patch 6)

thanks
Lucas De Marchi

Re: [PATCH 6/8] drm/i915/dp: add 128b/132b support to link status checks

2022-01-26 Thread Ville Syrjälä

On Tue, Jan 25, 2022 at 07:03:44PM +0200, Jani Nikula wrote:
> Abstract link status check to a function that takes 128b/132b and 8b/10b
> into account, and use it. Also dump link status on failures.
> 
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Signed-off-by: Jani Nikula 
> ---
>  drivers/gpu/drm/i915/display/intel_dp.c   | 39 ++-
>  .../drm/i915/display/intel_dp_link_training.c |  2 +-
>  .../drm/i915/display/intel_dp_link_training.h |  4 ++
>  3 files changed, 34 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> b/drivers/gpu/drm/i915/display/intel_dp.c
> index 4d4579a301f6..80fedd0e6212 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -3628,6 +3628,32 @@ static void intel_dp_handle_test_request(struct 
> intel_dp *intel_dp)
>   "Could not write test response to sink\n");
>  }
>  
> +static bool intel_dp_link_ok(struct intel_dp *intel_dp,
> +  u8 link_status[DP_LINK_STATUS_SIZE])
> +{
> + struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
> + struct drm_i915_private *i915 = to_i915(encoder->base.dev);
> + bool uhbr = intel_dp->link_rate >= 100;
> + bool ok;
> +
> + if (uhbr)
> + ok = drm_dp_128b132b_lane_channel_eq_done(link_status,
> +   intel_dp->lane_count);

That will only check the eq done bits. I think we want to keep the
symbol locked checks as well.

> + else
> + ok = drm_dp_channel_eq_ok(link_status, intel_dp->lane_count);
> +
> + if (ok)
> + return true;
> +
> + intel_dp_dump_link_status(intel_dp, DP_PHY_DPRX, link_status);
> + drm_dbg_kms(&i915->drm,
> + "[ENCODER:%d:%s] %s link not ok, retraining\n",
> + encoder->base.base.id, encoder->base.name,
> + uhbr ? "128b/132b" : "8b/10b");
> +
> + return false;
> +}
> +
>  static void
>  intel_dp_mst_hpd_irq(struct intel_dp *intel_dp, u8 *esi, u8 *ack)
>  {
> @@ -3658,14 +3684,7 @@ static bool intel_dp_mst_link_status(struct intel_dp 
> *intel_dp)
>   return false;
>   }
>  
> - if (!drm_dp_channel_eq_ok(link_status, intel_dp->lane_count)) {
> - drm_dbg_kms(&i915->drm,
> - "[ENCODER:%d:%s] channel EQ not ok, retraining\n",
> - encoder->base.base.id, encoder->base.name);
> - return false;
> - }
> -
> - return true;
> + return intel_dp_link_ok(intel_dp, link_status);
>  }
>  
>  /**
> @@ -3779,8 +3798,8 @@ intel_dp_needs_link_retrain(struct intel_dp *intel_dp)
>   intel_dp->lane_count))
>   return false;
>  
> - /* Retrain if Channel EQ or CR not ok */
> - return !drm_dp_channel_eq_ok(link_status, intel_dp->lane_count);
> + /* Retrain if link not ok */
> + return !intel_dp_link_ok(intel_dp, link_status);
>  }
>  
>  static bool intel_dp_has_connector(struct intel_dp *intel_dp,
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.c 
> b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> index 8bb6a296f421..1e41a560204a 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.c
> @@ -712,7 +712,7 @@ static bool intel_dp_adjust_request_changed(const struct 
> intel_crtc_state *crtc_
>   return false;
>  }
>  
> -static void
> +void
>  intel_dp_dump_link_status(struct intel_dp *intel_dp, enum drm_dp_phy dp_phy,
> const u8 link_status[DP_LINK_STATUS_SIZE])
>  {
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_link_training.h 
> b/drivers/gpu/drm/i915/display/intel_dp_link_training.h
> index dbfb15705aaa..dc1556b46b85 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_link_training.h
> +++ b/drivers/gpu/drm/i915/display/intel_dp_link_training.h
> @@ -29,6 +29,10 @@ void intel_dp_start_link_train(struct intel_dp *intel_dp,
>  void intel_dp_stop_link_train(struct intel_dp *intel_dp,
> const struct intel_crtc_state *crtc_state);
>  
> +void
> +intel_dp_dump_link_status(struct intel_dp *intel_dp, enum drm_dp_phy dp_phy,
> +   const u8 link_status[DP_LINK_STATUS_SIZE]);
> +
>  /* Get the TPSx symbol type of the value programmed to 
> DP_TRAINING_PATTERN_SET */
>  static inline u8 intel_dp_training_pattern_symbol(u8 pattern)
>  {
> -- 
> 2.30.2

-- 
Ville Syrjälä
Intel

Re: [PATCH 5/8] drm/i915/dp: rewrite DP 2.0 128b/132b link training based on errata

2022-01-26 Thread Ville Syrjälä

On Tue, Jan 25, 2022 at 07:03:43PM +0200, Jani Nikula wrote:

> +static bool
> +intel_dp_128b132b_lane_cds(struct intel_dp *intel_dp,
> +const struct intel_crtc_state *crtc_state,
> +int lttpr_count)
> +{
> + struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
> + struct drm_i915_private *i915 = to_i915(encoder->base.dev);
> + u8 link_status[DP_LINK_STATUS_SIZE];
> + unsigned long deadline;
> +
> + if (drm_dp_dpcd_writeb(&intel_dp->aux, DP_TRAINING_PATTERN_SET,
> +DP_TRAINING_PATTERN_2_CDS) != 1) {
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Failed to start 128b/132b TPS2 CDS\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + deadline = jiffies + msecs_to_jiffies((lttpr_count + 1) * 20);
> + for (;;) {
> + usleep_range(2000, 3000);
> +
> + if (drm_dp_dpcd_read_link_status(&intel_dp->aux, link_status) < 
> 0) {
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Failed to read link status\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + if (drm_dp_128b132b_cds_interlane_align_done(link_status) &&
> + drm_dp_128b132b_lane_symbol_locked(link_status, 
> crtc_state->lane_count)) {

I'm thinkin we want to check for both eq done and symbol locked here,
just like we do with 8b10b.

> + drm_dbg_kms(&i915->drm,
> + "[ENCODER:%d:%s] CDS interlane align 
> done\n",
> + encoder->base.base.id, encoder->base.name);
> + break;
> + }
> +
> + if (drm_dp_128b132b_link_training_failed(link_status)) {
> + intel_dp_dump_link_status(intel_dp, DP_PHY_DPRX, 
> link_status);
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] Downstream link training 
> failure\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> +
> + if (time_after(jiffies, deadline)) {
> + intel_dp_dump_link_status(intel_dp, DP_PHY_DPRX, 
> link_status);
> + drm_err(&i915->drm,
> + "[ENCODER:%d:%s] CDS timeout\n",
> + encoder->base.base.id, encoder->base.name);
> + return false;
> + }
> + }
> +
> + /* FIXME: Should DP_TRAINING_PATTERN_DISABLE be written first? */
> + if (intel_dp->set_idle_link_train)
> + intel_dp->set_idle_link_train(intel_dp, crtc_state);
> +
> + return true;
> +}

-- 
Ville Syrjälä
Intel

Re: [PATCH 01/19] dma-buf-map: Add read/write helpers

2022-01-26 Thread Matthew Brost

On Thu, Jan 27, 2022 at 08:24:04AM +0100, Christian König wrote:
> Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
> > In certain situations it's useful to be able to read or write to an
> > offset that is calculated by having the memory layout given by a struct
> > declaration. Usually we are going to read/write a u8, u16, u32 or u64.
> > 
> > Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field()
> > to calculate the offset of a struct member and memcpy the data from/to
> > the dma_buf_map. We could use readb, readw, readl, readq and the write*
> > counterparts, however due to alignment issues this may not work on all
> > architectures. If alignment needs to be checked to call the right
> > function, it's not possible to decide at compile-time which function to
> > call: so just leave the decision to the memcpy function that will do
> > exactly that on IO memory or dereference the pointer.
> > 
> > Cc: Sumit Semwal 
> > Cc: Christian König 
> > Cc: linux-me...@vger.kernel.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: linaro-mm-...@lists.linaro.org
> > Cc: linux-ker...@vger.kernel.org
> > Signed-off-by: Lucas De Marchi 
> > ---
> >   include/linux/dma-buf-map.h | 81 +
> >   1 file changed, 81 insertions(+)
> > 
> > diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
> > index 19fa0b5ae5ec..65e927d9ce33 100644
> > --- a/include/linux/dma-buf-map.h
> > +++ b/include/linux/dma-buf-map.h
> > @@ -6,6 +6,7 @@
> >   #ifndef __DMA_BUF_MAP_H__
> >   #define __DMA_BUF_MAP_H__
> > +#include 
> >   #include 
> >   #include 
> > @@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct 
> > dma_buf_map *map)
> > }
> >   }
> > +/**
> > + * dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
> > + * @dst:   The dma-buf mapping structure
> > + * @offset:The offset from which to copy
> > + * @src:   The source buffer
> > + * @len:   The number of byte in src
> > + *
> > + * Copies data into a dma-buf mapping with an offset. The source buffer is 
> > in
> > + * system memory. Depending on the buffer's location, the helper picks the
> > + * correct method of accessing the memory.
> > + */
> > +static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, 
> > size_t offset,
> > +   const void *src, size_t len)
> > +{
> > +   if (dst->is_iomem)
> > +   memcpy_toio(dst->vaddr_iomem + offset, src, len);
> > +   else
> > +   memcpy(dst->vaddr + offset, src, len);
> > +}
> > +
> > +/**
> > + * dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping 
> > into system memory
> > + * @dst:   Destination in system memory
> > + * @src:   The dma-buf mapping structure
> > + * @src:   The offset from which to copy
> > + * @len:   The number of byte in src
> > + *
> > + * Copies data from a dma-buf mapping with an offset. The dest buffer is in
> > + * system memory. Depending on the mapping location, the helper picks the
> > + * correct method of accessing the memory.
> > + */
> > +static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct 
> > dma_buf_map *src,
> > + size_t offset, size_t len)
> > +{
> > +   if (src->is_iomem)
> > +   memcpy_fromio(dst, src->vaddr_iomem + offset, len);
> > +   else
> > +   memcpy(dst, src->vaddr + offset, len);
> > +}
> > +
> 
> Well that's certainly a valid use case, but I suggest to change the
> implementation of the existing functions to call the new ones with offset=0.
> 
> This way we only have one implementation.
> 
Trivial - but agree with Christian that is a good cleanup.

> >   /**
> >* dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
> >* @dst:  The dma-buf mapping structure
> > @@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map 
> > *map, size_t incr)
> > map->vaddr += incr;
> >   }
> > +/**
> > + * dma_buf_map_read_field - Read struct member from dma-buf mapping with
> > + * arbitrary size and handling un-aligned accesses
> > + *
> > + * @map__: The dma-buf mapping structure
> > + * @type__:The struct to be used containing the field to read
> > + * @field__:   Member from struct we want to read
> > + *
> > + * Read a value from dma-buf mapping calculating the offset and size: this 
> > assumes
> > + * the dma-buf mapping is aligned with a a struct type__. A single u8, 
> > u16, u32
> > + * or u64 can be read, based on the offset and size of type__.field__.
> > + */
> > +#define dma_buf_map_read_field(map__, type__, field__) ({  
> > \
> > +   type__ *t__;
> > \
> > +   typeof(t__->field__) val__; 
> > \
> > +   dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, 
> > field__),\
> > +

Re: [PATCH 3/8] drm/dp: add some new DPCD macros from DP 2.0 E11

2022-01-26 Thread Ville Syrjälä

On Tue, Jan 25, 2022 at 07:03:41PM +0200, Jani Nikula wrote:
> Add some of the new additions from DP 2.0 E11.
> 
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Signed-off-by: Jani Nikula 

Reviewed-by: Ville Syrjälä 

> ---
>  include/drm/dp/drm_dp_helper.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/drm/dp/drm_dp_helper.h b/include/drm/dp/drm_dp_helper.h
> index c499d735b992..69487bd8ed56 100644
> --- a/include/drm/dp/drm_dp_helper.h
> +++ b/include/drm/dp/drm_dp_helper.h
> @@ -560,6 +560,7 @@ struct drm_panel;
>  # define DP_TRAINING_PATTERN_DISABLE 0
>  # define DP_TRAINING_PATTERN_1   1
>  # define DP_TRAINING_PATTERN_2   2
> +# define DP_TRAINING_PATTERN_2_CDS   3   /* 2.0 E11 */
>  # define DP_TRAINING_PATTERN_3   3   /* 1.2 */
>  # define DP_TRAINING_PATTERN_4  7   /* 1.4 */
>  # define DP_TRAINING_PATTERN_MASK0x3
> @@ -1350,6 +1351,7 @@ struct drm_panel;
>  # define DP_PHY_REPEATER_128B132B_SUPPORTED  (1 << 0)
>  /* See DP_128B132B_SUPPORTED_LINK_RATES for values */
>  #define DP_PHY_REPEATER_128B132B_RATES   0xf0007 /* 
> 2.0 */
> +#define DP_PHY_REPEATER_EQ_DONE 0xf0008 /* 2.0 
> E11 */

Wonder if we should look at that at some point? The spec doesn't really
say so. Or maybe we should just dump it out of the link training failed?

>  
>  enum drm_dp_phy {
>   DP_PHY_DPRX,
> -- 
> 2.30.2

-- 
Ville Syrjälä
Intel

[PATCH v1] drm/panel: simple: Tune timing for ET057090DHU

2022-01-26 Thread Francesco Dolcini

From: Oleksandr Suvorov 

VESA Display Monitor Timing v1.13 has recommendations for the historical
VGA mode 640x480 60Hz. These parameters are compatible with EDT
ET057090DHU recommended timings.

Use VESA DMT timing parameters for EDT ET057090DHU panel.

Signed-off-by: Oleksandr Suvorov 
Cc: Oleksandr Suvorov 
Signed-off-by: Francesco Dolcini 
---
 drivers/gpu/drm/panel/panel-simple.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-simple.c 
b/drivers/gpu/drm/panel/panel-simple.c
index 9e46db5e359c..c11427f94ac5 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -1598,12 +1598,13 @@ static const struct drm_display_mode 
edt_et057090dhu_mode = {
.clock = 25175,
.hdisplay = 640,
.hsync_start = 640 + 16,
-   .hsync_end = 640 + 16 + 30,
-   .htotal = 640 + 16 + 30 + 114,
+   .hsync_end = 640 + 16 + 48,
+   .htotal = 640 + 16 + 48 + 96,
.vdisplay = 480,
.vsync_start = 480 + 10,
-   .vsync_end = 480 + 10 + 3,
-   .vtotal = 480 + 10 + 3 + 32,
+   .vsync_end = 480 + 10 + 2,
+   .vtotal = 480 + 10 + 2 + 33,
+   .vrefresh = 60,
.flags = DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC,
 };
 
-- 
2.25.1

Re: [PATCH 09/19] dma-buf-map: Add wrapper over memset

2022-01-26 Thread Christian König


Am 26.01.22 um 21:36 schrieb Lucas De Marchi:

Just like memcpy_toio(), there is also need to write a direct value to a
memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()

Cc: Matt Roper 
Cc: Sumit Semwal 
Cc: Christian König 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
  include/linux/dma-buf-map.h | 17 +
  1 file changed, 17 insertions(+)

diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index 3514a859f628..c9fb04264cd0 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct 
dma_buf_map *dst, const void *sr
memcpy(dst->vaddr, src, len);
  }
  
+/**

+ * dma_buf_map_memset - Memset into dma-buf mapping
+ * @dst:   The dma-buf mapping structure
+ * @value: The value to set
+ * @len:   The number of bytes to set in dst
+ *
+ * Set value in dma-buf mapping. Depending on the buffer's location, the helper
+ * picks the correct method of accessing the memory.
+ */
+static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, 
size_t len)
+{
+   if (dst->is_iomem)
+   memset_io(dst->vaddr_iomem, value, len);
+   else
+   memset(dst->vaddr, value, len);
+}
+


Yeah, that's certainly a valid use case. But maybe directly add a 
dma_buf_map_memset_with_offset() variant as well when that helps to 
avoid patch #2.


Regards,
Christian.


  /**
   * dma_buf_map_incr - Increments the address stored in a dma-buf mapping
   * @map:  The dma-buf mapping structure

Re: [PATCH 1/8] drm/dp: add drm_dp_128b132b_read_aux_rd_interval()

2022-01-26 Thread Ville Syrjälä

On Tue, Jan 25, 2022 at 07:03:39PM +0200, Jani Nikula wrote:
> The DP 2.0 errata changes DP_128B132B_TRAINING_AUX_RD_INTERVAL (DPCD
> 0x2216) completely. Add a new function to read that. Follow-up will need
> to clean up existing functions.
> 
> v2: fix reversed interpretation of bit 7 meaning (Uma)
> 
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Signed-off-by: Jani Nikula 

Reviewed-by: Ville Syrjälä 

> ---
>  drivers/gpu/drm/dp/drm_dp.c| 20 
>  include/drm/dp/drm_dp_helper.h |  3 +++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/drivers/gpu/drm/dp/drm_dp.c b/drivers/gpu/drm/dp/drm_dp.c
> index 6d43325acca5..52c6da510142 100644
> --- a/drivers/gpu/drm/dp/drm_dp.c
> +++ b/drivers/gpu/drm/dp/drm_dp.c
> @@ -281,6 +281,26 @@ int drm_dp_read_channel_eq_delay(struct drm_dp_aux *aux, 
> const u8 dpcd[DP_RECEIV
>  }
>  EXPORT_SYMBOL(drm_dp_read_channel_eq_delay);
>  
> +/* Per DP 2.0 Errata */
> +int drm_dp_128b132b_read_aux_rd_interval(struct drm_dp_aux *aux)
> +{
> + int unit;
> + u8 val;
> +
> + if (drm_dp_dpcd_readb(aux, DP_128B132B_TRAINING_AUX_RD_INTERVAL, &val) 
> != 1) {
> + drm_err(aux->drm_dev, "%s: failed rd interval read\n",
> + aux->name);
> + /* default to max */
> + val = DP_128B132B_TRAINING_AUX_RD_INTERVAL_MASK;
> + }
> +
> + unit = (val & DP_128B132B_TRAINING_AUX_RD_INTERVAL_1MS_UNIT) ? 1 : 2;
> + val &= DP_128B132B_TRAINING_AUX_RD_INTERVAL_MASK;
> +
> + return (val + 1) * unit * 1000;
> +}
> +EXPORT_SYMBOL(drm_dp_128b132b_read_aux_rd_interval);
> +
>  void drm_dp_link_train_clock_recovery_delay(const struct drm_dp_aux *aux,
>   const u8 dpcd[DP_RECEIVER_CAP_SIZE])
>  {
> diff --git a/include/drm/dp/drm_dp_helper.h b/include/drm/dp/drm_dp_helper.h
> index 98d020835b49..aa73dfc817ff 100644
> --- a/include/drm/dp/drm_dp_helper.h
> +++ b/include/drm/dp/drm_dp_helper.h
> @@ -1112,6 +1112,7 @@ struct drm_panel;
>  # define DP_UHBR13_5   (1 << 2)
>  
>  #define DP_128B132B_TRAINING_AUX_RD_INTERVAL0x2216 /* 
> 2.0 */
> +# define DP_128B132B_TRAINING_AUX_RD_INTERVAL_1MS_UNIT  (1 << 7)
>  # define DP_128B132B_TRAINING_AUX_RD_INTERVAL_MASK  0x7f
>  # define DP_128B132B_TRAINING_AUX_RD_INTERVAL_400_US0x00
>  # define DP_128B132B_TRAINING_AUX_RD_INTERVAL_4_MS  0x01
> @@ -1549,6 +1550,8 @@ void drm_dp_link_train_channel_eq_delay(const struct 
> drm_dp_aux *aux,
>  void drm_dp_lttpr_link_train_channel_eq_delay(const struct drm_dp_aux *aux,
> const u8 
> caps[DP_LTTPR_PHY_CAP_SIZE]);
>  
> +int drm_dp_128b132b_read_aux_rd_interval(struct drm_dp_aux *aux);
> +
>  u8 drm_dp_link_rate_to_bw_code(int link_rate);
>  int drm_dp_bw_code_to_link_rate(u8 link_bw);
>  
> -- 
> 2.30.2

-- 
Ville Syrjälä
Intel

Re: [PATCH 02/19] dma-buf-map: Add helper to initialize second map

2022-01-26 Thread Christian König


Am 26.01.22 um 21:36 schrieb Lucas De Marchi:

When dma_buf_map struct is passed around, it's useful to be able to
initialize a second map that takes care of reading/writing to an offset
of the original map.

Add a helper that copies the struct and add the offset to the proper
address.


Well what you propose here can lead to all kind of problems and is 
rather bad design as far as I can see.


The struct dma_buf_map is only to be filled in by the exporter and 
should not be modified in this way by the importer.


If you need to copy only a certain subset of the mapping use the 
functions you added in patch #1.


Regards,
Christian.



Cc: Sumit Semwal 
Cc: Christian König 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
  include/linux/dma-buf-map.h | 29 +
  1 file changed, 29 insertions(+)

diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index 65e927d9ce33..3514a859f628 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -131,6 +131,35 @@ struct dma_buf_map {
.is_iomem = false, \
}
  
+/**

+ * DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from another 
dma_buf_map
+ * @map_:  The dma-buf mapping structure to copy from
+ * @offset:Offset to add to the other mapping
+ *
+ * Initializes a new dma_buf_struct based on another. This is the equivalent 
of doing:
+ *
+ * .. code-block: c
+ *
+ * dma_buf_map map = other_map;
+ * dma_buf_map_incr(&map, &offset);
+ *
+ * Example usage:
+ *
+ * .. code-block: c
+ *
+ * void foo(struct device *dev, struct dma_buf_map *base_map)
+ * {
+ * ...
+ * struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map, 
FIELD_OFFSET);
+ * ...
+ * }
+ */
+#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map)\
+   {   \
+   .vaddr = (map_)->vaddr + (offset_),  \
+   .is_iomem = (map_)->is_iomem,\
+   }
+
  /**
   * dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in 
system memory
   * @map:  The dma-buf mapping structure

Re: [PATCH 01/19] dma-buf-map: Add read/write helpers

2022-01-26 Thread Christian König


Am 26.01.22 um 21:36 schrieb Lucas De Marchi:

In certain situations it's useful to be able to read or write to an
offset that is calculated by having the memory layout given by a struct
declaration. Usually we are going to read/write a u8, u16, u32 or u64.

Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field()
to calculate the offset of a struct member and memcpy the data from/to
the dma_buf_map. We could use readb, readw, readl, readq and the write*
counterparts, however due to alignment issues this may not work on all
architectures. If alignment needs to be checked to call the right
function, it's not possible to decide at compile-time which function to
call: so just leave the decision to the memcpy function that will do
exactly that on IO memory or dereference the pointer.

Cc: Sumit Semwal 
Cc: Christian König 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
  include/linux/dma-buf-map.h | 81 +
  1 file changed, 81 insertions(+)

diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index 19fa0b5ae5ec..65e927d9ce33 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -6,6 +6,7 @@
  #ifndef __DMA_BUF_MAP_H__
  #define __DMA_BUF_MAP_H__
  
+#include 

  #include 
  #include 
  
@@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map)

}
  }
  
+/**

+ * dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
+ * @dst:   The dma-buf mapping structure
+ * @offset:The offset from which to copy
+ * @src:   The source buffer
+ * @len:   The number of byte in src
+ *
+ * Copies data into a dma-buf mapping with an offset. The source buffer is in
+ * system memory. Depending on the buffer's location, the helper picks the
+ * correct method of accessing the memory.
+ */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, 
size_t offset,
+   const void *src, size_t len)
+{
+   if (dst->is_iomem)
+   memcpy_toio(dst->vaddr_iomem + offset, src, len);
+   else
+   memcpy(dst->vaddr + offset, src, len);
+}
+
+/**
+ * dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into 
system memory
+ * @dst:   Destination in system memory
+ * @src:   The dma-buf mapping structure
+ * @src:   The offset from which to copy
+ * @len:   The number of byte in src
+ *
+ * Copies data from a dma-buf mapping with an offset. The dest buffer is in
+ * system memory. Depending on the mapping location, the helper picks the
+ * correct method of accessing the memory.
+ */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct 
dma_buf_map *src,
+ size_t offset, size_t len)
+{
+   if (src->is_iomem)
+   memcpy_fromio(dst, src->vaddr_iomem + offset, len);
+   else
+   memcpy(dst, src->vaddr + offset, len);
+}
+


Well that's certainly a valid use case, but I suggest to change the 
implementation of the existing functions to call the new ones with offset=0.


This way we only have one implementation.


  /**
   * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
   * @dst:  The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map 
*map, size_t incr)
map->vaddr += incr;
  }
  
+/**

+ * dma_buf_map_read_field - Read struct member from dma-buf mapping with
+ * arbitrary size and handling un-aligned accesses
+ *
+ * @map__: The dma-buf mapping structure
+ * @type__:The struct to be used containing the field to read
+ * @field__:   Member from struct we want to read
+ *
+ * Read a value from dma-buf mapping calculating the offset and size: this 
assumes
+ * the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
+ * or u64 can be read, based on the offset and size of type__.field__.
+ */
+#define dma_buf_map_read_field(map__, type__, field__) ({  
\
+   type__ *t__;
\
+   typeof(t__->field__) val__; 
 \
+   dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, 
field__),\
+  sizeof(t__->field__));   
 \
+   val__;  
\
+})
+
+/**
+ * dma_buf_map_write_field - Write struct member to the dma-buf mapping with
+ * arbitrary size and handling un-aligned accesses
+ *
+ * @map__: The dma-buf mapping structure
+ * @type__:The struct to be used containing the field to write
+ * @field__:   Member from struct we want to write
+ * @val__:

Re: [PATCH] drivers: Fix typo in comment

2022-01-26 Thread Greg KH

On Thu, Jan 27, 2022 at 02:51:56PM +0800, tangmeng wrote:
> Replace disbale with disable and replace unavaibale with unavailable.
> 
> Signed-off-by: tangmeng 
> ---
>  drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 2 +-
>  drivers/gpu/drm/tilcdc/tilcdc_crtc.c  | 2 +-
>  drivers/pcmcia/rsrc_nonstatic.c   | 2 +-
>  drivers/usb/chipidea/udc.c| 2 +-
>  4 files changed, 4 insertions(+), 4 deletions(-)

This needs to be broken up per-subsystem, thanks.

greg k-h

Re: mmotm 2022-01-26-21-04 uploaded (gpu/drm/i915/i915_gem_evict.h)

2022-01-26 Thread Randy Dunlap




On 1/26/22 21:04, a...@linux-foundation.org wrote:
> The mm-of-the-moment snapshot 2022-01-26-21-04 has been uploaded to
> 
>https://www.ozlabs.org/~akpm/mmotm/
> 
> mmotm-readme.txt says
> 
> README for mm-of-the-moment:
> 
> https://www.ozlabs.org/~akpm/mmotm/
> 
> This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
> more than once a week.
> 
> You will need quilt to apply these patches to the latest Linus release (5.x
> or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
> https://ozlabs.org/~akpm/mmotm/series
> 
> The file broken-out.tar.gz contains two datestamp files: .DATE and
> .DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
> followed by the base kernel version against which this patch series is to
> be applied.

on x86_64:
(from linux-next.patch)


  HDRTEST drivers/gpu/drm/i915/i915_gem_evict.h
In file included from :0:0:
./../drivers/gpu/drm/i915/i915_gem_evict.h:15:15: error: ‘struct 
i915_gem_ww_ctx’ declared inside parameter list will not be visible outside of 
this definition or declaration [-Werror]
struct i915_gem_ww_ctx *ww,
   ^~~
./../drivers/gpu/drm/i915/i915_gem_evict.h:21:14: error: ‘struct 
i915_gem_ww_ctx’ declared inside parameter list will not be visible outside of 
this definition or declaration [-Werror]
   struct i915_gem_ww_ctx *ww,
  ^~~
./../drivers/gpu/drm/i915/i915_gem_evict.h:25:16: error: ‘struct 
i915_gem_ww_ctx’ declared inside parameter list will not be visible outside of 
this definition or declaration [-Werror]
 struct i915_gem_ww_ctx *ww);
^~~
cc1: all warnings being treated as errors


-- 
~Randy

Re: [PATCH v1 0/4] fbtft: Unorphan the driver for maintenance

2022-01-26 Thread Dan Carpenter

On Wed, Jan 26, 2022 at 11:31:02PM +0100, Daniel Vetter wrote:
> dOn Wed, Jan 26, 2022 at 3:46 PM Dan Carpenter  
> wrote:
> >
> > The other advantage of staging is the I don't think syzbot enables it.
> > I guess it's easier to persuade Dmitry to ignore STAGING than it was to
> > get him to disable FBDEV.  :P
> >
> > The memory corruption in fbdev was a real headache for everyone because
> > the stack traces ended up all over the kernel.
> 
> Uh Dmitry disabled all of FBDEV?

No that's the opposite of what I meant.  STAGING is disabled in syzbot
and FBDEV is enabled.

regards,
dan carpenter

Re: [PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset

2022-01-26 Thread kernel test robot

Hi Lucas,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-tip/drm-tip]
[also build test ERROR on next-20220125]
[cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next 
drm/drm-next tegra-drm/drm/tegra/for-next linus/master airlied/drm-next 
v5.17-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: i386-allyesconfig 
(https://download.01.org/0day-ci/archive/20220127/202201271208.kelpe3mn-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/313757d9ed833acea4ee2bb0e3f3565d6efcf3cc
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912
git checkout 313757d9ed833acea4ee2bb0e3f3565d6efcf3cc
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from include/drm/drm_mm.h:51,
from drivers/gpu/drm/i915/i915_vma.h:31,
from drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h:13,
from drivers/gpu/drm/i915/gt/uc/intel_guc.h:20,
from drivers/gpu/drm/i915/gt/uc/intel_uc.h:9,
from drivers/gpu/drm/i915/gt/intel_gt_types.h:18,
from drivers/gpu/drm/i915/gt/intel_gt.h:10,
from drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:9:
   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c: In function 
'guc_mmio_reg_state_create':
>> drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:38: error: format '%lu' 
>> expects argument of type 'long unsigned int', but argument 4 has type 'u32' 
>> {aka 'unsigned int'} [-Werror=format=]
 369 |  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS 
regset\n",
 |  
^~~~
 370 |   (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
 |   ~~
 ||
 |u32 {aka 
unsigned int}
   include/drm/drm_print.h:461:56: note: in definition of macro 'drm_dbg'
 461 |  drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, 
##__VA_ARGS__)
 |^~~
   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:46: note: format string is 
defined here
 369 |  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS 
regset\n",
 |~~^
 |  |
 |  long unsigned int
 |%u
   cc1: all warnings being treated as errors


vim +369 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c

   348  
   349  static long guc_mmio_reg_state_create(struct intel_guc *guc)
   350  {
   351  struct intel_gt *gt = guc_to_gt(guc);
   352  struct intel_engine_cs *engine;
   353  enum intel_engine_id id;
   354  struct temp_regset temp_set = {};
   355  long total = 0;
   356  
   357  for_each_engine(engine, gt, id) {
   358  u32 used = temp_set.storage_used;
   359  
   360  if (guc_mmio_regset_init(&temp_set, engine) < 0)
   361  return -1;
   362  
   363  guc->ads_regset_count[id] = temp_set.storage_used - 
used;
   364  total += guc->ads_regset_count[id];
   365  }
   366  
   367  guc->ads_regset = temp_set.storage;
   368  
 > 369  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary 
 > ADS regset\n",
   370  (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 
10);
   371  
   372  return total * sizeof(struct guc_mmio_reg);
   373  }
   374  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

[pull] amdgpu drm-fixes-5.17

2022-01-26 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 5.17.

The following changes since commit e783362eb54cd99b2cac8b3a9aeac942e6f6ac07:

  Linux 5.17-rc1 (2022-01-23 10:12:53 +0200)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-5.17-2022-01-26

for you to fetch changes up to 2a807341ed1074ab83638f2fab08dffaa373f6b8:

  drm/amdgpu/display: Remove t_srx_delay_us. (2022-01-25 17:54:23 -0500)


amd-drm-fixes-5.17-2022-01-26:

amdgpu:
- Proper fix for otg synchronization logic regression
- DCN3.01 fixes
- Filter out secondary radeon PCI IDs
- udelay fixes
- Fix a memory leak in an error path


Alex Deucher (3):
  drm/amdgpu: filter out radeon secondary ids as well
  drm/amdgpu/display: adjust msleep limit in 
dp_wait_for_training_aux_rd_interval
  drm/amdgpu/display: use msleep rather than udelay for long delays

Bas Nieuwenhuizen (3):
  drm/amd/display: Fix FP start/end for dcn30_internal_validate_bw.
  drm/amd/display: Wrap dcn301_calculate_wm_and_dlg for FPU.
  drm/amdgpu/display: Remove t_srx_delay_us.

Meenakshikumar Somasundaram (1):
  drm/amd/display: Fix for otg synchronization logic

Zhan Liu (2):
  drm/amd/display: Correct MPC split policy for DCN301
  drm/amd/display: change FIFO reset condition to embedded display only

Zhou Qingyang (1):
  drm/amd/display/dc/calcs/dce_calcs: Fix a memleak in calculate_bandwidth()

 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 81 ++
 drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c   |  4 +-
 drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c   |  1 -
 drivers/gpu/drm/amd/display/dc/core/dc.c   | 40 +++
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c   |  6 +-
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  | 54 +++
 drivers/gpu/drm/amd/display/dc/dc.h|  1 +
 .../amd/display/dc/dce110/dce110_hw_sequencer.c| 10 ++-
 .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c  |  4 +-
 .../drm/amd/display/dc/dcn301/dcn301_resource.c| 13 +++-
 .../gpu/drm/amd/display/dc/dcn31/dcn31_resource.c  |  3 +
 .../display/dc/dml/dcn20/display_rq_dlg_calc_20.c  |  2 -
 .../dc/dml/dcn20/display_rq_dlg_calc_20v2.c|  2 -
 .../display/dc/dml/dcn21/display_rq_dlg_calc_21.c  |  2 -
 .../display/dc/dml/dcn30/display_rq_dlg_calc_30.c  |  2 -
 .../gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.c |  2 +-
 .../gpu/drm/amd/display/dc/dml/dcn301/dcn301_fpu.h |  2 +-
 .../drm/amd/display/dc/dml/display_mode_structs.h  |  1 -
 .../amd/display/dc/dml/display_rq_dlg_helpers.c|  3 -
 .../amd/display/dc/dml/dml1_display_rq_dlg_calc.c  |  4 --
 drivers/gpu/drm/amd/display/dc/inc/core_types.h|  1 +
 drivers/gpu/drm/amd/display/dc/inc/resource.h  | 11 +++
 22 files changed, 208 insertions(+), 41 deletions(-)

Re: [PATCH v5 4/4] dt-bindings: drm/bridge: anx7625: Add aux-bus node

2022-01-26 Thread Hsin-Yi Wang

On Tue, Jan 25, 2022 at 2:25 PM Hsin-Yi Wang  wrote:
>
> On Wed, Jan 19, 2022 at 11:36 PM Robert Foss  wrote:
> >
> > Hey Hsin-Yi,
> >
> > While I can review this patch, I don't have the authority to merge it
> > since it is outside the scope of my maintainership. Rob Herring,
> > Daniel Vetter or David Airlie would have to Ack this patch.

hi Rob, Daniel, and David,

Can you help ack this patch?

Thanks
> >
> > On Wed, 19 Jan 2022 at 16:18, Hsin-Yi Wang  wrote:
> > >
> > > List panel under aux-bus node if it's connected to anx7625's aux bus.
> > >
> > > Signed-off-by: Hsin-Yi Wang 
> > > Reviewed-by: Xin Ji 
> > > ---
> > >  .../display/bridge/analogix,anx7625.yaml| 17 +
> > >  1 file changed, 17 insertions(+)
> > >
> > > diff --git 
> > > a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml 
> > > b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
> > > index 1d3e88daca041a..0d38d6fe39830f 100644
> > > --- 
> > > a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
> > > +++ 
> > > b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
> > > @@ -83,6 +83,9 @@ properties:
> > >  type: boolean
> > >  description: let the driver enable audio HDMI codec function or not.
> > >
> > > +  aux-bus:
> > > +$ref: /schemas/display/dp-aux-bus.yaml#
> > > +
> > >ports:
> > >  $ref: /schemas/graph.yaml#/properties/ports
> > >
> > > @@ -167,5 +170,19 @@ examples:
> > >  };
> > >  };
> > >  };
> > > +
> > > +aux-bus {
> > > +panel {
> > > +compatible = "innolux,n125hce-gn1";
> > > +power-supply = <&pp3300_disp_x>;
> > > +backlight = <&backlight_lcd0>;
> > > +
> > > +port {
> > > +panel_in: endpoint {
> > > +remote-endpoint = <&anx7625_out>;
> > > +};
> > > +};
> > > +};
> > > +};
> > >  };
> > >  };
> > > --
> > > 2.34.1.703.g22d0c6ccf7-goog
> > >

[RFC PATCH v5 3/3] drm: remove allow_fb_modifiers

2022-01-26 Thread Tomohito Esaki

The allow_fb_modifiers flag is unnecessary since it has been replaced
with fb_modifiers_not_supported flag.

Signed-off-by: Tomohito Esaki 
---
 drivers/gpu/drm/selftests/test-drm_framebuffer.c |  1 -
 include/drm/drm_mode_config.h| 16 
 2 files changed, 17 deletions(-)

diff --git a/drivers/gpu/drm/selftests/test-drm_framebuffer.c 
b/drivers/gpu/drm/selftests/test-drm_framebuffer.c
index 61b44d3a6a61..f6d66285c5fc 100644
--- a/drivers/gpu/drm/selftests/test-drm_framebuffer.c
+++ b/drivers/gpu/drm/selftests/test-drm_framebuffer.c
@@ -323,7 +323,6 @@ static struct drm_device mock_drm_device = {
.max_width = MAX_WIDTH,
.min_height = MIN_HEIGHT,
.max_height = MAX_HEIGHT,
-   .allow_fb_modifiers = true,
.funcs = &mock_config_funcs,
},
 };
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 4a93dac91cf9..6b5e01295348 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -917,22 +917,6 @@ struct drm_mode_config {
 */
bool async_page_flip;
 
-   /**
-* @allow_fb_modifiers:
-*
-* Whether the driver supports fb modifiers in the ADDFB2.1 ioctl call.
-* Note that drivers should not set this directly, it is automatically
-* set in drm_universal_plane_init().
-*
-* IMPORTANT:
-*
-* If this is set the driver must fill out the full implicit modifier
-* information in their &drm_mode_config_funcs.fb_create hook for legacy
-* userspace which does not set modifiers. Otherwise the GETFB2 ioctl is
-* broken for modifier aware userspace.
-*/
-   bool allow_fb_modifiers;
-
/**
 * @fb_modifiers_not_supported:
 *
-- 
2.25.1

[RFC PATCH v5 2/3] drm: add support modifiers for drivers whose planes only support linear layout

2022-01-26 Thread Tomohito Esaki

The LINEAR modifier is advertised as default if a driver doesn't specify
modifiers.

Signed-off-by: Tomohito Esaki 
---
 drivers/gpu/drm/drm_plane.c | 23 +--
 include/drm/drm_plane.h |  3 +++
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index deeec60a3315..bf0daa8d9bbd 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -237,6 +237,9 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
  const char *name, va_list ap)
 {
struct drm_mode_config *config = &dev->mode_config;
+   static const uint64_t default_modifiers[] = {
+   DRM_FORMAT_MOD_LINEAR,
+   };
unsigned int format_modifier_count = 0;
int ret;
 
@@ -277,16 +280,16 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
 
while (*temp_modifiers++ != DRM_FORMAT_MOD_INVALID)
format_modifier_count++;
+   } else {
+   if (!dev->mode_config.fb_modifiers_not_supported) {
+   format_modifiers = default_modifiers;
+   format_modifier_count = ARRAY_SIZE(default_modifiers);
+   }
}
 
/* autoset the cap and check for consistency across all planes */
-   if (format_modifier_count) {
-   drm_WARN_ON(dev, !config->allow_fb_modifiers &&
-   !list_empty(&config->plane_list));
-   config->allow_fb_modifiers = true;
-   } else {
-   drm_WARN_ON(dev, config->allow_fb_modifiers);
-   }
+   drm_WARN_ON(dev, config->fb_modifiers_not_supported &&
+   format_modifier_count);
 
plane->modifier_count = format_modifier_count;
plane->modifiers = kmalloc_array(format_modifier_count,
@@ -341,7 +344,7 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
drm_object_attach_property(&plane->base, config->prop_src_h, 0);
}
 
-   if (config->allow_fb_modifiers)
+   if (format_modifier_count)
create_in_format_blob(dev, plane);
 
return 0;
@@ -368,8 +371,8 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
  * drm_universal_plane_init() to let the DRM managed resource infrastructure
  * take care of cleanup and deallocation.
  *
- * Drivers supporting modifiers must set @format_modifiers on all their planes,
- * even those that only support DRM_FORMAT_MOD_LINEAR.
+ * Drivers that only support the DRM_FORMAT_MOD_LINEAR modifier support may set
+ * @format_modifiers to NULL. The plane will advertise the linear modifier.
  *
  * Returns:
  * Zero on success, error code on failure.
diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h
index 0c1102dc4d88..a0390b6ad3b4 100644
--- a/include/drm/drm_plane.h
+++ b/include/drm/drm_plane.h
@@ -803,6 +803,9 @@ void *__drmm_universal_plane_alloc(struct drm_device *dev,
  *
  * The @drm_plane_funcs.destroy hook must be NULL.
  *
+ * Drivers that only support the DRM_FORMAT_MOD_LINEAR modifier support may set
+ * @format_modifiers to NULL. The plane will advertise the linear modifier.
+ *
  * Returns:
  * Pointer to new plane, or ERR_PTR on failure.
  */
-- 
2.25.1

[RFC PATCH v5 1/3] drm: introduce fb_modifiers_not_supported flag in mode_config

2022-01-26 Thread Tomohito Esaki

If only linear modifier is advertised, since there are many drivers that
only linear supported, the DRM core should handle this rather than
open-coding in every driver. However, there are legacy drivers such as
radeon that do not support modifiers but infer the actual layout of the
underlying buffer. Therefore, a new flag fb_modifiers_not_supported is
introduced for these legacy drivers, and allow_fb_modifiers is replaced
with this new flag.

Signed-off-by: Tomohito Esaki 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  6 +++---
 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c |  1 +
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c |  2 ++
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  3 +++
 drivers/gpu/drm/drm_framebuffer.c |  6 +++---
 drivers/gpu/drm/drm_ioctl.c   |  2 +-
 drivers/gpu/drm/nouveau/nouveau_display.c |  6 --
 drivers/gpu/drm/radeon/radeon_display.c   |  2 ++
 include/drm/drm_mode_config.h | 10 ++
 11 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 82011e75ed85..edbb30d47b8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -954,7 +954,7 @@ static int amdgpu_display_verify_sizes(struct 
amdgpu_framebuffer *rfb)
int ret;
unsigned int i, block_width, block_height, block_size_log2;
 
-   if (!rfb->base.dev->mode_config.allow_fb_modifiers)
+   if (rfb->base.dev->mode_config.fb_modifiers_not_supported)
return 0;
 
for (i = 0; i < format_info->num_planes; ++i) {
@@ -1141,7 +1141,7 @@ int amdgpu_display_framebuffer_init(struct drm_device 
*dev,
if (ret)
return ret;
 
-   if (!dev->mode_config.allow_fb_modifiers) {
+   if (dev->mode_config.fb_modifiers_not_supported) {
drm_WARN_ONCE(dev, adev->family >= AMDGPU_FAMILY_AI,
  "GFX9+ requires FB check based on format 
modifier\n");
ret = check_tiling_flags_gfx6(rfb);
@@ -1149,7 +1149,7 @@ int amdgpu_display_framebuffer_init(struct drm_device 
*dev,
return ret;
}
 
-   if (dev->mode_config.allow_fb_modifiers &&
+   if (!dev->mode_config.fb_modifiers_not_supported &&
!(rfb->base.flags & DRM_MODE_FB_MODIFIERS)) {
ret = convert_tiling_flags_to_modifier(rfb);
if (ret) {
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
index d1570a462a51..fb61c0814115 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
@@ -2798,6 +2798,8 @@ static int dce_v10_0_sw_init(void *handle)
adev_to_drm(adev)->mode_config.preferred_depth = 24;
adev_to_drm(adev)->mode_config.prefer_shadow = 1;
 
+   adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
+
adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
 
r = amdgpu_display_modeset_create_props(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
index 18a7b3bd633b..17942a11366d 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
@@ -2916,6 +2916,8 @@ static int dce_v11_0_sw_init(void *handle)
adev_to_drm(adev)->mode_config.preferred_depth = 24;
adev_to_drm(adev)->mode_config.prefer_shadow = 1;
 
+   adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
+
adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
 
r = amdgpu_display_modeset_create_props(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index c7803dc2b2d5..2ec99ec8e1a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@@ -2674,6 +2674,7 @@ static int dce_v6_0_sw_init(void *handle)
adev_to_drm(adev)->mode_config.max_height = 16384;
adev_to_drm(adev)->mode_config.preferred_depth = 24;
adev_to_drm(adev)->mode_config.prefer_shadow = 1;
+   adev_to_drm(adev)->mode_config.fb_modifiers_not_supported = true;
adev_to_drm(adev)->mode_config.fb_base = adev->gmc.aper_base;
 
r = amdgpu_display_modeset_create_props(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
index 8318ee8339f1..de11fbe5aba2 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
@@ -2695,6 +2695,8 @@ static int dce_v8_0_sw_init(void *handle)
adev_to_drm(adev)->mode_config.preferred_depth = 24;
adev_to_drm(adev)->mode_config.prefer_shadow = 1;
 
+   adev_to_drm(adev)->mode

[RFC PATCH v5 0/3] Add support modifiers for drivers whose planes only support linear layout

2022-01-26 Thread Tomohito Esaki

Some drivers whose planes only support linear layout fb do not support format
modifiers.
These drivers should support modifiers, however the DRM core should handle this
rather than open-coding in every driver.

In this patch series, these drivers expose format modifiers based on the
following suggestion[1].

On Thu, Nov 18, 2021 at 01:02:11PM +, Daniel Stone wrote:
> I think the best way forward here is:
>   - add a new mode_config.cannot_support_modifiers flag, and enable
> this in radeon (plus any other drivers in the same boat)
>   - change drm_universal_plane_init() to advertise the LINEAR modifier
> when NULL is passed as the modifier list (including installing a
> default .format_mod_supported hook)
>   - remove the mode_config.allow_fb_modifiers hook and always
> advertise modifier support, unless
> mode_config.cannot_support_modifiers is set


[1] 
https://patchwork.kernel.org/project/linux-renesas-soc/patch/20190509054518.10781-1-e...@igel.co.jp/#24602575

v5:
* rebase to the latest master branch (5.17-rc1+)
+ "drm/plane: Make format_mod_supported truly optional" patch [2]
  [2] https://patchwork.freedesktop.org/patch/467940/?series=98255&rev=3

* change default_modifiers array from non-static to static
* remove terminator in default_modifiers array
* use ARRAY_SIZE to get the format_modifier_count
* keep a sanity check in plane init func
* modify several kerneldocs

v4: https://www.spinics.net/lists/dri-devel/msg329508.html
* modify documentation for fb_modifiers_not_supported flag in kerneldoc

v3: https://www.spinics.net/lists/dri-devel/msg329102.html
* change the order as follows:
   1. add fb_modifiers_not_supported flag
   2. add default modifiers
   3. remove allow_fb_modifiers flag
* add a conditional disable in amdgpu_dm_plane_init()

v2: https://www.spinics.net/lists/dri-devel/msg328939.html
* rebase to the latest master branch (5.16.0+)
  + "drm/plane: Make format_mod_supported truly optional" patch [2]

v1: https://www.spinics.net/lists/dri-devel/msg327352.html
* The initial patch set

Tomohito Esaki (3):
  drm: introduce fb_modifiers_not_supported flag in mode_config
  drm: add support modifiers for drivers whose planes only support
linear layout
  drm: remove allow_fb_modifiers

 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  6 ++---
 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c |  1 +
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c |  2 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  3 +++
 drivers/gpu/drm/drm_framebuffer.c |  6 ++---
 drivers/gpu/drm/drm_ioctl.c   |  2 +-
 drivers/gpu/drm/drm_plane.c   | 23 +++
 drivers/gpu/drm/nouveau/nouveau_display.c |  6 +++--
 drivers/gpu/drm/radeon/radeon_display.c   |  2 ++
 .../gpu/drm/selftests/test-drm_framebuffer.c  |  1 -
 include/drm/drm_mode_config.h | 18 +--
 include/drm/drm_plane.h   |  3 +++
 14 files changed, 45 insertions(+), 32 deletions(-)

-- 
2.25.1

Re: [PATCH v3 09/10] tools: update hmm-test to support device coherent type

2022-01-26 Thread Sierra Guiza, Alejandro (Alex)




On 1/20/2022 12:14 AM, Alistair Popple wrote:

On Tuesday, 11 January 2022 9:32:00 AM AEDT Alex Sierra wrote:

Test cases such as migrate_fault and migrate_multiple, were modified to
explicit migrate from device to sys memory without the need of page
faults, when using device coherent type.

Snapshot test case updated to read memory device type first and based
on that, get the proper returned results migrate_ping_pong test case

Where is the migrate_ping_pong test? Did you perhaps forget to add it? :-)


Migration from device coherent to system is tested with migrate_multiple 
too. Therefore,
I've removed migrate_ping_pong test. BTW, I just added the "number of 
pages migrated"

checker after migrate from coherent to system on v4 series.

Regards,
Alejandro Sierra




added to test explicit migration from device to sys memory for both
private and coherent zone types.

Helpers to migrate from device to sys memory and vicerversa
were also added.

Signed-off-by: Alex Sierra 
---
v2:
Set FIXTURE_VARIANT to add multiple device types to the FIXTURE. This
will run all the tests for each device type (private and coherent) in
case both existed during hmm-test driver probed.
---
  tools/testing/selftests/vm/hmm-tests.c | 122 -
  1 file changed, 101 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 864f126ffd78..8eb81dfba4b3 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -44,6 +44,14 @@ struct hmm_buffer {
int fd;
uint64_tcpages;
uint64_tfaults;
+   int zone_device_type;
+};
+
+enum {
+   HMM_PRIVATE_DEVICE_ONE,
+   HMM_PRIVATE_DEVICE_TWO,
+   HMM_COHERENCE_DEVICE_ONE,
+   HMM_COHERENCE_DEVICE_TWO,
  };
  
  #define TWOMEG		(1 << 21)

@@ -60,6 +68,21 @@ FIXTURE(hmm)
unsigned intpage_shift;
  };
  
+FIXTURE_VARIANT(hmm)

+{
+   int device_number;
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_private)
+{
+   .device_number = HMM_PRIVATE_DEVICE_ONE,
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_coherent)
+{
+   .device_number = HMM_COHERENCE_DEVICE_ONE,
+};
+
  FIXTURE(hmm2)
  {
int fd0;
@@ -68,6 +91,24 @@ FIXTURE(hmm2)
unsigned intpage_shift;
  };
  
+FIXTURE_VARIANT(hmm2)

+{
+   int device_number0;
+   int device_number1;
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_private)
+{
+   .device_number0 = HMM_PRIVATE_DEVICE_ONE,
+   .device_number1 = HMM_PRIVATE_DEVICE_TWO,
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_coherent)
+{
+   .device_number0 = HMM_COHERENCE_DEVICE_ONE,
+   .device_number1 = HMM_COHERENCE_DEVICE_TWO,
+};
+
  static int hmm_open(int unit)
  {
char pathname[HMM_PATH_MAX];
@@ -81,12 +122,19 @@ static int hmm_open(int unit)
return fd;
  }
  
+static bool hmm_is_coherent_type(int dev_num)

+{
+   return (dev_num >= HMM_COHERENCE_DEVICE_ONE);
+}
+
  FIXTURE_SETUP(hmm)
  {
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
  
-	self->fd = hmm_open(0);

+   self->fd = hmm_open(variant->device_number);
+   if (self->fd < 0 && hmm_is_coherent_type(variant->device_number))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd, 0);
  }
  
@@ -95,9 +143,11 @@ FIXTURE_SETUP(hmm2)

self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
  
-	self->fd0 = hmm_open(0);

+   self->fd0 = hmm_open(variant->device_number0);
+   if (self->fd0 < 0 && hmm_is_coherent_type(variant->device_number0))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd0, 0);
-   self->fd1 = hmm_open(1);
+   self->fd1 = hmm_open(variant->device_number1);
ASSERT_GE(self->fd1, 0);
  }
  
@@ -144,6 +194,7 @@ static int hmm_dmirror_cmd(int fd,

}
buffer->cpages = cmd.cpages;
buffer->faults = cmd.faults;
+   buffer->zone_device_type = cmd.zone_device_type;
  
  	return 0;

  }
@@ -211,6 +262,20 @@ static void hmm_nanosleep(unsigned int n)
nanosleep(&t, NULL);
  }
  
+static int hmm_migrate_sys_to_dev(int fd,

+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_DEV, buffer, npages);
+}
+
+static int hmm_migrate_dev_to_sys(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_SYS, buffer, npages);
+}
+
  /*
   * Simple NULL test of device open/close.
   */
@@ -875,7 +940,7 @@ TEST_F(hmm, migrate)
ptr[i] = i;
  
  	/* Migrate memory to device. */

-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRR

[PATCH v4 10/10] tools: update test_hmm script to support SP config

2022-01-26 Thread Alex Sierra

Add two more parameters to set spm_addr_dev0 & spm_addr_dev1
addresses. These two parameters configure the start SP
addresses for each device in test_hmm driver.
Consequently, this configures zone device type as coherent.

Signed-off-by: Alex Sierra 
Reviewed-by: Alistair Popple 
---
v2:
Add more mknods for device coherent type. These are represented under
/dev/hmm_mirror2 and /dev/hmm_mirror3, only in case they have created
at probing the hmm-test driver.
---
 tools/testing/selftests/vm/test_hmm.sh | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/test_hmm.sh 
b/tools/testing/selftests/vm/test_hmm.sh
index 0647b525a625..539c9371e592 100755
--- a/tools/testing/selftests/vm/test_hmm.sh
+++ b/tools/testing/selftests/vm/test_hmm.sh
@@ -40,11 +40,26 @@ check_test_requirements()
 
 load_driver()
 {
-   modprobe $DRIVER > /dev/null 2>&1
+   if [ $# -eq 0 ]; then
+   modprobe $DRIVER > /dev/null 2>&1
+   else
+   if [ $# -eq 2 ]; then
+   modprobe $DRIVER spm_addr_dev0=$1 spm_addr_dev1=$2
+   > /dev/null 2>&1
+   else
+   echo "Missing module parameters. Make sure pass"\
+   "spm_addr_dev0 and spm_addr_dev1"
+   usage
+   fi
+   fi
if [ $? == 0 ]; then
major=$(awk "\$2==\"HMM_DMIRROR\" {print \$1}" /proc/devices)
mknod /dev/hmm_dmirror0 c $major 0
mknod /dev/hmm_dmirror1 c $major 1
+   if [ $# -eq 2 ]; then
+   mknod /dev/hmm_dmirror2 c $major 2
+   mknod /dev/hmm_dmirror3 c $major 3
+   fi
fi
 }
 
@@ -58,7 +73,7 @@ run_smoke()
 {
echo "Running smoke test. Note, this test provides basic coverage."
 
-   load_driver
+   load_driver $1 $2
$(dirname "${BASH_SOURCE[0]}")/hmm-tests
unload_driver
 }
@@ -75,6 +90,9 @@ usage()
echo "# Smoke testing"
echo "./${TEST_NAME}.sh smoke"
echo
+   echo "# Smoke testing with SPM enabled"
+   echo "./${TEST_NAME}.sh smoke  "
+   echo
exit 0
 }
 
@@ -84,7 +102,7 @@ function run_test()
usage
else
if [ "$1" = "smoke" ]; then
-   run_smoke
+   run_smoke $2 $3
else
usage
fi
-- 
2.32.0

[PATCH v4 03/10] mm/gup: fail get_user_pages for LONGTERM dev coherent type

2022-01-26 Thread Alex Sierra

Avoid long term pinning for Coherent device type pages. This could
interfere with their own device memory manager. For now, we are just
returning error for PIN_LONGTERM Coherent device type pages. Eventually,
these type of pages will get migrated to system memory, once the device
migration pages support is added.

Signed-off-by: Alex Sierra 
---
 mm/gup.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index 886d6148d3d0..5291d7221826 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1720,6 +1720,12 @@ static long check_and_migrate_movable_pages(unsigned 
long nr_pages,
 * If we get a movable page, since we are going to be pinning
 * these entries, try to move them out if possible.
 */
+   if (is_dev_private_or_coherent_page(head)) {
+   WARN_ON_ONCE(is_device_private_page(head));
+   ret = -EFAULT;
+   goto unpin_pages;
+   }
+
if (!is_pinnable_page(head)) {
if (PageHuge(head)) {
if (!isolate_huge_page(head, 
&movable_page_list))
@@ -1750,6 +1756,7 @@ static long check_and_migrate_movable_pages(unsigned long 
nr_pages,
if (list_empty(&movable_page_list) && !isolation_error_count)
return nr_pages;
 
+unpin_pages:
if (gup_flags & FOLL_PIN) {
unpin_user_pages(pages, nr_pages);
} else {
-- 
2.32.0

[PATCH v4 05/10] drm/amdkfd: coherent type as sys mem on migration to ram

2022-01-26 Thread Alex Sierra

Coherent device type memory on VRAM to RAM migration, has similar access
as System RAM from the CPU. This flag sets the source from the sender.
Which in Coherent type case, should be set as
MIGRATE_VMA_SELECT_DEVICE_COHERENT.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 9e36fe8aea0f..3e405f078ade 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -661,9 +661,12 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct 
svm_range *prange,
migrate.vma = vma;
migrate.start = start;
migrate.end = end;
-   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
 
+   if (adev->gmc.xgmi.connected_to_cpu)
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+   else
+   migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
size = 2 * sizeof(*migrate.src) + sizeof(uint64_t) + sizeof(dma_addr_t);
size *= npages;
buf = kvmalloc(size, GFP_KERNEL | __GFP_ZERO);
-- 
2.32.0

[PATCH v4 07/10] lib: test_hmm add module param for zone device type

2022-01-26 Thread Alex Sierra

In order to configure device coherent in test_hmm, two module parameters
should be passed, which correspond to the SP start address of each
device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed,
private device type is configured.

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 73 -
 lib/test_hmm_uapi.h |  1 +
 2 files changed, 53 insertions(+), 21 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index fb1fa7c6fa98..6f068f7c4ee3 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -34,6 +34,16 @@
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+static unsigned long spm_addr_dev0;
+module_param(spm_addr_dev0, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev0,
+   "Specify start address for SPM (special purpose memory) used 
for device 0. By setting this Coherent device type will be used. Make sure 
spm_addr_dev1 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
+
+static unsigned long spm_addr_dev1;
+module_param(spm_addr_dev1, long, 0644);
+MODULE_PARM_DESC(spm_addr_dev1,
+   "Specify start address for SPM (special purpose memory) used 
for device 1. By setting this Coherent device type will be used. Make sure 
spm_addr_dev0 is set too. Minimum SPM size should be DEVMEM_CHUNK_SIZE.");
+
 static const struct dev_pagemap_ops dmirror_devmem_ops;
 static const struct mmu_interval_notifier_ops dmirror_min_ops;
 static dev_t dmirror_dev;
@@ -452,28 +462,44 @@ static int dmirror_write(struct dmirror *dmirror, struct 
hmm_dmirror_cmd *cmd)
return ret;
 }
 
-static bool dmirror_allocate_chunk(struct dmirror_device *mdevice,
+static int dmirror_allocate_chunk(struct dmirror_device *mdevice,
   struct page **ppage)
 {
struct dmirror_chunk *devmem;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long pfn;
unsigned long pfn_first;
unsigned long pfn_last;
void *ptr;
+   int ret = -ENOMEM;
 
devmem = kzalloc(sizeof(*devmem), GFP_KERNEL);
if (!devmem)
-   return false;
+   return ret;
 
-   res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE,
- "hmm_dmirror");
-   if (IS_ERR(res))
+   switch (mdevice->zone_device_type) {
+   case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE:
+   res = request_free_mem_region(&iomem_resource, 
DEVMEM_CHUNK_SIZE,
+ "hmm_dmirror");
+   if (IS_ERR_OR_NULL(res))
+   goto err_devmem;
+   devmem->pagemap.range.start = res->start;
+   devmem->pagemap.range.end = res->end;
+   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
+   break;
+   case HMM_DMIRROR_MEMORY_DEVICE_COHERENT:
+   devmem->pagemap.range.start = (MINOR(mdevice->cdevice.dev) - 2) 
?
+   spm_addr_dev0 :
+   spm_addr_dev1;
+   devmem->pagemap.range.end = devmem->pagemap.range.start +
+   DEVMEM_CHUNK_SIZE - 1;
+   devmem->pagemap.type = MEMORY_DEVICE_COHERENT;
+   break;
+   default:
+   ret = -EINVAL;
goto err_devmem;
+   }
 
-   devmem->pagemap.type = MEMORY_DEVICE_PRIVATE;
-   devmem->pagemap.range.start = res->start;
-   devmem->pagemap.range.end = res->end;
devmem->pagemap.nr_range = 1;
devmem->pagemap.ops = &dmirror_devmem_ops;
devmem->pagemap.owner = mdevice;
@@ -494,10 +520,14 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
mdevice->devmem_capacity = new_capacity;
mdevice->devmem_chunks = new_chunks;
}
-
ptr = memremap_pages(&devmem->pagemap, numa_node_id());
-   if (IS_ERR(ptr))
+   if (IS_ERR_OR_NULL(ptr)) {
+   if (ptr)
+   ret = PTR_ERR(ptr);
+   else
+   ret = -EFAULT;
goto err_release;
+   }
 
devmem->mdevice = mdevice;
pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT;
@@ -526,15 +556,17 @@ static bool dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
}
spin_unlock(&mdevice->lock);
 
-   return true;
+   return 0;
 
 err_release:
mutex_unlock(&mdevice->devmem_lock);
-   release_mem_region(devmem->pagemap.range.start, 
range_len(&devmem->pagemap.range));
+   if (res && devmem->pagemap.type == MEMORY_DEVICE_PRIVATE)
+   release_mem_region(devmem->pagemap.range.start,
+  range_len(&devmem->pagemap.range));
 err_devmem:
kfree(devmem);
 
-   return false;
+   return ret;
 }
 
 static

[PATCH v4 09/10] tools: update hmm-test to support device coherent type

2022-01-26 Thread Alex Sierra

Test cases such as migrate_fault and migrate_multiple, were modified to
explicit migrate from device to sys memory without the need of page
faults, when using device coherent type.

Snapshot test case updated to read memory device type first and based
on that, get the proper returned results migrate_ping_pong test case
added to test explicit migration from device to sys memory for both
private and coherent zone types.

Helpers to migrate from device to sys memory and vicerversa
were also added.

Signed-off-by: Alex Sierra 
---
v2:
Set FIXTURE_VARIANT to add multiple device types to the FIXTURE. This
will run all the tests for each device type (private and coherent) in
case both existed during hmm-test driver probed.
v4:
Check for the number of pages successfully migrated from coherent
device to system at migrate_multiple test.
---
 tools/testing/selftests/vm/hmm-tests.c | 123 -
 1 file changed, 102 insertions(+), 21 deletions(-)

diff --git a/tools/testing/selftests/vm/hmm-tests.c 
b/tools/testing/selftests/vm/hmm-tests.c
index 864f126ffd78..99de5e38dbbc 100644
--- a/tools/testing/selftests/vm/hmm-tests.c
+++ b/tools/testing/selftests/vm/hmm-tests.c
@@ -44,6 +44,14 @@ struct hmm_buffer {
int fd;
uint64_tcpages;
uint64_tfaults;
+   int zone_device_type;
+};
+
+enum {
+   HMM_PRIVATE_DEVICE_ONE,
+   HMM_PRIVATE_DEVICE_TWO,
+   HMM_COHERENCE_DEVICE_ONE,
+   HMM_COHERENCE_DEVICE_TWO,
 };
 
 #define TWOMEG (1 << 21)
@@ -60,6 +68,21 @@ FIXTURE(hmm)
unsigned intpage_shift;
 };
 
+FIXTURE_VARIANT(hmm)
+{
+   int device_number;
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_private)
+{
+   .device_number = HMM_PRIVATE_DEVICE_ONE,
+};
+
+FIXTURE_VARIANT_ADD(hmm, hmm_device_coherent)
+{
+   .device_number = HMM_COHERENCE_DEVICE_ONE,
+};
+
 FIXTURE(hmm2)
 {
int fd0;
@@ -68,6 +91,24 @@ FIXTURE(hmm2)
unsigned intpage_shift;
 };
 
+FIXTURE_VARIANT(hmm2)
+{
+   int device_number0;
+   int device_number1;
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_private)
+{
+   .device_number0 = HMM_PRIVATE_DEVICE_ONE,
+   .device_number1 = HMM_PRIVATE_DEVICE_TWO,
+};
+
+FIXTURE_VARIANT_ADD(hmm2, hmm2_device_coherent)
+{
+   .device_number0 = HMM_COHERENCE_DEVICE_ONE,
+   .device_number1 = HMM_COHERENCE_DEVICE_TWO,
+};
+
 static int hmm_open(int unit)
 {
char pathname[HMM_PATH_MAX];
@@ -81,12 +122,19 @@ static int hmm_open(int unit)
return fd;
 }
 
+static bool hmm_is_coherent_type(int dev_num)
+{
+   return (dev_num >= HMM_COHERENCE_DEVICE_ONE);
+}
+
 FIXTURE_SETUP(hmm)
 {
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
 
-   self->fd = hmm_open(0);
+   self->fd = hmm_open(variant->device_number);
+   if (self->fd < 0 && hmm_is_coherent_type(variant->device_number))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd, 0);
 }
 
@@ -95,9 +143,11 @@ FIXTURE_SETUP(hmm2)
self->page_size = sysconf(_SC_PAGE_SIZE);
self->page_shift = ffs(self->page_size) - 1;
 
-   self->fd0 = hmm_open(0);
+   self->fd0 = hmm_open(variant->device_number0);
+   if (self->fd0 < 0 && hmm_is_coherent_type(variant->device_number0))
+   SKIP(exit(0), "DEVICE_COHERENT not available");
ASSERT_GE(self->fd0, 0);
-   self->fd1 = hmm_open(1);
+   self->fd1 = hmm_open(variant->device_number1);
ASSERT_GE(self->fd1, 0);
 }
 
@@ -144,6 +194,7 @@ static int hmm_dmirror_cmd(int fd,
}
buffer->cpages = cmd.cpages;
buffer->faults = cmd.faults;
+   buffer->zone_device_type = cmd.zone_device_type;
 
return 0;
 }
@@ -211,6 +262,20 @@ static void hmm_nanosleep(unsigned int n)
nanosleep(&t, NULL);
 }
 
+static int hmm_migrate_sys_to_dev(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_DEV, buffer, npages);
+}
+
+static int hmm_migrate_dev_to_sys(int fd,
+  struct hmm_buffer *buffer,
+  unsigned long npages)
+{
+   return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_SYS, buffer, npages);
+}
+
 /*
  * Simple NULL test of device open/close.
  */
@@ -875,7 +940,7 @@ TEST_F(hmm, migrate)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+   ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages);
ASSERT_EQ(ret, 0);
ASSERT_EQ(buffer->cpages, npages);
 
@@ -923,7 +988,7 @@ TEST_F(hmm, migrate_fault)
ptr[i] = i;
 
/* Migrate memory to device. */
-   ret = hmm_dmirror_cmd(self->fd, HMM_DMIRROR_MIGRATE, buffer, npages);
+

[PATCH v4 04/10] drm/amdkfd: add SPM support for SVM

2022-01-26 Thread Alex Sierra

When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_COHERENT to create the device
page map region.

Signed-off-by: Alex Sierra 
Reviewed-by: Felix Kuehling 
---
v7:
Remove lookup_resource call, so export symbol for this function
is not longer required. Patch dropped "kernel: resource:
lookup_resource as exported symbol"
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 29 +++-
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index aeade32ec298..9e36fe8aea0f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -935,7 +935,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
 {
struct kfd_dev *kfddev = adev->kfd.dev;
struct dev_pagemap *pgmap;
-   struct resource *res;
+   struct resource *res = NULL;
unsigned long size;
void *r;
 
@@ -950,28 +950,34 @@ int svm_migrate_init(struct amdgpu_device *adev)
 * should remove reserved size
 */
size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-   res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
-   if (IS_ERR(res))
-   return -ENOMEM;
+   if (adev->gmc.xgmi.connected_to_cpu) {
+   pgmap->range.start = adev->gmc.aper_base;
+   pgmap->range.end = adev->gmc.aper_base + adev->gmc.aper_size - 
1;
+   pgmap->type = MEMORY_DEVICE_COHERENT;
+   } else {
+   res = devm_request_free_mem_region(adev->dev, &iomem_resource, 
size);
+   if (IS_ERR(res))
+   return -ENOMEM;
+   pgmap->range.start = res->start;
+   pgmap->range.end = res->end;
+   pgmap->type = MEMORY_DEVICE_PRIVATE;
+   }
 
-   pgmap->type = MEMORY_DEVICE_PRIVATE;
pgmap->nr_range = 1;
-   pgmap->range.start = res->start;
-   pgmap->range.end = res->end;
pgmap->ops = &svm_migrate_pgmap_ops;
pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-   pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
-
+   pgmap->flags = 0;
/* Device manager releases device-specific resources, memory region and
 * pgmap when driver disconnects from device.
 */
r = devm_memremap_pages(adev->dev, pgmap);
if (IS_ERR(r)) {
pr_err("failed to register HMM device memory\n");
-
/* Disable SVM support capability */
pgmap->type = 0;
-   devm_release_mem_region(adev->dev, res->start, 
resource_size(res));
+   if (pgmap->type == MEMORY_DEVICE_PRIVATE)
+   devm_release_mem_region(adev->dev, res->start,
+   res->end - res->start + 1);
return PTR_ERR(r);
}
 
@@ -984,3 +990,4 @@ int svm_migrate_init(struct amdgpu_device *adev)
 
return 0;
 }
+
-- 
2.32.0

[PATCH v4 08/10] lib: add support for device coherent type in test_hmm

2022-01-26 Thread Alex Sierra

Device Coherent type uses device memory that is coherently accesible by
the CPU. This could be shown as SP (special purpose) memory range
at the BIOS-e820 memory enumeration. If no SP memory is supported in
system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP.

Currently, test_hmm only supports two different SP ranges of at least
256MB size. This could be specified in the kernel parameter variable
efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x1 &
0x14000 physical address. Ex.
efi_fake_mem=1G@0x1:0x4,1G@0x14000:0x4

Private and coherent device mirror instances can be created in the same
probed. This is done by passing the module parameters spm_addr_dev0 &
spm_addr_dev1. In this case, it will create four instances of
device_mirror. The first two correspond to private device type, the
last two to coherent type. Then, they can be easily accessed from user
space through /dev/hmm_mirror. Usually num_device 0 and 1
are for private, and 2 and 3 for coherent types. If no module
parameters are passed, two instances of private type device_mirror will
be created only.

Signed-off-by: Alex Sierra 
---
v4:
Return number of coherent device pages successfully migrated to system.
This is returned at cmd->cpages.
---
 lib/test_hmm.c  | 260 +---
 lib/test_hmm_uapi.h |  15 ++-
 2 files changed, 205 insertions(+), 70 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 6f068f7c4ee3..850d5331e370 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -29,11 +29,22 @@
 
 #include "test_hmm_uapi.h"
 
-#define DMIRROR_NDEVICES   2
+#define DMIRROR_NDEVICES   4
 #define DMIRROR_RANGE_FAULT_TIMEOUT1000
 #define DEVMEM_CHUNK_SIZE  (256 * 1024 * 1024U)
 #define DEVMEM_CHUNKS_RESERVE  16
 
+/*
+ * For device_private pages, dpage is just a dummy struct page
+ * representing a piece of device memory. dmirror_devmem_alloc_page
+ * allocates a real system memory page as backing storage to fake a
+ * real device. zone_device_data points to that backing page. But
+ * for device_coherent memory, the struct page represents real
+ * physical CPU-accessible memory that we can use directly.
+ */
+#define BACKING_PAGE(page) (is_device_private_page((page)) ? \
+  (page)->zone_device_data : (page))
+
 static unsigned long spm_addr_dev0;
 module_param(spm_addr_dev0, long, 0644);
 MODULE_PARM_DESC(spm_addr_dev0,
@@ -122,6 +133,21 @@ static int dmirror_bounce_init(struct dmirror_bounce 
*bounce,
return 0;
 }
 
+static bool dmirror_is_private_zone(struct dmirror_device *mdevice)
+{
+   return (mdevice->zone_device_type ==
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? true : false;
+}
+
+static enum migrate_vma_direction
+   dmirror_select_device(struct dmirror *dmirror)
+{
+   return (dmirror->mdevice->zone_device_type ==
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ?
+   MIGRATE_VMA_SELECT_DEVICE_PRIVATE :
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT;
+}
+
 static void dmirror_bounce_fini(struct dmirror_bounce *bounce)
 {
vfree(bounce->ptr);
@@ -572,16 +598,19 @@ static int dmirror_allocate_chunk(struct dmirror_device 
*mdevice,
 static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice)
 {
struct page *dpage = NULL;
-   struct page *rpage;
+   struct page *rpage = NULL;
 
/*
-* This is a fake device so we alloc real system memory to store
-* our device memory.
+* For ZONE_DEVICE private type, this is a fake device so we alloc real
+* system memory to store our device memory.
+* For ZONE_DEVICE coherent type we use the actual dpage to store the 
data
+* and ignore rpage.
 */
-   rpage = alloc_page(GFP_HIGHUSER);
-   if (!rpage)
-   return NULL;
-
+   if (dmirror_is_private_zone(mdevice)) {
+   rpage = alloc_page(GFP_HIGHUSER);
+   if (!rpage)
+   return NULL;
+   }
spin_lock(&mdevice->lock);
 
if (mdevice->free_pages) {
@@ -601,7 +630,8 @@ static struct page *dmirror_devmem_alloc_page(struct 
dmirror_device *mdevice)
return dpage;
 
 error:
-   __free_page(rpage);
+   if (rpage)
+   __free_page(rpage);
return NULL;
 }
 
@@ -627,12 +657,16 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
 * unallocated pte_none() or read-only zero page.
 */
spage = migrate_pfn_to_page(*src);
+   if (WARN(spage && is_zone_device_page(spage),
+"page already in device spage pfn: 0x%lx\n",
+page_to_pfn(spage)))
+   continue;
 
dpage = dmirror_devmem_alloc_page(mdevice);
if (!dpage)
continue;
 
-   rpage = dpage->zone_de

[PATCH v4 06/10] lib: test_hmm add ioctl to get zone device type

2022-01-26 Thread Alex Sierra

new ioctl cmd added to query zone device type. This will be
used once the test_hmm adds zone device coherent type.

Signed-off-by: Alex Sierra 
---
 lib/test_hmm.c  | 23 +--
 lib/test_hmm_uapi.h |  8 
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index c259842f6d44..fb1fa7c6fa98 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -84,6 +84,7 @@ struct dmirror_chunk {
 struct dmirror_device {
struct cdev cdevice;
struct hmm_devmem   *devmem;
+   unsigned intzone_device_type;
 
unsigned intdevmem_capacity;
unsigned intdevmem_count;
@@ -1025,6 +1026,15 @@ static int dmirror_snapshot(struct dmirror *dmirror,
return ret;
 }
 
+static int dmirror_get_device_type(struct dmirror *dmirror,
+   struct hmm_dmirror_cmd *cmd)
+{
+   mutex_lock(&dmirror->mutex);
+   cmd->zone_device_type = dmirror->mdevice->zone_device_type;
+   mutex_unlock(&dmirror->mutex);
+
+   return 0;
+}
 static long dmirror_fops_unlocked_ioctl(struct file *filp,
unsigned int command,
unsigned long arg)
@@ -1075,6 +1085,9 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,
ret = dmirror_snapshot(dmirror, &cmd);
break;
 
+   case HMM_DMIRROR_GET_MEM_DEV_TYPE:
+   ret = dmirror_get_device_type(dmirror, &cmd);
+   break;
default:
return -EINVAL;
}
@@ -1235,14 +1248,20 @@ static void dmirror_device_remove(struct dmirror_device 
*mdevice)
 static int __init hmm_dmirror_init(void)
 {
int ret;
-   int id;
+   int id = 0;
+   int ndevices = 0;
 
ret = alloc_chrdev_region(&dmirror_dev, 0, DMIRROR_NDEVICES,
  "HMM_DMIRROR");
if (ret)
goto err_unreg;
 
-   for (id = 0; id < DMIRROR_NDEVICES; id++) {
+   memset(dmirror_devices, 0, DMIRROR_NDEVICES * 
sizeof(dmirror_devices[0]));
+   dmirror_devices[ndevices++].zone_device_type =
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   dmirror_devices[ndevices++].zone_device_type =
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE;
+   for (id = 0; id < ndevices; id++) {
ret = dmirror_device_init(dmirror_devices + id, id);
if (ret)
goto err_chrdev;
diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h
index f14dea5dcd06..17f842f1aa02 100644
--- a/lib/test_hmm_uapi.h
+++ b/lib/test_hmm_uapi.h
@@ -19,6 +19,7 @@
  * @npages: (in) number of pages to read/write
  * @cpages: (out) number of pages copied
  * @faults: (out) number of device page faults seen
+ * @zone_device_type: (out) zone device memory type
  */
 struct hmm_dmirror_cmd {
__u64   addr;
@@ -26,6 +27,7 @@ struct hmm_dmirror_cmd {
__u64   npages;
__u64   cpages;
__u64   faults;
+   __u64   zone_device_type;
 };
 
 /* Expose the address space of the calling process through hmm device file */
@@ -35,6 +37,7 @@ struct hmm_dmirror_cmd {
 #define HMM_DMIRROR_SNAPSHOT   _IOWR('H', 0x03, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_EXCLUSIVE  _IOWR('H', 0x04, struct hmm_dmirror_cmd)
 #define HMM_DMIRROR_CHECK_EXCLUSIVE_IOWR('H', 0x05, struct hmm_dmirror_cmd)
+#define HMM_DMIRROR_GET_MEM_DEV_TYPE   _IOWR('H', 0x06, struct hmm_dmirror_cmd)
 
 /*
  * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT.
@@ -62,4 +65,9 @@ enum {
HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE = 0x30,
 };
 
+enum {
+   /* 0 is reserved to catch uninitialized type fields */
+   HMM_DMIRROR_MEMORY_DEVICE_PRIVATE = 1,
+};
+
 #endif /* _LIB_TEST_HMM_UAPI_H */
-- 
2.32.0

[PATCH v4 02/10] mm: add device coherent vma selection for memory migration

2022-01-26 Thread Alex Sierra

This case is used to migrate pages from device memory, back to system
memory. Device coherent type memory is cache coherent from device and CPU
point of view.

Signed-off-by: Alex Sierra 
---
v2:
condition added when migrations from device coherent pages.
---
 include/linux/migrate.h | 1 +
 mm/migrate.c| 9 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index c8077e936691..e74bb0978f6f 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -138,6 +138,7 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
 enum migrate_vma_direction {
MIGRATE_VMA_SELECT_SYSTEM = 1 << 0,
MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1,
+   MIGRATE_VMA_SELECT_DEVICE_COHERENT = 1 << 2,
 };
 
 struct migrate_vma {
diff --git a/mm/migrate.c b/mm/migrate.c
index 277562cd4cf5..2b3375e165b1 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2340,8 +2340,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (is_writable_device_private_entry(entry))
mpfn |= MIGRATE_PFN_WRITE;
} else {
-   if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
-   goto next;
pfn = pte_pfn(pte);
if (is_zero_pfn(pfn)) {
mpfn = MIGRATE_PFN_MIGRATE;
@@ -2349,6 +2347,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
goto next;
}
page = vm_normal_page(migrate->vma, addr, pte);
+   if (page && !is_zone_device_page(page) &&
+   !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM))
+   goto next;
+   if (page && is_device_coherent_page(page) &&
+   (!(migrate->flags & 
MIGRATE_VMA_SELECT_DEVICE_COHERENT) ||
+page->pgmap->owner != migrate->pgmap_owner))
+   goto next;
mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE;
mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0;
}
-- 
2.32.0

[PATCH v4 00/10] Add MEMORY_DEVICE_COHERENT for coherent device memory mapping

2022-01-26 Thread Alex Sierra

This patch series introduces MEMORY_DEVICE_COHERENT, a type of memory
owned by a device that can be mapped into CPU page tables like
MEMORY_DEVICE_GENERIC and can also be migrated like
MEMORY_DEVICE_PRIVATE.

Christoph, the suggestion to incorporate Ralph Campbell’s refcount
cleanup patch into our hardware page migration patchset originally came
from you, but it proved impractical to do things in that order because
the refcount cleanup introduced a bug with wide ranging structural
implications. Instead, we amended Ralph’s patch so that it could be
applied after merging the migration work. As we saw from the recent
discussion, merging the refcount work is going to take some time and
cooperation between multiple development groups, while the migration
work is ready now and is needed now. So we propose to merge this
patchset first and continue to work with Ralph and others to merge the
refcount cleanup separately, when it is ready.

This patch series is mostly self-contained except for a few places where
it needs to update other subsystems to handle the new memory type.

System stability and performance are not affected according to our
ongoing testing, including xfstests.

How it works: The system BIOS advertises the GPU device memory
(aka VRAM) as SPM (special purpose memory) in the UEFI system address
map.

The amdgpu driver registers the memory with devmap as
MEMORY_DEVICE_COHERENT using devm_memremap_pages. The initial user for
this hardware page migration capability is the Frontier supercomputer
project. This functionality is not AMD-specific. We expect other GPU
vendors to find this functionality useful, and possibly other hardware
types in the future.

Our test nodes in the lab are similar to the Frontier configuration,
with .5 TB of system memory plus 256 GB of device memory split across
4 GPUs, all in a single coherent address space. Page migration is
expected to improve application efficiency significantly. We will
report empirical results as they become available.

We extended hmm_test to cover migration of MEMORY_DEVICE_COHERENT. This
patch set builds on HMM and our SVM memory manager already merged in
5.15.

v2:
- test_hmm is now able to create private and coherent device mirror
instances in the same driver probe. This adds more usability to the hmm
test by not having to remove the kernel module for each device type
test (private/coherent type). This is done by passing the module
parameters spm_addr_dev0 & spm_addr_dev1. In this case, it will create
four instances of device_mirror. The first two correspond to private
device type, the last two to coherent type. Then, they can be easily
accessed from user space through /dev/hmm_mirror. Usually
num_device 0 and 1 are for private, and 2 and 3 for coherent types.

- Coherent device type pages at gup are now migrated back to system
memory if they have been long term pinned (FOLL_LONGTERM). The reason
is these pages could eventually interfere with their own device memory
manager. A new hmm_gup_test has been added to the hmm-test to test this
functionality. It makes use of the gup_test module to long term pin
user pages that have been migrate to device memory first.

- Other patch corrections made by Felix, Alistair and Christoph.

v3:
- Based on last v2 feedback we got from Alistair, we've decided to
remove migration logic for FOLL_LONGTERM coherent device type pages at
gup for now. Ideally, this should be done through the kernel mm,
instead of calling the device driver to do it. Currently, there's no
support for migrating device pages based on pfn, mainly because
migrate_pages() relies on pages being LRU pages. Alistair mentioned, he
has started to work on adding this migrate device pages logic. For now,
we fail on get_user_pages call with FOLL_LONGTERM for DEVICE_COHERENT
pages.

- Also, hmm_gup_test has been removed from hmm-test. We plan to include
it again after this migration work is ready.

- Addressed Liam Howlett's feedback changes.

v4:
- Addressed Alistair Popple's last v3 feedback.

- Use the same system entry path for coherent device pages at
migrate_vma_insert_page.

- Add coherent device type support for try_to_migrate /
try_to_migrate_one.

- Include number of coherent device pages successfully migrated back to
system at test_hmm. Made the proper changes to hmm-test to read/check
this number.

Alex Sierra (10):
  mm: add zone device coherent type memory support
  mm: add device coherent vma selection for memory migration
  mm/gup: fail get_user_pages for LONGTERM dev coherent type
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: coherent type as sys mem on migration to ram
  lib: test_hmm add ioctl to get zone device type
  lib: test_hmm add module param for zone device type
  lib: add support for device coherent type in test_hmm
  tools: update hmm-test to support device coherent type
  tools: update test_hmm script to support SP config

 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  34 ++-
 include/linux/memremap.h |

[PATCH v4 01/10] mm: add zone device coherent type memory support

2022-01-26 Thread Alex Sierra

Device memory that is cache coherent from device and CPU point of view.
This is used on platforms that have an advanced system bus (like CAPI
or CXL). Any page of a process can be migrated to such memory. However,
no one should be allowed to pin such memory so that it can always be
evicted.

Signed-off-by: Alex Sierra 
---
v4:
- use the same system entry path for coherent device pages at
migrate_vma_insert_page.

- Add coherent device type support for try_to_migrate /
try_to_migrate_one.
---
 include/linux/memremap.h |  8 +++
 include/linux/mm.h   | 16 ++
 mm/memcontrol.c  |  6 ++---
 mm/memory-failure.c  |  8 +--
 mm/memremap.c| 14 +++-
 mm/migrate.c | 47 
 mm/rmap.c| 20 -
 7 files changed, 83 insertions(+), 36 deletions(-)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index c0e9d35889e8..ff4d398edf35 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -39,6 +39,13 @@ struct vmem_altmap {
  * A more complete discussion of unaddressable memory may be found in
  * include/linux/hmm.h and Documentation/vm/hmm.rst.
  *
+ * MEMORY_DEVICE_COHERENT:
+ * Device memory that is cache coherent from device and CPU point of view. This
+ * is used on platforms that have an advanced system bus (like CAPI or CXL). A
+ * driver can hotplug the device memory using ZONE_DEVICE and with that memory
+ * type. Any page of a process can be migrated to such memory. However no one
+ * should be allowed to pin such memory so that it can always be evicted.
+ *
  * MEMORY_DEVICE_FS_DAX:
  * Host memory that has similar access semantics as System RAM i.e. DMA
  * coherent and supports page pinning. In support of coordinating page
@@ -59,6 +66,7 @@ struct vmem_altmap {
 enum memory_type {
/* 0 is reserved to catch uninitialized type fields */
MEMORY_DEVICE_PRIVATE = 1,
+   MEMORY_DEVICE_COHERENT,
MEMORY_DEVICE_FS_DAX,
MEMORY_DEVICE_GENERIC,
MEMORY_DEVICE_PCI_P2PDMA,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 73a52aba448f..9c0bf1441da3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1162,6 +1162,7 @@ static inline bool page_is_devmap_managed(struct page 
*page)
return false;
switch (page->pgmap->type) {
case MEMORY_DEVICE_PRIVATE:
+   case MEMORY_DEVICE_COHERENT:
case MEMORY_DEVICE_FS_DAX:
return true;
default:
@@ -1191,6 +1192,21 @@ static inline bool is_device_private_page(const struct 
page *page)
page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool is_device_coherent_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   page->pgmap->type == MEMORY_DEVICE_COHERENT;
+}
+
+static inline bool is_dev_private_or_coherent_page(const struct page *page)
+{
+   return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+   is_zone_device_page(page) &&
+   (page->pgmap->type == MEMORY_DEVICE_PRIVATE ||
+   page->pgmap->type == MEMORY_DEVICE_COHERENT);
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6da5020a8656..b06262c3cdf9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5695,8 +5695,8 @@ static int mem_cgroup_move_account(struct page *page,
  *   2(MC_TARGET_SWAP): if the swap entry corresponding to this pte is a
  * target for charge migration. if @target is not NULL, the entry is stored
  * in target->ent.
- *   3(MC_TARGET_DEVICE): like MC_TARGET_PAGE  but page is 
MEMORY_DEVICE_PRIVATE
- * (so ZONE_DEVICE page and thus not on the lru).
+ *   3(MC_TARGET_DEVICE): like MC_TARGET_PAGE  but page is device memory and
+ *   thus not on the lru.
  * For now we such page is charge like a regular page would be as for all
  * intent and purposes it is just special memory taking the place of a
  * regular page.
@@ -5730,7 +5730,7 @@ static enum mc_target_type get_mctgt_type(struct 
vm_area_struct *vma,
 */
if (page_memcg(page) == mc.from) {
ret = MC_TARGET_PAGE;
-   if (is_device_private_page(page))
+   if (is_dev_private_or_coherent_page(page))
ret = MC_TARGET_DEVICE;
if (target)
target->page = page;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3e6449f2102a..4cf212e5f432 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1554,12 +1554,16 @@ static int memory_failure_dev_pagemap(unsigned long 
pfn, int flags,
goto unlock;
}
 
-   if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
+   switch (pg

[PATCH v1, 8/8] media: mtk-vcodec: Add to support H264 inner racing mode

2022-01-26 Thread Yunfei Dong

In order to reduce decoder latency, enable H264 inner racing mode.

Send lat trans buffer information to core when trigger lat to work,
need not to wait until lat decode done.

Signed-off-by: Yunfei Dong 
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  4 +++
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 34 +++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h  | 10 ++
 .../mtk-vcodec/vdec/vdec_h264_req_multi_if.c  | 23 ++---
 4 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 938bf14e4e8c..099dc28b7445 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -390,6 +390,10 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
}
}
 
+   atomic_set(&dev->dec_active_cnt, 0);
+   memset(dev->vdec_racing_info, 0 , sizeof(dev->vdec_racing_info));
+   mutex_init(&dev->dec_racing_info_mutex);
+
if (dev->vdec_pdata->uses_stateless_api) {
dev->mdev_dec.dev = &pdev->dev;
strscpy(dev->mdev_dec.model, MTK_VCODEC_DEC_NAME,
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 76e1442fc6f9..065d14a3d11f 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -173,6 +173,34 @@ static void mtk_vcodec_dec_disable_irq(struct 
mtk_vcodec_dev *vdec_dev, int hw_i
}
 }
 
+static void mtk_vcodec_load_racing_info(struct mtk_vcodec_ctx *ctx)
+{
+   void __iomem *vdec_racing_addr;
+   int j;
+
+   mutex_lock(&ctx->dev->dec_racing_info_mutex);
+   if (atomic_inc_return(&ctx->dev->dec_active_cnt) == 1) {
+   vdec_racing_addr = ctx->dev->reg_base[VDEC_MISC] + 0x100;
+   for (j = 0; j < 132; j++)
+   writel(ctx->dev->vdec_racing_info[j], vdec_racing_addr 
+ j * 4);
+   }
+   mutex_unlock(&ctx->dev->dec_racing_info_mutex);
+}
+
+static void mtk_vcodec_record_racing_info(struct mtk_vcodec_ctx *ctx)
+{
+   void __iomem *vdec_racing_addr;
+   int j;
+
+   mutex_lock(&ctx->dev->dec_racing_info_mutex);
+   if (atomic_dec_and_test(&ctx->dev->dec_active_cnt)) {
+   vdec_racing_addr = ctx->dev->reg_base[VDEC_MISC] + 0x100;
+   for (j = 0; j < 132; j++)
+   ctx->dev->vdec_racing_info[j] = readl(vdec_racing_addr 
+ j * 4);
+   }
+   mutex_unlock(&ctx->dev->dec_racing_info_mutex);
+}
+
 static struct mtk_vcodec_pm *mtk_vcodec_dec_get_pm(struct mtk_vcodec_dev 
*vdec_dev,
   int hw_idx)
 {
@@ -243,11 +271,17 @@ void mtk_vcodec_dec_enable_hardware(struct mtk_vcodec_ctx 
*ctx, int hw_idx)
mtk_vcodec_dec_child_dev_on(ctx->dev, hw_idx);
 
mtk_vcodec_dec_enable_irq(ctx->dev, hw_idx);
+
+   if (IS_VDEC_INNER_RACING(ctx->dev->dec_capability))
+   mtk_vcodec_load_racing_info(ctx);
 }
 EXPORT_SYMBOL_GPL(mtk_vcodec_dec_enable_hardware);
 
 void mtk_vcodec_dec_disable_hardware(struct mtk_vcodec_ctx *ctx, int hw_idx)
 {
+   if (IS_VDEC_INNER_RACING(ctx->dev->dec_capability))
+   mtk_vcodec_record_racing_info(ctx);
+
mtk_vcodec_dec_disable_irq(ctx->dev, hw_idx);
 
mtk_vcodec_dec_child_dev_off(ctx->dev, hw_idx);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index 363b999dd709..4d6ace869b5a 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -28,6 +28,7 @@
 #define MTK_V4L2_BENCHMARK 0
 #define WAIT_INTR_TIMEOUT_MS   1000
 #define IS_VDEC_LAT_ARCH(hw_arch) ((hw_arch) >= MTK_VDEC_LAT_SINGLE_CORE)
+#define IS_VDEC_INNER_RACING(capability) (capability & MTK_VCODEC_INNER_RACING)
 
 /*
  * enum mtk_hw_reg_idx - MTK hw register base index
@@ -360,6 +361,7 @@ enum mtk_vdec_format_types {
MTK_VDEC_FORMAT_H264_SLICE = 0x100,
MTK_VDEC_FORMAT_VP8_FRAME = 0x200,
MTK_VDEC_FORMAT_VP9_FRAME = 0x400,
+   MTK_VCODEC_INNER_RACING = 0x2,
 };
 
 /**
@@ -480,6 +482,10 @@ struct mtk_vcodec_enc_pdata {
  * @subdev_dev: subdev hardware device
  * @subdev_prob_done: check whether all used hw device is prob done
  * @subdev_bitmap: used to record hardware is ready or not
+ *
+ * @dec_active_cnt: used to mark whether need to record register value
+ * @vdec_racing_info: record register value
+ * @dec_racing_info_mutex: mutex lock used for inner racing mode
  */
 struct mtk_vcodec_dev {
struct v4l2_device v4l2_dev;
@@ -525,6 +531,10 @@ struct mtk_vcodec_dev {
void *subdev_dev[MTK_VDEC_HW_MAX];
int (*subdev_prob_done)(struct mtk_vcodec_dev *vdec_dev);
DECLARE_BITMAP(subde

[PATCH v1, 7/8] media: uapi: Init VP9 stateless decode params

2022-01-26 Thread Yunfei Dong

Init some of VP9 frame decode params to default value.

Signed-off-by: Yunfei Dong 
---
 drivers/media/v4l2-core/v4l2-ctrls-core.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c 
b/drivers/media/v4l2-core/v4l2-ctrls-core.c
index 54abe5245dcc..b25c77b8a445 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
@@ -112,6 +112,7 @@ static void std_init_compound(const struct v4l2_ctrl *ctrl, 
u32 idx,
struct v4l2_ctrl_mpeg2_picture *p_mpeg2_picture;
struct v4l2_ctrl_mpeg2_quantisation *p_mpeg2_quant;
struct v4l2_ctrl_vp8_frame *p_vp8_frame;
+   struct v4l2_ctrl_vp9_frame *p_vp9_frame;
struct v4l2_ctrl_fwht_params *p_fwht_params;
void *p = ptr.p + idx * ctrl->elem_size;
 
@@ -152,6 +153,13 @@ static void std_init_compound(const struct v4l2_ctrl 
*ctrl, u32 idx,
p_vp8_frame = p;
p_vp8_frame->num_dct_parts = 1;
break;
+   case V4L2_CTRL_TYPE_VP9_FRAME:
+   p_vp9_frame = p;
+   p_vp9_frame->profile = 0;
+   p_vp9_frame->bit_depth = 8;
+   p_vp9_frame->flags |= V4L2_VP9_FRAME_FLAG_X_SUBSAMPLING |
+   V4L2_VP9_FRAME_FLAG_Y_SUBSAMPLING;
+   break;
case V4L2_CTRL_TYPE_FWHT_PARAMS:
p_fwht_params = p;
p_fwht_params->version = V4L2_FWHT_VERSION;
-- 
2.25.1

[PATCH v1, 6/8] media: mtk-vcodec: prevent kernel crash when scp ipi timeout

2022-01-26 Thread Yunfei Dong

From: Tinghan Shen 

When SCP timeout during playing video, kernel crashes with following
message. It's caused by accessing NULL pointer in vpu_dec_ipi_handler.
This patch doesn't solve the root cause of NULL pointer, but merely
prevent kernel crashed when encounter the NULL pointer.

After applied this patch, kernel keeps alive, only the video player turns
to green screen.

[67242.065474] pc : vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec]
[67242.065485] [MTK_V4L2] level=0 fops_vcodec_open(),334:
1800.vcodec_dec decoder [135]
[67242.065523] lr : scp_ipi_handler+0x11c/0x244 [mtk_scp]
[67242.065540] sp : ffbb4207fb10
[67242.065557] x29: ffbb4207fb30 x28: ffd00a1d5000
[67242.065592] x27: 1ffa0143aa24 x26: 
[67242.065625] x25: dfd0 x24: ffd0168bfdb0
[67242.065659] x23: 1ff76840ff74 x22: ffbb41fa8a88
[67242.065692] x21: ffbb4207fb9c x20: ffbb4207fba0
[67242.065725] x19: ffbb4207fb98 x18: 
[67242.065758] x17:  x16: ffd042022094
[67242.065791] x15: 1ff77ed4b71a x14: 1ff77ed4b719
[67242.065824] x13:  x12: 
[67242.065857] x11:  x10: dfd1
[67242.065890] x9 :  x8 : 0002
[67242.065923] x7 :  x6 : 003f
[67242.065956] x5 : 0040 x4 : ffe0
[67242.065989] x3 : ffd043b841b8 x2 : 
[67242.066021] x1 : 0010 x0 : 0010
[67242.066055] Call trace:
[67242.066092]  vpu_dec_ipi_handler+0xa0/0xb20 [mtk_vcodec_dec
12220d230d83a7426fc38c56b3e7bc6066955bae]
[67242.066119]  scp_ipi_handler+0x11c/0x244 [mtk_scp
8fb69c2ef141dd3192518b952b65aba35627b8bf]
[67242.066145]  mt8192_scp_irq_handler+0x70/0x128 [mtk_scp
8fb69c2ef141dd3192518b952b65aba35627b8bf]
[67242.066172]  scp_irq_handler+0xa0/0x114 [mtk_scp
8fb69c2ef141dd3192518b952b65aba35627b8bf]
[67242.066200]  irq_thread_fn+0x84/0xf8
[67242.066220]  irq_thread+0x170/0x1ec
[67242.066242]  kthread+0x2f8/0x3b8
[67242.066264]  ret_from_fork+0x10/0x30
[67242.066292] Code: 38f96908 35003628 91004340 d343fc08 (38f96908)

Signed-off-by: Tinghan Shen 
Signed-off-by: Yunfei Dong 
---
 drivers/media/platform/mtk-vcodec/vdec_vpu_if.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c 
b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
index 35f4d5583084..1041dd663e76 100644
--- a/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
+++ b/drivers/media/platform/mtk-vcodec/vdec_vpu_if.c
@@ -91,6 +91,11 @@ static void vpu_dec_ipi_handler(void *data, unsigned int 
len, void *priv)
struct vdec_vpu_inst *vpu = (struct vdec_vpu_inst *)
(unsigned long)msg->ap_inst_addr;
 
+   if (!vpu) {
+   mtk_v4l2_err("ap_inst_addr is NULL");
+   return;
+   }
+
mtk_vcodec_debug(vpu, "+ id=%X", msg->msg_id);
 
vpu->failure = msg->status;
-- 
2.25.1

[PATCH v1, 5/8] media: mtk-vcodec: Different codec using different capture format

2022-01-26 Thread Yunfei Dong

Vp8 need to use MM21, but vp9 and h264 need to use HyFbc mode
for mt8195. Vp8/vp9/h264 use the same MM21 format for mt8192.

Signed-off-by: Yunfei Dong 
---
 .../platform/mtk-vcodec/mtk_vcodec_dec.c  | 41 +++
 1 file changed, 41 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
index 6ad17e69e32d..f2ced0147534 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec.c
@@ -35,6 +35,44 @@ mtk_vdec_find_format(struct v4l2_format *f,
return NULL;
 }
 
+static bool mtk_vdec_get_cap_fmt(struct mtk_vcodec_ctx *ctx, int format_index)
+{
+   const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
+   const struct mtk_video_fmt *fmt;
+   struct mtk_q_data *q_data;
+   int num_frame_count = 0, i;
+   bool ret = true;
+
+   for (i = 0; i < *dec_pdata->num_formats; i++) {
+   if (dec_pdata->vdec_formats[i].type != MTK_FMT_FRAME)
+   continue;
+
+   num_frame_count++;
+   }
+
+   if (num_frame_count == 1)
+   return true;
+
+   fmt = &dec_pdata->vdec_formats[format_index];
+   q_data = &ctx->q_data[MTK_Q_DATA_SRC];
+   switch (q_data->fmt->fourcc) {
+   case V4L2_PIX_FMT_VP8_FRAME:
+   if (fmt->fourcc == V4L2_PIX_FMT_MM21)
+   ret = true;
+   break;
+   case V4L2_PIX_FMT_H264_SLICE:
+   case V4L2_PIX_FMT_VP9_FRAME:
+   if (fmt->fourcc == V4L2_PIX_FMT_MM21)
+   ret = false;
+   break;
+   default:
+   ret = true;
+   break;
+   };
+
+   return ret;
+}
+
 static struct mtk_q_data *mtk_vdec_get_q_data(struct mtk_vcodec_ctx *ctx,
  enum v4l2_buf_type type)
 {
@@ -578,6 +616,9 @@ static int vidioc_enum_fmt(struct v4l2_fmtdesc *f, void 
*priv,
dec_pdata->vdec_formats[i].type != MTK_FMT_FRAME)
continue;
 
+   if (!output_queue && !mtk_vdec_get_cap_fmt(ctx, i))
+   continue;
+
if (j == f->index)
break;
++j;
-- 
2.25.1

[PATCH v1, 4/8] media: mtk-vcodec: Adds compatible for mt8195

2022-01-26 Thread Yunfei Dong

Adds compatible for mt8195 platform.

Signed-off-by: Yunfei Dong 
---
 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
index 2d21d0010c9c..938bf14e4e8c 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_drv.c
@@ -468,6 +468,10 @@ static const struct of_device_id mtk_vcodec_match[] = {
.compatible = "mediatek,mt8186-vcodec-dec",
.data = &mtk_vdec_single_core_pdata,
},
+   {
+   .compatible = "mediatek,mt8195-vcodec-dec",
+   .data = &mtk_lat_sig_core_pdata,
+   },
{},
 };
 
-- 
2.25.1

[PATCH v1, 3/8] dt-bindings: media: mtk-vcodec: Adds decoder dt-bindings for mt8195

2022-01-26 Thread Yunfei Dong

Adds decoder dt-bindings for mt8195.

Signed-off-by: Yunfei Dong 
---
 .../bindings/media/mediatek,vcodec-subdev-decoder.yaml   | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml
index a3c892338ac0..a2f2db29daed 100644
--- 
a/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml
+++ 
b/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml
@@ -50,6 +50,7 @@ properties:
 enum:
   - mediatek,mt8192-vcodec-dec
   - mediatek,mt8186-vcodec-dec
+  - mediatek,mt8195-vcodec-dec
 
   reg:
 maxItems: 1
-- 
2.25.1

[PATCH v1, 2/8] media: mtk-vcodec: Add to support lat soc hardware

2022-01-26 Thread Yunfei Dong

Add lat soc compatible and to support lat soc power/clk helper.

Signed-off-by: Yunfei Dong 
---
 .../platform/mtk-vcodec/mtk_vcodec_dec_hw.c  | 12 +---
 .../platform/mtk-vcodec/mtk_vcodec_dec_hw.h  |  2 ++
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c  | 16 
 .../media/platform/mtk-vcodec/mtk_vcodec_drv.h   |  1 +
 4 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.c
index 7b5da3e4cac2..7374d5a5c156 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.c
@@ -28,6 +28,10 @@ static const struct of_device_id mtk_vdec_hw_match[] = {
.compatible = "mediatek,mtk-vcodec-core",
.data = (void *)MTK_VDEC_CORE,
},
+   {
+   .compatible = "mediatek,mtk-vcodec-lat-soc",
+   .data = (void *)MTK_VDEC_LAT_SOC,
+   },
{},
 };
 MODULE_DEVICE_TABLE(of, mtk_vdec_hw_match);
@@ -166,9 +170,11 @@ static int mtk_vdec_hw_probe(struct platform_device *pdev)
subdev_dev->reg_base[VDEC_HW_SYS] = main_dev->reg_base[VDEC_HW_SYS];
set_bit(subdev_dev->hw_idx, main_dev->subdev_bitmap);
 
-   ret = mtk_vdec_hw_init_irq(subdev_dev);
-   if (ret)
-   goto err;
+   if (IS_SUPPORT_VDEC_HW_IRQ(hw_idx)) {
+   ret = mtk_vdec_hw_init_irq(subdev_dev);
+   if (ret)
+   goto err;
+   }
 
subdev_dev->reg_base[VDEC_HW_MISC] =
devm_platform_ioremap_resource(pdev, 0);
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.h
index a63e4b1b81c3..b8938c6c3e72 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.h
@@ -17,6 +17,8 @@
 #define VDEC_IRQ_CLR 0x10
 #define VDEC_IRQ_CFG_REG 0xa4
 
+#define IS_SUPPORT_VDEC_HW_IRQ(hw_idx) (hw_idx != MTK_VDEC_LAT_SOC)
+
 /**
  * enum mtk_vdec_hw_reg_idx - subdev hardware register base index
  * @VDEC_HW_SYS : vdec soc register index
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 1581a1277473..76e1442fc6f9 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -203,6 +203,14 @@ static void mtk_vcodec_dec_child_dev_on(struct 
mtk_vcodec_dev *vdec_dev,
mtk_vcodec_dec_pw_on(pm);
mtk_vcodec_dec_clock_on(pm);
}
+
+   if (hw_idx == MTK_VDEC_LAT0) {
+   pm = mtk_vcodec_dec_get_pm(vdec_dev, MTK_VDEC_LAT_SOC);
+   if (pm) {
+   mtk_vcodec_dec_pw_on(pm);
+   mtk_vcodec_dec_clock_on(pm);
+   }
+   }
 }
 
 static void mtk_vcodec_dec_child_dev_off(struct mtk_vcodec_dev *vdec_dev,
@@ -215,6 +223,14 @@ static void mtk_vcodec_dec_child_dev_off(struct 
mtk_vcodec_dev *vdec_dev,
mtk_vcodec_dec_clock_off(pm);
mtk_vcodec_dec_pw_off(pm);
}
+
+   if (hw_idx == MTK_VDEC_LAT0) {
+   pm = mtk_vcodec_dec_get_pm(vdec_dev, MTK_VDEC_LAT_SOC);
+   if (pm) {
+   mtk_vcodec_dec_clock_off(pm);
+   mtk_vcodec_dec_pw_off(pm);
+   }
+   }
 }
 
 void mtk_vcodec_dec_enable_hardware(struct mtk_vcodec_ctx *ctx, int hw_idx)
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index cd2939b47790..363b999dd709 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -104,6 +104,7 @@ enum mtk_vdec_hw_id {
MTK_VDEC_CORE,
MTK_VDEC_LAT0,
MTK_VDEC_LAT1,
+   MTK_VDEC_LAT_SOC,
MTK_VDEC_HW_MAX,
 };
 
-- 
2.25.1

[PATCH v1, 1/8] dt-bindings: media: mtk-vcodec: Adds decoder dt-bindings for lat soc

2022-01-26 Thread Yunfei Dong

Adds decoder dt-bindings for compatible "mediatek,mtk-vcodec-lat-soc".

Signed-off-by: Yunfei Dong 
---
 .../media/mediatek,vcodec-subdev-decoder.yaml | 49 +++
 1 file changed, 49 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml
index 6415c9f29130..a3c892338ac0 100644
--- 
a/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml
+++ 
b/Documentation/devicetree/bindings/media/mediatek,vcodec-subdev-decoder.yaml
@@ -189,6 +189,55 @@ patternProperties:
 
 additionalProperties: false
 
+  '^vcodec-lat-soc@[0-9a-f]+$':
+type: object
+
+properties:
+  compatible:
+const: mediatek,mtk-vcodec-lat-soc
+
+  reg:
+maxItems: 1
+
+  iommus:
+minItems: 1
+maxItems: 32
+description: |
+  List of the hardware port in respective IOMMU block for current Socs.
+  Refer to bindings/iommu/mediatek,iommu.yaml.
+
+  clocks:
+maxItems: 5
+
+  clock-names:
+items:
+  - const: sel
+  - const: soc-vdec
+  - const: soc-lat
+  - const: vdec
+  - const: top
+
+  assigned-clocks:
+maxItems: 1
+
+  assigned-clock-parents:
+maxItems: 1
+
+  power-domains:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - iommus
+  - clocks
+  - clock-names
+  - assigned-clocks
+  - assigned-clock-parents
+  - power-domains
+
+additionalProperties: false
+
 required:
   - compatible
   - reg
-- 
2.25.1

[PATCH v1, 0/8] support mt8195 decoder

2022-01-26 Thread Yunfei Dong

Firstly, add mt8195 soc lat hardware and compatible, then add documents.
For vp8 only support MM21 mode, H264/vp9 support MT21C, need to separate
them. Next, initialize vp9 stateless decoder parameters. Lastly, enable
H264 inner racing mode to reduce hardware latency.

Patch 1~4 add mt8195 soc lat hardware and compatible, then add documents.
Patch 5 using different format for different codecs.
Patch 6 prevent kernel crash when scp reboot.
Patch 7 init vp9 stateless decoder parameters.
Patch 8 enable H264 inner racing mode to reduce hardware latency.
---
This patch depends on "support mt8186 decoder"[1]

[1]  
https://patchwork.kernel.org/project/linux-mediatek/cover/20220122075606.19373-1-yunfei.d...@mediatek.com
---
Tinghan Shen (1):
  media: mtk-vcodec: prevent kernel crash when scp ipi timeout

Yunfei Dong (7):
  dt-bindings: media: mtk-vcodec: Adds decoder dt-bindings for lat soc
  media: mtk-vcodec: Add to support lat soc hardware
  dt-bindings: media: mtk-vcodec: Adds decoder dt-bindings for mt8195
  media: mtk-vcodec: Adds compatible for mt8195
  media: mtk-vcodec: Different codec using different capture format
  media: uapi: Init VP9 stateless decode params
  media: mtk-vcodec: Add to support H264 inner racing mode

 .../media/mediatek,vcodec-subdev-decoder.yaml | 50 +++
 .../platform/mtk-vcodec/mtk_vcodec_dec.c  | 41 +++
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  |  8 +++
 .../platform/mtk-vcodec/mtk_vcodec_dec_hw.c   | 12 +++--
 .../platform/mtk-vcodec/mtk_vcodec_dec_hw.h   |  2 +
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 50 +++
 .../platform/mtk-vcodec/mtk_vcodec_drv.h  | 11 
 .../mtk-vcodec/vdec/vdec_h264_req_multi_if.c  | 23 +++--
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   |  5 ++
 drivers/media/v4l2-core/v4l2-ctrls-core.c |  8 +++
 10 files changed, 202 insertions(+), 8 deletions(-)

-- 
2.25.1

Re: [PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset

2022-01-26 Thread kernel test robot

Hi Lucas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[also build test WARNING on next-20220125]
[cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next 
drm/drm-next tegra-drm/drm/tegra/for-next linus/master airlied/drm-next 
v5.17-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: i386-randconfig-a011 
(https://download.01.org/0day-ci/archive/20220127/202201270902.hcre2frp-...@intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 
2a1b7aa016c0f4b5598806205bdfbab1ea2d92c4)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/313757d9ed833acea4ee2bb0e3f3565d6efcf3cc
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912
git checkout 313757d9ed833acea4ee2bb0e3f3565d6efcf3cc
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:370:3: warning: format specifies 
>> type 'unsigned long' but the argument has type 'unsigned int' [-Wformat]
   (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
   ^~
   include/drm/drm_print.h:461:63: note: expanded from macro 'drm_dbg'
   drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, 
##__VA_ARGS__)
 ~~~
^~~
   1 warning generated.


vim +370 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c

   348  
   349  static long guc_mmio_reg_state_create(struct intel_guc *guc)
   350  {
   351  struct intel_gt *gt = guc_to_gt(guc);
   352  struct intel_engine_cs *engine;
   353  enum intel_engine_id id;
   354  struct temp_regset temp_set = {};
   355  long total = 0;
   356  
   357  for_each_engine(engine, gt, id) {
   358  u32 used = temp_set.storage_used;
   359  
   360  if (guc_mmio_regset_init(&temp_set, engine) < 0)
   361  return -1;
   362  
   363  guc->ads_regset_count[id] = temp_set.storage_used - 
used;
   364  total += guc->ads_regset_count[id];
   365  }
   366  
   367  guc->ads_regset = temp_set.storage;
   368  
   369  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary 
ADS regset\n",
 > 370  (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 
 > 10);
   371  
   372  return total * sizeof(struct guc_mmio_reg);
   373  }
   374  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [PATCH v2] drm/bridge: Add missing pm_runtime_put_sync

2022-01-26 Thread Laurent Pinchart

Hi Yongzhi,

Thank you for the patch.

On Sun, Jan 23, 2022 at 11:20:35PM -0800, Yongzhi Liu wrote:
> pm_runtime_get_sync() will increase the rumtime PM counter
> even when it returns an error. Thus a pairing decrement is needed
> to prevent refcount leak. Fix this by replacing this API with
> pm_runtime_resume_and_get(), which will not change the runtime
> PM counter on error. Besides, a matching decrement is needed
> on the error handling path to keep the counter balanced.
> 
> Signed-off-by: Yongzhi Liu 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/gpu/drm/bridge/nwl-dsi.c | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/nwl-dsi.c 
> b/drivers/gpu/drm/bridge/nwl-dsi.c
> index 9282e61..30aacd9 100644
> --- a/drivers/gpu/drm/bridge/nwl-dsi.c
> +++ b/drivers/gpu/drm/bridge/nwl-dsi.c
> @@ -862,18 +862,19 @@ nwl_dsi_bridge_mode_set(struct drm_bridge *bridge,
>   memcpy(&dsi->mode, adjusted_mode, sizeof(dsi->mode));
>   drm_mode_debug_printmodeline(adjusted_mode);
>  
> - pm_runtime_get_sync(dev);
> + if (pm_runtime_resume_and_get(dev) < 0)
> + return;
>  
>   if (clk_prepare_enable(dsi->lcdif_clk) < 0)
> - return;
> + goto runtime_put;
>   if (clk_prepare_enable(dsi->core_clk) < 0)
> - return;
> + goto runtime_put;
>  
>   /* Step 1 from DSI reset-out instructions */
>   ret = reset_control_deassert(dsi->rst_pclk);
>   if (ret < 0) {
>   DRM_DEV_ERROR(dev, "Failed to deassert PCLK: %d\n", ret);
> - return;
> + goto runtime_put;
>   }
>  
>   /* Step 2 from DSI reset-out instructions */
> @@ -883,13 +884,18 @@ nwl_dsi_bridge_mode_set(struct drm_bridge *bridge,
>   ret = reset_control_deassert(dsi->rst_esc);
>   if (ret < 0) {
>   DRM_DEV_ERROR(dev, "Failed to deassert ESC: %d\n", ret);
> - return;
> + goto runtime_put;
>   }
>   ret = reset_control_deassert(dsi->rst_byte);
>   if (ret < 0) {
>   DRM_DEV_ERROR(dev, "Failed to deassert BYTE: %d\n", ret);
> - return;
> + goto runtime_put;
>   }
> +
> + return;
> +
> +runtime_put:
> + pm_runtime_put_sync(dev);
>  }
>  
>  static void

-- 
Regards,

Laurent Pinchart

Re: [PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset

2022-01-26 Thread kernel test robot

Hi Lucas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[also build test WARNING on next-20220125]
[cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next 
drm/drm-next tegra-drm/drm/tegra/for-next linus/master airlied/drm-next 
v5.17-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: i386-randconfig-m021-20220124 
(https://download.01.org/0day-ci/archive/20220127/202201270827.clihfdpe-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/313757d9ed833acea4ee2bb0e3f3565d6efcf3cc
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912
git checkout 313757d9ed833acea4ee2bb0e3f3565d6efcf3cc
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   In file included from include/drm/drm_mm.h:51,
from drivers/gpu/drm/i915/i915_vma.h:31,
from drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h:13,
from drivers/gpu/drm/i915/gt/uc/intel_guc.h:20,
from drivers/gpu/drm/i915/gt/uc/intel_uc.h:9,
from drivers/gpu/drm/i915/gt/intel_gt_types.h:18,
from drivers/gpu/drm/i915/gt/intel_gt.h:10,
from drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:9:
   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c: In function 
'guc_mmio_reg_state_create':
>> drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:38: warning: format '%lu' 
>> expects argument of type 'long unsigned int', but argument 4 has type 'u32' 
>> {aka 'unsigned int'} [-Wformat=]
 369 |  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS 
regset\n",
 |  
^~~~
 370 |   (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
 |   ~~
 ||
 |u32 {aka 
unsigned int}
   include/drm/drm_print.h:461:56: note: in definition of macro 'drm_dbg'
 461 |  drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, 
##__VA_ARGS__)
 |^~~
   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:46: note: format string is 
defined here
 369 |  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS 
regset\n",
 |~~^
 |  |
 |  long unsigned int
 |%u


vim +369 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c

   348  
   349  static long guc_mmio_reg_state_create(struct intel_guc *guc)
   350  {
   351  struct intel_gt *gt = guc_to_gt(guc);
   352  struct intel_engine_cs *engine;
   353  enum intel_engine_id id;
   354  struct temp_regset temp_set = {};
   355  long total = 0;
   356  
   357  for_each_engine(engine, gt, id) {
   358  u32 used = temp_set.storage_used;
   359  
   360  if (guc_mmio_regset_init(&temp_set, engine) < 0)
   361  return -1;
   362  
   363  guc->ads_regset_count[id] = temp_set.storage_used - 
used;
   364  total += guc->ads_regset_count[id];
   365  }
   366  
   367  guc->ads_regset = temp_set.storage;
   368  
 > 369  drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary 
 > ADS regset\n",
   370  (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 
10);
   371  
   372  return total * sizeof(struct guc_mmio_reg);
   373  }
   374  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

[PATCH v2] drm/msm/dp: add connector type to enhance debug messages

2022-01-26 Thread Kuogee Hsieh

DP driver is a generic driver which supports both eDP and DP.
For debugging purpose it is required to have capabilities to
differentiate message are generated from eDP or DP.
This patch do:
1) add connector type into debug messages within dp_display.c
2) revise debug messages related to DP phy within dp_ctrl.c
3) replace DRM_DEBUG_DP marco with drm_dbg_dp

Changes in V2:
-- replace DRM_DEBUG_DP marco with drm_dbg_dp

Signed-off-by: Kuogee Hsieh 
---
 drivers/gpu/drm/msm/dp/dp_audio.c   |  49 +--
 drivers/gpu/drm/msm/dp/dp_catalog.c |  34 ++-
 drivers/gpu/drm/msm/dp/dp_ctrl.c| 116 +++-
 drivers/gpu/drm/msm/dp/dp_display.c | 103 ++--
 drivers/gpu/drm/msm/dp/dp_drm.c |   4 +-
 drivers/gpu/drm/msm/dp/dp_link.c|  99 +-
 drivers/gpu/drm/msm/dp/dp_panel.c   |  43 +++--
 drivers/gpu/drm/msm/dp/dp_parser.c  |   2 +-
 drivers/gpu/drm/msm/dp/dp_power.c   |  20 ---
 9 files changed, 283 insertions(+), 187 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_audio.c 
b/drivers/gpu/drm/msm/dp/dp_audio.c
index d7e4a39..4fbbe0a 100644
--- a/drivers/gpu/drm/msm/dp/dp_audio.c
+++ b/drivers/gpu/drm/msm/dp/dp_audio.c
@@ -136,7 +136,8 @@ static void dp_audio_stream_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADER_BYTE_1_BIT)
| (parity_byte << PARITY_BYTE_1_BIT));
-   DRM_DEBUG_DP("Header Byte 1: value = 0x%x, parity_byte = 0x%x\n",
+   drm_dbg_dp((struct drm_device *)NULL,
+   "Header Byte 1: value = 0x%x, parity_byte = 0x%x\n",
value, parity_byte);
dp_audio_set_header(catalog, value,
DP_AUDIO_SDP_STREAM, DP_AUDIO_SDP_HEADER_1);
@@ -148,7 +149,8 @@ static void dp_audio_stream_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADER_BYTE_2_BIT)
| (parity_byte << PARITY_BYTE_2_BIT));
-   DRM_DEBUG_DP("Header Byte 2: value = 0x%x, parity_byte = 0x%x\n",
+   drm_dbg_dp((struct drm_device *)NULL,
+   "Header Byte 2: value = 0x%x, parity_byte = 0x%x\n",
value, parity_byte);
 
dp_audio_set_header(catalog, value,
@@ -162,7 +164,8 @@ static void dp_audio_stream_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADER_BYTE_3_BIT)
| (parity_byte << PARITY_BYTE_3_BIT));
-   DRM_DEBUG_DP("Header Byte 3: value = 0x%x, parity_byte = 0x%x\n",
+   drm_dbg_dp((struct drm_device *)NULL,
+   "Header Byte 3: value = 0x%x, parity_byte = 0x%x\n",
value, parity_byte);
 
dp_audio_set_header(catalog, value,
@@ -183,8 +186,9 @@ static void dp_audio_timestamp_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADER_BYTE_1_BIT)
| (parity_byte << PARITY_BYTE_1_BIT));
-   DRM_DEBUG_DP("Header Byte 1: value = 0x%x, parity_byte = 0x%x\n",
-   value, parity_byte);
+   drm_dbg_dp((struct drm_device *)NULL,
+   "Header Byte 1: value = 0x%x, parity_byte = 0x%x\n",
+   value, parity_byte);
dp_audio_set_header(catalog, value,
DP_AUDIO_SDP_TIMESTAMP, DP_AUDIO_SDP_HEADER_1);
 
@@ -196,7 +200,8 @@ static void dp_audio_timestamp_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADER_BYTE_2_BIT)
| (parity_byte << PARITY_BYTE_2_BIT));
-   DRM_DEBUG_DP("Header Byte 2: value = 0x%x, parity_byte = 0x%x\n",
+   drm_dbg_dp((struct drm_device *)NULL,
+   "Header Byte 2: value = 0x%x, parity_byte = 0x%x\n",
value, parity_byte);
dp_audio_set_header(catalog, value,
DP_AUDIO_SDP_TIMESTAMP, DP_AUDIO_SDP_HEADER_2);
@@ -209,7 +214,8 @@ static void dp_audio_timestamp_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADER_BYTE_3_BIT)
| (parity_byte << PARITY_BYTE_3_BIT));
-   DRM_DEBUG_DP("Header Byte 3: value = 0x%x, parity_byte = 0x%x\n",
+   drm_dbg_dp((struct drm_device *)NULL,
+   "Header Byte 3: value = 0x%x, parity_byte = 0x%x\n",
value, parity_byte);
dp_audio_set_header(catalog, value,
DP_AUDIO_SDP_TIMESTAMP, DP_AUDIO_SDP_HEADER_3);
@@ -229,7 +235,8 @@ static void dp_audio_infoframe_sdp(struct dp_audio_private 
*audio)
parity_byte = dp_audio_calculate_parity(new_value);
value |= ((new_value << HEADE

Re: [PATCH 4/8] drm/i915: Use preempt_disable/enable_rt() where recommended

2022-01-26 Thread Mario Kleiner

On Tue, Dec 14, 2021 at 3:03 PM Sebastian Andrzej Siewior <
bige...@linutronix.de> wrote:

> From: Mike Galbraith 
>
> Mario Kleiner suggest in commit
>   ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into
> kms driver.")
>
> a spots where preemption should be disabled on PREEMPT_RT. The
> difference is that on PREEMPT_RT the intel_uncore::lock disables neither
> preemption nor interrupts and so region remains preemptible.
>
>
Hi, first thank you for implementing these preempt disables according to
the markers i left long ago. And sorry for the rather late reply.

I had a look at the code, as of Linux 5.16, and did also a little test run
(of a standard kernel, not with PREEMPT_RT, only
CONFIG_PREEMPT_VOLUNTARY=y) on my Intel Kabylake GT2, so some thoughts:

The area covers only register reads and writes. The part that worries me
> is:
> - __intel_get_crtc_scanline() the worst case is 100us if no match is
>   found.
>

This one can be a problem indeed on (maybe all?) modern Intel gpu's since
Haswell, ie. the last ~10 years. I was able to reproduce it on my Kabylake
Intel gpu.

Most of the time that for-loop with up to 100 repetitions (~ 100
udelay(1) + one mmio register read) (cfe.
https://elixir.bootlin.com/linux/v5.17-rc1/source/drivers/gpu/drm/i915/i915_irq.c#L856)
will not execute, because most of the time that function gets called from
the vblank irq handler and then that trigger condition (if
(HAS_DDI(dev_priv) && !position)) is not true. However, it also gets called
as part of power-saving on behalf of userspace context, whenever the
desktop graphics goes idle for two video refresh cycles. If the desktop
shows graphics activity again, and vblank interrupts need to get reenabled,
the probability of hitting that case is then ~1-4% depending on video mode.
How many loops it runs also varies.

On my little Intel(R) Core(TM) i5-8250U CPU machine with a mostly idle
desktop, I observed about one hit every couple of seconds of regular use,
and each hit took between 125 usecs and almost 250 usecs. I guess udelay(1)
can take a bit longer than 1 usec?

So that's too much for preempt-rt. What one could do is the following:

1. In the for-loop in __intel_get_crtc_scanline(), add a preempt_enable()
before the udelay(1); and a preempt_disable() again after it. Or
potentially around the whole for-loop if the overhead of
preempt_en/disable() is significant?

2. In intel_get_crtc_scanline() also wrap the call to
__intel_get_crtc_scanline() into a preempt_disable() and preempt_enable(),
so we can be sure that __intel_get_crtc_scanline() always gets called with
preemption disabled.

Why should this work ok'ish? The point of the original preempt disable
inside i915_get_crtc_scanoutpos

is that those two *stime = ktime_get() and *etime = ktime_get() clock
queries happen as close to the scanout position query as possible to get a
small confidence interval for when exactly the scanoutpos was
read/determined from the display hardware. error = (etime - stime) is the
error margin. If that margin becomes greater than 20 usecs, then the
higher-level code will consider the measurement invalid and repeat the
whole procedure up to 3 times before giving up.

Normally, in my experience with different graphics chips, one would observe
error < 3 usecs, so the measurement almost always succeeds at first try,
only very rarely takes two attempts. The preempt disable is meant to make
sure that this stays the case on a PREEMPT_RT kernel.

The problem here are the relatively rare cases where we hit that up to 100
iterations for-loop. Here even on a regular kernel, due to hardware quirks,
we already exceed the 20 usecs tolerance by a huge amount of more than 100
usecs, leading to a retry of the measurement. And my tests showed that
often the two succeeding retries also fail, because of hardware quirks can
apparently create a blackout situation approaching 1 msec, so we lose
anyway, regardless if we get preempted on a RT kernel or not. That's why
enabling preemption on RT again during that for-loop should not make the
situation worse and at least keep RT as real-time as intended.

In practice I would also expect that this failure case is the one least
likely to impair userspace applications greatly in practice. The cases that
mostly matter are the ones executed during vblank hardware irq, where the
for-loop never executes and error margin and preempt off time is only about
1 usec. My own software which depends on very precise timestamps from the
mechanism never reported >> 20 usecs errors during startup tests or runtime
tests.

> - intel_crtc_scanlines_since_frame_timestamp() not sure how long this
>   may take in the worst case.
>
>
intel_crtc_scanlines_since_frame_timestamp() should be harmless. That
do-while loop just tries to make sure that two register reads that should
happen within the same video refresh cycle are happening in the same
re

Re: [PATCH v1 0/4] fbtft: Unorphan the driver for maintenance

2022-01-26 Thread Daniel Vetter

On Wed, Jan 26, 2022 at 3:24 PM Greg Kroah-Hartman
 wrote:
> On Wed, Jan 26, 2022 at 03:18:14PM +0100, Javier Martinez Canillas wrote:
> > On 1/26/22 15:11, Andy Shevchenko wrote:
> > > On Wed, Jan 26, 2022 at 02:47:33PM +0100, Javier Martinez Canillas wrote:
> > >> On 1/26/22 14:27, Andy Shevchenko wrote:
> > >>> On Wed, Jan 26, 2022 at 12:18:30PM +0100, Javier Martinez Canillas 
> > >>> wrote:
> >  On 1/26/22 11:59, Helge Deller wrote:
> > > On 1/26/22 11:02, Andy Shevchenko wrote:
> > >
> > > ...
> > >
> > >> P.S. For the record, I will personally NAK any attempts to remove 
> > >> that
> > >> driver from the kernel. And this is another point why it's better not
> > >> to be under the staging.
> > >
> > > I agree. Same as for me to NAK the disabling of fbcon's acceleration
> > > features or even attempting to remove fbdev altogether (unless all
> > > relevant drivers are ported to DRM).
> > 
> >  But that will never happen if we keep moving the goal post.
> > 
> >  At some point new fbdev drivers should not be added anymore, otherwise
> >  the number of existing drivers that need conversion will keep growing.
> > >>>
> > >>> This thread is not about adding a new driver.
> > >>
> > >> It was about adding a new drivers to drivers/video/ (taken from staging).
> > >
> > > Does it mean gates are open to take any new fbdev drivers to the staging?
> > > If not, I do not see a point here.
> > >
> >
> > Good question. I don't know really.
> >
> > But staging has always been more flexible in what's accepted there and
> > that's why some distros avoid to enable CONFIG_STAGING=y in the kernel.
>
> And that's why if you load a staging driver, it enables TAINT_CRAP in
> your runtime flags :)

fwiw I'm fine with adding new fbdev drivers to staging, that really
doesn't hurt anyone. Adding drm drivers to staging tends to be pain,
least because if we need to do any changes to helpers there's a
cross-tree cordination problem usually, and the benefit of staging
hasn't in the past really outweighted that. Plus I try for us to land
new drivers when they're good enough directly into drivers/gpu, and
not aim for perfect.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v1 0/4] fbtft: Unorphan the driver for maintenance

2022-01-26 Thread Daniel Vetter

dOn Wed, Jan 26, 2022 at 3:46 PM Dan Carpenter  wrote:
>
> The other advantage of staging is the I don't think syzbot enables it.
> I guess it's easier to persuade Dmitry to ignore STAGING than it was to
> get him to disable FBDEV.  :P
>
> The memory corruption in fbdev was a real headache for everyone because
> the stack traces ended up all over the kernel.

Uh Dmitry disabled all of FBDEV? That's a bit too much, since there's
still a lot of distros shipping things. I don't recommend enabling
neither fbdev nor fbcon and some hardening checks look for these
(forgot which one). But if syzbot stops checking fbcon and fbdev stuff
on top of drm drivers (where most of the problems should be gone
because you can't change the resolution through the current fbdev
emulation) then that essentially means fbdev really needs to be
disabled in distros asap.

Disabling the entire pile of hw drivers makes sense, because that's
pretty hopeless imo.

Adding Dmitry to confirm.

-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 17/27] dt-bindings: display: rockchip: Add binding for VOP2

2022-01-26 Thread Rob Herring

On Wed, 26 Jan 2022 15:55:39 +0100, Sascha Hauer wrote:
> The VOP2 is found on newer Rockchip SoCs like the rk3568 or the rk3566.
> The binding differs slightly from the existing VOP binding, so add a new
> binding file for it.
> 
> Changes since v3:
> - drop redundant _vop suffix from clock names
> 
> Signed-off-by: Sascha Hauer 
> ---
>  .../display/rockchip/rockchip-vop2.yaml   | 146 ++
>  1 file changed, 146 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/display/rockchip/rockchip-vop2.yaml
> 

My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):

yamllint warnings/errors:

dtschema/dtc warnings/errors:
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/rockchip/rockchip-vop2.example.dt.yaml:
 vop@fe04: clock-names:0: 'aclk' was expected
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/rockchip/rockchip-vop2.yaml
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/rockchip/rockchip-vop2.example.dt.yaml:
 vop@fe04: clock-names:1: 'hclk' was expected
From schema: 
/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/display/rockchip/rockchip-vop2.yaml

doc reference errors (make refcheckdocs):

See https://patchwork.ozlabs.org/patch/1584511

This check can fail if there are any dependencies. The base for a patch
series is generally the most recent rc1.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit.

[PATCH v10 4/5] drm/amdgpu: move vram inline functions into a header

2022-01-26 Thread Arunpravin

Move shared vram inline functions and structs
into a header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 51 
 1 file changed, 51 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..59983464cce5
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,51 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+static inline u64 amdgpu_node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline u64 amdgpu_node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+#endif
-- 
2.25.1

[PATCH v10 5/5] drm/amdgpu: add drm buddy support to amdgpu

2022-01-26 Thread Arunpravin

- Remove drm_mm references and replace with drm buddy functionalities
- Add res cursor support for drm buddy

v2(Matthew Auld):
  - replace spinlock with mutex as we call kmem_cache_zalloc
(..., GFP_KERNEL) in drm_buddy_alloc() function

  - lock drm_buddy_block_trim() function as it calls
mark_free/mark_split are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

v4:
  - fix warnings reported by kernel test robot 

v5:
  - fix merge conflict issue

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Kconfig   |   1 +
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 259 ++
 4 files changed, 231 insertions(+), 133 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index dfdd3ec5f793..eb5a57ae3c5c 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -279,6 +279,7 @@ config DRM_AMDGPU
select HWMON
select BACKLIGHT_CLASS_DEVICE
select INTERVAL_TREE
+   select DRM_BUDDY
help
  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..da12b4ff2e45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = &to_amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= amdgpu_node_size(block)) {
+   start -= amdgpu_node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = amdgpu_node_start(block) + start;
+   cur->size = min(amdgpu_node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+

[PATCH v10 2/5] drm: implement top-down allocation method

2022-01-26 Thread Arunpravin

Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.

v2:
  - Fix alignment issues(Matthew Auld)
  - Remove unnecessary list_empty check(Matthew Auld)
  - merged the below patch to see the feature in action
 - add top-down alloc support to i915 driver

Signed-off-by: Arunpravin 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/drm_buddy.c   | 36 ---
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  3 ++
 include/drm/drm_buddy.h   |  1 +
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index cfc160a1ef1a..30cad939a112 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -369,6 +369,26 @@ alloc_range_bias(struct drm_buddy *mm,
return ERR_PTR(err);
 }
 
+static struct drm_buddy_block *
+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }
+
+   return max_block;
+}
+
 static struct drm_buddy_block *
 alloc_from_freelist(struct drm_buddy *mm,
unsigned int order,
@@ -379,11 +399,17 @@ alloc_from_freelist(struct drm_buddy *mm,
int err;
 
for (i = order; i <= mm->max_order; ++i) {
-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   block = get_maxblock(&mm->free_list[i]);
+   if (block)
+   break;
+   } else {
+   block = list_first_entry_or_null(&mm->free_list[i],
+struct drm_buddy_block,
+link);
+   if (block)
+   break;
+   }
}
 
if (!block)
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index b9b420cabc14..45b091626278 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
INIT_LIST_HEAD(&bman_res->blocks);
bman_res->mm = mm;
 
+   if (place->flags & TTM_PL_FLAG_TOPDOWN)
+   bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
if (place->fpfn || lpfn != man->size)
bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION;
 
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h
index 54f25a372f27..f0378fb48d06 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -23,6 +23,7 @@
 })
 
 #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0)
+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1)
 
 struct drm_buddy_block {
 #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
-- 
2.25.1

[PATCH v10 3/5] drm: implement a method to free unused pages

2022-01-26 Thread Arunpravin

On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.

v2(Matthew Auld):
  - replace function name 'drm_buddy_free_unused_pages' with
drm_buddy_block_trim
  - replace input argument name 'actual_size' with 'new_size'
  - add more validation checks for input arguments
  - add overlaps check to avoid needless searching and splitting
  - merged the below patch to see the feature in action
 - add free unused pages support to i915 driver
  - lock drm_buddy_block_trim() function as it calls mark_free/mark_split
are all globally visible

v3(Matthew Auld):
  - remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function

v4:
  - in case of trim, at __alloc_range() split_block failure path
marks the block as free and removes it from the original list,
potentially also freeing it, to overcome this problem, we turn
the drm_buddy_block_trim() input node into a temporary node to
prevent recursively freeing itself, but still retain the
un-splitting/freeing of the other nodes(Matthew Auld)

  - modify the drm_buddy_block_trim() function return type

v5(Matthew Auld):
  - revert drm_buddy_block_trim() function return type changes in v4
  - modify drm_buddy_block_trim() passing argument n_pages to original_size
as n_pages has already been rounded up to the next power-of-two and
passing n_pages results noop

v6:
  - fix warnings reported by kernel test robot 

v7:
  - modify drm_buddy_block_trim() function doc description
  - at drm_buddy_block_trim() handle non-allocated block as
a serious programmer error
  - fix a typo

Signed-off-by: Arunpravin 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/drm_buddy.c   | 69 +++
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +++
 include/drm/drm_buddy.h   |  4 ++
 3 files changed, 83 insertions(+)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 30cad939a112..4845ef784b5e 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -542,6 +542,75 @@ static int __drm_buddy_alloc_range(struct drm_buddy *mm,
return __alloc_range(mm, &dfs, start, size, blocks);
 }
 
+/**
+ * drm_buddy_block_trim - free unused pages
+ *
+ * @mm: DRM buddy manager
+ * @new_size: original size requested
+ * @blocks: Input and output list of allocated blocks.
+ * MUST contain single block as input to be trimmed.
+ * On success will contain the newly allocated blocks
+ * making up the @new_size. Blocks always appear in
+ * ascending order
+ *
+ * For contiguous allocation, we round up the size to the nearest
+ * power of two value, drivers consume *actual* size, so remaining
+ * portions are unused and can be optionally freed with this function
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_block_trim(struct drm_buddy *mm,
+u64 new_size,
+struct list_head *blocks)
+{
+   struct drm_buddy_block *parent;
+   struct drm_buddy_block *block;
+   LIST_HEAD(dfs);
+   u64 new_start;
+   int err;
+
+   if (!list_is_singular(blocks))
+   return -EINVAL;
+
+   block = list_first_entry(blocks,
+struct drm_buddy_block,
+link);
+
+   if (WARN_ON(!drm_buddy_block_is_allocated(block)))
+   return -EINVAL;
+
+   if (new_size > drm_buddy_block_size(mm, block))
+   return -EINVAL;
+
+   if (!new_size || !IS_ALIGNED(new_size, mm->chunk_size))
+   return -EINVAL;
+
+   if (new_size == drm_buddy_block_size(mm, block))
+   return 0;
+
+   list_del(&block->link);
+   mark_free(mm, block);
+   mm->avail += drm_buddy_block_size(mm, block);
+
+   /* Prevent recursively freeing this node */
+   parent = block->parent;
+   block->parent = NULL;
+
+   new_start = drm_buddy_block_offset(block);
+   list_add(&block->tmp_link, &dfs);
+   err =  __alloc_range(mm, &dfs, new_start, new_size, blocks);
+   if (err) {
+   mark_allocated(block);
+   mm->avail -= drm_buddy_block_size(mm, block);
+   list_add(&block->link, blocks);
+   }
+
+   block->parent = parent;
+   return err;
+}
+EXPORT_SYMBOL(drm_buddy_block_trim);
+
 /**
  * drm_buddy_alloc_blocks - allocate power-of-two blocks
  *
diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 45b091626278..b52684552523 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -97,6 +97,16 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
if (unlikely(err))
goto err_free_blocks;
 
+   if (place->flags & TTM_PL_FLAG_CO

[PATCH v10 1/5] drm: improve drm_buddy_alloc function

2022-01-26 Thread Arunpravin

- Make drm_buddy_alloc a single function to handle
  range allocation and non-range allocation demands

- Implemented a new function alloc_range() which allocates
  the requested power-of-two block comply with range limitations

- Moved order computation and memory alignment logic from
  i915 driver to drm buddy

v2:
  merged below changes to keep the build unbroken
   - drm_buddy_alloc_range() becomes obsolete and may be removed
   - enable ttm range allocation (fpfn / lpfn) support in i915 driver
   - apply enhanced drm_buddy_alloc() function to i915 driver

v3(Matthew Auld):
  - Fix alignment issues and remove unnecessary list_empty check
  - add more validation checks for input arguments
  - make alloc_range() block allocations as bottom-up
  - optimize order computation logic
  - replace uint64_t with u64, which is preferred in the kernel

v4(Matthew Auld):
  - keep drm_buddy_alloc_range() function implementation for generic
actual range allocations
  - keep alloc_range() implementation for end bias allocations

v5(Matthew Auld):
  - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
as place->lpfn will currently always be zero for i915

v6(Matthew Auld):
  - fixup potential uaf - If we are unlucky and can't allocate
enough memory when splitting blocks, where we temporarily
end up with the given block and its buddy on the respective
free list, then we need to ensure we delete both blocks,
and no just the buddy, before potentially freeing them

  - fix warnings reported by kernel test robot 

v7(Matthew Auld):
  - revert fixup potential uaf
  - keep __alloc_range() add node to the list logic same as
drm_buddy_alloc_blocks() by having a temporary list variable
  - at drm_buddy_alloc_blocks() keep i915 range_overflows macro
and add a new check for end variable

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/drm_buddy.c   | 315 +-
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
 include/drm/drm_buddy.h   |  13 +-
 4 files changed, 280 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index d60878bc9c20..cfc160a1ef1a 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -282,23 +282,97 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
list_head *objects)
 }
 EXPORT_SYMBOL(drm_buddy_free_list);
 
-/**
- * drm_buddy_alloc_blocks - allocate power-of-two blocks
- *
- * @mm: DRM buddy manager to allocate from
- * @order: size of the allocation
- *
- * The order value here translates to:
- *
- * 0 = 2^0 * mm->chunk_size
- * 1 = 2^1 * mm->chunk_size
- * 2 = 2^2 * mm->chunk_size
- *
- * Returns:
- * allocated ptr to the &drm_buddy_block on success
- */
-struct drm_buddy_block *
-drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= s2 && e1 >= e2;
+}
+
+static struct drm_buddy_block *
+alloc_range_bias(struct drm_buddy *mm,
+u64 start, u64 end,
+unsigned int order)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   LIST_HEAD(dfs);
+   int err;
+   int i;
+
+   end = end - 1;
+
+   for (i = 0; i < mm->n_roots; ++i)
+   list_add_tail(&mm->roots[i]->tmp_link, &dfs);
+
+   do {
+   u64 block_start;
+   u64 block_end;
+
+   block = list_first_entry_or_null(&dfs,
+struct drm_buddy_block,
+tmp_link);
+   if (!block)
+   break;
+
+   list_del(&block->tmp_link);
+
+   if (drm_buddy_block_order(block) < order)
+   continue;
+
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
+
+   if (!overlaps(start, end, block_start, block_end))
+   continue;
+
+   if (drm_buddy_block_is_allocated(block))
+   continue;
+
+   if (contains(start, end, block_start, block_end) &&
+   order == drm_buddy_block_order(block)) {
+   /*
+* Find the free block within the range.
+*/
+   if (drm_buddy_block_is_free(block))
+   return block;
+
+   continue;
+   }
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+   if (unlikely(err))
+   goto err_undo;
+   }
+
+

Re: [PATCH v9 4/6] drm: implement a method to free unused pages

2022-01-26 Thread Arunpravin





> -Original Message-
> From: amd-gfx  On Behalf Of Matthew 
> Auld
> Sent: Thursday, January 20, 2022 11:05 PM
> To: Paneer Selvam, Arunpravin ; 
> dri-devel@lists.freedesktop.org; intel-...@lists.freedesktop.org; 
> amd-...@lists.freedesktop.org
> Cc: Deucher, Alexander ; tzimmerm...@suse.de; 
> jani.nik...@linux.intel.com; Koenig, Christian ; 
> dan...@ffwll.ch
> Subject: Re: [PATCH v9 4/6] drm: implement a method to free unused pages
> 
> On 19/01/2022 11:37, Arunpravin wrote:
>> On contiguous allocation, we round up the size to the *next* power of 
>> 2, implement a function to free the unused pages after the newly 
>> allocate block.
>>
>> v2(Matthew Auld):
>>- replace function name 'drm_buddy_free_unused_pages' with
>>  drm_buddy_block_trim
>>- replace input argument name 'actual_size' with 'new_size'
>>- add more validation checks for input arguments
>>- add overlaps check to avoid needless searching and splitting
>>- merged the below patch to see the feature in action
>>   - add free unused pages support to i915 driver
>>- lock drm_buddy_block_trim() function as it calls mark_free/mark_split
>>  are all globally visible
>>
>> v3(Matthew Auld):
>>- remove trim method error handling as we address the failure case
>>  at drm_buddy_block_trim() function
>>
>> v4:
>>- in case of trim, at __alloc_range() split_block failure path
>>  marks the block as free and removes it from the original list,
>>  potentially also freeing it, to overcome this problem, we turn
>>  the drm_buddy_block_trim() input node into a temporary node to
>>  prevent recursively freeing itself, but still retain the
>>  un-splitting/freeing of the other nodes(Matthew Auld)
>>
>>- modify the drm_buddy_block_trim() function return type
>>
>> v5(Matthew Auld):
>>- revert drm_buddy_block_trim() function return type changes in v4
>>- modify drm_buddy_block_trim() passing argument n_pages to original_size
>>  as n_pages has already been rounded up to the next power-of-two and
>>  passing n_pages results noop
>>
>> v6:
>>- fix warnings reported by kernel test robot 
>>
>> Signed-off-by: Arunpravin 
>> ---
>>   drivers/gpu/drm/drm_buddy.c   | 65 +++
>>   drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +++
>>   include/drm/drm_buddy.h   |  4 ++
>>   3 files changed, 79 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c 
>> index 6aa5c1ce25bf..c5902a81b8c5 100644
>> --- a/drivers/gpu/drm/drm_buddy.c
>> +++ b/drivers/gpu/drm/drm_buddy.c
>> @@ -546,6 +546,71 @@ static int __drm_buddy_alloc_range(struct drm_buddy *mm,
>>  return __alloc_range(mm, &dfs, start, size, blocks);
>>   }
>>   
>> +/**
>> + * drm_buddy_block_trim - free unused pages
>> + *
>> + * @mm: DRM buddy manager
>> + * @new_size: original size requested
>> + * @blocks: output list head to add allocated blocks
> 
> @blocks: Input and output list of allocated blocks. MUST contain single block 
> as input to be trimmed. On success will contain the newly allocated blocks 
> making up the @new_size. Blocks always appear in ascending order.
> 
> ?
> 
>> + *
>> + * For contiguous allocation, we round up the size to the nearest
>> + * power of two value, drivers consume *actual* size, so remaining
>> + * portions are unused and it can be freed.
> 
> so remaining portions are unused and can be optionally freed with this 
> function.
> 
> ?
> 
>> + *
>> + * Returns:
>> + * 0 on success, error code on failure.
>> + */
>> +int drm_buddy_block_trim(struct drm_buddy *mm,
>> + u64 new_size,
>> + struct list_head *blocks)
>> +{
>> +struct drm_buddy_block *parent;
>> +struct drm_buddy_block *block;
>> +LIST_HEAD(dfs);
>> +u64 new_start;
>> +int err;
>> +
>> +if (!list_is_singular(blocks))
>> +return -EINVAL;
>> +
>> +block = list_first_entry(blocks,
>> + struct drm_buddy_block,
>> + link);
>> +
>> +if (!drm_buddy_block_is_allocated(block))
> 
> Maybe:
> 
> if (WARN_ON(!drm_buddy_block_is_allocated()))
> 
> AFAIK it should be normally impossible to be handed such non-allocated block, 
> and so should be treated as a serious programmer error.
> 
> ?
> 
>> +return -EINVAL;
>> +
>> +if (new_size > drm_buddy_block_size(mm, block))
>> +return -EINVAL;
>> +
>> +if (!new_size && !IS_ALIGNED(new_size, mm->chunk_size))
>> +return -EINVAL;
> 
> I assume that's a typo:
> 
> if (!new_size || ...)
> 
> Otherwise I think looks good. Some unit tests for this would be nice, but not 
> a blocker. And this does at least pass the igt_mock_contiguous selftest, and 
> I didn't see anything nasty when running on DG1, which does make use of 
> TTM_PL_FLAG_CONTIGUOUS,

Good to hear its running on DG1, all changes are added to v10. working
on m

Re: [PATCH v2] drm/bridge/tc358775: Fix for dual-link LVDS

2022-01-26 Thread Jiří Vaněk

AUO P215HVN01.0 /   AUO G215HVN01.0

čt 6. 1. 2022 v 20:22 odesílatel Vinay Simha B N 
napsal:

> Reviewed-by: Vinay Simha BN 
>
> Jiri Vanek,
> Could you please share the part number or datasheet of the dual-link LVDS
> display/panel used.
>
>
> On Fri, Jan 7, 2022 at 12:30 AM Jiri Vanek  wrote:
>
>> Fixed wrong register shift for single/dual link LVDS output.
>>
>> Tested-by: Jiri Vanek 
>> Signed-off-by: Jiri Vanek 
>>
>> ---
>> v1:
>> * Initial version
>>
>> v2:
>> * Tested-by tag added
>>
>> ---
>>  drivers/gpu/drm/bridge/tc358775.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/bridge/tc358775.c
>> b/drivers/gpu/drm/bridge/tc358775.c
>> index 2272adcc5b4a..1d6ec1baeff2 100644
>> --- a/drivers/gpu/drm/bridge/tc358775.c
>> +++ b/drivers/gpu/drm/bridge/tc358775.c
>> @@ -241,7 +241,7 @@ static inline u32 TC358775_LVCFG_PCLKDIV(uint32_t val)
>>  }
>>
>>  #define TC358775_LVCFG_LVDLINK__MASK 0x0002
>> -#define TC358775_LVCFG_LVDLINK__SHIFT0
>> +#define TC358775_LVCFG_LVDLINK__SHIFT1
>>  static inline u32 TC358775_LVCFG_LVDLINK(uint32_t val)
>>  {
>> return ((val) << TC358775_LVCFG_LVDLINK__SHIFT) &
>> --
>> 2.30.2
>>
>>
>
> --
> regards,
> vinaysimha
>

Re: [PATCH v9 2/6] drm: improve drm_buddy_alloc function

2022-01-26 Thread Arunpravin




On 21/01/22 5:30 pm, Matthew Auld wrote:
> On 19/01/2022 11:37, Arunpravin wrote:
>> - Make drm_buddy_alloc a single function to handle
>>range allocation and non-range allocation demands
>>
>> - Implemented a new function alloc_range() which allocates
>>the requested power-of-two block comply with range limitations
>>
>> - Moved order computation and memory alignment logic from
>>i915 driver to drm buddy
>>
>> v2:
>>merged below changes to keep the build unbroken
>> - drm_buddy_alloc_range() becomes obsolete and may be removed
>> - enable ttm range allocation (fpfn / lpfn) support in i915 driver
>> - apply enhanced drm_buddy_alloc() function to i915 driver
>>
>> v3(Matthew Auld):
>>- Fix alignment issues and remove unnecessary list_empty check
>>- add more validation checks for input arguments
>>- make alloc_range() block allocations as bottom-up
>>- optimize order computation logic
>>- replace uint64_t with u64, which is preferred in the kernel
>>
>> v4(Matthew Auld):
>>- keep drm_buddy_alloc_range() function implementation for generic
>>  actual range allocations
>>- keep alloc_range() implementation for end bias allocations
>>
>> v5(Matthew Auld):
>>- modify drm_buddy_alloc() passing argument place->lpfn to lpfn
>>  as place->lpfn will currently always be zero for i915
>>
>> v6(Matthew Auld):
>>- fixup potential uaf - If we are unlucky and can't allocate
>>  enough memory when splitting blocks, where we temporarily
>>  end up with the given block and its buddy on the respective
>>  free list, then we need to ensure we delete both blocks,
>>  and no just the buddy, before potentially freeing them
> 
> Hmm, not sure we really want to squash existing bug fixes into this 
> patch. Perhaps bring in [1] to the start of your series? i915_buddy is 
> gone now. Alternatively I can resend such that it applies on top 
> drm_buddy. Your choice.
> 
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F469806%2F%3Fseries%3D98953%26rev%3D1&data=04%7C01%7CArunpravin.PaneerSelvam%40amd.com%7Ce451a48532e74b6c138408d9dcd5aef0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637783632587526317%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=0UoMVUwlwpu8AbB%2BgJrmRBEc7VPt8aAcraRnWkU83ag%3D&reserved=0
> 

I will revert this fix in v10, please resend on top drm_buddy
>>
>>- fix warnings reported by kernel test robot 
>>
>> Signed-off-by: Arunpravin 
>> ---
>>   drivers/gpu/drm/drm_buddy.c   | 326 +-
>>   drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
>>   drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
>>   include/drm/drm_buddy.h   |  22 +-
>>   4 files changed, 293 insertions(+), 124 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
>> index d60878bc9c20..954e31962c74 100644
>> --- a/drivers/gpu/drm/drm_buddy.c
>> +++ b/drivers/gpu/drm/drm_buddy.c
>> @@ -282,23 +282,99 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
>> list_head *objects)
>>   }
>>   EXPORT_SYMBOL(drm_buddy_free_list);
>>   
>> -/**
>> - * drm_buddy_alloc_blocks - allocate power-of-two blocks
>> - *
>> - * @mm: DRM buddy manager to allocate from
>> - * @order: size of the allocation
>> - *
>> - * The order value here translates to:
>> - *
>> - * 0 = 2^0 * mm->chunk_size
>> - * 1 = 2^1 * mm->chunk_size
>> - * 2 = 2^2 * mm->chunk_size
>> - *
>> - * Returns:
>> - * allocated ptr to the &drm_buddy_block on success
>> - */
>> -struct drm_buddy_block *
>> -drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
>> +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
>> +{
>> +return s1 <= e2 && e1 >= s2;
>> +}
>> +
>> +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
>> +{
>> +return s1 <= s2 && e1 >= e2;
>> +}
>> +
>> +static struct drm_buddy_block *
>> +alloc_range_bias(struct drm_buddy *mm,
>> + u64 start, u64 end,
>> + unsigned int order)
>> +{
>> +struct drm_buddy_block *block;
>> +struct drm_buddy_block *buddy;
>> +LIST_HEAD(dfs);
>> +int err;
>> +int i;
>> +
>> +end = end - 1;
>> +
>> +for (i = 0; i < mm->n_roots; ++i)
>> +list_add_tail(&mm->roots[i]->tmp_link, &dfs);
>> +
>> +do {
>> +u64 block_start;
>> +u64 block_end;
>> +
>> +block = list_first_entry_or_null(&dfs,
>> + struct drm_buddy_block,
>> + tmp_link);
>> +if (!block)
>> +break;
>> +
>> +list_del(&block->tmp_link);
>> +
>> +if (drm_buddy_block_order(block) < order)
>> +continue;
>> +
>> +block_start = drm_buddy_block_offset(block);
>> +block_end = block_star

[PATCH v2] drm/bridge: synopsys/dw-hdmi: set cec clock rate

2022-01-26 Thread Peter Geis

The hdmi-cec clock must be 32khz in order for cec to work correctly.
Ensure before enabling the clock we set it in order for the hardware to
work as expected.
Fixes hdmi-cec support on Rockchip devices.

Fixes: ebe32c3e282a ("drm/bridge: synopsys/dw-hdmi: Enable cec clock")

Signed-off-by: Peter Geis 
---
Changelog:
v2:
- Set the clock rate before enabling the clock
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 54d8fdad395f..65c16455b76a 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -48,6 +48,9 @@
 
 #define HDMI14_MAX_TMDSCLK 34000
 
+/* HDMI CEC needs a clock rate of 32khz */
+#define HDMI_CEC_CLK_RATE  32768
+
 enum hdmi_datamap {
RGB444_8B = 0x01,
RGB444_10B = 0x03,
@@ -3341,6 +3344,10 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device 
*pdev,
hdmi->cec_clk = NULL;
goto err_iahb;
} else {
+   ret = clk_set_rate(hdmi->cec_clk, HDMI_CEC_CLK_RATE);
+   if (ret)
+   dev_warn(hdmi->dev, "Cannot set HDMI cec clock rate: 
%d\n", ret);
+
ret = clk_prepare_enable(hdmi->cec_clk);
if (ret) {
dev_err(hdmi->dev, "Cannot enable HDMI cec clock: %d\n",
-- 
2.25.1

Re: [PATCH] drm/v3d: Add missing unlock

2022-01-26 Thread Melissa Wen

On 01/22, Yongzhi Liu wrote:
> [why]
> Unlock is needed on the error handling path to prevent dead lock.
> 
> [how]
> Fix this by adding drm_gem_unlock_reservations on the error handling path.
> 
> Signed-off-by: Yongzhi Liu 
> ---
>  drivers/gpu/drm/v3d/v3d_gem.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index c7ed2e1..0c989dc 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -798,6 +798,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
>  
>   if (!render->base.perfmon) {
>   ret = -ENOENT;
> + drm_gem_unlock_reservations(last_job->bo,
> + last_job->bo_count, &acquire_ctx);
>   goto fail;
Hi,

Nice catch!

As unlock is handle in fail_unreserve, I would suggest you to keep the
failures handling around there. In that case, the goto will target a
place between `fail_unreserve:` and `fail:`, i.e. calls
drm_gem_unlock_reservations (and the following cleanings) but don't call
mutex_unlock.

Thanks,

Melissa

>   }
>   }
> @@ -1027,6 +1029,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
>args->perfmon_id);
>   if (!job->base.perfmon) {
>   ret = -ENOENT;
> + drm_gem_unlock_reservations(clean_job->bo, 
> clean_job->bo_count,
> + &acquire_ctx);
>   goto fail;
>   }
>   }
> -- 
> 2.7.4
> 


signature.asc
Description: PGP signature

Re: [PATCH] drm: rcar-du: Drop LVDS device tree backward compatibility

2022-01-26 Thread Kieran Bingham

Quoting Laurent Pinchart (2022-01-26 20:29:56)
> The rcar-du driver goes to great lengths to preserve device tree
> backward compatibility for the LVDS encoders by patching old device
> trees at runtime.
> 
> The last R-Car Gen2 platform was converted to the new bindings commit
> edb0c3affe5214a2 ("ARM: dts: r8a7793: Convert to new LVDS DT bindings"),
> in v4.17, and the last RZ/G1 platform converted in commit
> 6a6a797625b5fe85 ("ARM: dts: r8a7743: Convert to new LVDS DT bindings"),
> in v5.0. Both are older than commit 58256143cff7c2e0 ("clk: renesas:
> Remove R-Car Gen2 legacy DT clock support"), in v5.5, which removes
> support for legacy bindings for clocks. The LBDS compatibility code is

s/LBDS/LVDS/

> thus not needed anymore. Drop it.

Oh, I'm almost sad to see such exotic code go...

But code gone is less code to worry about so:

Reviewed-by: Kieran Bingham 

> Signed-off-by: Laurent Pinchart 
> ---
>  drivers/gpu/drm/rcar-du/Makefile  |   6 -
>  drivers/gpu/drm/rcar-du/rcar_du_drv.c |  15 +-
>  drivers/gpu/drm/rcar-du/rcar_du_of.c  | 323 --
>  drivers/gpu/drm/rcar-du/rcar_du_of.h  |  20 --
>  .../drm/rcar-du/rcar_du_of_lvds_r8a7790.dts   |  69 
>  .../drm/rcar-du/rcar_du_of_lvds_r8a7791.dts   |  43 ---
>  .../drm/rcar-du/rcar_du_of_lvds_r8a7793.dts   |  43 ---
>  .../drm/rcar-du/rcar_du_of_lvds_r8a7795.dts   |  43 ---
>  .../drm/rcar-du/rcar_du_of_lvds_r8a7796.dts   |  43 ---
>  9 files changed, 1 insertion(+), 604 deletions(-)
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of.c
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of.h
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7790.dts
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7791.dts
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7793.dts
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7795.dts
>  delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7796.dts
> 
> diff --git a/drivers/gpu/drm/rcar-du/Makefile 
> b/drivers/gpu/drm/rcar-du/Makefile
> index 286bc81b3e7c..e7275b5e7ec8 100644
> --- a/drivers/gpu/drm/rcar-du/Makefile
> +++ b/drivers/gpu/drm/rcar-du/Makefile
> @@ -6,12 +6,6 @@ rcar-du-drm-y := rcar_du_crtc.o \
>  rcar_du_kms.o \
>  rcar_du_plane.o \
>  
> -rcar-du-drm-$(CONFIG_DRM_RCAR_LVDS)+= rcar_du_of.o \
> -  rcar_du_of_lvds_r8a7790.dtb.o \
> -  rcar_du_of_lvds_r8a7791.dtb.o \
> -  rcar_du_of_lvds_r8a7793.dtb.o \
> -  rcar_du_of_lvds_r8a7795.dtb.o \
> -  rcar_du_of_lvds_r8a7796.dtb.o
>  rcar-du-drm-$(CONFIG_DRM_RCAR_VSP) += rcar_du_vsp.o
>  rcar-du-drm-$(CONFIG_DRM_RCAR_WRITEBACK) += rcar_du_writeback.o
>  
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> index 5a8131ef81d5..71a9df5a4834 100644
> --- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> +++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
> @@ -28,7 +28,6 @@
>  
>  #include "rcar_du_drv.h"
>  #include "rcar_du_kms.h"
> -#include "rcar_du_of.h"
>  #include "rcar_du_regs.h"
>  
>  /* 
> -
> @@ -699,19 +698,7 @@ static struct platform_driver rcar_du_platform_driver = {
> },
>  };
>  
> -static int __init rcar_du_init(void)
> -{
> -   rcar_du_of_init(rcar_du_of_table);
> -
> -   return platform_driver_register(&rcar_du_platform_driver);
> -}
> -module_init(rcar_du_init);
> -
> -static void __exit rcar_du_exit(void)
> -{
> -   platform_driver_unregister(&rcar_du_platform_driver);
> -}
> -module_exit(rcar_du_exit);
> +module_platform_driver(rcar_du_platform_driver);
>  
>  MODULE_AUTHOR("Laurent Pinchart ");
>  MODULE_DESCRIPTION("Renesas R-Car Display Unit DRM Driver");
> diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of.c 
> b/drivers/gpu/drm/rcar-du/rcar_du_of.c
> deleted file mode 100644
> index afef69669bb4..
> --- a/drivers/gpu/drm/rcar-du/rcar_du_of.c
> +++ /dev/null
> @@ -1,323 +0,0 @@
> -// SPDX-License-Identifier: GPL-2.0
> -/*
> - * rcar_du_of.c - Legacy DT bindings compatibility
> - *
> - * Copyright (C) 2018 Laurent Pinchart 
> - *
> - * Based on work from Jyri Sarha 
> - * Copyright (C) 2015 Texas Instruments
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -#include "rcar_du_crtc.h"
> -#include "rcar_du_drv.h"
> -#include "rcar_du_of.h"
> -
> -/* 
> -
> - * Generic Overlay Handling
> - */
> -
> -struct rcar_du_of_overlay {
> -   const char *compatible;
> -   void *begin;
> -   void *end;
> -};
> -
> -#define RCAR_DU_OF_DTB(type, soc)  \
> -   extern char

[PATCH 09/19] dma-buf-map: Add wrapper over memset

2022-01-26 Thread Lucas De Marchi

Just like memcpy_toio(), there is also need to write a direct value to a
memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()

Cc: Matt Roper 
Cc: Sumit Semwal 
Cc: Christian König 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
 include/linux/dma-buf-map.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index 3514a859f628..c9fb04264cd0 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct 
dma_buf_map *dst, const void *sr
memcpy(dst->vaddr, src, len);
 }
 
+/**
+ * dma_buf_map_memset - Memset into dma-buf mapping
+ * @dst:   The dma-buf mapping structure
+ * @value: The value to set
+ * @len:   The number of bytes to set in dst
+ *
+ * Set value in dma-buf mapping. Depending on the buffer's location, the helper
+ * picks the correct method of accessing the memory.
+ */
+static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, 
size_t len)
+{
+   if (dst->is_iomem)
+   memset_io(dst->vaddr_iomem, value, len);
+   else
+   memset(dst->vaddr, value, len);
+}
+
 /**
  * dma_buf_map_incr - Increments the address stored in a dma-buf mapping
  * @map:   The dma-buf mapping structure
-- 
2.35.0

[PATCH 17/19] drm/i915/guc: Convert guc_mmio_reg_state_init to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Now that the regset list is prepared, convert guc_mmio_reg_state_init()
to use dma_buf_map to copy the array to the final location and
initialize additional fields in ads.reg_state_list.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 30 +-
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 390101ee3661..cb0f543b0e86 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -372,40 +372,46 @@ static long guc_mmio_reg_state_create(struct intel_guc 
*guc)
return total * sizeof(struct guc_mmio_reg);
 }
 
-static void guc_mmio_reg_state_init(struct intel_guc *guc,
-   struct __guc_ads_blob *blob)
+static void guc_mmio_reg_state_init(struct intel_guc *guc)
 {
+   struct dma_buf_map ads_regset_map;
struct intel_gt *gt = guc_to_gt(guc);
struct intel_engine_cs *engine;
-   struct guc_mmio_reg *ads_registers;
enum intel_engine_id id;
u32 addr_ggtt, offset;
 
offset = guc_ads_regset_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
+   ads_regset_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offset);
 
-   memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size);
+   dma_buf_map_memcpy_to(&ads_regset_map, guc->ads_regset,
+ guc->ads_regset_size);
 
for_each_engine(engine, gt, id) {
u32 count = guc->ads_regset_count[id];
-   struct guc_mmio_reg_set *ads_reg_set;
u8 guc_class;
 
/* Class index is checked in class converter */
GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
 
guc_class = engine_class_to_guc_class(engine->class);
-   ads_reg_set = 
&blob->ads.reg_state_list[guc_class][engine->instance];
 
if (!count) {
-   ads_reg_set->address = 0;
-   ads_reg_set->count = 0;
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].address,
+  0);
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].count,
+  0);
continue;
}
 
-   ads_reg_set->address = addr_ggtt;
-   ads_reg_set->count = count;
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].address,
+  addr_ggtt);
+   ads_blob_write(guc,
+  
ads.reg_state_list[guc_class][engine->instance].count,
+  count);
 
addr_ggtt += count * sizeof(struct guc_mmio_reg);
}
@@ -635,7 +641,7 @@ static void __guc_ads_init(struct intel_guc *guc)
blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
 
/* MMIO save/restore list */
-   guc_mmio_reg_state_init(guc, blob);
+   guc_mmio_reg_state_init(guc);
 
/* Private Data */
blob->ads.private_data = base + guc_ads_private_data_offset(guc);
-- 
2.35.0

[PATCH 19/19] drm/i915/guc: Remove plain ads_blob pointer

2022-01-26 Thread Lucas De Marchi

Now we have the access to content of GuC ADS either using dma_buf_map
API or using a temporary buffer. Remove guc->ads_blob as there shouldn't
be updates using the bare pointer anymore.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h | 3 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4c852eee3ad8..7349483d0e35 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -147,8 +147,7 @@ struct intel_guc {
 
/** @ads_vma: object allocated to hold the GuC ADS */
struct i915_vma *ads_vma;
-   /** @ads_blob: contents of the GuC ADS */
-   struct __guc_ads_blob *ads_blob;
+   /** @ads_map: contents of the GuC ADS */
struct dma_buf_map ads_map;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 30edac93afbf..b87269081650 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -661,6 +661,7 @@ static void __guc_ads_init(struct intel_guc *guc)
  */
 int intel_guc_ads_create(struct intel_guc *guc)
 {
+   void *ads_blob;
u32 size;
int ret;
 
@@ -685,14 +686,14 @@ int intel_guc_ads_create(struct intel_guc *guc)
size = guc_ads_blob_size(guc);
 
ret = intel_guc_allocate_and_map_vma(guc, size, &guc->ads_vma,
-(void **)&guc->ads_blob);
+&ads_blob);
if (ret)
return ret;
 
if (i915_gem_object_is_lmem(guc->ads_vma->obj))
-   dma_buf_map_set_vaddr_iomem(&guc->ads_map, (void __iomem 
*)guc->ads_blob);
+   dma_buf_map_set_vaddr_iomem(&guc->ads_map, (void __iomem 
*)ads_blob);
else
-   dma_buf_map_set_vaddr(&guc->ads_map, guc->ads_blob);
+   dma_buf_map_set_vaddr(&guc->ads_map, ads_blob);
 
__guc_ads_init(guc);
 
@@ -714,7 +715,6 @@ void intel_guc_ads_init_late(struct intel_guc *guc)
 void intel_guc_ads_destroy(struct intel_guc *guc)
 {
i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
-   guc->ads_blob = NULL;
dma_buf_map_clear(&guc->ads_map);
kfree(guc->ads_regset);
 }
-- 
2.35.0

[PATCH 18/19] drm/i915/guc: Convert __guc_ads_init to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Now that all the called functions from __guc_ads_init() are converted to
use ads_map, stop using ads_blob in __guc_ads_init().

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 --
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index cb0f543b0e86..30edac93afbf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -602,7 +602,6 @@ static void __guc_ads_init(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
struct drm_i915_private *i915 = gt->i915;
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct dma_buf_map info_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map,
offsetof(struct __guc_ads_blob, system_info));
u32 base;
@@ -613,17 +612,18 @@ static void __guc_ads_init(struct intel_guc *guc)
/* System info */
fill_engine_enable_masks(gt, &info_map);
 
-   
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] =
-   hweight8(gt->info.sseu.slice_mask);
-   
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK]
 =
-   gt->info.vdbox_sfc_access;
+   ads_blob_write(guc, 
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED],
+  hweight8(gt->info.sseu.slice_mask));
+   ads_blob_write(guc, 
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK],
+  gt->info.vdbox_sfc_access);
 
if (GRAPHICS_VER(i915) >= 12 && !IS_DGFX(i915)) {
u32 distdbreg = intel_uncore_read(gt->uncore,
  GEN12_DIST_DBS_POPULATED);
-   
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI]
 =
-   ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) &
-GEN12_DOORBELLS_PER_SQIDI) + 1;
+   ads_blob_write(guc,
+  
system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI],
+  ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT)
+   & GEN12_DOORBELLS_PER_SQIDI) + 1);
}
 
/* Golden contexts for re-initialising after a watchdog reset */
@@ -637,14 +637,17 @@ static void __guc_ads_init(struct intel_guc *guc)
guc_capture_list_init(guc);
 
/* ADS */
-   blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
-   blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
+   ads_blob_write(guc, ads.scheduler_policies, base +
+  offsetof(struct __guc_ads_blob, policies));
+   ads_blob_write(guc, ads.gt_system_info, base +
+  offsetof(struct __guc_ads_blob, system_info));
 
/* MMIO save/restore list */
guc_mmio_reg_state_init(guc);
 
/* Private Data */
-   blob->ads.private_data = base + guc_ads_private_data_offset(guc);
+   ads_blob_write(guc, ads.private_data, base +
+  guc_ads_private_data_offset(guc));
 
i915_gem_object_flush_map(guc->ads_vma->obj);
 }
-- 
2.35.0

[PATCH 16/19] drm/i915/guc: Use a single pass to calculate regset

2022-01-26 Thread Lucas De Marchi

The ADS initialitazion was using 2 passes to calculate the regset sent
to GuC to initialize each engine: the first pass to just have the final
object size and the second to set each register in place in the final
gem object.

However in order to maintain an ordered set of registers to pass to guc,
each register needs to be added and moved in the final array. The second
phase may actually happen in IO memory rather than system memory and
accessing IO memory by simply dereferencing the pointer doesn't work on
all architectures. Other places of the ADS initializaition were
converted to use the dma_buf_map API, but here there may be a lot more
accesses to IO memory. So, instead of following that same approach,
convert the regset initialization to calculate the final array in 1
pass and in the second pass that array is just copied to its final
location, updating the pointers for each engine written to the ADS blob.

One important thing is that struct temp_regset now have
different semantics: `registers` continues to track the registers of a
single engine, however the other fields are updated together, according
to the newly added `storage`, which tracks the memory allocated for
all the registers. So rename some of these fields and add a
__mmio_reg_add(): this function (possibly) allocates memory and operates
on the storage pointer while guc_mmio_reg_add() continues to manage the
registers pointer.

On a Tiger Lake system using enable_guc=3, the following log message is
now seen:

[  187.334310] i915 :00:02.0: [drm:intel_guc_ads_create [i915]] 
Used 4 KB for temporary ADS regset

This change has also been tested on an ARM64 host with DG2 and other
discrete graphics cards.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h |   7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 117 +
 2 files changed, 79 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index e2e0df1c3d91..4c852eee3ad8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -152,6 +152,13 @@ struct intel_guc {
struct dma_buf_map ads_map;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
+   /**
+* @ads_regset_count: number of save/restore registers in the ADS for
+* each engine
+*/
+   u32 ads_regset_count[I915_NUM_ENGINES];
+   /** @ads_regset: save/restore regsets in the ADS */
+   struct guc_mmio_reg *ads_regset;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
/** @ads_engine_usage_size: size of engine usage in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 73ca34de44f7..390101ee3661 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -226,14 +226,13 @@ static void guc_mapping_table_init(struct intel_gt *gt,
 
 /*
  * The save/restore register list must be pre-calculated to a temporary
- * buffer of driver defined size before it can be generated in place
- * inside the ADS.
+ * buffer before it can be copied inside the ADS.
  */
-#define MAX_MMIO_REGS  128 /* Arbitrary size, increase as needed */
 struct temp_regset {
struct guc_mmio_reg *registers;
-   u32 used;
-   u32 size;
+   struct guc_mmio_reg *storage;
+   u32 storage_used;
+   u32 storage_max;
 };
 
 static int guc_mmio_reg_cmp(const void *a, const void *b)
@@ -244,18 +243,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b)
return (int)ra->offset - (int)rb->offset;
 }
 
+static struct guc_mmio_reg * __must_check
+__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg)
+{
+   u32 pos = regset->storage_used;
+   struct guc_mmio_reg *slot;
+
+   if (pos >= regset->storage_max) {
+   size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE);
+   struct guc_mmio_reg *r = krealloc(regset->storage,
+ size, GFP_KERNEL);
+   if (!r) {
+   WARN_ONCE(1, "Incomplete regset list: can't add 
register (%d)\n",
+ -ENOMEM);
+   return ERR_PTR(-ENOMEM);
+   }
+
+   regset->registers = r + (regset->registers - regset->storage);
+   regset->storage = r;
+   regset->storage_max = size / sizeof(*slot);
+   }
+
+   slot = ®set->storage[pos];
+   regset->storage_used++;
+   *slot = *reg;
+
+   return slot;
+}
+
 static long __must_check guc_mmio_reg_add(struct temp_regset *regset,

[PATCH 08/19] drm/i915/guc: Convert engine record to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Use dma_buf_map to read fields from the dma_blob so access to IO and
system memory is abstracted away.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 14 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|  3 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +++
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 2ffe5836f95e..fe1e71adfca1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -698,18 +698,16 @@ void intel_guc_ads_reset(struct intel_guc *guc)
 
 u32 intel_guc_engine_usage_offset(struct intel_guc *guc)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
-   u32 base = intel_guc_ggtt_offset(guc, guc->ads_vma);
-   u32 offset = base + ptr_offset(blob, engine_usage);
-
-   return offset;
+   return intel_guc_ggtt_offset(guc, guc->ads_vma) +
+   offsetof(struct __guc_ads_blob, engine_usage);
 }
 
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs 
*engine)
+struct dma_buf_map intel_guc_engine_usage_record_map(struct intel_engine_cs 
*engine)
 {
struct intel_guc *guc = &engine->gt->uc.guc;
-   struct __guc_ads_blob *blob = guc->ads_blob;
u8 guc_class = engine_class_to_guc_class(engine->class);
+   size_t offset = offsetof(struct __guc_ads_blob,
+
engine_usage.engines[guc_class][ilog2(engine->logical_mask)]);
 
-   return 
&blob->engine_usage.engines[guc_class][ilog2(engine->logical_mask)];
+   return DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offset);
 }
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
index e74c110facff..27f5b1f9ddac 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h
@@ -7,6 +7,7 @@
 #define _INTEL_GUC_ADS_H_
 
 #include 
+#include 
 
 struct intel_guc;
 struct drm_printer;
@@ -18,7 +19,7 @@ void intel_guc_ads_init_late(struct intel_guc *guc);
 void intel_guc_ads_reset(struct intel_guc *guc);
 void intel_guc_ads_print_policy_info(struct intel_guc *guc,
 struct drm_printer *p);
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs 
*engine);
+struct dma_buf_map intel_guc_engine_usage_record_map(struct intel_engine_cs 
*engine);
 u32 intel_guc_engine_usage_offset(struct intel_guc *guc);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index db9615dcb0ec..57bfb4ad0ab8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1125,14 +1125,17 @@ __extend_last_switch(struct intel_guc *guc, u64 
*prev_start, u32 new_start)
*prev_start = ((u64)gt_stamp_hi << 32) | new_start;
 }
 
+#define record_read(map_, field_) \
+   dma_buf_map_read_field(map_, struct guc_engine_usage_record, field_)
+
 static void guc_update_engine_gt_clks(struct intel_engine_cs *engine)
 {
-   struct guc_engine_usage_record *rec = intel_guc_engine_usage(engine);
+   struct dma_buf_map rec_map = intel_guc_engine_usage_record_map(engine);
struct intel_engine_guc_stats *stats = &engine->stats.guc;
struct intel_guc *guc = &engine->gt->uc.guc;
-   u32 last_switch = rec->last_switch_in_stamp;
-   u32 ctx_id = rec->current_context_index;
-   u32 total = rec->total_runtime;
+   u32 last_switch = record_read(&rec_map, last_switch_in_stamp);
+   u32 ctx_id = record_read(&rec_map, current_context_index);
+   u32 total = record_read(&rec_map, total_runtime);
 
lockdep_assert_held(&guc->timestamp.lock);
 
-- 
2.35.0

[PATCH 15/19] drm/i915/guc: Prepare for error propagation

2022-01-26 Thread Lucas De Marchi

Currently guc_mmio_reg_add() relies on having enough memory available in
the array to add a new slot. It uses
`GEM_BUG_ON(count >= regset->size);` to protect going above the
threshold.

In order to allow guc_mmio_reg_add() to handle the memory allocation by
itself, it must return an error in case of failures.  Adjust return code
so this error can be propagated to the callers of guc_mmio_reg_add() and
guc_mmio_regset_init().

No intended change in behavior.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 31 +-
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index cad1e325656e..73ca34de44f7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -244,8 +244,8 @@ static int guc_mmio_reg_cmp(const void *a, const void *b)
return (int)ra->offset - (int)rb->offset;
 }
 
-static void guc_mmio_reg_add(struct temp_regset *regset,
-u32 offset, u32 flags)
+static long __must_check guc_mmio_reg_add(struct temp_regset *regset,
+ u32 offset, u32 flags)
 {
u32 count = regset->used;
struct guc_mmio_reg reg = {
@@ -264,7 +264,7 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
 */
if (bsearch(®, regset->registers, count,
sizeof(reg), guc_mmio_reg_cmp))
-   return;
+   return 0;
 
slot = ®set->registers[count];
regset->used++;
@@ -277,6 +277,8 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
 
swap(slot[1], slot[0]);
}
+
+   return 0;
 }
 
 #define GUC_MMIO_REG_ADD(regset, reg, masked) \
@@ -284,32 +286,35 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
 i915_mmio_reg_offset((reg)), \
 (masked) ? GUC_REGSET_MASKED : 0)
 
-static void guc_mmio_regset_init(struct temp_regset *regset,
-struct intel_engine_cs *engine)
+static int guc_mmio_regset_init(struct temp_regset *regset,
+   struct intel_engine_cs *engine)
 {
const u32 base = engine->mmio_base;
struct i915_wa_list *wal = &engine->wa_list;
struct i915_wa *wa;
unsigned int i;
+   int ret = 0;
 
regset->used = 0;
 
-   GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
-   GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
-   GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
+   ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true);
+   ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
+   ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
 
for (i = 0, wa = wal->list; i < wal->count; i++, wa++)
-   GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
+   ret |= GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
 
/* Be extra paranoid and include all whitelist registers. */
for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++)
-   GUC_MMIO_REG_ADD(regset,
-RING_FORCE_TO_NONPRIV(base, i),
-false);
+   ret |= GUC_MMIO_REG_ADD(regset,
+   RING_FORCE_TO_NONPRIV(base, i),
+   false);
 
/* add in local MOCS registers */
for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++)
-   GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false);
+   ret |= GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false);
+
+   return ret ? -1 : 0;
 }
 
 static int guc_mmio_reg_state_query(struct intel_guc *guc)
-- 
2.35.0

[PATCH 02/19] dma-buf-map: Add helper to initialize second map

2022-01-26 Thread Lucas De Marchi

When dma_buf_map struct is passed around, it's useful to be able to
initialize a second map that takes care of reading/writing to an offset
of the original map.

Add a helper that copies the struct and add the offset to the proper
address.

Cc: Sumit Semwal 
Cc: Christian König 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
 include/linux/dma-buf-map.h | 29 +
 1 file changed, 29 insertions(+)

diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index 65e927d9ce33..3514a859f628 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -131,6 +131,35 @@ struct dma_buf_map {
.is_iomem = false, \
}
 
+/**
+ * DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from another 
dma_buf_map
+ * @map_:  The dma-buf mapping structure to copy from
+ * @offset:Offset to add to the other mapping
+ *
+ * Initializes a new dma_buf_struct based on another. This is the equivalent 
of doing:
+ *
+ * .. code-block: c
+ *
+ * dma_buf_map map = other_map;
+ * dma_buf_map_incr(&map, &offset);
+ *
+ * Example usage:
+ *
+ * .. code-block: c
+ *
+ * void foo(struct device *dev, struct dma_buf_map *base_map)
+ * {
+ * ...
+ * struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map, 
FIELD_OFFSET);
+ * ...
+ * }
+ */
+#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map)\
+   {   \
+   .vaddr = (map_)->vaddr + (offset_), \
+   .is_iomem = (map_)->is_iomem,   \
+   }
+
 /**
  * dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in 
system memory
  * @map:   The dma-buf mapping structure
-- 
2.35.0

[PATCH 12/19] drm/i915/guc: Replace check for golden context size

2022-01-26 Thread Lucas De Marchi

In the other places in this function, guc->ads_map is being protected
from access when it's not yet set. However the last check is actually
about guc->ads_golden_ctxt_size been set before.  These checks should
always match as the size is initialized on the first call to
guc_prep_golden_context(), but it's clearer if we have a single return
and check for guc->ads_golden_ctxt_size.

This is just a readability improvement, no change in behavior.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index dd9ec47eed16..8e4768289792 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -461,10 +461,10 @@ static int guc_prep_golden_context(struct intel_guc *guc)
addr_ggtt += alloc_size;
}
 
-   if (dma_buf_map_is_null(&guc->ads_map))
-   return total_size;
+   /* Make sure current size matches what we calculated previously */
+   if (guc->ads_golden_ctxt_size)
+   GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
 
-   GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
return total_size;
 }
 
-- 
2.35.0

[PATCH 11/19] drm/i915/guc: Convert golden context prep to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Use the saved ads_map to prepare the golden context. One difference from
the init context is that this function can be called before there is a
gem object (and thus the guc->ads_map) to calculare the size of the
golden context that should be allocated for that object.

So in this case the function needs to be prepared for not having the
system_info with enabled engines filled out. To accomplish that an
info_map is prepared on the side to point either to the gem object
or the local variable on the stack. This allows making
fill_engine_enable_masks() operate always with a dma_buf_map
argument.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 52 +-
 1 file changed, 32 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 15990c229b54..dd9ec47eed16 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -67,6 +67,12 @@ struct __guc_ads_blob {
dma_buf_map_write_field(&(guc_)->ads_map, struct __guc_ads_blob,\
field_, val_)
 
+#define info_map_write(map_, field_, val_) \
+   dma_buf_map_write_field(map_, struct guc_gt_system_info, field_, val_)
+
+#define info_map_read(map_, field_) \
+   dma_buf_map_read_field(map_, struct guc_gt_system_info, field_)
+
 static u32 guc_ads_regset_size(struct intel_guc *guc)
 {
GEM_BUG_ON(!guc->ads_regset_size);
@@ -378,24 +384,24 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc,
 }
 
 static void fill_engine_enable_masks(struct intel_gt *gt,
-struct guc_gt_system_info *info)
+struct dma_buf_map *info_map)
 {
-   info->engine_enabled_masks[GUC_RENDER_CLASS] = 1;
-   info->engine_enabled_masks[GUC_BLITTER_CLASS] = 1;
-   info->engine_enabled_masks[GUC_VIDEO_CLASS] = VDBOX_MASK(gt);
-   info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt);
+   info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1);
+   info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
+   info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], 
VDBOX_MASK(gt));
+   info_map_write(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS], 
VEBOX_MASK(gt));
 }
 
 #define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
 #define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
-static int guc_prep_golden_context(struct intel_guc *guc,
-  struct __guc_ads_blob *blob)
+static int guc_prep_golden_context(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
u32 addr_ggtt, offset;
u32 total_size = 0, alloc_size, real_size;
u8 engine_class, guc_class;
-   struct guc_gt_system_info *info, local_info;
+   struct guc_gt_system_info local_info;
+   struct dma_buf_map info_map;
 
/*
 * Reserve the memory for the golden contexts and point GuC at it but
@@ -409,14 +415,15 @@ static int guc_prep_golden_context(struct intel_guc *guc,
 * GuC will also validate that the LRC base + size fall within the
 * allowed GGTT range.
 */
-   if (blob) {
+   if (!dma_buf_map_is_null(&guc->ads_map)) {
offset = guc_ads_golden_ctxt_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   info = &blob->system_info;
+   info_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map,
+  offsetof(struct 
__guc_ads_blob, system_info));
} else {
memset(&local_info, 0, sizeof(local_info));
-   info = &local_info;
-   fill_engine_enable_masks(gt, info);
+   dma_buf_map_set_vaddr(&info_map, &local_info);
+   fill_engine_enable_masks(gt, &info_map);
}
 
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; 
++engine_class) {
@@ -425,14 +432,14 @@ static int guc_prep_golden_context(struct intel_guc *guc,
 
guc_class = engine_class_to_guc_class(engine_class);
 
-   if (!info->engine_enabled_masks[guc_class])
+   if (!info_map_read(&info_map, engine_enabled_masks[guc_class]))
continue;
 
real_size = intel_engine_context_size(gt, engine_class);
alloc_size = PAGE_ALIGN(real_size);
total_size += alloc_size;
 
-   if (!blob)
+   if (dma_buf_map_is_null(&guc->ads_map))
continue;
 
/*
@@ -446,12 +453,15 @@ static int guc_prep_golden_context(struct intel_guc *guc,
 * what comes before it in the context image (which

[PATCH 04/19] drm/i915/guc: Keep dma_buf_map of ads_blob around

2022-01-26 Thread Lucas De Marchi

Convert intel_guc_ads_create() and initialization to use dma_buf_map
rather than plain pointer and save it in the guc struct. This will help
with additional updates to the ads_blob after the
creation/initialization by abstracting the IO vs system memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.h | 4 +++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 697d9d66acef..e2e0df1c3d91 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -6,8 +6,9 @@
 #ifndef _INTEL_GUC_H_
 #define _INTEL_GUC_H_
 
-#include 
 #include 
+#include 
+#include 
 
 #include "intel_uncore.h"
 #include "intel_guc_fw.h"
@@ -148,6 +149,7 @@ struct intel_guc {
struct i915_vma *ads_vma;
/** @ads_blob: contents of the GuC ADS */
struct __guc_ads_blob *ads_blob;
+   struct dma_buf_map ads_map;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 668bf4ac9b0c..c012858376f0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -623,6 +623,11 @@ int intel_guc_ads_create(struct intel_guc *guc)
if (ret)
return ret;
 
+   if (i915_gem_object_is_lmem(guc->ads_vma->obj))
+   dma_buf_map_set_vaddr_iomem(&guc->ads_map, (void __iomem 
*)guc->ads_blob);
+   else
+   dma_buf_map_set_vaddr(&guc->ads_map, guc->ads_blob);
+
__guc_ads_init(guc);
 
return 0;
@@ -644,6 +649,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc)
 {
i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP);
guc->ads_blob = NULL;
+   dma_buf_map_clear(&guc->ads_map);
 }
 
 static void guc_ads_private_data_reset(struct intel_guc *guc)
-- 
2.35.0

[PATCH 07/19] drm/i915/guc: Convert policies update to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Use dma_buf_map to write the policies update so access to IO and system
memory is abstracted away.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 41 --
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index bcf52ac4fe35..2ffe5836f95e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -130,33 +130,37 @@ static u32 guc_ads_blob_size(struct intel_guc *guc)
   guc_ads_private_data_size(guc);
 }
 
-static void guc_policies_init(struct intel_guc *guc, struct guc_policies 
*policies)
+static void guc_policies_init(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
struct drm_i915_private *i915 = gt->i915;
+   u32 global_flags = 0;
 
-   policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
-   policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI;
+   ads_blob_write(guc, policies.dpc_promote_time,
+  GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US);
+   ads_blob_write(guc, policies.max_num_work_items,
+  GLOBAL_POLICY_MAX_NUM_WI);
 
-   policies->global_flags = 0;
if (i915->params.reset < 2)
-   policies->global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
+   global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
 
-   policies->is_valid = 1;
+   ads_blob_write(guc, policies.global_flags, global_flags);
+   ads_blob_write(guc, policies.is_valid, 1);
 }
 
 void intel_guc_ads_print_policy_info(struct intel_guc *guc,
 struct drm_printer *dp)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
-
-   if (unlikely(!blob))
+   if (unlikely(dma_buf_map_is_null(&guc->ads_map)))
return;
 
drm_printf(dp, "Global scheduling policies:\n");
-   drm_printf(dp, "  DPC promote time   = %u\n", 
blob->policies.dpc_promote_time);
-   drm_printf(dp, "  Max num work items = %u\n", 
blob->policies.max_num_work_items);
-   drm_printf(dp, "  Flags  = %u\n", 
blob->policies.global_flags);
+   drm_printf(dp, "  DPC promote time   = %u\n",
+  ads_blob_read(guc, policies.dpc_promote_time));
+   drm_printf(dp, "  Max num work items = %u\n",
+  ads_blob_read(guc, policies.max_num_work_items));
+   drm_printf(dp, "  Flags  = %u\n",
+  ads_blob_read(guc, policies.global_flags));
 }
 
 static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset)
@@ -171,23 +175,24 @@ static int guc_action_policies_update(struct intel_guc 
*guc, u32 policy_offset)
 
 int intel_guc_global_policies_update(struct intel_guc *guc)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct intel_gt *gt = guc_to_gt(guc);
+   u32 scheduler_policies;
intel_wakeref_t wakeref;
int ret;
 
-   if (!blob)
+   if (dma_buf_map_is_null(&guc->ads_map))
return -EOPNOTSUPP;
 
-   GEM_BUG_ON(!blob->ads.scheduler_policies);
+   scheduler_policies = ads_blob_read(guc, ads.scheduler_policies);
+   GEM_BUG_ON(!scheduler_policies);
 
-   guc_policies_init(guc, &blob->policies);
+   guc_policies_init(guc);
 
if (!intel_guc_is_ready(guc))
return 0;
 
with_intel_runtime_pm(>->i915->runtime_pm, wakeref)
-   ret = guc_action_policies_update(guc, 
blob->ads.scheduler_policies);
+   ret = guc_action_policies_update(guc, scheduler_policies);
 
return ret;
 }
@@ -557,7 +562,7 @@ static void __guc_ads_init(struct intel_guc *guc)
u32 base;
 
/* GuC scheduling policies */
-   guc_policies_init(guc, &blob->policies);
+   guc_policies_init(guc);
 
/* System info */
fill_engine_enable_masks(gt, &blob->system_info);
-- 
2.35.0

[PATCH 13/19] drm/i915/guc: Convert mapping table to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Use dma_buf_map to write the fields system_info.mapping_table[][].
Since we already have the info_map around where needed, just use it
instead of going through guc->ads_map.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 8e4768289792..dca7c3db9cdd 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -204,7 +204,7 @@ int intel_guc_global_policies_update(struct intel_guc *guc)
 }
 
 static void guc_mapping_table_init(struct intel_gt *gt,
-  struct guc_gt_system_info *system_info)
+  struct dma_buf_map *info_map)
 {
unsigned int i, j;
struct intel_engine_cs *engine;
@@ -213,14 +213,14 @@ static void guc_mapping_table_init(struct intel_gt *gt,
/* Table must be set to invalid values for entries not used */
for (i = 0; i < GUC_MAX_ENGINE_CLASSES; ++i)
for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; ++j)
-   system_info->mapping_table[i][j] =
-   GUC_MAX_INSTANCES_PER_CLASS;
+   info_map_write(info_map, mapping_table[i][j],
+  GUC_MAX_INSTANCES_PER_CLASS);
 
for_each_engine(engine, gt, id) {
u8 guc_class = engine_class_to_guc_class(engine->class);
 
-   
system_info->mapping_table[guc_class][ilog2(engine->logical_mask)] =
-   engine->instance;
+   info_map_write(info_map, 
mapping_table[guc_class][ilog2(engine->logical_mask)],
+  engine->instance);
}
 }
 
@@ -595,7 +595,7 @@ static void __guc_ads_init(struct intel_guc *guc)
/* Golden contexts for re-initialising after a watchdog reset */
guc_prep_golden_context(guc);
 
-   guc_mapping_table_init(guc_to_gt(guc), &blob->system_info);
+   guc_mapping_table_init(guc_to_gt(guc), &info_map);
 
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
 
-- 
2.35.0

[PATCH 06/19] drm/i915/guc: Convert golden context init to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Now the map is saved during creation, so use it to initialize the
golden context, reading from shmem and writing to either system or IO
memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 +++---
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 01d2c1ead680..bcf52ac4fe35 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -473,18 +473,17 @@ static struct intel_engine_cs *find_engine_state(struct 
intel_gt *gt, u8 engine_
 
 static void guc_init_golden_context(struct intel_guc *guc)
 {
-   struct __guc_ads_blob *blob = guc->ads_blob;
struct intel_engine_cs *engine;
struct intel_gt *gt = guc_to_gt(guc);
+   struct dma_buf_map golden_context_map;
u32 addr_ggtt, offset;
u32 total_size = 0, alloc_size, real_size;
u8 engine_class, guc_class;
-   u8 *ptr;
 
if (!intel_uc_uses_guc_submission(>->uc))
return;
 
-   GEM_BUG_ON(!blob);
+   GEM_BUG_ON(dma_buf_map_is_null(&guc->ads_map));
 
/*
 * Go back and fill in the golden context data now that it is
@@ -492,15 +491,15 @@ static void guc_init_golden_context(struct intel_guc *guc)
 */
offset = guc_ads_golden_ctxt_offset(guc);
addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
-   ptr = ((u8 *)blob) + offset;
+
+   golden_context_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offset);
 
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; 
++engine_class) {
if (engine_class == OTHER_CLASS)
continue;
 
guc_class = engine_class_to_guc_class(engine_class);
-
-   if (!blob->system_info.engine_enabled_masks[guc_class])
+   if (!ads_blob_read(guc, 
system_info.engine_enabled_masks[guc_class]))
continue;
 
real_size = intel_engine_context_size(gt, engine_class);
@@ -511,18 +510,20 @@ static void guc_init_golden_context(struct intel_guc *guc)
if (!engine) {
drm_err(>->i915->drm, "No engine state recorded for 
class %d!\n",
engine_class);
-   blob->ads.eng_state_size[guc_class] = 0;
-   blob->ads.golden_context_lrca[guc_class] = 0;
+   ads_blob_write(guc, ads.eng_state_size[guc_class], 0);
+   ads_blob_write(guc, ads.golden_context_lrca[guc_class], 
0);
continue;
}
 
-   GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=
+   GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) !=
   real_size - LRC_SKIP_SIZE);
-   GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != 
addr_ggtt);
+   GEM_BUG_ON(ads_blob_read(guc, 
ads.golden_context_lrca[guc_class]) != addr_ggtt);
+
addr_ggtt += alloc_size;
 
-   shmem_read(engine->default_state, 0, ptr, real_size);
-   ptr += alloc_size;
+   shmem_read_to_dma_buf_map(engine->default_state, 0,
+ &golden_context_map, real_size);
+   dma_buf_map_incr(&golden_context_map, alloc_size);
}
 
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
-- 
2.35.0

[PATCH 14/19] drm/i915/guc: Convert capture list to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Use dma_buf_map to write the fields ads.capture_*.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index dca7c3db9cdd..cad1e325656e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -544,7 +544,7 @@ static void guc_init_golden_context(struct intel_guc *guc)
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
 }
 
-static void guc_capture_list_init(struct intel_guc *guc, struct __guc_ads_blob 
*blob)
+static void guc_capture_list_init(struct intel_guc *guc)
 {
int i, j;
u32 addr_ggtt, offset;
@@ -556,11 +556,11 @@ static void guc_capture_list_init(struct intel_guc *guc, 
struct __guc_ads_blob *
 
for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) {
for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) {
-   blob->ads.capture_instance[i][j] = addr_ggtt;
-   blob->ads.capture_class[i][j] = addr_ggtt;
+   ads_blob_write(guc, ads.capture_instance[i][j], 
addr_ggtt);
+   ads_blob_write(guc, ads.capture_class[i][j], addr_ggtt);
}
 
-   blob->ads.capture_global[i] = addr_ggtt;
+   ads_blob_write(guc, ads.capture_global[i], addr_ggtt);
}
 }
 
@@ -600,7 +600,7 @@ static void __guc_ads_init(struct intel_guc *guc)
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
 
/* Capture list for hang debug */
-   guc_capture_list_init(guc, blob);
+   guc_capture_list_init(guc);
 
/* ADS */
blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
-- 
2.35.0

[PATCH 10/19] drm/i915/guc: Convert guc_ads_private_data_reset to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Use dma_buf_map_memset() to zero the private data as ADS may be either
on system or IO memory.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index fe1e71adfca1..15990c229b54 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -668,14 +668,15 @@ void intel_guc_ads_destroy(struct intel_guc *guc)
 
 static void guc_ads_private_data_reset(struct intel_guc *guc)
 {
+   struct dma_buf_map map =
+   DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, 
guc_ads_private_data_offset(guc));
u32 size;
 
size = guc_ads_private_data_size(guc);
if (!size)
return;
 
-   memset((void *)guc->ads_blob + guc_ads_private_data_offset(guc), 0,
-  size);
+   dma_buf_map_memset(&map, 0, size);
 }
 
 /**
-- 
2.35.0

[PATCH 05/19] drm/i915/guc: Add read/write helpers for ADS blob

2022-01-26 Thread Lucas De Marchi

Add helpers on top of dma_buf_map_read_field() /
dma_buf_map_write_field() functions so they always use the right
arguments and make code easier to read.

Cc: Matt Roper 
Cc: Thomas Hellström 
Cc: Daniel Vetter 
Cc: John Harrison 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index c012858376f0..01d2c1ead680 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -59,6 +59,14 @@ struct __guc_ads_blob {
struct guc_mmio_reg regset[0];
 } __packed;
 
+#define ads_blob_read(guc_, field_)\
+   dma_buf_map_read_field(&(guc_)->ads_map, struct __guc_ads_blob, \
+  field_)
+
+#define ads_blob_write(guc_, field_, val_) \
+   dma_buf_map_write_field(&(guc_)->ads_map, struct __guc_ads_blob,\
+   field_, val_)
+
 static u32 guc_ads_regset_size(struct intel_guc *guc)
 {
GEM_BUG_ON(!guc->ads_regset_size);
-- 
2.35.0

[PATCH 01/19] dma-buf-map: Add read/write helpers

2022-01-26 Thread Lucas De Marchi

In certain situations it's useful to be able to read or write to an
offset that is calculated by having the memory layout given by a struct
declaration. Usually we are going to read/write a u8, u16, u32 or u64.

Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field()
to calculate the offset of a struct member and memcpy the data from/to
the dma_buf_map. We could use readb, readw, readl, readq and the write*
counterparts, however due to alignment issues this may not work on all
architectures. If alignment needs to be checked to call the right
function, it's not possible to decide at compile-time which function to
call: so just leave the decision to the memcpy function that will do
exactly that on IO memory or dereference the pointer.

Cc: Sumit Semwal 
Cc: Christian König 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Lucas De Marchi 
---
 include/linux/dma-buf-map.h | 81 +
 1 file changed, 81 insertions(+)

diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index 19fa0b5ae5ec..65e927d9ce33 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -6,6 +6,7 @@
 #ifndef __DMA_BUF_MAP_H__
 #define __DMA_BUF_MAP_H__
 
+#include 
 #include 
 #include 
 
@@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map 
*map)
}
 }
 
+/**
+ * dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
+ * @dst:   The dma-buf mapping structure
+ * @offset:The offset from which to copy
+ * @src:   The source buffer
+ * @len:   The number of byte in src
+ *
+ * Copies data into a dma-buf mapping with an offset. The source buffer is in
+ * system memory. Depending on the buffer's location, the helper picks the
+ * correct method of accessing the memory.
+ */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, 
size_t offset,
+   const void *src, size_t len)
+{
+   if (dst->is_iomem)
+   memcpy_toio(dst->vaddr_iomem + offset, src, len);
+   else
+   memcpy(dst->vaddr + offset, src, len);
+}
+
+/**
+ * dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into 
system memory
+ * @dst:   Destination in system memory
+ * @src:   The dma-buf mapping structure
+ * @src:   The offset from which to copy
+ * @len:   The number of byte in src
+ *
+ * Copies data from a dma-buf mapping with an offset. The dest buffer is in
+ * system memory. Depending on the mapping location, the helper picks the
+ * correct method of accessing the memory.
+ */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct 
dma_buf_map *src,
+ size_t offset, size_t len)
+{
+   if (src->is_iomem)
+   memcpy_fromio(dst, src->vaddr_iomem + offset, len);
+   else
+   memcpy(dst, src->vaddr + offset, len);
+}
+
 /**
  * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
  * @dst:   The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map 
*map, size_t incr)
map->vaddr += incr;
 }
 
+/**
+ * dma_buf_map_read_field - Read struct member from dma-buf mapping with
+ * arbitrary size and handling un-aligned accesses
+ *
+ * @map__: The dma-buf mapping structure
+ * @type__:The struct to be used containing the field to read
+ * @field__:   Member from struct we want to read
+ *
+ * Read a value from dma-buf mapping calculating the offset and size: this 
assumes
+ * the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
+ * or u64 can be read, based on the offset and size of type__.field__.
+ */
+#define dma_buf_map_read_field(map__, type__, field__) ({  
\
+   type__ *t__;
\
+   typeof(t__->field__) val__; 
\
+   dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, 
field__),\
+  sizeof(t__->field__));   
\
+   val__;  
\
+})
+
+/**
+ * dma_buf_map_write_field - Write struct member to the dma-buf mapping with
+ * arbitrary size and handling un-aligned accesses
+ *
+ * @map__: The dma-buf mapping structure
+ * @type__:The struct to be used containing the field to write
+ * @field__:   Member from struct we want to write
+ * @val__: Value to be written
+ *
+ * Write a value to the dma-buf mapping calculating the offset and size.
+ * A single u8, u16, u32 or u64 can be written based on the offset and size of
+ * type__.field__.
+ */
+#define dma_buf_map_write_field(map__, type__, field_

[PATCH 03/19] drm/i915/gt: Add helper for shmem copy to dma_buf_map

2022-01-26 Thread Lucas De Marchi

Add a variant of shmem_read() that takes a dma_buf_map pointer rather
than a plain pointer as argument. It's mostly a copy __shmem_rw() but
adapting the api and removing the write support since there's currently
only need to use dma_buf_map as destination.

Reworking __shmem_rw() to share the implementation was tempting, but
finding a good balance between reuse and clarity pushed towards a little
code duplication. Since the function is small, just add the similar
function with a copy/paste/adapt approach.

Cc: Matt Roper 
Cc: Joonas Lahtinen 
Cc: Tvrtko Ursulin 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/shmem_utils.c | 32 +++
 drivers/gpu/drm/i915/gt/shmem_utils.h |  3 +++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c 
b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 0683b27a3890..d7968e68ccfb 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -3,6 +3,7 @@
  * Copyright © 2020 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -123,6 +124,37 @@ static int __shmem_rw(struct file *file, loff_t off,
return 0;
 }
 
+int shmem_read_to_dma_buf_map(struct file *file, loff_t off,
+ struct dma_buf_map *map, size_t len)
+{
+   struct dma_buf_map map_iter = *map;
+   unsigned long pfn;
+
+   for (pfn = off >> PAGE_SHIFT; len; pfn++) {
+   unsigned int this =
+   min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
+   struct page *page;
+   void *vaddr;
+
+   page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
+  GFP_KERNEL);
+   if (IS_ERR(page))
+   return PTR_ERR(page);
+
+   vaddr = kmap(page);
+   dma_buf_map_memcpy_to(&map_iter, vaddr + offset_in_page(off), 
this);
+   mark_page_accessed(page);
+   kunmap(page);
+   put_page(page);
+
+   len -= this;
+   dma_buf_map_incr(&map_iter, this);
+   off = 0;
+   }
+
+   return 0;
+}
+
 int shmem_read(struct file *file, loff_t off, void *dst, size_t len)
 {
return __shmem_rw(file, off, dst, len, false);
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.h 
b/drivers/gpu/drm/i915/gt/shmem_utils.h
index c1669170c351..a3d4ce966f74 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.h
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.h
@@ -8,6 +8,7 @@
 
 #include 
 
+struct dma_buf_map;
 struct drm_i915_gem_object;
 struct file;
 
@@ -17,6 +18,8 @@ struct file *shmem_create_from_object(struct 
drm_i915_gem_object *obj);
 void *shmem_pin_map(struct file *file);
 void shmem_unpin_map(struct file *file, void *ptr);
 
+int shmem_read_to_dma_buf_map(struct file *file, loff_t off,
+ struct dma_buf_map *map, size_t len);
 int shmem_read(struct file *file, loff_t off, void *dst, size_t len);
 int shmem_write(struct file *file, loff_t off, void *src, size_t len);
 
-- 
2.35.0

[PATCH 00/19] drm/i915/guc: Refactor ADS access to use dma_buf_map

2022-01-26 Thread Lucas De Marchi

While porting i915 to arm64 we noticed some issues accessing lmem.
Some writes were getting corrupted and the final state of the buffer
didn't have exactly what we wrote. This became evident when enabling
GuC submission: depending on the number of engines the ADS struct was
being corrupted and GuC would reject it, refusin to initialize.

>From Documentation/core-api/bus-virt-phys-mapping.rst:

This memory is called "PCI memory" or "shared memory" or "IO memory" or
whatever, and there is only one way to access it: the readb/writeb and
related functions. You should never take the address of such memory, 
because
there is really nothing you can do with such an address: it's not
conceptually in the same memory space as "real memory" at all, so you 
cannot
just dereference a pointer. (Sadly, on x86 it **is** in the same memory 
space,
so on x86 it actually works to just deference a pointer, but it's not
portable).

When reading or writing words directly to IO memory, in order to be portable
the Linux kernel provides the abstraction detailed in section "Differences
between I/O access functions" of Documentation/driver-api/device-io.rst.

This limits our ability to simply overlay our structs on top a buffer
and directly access it since that buffer may come from IO memory rather than
system memory. Hence the approach taken in intel_guc_ads.c needs to be
refactored. This is not the only place in i915 that neeed to be changed, but
the one causing the most problems, with a real reproducer. This first set of
patch focuses on fixing the gem object to pass the ADS

After the addition of a few helpers in the dma_buf_map API, most of
intel_guc_ads.c can be converted to use it. The exception is the regset
initialization: we'd incur into a lot of extra indirection when
reading/writting each register. So the regset is converted to use a
temporary buffer allocated on probe, which is then copied to its
final location when finishing the initialization or on gt reset.

Testing on some discrete cards, after this change we can correctly pass the
ADS struct to GuC and have it initialized correctly.

thanks
Lucas De Marchi

Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: linux-ker...@vger.kernel.org
Cc: Christian König 
Cc: Daniel Vetter 
Cc: Daniele Ceraolo Spurio 
Cc: David Airlie 
Cc: John Harrison 
Cc: Joonas Lahtinen 
Cc: Maarten Lankhorst 
Cc: Matt Roper 
Cc: Matthew Auld 
Cc: Matthew Brost 
Cc: Sumit Semwal 
Cc: Thomas Hellström 
Cc: Tvrtko Ursulin 

Lucas De Marchi (19):
  dma-buf-map: Add read/write helpers
  dma-buf-map: Add helper to initialize second map
  drm/i915/gt: Add helper for shmem copy to dma_buf_map
  drm/i915/guc: Keep dma_buf_map of ads_blob around
  drm/i915/guc: Add read/write helpers for ADS blob
  drm/i915/guc: Convert golden context init to dma_buf_map
  drm/i915/guc: Convert policies update to dma_buf_map
  drm/i915/guc: Convert engine record to dma_buf_map
  dma-buf-map: Add wrapper over memset
  drm/i915/guc: Convert guc_ads_private_data_reset to dma_buf_map
  drm/i915/guc: Convert golden context prep to dma_buf_map
  drm/i915/guc: Replace check for golden context size
  drm/i915/guc: Convert mapping table to dma_buf_map
  drm/i915/guc: Convert capture list to dma_buf_map
  drm/i915/guc: Prepare for error propagation
  drm/i915/guc: Use a single pass to calculate regset
  drm/i915/guc: Convert guc_mmio_reg_state_init to dma_buf_map
  drm/i915/guc: Convert __guc_ads_init to dma_buf_map
  drm/i915/guc: Remove plain ads_blob pointer

 drivers/gpu/drm/i915/gt/shmem_utils.c |  32 ++
 drivers/gpu/drm/i915/gt/shmem_utils.h |   3 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  14 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 374 +++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|   3 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  11 +-
 include/linux/dma-buf-map.h   | 127 ++
 7 files changed, 405 insertions(+), 159 deletions(-)

-- 
2.35.0

Re: [PATCH] drm/bridge: synopsys/dw-hdmi: set cec clock rate

2022-01-26 Thread Fabio Estevam

On Wed, Jan 26, 2022 at 5:25 PM Peter Geis  wrote:

> +
> +   ret = clk_set_rate(hdmi->cec_clk, HDMI_CEC_CLK_RATE);
> +   if (ret)
> +   dev_warn(hdmi->dev, "Cannot set HDMI cec clock rate: 
> %d\n", ret);

You are setting the cec clock rate after it has been enabled, which
can be glitchy.

Better to set the rate prior to enabling the clock.

[PATCH] drm: rcar-du: Drop LVDS device tree backward compatibility

2022-01-26 Thread Laurent Pinchart

The rcar-du driver goes to great lengths to preserve device tree
backward compatibility for the LVDS encoders by patching old device
trees at runtime.

The last R-Car Gen2 platform was converted to the new bindings commit
edb0c3affe5214a2 ("ARM: dts: r8a7793: Convert to new LVDS DT bindings"),
in v4.17, and the last RZ/G1 platform converted in commit
6a6a797625b5fe85 ("ARM: dts: r8a7743: Convert to new LVDS DT bindings"),
in v5.0. Both are older than commit 58256143cff7c2e0 ("clk: renesas:
Remove R-Car Gen2 legacy DT clock support"), in v5.5, which removes
support for legacy bindings for clocks. The LBDS compatibility code is
thus not needed anymore. Drop it.

Signed-off-by: Laurent Pinchart 
---
 drivers/gpu/drm/rcar-du/Makefile  |   6 -
 drivers/gpu/drm/rcar-du/rcar_du_drv.c |  15 +-
 drivers/gpu/drm/rcar-du/rcar_du_of.c  | 323 --
 drivers/gpu/drm/rcar-du/rcar_du_of.h  |  20 --
 .../drm/rcar-du/rcar_du_of_lvds_r8a7790.dts   |  69 
 .../drm/rcar-du/rcar_du_of_lvds_r8a7791.dts   |  43 ---
 .../drm/rcar-du/rcar_du_of_lvds_r8a7793.dts   |  43 ---
 .../drm/rcar-du/rcar_du_of_lvds_r8a7795.dts   |  43 ---
 .../drm/rcar-du/rcar_du_of_lvds_r8a7796.dts   |  43 ---
 9 files changed, 1 insertion(+), 604 deletions(-)
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of.c
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of.h
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7790.dts
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7791.dts
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7793.dts
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7795.dts
 delete mode 100644 drivers/gpu/drm/rcar-du/rcar_du_of_lvds_r8a7796.dts

diff --git a/drivers/gpu/drm/rcar-du/Makefile b/drivers/gpu/drm/rcar-du/Makefile
index 286bc81b3e7c..e7275b5e7ec8 100644
--- a/drivers/gpu/drm/rcar-du/Makefile
+++ b/drivers/gpu/drm/rcar-du/Makefile
@@ -6,12 +6,6 @@ rcar-du-drm-y := rcar_du_crtc.o \
 rcar_du_kms.o \
 rcar_du_plane.o \
 
-rcar-du-drm-$(CONFIG_DRM_RCAR_LVDS)+= rcar_du_of.o \
-  rcar_du_of_lvds_r8a7790.dtb.o \
-  rcar_du_of_lvds_r8a7791.dtb.o \
-  rcar_du_of_lvds_r8a7793.dtb.o \
-  rcar_du_of_lvds_r8a7795.dtb.o \
-  rcar_du_of_lvds_r8a7796.dtb.o
 rcar-du-drm-$(CONFIG_DRM_RCAR_VSP) += rcar_du_vsp.o
 rcar-du-drm-$(CONFIG_DRM_RCAR_WRITEBACK) += rcar_du_writeback.o
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c 
b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index 5a8131ef81d5..71a9df5a4834 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -28,7 +28,6 @@
 
 #include "rcar_du_drv.h"
 #include "rcar_du_kms.h"
-#include "rcar_du_of.h"
 #include "rcar_du_regs.h"
 
 /* 
-
@@ -699,19 +698,7 @@ static struct platform_driver rcar_du_platform_driver = {
},
 };
 
-static int __init rcar_du_init(void)
-{
-   rcar_du_of_init(rcar_du_of_table);
-
-   return platform_driver_register(&rcar_du_platform_driver);
-}
-module_init(rcar_du_init);
-
-static void __exit rcar_du_exit(void)
-{
-   platform_driver_unregister(&rcar_du_platform_driver);
-}
-module_exit(rcar_du_exit);
+module_platform_driver(rcar_du_platform_driver);
 
 MODULE_AUTHOR("Laurent Pinchart ");
 MODULE_DESCRIPTION("Renesas R-Car Display Unit DRM Driver");
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_of.c 
b/drivers/gpu/drm/rcar-du/rcar_du_of.c
deleted file mode 100644
index afef69669bb4..
--- a/drivers/gpu/drm/rcar-du/rcar_du_of.c
+++ /dev/null
@@ -1,323 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * rcar_du_of.c - Legacy DT bindings compatibility
- *
- * Copyright (C) 2018 Laurent Pinchart 
- *
- * Based on work from Jyri Sarha 
- * Copyright (C) 2015 Texas Instruments
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "rcar_du_crtc.h"
-#include "rcar_du_drv.h"
-#include "rcar_du_of.h"
-
-/* 
-
- * Generic Overlay Handling
- */
-
-struct rcar_du_of_overlay {
-   const char *compatible;
-   void *begin;
-   void *end;
-};
-
-#define RCAR_DU_OF_DTB(type, soc)  \
-   extern char __dtb_rcar_du_of_##type##_##soc##_begin[];  \
-   extern char __dtb_rcar_du_of_##type##_##soc##_end[]
-
-#define RCAR_DU_OF_OVERLAY(type, soc)  \
-   {   \
-   .compatible = "renesas,du-" #soc,   \
-   .begin = __dtb_rcar_du_of_##type##_##soc##_begin,   \
-   .end = __dtb_rc

[PATCH] drm/bridge: synopsys/dw-hdmi: set cec clock rate

2022-01-26 Thread Peter Geis

The hdmi-cec clock must be 32khz in order for cec to work correctly.
Ensure after enabling the clock we set it in order for the hardware to
work as expected.
Warn on failure, in case this is a static clock that is slighty off.
Fixes hdmi-cec support on Rockchip devices.

Fixes: ebe32c3e282a ("drm/bridge: synopsys/dw-hdmi: Enable cec clock")

Signed-off-by: Peter Geis 
---
 drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 54d8fdad395f..1a96da60e357 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -48,6 +48,9 @@
 
 #define HDMI14_MAX_TMDSCLK 34000
 
+/* HDMI CEC needs a clock rate of 32khz */
+#define HDMI_CEC_CLK_RATE  32768
+
 enum hdmi_datamap {
RGB444_8B = 0x01,
RGB444_10B = 0x03,
@@ -3347,6 +3350,10 @@ struct dw_hdmi *dw_hdmi_probe(struct platform_device 
*pdev,
ret);
goto err_iahb;
}
+
+   ret = clk_set_rate(hdmi->cec_clk, HDMI_CEC_CLK_RATE);
+   if (ret)
+   dev_warn(hdmi->dev, "Cannot set HDMI cec clock rate: 
%d\n", ret);
}
 
/* Product and revision IDs */
-- 
2.25.1

Re: [PATCH 2/4] drm/i915/guc: Cancel requests immediately

2022-01-26 Thread Matthew Brost

On Wed, Jan 26, 2022 at 10:58:46AM -0800, John Harrison wrote:
> On 1/24/2022 07:01, Matthew Brost wrote:
> > Change the preemption timeout to the smallest possible value (1 us) when
> > disabling scheduling to cancel a request and restore it after
> > cancellation. This not only cancels the request as fast as possible, it
> > fixes a bug where the preemption timeout is 0 which results in the
> > schedule disable hanging forever.
> Shouldn't there be an 'if' in the above statement? The pre-emption timeout
> is not normally zero.
>

Yes. Will reword.
 
> > 
> > Reported-by: Jani Saarinen 
> > Fixes: 62eaf0ae217d4 ("drm/i915/guc: Support request cancellation")
> > Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4960
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context_types.h |  5 ++
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 46 +++
> >   2 files changed, 31 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > index 30cd81ad8911a..730998823dbea 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > @@ -198,6 +198,11 @@ struct intel_context {
> >  * each priority bucket
> >  */
> > u32 prio_count[GUC_CLIENT_PRIORITY_NUM];
> > +   /**
> > +* @preemption_timeout: preemption timeout of the context, used
> > +* to restore this value after request cancellation
> > +*/
> > +   u32 preemption_timeout;
> > } guc_state;
> > struct {
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 3918f1be114fa..966947c450253 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -2147,7 +2147,8 @@ static inline u32 get_children_join_value(struct 
> > intel_context *ce,
> > return __get_parent_scratch(ce)->join[child_index].semaphore;
> >   }
> > -static void guc_context_policy_init(struct intel_engine_cs *engine,
> > +static void guc_context_policy_init(struct intel_context *ce,
> > +   struct intel_engine_cs *engine,
> > struct guc_lrc_desc *desc)
> Shouldn't engine be before ce? The more general structure usually goes
> first.
> 

Sure. Fix fix this in the next rev.

Matt

> John.
> 
> >   {
> > desc->policy_flags = 0;
> > @@ -2157,7 +2158,8 @@ static void guc_context_policy_init(struct 
> > intel_engine_cs *engine,
> > /* NB: For both of these, zero means disabled. */
> > desc->execution_quantum = engine->props.timeslice_duration_ms * 1000;
> > -   desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
> > +   ce->guc_state.preemption_timeout = engine->props.preempt_timeout_ms * 
> > 1000;
> > +   desc->preemption_timeout = ce->guc_state.preemption_timeout;
> >   }
> >   static int guc_lrc_desc_pin(struct intel_context *ce, bool loop)
> > @@ -2193,7 +2195,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
> > bool loop)
> > desc->hw_context_desc = ce->lrc.lrca;
> > desc->priority = ce->guc_state.prio;
> > desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
> > -   guc_context_policy_init(engine, desc);
> > +   guc_context_policy_init(ce, engine, desc);
> > /*
> >  * If context is a parent, we need to register a process descriptor
> > @@ -2226,7 +2228,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
> > bool loop)
> > desc->hw_context_desc = child->lrc.lrca;
> > desc->priority = ce->guc_state.prio;
> > desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
> > -   guc_context_policy_init(engine, desc);
> > +   guc_context_policy_init(child, engine, desc);
> > }
> > clear_children_join_go_memory(ce);
> > @@ -2409,6 +2411,19 @@ static u16 prep_context_pending_disable(struct 
> > intel_context *ce)
> > return ce->guc_id.id;
> >   }
> > +static void __guc_context_set_preemption_timeout(struct intel_guc *guc,
> > +u16 guc_id,
> > +u32 preemption_timeout)
> > +{
> > +   u32 action[] = {
> > +   INTEL_GUC_ACTION_SET_CONTEXT_PREEMPTION_TIMEOUT,
> > +   guc_id,
> > +   preemption_timeout
> > +   };
> > +
> > +   intel_guc_send_busy_loop(guc, action, ARRAY_SIZE(action), 0, true);
> > +}
> > +
> >   static struct i915_sw_fence *guc_context_block(struct intel_context *ce)
> >   {
> > struct intel_guc *guc = ce_to_guc(ce);
> > @@ -2442,8 +2457,10 @@ static struct i915_sw_fence 
> > *guc_context_block(struct intel_context *ce)
> > spin_unlock_irqrestore(&ce->guc_state.lock, flags);
> > -   with_

Re: [PATCH 3/4] drm/i915/execlists: Fix execlists request cancellation corner case

2022-01-26 Thread Matthew Brost

On Wed, Jan 26, 2022 at 11:03:24AM -0800, John Harrison wrote:
> On 1/24/2022 07:01, Matthew Brost wrote:
> > More than 1 request can be submitted to a single ELSP at a time if
> > multiple requests are ready run to on the same context. When a request
> > is canceled it is marked bad, an idle pulse is triggered to the engine
> > (high priority kernel request), the execlists scheduler sees that
> > running request is bad and sets preemption timeout to minimum value (1
> > ms). This fails to work if multiple requests are combined on the ELSP as
> > only the most recent request is stored in the execlists schedule (the
> > request stored in the ELSP isn't marked bad, thus preemption timeout
> > isn't set to the minimum value). If the preempt timeout is configured to
> > zero, the engine is permanently hung. This is shown by an upcoming
> > selftest.
> > 
> > To work around this, mark the idle pulse with a flag to force a preempt
> > with the minimum value.
> > 
> > Fixes: 38b237eab2bc7 ("drm/i915: Individual request cancellation")
> > Signed-off-by: Matthew Brost 
> > ---
> >   .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 23 +++
> >   .../gpu/drm/i915/gt/intel_engine_heartbeat.h  |  1 +
> >   .../drm/i915/gt/intel_execlists_submission.c  | 18 ++-
> >   drivers/gpu/drm/i915/i915_request.h   |  6 +
> >   4 files changed, 38 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
> > b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> > index a3698f611f457..efd1c719b4072 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
> > @@ -243,7 +243,8 @@ void intel_engine_init_heartbeat(struct intel_engine_cs 
> > *engine)
> > INIT_DELAYED_WORK(&engine->heartbeat.work, heartbeat);
> >   }
> > -static int __intel_engine_pulse(struct intel_engine_cs *engine)
> > +static int __intel_engine_pulse(struct intel_engine_cs *engine,
> > +   bool force_preempt)
> >   {
> > struct i915_sched_attr attr = { .priority = I915_PRIORITY_BARRIER };
> > struct intel_context *ce = engine->kernel_context;
> > @@ -258,6 +259,8 @@ static int __intel_engine_pulse(struct intel_engine_cs 
> > *engine)
> > return PTR_ERR(rq);
> > __set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
> > +   if (force_preempt)
> > +   __set_bit(I915_FENCE_FLAG_FORCE_PREEMPT, &rq->fence.flags);
> > heartbeat_commit(rq, &attr);
> > GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER);
> > @@ -299,7 +302,7 @@ int intel_engine_set_heartbeat(struct intel_engine_cs 
> > *engine,
> > /* recheck current execution */
> > if (intel_engine_has_preemption(engine)) {
> > -   err = __intel_engine_pulse(engine);
> > +   err = __intel_engine_pulse(engine, false);
> > if (err)
> > set_heartbeat(engine, saved);
> > }
> > @@ -312,7 +315,8 @@ int intel_engine_set_heartbeat(struct intel_engine_cs 
> > *engine,
> > return err;
> >   }
> > -int intel_engine_pulse(struct intel_engine_cs *engine)
> > +static int _intel_engine_pulse(struct intel_engine_cs *engine,
> > +  bool force_preempt)
> >   {
> > struct intel_context *ce = engine->kernel_context;
> > int err;
> > @@ -325,7 +329,7 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
> > err = -EINTR;
> > if (!mutex_lock_interruptible(&ce->timeline->mutex)) {
> > -   err = __intel_engine_pulse(engine);
> > +   err = __intel_engine_pulse(engine, force_preempt);
> > mutex_unlock(&ce->timeline->mutex);
> > }
> > @@ -334,6 +338,17 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
> > return err;
> >   }
> > +int intel_engine_pulse(struct intel_engine_cs *engine)
> > +{
> > +   return _intel_engine_pulse(engine, false);
> > +}
> > +
> > +
> > +int intel_engine_pulse_force_preempt(struct intel_engine_cs *engine)
> > +{
> > +   return _intel_engine_pulse(engine, true);
> > +}
> > +
> >   int intel_engine_flush_barriers(struct intel_engine_cs *engine)
> >   {
> > struct i915_sched_attr attr = { .priority = I915_PRIORITY_MIN };
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h 
> > b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
> > index 5da6d809a87a2..d9c8386754cb3 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
> > @@ -21,6 +21,7 @@ void intel_gt_park_heartbeats(struct intel_gt *gt);
> >   void intel_gt_unpark_heartbeats(struct intel_gt *gt);
> >   int intel_engine_pulse(struct intel_engine_cs *engine);
> > +int intel_engine_pulse_force_preempt(struct intel_engine_cs *engine);
> >   int intel_engine_flush_barriers(struct intel_engine_cs *engine);
> >   #endif /* INTEL_ENGINE_HEARTBEAT_H */
> > diff --git

Re: [Intel-gfx] [PATCH 1/3] drm: Stop spamming log with drm_cache message

2022-01-26 Thread Lucas De Marchi


On Wed, Jan 26, 2022 at 08:24:54PM +0200, Jani Nikula wrote:

On Tue, 25 Jan 2022, Lucas De Marchi  wrote:

Only x86 and in some cases PPC have support added in drm_cache.c for the
clflush class of functions. However warning once is sufficient to taint
the log instead of spamming it with "Architecture has no drm_cache.c
support" every few millisecond.

Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/drm_cache.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index f19d9acbe959..2d5a4c463a4f 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -112,7 +112,6 @@ drm_clflush_pages(struct page *pages[], unsigned long 
num_pages)
kunmap_atomic(page_virtual);
}
 #else
-   pr_err("Architecture has no drm_cache.c support\n");
WARN_ON_ONCE(1);


An alternative would be to replace the two lines with:

WARN_ONCE(1, "Architecture has no drm_cache.c support\n");

But I'm not insisting.


I actually like that suggestion. I will change that in the next version.

Thanks
Lucas De Marchi

Re: [Intel-gfx] [PATCH 18/20] drm/i915/uapi: forbid ALLOC_TOPDOWN for error capture

2022-01-26 Thread kernel test robot

Hi Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-tip/drm-tip]
[cannot apply to drm-intel/for-linux-next drm/drm-next v5.17-rc1 next-20220125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Matthew-Auld/Initial-support-for-small-BAR-recovery/20220126-232640
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: x86_64-randconfig-a013-20220124 
(https://download.01.org/0day-ci/archive/20220127/202201270346.fzrpmvzl-...@intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 
2a1b7aa016c0f4b5598806205bdfbab1ea2d92c4)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/33b0a9f1f9810bd16cef89ce1e5787751583661e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Matthew-Auld/Initial-support-for-small-BAR-recovery/20220126-232640
git checkout 33b0a9f1f9810bd16cef89ce1e5787751583661e
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:3426:6: error: assigning to 
>> 'int' from incompatible type 'void'
   err = eb_capture_stage(&eb);
   ^ ~
   1 error generated.


vim +3426 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c

  3381  
  3382  if (args->flags & I915_EXEC_FENCE_OUT) {
  3383  out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
  3384  if (out_fence_fd < 0) {
  3385  err = out_fence_fd;
  3386  goto err_in_fence;
  3387  }
  3388  }
  3389  
  3390  err = eb_create(&eb);
  3391  if (err)
  3392  goto err_out_fence;
  3393  
  3394  GEM_BUG_ON(!eb.lut_size);
  3395  
  3396  err = eb_select_context(&eb);
  3397  if (unlikely(err))
  3398  goto err_destroy;
  3399  
  3400  err = eb_select_engine(&eb);
  3401  if (unlikely(err))
  3402  goto err_context;
  3403  
  3404  err = eb_lookup_vmas(&eb);
  3405  if (err) {
  3406  eb_release_vmas(&eb, true);
  3407  goto err_engine;
  3408  }
  3409  
  3410  i915_gem_ww_ctx_init(&eb.ww, true);
  3411  
  3412  err = eb_relocate_parse(&eb);
  3413  if (err) {
  3414  /*
  3415   * If the user expects the execobject.offset and
  3416   * reloc.presumed_offset to be an exact match,
  3417   * as for using NO_RELOC, then we cannot update
  3418   * the execobject.offset until we have completed
  3419   * relocation.
  3420   */
  3421  args->flags &= ~__EXEC_HAS_RELOC;
  3422  goto err_vma;
  3423  }
  3424  
  3425  ww_acquire_done(&eb.ww.ctx);
> 3426  err = eb_capture_stage(&eb);
  3427  if (err)
  3428  goto err_vma;
  3429  
  3430  out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
  3431  if (IS_ERR(out_fence)) {
  3432  err = PTR_ERR(out_fence);
  3433  out_fence = NULL;
  3434  if (eb.requests[0])
  3435  goto err_request;
  3436  else
  3437  goto err_vma;
  3438  }
  3439  
  3440  err = eb_submit(&eb);
  3441  
  3442  err_request:
  3443  eb_requests_get(&eb);
  3444  err = eb_requests_add(&eb, err);
  3445  
  3446  if (eb.fences)
  3447  signal_fence_array(&eb, eb.composite_fence ?
  3448 eb.composite_fence :
  3449 &eb.requests[0]->fence);
  3450  
  3451  if (out_fence) {
  3452  if (err == 0) {
  3453  fd_install(out_fence_fd, out_fence->file);
  3454  args->rsvd2 &= GENMASK_ULL(31, 0); /* keep 
in-fence */
  3455  args->rsvd2 |= (u64)out_fence_fd << 32;
  3456

Re: [PATCH 21/27] arm64: dts: rockchip: rk356x: Add HDMI nodes

2022-01-26 Thread Peter Geis

On Wed, Jan 26, 2022 at 2:25 PM Robin Murphy  wrote:
>
> On 2022-01-26 18:44, Peter Geis wrote:
> > On Wed, Jan 26, 2022 at 12:56 PM Robin Murphy  wrote:
> >>
> >> On 2022-01-26 16:04, Peter Geis wrote:
> >>> On Wed, Jan 26, 2022 at 9:58 AM Sascha Hauer  
> >>> wrote:
> 
>  Add support for the HDMI port found on RK3568.
> 
>  Signed-off-by: Sascha Hauer 
>  ---
> arch/arm64/boot/dts/rockchip/rk356x.dtsi | 37 +++-
> 1 file changed, 36 insertions(+), 1 deletion(-)
> 
>  diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi 
>  b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
>  index 4008bd666d01..e38fb223e9b8 100644
>  --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
>  +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
>  @@ -10,7 +10,6 @@
> #include 
> #include 
> #include 
>  -#include 
> #include 
> 
> / {
>  @@ -502,6 +501,42 @@ vop_mmu: iommu@fe043e00 {
>    status = "disabled";
>    };
> 
>  +   hdmi: hdmi@fe0a {
>  +   compatible = "rockchip,rk3568-dw-hdmi";
>  +   reg = <0x0 0xfe0a 0x0 0x2>;
>  +   interrupts = ;
>  +   clocks = <&cru PCLK_HDMI_HOST>,
>  +<&cru CLK_HDMI_SFR>,
>  +<&cru CLK_HDMI_CEC>,
>  +<&pmucru CLK_HDMI_REF>,
>  +<&cru HCLK_VOP>;
>  +   clock-names = "iahb", "isfr", "cec", "ref", "hclk";
>  +   pinctrl-names = "default";
>  +   pinctrl-0 = <&hdmitx_scl &hdmitx_sda &hdmitxm0_cec>;
> >>>
> >>> I looked into CEC support here, and it seems that it does work with one 
> >>> change.
> >>> Please add the two following lines to your patch:
> >>> assigned-clocks = <&cru CLK_HDMI_CEC>;
> >>> assigned-clock-rates = <32768>;
> >>>
> >>> The issue is the clk_rtc32k_frac clock that feeds clk_rtc_32k which
> >>> feeds clk_hdmi_cec is 24mhz at boot, which is too high for CEC to
> >>> function.
> >>
> >> Wouldn't it make far more sense to just stick a suitable clk_set_rate()
> >> call in the driver? AFAICS it's already explicitly aware of the CEC clock.
> >
> > This is handled purely in the
> > drivers/gpu/drm/bridge/synopsys/dw-hdmi.c driver, so I'm hesitant to
> > touch it there as it would affect all users, not just Rockchip.
>
> I'd have a strong hunch that it's a standard thing for the DesignWare IP
> and not affected by platform integration. I don't have the magical
> Synopsys databook, but between the trusty old i.MX6 manual and most of
> the other in-tree DTs getting their dw-hdmi "cec" clock from
> suspiciously-obviously-named sources, I'd be somewhat surprised if it
> was ever anything other than 32KHz.

My main concern was similar to the other HDMI clock issues, mainly
setting the clock can propagate up and affect other users of the
upstream clock.
I'll spin up a quick patch for this method.

Thanks,
Peter

>
> Robin.
>
> > Could someone familiar with the dw-hdmi IP weigh in on the minimum and
> > maximum clock rate the CEC block can handle?
> >
> >>
> >> Robin.
> >>
>  +   power-domains = <&power RK3568_PD_VO>;
>  +   reg-io-width = <4>;
>  +   rockchip,grf = <&grf>;
>  +   #sound-dai-cells = <0>;
>  +   status = "disabled";
>  +
>  +   ports {
>  +   #address-cells = <1>;
>  +   #size-cells = <0>;
>  +
>  +   hdmi_in: port@0 {
>  +   reg = <0>;
>  +   #address-cells = <1>;
>  +   #size-cells = <0>;
>  +   };
>  +
>  +   hdmi_out: port@1 {
>  +   reg = <1>;
>  +   #address-cells = <1>;
>  +   #size-cells = <0>;
>  +   };
>  +   };
>  +   };
>  +
>    qos_gpu: qos@fe128000 {
>    compatible = "rockchip,rk3568-qos", "syscon";
>    reg = <0x0 0xfe128000 0x0 0x20>;
>  --
>  2.30.2
> 
> >>>
> >>> ___
> >>> Linux-rockchip mailing list
> >>> linux-rockc...@lists.infradead.org
> >>> http://lists.infradead.org/mailman/listinfo/linux-rockchip

Re: [Intel-gfx] [PATCH 18/20] drm/i915/uapi: forbid ALLOC_TOPDOWN for error capture

2022-01-26 Thread kernel test robot

Hi Matthew,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-tip/drm-tip]
[cannot apply to drm-intel/for-linux-next drm/drm-next v5.17-rc1 next-20220125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Matthew-Auld/Initial-support-for-small-BAR-recovery/20220126-232640
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: i386-randconfig-a002-20220124 
(https://download.01.org/0day-ci/archive/20220127/202201270314.twkiundm-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/33b0a9f1f9810bd16cef89ce1e5787751583661e
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Matthew-Auld/Initial-support-for-small-BAR-recovery/20220126-232640
git checkout 33b0a9f1f9810bd16cef89ce1e5787751583661e
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c: In function 
'i915_gem_do_execbuffer':
>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:3426:6: error: void value not 
>> ignored as it ought to be
3426 |  err = eb_capture_stage(&eb);
 |  ^


vim +3426 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c

  3381  
  3382  if (args->flags & I915_EXEC_FENCE_OUT) {
  3383  out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
  3384  if (out_fence_fd < 0) {
  3385  err = out_fence_fd;
  3386  goto err_in_fence;
  3387  }
  3388  }
  3389  
  3390  err = eb_create(&eb);
  3391  if (err)
  3392  goto err_out_fence;
  3393  
  3394  GEM_BUG_ON(!eb.lut_size);
  3395  
  3396  err = eb_select_context(&eb);
  3397  if (unlikely(err))
  3398  goto err_destroy;
  3399  
  3400  err = eb_select_engine(&eb);
  3401  if (unlikely(err))
  3402  goto err_context;
  3403  
  3404  err = eb_lookup_vmas(&eb);
  3405  if (err) {
  3406  eb_release_vmas(&eb, true);
  3407  goto err_engine;
  3408  }
  3409  
  3410  i915_gem_ww_ctx_init(&eb.ww, true);
  3411  
  3412  err = eb_relocate_parse(&eb);
  3413  if (err) {
  3414  /*
  3415   * If the user expects the execobject.offset and
  3416   * reloc.presumed_offset to be an exact match,
  3417   * as for using NO_RELOC, then we cannot update
  3418   * the execobject.offset until we have completed
  3419   * relocation.
  3420   */
  3421  args->flags &= ~__EXEC_HAS_RELOC;
  3422  goto err_vma;
  3423  }
  3424  
  3425  ww_acquire_done(&eb.ww.ctx);
> 3426  err = eb_capture_stage(&eb);
  3427  if (err)
  3428  goto err_vma;
  3429  
  3430  out_fence = eb_requests_create(&eb, in_fence, out_fence_fd);
  3431  if (IS_ERR(out_fence)) {
  3432  err = PTR_ERR(out_fence);
  3433  out_fence = NULL;
  3434  if (eb.requests[0])
  3435  goto err_request;
  3436  else
  3437  goto err_vma;
  3438  }
  3439  
  3440  err = eb_submit(&eb);
  3441  
  3442  err_request:
  3443  eb_requests_get(&eb);
  3444  err = eb_requests_add(&eb, err);
  3445  
  3446  if (eb.fences)
  3447  signal_fence_array(&eb, eb.composite_fence ?
  3448 eb.composite_fence :
  3449 &eb.requests[0]->fence);
  3450  
  3451  if (out_fence) {
  3452  if (err == 0) {
  3453  fd_install(out_fence_fd, out_fence->file);
  3454  args->rsvd2 &= GENMASK_ULL(31, 0); /* keep 
in-fence */
  3455  args->rsvd2 |= (u64)out_fence_fd << 32;
  3456  out_fence_fd = -1;
  3457  } else {
  3458  fput(out_fence->file);
  3459  }
  3460  }
  3461  
  3462  if (unlikely(eb.gem_context->syncobj)) {

Re: [PATCH 21/27] arm64: dts: rockchip: rk356x: Add HDMI nodes

2022-01-26 Thread Robin Murphy


On 2022-01-26 18:44, Peter Geis wrote:

On Wed, Jan 26, 2022 at 12:56 PM Robin Murphy  wrote:


On 2022-01-26 16:04, Peter Geis wrote:

On Wed, Jan 26, 2022 at 9:58 AM Sascha Hauer  wrote:


Add support for the HDMI port found on RK3568.

Signed-off-by: Sascha Hauer 
---
   arch/arm64/boot/dts/rockchip/rk356x.dtsi | 37 +++-
   1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi 
b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
index 4008bd666d01..e38fb223e9b8 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
@@ -10,7 +10,6 @@
   #include 
   #include 
   #include 
-#include 
   #include 

   / {
@@ -502,6 +501,42 @@ vop_mmu: iommu@fe043e00 {
  status = "disabled";
  };

+   hdmi: hdmi@fe0a {
+   compatible = "rockchip,rk3568-dw-hdmi";
+   reg = <0x0 0xfe0a 0x0 0x2>;
+   interrupts = ;
+   clocks = <&cru PCLK_HDMI_HOST>,
+<&cru CLK_HDMI_SFR>,
+<&cru CLK_HDMI_CEC>,
+<&pmucru CLK_HDMI_REF>,
+<&cru HCLK_VOP>;
+   clock-names = "iahb", "isfr", "cec", "ref", "hclk";
+   pinctrl-names = "default";
+   pinctrl-0 = <&hdmitx_scl &hdmitx_sda &hdmitxm0_cec>;


I looked into CEC support here, and it seems that it does work with one change.
Please add the two following lines to your patch:
assigned-clocks = <&cru CLK_HDMI_CEC>;
assigned-clock-rates = <32768>;

The issue is the clk_rtc32k_frac clock that feeds clk_rtc_32k which
feeds clk_hdmi_cec is 24mhz at boot, which is too high for CEC to
function.


Wouldn't it make far more sense to just stick a suitable clk_set_rate()
call in the driver? AFAICS it's already explicitly aware of the CEC clock.


This is handled purely in the
drivers/gpu/drm/bridge/synopsys/dw-hdmi.c driver, so I'm hesitant to
touch it there as it would affect all users, not just Rockchip.


I'd have a strong hunch that it's a standard thing for the DesignWare IP 
and not affected by platform integration. I don't have the magical 
Synopsys databook, but between the trusty old i.MX6 manual and most of 
the other in-tree DTs getting their dw-hdmi "cec" clock from 
suspiciously-obviously-named sources, I'd be somewhat surprised if it 
was ever anything other than 32KHz.


Robin.


Could someone familiar with the dw-hdmi IP weigh in on the minimum and
maximum clock rate the CEC block can handle?



Robin.


+   power-domains = <&power RK3568_PD_VO>;
+   reg-io-width = <4>;
+   rockchip,grf = <&grf>;
+   #sound-dai-cells = <0>;
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   hdmi_in: port@0 {
+   reg = <0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+
+   hdmi_out: port@1 {
+   reg = <1>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+   };
+   };
+
  qos_gpu: qos@fe128000 {
  compatible = "rockchip,rk3568-qos", "syscon";
  reg = <0x0 0xfe128000 0x0 0x20>;
--
2.30.2



___
Linux-rockchip mailing list
linux-rockc...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

Re: [PATCH 3/4] drm/i915/execlists: Fix execlists request cancellation corner case

2022-01-26 Thread John Harrison


On 1/24/2022 07:01, Matthew Brost wrote:

More than 1 request can be submitted to a single ELSP at a time if
multiple requests are ready run to on the same context. When a request
is canceled it is marked bad, an idle pulse is triggered to the engine
(high priority kernel request), the execlists scheduler sees that
running request is bad and sets preemption timeout to minimum value (1
ms). This fails to work if multiple requests are combined on the ELSP as
only the most recent request is stored in the execlists schedule (the
request stored in the ELSP isn't marked bad, thus preemption timeout
isn't set to the minimum value). If the preempt timeout is configured to
zero, the engine is permanently hung. This is shown by an upcoming
selftest.

To work around this, mark the idle pulse with a flag to force a preempt
with the minimum value.

Fixes: 38b237eab2bc7 ("drm/i915: Individual request cancellation")
Signed-off-by: Matthew Brost 
---
  .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 23 +++
  .../gpu/drm/i915/gt/intel_engine_heartbeat.h  |  1 +
  .../drm/i915/gt/intel_execlists_submission.c  | 18 ++-
  drivers/gpu/drm/i915/i915_request.h   |  6 +
  4 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index a3698f611f457..efd1c719b4072 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -243,7 +243,8 @@ void intel_engine_init_heartbeat(struct intel_engine_cs 
*engine)
INIT_DELAYED_WORK(&engine->heartbeat.work, heartbeat);
  }
  
-static int __intel_engine_pulse(struct intel_engine_cs *engine)

+static int __intel_engine_pulse(struct intel_engine_cs *engine,
+   bool force_preempt)
  {
struct i915_sched_attr attr = { .priority = I915_PRIORITY_BARRIER };
struct intel_context *ce = engine->kernel_context;
@@ -258,6 +259,8 @@ static int __intel_engine_pulse(struct intel_engine_cs 
*engine)
return PTR_ERR(rq);
  
  	__set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);

+   if (force_preempt)
+   __set_bit(I915_FENCE_FLAG_FORCE_PREEMPT, &rq->fence.flags);
  
  	heartbeat_commit(rq, &attr);

GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER);
@@ -299,7 +302,7 @@ int intel_engine_set_heartbeat(struct intel_engine_cs 
*engine,
  
  		/* recheck current execution */

if (intel_engine_has_preemption(engine)) {
-   err = __intel_engine_pulse(engine);
+   err = __intel_engine_pulse(engine, false);
if (err)
set_heartbeat(engine, saved);
}
@@ -312,7 +315,8 @@ int intel_engine_set_heartbeat(struct intel_engine_cs 
*engine,
return err;
  }
  
-int intel_engine_pulse(struct intel_engine_cs *engine)

+static int _intel_engine_pulse(struct intel_engine_cs *engine,
+  bool force_preempt)
  {
struct intel_context *ce = engine->kernel_context;
int err;
@@ -325,7 +329,7 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
  
  	err = -EINTR;

if (!mutex_lock_interruptible(&ce->timeline->mutex)) {
-   err = __intel_engine_pulse(engine);
+   err = __intel_engine_pulse(engine, force_preempt);
mutex_unlock(&ce->timeline->mutex);
}
  
@@ -334,6 +338,17 @@ int intel_engine_pulse(struct intel_engine_cs *engine)

return err;
  }
  
+int intel_engine_pulse(struct intel_engine_cs *engine)

+{
+   return _intel_engine_pulse(engine, false);
+}
+
+
+int intel_engine_pulse_force_preempt(struct intel_engine_cs *engine)
+{
+   return _intel_engine_pulse(engine, true);
+}
+
  int intel_engine_flush_barriers(struct intel_engine_cs *engine)
  {
struct i915_sched_attr attr = { .priority = I915_PRIORITY_MIN };
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
index 5da6d809a87a2..d9c8386754cb3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
@@ -21,6 +21,7 @@ void intel_gt_park_heartbeats(struct intel_gt *gt);
  void intel_gt_unpark_heartbeats(struct intel_gt *gt);
  
  int intel_engine_pulse(struct intel_engine_cs *engine);

+int intel_engine_pulse_force_preempt(struct intel_engine_cs *engine);
  int intel_engine_flush_barriers(struct intel_engine_cs *engine);
  
  #endif /* INTEL_ENGINE_HEARTBEAT_H */

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 960a9aaf4f3a3..f0c2024058731 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1222,26 +1222,29 @@ static void record_preemption(struct 
i

Re: [PATCH 2/4] drm/i915/guc: Cancel requests immediately

2022-01-26 Thread John Harrison


On 1/24/2022 07:01, Matthew Brost wrote:

Change the preemption timeout to the smallest possible value (1 us) when
disabling scheduling to cancel a request and restore it after
cancellation. This not only cancels the request as fast as possible, it
fixes a bug where the preemption timeout is 0 which results in the
schedule disable hanging forever.
Shouldn't there be an 'if' in the above statement? The pre-emption 
timeout is not normally zero.




Reported-by: Jani Saarinen 
Fixes: 62eaf0ae217d4 ("drm/i915/guc: Support request cancellation")
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/4960
Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_context_types.h |  5 ++
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 46 +++
  2 files changed, 31 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 30cd81ad8911a..730998823dbea 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -198,6 +198,11 @@ struct intel_context {
 * each priority bucket
 */
u32 prio_count[GUC_CLIENT_PRIORITY_NUM];
+   /**
+* @preemption_timeout: preemption timeout of the context, used
+* to restore this value after request cancellation
+*/
+   u32 preemption_timeout;
} guc_state;
  
  	struct {

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 3918f1be114fa..966947c450253 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2147,7 +2147,8 @@ static inline u32 get_children_join_value(struct 
intel_context *ce,
return __get_parent_scratch(ce)->join[child_index].semaphore;
  }
  
-static void guc_context_policy_init(struct intel_engine_cs *engine,

+static void guc_context_policy_init(struct intel_context *ce,
+   struct intel_engine_cs *engine,
struct guc_lrc_desc *desc)
Shouldn't engine be before ce? The more general structure usually goes 
first.


John.


  {
desc->policy_flags = 0;
@@ -2157,7 +2158,8 @@ static void guc_context_policy_init(struct 
intel_engine_cs *engine,
  
  	/* NB: For both of these, zero means disabled. */

desc->execution_quantum = engine->props.timeslice_duration_ms * 1000;
-   desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
+   ce->guc_state.preemption_timeout = engine->props.preempt_timeout_ms * 
1000;
+   desc->preemption_timeout = ce->guc_state.preemption_timeout;
  }
  
  static int guc_lrc_desc_pin(struct intel_context *ce, bool loop)

@@ -2193,7 +2195,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
desc->hw_context_desc = ce->lrc.lrca;
desc->priority = ce->guc_state.prio;
desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
-   guc_context_policy_init(engine, desc);
+   guc_context_policy_init(ce, engine, desc);
  
  	/*

 * If context is a parent, we need to register a process descriptor
@@ -2226,7 +2228,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
desc->hw_context_desc = child->lrc.lrca;
desc->priority = ce->guc_state.prio;
desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
-   guc_context_policy_init(engine, desc);
+   guc_context_policy_init(child, engine, desc);
}
  
  		clear_children_join_go_memory(ce);

@@ -2409,6 +2411,19 @@ static u16 prep_context_pending_disable(struct 
intel_context *ce)
return ce->guc_id.id;
  }
  
+static void __guc_context_set_preemption_timeout(struct intel_guc *guc,

+u16 guc_id,
+u32 preemption_timeout)
+{
+   u32 action[] = {
+   INTEL_GUC_ACTION_SET_CONTEXT_PREEMPTION_TIMEOUT,
+   guc_id,
+   preemption_timeout
+   };
+
+   intel_guc_send_busy_loop(guc, action, ARRAY_SIZE(action), 0, true);
+}
+
  static struct i915_sw_fence *guc_context_block(struct intel_context *ce)
  {
struct intel_guc *guc = ce_to_guc(ce);
@@ -2442,8 +2457,10 @@ static struct i915_sw_fence *guc_context_block(struct 
intel_context *ce)
  
  	spin_unlock_irqrestore(&ce->guc_state.lock, flags);
  
-	with_intel_runtime_pm(runtime_pm, wakeref)

+   with_intel_runtime_pm(runtime_pm, wakeref) {
+   __guc_context_set_preemption_timeout(guc, guc_id, 1);
__guc_context_sched_disable(guc, ce, guc_id);
+   }
  
  	return &ce->guc_state.blocked;

  }
@@ -2492,8 +2509,10 @@ static void guc_context_unblock(struct intel_context *

Re: [PATCH 21/27] arm64: dts: rockchip: rk356x: Add HDMI nodes

2022-01-26 Thread Peter Geis

On Wed, Jan 26, 2022 at 12:56 PM Robin Murphy  wrote:
>
> On 2022-01-26 16:04, Peter Geis wrote:
> > On Wed, Jan 26, 2022 at 9:58 AM Sascha Hauer  wrote:
> >>
> >> Add support for the HDMI port found on RK3568.
> >>
> >> Signed-off-by: Sascha Hauer 
> >> ---
> >>   arch/arm64/boot/dts/rockchip/rk356x.dtsi | 37 +++-
> >>   1 file changed, 36 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi 
> >> b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> >> index 4008bd666d01..e38fb223e9b8 100644
> >> --- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> >> +++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
> >> @@ -10,7 +10,6 @@
> >>   #include 
> >>   #include 
> >>   #include 
> >> -#include 
> >>   #include 
> >>
> >>   / {
> >> @@ -502,6 +501,42 @@ vop_mmu: iommu@fe043e00 {
> >>  status = "disabled";
> >>  };
> >>
> >> +   hdmi: hdmi@fe0a {
> >> +   compatible = "rockchip,rk3568-dw-hdmi";
> >> +   reg = <0x0 0xfe0a 0x0 0x2>;
> >> +   interrupts = ;
> >> +   clocks = <&cru PCLK_HDMI_HOST>,
> >> +<&cru CLK_HDMI_SFR>,
> >> +<&cru CLK_HDMI_CEC>,
> >> +<&pmucru CLK_HDMI_REF>,
> >> +<&cru HCLK_VOP>;
> >> +   clock-names = "iahb", "isfr", "cec", "ref", "hclk";
> >> +   pinctrl-names = "default";
> >> +   pinctrl-0 = <&hdmitx_scl &hdmitx_sda &hdmitxm0_cec>;
> >
> > I looked into CEC support here, and it seems that it does work with one 
> > change.
> > Please add the two following lines to your patch:
> > assigned-clocks = <&cru CLK_HDMI_CEC>;
> > assigned-clock-rates = <32768>;
> >
> > The issue is the clk_rtc32k_frac clock that feeds clk_rtc_32k which
> > feeds clk_hdmi_cec is 24mhz at boot, which is too high for CEC to
> > function.
>
> Wouldn't it make far more sense to just stick a suitable clk_set_rate()
> call in the driver? AFAICS it's already explicitly aware of the CEC clock.

This is handled purely in the
drivers/gpu/drm/bridge/synopsys/dw-hdmi.c driver, so I'm hesitant to
touch it there as it would affect all users, not just Rockchip.

Could someone familiar with the dw-hdmi IP weigh in on the minimum and
maximum clock rate the CEC block can handle?

>
> Robin.
>
> >> +   power-domains = <&power RK3568_PD_VO>;
> >> +   reg-io-width = <4>;
> >> +   rockchip,grf = <&grf>;
> >> +   #sound-dai-cells = <0>;
> >> +   status = "disabled";
> >> +
> >> +   ports {
> >> +   #address-cells = <1>;
> >> +   #size-cells = <0>;
> >> +
> >> +   hdmi_in: port@0 {
> >> +   reg = <0>;
> >> +   #address-cells = <1>;
> >> +   #size-cells = <0>;
> >> +   };
> >> +
> >> +   hdmi_out: port@1 {
> >> +   reg = <1>;
> >> +   #address-cells = <1>;
> >> +   #size-cells = <0>;
> >> +   };
> >> +   };
> >> +   };
> >> +
> >>  qos_gpu: qos@fe128000 {
> >>  compatible = "rockchip,rk3568-qos", "syscon";
> >>  reg = <0x0 0xfe128000 0x0 0x20>;
> >> --
> >> 2.30.2
> >>
> >
> > ___
> > Linux-rockchip mailing list
> > linux-rockc...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-rockchip

Re: [Intel-gfx] [PATCH 02/20] drm: implement top-down allocation method

2022-01-26 Thread Robert Beckett





On 26/01/2022 15:21, Matthew Auld wrote:

From: Arunpravin 

Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.


ooc, why did you choose to implement this as an alloc flag?
Seems to me like it would be a good candidate for a new memory region. 
That way allocation algorithms wouldn't need exta logic and ttm can 
already handle migrations.




v2:
   - Fix alignment issues(Matthew Auld)
   - Remove unnecessary list_empty check(Matthew Auld)
   - merged the below patch to see the feature in action
  - add top-down alloc support to i915 driver

Signed-off-by: Arunpravin 
---
  drivers/gpu/drm/drm_buddy.c   | 36 ---
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  3 ++
  include/drm/drm_buddy.h   |  1 +
  3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 954e31962c74..6aa5c1ce25bf 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -371,6 +371,26 @@ alloc_range_bias(struct drm_buddy *mm,
return ERR_PTR(err);
  }
  
+static struct drm_buddy_block *

+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }
+
+   return max_block;
+}
+
  static struct drm_buddy_block *
  alloc_from_freelist(struct drm_buddy *mm,
unsigned int order,
@@ -381,11 +401,17 @@ alloc_from_freelist(struct drm_buddy *mm,
int err;
  
  	for (i = order; i <= mm->max_order; ++i) {

-   block = list_first_entry_or_null(&mm->free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   block = get_maxblock(&mm->free_list[i]);
+   if (block)
+   break;
+   } else {
+   block = list_first_entry_or_null(&mm->free_list[i],
+struct drm_buddy_block,
+link);
+   if (block)
+   break;
+   }
}
  
  	if (!block)

diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 1411f4cf1f21..3662434b64bb 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
INIT_LIST_HEAD(&bman_res->blocks);
bman_res->mm = mm;
  
+	if (place->flags & TTM_PL_FLAG_TOPDOWN)

+   bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
if (place->fpfn || lpfn != man->size)
bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION;
  
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h

index 865664b90a8a..424fc443115e 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -28,6 +28,7 @@
  })
  
  #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0)

+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1)
  
  struct drm_buddy_block {

  #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)

Re: [Intel-gfx] [PATCH 1/3] drm: Stop spamming log with drm_cache message

2022-01-26 Thread Jani Nikula

On Tue, 25 Jan 2022, Lucas De Marchi  wrote:
> Only x86 and in some cases PPC have support added in drm_cache.c for the
> clflush class of functions. However warning once is sufficient to taint
> the log instead of spamming it with "Architecture has no drm_cache.c
> support" every few millisecond.
>
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Signed-off-by: Lucas De Marchi 
> ---
>  drivers/gpu/drm/drm_cache.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
> index f19d9acbe959..2d5a4c463a4f 100644
> --- a/drivers/gpu/drm/drm_cache.c
> +++ b/drivers/gpu/drm/drm_cache.c
> @@ -112,7 +112,6 @@ drm_clflush_pages(struct page *pages[], unsigned long 
> num_pages)
>   kunmap_atomic(page_virtual);
>   }
>  #else
> - pr_err("Architecture has no drm_cache.c support\n");
>   WARN_ON_ONCE(1);

An alternative would be to replace the two lines with:

WARN_ONCE(1, "Architecture has no drm_cache.c support\n");

But I'm not insisting.

BR,
Jani.


>  #endif
>  }
> @@ -143,7 +142,6 @@ drm_clflush_sg(struct sg_table *st)
>   if (wbinvd_on_all_cpus())
>   pr_err("Timed out waiting for cache flush\n");
>  #else
> - pr_err("Architecture has no drm_cache.c support\n");
>   WARN_ON_ONCE(1);
>  #endif
>  }
> @@ -177,7 +175,6 @@ drm_clflush_virt_range(void *addr, unsigned long length)
>   if (wbinvd_on_all_cpus())
>   pr_err("Timed out waiting for cache flush\n");
>  #else
> - pr_err("Architecture has no drm_cache.c support\n");
>   WARN_ON_ONCE(1);
>  #endif
>  }

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH 01/20] drm: improve drm_buddy_alloc function

2022-01-26 Thread Jani Nikula

On Wed, 26 Jan 2022, Matthew Auld  wrote:
> From: Arunpravin 
>
> - Make drm_buddy_alloc a single function to handle
>   range allocation and non-range allocation demands
>
> - Implemented a new function alloc_range() which allocates
>   the requested power-of-two block comply with range limitations
>
> - Moved order computation and memory alignment logic from
>   i915 driver to drm buddy
>
> v2:
>   merged below changes to keep the build unbroken
>- drm_buddy_alloc_range() becomes obsolete and may be removed
>- enable ttm range allocation (fpfn / lpfn) support in i915 driver
>- apply enhanced drm_buddy_alloc() function to i915 driver
>
> v3(Matthew Auld):
>   - Fix alignment issues and remove unnecessary list_empty check
>   - add more validation checks for input arguments
>   - make alloc_range() block allocations as bottom-up
>   - optimize order computation logic
>   - replace uint64_t with u64, which is preferred in the kernel
>
> v4(Matthew Auld):
>   - keep drm_buddy_alloc_range() function implementation for generic
> actual range allocations
>   - keep alloc_range() implementation for end bias allocations
>
> v5(Matthew Auld):
>   - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
> as place->lpfn will currently always be zero for i915
>
> v6(Matthew Auld):
>   - fixup potential uaf - If we are unlucky and can't allocate
> enough memory when splitting blocks, where we temporarily
> end up with the given block and its buddy on the respective
> free list, then we need to ensure we delete both blocks,
> and no just the buddy, before potentially freeing them
>
>   - fix warnings reported by kernel test robot 
>
> Signed-off-by: Arunpravin 
> ---
>  drivers/gpu/drm/drm_buddy.c   | 326 +-
>  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
>  drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
>  include/drm/drm_buddy.h   |  22 +-
>  4 files changed, 293 insertions(+), 124 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index d60878bc9c20..954e31962c74 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -282,23 +282,99 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
> list_head *objects)
>  }
>  EXPORT_SYMBOL(drm_buddy_free_list);
>  
> -/**
> - * drm_buddy_alloc_blocks - allocate power-of-two blocks
> - *
> - * @mm: DRM buddy manager to allocate from
> - * @order: size of the allocation
> - *
> - * The order value here translates to:
> - *
> - * 0 = 2^0 * mm->chunk_size
> - * 1 = 2^1 * mm->chunk_size
> - * 2 = 2^2 * mm->chunk_size
> - *
> - * Returns:
> - * allocated ptr to the &drm_buddy_block on success
> - */
> -struct drm_buddy_block *
> -drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
> +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
> +{
> + return s1 <= e2 && e1 >= s2;
> +}
> +
> +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
> +{
> + return s1 <= s2 && e1 >= e2;
> +}
> +
> +static struct drm_buddy_block *
> +alloc_range_bias(struct drm_buddy *mm,
> +  u64 start, u64 end,
> +  unsigned int order)
> +{
> + struct drm_buddy_block *block;
> + struct drm_buddy_block *buddy;
> + LIST_HEAD(dfs);
> + int err;
> + int i;
> +
> + end = end - 1;
> +
> + for (i = 0; i < mm->n_roots; ++i)
> + list_add_tail(&mm->roots[i]->tmp_link, &dfs);
> +
> + do {
> + u64 block_start;
> + u64 block_end;
> +
> + block = list_first_entry_or_null(&dfs,
> +  struct drm_buddy_block,
> +  tmp_link);
> + if (!block)
> + break;
> +
> + list_del(&block->tmp_link);
> +
> + if (drm_buddy_block_order(block) < order)
> + continue;
> +
> + block_start = drm_buddy_block_offset(block);
> + block_end = block_start + drm_buddy_block_size(mm, block) - 1;
> +
> + if (!overlaps(start, end, block_start, block_end))
> + continue;
> +
> + if (drm_buddy_block_is_allocated(block))
> + continue;
> +
> + if (contains(start, end, block_start, block_end) &&
> + order == drm_buddy_block_order(block)) {
> + /*
> +  * Find the free block within the range.
> +  */
> + if (drm_buddy_block_is_free(block))
> + return block;
> +
> + continue;
> + }
> +
> + if (!drm_buddy_block_is_split(block)) {
> + err = split_block(mm, block);
> + if (unlikely(err))
> + goto err_undo;
> + }
> +
> + list_add(&block->

Re: [PATCH 21/27] arm64: dts: rockchip: rk356x: Add HDMI nodes

2022-01-26 Thread Robin Murphy


On 2022-01-26 16:04, Peter Geis wrote:

On Wed, Jan 26, 2022 at 9:58 AM Sascha Hauer  wrote:


Add support for the HDMI port found on RK3568.

Signed-off-by: Sascha Hauer 
---
  arch/arm64/boot/dts/rockchip/rk356x.dtsi | 37 +++-
  1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/rockchip/rk356x.dtsi 
b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
index 4008bd666d01..e38fb223e9b8 100644
--- a/arch/arm64/boot/dts/rockchip/rk356x.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk356x.dtsi
@@ -10,7 +10,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 

  / {
@@ -502,6 +501,42 @@ vop_mmu: iommu@fe043e00 {
 status = "disabled";
 };

+   hdmi: hdmi@fe0a {
+   compatible = "rockchip,rk3568-dw-hdmi";
+   reg = <0x0 0xfe0a 0x0 0x2>;
+   interrupts = ;
+   clocks = <&cru PCLK_HDMI_HOST>,
+<&cru CLK_HDMI_SFR>,
+<&cru CLK_HDMI_CEC>,
+<&pmucru CLK_HDMI_REF>,
+<&cru HCLK_VOP>;
+   clock-names = "iahb", "isfr", "cec", "ref", "hclk";
+   pinctrl-names = "default";
+   pinctrl-0 = <&hdmitx_scl &hdmitx_sda &hdmitxm0_cec>;


I looked into CEC support here, and it seems that it does work with one change.
Please add the two following lines to your patch:
assigned-clocks = <&cru CLK_HDMI_CEC>;
assigned-clock-rates = <32768>;

The issue is the clk_rtc32k_frac clock that feeds clk_rtc_32k which
feeds clk_hdmi_cec is 24mhz at boot, which is too high for CEC to
function.


Wouldn't it make far more sense to just stick a suitable clk_set_rate() 
call in the driver? AFAICS it's already explicitly aware of the CEC clock.


Robin.


+   power-domains = <&power RK3568_PD_VO>;
+   reg-io-width = <4>;
+   rockchip,grf = <&grf>;
+   #sound-dai-cells = <0>;
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   hdmi_in: port@0 {
+   reg = <0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+
+   hdmi_out: port@1 {
+   reg = <1>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   };
+   };
+   };
+
 qos_gpu: qos@fe128000 {
 compatible = "rockchip,rk3568-qos", "syscon";
 reg = <0x0 0xfe128000 0x0 0x20>;
--
2.30.2



___
Linux-rockchip mailing list
linux-rockc...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

Re: [PATCH v1 1/4] fbtft: Unorphan the driver

2022-01-26 Thread Jani Nikula

On Wed, 26 Jan 2022, Andy Shevchenko  wrote:
> And basically create a MIPI based driver for I2C.

What does that even mean?

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH v5 4/5] drm/i915: add gtt misalignment test

2022-01-26 Thread Robert Beckett





On 26/01/2022 14:05, Thomas Hellström (Intel) wrote:


On 1/25/22 20:35, Robert Beckett wrote:

add test to check handling of misaligned offsets and sizes

v4:
* remove spurious blank lines
* explicitly cast intel_region_id to intel_memory_type in 
misaligned_pin

Reported-by: kernel test robot 

Signed-off-by: Robert Beckett 
---
  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 128 ++
  1 file changed, 128 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c

index b80788a2b7f9..f082b5ff3b5e 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -22,10 +22,12 @@
   *
   */
+#include "gt/intel_gtt.h"
  #include 
  #include 
  #include "gem/i915_gem_context.h"
+#include "gem/i915_gem_region.h"
  #include "gem/selftests/mock_context.h"
  #include "gt/intel_context.h"
  #include "gt/intel_gpu_commands.h"
@@ -1067,6 +1069,120 @@ static int shrink_boom(struct 
i915_address_space *vm,

  return err;
  }
+static int misaligned_case(struct i915_address_space *vm, struct 
intel_memory_region *mr,

+   u64 addr, u64 size, unsigned long flags)
+{
+    struct drm_i915_gem_object *obj;
+    struct i915_vma *vma;
+    int err = 0;
+    u64 expected_vma_size, expected_node_size;
+
+    obj = i915_gem_object_create_region(mr, size, 0, 0);
+    if (IS_ERR(obj))
+    return PTR_ERR(obj);
+
+    vma = i915_vma_instance(obj, vm, NULL);
+    if (IS_ERR(vma)) {
+    err = PTR_ERR(vma);
+    goto err_put;
+    }
+
+    err = i915_vma_pin(vma, 0, 0, addr | flags);
+    if (err)
+    goto err_put;
+    i915_vma_unpin(vma);
+
+    if (!drm_mm_node_allocated(&vma->node)) {
+    err = -EINVAL;
+    goto err_put;
+    }
+
+    if (i915_vma_misplaced(vma, 0, 0, addr | flags)) {
+    err = -EINVAL;
+    goto err_put;
+    }
+
+    expected_vma_size = round_up(size, 1 << 
(ffs(vma->resource->page_sizes_gtt) - 1));

+    expected_node_size = expected_vma_size;
+
+    if (IS_DG2(vm->i915) && i915_gem_object_is_lmem(obj)) {
+    /* dg2 should expand lmem node to 2MB */


Should this test be NEEDS_COMPACT_PT()?

Otherwise LGTM. Reviewed-by: Thomas Hellström 


Thanks. Good catch, forgot to retrofit the new macro here.

1 2 3 >

1 - 100 of 271 matches

Mail list logo