Re: [PATCH v5 15/16] drm/mediatek: modify mediatek-drm for mt8195 multi mmsys support

2021-10-03 Thread Nancy . Lin
Hi Chun-Kuang,

Thanks for the review.

On Wed, 2021-09-29 at 08:03 +0800, Chun-Kuang Hu wrote:
> 

> Hi, Nancy:
> 
> Nancy.Lin  於 2021年9月6日 週一 下午3:15寫道:
> > 
> > MT8195 have two mmsys. Modify drm for MT8195 multi-mmsys support.
> > The two mmsys (vdosys0 and vdosys1) will bring up two drm drivers,
> > only one drm driver register as the drm device.
> > Each drm driver binds its own component. The first bind drm driver
> > will allocate the drm device, and the last bind drm driver
> > registers
> > the drm device to drm core. Each crtc path is created with the
> > corresponding drm driver data.
> > 
> > Signed-off-by: Nancy.Lin 
> > ---
> >  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  25 +-
> >  drivers/gpu/drm/mediatek/mtk_drm_crtc.h |   3 +-
> >  drivers/gpu/drm/mediatek/mtk_drm_drv.c  | 360 +++-
> > 
> >  drivers/gpu/drm/mediatek/mtk_drm_drv.h  |   8 +-
> >  4 files changed, 320 insertions(+), 76 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > index 474efb844249..68cb15c38c3f 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
> > @@ -737,21 +737,28 @@ static int
> > mtk_drm_crtc_init_comp_planes(struct drm_device *drm_dev,
> >  }
> > 
> >  int mtk_drm_crtc_create(struct drm_device *drm_dev,
> > -   const enum mtk_ddp_comp_id *path, unsigned
> > int path_len)
> > +   const enum mtk_ddp_comp_id *path, unsigned
> > int path_len,
> > +   int priv_data_index)
> >  {
> > struct mtk_drm_private *priv = drm_dev->dev_private;
> > struct device *dev = drm_dev->dev;
> > struct mtk_drm_crtc *mtk_crtc;
> > unsigned int num_comp_planes = 0;
> > -   int pipe = priv->num_pipes;
> > int ret;
> > int i;
> > bool has_ctm = false;
> > uint gamma_lut_size = 0;
> > +   struct drm_crtc *tmp;
> > +   int crtc_i = 0;
> > 
> > if (!path)
> > return 0;
> > 
> > +   priv = priv->all_drm_private[priv_data_index];
> > +
> > +   drm_for_each_crtc(tmp, drm_dev)
> > +   crtc_i++;
> > +
> > for (i = 0; i < path_len; i++) {
> > enum mtk_ddp_comp_id comp_id = path[i];
> > struct device_node *node;
> > @@ -760,7 +767,7 @@ int mtk_drm_crtc_create(struct drm_device
> > *drm_dev,
> > if (!node) {
> > dev_info(dev,
> >  "Not creating crtc %d because
> > component %d is disabled or missing\n",
> > -pipe, comp_id);
> > +crtc_i, comp_id);
> > return 0;
> > }
> > }
> > @@ -816,25 +823,25 @@ int mtk_drm_crtc_create(struct drm_device
> > *drm_dev,
> > 
> > for (i = 0; i < mtk_crtc->ddp_comp_nr; i++) {
> > ret = mtk_drm_crtc_init_comp_planes(drm_dev,
> > mtk_crtc, i,
> > -   pipe);
> > +   crtc_i);
> > if (ret)
> > return ret;
> > }
> > 
> > -   ret = mtk_drm_crtc_init(drm_dev, mtk_crtc, pipe);
> > +   ret = mtk_drm_crtc_init(drm_dev, mtk_crtc, crtc_i);
> > if (ret < 0)
> > return ret;
> > 
> > if (gamma_lut_size)
> > drm_mode_crtc_set_gamma_size(_crtc->base,
> > gamma_lut_size);
> > drm_crtc_enable_color_mgmt(_crtc->base, 0, has_ctm,
> > gamma_lut_size);
> > -   priv->num_pipes++;
> > mutex_init(_crtc->hw_lock);
> > 
> >  #if IS_REACHABLE(CONFIG_MTK_CMDQ)
> > +   i = (priv->data->mmsys_dev_num > 1) ? 0 :
> > drm_crtc_index(_crtc->base);
> > +
> > mtk_crtc->cmdq_client =
> > -   cmdq_mbox_create(mtk_crtc->mmsys_dev,
> > -drm_crtc_index(_crtc-
> > >base));
> > +   cmdq_mbox_create(mtk_crtc->mmsys_dev, i);
> > if (IS_ERR(mtk_crtc->cmdq_client)) {
> > dev_dbg(dev, "mtk_crtc %d failed to create mailbox
> > client, writing register by CPU now\n",
> > drm_crtc_index(_crtc->base));
> > @@ -844,7 +851,7 @@ int mtk_drm_crtc_create(struct drm_device
> > *drm_dev,
> > if (mtk_crtc->cmdq_client) {
> > ret = of_property_read_u32_index(priv->mutex_node,
> >  "mediatek,gce-
> > events",
> > -drm_crtc_index(
> > k_crtc->base),
> > +i,
> >  _crtc-
> > >cmdq_event);
> > if (ret) {
> > dev_dbg(dev, "mtk_crtc %d failed to get
> > mediatek,gce-events property\n",
> > diff --git 

Re: Questions over DSI within DRM.

2021-10-03 Thread Andrzej Hajda

Hi,

Thanks Laurent for reviving the thread, I have missed it entirely.


On 03.10.2021 16:16, Laurent Pinchart wrote:

Hello,

Reviving a bit of an old thread.

On Thu, Jul 15, 2021 at 11:50:22AM +0200, Maxime Ripard wrote:

On Tue, Jul 06, 2021 at 05:44:58PM +0100, Dave Stevenson wrote:

On Tue, 6 Jul 2021 at 16:13, Maxime Ripard wrote:

On a similar theme, some devices want the clock lane in HS mode early
so they can use it in place of an external oscillator, but the data
lanes still in LP-11. There appears to be no way for the
display/bridge to signal this requirement or it be achieved.


You're right. A lng time ago, the omapdrm driver had an internal
infrastructure that didn't use drm_bridge or drm_panel and instead
required omapdrm-specific drivers for those components. It used to model
the display pipeline in a different way than drm_bridge, with the sync
explicitly setting the source state. A DSI sink could thus control its
enable sequence, interleaving programming of the sink with control of
the source.

Migrating omapdrm to the drm_bridge model took a really large effort,
which makes me believe that transitioning the whole subsystem to
sink-controlled sources would be close to impossible. We could add
DSI-specific operations, or add another enable bridge operation
(post_pre_enable ? :-D). Neither would scale, but it may be enough.


I haven't thought it through for all generic cases, but I suspect it's
more a pre_pre_enable that is needed to initialise the PHY etc,
probably from source to sink.


I believe it could be implemented as a pre-pre-enable indeed. It feels
like a bit of a hack, as the next time we need more fine-grained control
over the startup sequence, we'll have to add a pre-pre-pre-enable. Given
that the startup sequence requirements come from the sink device, it
would make sense to let it explicitly control the initialization,
instead of driving it from the source. I don't think we'll be able to
rework the bridge API in that direction though, so I'm fine with a hack.


As I remember I have suggested in similar discussion [1] adding to 
mipi_dsi_host_ops requested operations:

power_on - power on DSI bus (do we really need it?)
init - enter LP11 (or HS-stop state if I remember correctly)
and then call them from the right place in DSI device, probably 
pre_enable callback.


This way we could avoid polluting drm_bridge_ops, with DSI specific stuff.

[1]: 
https://lore.kernel.org/dri-devel/6700c90f-d0e0-5cbf-1616-0c1d15844...@samsung.com/#t



Sorry for addressing only this issue, but I need to read whole thread, 
to re-read whole thread, and today it is too late for me :)


Regards
Andrzej




572994bf18ff prevents system boot

2021-10-03 Thread Chuck Lever III
Hi-

After updating one of my test systems to v5.15-rc, I found that it
becomes unresponsive during the later part of the boot process. A
power-on reset is necessary to recover.

I bisected to this commit:

572994bf18ff ("drm/ast: Zero is missing in detect function")

Checking out v5.15-rc3 and reverting this commit enables the system
to boot again.

0b:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 30) (prog-if 00 [VGA controller])
DeviceName:  ASPEED Video AST2400
Subsystem: Super Micro Computer Inc X10SRL-F
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR- 

Re: Questions over DSI within DRM.

2021-10-03 Thread Laurent Pinchart
Hello,

Reviving a bit of an old thread.

On Thu, Jul 15, 2021 at 11:50:22AM +0200, Maxime Ripard wrote:
> On Tue, Jul 06, 2021 at 05:44:58PM +0100, Dave Stevenson wrote:
> > On Tue, 6 Jul 2021 at 16:13, Maxime Ripard wrote:
> > > > > > On a similar theme, some devices want the clock lane in HS mode 
> > > > > > early
> > > > > > so they can use it in place of an external oscillator, but the data
> > > > > > lanes still in LP-11. There appears to be no way for the
> > > > > > display/bridge to signal this requirement or it be achieved.
> > > > >
> > > > > You're right. A lng time ago, the omapdrm driver had an internal
> > > > > infrastructure that didn't use drm_bridge or drm_panel and instead
> > > > > required omapdrm-specific drivers for those components. It used to 
> > > > > model
> > > > > the display pipeline in a different way than drm_bridge, with the sync
> > > > > explicitly setting the source state. A DSI sink could thus control its
> > > > > enable sequence, interleaving programming of the sink with control of
> > > > > the source.
> > > > >
> > > > > Migrating omapdrm to the drm_bridge model took a really large effort,
> > > > > which makes me believe that transitioning the whole subsystem to
> > > > > sink-controlled sources would be close to impossible. We could add
> > > > > DSI-specific operations, or add another enable bridge operation
> > > > > (post_pre_enable ? :-D). Neither would scale, but it may be enough.
> > > >
> > > > I haven't thought it through for all generic cases, but I suspect it's
> > > > more a pre_pre_enable that is needed to initialise the PHY etc,
> > > > probably from source to sink.

I believe it could be implemented as a pre-pre-enable indeed. It feels
like a bit of a hack, as the next time we need more fine-grained control
over the startup sequence, we'll have to add a pre-pre-pre-enable. Given
that the startup sequence requirements come from the sink device, it
would make sense to let it explicitly control the initialization,
instead of driving it from the source. I don't think we'll be able to
rework the bridge API in that direction though, so I'm fine with a hack.

> > > > If the panel/bridge can set a flag that can be checked at this point
> > > > for whether an early clock is required or not, I think that allows us
> > > > to comply with the requirements for a large number of panels/bridges
> > > > (LP-11 vs HS config for clock and or data lanes before pre_enable is
> > > > called).
> > > >
> > > > pre_enable retains the current behaviour (initialise the chain from
> > > > sink to source).
> > > > enable then actually starts sending video and enabling outputs.
> > >
> > > Flags indeed seem like a more contained option. Another one could be to
> > > have a mipi_dsi_host to (for example) power up the clock lane that would
> > > be called by default before the bridge's enable, or at the downstream
> > > bridge driver discretion before that.
> > 
> > Which driver will that call?
> 
> The parent DSI Host
> 
> > An extreme example perhaps, but Toshiba make the TC358860 eDP to DSI
> > bridge chip[1]. So the encoder will be eDP, but the DSI config needs
> > to go to that bridge. Does that happen automatically within the
> > framework? I guess so as the bridge will have called
> > mipi_dsi_host_register for the DSI sink to attach to.
> 
> In that case, whatever sink would be connected to the bridge would call
> the bridge clock enable hook if it needs it in its pre_enable, or it
> would be called automatically before enable if it doesn't
> 
> Would that help?

Sounds good to me, in theory at least (let's see what issues we'll run
into in practice :-)).

Has anyone given it a try, or is planning to ?

> > Perhaps a new mipi_dsi_host function to configure the PHY is the
> > easier solution. If it can allow the sink to request whatever
> > combination of states from clock and data lanes that it fancies, then
> > it can be as explicit as required for the initialisation sequence, and
> > the host driver does its best to comply with the requests.
> 
> I don't know, I'm not really fond in general of solutions that try to
> cover any single case if we don't need it and / or have today an issue
> with this. I'd rather have something that works for the particular
> bridge we were discussing, see if it applies to other bridges and modify
> it if it doesn't until it works for all our cases. Trying to reason in
> all possible cases tend to lead to solutions that are difficult to
> maintain and usually over-engineered.

A DSI host clock enable operation or a DSI host PHY configuration
operation both fit in the same place in the grand scheme of things, so I
don't mind either. We should be able to easily move from a more specific
operation to a more generic one if the need arises.

> > I'd have a slight query over when and how the host would drop to ULPS
> > or power off. It probably shouldn't be in post_disable as the sink
> > hasn't had a chance to finalise everything in its 

Re: [PATCH] drm: rcar-du: Don't create encoder for unconnected LVDS outputs

2021-10-03 Thread Laurent Pinchart
Hi Geert,

On Tue, Sep 28, 2021 at 10:55:57AM +0200, Geert Uytterhoeven wrote:
> On Mon, Aug 23, 2021 at 4:54 PM Laurent Pinchart wrote:
> > On Mon, Aug 23, 2021 at 02:25:32PM +0200, Geert Uytterhoeven wrote:
> > > On Sun, Aug 22, 2021 at 2:36 AM Laurent Pinchart wrote:
> > > > On R-Car D3 and E3, the LVDS encoders provide the pixel clock to the DU,
> > > > even when LVDS outputs are not used. For this reason, the rcar-lvds
> > > > driver probes successfully on those platforms even if no further bridge
> > > > or panel is connected to the LVDS output, in order to provide the
> > > > rcar_lvds_clk_enable() and rcar_lvds_clk_disable() functions to the DU
> > > > driver.
> > > >
> > > > If an LVDS output isn't connected, trying to create a DRM connector for
> > > > the output will fail. Fix this by skipping connector creation in that
> > > > case, and also skip creation of the DRM encoder as there's no point in
> > > > an encoder without a connector.
> > > >
> > > > Fixes: e9e056949c92 ("drm: rcar-du: lvds: Convert to DRM panel bridge 
> > > > helper")
> > > > Reported-by: Geert Uytterhoeven 
> > >
> > > Can you please change that to
> > > Reported-by: Geert Uytterhoeven 
> > > ?
> >
> > Sure thing.
> 
> Thanks!
> 
> > > > Signed-off-by: Laurent Pinchart 
> > > > 
> > >
> > > Thanks, the scary warning on Ebisu-4D is gone, so
> > > Tested-by: Geert Uytterhoeven 
> > >
> > > Disclaimer: there are no displays connected to my Ebisu-4D.
> >
> > That's the best way to ensure the absence of display issues. It works
> > great for camera testing too, if you also remove networking and storage
> > :-)
> 
> Any chance this fix can make it upstream?
> The fix was created before the issue entered upstream in v5.15-rc1.

Pull request sent.

-- 
Regards,

Laurent Pinchart


[GIT FIXES FOR v5.15] R-Car DU fix

2021-10-03 Thread Laurent Pinchart
Hi Dave and Daniel,

The following changes since commit 78ea81417944fe03f48648eddeb8e8a8e513c4ad:

  Merge tag 'exynos-drm-fixes-for-v5.15-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes 
(2021-10-01 18:14:39 +0200)

are available in the Git repository at:

  git://linuxtv.org/pinchartl/media.git tags/du-fixes-20211003

for you to fetch changes up to 734d0b3b6afd44510b4752ea730685f89e107767:

  drm: rcar-du: Don't create encoder for unconnected LVDS outputs (2021-10-03 
16:05:36 +0300)


- R-Car DU fix for LVDS outputs on D3 and E3 SoCs


Laurent Pinchart (1):
  drm: rcar-du: Don't create encoder for unconnected LVDS outputs

 drivers/gpu/drm/rcar-du/rcar_du_encoder.c | 16 
 drivers/gpu/drm/rcar-du/rcar_lvds.c   | 11 +++
 drivers/gpu/drm/rcar-du/rcar_lvds.h   |  5 +
 3 files changed, 28 insertions(+), 4 deletions(-)

-- 
Regards,

Laurent Pinchart


Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-10-03 Thread Laurent Pinchart
Hi Maxime,

On Thu, Sep 30, 2021 at 12:56:02AM +0300, Laurent Pinchart wrote:
> On Fri, Sep 10, 2021 at 12:11:54PM +0200, Maxime Ripard wrote:
> > Hi,
> > 
> > We've encountered an issue with the RaspberryPi DSI panel that prevented the
> > whole display driver from probing.
> > 
> > The issue is described in detail in the commit 7213246a803f ("drm/vc4: dsi:
> > Only register our component once a DSI device is attached"), but the basic 
> > idea
> > is that since the panel is probed through i2c, there's no synchronization
> > between its probe and the registration of the MIPI-DSI host it's attached 
> > to.
> > 
> > We initially moved the component framework registration to the MIPI-DSI Host
> > attach hook to make sure we register our component only when we have a DSI
> > device attached to our MIPI-DSI host, and then use lookup our DSI device in 
> > our
> > bind hook.
> > 
> > However, all the DSI bridges controlled through i2c are only registering 
> > their
> > associated DSI device in their bridge attach hook, meaning with our change
> > above, we never got that far, and therefore ended up in the same situation 
> > than
> > the one we were trying to fix for panels.
> > 
> > The best practice to avoid those issues is to register its functions only 
> > after
> > all its dependencies are live. We also shouldn't wait any longer than we 
> > should
> > to play nice with the other components that are waiting for us, so in our 
> > case
> > that would mean moving the DSI device registration to the bridge probe.
> > 
> > I also had a look at all the DSI hosts, and it seems that exynos, kirin and 
> > msm
> > would be affected by this and wouldn't probe anymore after those changes.
> > Exynos and kirin seems to be simple enough for a mechanical change (that 
> > still
> > requires to be tested), but the changes in msm seemed to be far more 
> > important
> > and I wasn't confortable doing them.
> > 
> > Let me know what you think,
> 
> I've tested this series on my RPi CM4-based board, and there's a clear
> improvement: the sn65dsi83 now probes successfully !
> 
> The downside is that I can now look at a panel that desperately refuses
> to display anything. That's a separate issue, but it prevents me from
> telling whether this series introduces regressions :-S I'll try to debug
> that separately.

I managed to (partly) fix that issue with a few backports from the RPi
kernel, making me confident enough to say

Tested-by: Laurent Pinchart 

for

drivers/gpu/drm/bridge/ti-sn65dsi83.c
drivers/gpu/drm/drm_bridge.c
drivers/gpu/drm/drm_mipi_dsi.c
include/drm/drm_mipi_dsi.h

> Also, Kieran, would you be able to test this with the SN65DSI86 ?
> 
> > ---
> > 
> > Changes from v3:
> >   - Converted exynos and kirin
> >   - Converted all the affected bridge drivers
> >   - Reworded the documentation a bit
> > 
> > Changes from v2:
> >   - Changed the approach as suggested by Andrzej, and aligned the bridge on 
> > the
> > panel this time.
> >   - Fixed some typos
> > 
> > Changes from v1:
> >   - Change the name of drm_of_get_next function to drm_of_get_bridge
> >   - Mention the revert of 87154ff86bf6 and squash the two patches that were
> > reverting that commit
> >   - Add some documentation
> >   - Make drm_panel_attach and _detach succeed when no callback is there
> > 
> > Maxime Ripard (24):
> >   drm/bridge: Add documentation sections
> >   drm/bridge: Document the probe issue with MIPI-DSI bridges
> >   drm/mipi-dsi: Create devm device registration
> >   drm/mipi-dsi: Create devm device attachment
> >   drm/bridge: adv7533: Switch to devm MIPI-DSI helpers
> >   drm/bridge: adv7511: Register and attach our DSI device at probe
> >   drm/bridge: anx7625: Switch to devm MIPI-DSI helpers
> >   drm/bridge: anx7625: Register and attach our DSI device at probe
> >   drm/bridge: lt8912b: Switch to devm MIPI-DSI helpers
> >   drm/bridge: lt8912b: Register and attach our DSI device at probe
> >   drm/bridge: lt9611: Switch to devm MIPI-DSI helpers
> >   drm/bridge: lt9611: Register and attach our DSI device at probe
> >   drm/bridge: lt9611uxc: Switch to devm MIPI-DSI helpers
> >   drm/bridge: lt9611uxc: Register and attach our DSI device at probe
> >   drm/bridge: ps8640: Switch to devm MIPI-DSI helpers
> >   drm/bridge: ps8640: Register and attach our DSI device at probe
> >   drm/bridge: sn65dsi83: Switch to devm MIPI-DSI helpers
> >   drm/bridge: sn65dsi83: Register and attach our DSI device at probe
> >   drm/bridge: sn65dsi86: Switch to devm MIPI-DSI helpers
> >   drm/bridge: sn65dsi86: Register and attach our DSI device at probe
> >   drm/bridge: tc358775: Switch to devm MIPI-DSI helpers
> >   drm/bridge: tc358775: Register and attach our DSI device at probe
> >   drm/kirin: dsi: Adjust probe order
> >   drm/exynos: dsi: Adjust probe order
> > 
> >  Documentation/gpu/drm-kms-helpers.rst|  12 +++
> >  drivers/gpu/drm/bridge/adv7511/adv7511.h |   1 -
> >  

[PATCH] drm/i915: Prefer struct_size over open coded arithmetic

2021-10-03 Thread Len Baker
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.

In this case these are not actually dynamic sizes: all the operands
involved in the calculation are constant values. However it is better to
refactor them anyway, just to keep the open-coded math idiom out of
code.

So, add at the end of the struct i915_syncmap a union with two flexible
array members (these arrays share the same memory layout). This is
possible using the new DECLARE_FLEX_ARRAY macro. And then, use the
struct_size() helper to do the arithmetic instead of the argument
"size + count * size" in the kmalloc and kzalloc() functions.

Also, take the opportunity to refactor the __sync_seqno and __sync_child
making them more readable.

This code was detected with the help of Coccinelle and audited and fixed
manually.

[1] 
https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments

Signed-off-by: Len Baker 
---
 drivers/gpu/drm/i915/i915_syncmap.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_syncmap.c 
b/drivers/gpu/drm/i915/i915_syncmap.c
index 60404dbb2e9f..a8d35491d05a 100644
--- a/drivers/gpu/drm/i915/i915_syncmap.c
+++ b/drivers/gpu/drm/i915/i915_syncmap.c
@@ -82,6 +82,10 @@ struct i915_syncmap {
 *  struct i915_syncmap *child[KSYNCMAP];
 * };
 */
+   union {
+   DECLARE_FLEX_ARRAY(u32, seqno);
+   DECLARE_FLEX_ARRAY(struct i915_syncmap *, child);
+   };
 };

 /**
@@ -99,13 +103,13 @@ void i915_syncmap_init(struct i915_syncmap **root)
 static inline u32 *__sync_seqno(struct i915_syncmap *p)
 {
GEM_BUG_ON(p->height);
-   return (u32 *)(p + 1);
+   return p->seqno;
 }

 static inline struct i915_syncmap **__sync_child(struct i915_syncmap *p)
 {
GEM_BUG_ON(!p->height);
-   return (struct i915_syncmap **)(p + 1);
+   return p->child;
 }

 static inline unsigned int
@@ -200,7 +204,7 @@ __sync_alloc_leaf(struct i915_syncmap *parent, u64 id)
 {
struct i915_syncmap *p;

-   p = kmalloc(sizeof(*p) + KSYNCMAP * sizeof(u32), GFP_KERNEL);
+   p = kmalloc(struct_size(p, seqno, KSYNCMAP), GFP_KERNEL);
if (unlikely(!p))
return NULL;

@@ -282,7 +286,7 @@ static noinline int __sync_set(struct i915_syncmap **root, 
u64 id, u32 seqno)
unsigned int above;

/* Insert a join above the current layer */
-   next = kzalloc(sizeof(*next) + KSYNCMAP * sizeof(next),
+   next = kzalloc(struct_size(next, child, KSYNCMAP),
   GFP_KERNEL);
if (unlikely(!next))
return -ENOMEM;
--
2.25.1



[PATCH v7 2/2] habanalabs: add support for dma-buf exporter

2021-10-03 Thread Oded Gabbay
From: Tomer Tayar 

Implement the calls to the dma-buf kernel api to create a dma-buf
object backed by FD.

We block the option to mmap the DMA-BUF object because we don't support
DIRECT_IO and implicit P2P. We only implement support for explicit P2P
through importing the FD of the DMA-BUF.

In the export phase, we provide to the DMA-BUF object an array of pages
that represent the device's memory area. During the map callback,
we convert the array of pages into an SGT. We split/merge the pages
according to the dma max segment size of the importer.

To get the DMA address of the PCI bar, we use the dma_map_resources()
kernel API, because our device memory is not backed by page struct
and this API doesn't need page struct to map the physical address to
a DMA address.

We set the orig_nents member of the SGT to be 0, to indicate to other
drivers that we don't support CPU mappings.

Note that in Habanalabs's ASICs, the device memory is pinned and
immutable. Therefore, there is no need for dynamic mappings and pinning
callbacks.

Also note that in GAUDI we don't have an MMU towards the device memory
and the user works on physical addresses. Therefore, the user doesn't
pass through the kernel driver to allocate memory there. As a result,
only for GAUDI we receive from the user a device memory physical address
(instead of a handle) and a size.

We check the p2p distance using pci_p2pdma_distance_many() and refusing
to map dmabuf in case the distance doesn't allow p2p.

Signed-off-by: Tomer Tayar 
Reviewed-by: Oded Gabbay 
Reviewed-by: Gal Pressman 
Reviewed-by: Greg Kroah-Hartman 
Acked-by: Daniel Vetter 
Signed-off-by: Oded Gabbay 
---
Changes in v7:
 - rename structure hl_dmabuf_wrapper to hl_dmabuf_priv
 - remove change that wasn't relevant to this patch
 - set orig_nents to 0 at end of alloc_sgt_from_device_pages()
   to avoid undoing that set in case of error
 - remove silent ignore of upper 32 bits of handle received
   from user in export_dmabuf_from_handle
 - replace all dev_err that triggered by user actions to dev_dbg
 - change alloc_sgt_from_device_pages() to return pointer to
   struct sg_table or ERR_PTR
 - add comment why we use DMA_ATTR_SKIP_CPU_SYNC in dmabuf unmap

 drivers/misc/habanalabs/Kconfig |   1 +
 drivers/misc/habanalabs/common/habanalabs.h |  22 +
 drivers/misc/habanalabs/common/memory.c | 514 +++-
 drivers/misc/habanalabs/gaudi/gaudi.c   |   1 +
 drivers/misc/habanalabs/goya/goya.c |   1 +
 5 files changed, 536 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/habanalabs/Kconfig b/drivers/misc/habanalabs/Kconfig
index 293d79811372..c82d2e7b2035 100644
--- a/drivers/misc/habanalabs/Kconfig
+++ b/drivers/misc/habanalabs/Kconfig
@@ -8,6 +8,7 @@ config HABANA_AI
depends on PCI && HAS_IOMEM
select GENERIC_ALLOCATOR
select HWMON
+   select DMA_SHARED_BUFFER
help
  Enables PCIe card driver for Habana's AI Processors (AIP) that are
  designed to accelerate Deep Learning inference and training workloads.
diff --git a/drivers/misc/habanalabs/common/habanalabs.h 
b/drivers/misc/habanalabs/common/habanalabs.h
index f8e23ca18a57..ba5a424a4f3e 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define HL_NAME"habanalabs"
 
@@ -1352,6 +1353,23 @@ struct hl_cs_counters_atomic {
atomic64_t validation_drop_cnt;
 };
 
+/**
+ * struct hl_dmabuf_priv - a dma-buf private object.
+ * @dmabuf: pointer to dma-buf object.
+ * @ctx: pointer to the dma-buf owner's context.
+ * @phys_pg_pack: pointer to physical page pack if the dma-buf was exported for
+ *memory allocation handle.
+ * @device_address: physical address of the device's memory. Relevant only
+ *  if phys_pg_pack is NULL (dma-buf was exported from 
address).
+ *  The total size can be taken from the dmabuf object.
+ */
+struct hl_dmabuf_priv {
+   struct dma_buf  *dmabuf;
+   struct hl_ctx   *ctx;
+   struct hl_vm_phys_pg_pack   *phys_pg_pack;
+   uint64_tdevice_address;
+};
+
 /**
  * struct hl_ctx - user/kernel context.
  * @mem_hash: holds mapping from virtual address to virtual memory area
@@ -1662,6 +1680,7 @@ struct hl_vm_hw_block_list_node {
  * @npages: num physical pages in the pack.
  * @total_size: total size of all the pages in this list.
  * @mapping_cnt: number of shared mappings.
+ * @exporting_cnt: number of dma-buf exporting.
  * @asid: the context related to this list.
  * @page_size: size of each page in the pack.
  * @flags: HL_MEM_* flags related to this list.
@@ -1676,6 +1695,7 @@ struct hl_vm_phys_pg_pack {
u64 npages;
u64 total_size;
atomic_tmapping_cnt;
+   u32  

[PATCH v7 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-10-03 Thread Oded Gabbay
User process might want to share the device memory with another
driver/device, and to allow it to access it over PCIe (P2P).

To enable this, we utilize the dma-buf mechanism and add a dma-buf
exporter support, so the other driver can import the device memory and
access it.

The device memory is allocated using our existing allocation uAPI,
where the user will get a handle that represents the allocation.

The user will then need to call the new
uAPI (HL_MEM_OP_EXPORT_DMABUF_FD) and give the handle as a parameter.

The driver will return a FD that represents the DMA-BUF object that
was created to match that allocation.

Signed-off-by: Oded Gabbay 
Reviewed-by: Tomer Tayar 
Reviewed-by: Greg Kroah-Hartman 
Acked-by: Daniel Vetter 
---
Changes in v7:
 - Change the type of the fd variable returned from IOCTL to be __s32

 include/uapi/misc/habanalabs.h | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
index ccfcb4d188fc..f980fbfb1a0e 100644
--- a/include/uapi/misc/habanalabs.h
+++ b/include/uapi/misc/habanalabs.h
@@ -959,6 +959,10 @@ union hl_wait_cs_args {
 #define HL_MEM_OP_UNMAP3
 /* Opcode to map a hw block */
 #define HL_MEM_OP_MAP_BLOCK4
+/* Opcode to create DMA-BUF object for an existing device memory allocation
+ * and to export an FD of that DMA-BUF back to the caller
+ */
+#define HL_MEM_OP_EXPORT_DMABUF_FD 5
 
 /* Memory flags */
 #define HL_MEM_CONTIGUOUS  0x1
@@ -1030,11 +1034,26 @@ struct hl_mem_in {
/* Virtual address returned from HL_MEM_OP_MAP */
__u64 device_virt_addr;
} unmap;
+
+   /* HL_MEM_OP_EXPORT_DMABUF_FD */
+   struct {
+   /* Handle returned from HL_MEM_OP_ALLOC. In Gaudi,
+* where we don't have MMU for the device memory, the
+* driver expects a physical address (instead of
+* a handle) in the device memory space.
+*/
+   __u64 handle;
+   /* Size of memory allocation. Relevant only for GAUDI */
+   __u64 mem_size;
+   } export_dmabuf_fd;
};
 
/* HL_MEM_OP_* */
__u32 op;
-   /* HL_MEM_* flags */
+   /* HL_MEM_* flags.
+* For the HL_MEM_OP_EXPORT_DMABUF_FD opcode, this field holds the
+* DMA-BUF file/FD flags.
+*/
__u32 flags;
/* Context ID - Currently not in use */
__u32 ctx_id;
@@ -1071,6 +1090,13 @@ struct hl_mem_out {
 
__u32 pad;
};
+
+   /* Returned in HL_MEM_OP_EXPORT_DMABUF_FD. Represents the
+* DMA-BUF object that was created to describe a memory
+* allocation on the device's memory space. The FD should be
+* passed to the importer driver
+*/
+   __s32 fd;
};
 };
 
-- 
2.17.1



[PATCH v7 0/2] Add p2p via dmabuf to habanalabs

2021-10-03 Thread Oded Gabbay
Hi,
I'm sending v7 after the latest review from Jason.
All the changes are detailed in the commit messages.

Dave, I'll appreciate if you can also a-b this patchset.

Thanks,
Oded

Oded Gabbay (1):
  habanalabs: define uAPI to export FD for DMA-BUF

Tomer Tayar (1):
  habanalabs: add support for dma-buf exporter

 drivers/misc/habanalabs/Kconfig |   1 +
 drivers/misc/habanalabs/common/habanalabs.h |  22 +
 drivers/misc/habanalabs/common/memory.c | 514 +++-
 drivers/misc/habanalabs/gaudi/gaudi.c   |   1 +
 drivers/misc/habanalabs/goya/goya.c |   1 +
 include/uapi/misc/habanalabs.h  |  28 +-
 6 files changed, 563 insertions(+), 4 deletions(-)

-- 
2.17.1