Re: [PATCH 0/3] Use implicit kref infra

2020-09-01 Thread Pan, Xinhui


> 2020年9月2日 11:46,Tuikov, Luben  写道:
> 
> On 2020-09-01 21:42, Pan, Xinhui wrote:
>> If you take a look at the below function, you should not use driver's 
>> release to free adev. As dev is embedded in adev.
> 
> Do you mean "look at the function below", using "below" as an adverb?
> "below" is not an adjective.
> 
> I know dev is embedded in adev--I did that patchset.
> 
>> 
>> 809 static void drm_dev_release(struct kref *ref)
>> 810 {
>> 811 struct drm_device *dev = container_of(ref, struct drm_device, 
>> ref);
>> 812
>> 813 if (dev->driver->release)
>> 814 dev->driver->release(dev);
>> 815 
>> 816 drm_managed_release(dev);
>> 817 
>> 818 kfree(dev->managed.final_kfree);
>> 819 }
> 
> That's simple--this comes from change c6603c740e0e3
> and it should be reverted. Simple as that.
> 
> The version before this change was absolutely correct:
> 
> static void drm_dev_release(struct kref *ref)
> {
>   if (dev->driver->release)
>   dev->driver->release(dev);
>   else
>   drm_dev_fini(dev);
> }
> 
> Meaning, "the kref is now 0"--> if the driver
> has a release, call it, else use our own.
> But note that nothing can be assumed after this point,
> about the existence of "dev".
> 
> It is exactly because struct drm_device is statically
> embedded into a container, struct amdgpu_device,
> that this change above should be reverted.
> 
> This is very similar to how fops has open/release
> but no close. That is, the "release" is called
> only when the last kref is released, i.e. when
> kref goes from non-zero to zero.
> 
> This uses the kref infrastructure which has been
> around for about 20 years in the Linux kernel.
> 
> I suggest reading the comments
> in drm_dev.c mostly, "DOC: driver instance overview"
> starting at line 240 onwards. This is right above
> drm_put_dev(). There is actually an example of a driver
> in the comment. Also the comment to drm_dev_init().
> 
> Now, take a look at this:
> 
> /**
> * drm_dev_put - Drop reference of a DRM device
> * @dev: device to drop reference of or NULL
> *
> * This decreases the ref-count of @dev by one. The device is destroyed if the
> * ref-count drops to zero.
> */
> void drm_dev_put(struct drm_device *dev)
> {
>if (dev)
>kref_put(>ref, drm_dev_release);
> }
> EXPORT_SYMBOL(drm_dev_put);
> 
> Two things:
> 
> 1. It is us, who kzalloc the amdgpu device, which contains
> the drm_device (you'll see this discussed in the reading
> material I pointed to above). We do this because we're
> probing the PCI device whether we'll work it it or not.
> 

that is true.
My understanding of the drm core code is like something below.
struct B { 
strcut A 
}
we initialize A firstly and initialize B in the end. But destroy B firstly and 
destory A in the end.
But yes, practice is more complex. 
if B has nothing to be destroyed. we can destory A directly, otherwise destroy 
B firstly.

in this case, we can do something below in our release()
//some cleanup work of B
drm_dev_fini(dev);//destroy A
kfree(adev)

> 2. Using the kref infrastructure, when the ref goes to 0,
> drm_dev_release is called. And here's the KEY:
> Because WE allocated the container, we should free it--after the release
> method is called, DRM cannot assume anything about the drm
> device or the container. The "release" method is final.
> 
> We allocate, we free. And we free only when the ref goes to 0.
> 
> DRM can, in due time, "free" itself of the DRM device and stop
> having knowledge of it--that's fine, but as long as the ref
> is not 0, the amdgpu device and thus the contained DRM device,
> cannot be freed.
> 
>> 
>> You have to make another change something like
>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
>> index 13068fdf4331..2aabd2b4c63b 100644
>> --- a/drivers/gpu/drm/drm_drv.c
>> +++ b/drivers/gpu/drm/drm_drv.c
>> @@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref)
>> 
>>drm_managed_release(dev);
>> 
>> -   kfree(dev->managed.final_kfree);
>> +   if (dev->driver->final_release)
>> +   dev->driver->final_release(dev);
>> }
> 
> No. What's this?
> There is no such thing as "final" release, nor is there a "partial" release.
> When the kref goes to 0, the device disappears. Simple.
> If someone is using it, they should kref-get it, and when they're
> done with it, they should kref-put it.

I just take an example here. add another release in the end. then no one could 
touch us. IOW, final_release.


A destroy B by a callback, then A destroy itself. It assumes B just free its 
own resource.
but that makes trouble if some resource of A is allocated by B.
Because B must take care of these common resource shared between A and B.

yes, that logical is more complex. So I think we can revert drm_dev_release to 
its previous version.

> 
> The whole point is that this is done implicitly, via the kref infrastructure.
> 

Re: [PATCH 0/3] Use implicit kref infra

2020-09-01 Thread Luben Tuikov
On 2020-09-01 21:42, Pan, Xinhui wrote:
> If you take a look at the below function, you should not use driver's release 
> to free adev. As dev is embedded in adev.

Do you mean "look at the function below", using "below" as an adverb?
"below" is not an adjective.

I know dev is embedded in adev--I did that patchset.

> 
>  809 static void drm_dev_release(struct kref *ref)
>  810 {
>  811 struct drm_device *dev = container_of(ref, struct drm_device, 
> ref);
>  812
>  813 if (dev->driver->release)
>  814 dev->driver->release(dev);
>  815 
>  816 drm_managed_release(dev);
>  817 
>  818 kfree(dev->managed.final_kfree);
>  819 }

That's simple--this comes from change c6603c740e0e3
and it should be reverted. Simple as that.

The version before this change was absolutely correct:

static void drm_dev_release(struct kref *ref)
{
if (dev->driver->release)
dev->driver->release(dev);
else
drm_dev_fini(dev);
}

Meaning, "the kref is now 0"--> if the driver
has a release, call it, else use our own.
But note that nothing can be assumed after this point,
about the existence of "dev".

It is exactly because struct drm_device is statically
embedded into a container, struct amdgpu_device,
that this change above should be reverted.

This is very similar to how fops has open/release
but no close. That is, the "release" is called
only when the last kref is released, i.e. when
kref goes from non-zero to zero.

This uses the kref infrastructure which has been
around for about 20 years in the Linux kernel.

I suggest reading the comments
in drm_dev.c mostly, "DOC: driver instance overview"
starting at line 240 onwards. This is right above
drm_put_dev(). There is actually an example of a driver
in the comment. Also the comment to drm_dev_init().

Now, take a look at this:

/**
 * drm_dev_put - Drop reference of a DRM device
 * @dev: device to drop reference of or NULL
 *
 * This decreases the ref-count of @dev by one. The device is destroyed if the
 * ref-count drops to zero.
 */
void drm_dev_put(struct drm_device *dev)
{
if (dev)
kref_put(>ref, drm_dev_release);
}
EXPORT_SYMBOL(drm_dev_put);

Two things:

1. It is us, who kzalloc the amdgpu device, which contains
the drm_device (you'll see this discussed in the reading
material I pointed to above). We do this because we're
probing the PCI device whether we'll work it it or not.

2. Using the kref infrastructure, when the ref goes to 0,
drm_dev_release is called. And here's the KEY:
Because WE allocated the container, we should free it--after the release
method is called, DRM cannot assume anything about the drm
device or the container. The "release" method is final.

We allocate, we free. And we free only when the ref goes to 0.

DRM can, in due time, "free" itself of the DRM device and stop
having knowledge of it--that's fine, but as long as the ref
is not 0, the amdgpu device and thus the contained DRM device,
cannot be freed.

> 
> You have to make another change something like
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 13068fdf4331..2aabd2b4c63b 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref)
>  
> drm_managed_release(dev);
>  
> -   kfree(dev->managed.final_kfree);
> +   if (dev->driver->final_release)
> +   dev->driver->final_release(dev);
>  }

No. What's this?
There is no such thing as "final" release, nor is there a "partial" release.
When the kref goes to 0, the device disappears. Simple.
If someone is using it, they should kref-get it, and when they're
done with it, they should kref-put it.

The whole point is that this is done implicitly, via the kref infrastructure.
drm_dev_init() which we call in our PCI probe function, sets the kref to 1--all
as per the documentation I pointed you to above.

Another point is that we can do some other stuff in the release
function, notify someone, write some registers, free memory we use
for that PCI device, etc.

If the "managed resources" infrastructure wants to stay, it should hook
itself into drm_dev_fini() and into drm_dev_init() or drm_dev_register().
It shouldn't have to be so out-of-place like in patch 2/3 of this series,
where the drmm_add_final_kfree() is smack-dab in the middle of our PCI
discovery function, surrounded on top and bottom by drm_dev_init()
and drm_dev_register(). The "managed resources" infra should be non-invasive
and drivers shouldn't have to change to use it--it should be invisible to them.
Then our kref would just work.

> 
> And in the final_release callback we free the dev. But that is a little 
> complex now. so I prefer still using final_kfree.
> Of course we can do some cleanup work in the driver's release callback. BUT 
> no kfree.

No! No final_kfree. It's a hack.

Read the documentation in drm_drv.c I noted 

Re: [PATCH v2 2/4] drm/vc4: hdmi: Add pixel bvb clock control

2020-09-01 Thread Hoegeun Kwon
Hi Chanwoo,

On 9/1/20 1:27 PM, Chanwoo Choi wrote:
> Hi Hoegeun,
>
> It looks good to me. But, just one comment.
>
> On 9/1/20 1:07 PM, Hoegeun Kwon wrote:
>> There is a problem that the output does not work at a resolution
>> exceeding FHD. To solve this, we need to adjust the bvb clock at a
>> resolution exceeding FHD.
>>
>> Signed-off-by: Hoegeun Kwon 
>> ---
>>   drivers/gpu/drm/vc4/vc4_hdmi.c | 25 +
>>   drivers/gpu/drm/vc4/vc4_hdmi.h |  1 +
>>   2 files changed, 26 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
>> index 95ec5eedea39..eb3192d1fd86 100644
>> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
>> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
>> @@ -80,6 +80,7 @@
>>   # define VC4_HD_M_ENABLE   BIT(0)
>>   
>>   #define CEC_CLOCK_FREQ 4
>> +#define VC4_HSM_MID_CLOCK 149985000
>>   
>>   static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused)
>>   {
>> @@ -380,6 +381,7 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct 
>> drm_encoder *encoder)
>>  HDMI_WRITE(HDMI_VID_CTL,
>> HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
>>   
>> +clk_disable_unprepare(vc4_hdmi->pixel_bvb_clock);
>>  clk_disable_unprepare(vc4_hdmi->hsm_clock);
>>  clk_disable_unprepare(vc4_hdmi->pixel_clock);
>>   
>> @@ -638,6 +640,23 @@ static void vc4_hdmi_encoder_pre_crtc_configure(struct 
>> drm_encoder *encoder)
>>  return;
>>  }
>>   
>> +ret = clk_set_rate(vc4_hdmi->pixel_bvb_clock,
>> +(hsm_rate > VC4_HSM_MID_CLOCK ? 15000 : 7500));
>> +if (ret) {
>> +DRM_ERROR("Failed to set pixel bvb clock rate: %d\n", ret);
>> +clk_disable_unprepare(vc4_hdmi->hsm_clock);
>> +clk_disable_unprepare(vc4_hdmi->pixel_clock);
>> +return;
>> +}
>> +
>> +ret = clk_prepare_enable(vc4_hdmi->pixel_bvb_clock);
>> +if (ret) {
>> +DRM_ERROR("Failed to turn on pixel bvb clock: %d\n", ret);
>> +clk_disable_unprepare(vc4_hdmi->hsm_clock);
>> +clk_disable_unprepare(vc4_hdmi->pixel_clock);
>> +return;
>> +}
> Generally, enable the clock before using clk and then change the clock rate.
> I think that you better to change the order between clk_prepare_enable and 
> clk_set_rate.

Thank you for your comment.


As Maxime answered in another patch [1], there is no clear rule of order 
here.

[1] https://lkml.org/lkml/2020/9/1/327


Best regards,

Hoegeun


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: linux-next: manual merge of the drm-misc tree with Linus' tree

2020-09-01 Thread Stephen Rothwell
Hi all,

On Wed, 26 Aug 2020 10:01:13 +1000 Stephen Rothwell  
wrote:
>
> Hi all,
> 
> Today's linux-next merge of the drm-misc tree got conflicts in:
> 
>   drivers/video/fbdev/arcfb.c
>   drivers/video/fbdev/atmel_lcdfb.c
>   drivers/video/fbdev/savage/savagefb_driver.c
> 
> between commit:
> 
>   df561f6688fe ("treewide: Use fallthrough pseudo-keyword")
> 
> from Linus' tree and commit:
> 
>   ad04fae0de07 ("fbdev: Use fallthrough pseudo-keyword")
> 
> from the drm-misc tree.
> 
> I fixed it up (they are much the same, I just used the version from Linus'
> tree) and can carry the fix as necessary. This is now fixed as far as
> linux-next is concerned, but any non trivial conflicts should be mentioned
> to your upstream maintainer when your tree is submitted for merging.
> You may also want to consider cooperating with the maintainer of the
> conflicting tree to minimise any particularly complex conflicts.

These conflicts now appear in the merge between the drm tree and Linus'
tree.

-- 
Cheers,
Stephen Rothwell


pgp3Pc3XKh5bX.pgp
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: linux-next: build failure after merge of the drm-misc tree

2020-09-01 Thread Stephen Rothwell
Hi all,

On Wed, 26 Aug 2020 10:55:47 +1000 Stephen Rothwell  
wrote:
>
> After merging the drm-misc tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> drivers/gpu/drm/qxl/qxl_display.c: In function 
> 'qxl_display_read_client_monitors_config':
> include/drm/drm_modeset_lock.h:167:7: error: implicit declaration of function 
> 'drm_drv_uses_atomic_modeset' [-Werror=implicit-function-declaration]
>   167 |  if (!drm_drv_uses_atomic_modeset(dev))\
>   |   ^~~
> drivers/gpu/drm/qxl/qxl_display.c:187:2: note: in expansion of macro 
> 'DRM_MODESET_LOCK_ALL_BEGIN'
>   187 |  DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, 
> DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
>   |  ^~
> drivers/gpu/drm/qxl/qxl_display.c:189:35: error: macro 
> "DRM_MODESET_LOCK_ALL_END" requires 3 arguments, but only 2 given
>   189 |  DRM_MODESET_LOCK_ALL_END(ctx, ret);
>   |   ^
> In file included from include/drm/drm_crtc.h:36,
>  from include/drm/drm_atomic.h:31,
>  from drivers/gpu/drm/qxl/qxl_display.c:29:
> include/drm/drm_modeset_lock.h:194: note: macro "DRM_MODESET_LOCK_ALL_END" 
> defined here
>   194 | #define DRM_MODESET_LOCK_ALL_END(dev, ctx, ret)\
>   | 
> drivers/gpu/drm/qxl/qxl_display.c:189:2: error: 'DRM_MODESET_LOCK_ALL_END' 
> undeclared (first use in this function)
>   189 |  DRM_MODESET_LOCK_ALL_END(ctx, ret);
>   |  ^~~~
> drivers/gpu/drm/qxl/qxl_display.c:189:2: note: each undeclared identifier is 
> reported only once for each function it appears in
> drivers/gpu/drm/qxl/qxl_display.c:187:2: error: label 'modeset_lock_fail' 
> used but not defined
>   187 |  DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, 
> DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
>   |  ^~
> In file included from include/drm/drm_crtc.h:36,
>  from include/drm/drm_atomic.h:31,
>  from drivers/gpu/drm/qxl/qxl_display.c:29:
> include/drm/drm_modeset_lock.h:170:1: warning: label 'modeset_lock_retry' 
> defined but not used [-Wunused-label]
>   170 | modeset_lock_retry:   \
>   | ^~
> drivers/gpu/drm/qxl/qxl_display.c:187:2: note: in expansion of macro 
> 'DRM_MODESET_LOCK_ALL_BEGIN'
>   187 |  DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, 
> DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
>   |  ^~
> drivers/gpu/drm/qxl/qxl_display.c: In function 
> 'qxl_framebuffer_surface_dirty':
> drivers/gpu/drm/qxl/qxl_display.c:434:35: error: macro 
> "DRM_MODESET_LOCK_ALL_END" requires 3 arguments, but only 2 given
>   434 |  DRM_MODESET_LOCK_ALL_END(ctx, ret);
>   |   ^
> In file included from include/drm/drm_crtc.h:36,
>  from include/drm/drm_atomic.h:31,
>  from drivers/gpu/drm/qxl/qxl_display.c:29:
> include/drm/drm_modeset_lock.h:194: note: macro "DRM_MODESET_LOCK_ALL_END" 
> defined here
>   194 | #define DRM_MODESET_LOCK_ALL_END(dev, ctx, ret)\
>   | 
> drivers/gpu/drm/qxl/qxl_display.c:434:2: error: 'DRM_MODESET_LOCK_ALL_END' 
> undeclared (first use in this function)
>   434 |  DRM_MODESET_LOCK_ALL_END(ctx, ret);
>   |  ^~~~
> drivers/gpu/drm/qxl/qxl_display.c:411:2: error: label 'modeset_lock_fail' 
> used but not defined
>   411 |  DRM_MODESET_LOCK_ALL_BEGIN(fb->dev, ctx, 
> DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
>   |  ^~
> In file included from include/drm/drm_crtc.h:36,
>  from include/drm/drm_atomic.h:31,
>  from drivers/gpu/drm/qxl/qxl_display.c:29:
> include/drm/drm_modeset_lock.h:170:1: warning: label 'modeset_lock_retry' 
> defined but not used [-Wunused-label]
>   170 | modeset_lock_retry:   \
>   | ^~
> drivers/gpu/drm/qxl/qxl_display.c:411:2: note: in expansion of macro 
> 'DRM_MODESET_LOCK_ALL_BEGIN'
>   411 |  DRM_MODESET_LOCK_ALL_BEGIN(fb->dev, ctx, 
> DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
>   |  ^~
> 
> Caused by commit
> 
>   bbaac1354cc9 ("drm/qxl: Replace deprecated function in qxl_display")
> 
> interacting with commit
> 
>   77ef38574beb ("drm/modeset-lock: Take the modeset BKL for legacy drivers")
> 
> from the drm-misc-fixes tree.
> 
> drivers/gpu/drm/qxl/qxl_display.c manages to include
> drm/drm_modeset_lock.h by some indirect route, but fails to have
> drm/drm_drv.h similarly included.  In fact, drm/drm_modeset_lock.h should
> have included drm/drm_drv.h since it uses things declared there, and
> drivers/gpu/drm/qxl/qxl_display.c should include drm/drm_modeset_lock.h
> similarly.
> 
> I have added the following hack patch for today.
> 
> From: Stephen Rothwell 
> Date: Wed, 26 Aug 2020 10:40:18 +1000
> Subject: [PATCH] fix interaction with drm-misc-fix commit
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  drivers/gpu/drm/qxl/qxl_display.c | 5 +++--
>  

Re: [PATCH 0/3] Use implicit kref infra

2020-09-01 Thread Pan, Xinhui
If you take a look at the below function, you should not use driver's release 
to free adev. As dev is embedded in adev.

 809 static void drm_dev_release(struct kref *ref)
 810 {
 811 struct drm_device *dev = container_of(ref, struct drm_device, ref);
 812
 813 if (dev->driver->release)
 814 dev->driver->release(dev);
 815 
 816 drm_managed_release(dev);
 817 
 818 kfree(dev->managed.final_kfree);
 819 }

You have to make another change something like
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 13068fdf4331..2aabd2b4c63b 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -815,7 +815,8 @@ static void drm_dev_release(struct kref *ref)
 
drm_managed_release(dev);
 
-   kfree(dev->managed.final_kfree);
+   if (dev->driver->final_release)
+   dev->driver->final_release(dev);
 }

And in the final_release callback we free the dev. But that is a little complex 
now. so I prefer still using final_kfree.
Of course we can do some cleanup work in the driver's release callback. BUT no 
kfree.

-原始邮件-
发件人: "Tuikov, Luben" 
日期: 2020年9月2日 星期三 09:07
收件人: "amd-...@lists.freedesktop.org" , 
"dri-devel@lists.freedesktop.org" 
抄送: "Deucher, Alexander" , Daniel Vetter 
, "Pan, Xinhui" , "Tuikov, Luben" 

主题: [PATCH 0/3] Use implicit kref infra

Use the implicit kref infrastructure to free the container
struct amdgpu_device, container of struct drm_device.

First, in drm_dev_register(), do not indiscriminately warn
when a DRM driver hasn't opted for managed.final_kfree,
but instead check if the driver has provided its own
"release" function callback in the DRM driver structure.
If that is the case, no warning.

Remove drmm_add_final_kfree(). We take care of that, in the
kref "release" callback when all refs are down to 0, via
drm_dev_put(), i.e. the free is implicit.

Remove superfluous NULL check, since the DRM device to be
suspended always exists, so long as the underlying PCI and
DRM devices exist.

Luben Tuikov (3):
  drm: No warn for drivers who provide release
  drm/amdgpu: Remove drmm final free
  drm/amdgpu: Remove superfluous NULL check

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2 --
 drivers/gpu/drm/drm_drv.c  | 3 ++-
 3 files changed, 2 insertions(+), 6 deletions(-)

-- 
2.28.0.394.ge197136389



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/3] drm/amdgpu: Remove superfluous NULL check

2020-09-01 Thread Luben Tuikov
The DRM device is a static member of
the amdgpu device structure and as such
always exists, so long as the PCI and
thus the amdgpu device exist.

Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4900471beb0..6dcc256b9ebc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3471,9 +3471,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
struct drm_connector_list_iter iter;
int r;
 
-   if (!dev)
-   return -ENODEV;
-
adev = drm_to_adev(dev);
 
if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
-- 
2.28.0.394.ge197136389

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 0/3] Use implicit kref infra

2020-09-01 Thread Luben Tuikov
Use the implicit kref infrastructure to free the container
struct amdgpu_device, container of struct drm_device.

First, in drm_dev_register(), do not indiscriminately warn
when a DRM driver hasn't opted for managed.final_kfree,
but instead check if the driver has provided its own
"release" function callback in the DRM driver structure.
If that is the case, no warning.

Remove drmm_add_final_kfree(). We take care of that, in the
kref "release" callback when all refs are down to 0, via
drm_dev_put(), i.e. the free is implicit.

Remove superfluous NULL check, since the DRM device to be
suspended always exists, so long as the underlying PCI and
DRM devices exist.

Luben Tuikov (3):
  drm: No warn for drivers who provide release
  drm/amdgpu: Remove drmm final free
  drm/amdgpu: Remove superfluous NULL check

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 2 --
 drivers/gpu/drm/drm_drv.c  | 3 ++-
 3 files changed, 2 insertions(+), 6 deletions(-)

-- 
2.28.0.394.ge197136389

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/3] drm/amdgpu: Remove drmm final free

2020-09-01 Thread Luben Tuikov
The amdgpu driver implements its own DRM driver
release function which naturally frees
the container struct amdgpu_device of
the DRM device, on a "final" kref-put,
i.e. when the kref transitions from non-zero
to 0.

Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 459cf13e76fe..17d49f1d86e7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1153,8 +1153,6 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
if (ret)
goto err_free;
 
-   drmm_add_final_kfree(ddev, ddev);
-
if (!supports_atomic)
ddev->driver_features &= ~DRIVER_ATOMIC;
 
-- 
2.28.0.394.ge197136389

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/3] drm: No warn for drivers who provide release

2020-09-01 Thread Luben Tuikov
Drivers usually allocate their container
struct at PCI probe time, then call drm_dev_init(),
which initializes the contained DRM dev kref to 1.

A DRM driver may provide their own kref
release method, which frees the container
object, the container of the DRM device,
on the last "put" which usually comes
after the PCI device has been freed
with PCI and with DRM.

If a driver has provided their own "release"
method in the drm_driver structure, then
do not check "managed.final_kfree", and thus
do not splat a WARN_ON in the kernel log
when a driver which implements "release"
is loaded.

This patch adds this one-line check.

Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/drm_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 13068fdf4331..952455dedb8c 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -935,7 +935,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long 
flags)
if (!driver->load)
drm_mode_config_validate(dev);
 
-   WARN_ON(!dev->managed.final_kfree);
+   if (!driver->release)
+   WARN_ON(!dev->managed.final_kfree);
 
if (drm_dev_needs_global_mutex(dev))
mutex_lock(_global_mutex);
-- 
2.28.0.394.ge197136389

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 1/4] drm/of: Change the prototype of drm_of_lvds_get_dual_link_pixel_order

2020-09-01 Thread Laurent Pinchart
Hi Maxime,

On Tue, Sep 01, 2020 at 03:23:40PM +0200, Maxime Ripard wrote:
> On Mon, Aug 31, 2020 at 11:28:52PM +0300, Laurent Pinchart wrote:
> > On Thu, Jul 30, 2020 at 11:35:01AM +0200, Maxime Ripard wrote:
> > > The drm_of_lvds_get_dual_link_pixel_order() function took so far the
> > > device_node of the two ports used together to make up a dual-link LVDS
> > > output.
> > > 
> > > This assumes that a binding would use an entire port for the LVDS output.
> > > However, some bindings have used endpoints instead and thus we need to
> > > operate at the endpoint level. Change slightly the arguments to allow 
> > > that.
> > 
> > Is this still needed ? Unless I'm mistaken, the Allwinner platform now
> > uses two TCON instances for the two links, so there are two ports.
> 
> Yes, and no.
> 
> The two TCONs indeed have each a port of their own, so we do have two
> ports indeed. However, what we don't have is a port entirely dedicated
> to the LVDS output.
> 
> Our binding uses a single port for all its output (RGB, LVDS or TV/HDMI
> controllers) with different endpoints.

Good point. Then let's keep this patch :-) We can't fix existing
bindings, but for the future, let's model separate display outputs as
ports, not endpoints.

-- 
Regards,

Laurent Pinchart
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 12/32] drm: msm: fix common struct sg_table related issues

2020-09-01 Thread Rob Clark
On Tue, Sep 1, 2020 at 12:14 PM Robin Murphy  wrote:
>
> On 2020-08-26 07:32, Marek Szyprowski wrote:
> > The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
> > returns the number of the created entries in the DMA address space.
> > However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
> > dma_unmap_sg must be called with the original number of the entries
> > passed to the dma_map_sg().
> >
> > struct sg_table is a common structure used for describing a non-contiguous
> > memory buffer, used commonly in the DRM and graphics subsystems. It
> > consists of a scatterlist with memory pages and DMA addresses (sgl entry),
> > as well as the number of scatterlist entries: CPU pages (orig_nents entry)
> > and DMA mapped pages (nents entry).
> >
> > It turned out that it was a common mistake to misuse nents and orig_nents
> > entries, calling DMA-mapping functions with a wrong number of entries or
> > ignoring the number of mapped entries returned by the dma_map_sg()
> > function.
> >
> > To avoid such issues, lets use a common dma-mapping wrappers operating
> > directly on the struct sg_table objects and use scatterlist page
> > iterators where possible. This, almost always, hides references to the
> > nents and orig_nents entries, making the code robust, easier to follow
> > and copy/paste safe.
> >
> > Signed-off-by: Marek Szyprowski 
> > Acked-by: Rob Clark 
> > ---
> >   drivers/gpu/drm/msm/msm_gem.c| 13 +
> >   drivers/gpu/drm/msm/msm_gpummu.c | 14 ++
> >   drivers/gpu/drm/msm/msm_iommu.c  |  2 +-
> >   3 files changed, 12 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > index b2f49152b4d4..8c7ae812b813 100644
> > --- a/drivers/gpu/drm/msm/msm_gem.c
> > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > @@ -53,11 +53,10 @@ static void sync_for_device(struct msm_gem_object 
> > *msm_obj)
> >   struct device *dev = msm_obj->base.dev->dev;
> >
> >   if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
> > - dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
> > - msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> > + dma_sync_sgtable_for_device(dev, msm_obj->sgt,
> > + DMA_BIDIRECTIONAL);
> >   } else {
> > - dma_map_sg(dev, msm_obj->sgt->sgl,
> > - msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> > + dma_map_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
> >   }
> >   }
> >
> > @@ -66,11 +65,9 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
> >   struct device *dev = msm_obj->base.dev->dev;
> >
> >   if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
> > - dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
> > - msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> > + dma_sync_sgtable_for_cpu(dev, msm_obj->sgt, 
> > DMA_BIDIRECTIONAL);
> >   } else {
> > - dma_unmap_sg(dev, msm_obj->sgt->sgl,
> > - msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> > + dma_unmap_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
> >   }
> >   }
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpummu.c 
> > b/drivers/gpu/drm/msm/msm_gpummu.c
> > index 310a31b05faa..319f06c28235 100644
> > --- a/drivers/gpu/drm/msm/msm_gpummu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpummu.c
> > @@ -30,21 +30,19 @@ static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t 
> > iova,
> >   {
> >   struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
> >   unsigned idx = (iova - GPUMMU_VA_START) / GPUMMU_PAGE_SIZE;
> > - struct scatterlist *sg;
> > + struct sg_dma_page_iter dma_iter;
> >   unsigned prot_bits = 0;
> > - unsigned i, j;
> >
> >   if (prot & IOMMU_WRITE)
> >   prot_bits |= 1;
> >   if (prot & IOMMU_READ)
> >   prot_bits |= 2;
> >
> > - for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> > - dma_addr_t addr = sg->dma_address;
> > - for (j = 0; j < sg->length / GPUMMU_PAGE_SIZE; j++, idx++) {
> > - gpummu->table[idx] = addr | prot_bits;
> > - addr += GPUMMU_PAGE_SIZE;
> > - }
> > + for_each_sgtable_dma_page(sgt, _iter, 0) {
> > + dma_addr_t addr = sg_page_iter_dma_address(_iter);
> > +
> > + BUILD_BUG_ON(GPUMMU_PAGE_SIZE != PAGE_SIZE);
> > + gpummu->table[idx++] = addr | prot_bits;
>
> Given that the BUILD_BUG_ON might prevent valid arm64 configs from
> building, how about a simple tweak like:
>
> for (i = 0; i < PAGE_SIZE; i += GPUMMU_PAGE_SIZE)
> gpummu->table[idx++] = i + addr | prot_bits;
> ?
>
> Or alternatively perhaps some more aggressive #ifdefs or makefile tweaks
> to prevent the GPUMMU code building for arm64 at all if it's only
> relevant to 32-bit platforms (which I believe might be the case).


[PATCH v4] drm/nouveau/kms/nv50-: Program notifier offset before requesting disp caps

2020-09-01 Thread Lyude Paul
Not entirely sure why this never came up when I originally tested this
(maybe some BIOSes already have this setup?) but the ->caps_init vfunc
appears to cause the display engine to throw an exception on driver
init, at least on my ThinkPad P72:

nouveau :01:00.0: disp: chid 0 mthd 008c data  508c 102b

This is magic nvidia speak for "You need to have the DMA notifier offset
programmed before you can call NV507D_GET_CAPABILITIES." So, let's fix
this by doing that, and also perform an update afterwards to prevent
racing with the GPU when reading capabilities.

v2:
* Don't just program the DMA notifier offset, make sure to actually
  perform an update
v3:
* Don't call UPDATE()
* Actually read the correct notifier fields, as apparently the
  CAPABILITIES_DONE field lives in a different location than the main
  NV_DISP_CORE_NOTIFIER_1 field. As well, 907d+ use a different
  CAPABILITIES_DONE field then pre-907d cards.
v4:
* Don't forget to check the return value of core507d_read_caps()

Signed-off-by: Lyude Paul 
Fixes: 4a2cb4181b07 ("drm/nouveau/kms/nv50-: Probe SOR and PIOR caps for DP 
interlacing support")
Cc:  # v5.8+
---
 drivers/gpu/drm/nouveau/dispnv50/core.h   |  2 +
 drivers/gpu/drm/nouveau/dispnv50/core507d.c   | 37 ++-
 drivers/gpu/drm/nouveau/dispnv50/core907d.c   | 36 +-
 drivers/gpu/drm/nouveau/dispnv50/core917d.c   |  2 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.h   |  2 +
 .../drm/nouveau/include/nvhw/class/cl507d.h   |  5 ++-
 .../drm/nouveau/include/nvhw/class/cl907d.h   |  4 ++
 7 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/core.h 
b/drivers/gpu/drm/nouveau/dispnv50/core.h
index 498622c0c670d..b789139e5fff6 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/core.h
@@ -44,6 +44,7 @@ int core507d_new_(const struct nv50_core_func *, struct 
nouveau_drm *, s32,
  struct nv50_core **);
 int core507d_init(struct nv50_core *);
 void core507d_ntfy_init(struct nouveau_bo *, u32);
+int core507d_read_caps(struct nv50_disp *disp, u32 offset);
 int core507d_caps_init(struct nouveau_drm *, struct nv50_disp *);
 int core507d_ntfy_wait_done(struct nouveau_bo *, u32, struct nvif_device *);
 int core507d_update(struct nv50_core *, u32 *, bool);
@@ -55,6 +56,7 @@ extern const struct nv50_outp_func pior507d;
 int core827d_new(struct nouveau_drm *, s32, struct nv50_core **);
 
 int core907d_new(struct nouveau_drm *, s32, struct nv50_core **);
+int core907d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp);
 extern const struct nv50_outp_func dac907d;
 extern const struct nv50_outp_func sor907d;
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core507d.c 
b/drivers/gpu/drm/nouveau/dispnv50/core507d.c
index ad1f09a143aa4..d0f2b80a32103 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core507d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core507d.c
@@ -75,18 +75,51 @@ core507d_ntfy_init(struct nouveau_bo *bo, u32 offset)
 }
 
 int
-core507d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp)
+core507d_read_caps(struct nv50_disp *disp, u32 offset)
 {
struct nvif_push *push = disp->core->chan.push;
int ret;
 
-   if ((ret = PUSH_WAIT(push, 2)))
+   ret = PUSH_WAIT(push, 4);
+   if (ret)
return ret;
 
+   PUSH_MTHD(push, NV507D, SET_NOTIFIER_CONTROL,
+ NVDEF(NV507D, SET_NOTIFIER_CONTROL, MODE, WRITE) |
+ NVVAL(NV507D, SET_NOTIFIER_CONTROL, OFFSET, offset >> 2) |
+ NVDEF(NV507D, SET_NOTIFIER_CONTROL, NOTIFY, ENABLE));
PUSH_MTHD(push, NV507D, GET_CAPABILITIES, 0x);
+
return PUSH_KICK(push);
 }
 
+int
+core507d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp)
+{
+   struct nv50_core *core = disp->core;
+   struct nouveau_bo *bo = disp->sync;
+   s64 time;
+   int ret;
+
+   NVBO_WR32(bo, NV50_DISP_CAPS_NTFY1, NV_DISP_CORE_NOTIFIER_1, 
CAPABILITIES_1,
+ NVDEF(NV_DISP_CORE_NOTIFIER_1, 
CAPABILITIES_1, DONE, FALSE));
+
+   ret = core507d_read_caps(disp, NV50_DISP_CAPS_NTFY1);
+   if (ret < 0)
+   return ret;
+
+   time = nvif_msec(core->chan.base.device, 2000ULL,
+if (NVBO_TD32(bo, NV50_DISP_CAPS_NTFY1,
+  NV_DISP_CORE_NOTIFIER_1, CAPABILITIES_1, 
DONE, ==, TRUE))
+break;
+usleep_range(1, 2);
+);
+   if (time < 0)
+   NV_ERROR(drm, "core caps notifier timeout\n");
+
+   return 0;
+}
+
 int
 core507d_init(struct nv50_core *core)
 {
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core907d.c 
b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
index b17c03529c784..45505a18aca17 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core907d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
@@ -22,11 +22,45 

[PATCH v3] drm/nouveau/kms/nv50-: Program notifier offset before requesting disp caps

2020-09-01 Thread Lyude Paul
Not entirely sure why this never came up when I originally tested this
(maybe some BIOSes already have this setup?) but the ->caps_init vfunc
appears to cause the display engine to throw an exception on driver
init, at least on my ThinkPad P72:

nouveau :01:00.0: disp: chid 0 mthd 008c data  508c 102b

This is magic nvidia speak for "You need to have the DMA notifier offset
programmed before you can call NV507D_GET_CAPABILITIES." So, let's fix
this by doing that, and also perform an update afterwards to prevent
racing with the GPU when reading capabilities.

v2:
* Don't just program the DMA notifier offset, make sure to actually
  perform an update
v3:
* Don't call UPDATE()
* Actually read the correct notifier fields, as apparently the
  CAPABILITIES_DONE field lives in a different location than the main
  NV_DISP_CORE_NOTIFIER_1 field. As well, 907d+ use a different
  CAPABILITIES_DONE field then pre-907d cards.

Signed-off-by: Lyude Paul 
Fixes: 4a2cb4181b07 ("drm/nouveau/kms/nv50-: Probe SOR and PIOR caps for DP 
interlacing support")
Cc:  # v5.8+
---
 drivers/gpu/drm/nouveau/dispnv50/core.h   |  2 ++
 drivers/gpu/drm/nouveau/dispnv50/core507d.c   | 34 +--
 drivers/gpu/drm/nouveau/dispnv50/core907d.c   | 33 +-
 drivers/gpu/drm/nouveau/dispnv50/core917d.c   |  2 +-
 drivers/gpu/drm/nouveau/dispnv50/disp.h   |  2 ++
 .../drm/nouveau/include/nvhw/class/cl507d.h   |  5 ++-
 .../drm/nouveau/include/nvhw/class/cl907d.h   |  4 +++
 7 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/core.h 
b/drivers/gpu/drm/nouveau/dispnv50/core.h
index 498622c0c670d..b789139e5fff6 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core.h
+++ b/drivers/gpu/drm/nouveau/dispnv50/core.h
@@ -44,6 +44,7 @@ int core507d_new_(const struct nv50_core_func *, struct 
nouveau_drm *, s32,
  struct nv50_core **);
 int core507d_init(struct nv50_core *);
 void core507d_ntfy_init(struct nouveau_bo *, u32);
+int core507d_read_caps(struct nv50_disp *disp, u32 offset);
 int core507d_caps_init(struct nouveau_drm *, struct nv50_disp *);
 int core507d_ntfy_wait_done(struct nouveau_bo *, u32, struct nvif_device *);
 int core507d_update(struct nv50_core *, u32 *, bool);
@@ -55,6 +56,7 @@ extern const struct nv50_outp_func pior507d;
 int core827d_new(struct nouveau_drm *, s32, struct nv50_core **);
 
 int core907d_new(struct nouveau_drm *, s32, struct nv50_core **);
+int core907d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp);
 extern const struct nv50_outp_func dac907d;
 extern const struct nv50_outp_func sor907d;
 
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core507d.c 
b/drivers/gpu/drm/nouveau/dispnv50/core507d.c
index ad1f09a143aa4..3ec4c3a238c41 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core507d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core507d.c
@@ -75,18 +75,48 @@ core507d_ntfy_init(struct nouveau_bo *bo, u32 offset)
 }
 
 int
-core507d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp)
+core507d_read_caps(struct nv50_disp *disp, u32 offset)
 {
struct nvif_push *push = disp->core->chan.push;
int ret;
 
-   if ((ret = PUSH_WAIT(push, 2)))
+   ret = PUSH_WAIT(push, 4);
+   if (ret)
return ret;
 
+   PUSH_MTHD(push, NV507D, SET_NOTIFIER_CONTROL,
+ NVDEF(NV507D, SET_NOTIFIER_CONTROL, MODE, WRITE) |
+ NVVAL(NV507D, SET_NOTIFIER_CONTROL, OFFSET, offset >> 2) |
+ NVDEF(NV507D, SET_NOTIFIER_CONTROL, NOTIFY, ENABLE));
PUSH_MTHD(push, NV507D, GET_CAPABILITIES, 0x);
+
return PUSH_KICK(push);
 }
 
+int
+core507d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp)
+{
+   struct nv50_core *core = disp->core;
+   struct nouveau_bo *bo = disp->sync;
+   s64 time;
+
+   NVBO_WR32(bo, NV50_DISP_CAPS_NTFY1, NV_DISP_CORE_NOTIFIER_1, 
CAPABILITIES_1,
+ NVDEF(NV_DISP_CORE_NOTIFIER_1, 
CAPABILITIES_1, DONE, FALSE));
+
+   core507d_read_caps(disp, NV50_DISP_CAPS_NTFY1);
+
+   time = nvif_msec(core->chan.base.device, 2000ULL,
+if (NVBO_TD32(bo, NV50_DISP_CAPS_NTFY1,
+  NV_DISP_CORE_NOTIFIER_1, CAPABILITIES_1, 
DONE, ==, TRUE))
+break;
+usleep_range(1, 2);
+);
+   if (time < 0)
+   NV_ERROR(drm, "core caps notifier timeout\n");
+
+   return 0;
+}
+
 int
 core507d_init(struct nv50_core *core)
 {
diff --git a/drivers/gpu/drm/nouveau/dispnv50/core907d.c 
b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
index b17c03529c784..8a2005adb0e2f 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/core907d.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/core907d.c
@@ -22,11 +22,42 @@
 #include "core.h"
 #include "head.h"
 
+#include 
+#include 
+
+#include 
+
+#include "nouveau_bo.h"
+
+int

Re: [PATCH v9 11/32] drm: mediatek: use common helper for extracting pages array

2020-09-01 Thread Chun-Kuang Hu
Robin Murphy  於 2020年9月2日 週三 上午2:55寫道:
>
> On 2020-08-26 07:32, Marek Szyprowski wrote:
> > Use common helper for converting a sg_table object into struct
> > page pointer array.
>
> Reviewed-by: Robin Murphy 
>
> Side note: is mtk_drm_gem_prime_vmap() missing a call to
> sg_free_table(sgt) before its kfree(sgt)?

Yes, we need another patch to fix that bug, But for this patch,

Acked-by: Chun-Kuang Hu 

>
> > Signed-off-by: Marek Szyprowski 
> > ---
> >   drivers/gpu/drm/mediatek/mtk_drm_gem.c | 9 ++---
> >   1 file changed, 2 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
> > b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> > index 3654ec732029..0583e557ad37 100644
> > --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> > +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> > @@ -233,9 +233,7 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
> >   {
> >   struct mtk_drm_gem_obj *mtk_gem = to_mtk_gem_obj(obj);
> >   struct sg_table *sgt;
> > - struct sg_page_iter iter;
> >   unsigned int npages;
> > - unsigned int i = 0;
> >
> >   if (mtk_gem->kvaddr)
> >   return mtk_gem->kvaddr;
> > @@ -249,11 +247,8 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object 
> > *obj)
> >   if (!mtk_gem->pages)
> >   goto out;
> >
> > - for_each_sg_page(sgt->sgl, , sgt->orig_nents, 0) {
> > - mtk_gem->pages[i++] = sg_page_iter_page();
> > - if (i > npages)
> > - break;
> > - }
> > + drm_prime_sg_to_page_addr_arrays(sgt, mtk_gem->pages, NULL, npages);
> > +
> >   mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
> >  pgprot_writecombine(PAGE_KERNEL));
> >
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 10/32] drm: mediatek: use common helper for a scatterlist contiguity check

2020-09-01 Thread Chun-Kuang Hu
Hi, Marek:

Marek Szyprowski  於 2020年8月26日 週三 下午2:35寫道:
>
> Use common helper for checking the contiguity of the imported dma-buf and
> do this check before allocating resources, so the error path is simpler.
>

Acked-by: Chun-Kuang Hu 

> Signed-off-by: Marek Szyprowski 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_gem.c | 28 ++
>  1 file changed, 6 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> index 6190cc3b7b0d..3654ec732029 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> @@ -212,37 +212,21 @@ struct drm_gem_object 
> *mtk_gem_prime_import_sg_table(struct drm_device *dev,
> struct dma_buf_attachment *attach, struct sg_table 
> *sg)
>  {
> struct mtk_drm_gem_obj *mtk_gem;
> -   int ret;
> -   struct scatterlist *s;
> -   unsigned int i;
> -   dma_addr_t expected;
>
> -   mtk_gem = mtk_drm_gem_init(dev, attach->dmabuf->size);
> +   /* check if the entries in the sg_table are contiguous */
> +   if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
> +   DRM_ERROR("sg_table is not contiguous");
> +   return ERR_PTR(-EINVAL);
> +   }
>
> +   mtk_gem = mtk_drm_gem_init(dev, attach->dmabuf->size);
> if (IS_ERR(mtk_gem))
> return ERR_CAST(mtk_gem);
>
> -   expected = sg_dma_address(sg->sgl);
> -   for_each_sg(sg->sgl, s, sg->nents, i) {
> -   if (!sg_dma_len(s))
> -   break;
> -
> -   if (sg_dma_address(s) != expected) {
> -   DRM_ERROR("sg_table is not contiguous");
> -   ret = -EINVAL;
> -   goto err_gem_free;
> -   }
> -   expected = sg_dma_address(s) + sg_dma_len(s);
> -   }
> -
> mtk_gem->dma_addr = sg_dma_address(sg->sgl);
> mtk_gem->sg = sg;
>
> return _gem->base;
> -
> -err_gem_free:
> -   kfree(mtk_gem);
> -   return ERR_PTR(ret);
>  }
>
>  void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
> --
> 2.17.1
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] drm/msm: Drop debug print in _dpu_crtc_setup_lm_bounds()

2020-09-01 Thread abhinavk

On 2020-09-01 14:59, Stephen Boyd wrote:
This function is called quite often if you have a blinking cursor on 
the

screen, hello page flip. Let's drop this debug print here because it
means enabling the print via the module parameter starts to spam the
debug console.

Cc: Abhinav Kumar 
Cc: Jeykumar Sankaran 
Cc: Jordan Crouse 
Cc: Sean Paul 
Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Stephen Boyd 

Reviewed-by: Abhinav Kumar 

---
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 74294b5ed93f..2966e488bfd0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -421,8 +421,6 @@ static void _dpu_crtc_setup_lm_bounds(struct 
drm_crtc *crtc,


trace_dpu_crtc_setup_lm_bounds(DRMID(crtc), i, r);
}
-
-   drm_mode_debug_printmodeline(adj_mode);
 }

 static void _dpu_crtc_get_pcc_coeff(struct drm_crtc_state *state,

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] drm/msm: Avoid div-by-zero in dpu_crtc_atomic_check()

2020-09-01 Thread abhinavk

On 2020-09-01 14:59, Stephen Boyd wrote:

The cstate->num_mixers member is only set to a non-zero value once
dpu_encoder_virt_mode_set() is called, but the atomic check function 
can
be called by userspace before that. Let's avoid the div-by-zero here 
and

inside _dpu_crtc_setup_lm_bounds() by skipping this part of the atomic
check if dpu_encoder_virt_mode_set() hasn't been called yet. This fixes
an UBSAN warning:

 UBSAN: Undefined behaviour in 
drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c:860:31

 division by zero
 CPU: 7 PID: 409 Comm: frecon Tainted: G S5.4.31 #128
 Hardware name: Google Trogdor (rev0) (DT)
 Call trace:
  dump_backtrace+0x0/0x14c
  show_stack+0x20/0x2c
  dump_stack+0xa0/0xd8
  __ubsan_handle_divrem_overflow+0xec/0x110
  dpu_crtc_atomic_check+0x97c/0x9d4
  drm_atomic_helper_check_planes+0x160/0x1c8
  drm_atomic_helper_check+0x54/0xbc
  drm_atomic_check_only+0x6a8/0x880
  drm_atomic_commit+0x20/0x5c
  drm_atomic_helper_set_config+0x98/0xa0
  drm_mode_setcrtc+0x308/0x5dc
  drm_ioctl_kernel+0x9c/0x114
  drm_ioctl+0x2ac/0x4b0
  drm_compat_ioctl+0xe8/0x13c
  __arm64_compat_sys_ioctl+0x184/0x324
  el0_svc_common+0xa4/0x154
  el0_svc_compat_handler+0x

Cc: Abhinav Kumar 
Cc: Jeykumar Sankaran 
Cc: Jordan Crouse 
Cc: Sean Paul 
Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Stephen Boyd 

Reviewed-by: Abhinav Kumar 

---
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index f272a8d0f95b..74294b5ed93f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -881,7 +881,7 @@ static int dpu_crtc_atomic_check(struct drm_crtc 
*crtc,

struct drm_plane *plane;
struct drm_display_mode *mode;

-   int cnt = 0, rc = 0, mixer_width, i, z_pos;
+   int cnt = 0, rc = 0, mixer_width = 0, i, z_pos;

struct dpu_multirect_plane_states multirect_plane[DPU_STAGE_MAX * 2];
int multirect_count = 0;
@@ -914,9 +914,11 @@ static int dpu_crtc_atomic_check(struct drm_crtc 
*crtc,


memset(pipe_staged, 0, sizeof(pipe_staged));

-   mixer_width = mode->hdisplay / cstate->num_mixers;
+   if (cstate->num_mixers) {
+   mixer_width = mode->hdisplay / cstate->num_mixers;

-   _dpu_crtc_setup_lm_bounds(crtc, state);
+   _dpu_crtc_setup_lm_bounds(crtc, state);
+   }

crtc_rect.x2 = mode->hdisplay;
crtc_rect.y2 = mode->vdisplay;

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Nouveau] [PATCH 1/2] drm/nouveau/kms/nv50-: Program notifier offset before requesting disp caps

2020-09-01 Thread Lyude Paul
On Mon, 2020-08-31 at 14:26 +1000, Ben Skeggs wrote:
> On Wed, 26 Aug 2020 at 02:52, Lyude Paul  wrote:
> > On Tue, 2020-08-25 at 08:28 +1000, Ben Skeggs wrote:
> > > On Tue, 25 Aug 2020 at 04:33, Lyude Paul  wrote:
> > > > Not entirely sure why this never came up when I originally tested this
> > > > (maybe some BIOSes already have this setup?) but the ->caps_init vfunc
> > > > appears to cause the display engine to throw an exception on driver
> > > > init, at least on my ThinkPad P72:
> > > > 
> > > > nouveau :01:00.0: disp: chid 0 mthd 008c data  508c
> > > > 102b
> > > > 
> > > > This is magic nvidia speak for "You need to have the DMA notifier offset
> > > > programmed before you can call NV507D_GET_CAPABILITIES." So, let's fix
> > > > this by doing that, and also perform an update afterwards to prevent
> > > > racing with the GPU when reading capabilities.
> > > > 
> > > > Changes since v1:
> > > > * Don't just program the DMA notifier offset, make sure to actually
> > > >   perform an update
> > > I'm not sure there's a need to send an Update() method here, I believe
> > > GetCapabilities() is an action method on its own right?
> > > 
> > 
> > I'm not entirely sure about this part tbh. I do know that we need to call
> > GetCapabilities() _after_ the DMA notifier offset is programmed. But, my
> > assumption was that if GetCapabilities() requires a DMA notifier offset to
> > store
> > its results in, we'd probably want to fire an update or something to make
> > sure
> > that we're not reading before it finishes writing capabilities?
> We definitely want to *wait* on GetCapabilities() finishing, I believe
> it should also update the notifier the same (or similar) way Update()
> does.  But I don't think we want to send an Update() here, it'll
> actually trigger a modeset (which, on earlier HW, will tear down the
> boot mode.  Not sure about current HW, it might preserve state), and
> we may not want that to happen there.

I'm not so sure about that, as it seems like the notifier times out without the
update:

[5.142033] nouveau :1f:00.0: DRM: [DRM/:kmsChanPush] : 
00040088 mthd 0x0088 size 1 - core507d_init
[5.142037] nouveau :1f:00.0: DRM: [DRM/:kmsChanPush] 0004: 
f000-> NV507D_SET_CONTEXT_DMA_NOTIFIER
[5.142041] nouveau :1f:00.0: DRM: [DRM/:kmsChanPush] 0008: 
00040084 mthd 0x0084 size 1 - core507d_caps_init
[5.142044] nouveau :1f:00.0: DRM: [DRM/:kmsChanPush] 000c: 
8000-> NV507D_SET_NOTIFIER_CONTROL
[5.142047] nouveau :1f:00.0: DRM: [DRM/:kmsChanPush] 0010: 
0004008c mthd 0x008c size 1 - core507d_caps_init
[5.142050] nouveau :1f:00.0: DRM: [DRM/:kmsChanPush] 0014: 
-> NV507D_GET_CAPABILITIES
[7.142026] nouveau :1f:00.0: DRM: core notifier timeout
[7.142700] nouveau :1f:00.0: DRM: sor-0002-0fc1 caps: dp_interlace=0
[7.142708] nouveau :1f:00.0: DRM: sor-0002-0fc4 caps: dp_interlace=0
[7.142715] nouveau :1f:00.0: DRM: sor-0002-0f42 caps: dp_interlace=0
[7.142829] nouveau :1f:00.0: DRM: sor-0006-0f82 caps: dp_interlace=0
[7.142842] nouveau :1f:00.0: DRM: sor-0002-0f82 caps: dp_interlace=0
[7.142849] nouveau :1f:00.0: DRM: failed to create encoder 1/8/0: -19
[7.142851] nouveau :1f:00.0: DRM: Virtual-1 has no encoders, removing

Any other alternatives to UPDATE we might want to try?

> 
> Ben.
> 
> > > Ben.
> > > 
> > > > Signed-off-by: Lyude Paul 
> > > > Fixes: 4a2cb4181b07 ("drm/nouveau/kms/nv50-: Probe SOR and PIOR caps for
> > > > DP
> > > > interlacing support")
> > > > Cc:  # v5.8+
> > > > ---
> > > >  drivers/gpu/drm/nouveau/dispnv50/core507d.c | 25 -
> > > >  1 file changed, 19 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/core507d.c
> > > > b/drivers/gpu/drm/nouveau/dispnv50/core507d.c
> > > > index e341f572c2696..5e86feec3b720 100644
> > > > --- a/drivers/gpu/drm/nouveau/dispnv50/core507d.c
> > > > +++ b/drivers/gpu/drm/nouveau/dispnv50/core507d.c
> > > > @@ -65,13 +65,26 @@ core507d_ntfy_init(struct nouveau_bo *bo, u32
> > > > offset)
> > > >  int
> > > >  core507d_caps_init(struct nouveau_drm *drm, struct nv50_disp *disp)
> > > >  {
> > > > -   u32 *push = evo_wait(>core->chan, 2);
> > > > +   struct nv50_core *core = disp->core;
> > > > +   u32 interlock[NV50_DISP_INTERLOCK__SIZE] = {0};
> > > > +   u32 *push;
> > > > 
> > > > -   if (push) {
> > > > -   evo_mthd(push, 0x008c, 1);
> > > > -   evo_data(push, 0x0);
> > > > -   evo_kick(push, >core->chan);
> > > > -   }
> > > > +   core->func->ntfy_init(disp->sync, NV50_DISP_CORE_NTFY);
> > > > +
> > > > +   push = evo_wait(>chan, 4);
> > > > +   if (!push)
> > > > +   return 0;
> > > > +
> > > > +   evo_mthd(push, 0x0084, 1);
> > > > +   evo_data(push, 0x8000 | 

Re: [PATCH 3/3] drm/amdgpu: Embed drm_device into amdgpu_device (v2)

2020-09-01 Thread Luben Tuikov
On 2020-09-01 9:49 a.m., Alex Deucher wrote:
> On Tue, Sep 1, 2020 at 3:44 AM Daniel Vetter  wrote:
>>
>> On Wed, Aug 19, 2020 at 01:00:42AM -0400, Luben Tuikov wrote:
>>> a) Embed struct drm_device into struct amdgpu_device.
>>> b) Modify the inline-f drm_to_adev() accordingly.
>>> c) Modify the inline-f adev_to_drm() accordingly.
>>> d) Eliminate the use of drm_device.dev_private,
>>>in amdgpu.
>>> e) Switch from using drm_dev_alloc() to
>>>drm_dev_init().
>>> f) Add a DRM driver release function, which frees
>>>the container amdgpu_device after all krefs on
>>>the contained drm_device have been released.
>>>
>>> v2: Split out adding adev_to_drm() into its own
>>> patch (previous commit), making this patch
>>> more succinct and clear. More detailed commit
>>> description.
>>>
>>> Signed-off-by: Luben Tuikov 
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 10 ++---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++-
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 43 ++
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 20 +++---
>>>  4 files changed, 43 insertions(+), 45 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 735480cc7dcf..107a6ec920f7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -724,8 +724,8 @@ struct amd_powerplay {
>>>  #define AMDGPU_MAX_DF_PERFMONS 4
>>>  struct amdgpu_device {
>>>   struct device   *dev;
>>> - struct drm_device   *ddev;
>>>   struct pci_dev  *pdev;
>>> + struct drm_device   ddev;
>>>
>>>  #ifdef CONFIG_DRM_AMD_ACP
>>>   struct amdgpu_acp   acp;
>>> @@ -990,12 +990,12 @@ struct amdgpu_device {
>>>
>>>  static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
>>>  {
>>> - return ddev->dev_private;
>>> + return container_of(ddev, struct amdgpu_device, ddev);
>>>  }
>>>
>>>  static inline struct drm_device *adev_to_drm(struct amdgpu_device *adev)
>>>  {
>>> - return adev->ddev;
>>> + return >ddev;
>>>  }
>>>
>>>  static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device 
>>> *bdev)
>>> @@ -1004,8 +1004,6 @@ static inline struct amdgpu_device 
>>> *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
>>>  }
>>>
>>>  int amdgpu_device_init(struct amdgpu_device *adev,
>>> -struct drm_device *ddev,
>>> -struct pci_dev *pdev,
>>>  uint32_t flags);
>>>  void amdgpu_device_fini(struct amdgpu_device *adev);
>>>  int amdgpu_gpu_wait_for_idle(struct amdgpu_device *adev);
>>> @@ -1195,7 +1193,7 @@ static inline void *amdgpu_atpx_get_dhandle(void) { 
>>> return NULL; }
>>>  extern const struct drm_ioctl_desc amdgpu_ioctls_kms[];
>>>  extern const int amdgpu_max_kms_ioctl;
>>>
>>> -int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags);
>>> +int amdgpu_driver_load_kms(struct amdgpu_device *adev, unsigned long 
>>> flags);
>>>  void amdgpu_driver_unload_kms(struct drm_device *dev);
>>>  void amdgpu_driver_lastclose_kms(struct drm_device *dev);
>>>  int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file 
>>> *file_priv);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 07012d71eeea..6e529548e708 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -1216,7 +1216,8 @@ static int amdgpu_device_check_arguments(struct 
>>> amdgpu_device *adev)
>>>   * Callback for the switcheroo driver.  Suspends or resumes the
>>>   * the asics before or after it is powered up using ACPI methods.
>>>   */
>>> -static void amdgpu_switcheroo_set_state(struct pci_dev *pdev, enum 
>>> vga_switcheroo_state state)
>>> +static void amdgpu_switcheroo_set_state(struct pci_dev *pdev,
>>> + enum vga_switcheroo_state state)
>>>  {
>>>   struct drm_device *dev = pci_get_drvdata(pdev);
>>>   int r;
>>> @@ -2977,8 +2978,6 @@ static const struct attribute 
>>> *amdgpu_dev_attributes[] = {
>>>   * amdgpu_device_init - initialize the driver
>>>   *
>>>   * @adev: amdgpu_device pointer
>>> - * @ddev: drm dev pointer
>>> - * @pdev: pci dev pointer
>>>   * @flags: driver flags
>>>   *
>>>   * Initializes the driver info and hw (all asics).
>>> @@ -2986,18 +2985,15 @@ static const struct attribute 
>>> *amdgpu_dev_attributes[] = {
>>>   * Called at driver startup.
>>>   */
>>>  int amdgpu_device_init(struct amdgpu_device *adev,
>>> -struct drm_device *ddev,
>>> -struct pci_dev *pdev,
>>>  uint32_t flags)
>>>  {
>>> + struct drm_device *ddev = adev_to_drm(adev);
>>> + struct pci_dev *pdev = adev->pdev;
>>>   int r, i;
>>>   bool boco = false;
>>>   u32 

[PATCH v5 3/3] xen: add helpers to allocate unpopulated memory

2020-09-01 Thread Roger Pau Monne
To be used in order to create foreign mappings. This is based on the
ZONE_DEVICE facility which is used by persistent memory devices in
order to create struct pages and kernel virtual mappings for the IOMEM
areas of such devices. Note that on kernels without support for
ZONE_DEVICE Xen will fallback to use ballooned pages in order to
create foreign mappings.

The newly added helpers use the same parameters as the existing
{alloc/free}_xenballooned_pages functions, which allows for in-place
replacement of the callers. Once a memory region has been added to be
used as scratch mapping space it will no longer be released, and pages
returned are kept in a linked list. This allows to have a buffer of
pages and prevents resorting to frequent additions and removals of
regions.

If enabled (because ZONE_DEVICE is supported) the usage of the new
functionality untangles Xen balloon and RAM hotplug from the usage of
unpopulated physical memory ranges to map foreign pages, which is the
correct thing to do in order to avoid mappings of foreign pages depend
on memory hotplug.

Note the driver is currently not enabled on Arm platforms because it
would interfere with the identity mapping required on some platforms.

Signed-off-by: Roger Pau Monné 
---
Cc: Oleksandr Andrushchenko 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Dan Carpenter 
Cc: Roger Pau Monne 
Cc: Wei Liu 
Cc: Yan Yankovskyi 
Cc: dri-devel@lists.freedesktop.org
Cc: xen-de...@lists.xenproject.org
Cc: linux...@kvack.org
Cc: David Hildenbrand 
Cc: Michal Hocko 
Cc: Dan Williams 
---
Changes since v4:
 - Introduce a description for the option.
 - Force selection of ZONE_DEVICE on X86 and select
   XEN_UNPOPULATED_ALLOC if running on dom0 mode or having any
   backends.

Changes since v3:
 - Introduce a Kconfig option that gates the addition of the
   unpopulated alloc driver. This allows to easily disable it on Arm
   platforms.
 - Dropped Juergen RB due to the addition of the Kconfig option.
 - Switched from MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC.

Changes since v2:
 - Drop BUILD_BUG_ON regarding PVMMU page sizes.
 - Use a SPDX license identifier.
 - Call fill with only the minimum required number of pages.
 - Include xen.h header in xen_drm_front_gem.c.
 - Use less generic function names.
 - Exit early from the init function if not a PV guest.
 - Don't use all caps for region name.
---
 drivers/gpu/drm/xen/xen_drm_front_gem.c |   9 +-
 drivers/xen/Kconfig |  11 ++
 drivers/xen/Makefile|   1 +
 drivers/xen/balloon.c   |   4 +-
 drivers/xen/grant-table.c   |   4 +-
 drivers/xen/privcmd.c   |   4 +-
 drivers/xen/unpopulated-alloc.c | 185 
 drivers/xen/xenbus/xenbus_client.c  |   6 +-
 drivers/xen/xlate_mmu.c |   4 +-
 include/xen/xen.h   |   9 ++
 10 files changed, 222 insertions(+), 15 deletions(-)
 create mode 100644 drivers/xen/unpopulated-alloc.c

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index 39ff95b75357..534daf37c97e 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 
 #include "xen_drm_front.h"
 #include "xen_drm_front_gem.h"
@@ -99,8 +100,8 @@ static struct xen_gem_object *gem_create(struct drm_device 
*dev, size_t size)
 * allocate ballooned pages which will be used to map
 * grant references provided by the backend
 */
-   ret = alloc_xenballooned_pages(xen_obj->num_pages,
-  xen_obj->pages);
+   ret = xen_alloc_unpopulated_pages(xen_obj->num_pages,
+ xen_obj->pages);
if (ret < 0) {
DRM_ERROR("Cannot allocate %zu ballooned pages: %d\n",
  xen_obj->num_pages, ret);
@@ -152,8 +153,8 @@ void xen_drm_front_gem_free_object_unlocked(struct 
drm_gem_object *gem_obj)
} else {
if (xen_obj->pages) {
if (xen_obj->be_alloc) {
-   free_xenballooned_pages(xen_obj->num_pages,
-   xen_obj->pages);
+   xen_free_unpopulated_pages(xen_obj->num_pages,
+  xen_obj->pages);
gem_free_pages_array(xen_obj);
} else {
drm_gem_put_pages(_obj->base,
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index ea6c1e7e3e42..e38c33558d0d 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -325,4 +325,15 @@ config XEN_HAVE_VPMU
 config XEN_FRONT_PGDIR_SHBUF
tristate
 

Re: [PATCH v5 3/3] xen: add helpers to allocate unpopulated memory

2020-09-01 Thread Roger Pau Monné
On Tue, Sep 01, 2020 at 10:33:26AM +0200, Roger Pau Monne wrote:
> +static int fill_list(unsigned int nr_pages)
> +{
> + struct dev_pagemap *pgmap;
> + void *vaddr;
> + unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
> + int nid, ret;
> +
> + pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
> + if (!pgmap)
> + return -ENOMEM;
> +
> + pgmap->type = MEMORY_DEVICE_GENERIC;
> + pgmap->res.name = "Xen scratch";
> + pgmap->res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> +
> + ret = allocate_resource(_resource, >res,
> + alloc_pages * PAGE_SIZE, 0, -1,
> + PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
> + if (ret < 0) {
> + pr_err("Cannot allocate new IOMEM resource\n");
> + kfree(pgmap);
> + return ret;
> + }
> +
> + nid = memory_add_physaddr_to_nid(pgmap->res.start);

I think this is not needed ...

> +
> +#ifdef CONFIG_XEN_HAVE_PVMMU
> +/*
> + * memremap will build page tables for the new memory so
> + * the p2m must contain invalid entries so the correct
> + * non-present PTEs will be written.
> + *
> + * If a failure occurs, the original (identity) p2m entries
> + * are not restored since this region is now known not to
> + * conflict with any devices.
> + */
> + if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> + xen_pfn_t pfn = PFN_DOWN(pgmap->res.start);
> +
> + for (i = 0; i < alloc_pages; i++) {
> + if (!set_phys_to_machine(pfn + i, INVALID_P2M_ENTRY)) {
> + pr_warn("set_phys_to_machine() failed, no 
> memory added\n");
> + release_resource(>res);
> + kfree(pgmap);
> + return -ENOMEM;
> + }
> +}
> + }
> +#endif
> +
> + vaddr = memremap_pages(pgmap, nid);

... and NUMA_NO_NODE should be used here instead, as this memory is just
fictitious space to map foreign memory, and shouldn't be related to
any NUMA node.

The following chunk should be folded in, or I can resend.

Thanks, Roger.
---8<---
diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c
index 1b5d157c6977..3b98dc921426 100644
--- a/drivers/xen/unpopulated-alloc.c
+++ b/drivers/xen/unpopulated-alloc.c
@@ -20,7 +20,7 @@ static int fill_list(unsigned int nr_pages)
struct dev_pagemap *pgmap;
void *vaddr;
unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
-   int nid, ret;
+   int ret;
 
pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
if (!pgmap)
@@ -39,8 +39,6 @@ static int fill_list(unsigned int nr_pages)
return ret;
}
 
-   nid = memory_add_physaddr_to_nid(pgmap->res.start);
-
 #ifdef CONFIG_XEN_HAVE_PVMMU
 /*
  * memremap will build page tables for the new memory so
@@ -65,7 +63,7 @@ static int fill_list(unsigned int nr_pages)
}
 #endif
 
-   vaddr = memremap_pages(pgmap, nid);
+   vaddr = memremap_pages(pgmap, NUMA_NO_NODE);
if (IS_ERR(vaddr)) {
pr_err("Cannot remap memory range\n");
release_resource(>res);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues

2020-09-01 Thread Ruhl, Michael J
>-Original Message-
>From: Robin Murphy 
>Sent: Tuesday, September 1, 2020 3:54 PM
>To: Ruhl, Michael J ; Marek Szyprowski
>; dri-devel@lists.freedesktop.org;
>io...@lists.linux-foundation.org; linaro-mm-...@lists.linaro.org; linux-
>ker...@vger.kernel.org
>Cc: Bartlomiej Zolnierkiewicz ; David Airlie
>; intel-...@lists.freedesktop.org; Christoph Hellwig
>; linux-arm-ker...@lists.infradead.org
>Subject: Re: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct
>sg_table related issues
>
>On 2020-09-01 20:38, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: Intel-gfx  On Behalf Of
>>> Marek Szyprowski
>>> Sent: Wednesday, August 26, 2020 2:33 AM
>>> To: dri-devel@lists.freedesktop.org; io...@lists.linux-foundation.org;
>>> linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
>>> Cc: Bartlomiej Zolnierkiewicz ; David Airlie
>>> ; intel-...@lists.freedesktop.org; Robin Murphy
>>> ; Christoph Hellwig ; linux-arm-
>>> ker...@lists.infradead.org; Marek Szyprowski
>>> 
>>> Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table
>>> related issues
>>>
>>> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg()
>>> function
>>> returns the number of the created entries in the DMA address space.
>>> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
>>> dma_unmap_sg must be called with the original number of the entries
>>> passed to the dma_map_sg().
>>>
>>> struct sg_table is a common structure used for describing a non-contiguous
>>> memory buffer, used commonly in the DRM and graphics subsystems. It
>>> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
>>> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
>>> and DMA mapped pages (nents entry).
>>>
>>> It turned out that it was a common mistake to misuse nents and orig_nents
>>> entries, calling DMA-mapping functions with a wrong number of entries or
>>> ignoring the number of mapped entries returned by the dma_map_sg()
>>> function.
>>>
>>> This driver creatively uses sg_table->orig_nents to store the size of the
>>> allocated scatterlist and ignores the number of the entries returned by
>>> dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
>>> free the (over)allocated scatterlist.
>>>
>>> This patch only introduces the common DMA-mapping wrappers operating
>>> directly on the struct sg_table objects to the dmabuf related functions,
>>> so the other drivers, which might share buffers with i915 could rely on
>>> the properly set nents and orig_nents values.
>>>
>>> Signed-off-by: Marek Szyprowski 
>>> ---
>>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
>>> drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
>>> 2 files changed, 6 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> index 2679380159fc..8a988592715b 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> @@ -48,12 +48,9 @@ static struct sg_table
>*i915_gem_map_dma_buf(struct
>>> dma_buf_attachment *attachme
>>> src = sg_next(src);
>>> }
>>>
>>> -   if (!dma_map_sg_attrs(attachment->dev,
>>> - st->sgl, st->nents, dir,
>>> - DMA_ATTR_SKIP_CPU_SYNC)) {
>>> -   ret = -ENOMEM;
>>
>> You have dropped this error value.
>>
>> Do you now if this is a benign loss?
>
>True, dma_map_sgtable() will return -EINVAL rather than -ENOMEM for
>failure. A quick look through other .map_dma_buf callbacks suggests
>they're returning a motley mix of error values and NULL for failure
>cases, so I'd imagine that importers shouldn't be too sensitive to the
>exact value.

I followed some of our code through to see if anyone is checking for -ENOMEM...

I have found in some test paths... However, it is not clear to me if we can get
to those paths from here.

Anyways,

Reviewed-by: Michael J. Ruhl 

Mike

>Robin.
>
>>
>> M
>>
>>> +   ret = dma_map_sgtable(attachment->dev, st, dir,
>>> DMA_ATTR_SKIP_CPU_SYNC);
>>> +   if (ret)
>>> goto err_free_sg;
>>> -   }
>>>
>>> return st;
>>>
>>> @@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct
>>> dma_buf_attachment *attachment,
>>> {
>>> struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment-
 dmabuf);
>>>
>>> -   dma_unmap_sg_attrs(attachment->dev,
>>> -  sg->sgl, sg->nents, dir,
>>> -  DMA_ATTR_SKIP_CPU_SYNC);
>>> +   dma_unmap_sgtable(attachment->dev, sg, dir,
>>> DMA_ATTR_SKIP_CPU_SYNC);
>>> sg_free_table(sg);
>>> kfree(sg);
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>>> b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>>> index debaf7b18ab5..be30b27e2926 100644
>>> --- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>>> +++ 

Re: [PATCH v9 31/32] media: pci: fix common ALSA DMA-mapping related codes

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
  drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
  drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
  drivers/media/pci/cx88/cx88-alsa.c   | 2 +-
  drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
  4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/pci/cx23885/cx23885-alsa.c 
b/drivers/media/pci/cx23885/cx23885-alsa.c
index df44ed7393a0..3f366e4e4685 100644
--- a/drivers/media/pci/cx23885/cx23885-alsa.c
+++ b/drivers/media/pci/cx23885/cx23885-alsa.c
@@ -129,7 +129,7 @@ static int cx23885_alsa_dma_unmap(struct cx23885_audio_dev 
*dev)
if (!buf->sglen)
return 0;
  
-	dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen, PCI_DMA_FROMDEVICE);

+   dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages, 
PCI_DMA_FROMDEVICE);


If we're touching these lines anyway, we should update them to use the 
modern DMA_FROM_DEVICE definitions too.


Robin.


buf->sglen = 0;
return 0;
  }
diff --git a/drivers/media/pci/cx25821/cx25821-alsa.c 
b/drivers/media/pci/cx25821/cx25821-alsa.c
index 301616426d8a..c40304d33776 100644
--- a/drivers/media/pci/cx25821/cx25821-alsa.c
+++ b/drivers/media/pci/cx25821/cx25821-alsa.c
@@ -193,7 +193,7 @@ static int cx25821_alsa_dma_unmap(struct cx25821_audio_dev 
*dev)
if (!buf->sglen)
return 0;
  
-	dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen, PCI_DMA_FROMDEVICE);

+   dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages, 
PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
  }
diff --git a/drivers/media/pci/cx88/cx88-alsa.c 
b/drivers/media/pci/cx88/cx88-alsa.c
index 7d7aceecc985..3c6fe6ceb0b7 100644
--- a/drivers/media/pci/cx88/cx88-alsa.c
+++ b/drivers/media/pci/cx88/cx88-alsa.c
@@ -332,7 +332,7 @@ static int cx88_alsa_dma_unmap(struct cx88_audio_dev *dev)
if (!buf->sglen)
return 0;
  
-	dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen,

+   dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages,
 PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
diff --git a/drivers/media/pci/saa7134/saa7134-alsa.c 
b/drivers/media/pci/saa7134/saa7134-alsa.c
index 544ca57eee75..398c47ff473d 100644
--- a/drivers/media/pci/saa7134/saa7134-alsa.c
+++ b/drivers/media/pci/saa7134/saa7134-alsa.c
@@ -313,7 +313,7 @@ static int saa7134_alsa_dma_unmap(struct saa7134_dev *dev)
if (!dma->sglen)
return 0;
  
-	dma_unmap_sg(>pci->dev, dma->sglist, dma->sglen, PCI_DMA_FROMDEVICE);

+   dma_unmap_sg(>pci->dev, dma->sglist, dma->nr_pages, 
PCI_DMA_FROMDEVICE);
dma->sglen = 0;
return 0;
  }


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 32/32] videobuf2: use sgtable-based scatterlist wrappers

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

Use recently introduced common wrappers operating directly on the struct
sg_table objects and scatterlist page iterators to make the code a bit
more compact, robust, easier to follow and copy/paste safe.

No functional change, because the code already properly did all the
scaterlist related calls.


^^ typo

Otherwise,

Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  .../common/videobuf2/videobuf2-dma-contig.c   | 34 ---
  .../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++--
  .../common/videobuf2/videobuf2-vmalloc.c  | 12 +++
  3 files changed, 31 insertions(+), 47 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index ec3446cc45b8..1b242d844dde 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -58,10 +58,10 @@ static unsigned long vb2_dc_get_contiguous_size(struct 
sg_table *sgt)
unsigned int i;
unsigned long size = 0;
  
-	for_each_sg(sgt->sgl, s, sgt->nents, i) {

+   for_each_sgtable_dma_sg(sgt, s, i) {
if (sg_dma_address(s) != expected)
break;
-   expected = sg_dma_address(s) + sg_dma_len(s);
+   expected += sg_dma_len(s);
size += sg_dma_len(s);
}
return size;
@@ -103,8 +103,7 @@ static void vb2_dc_prepare(void *buf_priv)
if (!sgt)
return;
  
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,

-  buf->dma_dir);
+   dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
  }
  
  static void vb2_dc_finish(void *buf_priv)

@@ -115,7 +114,7 @@ static void vb2_dc_finish(void *buf_priv)
if (!sgt)
return;
  
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);

+   dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
  }
  
  /*/

@@ -275,8 +274,8 @@ static void vb2_dc_dmabuf_ops_detach(struct dma_buf *dbuf,
 * memory locations do not require any explicit cache
 * maintenance prior or after being used by the device.
 */
-   dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
-  attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sgt);
kfree(attach);
db_attach->priv = NULL;
@@ -301,8 +300,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
  
  	/* release any previous cache */

if (attach->dma_dir != DMA_NONE) {
-   dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
-  attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
attach->dma_dir = DMA_NONE;
}
  
@@ -310,9 +309,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(

 * mapping to the client with new direction, no cache sync
 * required see comment in vb2_dc_dmabuf_ops_detach()
 */
-   sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
- dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-   if (!sgt->nents) {
+   if (dma_map_sgtable(db_attach->dev, sgt, dma_dir,
+   DMA_ATTR_SKIP_CPU_SYNC)) {
pr_err("failed to map scatterlist\n");
mutex_unlock(lock);
return ERR_PTR(-EIO);
@@ -455,8 +453,8 @@ static void vb2_dc_put_userptr(void *buf_priv)
 * No need to sync to CPU, it's already synced to the CPU
 * since the finish() memop will have been called before this.
 */
-   dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-  buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(buf->dev, sgt, buf->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
pages = frame_vector_pages(buf->vec);
/* sgt should exist only if vector contains pages... */
BUG_ON(IS_ERR(pages));
@@ -553,9 +551,8 @@ static void *vb2_dc_get_userptr(struct device *dev, 
unsigned long vaddr,
 * No need to sync to the device, this will happen later when the
 * prepare() memop is called.
 */
-   sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
- buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-   if (sgt->nents <= 0) {
+   if (dma_map_sgtable(buf->dev, sgt, buf->dma_dir,
+

Re: [PATCH v9 30/32] samples: vfio-mdev/mbochs: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

While touching this code, also add missing call to dma_unmap_sgtable.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  samples/vfio-mdev/mbochs.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 3cc5e5921682..e03068917273 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -846,7 +846,7 @@ static struct sg_table *mbochs_map_dmabuf(struct 
dma_buf_attachment *at,
if (sg_alloc_table_from_pages(sg, dmabuf->pages, dmabuf->pagecount,
  0, dmabuf->mode.size, GFP_KERNEL) < 0)
goto err2;
-   if (!dma_map_sg(at->dev, sg->sgl, sg->nents, direction))
+   if (dma_map_sgtable(at->dev, sg, direction, 0))
goto err3;
  
  	return sg;

@@ -868,6 +868,7 @@ static void mbochs_unmap_dmabuf(struct dma_buf_attachment 
*at,
  
  	dev_dbg(dev, "%s: %d\n", __func__, dmabuf->id);
  
+	dma_unmap_sgtable(at->dev, sg, direction, 0);

sg_free_table(sg);
kfree(sg);
  }


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 29/32] rapidio: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/rapidio/devices/rio_mport_cdev.c | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/rapidio/devices/rio_mport_cdev.c 
b/drivers/rapidio/devices/rio_mport_cdev.c
index a30342942e26..89eb3d212652 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -573,8 +573,7 @@ static void dma_req_free(struct kref *ref)
refcount);
struct mport_cdev_priv *priv = req->priv;
  
-	dma_unmap_sg(req->dmach->device->dev,

-req->sgt.sgl, req->sgt.nents, req->dir);
+   dma_unmap_sgtable(req->dmach->device->dev, >sgt, req->dir, 0);
sg_free_table(>sgt);
if (req->page_list) {
unpin_user_pages(req->page_list, req->nr_pages);
@@ -814,7 +813,6 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
struct mport_dev *md = priv->md;
struct dma_chan *chan;
int ret;
-   int nents;
  
  	if (xfer->length == 0)

return -EINVAL;
@@ -930,15 +928,14 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
xfer->offset, xfer->length);
}
  
-	nents = dma_map_sg(chan->device->dev,

-  req->sgt.sgl, req->sgt.nents, dir);
-   if (nents == 0) {
+   ret = dma_map_sgtable(chan->device->dev, >sgt, dir, 0);
+   if (ret) {
rmcd_error("Failed to map SG list");
ret = -EFAULT;
goto err_pg;
}
  
-	ret = do_dma_request(req, xfer, sync, nents);

+   ret = do_dma_request(req, xfer, sync, req->sgt.nents);
  
  	if (ret >= 0) {

if (sync == RIO_TRANSFER_ASYNC)


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 28/32] misc: fastrpc: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/misc/fastrpc.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 7939c55daceb..9d6867749316 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -518,7 +518,7 @@ fastrpc_map_dma_buf(struct dma_buf_attachment *attachment,
  
  	table = >sgt;
  
-	if (!dma_map_sg(attachment->dev, table->sgl, table->nents, dir))

+   if (!dma_map_sgtable(attachment->dev, table, dir, 0))
return ERR_PTR(-ENOMEM);
  
  	return table;

@@ -528,7 +528,7 @@ static void fastrpc_unmap_dma_buf(struct dma_buf_attachment 
*attach,
  struct sg_table *table,
  enum dma_data_direction dir)
  {
-   dma_unmap_sg(attach->dev, table->sgl, table->nents, dir);
+   dma_unmap_sgtable(attach->dev, table, dir, 0);
  }
  
  static void fastrpc_release(struct dma_buf *dmabuf)



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 24/32] drm: host1x: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/host1x/job.c | 22 --
  1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index 89b6c14b7392..82d0a60ba3f7 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -170,11 +170,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
goto unpin;
}
  
-			err = dma_map_sg(dev, sgt->sgl, sgt->nents, dir);

-   if (!err) {
-   err = -ENOMEM;
+   err = dma_map_sgtable(dev, sgt, dir, 0);
+   if (err)
goto unpin;
-   }
  
  			job->unpins[job->num_unpins].dev = dev;

job->unpins[job->num_unpins].dir = dir;
@@ -228,7 +226,7 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
}
  
  		if (host->domain) {

-   for_each_sg(sgt->sgl, sg, sgt->nents, j)
+   for_each_sgtable_sg(sgt, sg, j)
gather_size += sg->length;
gather_size = iova_align(>iova, gather_size);
  
@@ -240,9 +238,9 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job)

goto put;
}
  
-			err = iommu_map_sg(host->domain,

+   err = iommu_map_sgtable(host->domain,
iova_dma_addr(>iova, alloc),
-   sgt->sgl, sgt->nents, IOMMU_READ);
+   sgt, IOMMU_READ);
if (err == 0) {
__free_iova(>iova, alloc);
err = -EINVAL;
@@ -252,12 +250,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
job->unpins[job->num_unpins].size = gather_size;
phys_addr = iova_dma_addr(>iova, alloc);
} else if (sgt) {
-   err = dma_map_sg(host->dev, sgt->sgl, sgt->nents,
-DMA_TO_DEVICE);
-   if (!err) {
-   err = -ENOMEM;
+   err = dma_map_sgtable(host->dev, sgt, DMA_TO_DEVICE, 0);
+   if (err)
goto put;
-   }
  
  			job->unpins[job->num_unpins].dir = DMA_TO_DEVICE;

job->unpins[job->num_unpins].dev = host->dev;
@@ -660,8 +655,7 @@ void host1x_job_unpin(struct host1x_job *job)
}
  
  		if (unpin->dev && sgt)

-   dma_unmap_sg(unpin->dev, sgt->sgl, sgt->nents,
-unpin->dir);
+   dma_unmap_sgtable(unpin->dev, sgt, unpin->dir, 0);
  
  		host1x_bo_unpin(dev, unpin->bo, sgt);

host1x_bo_put(unpin->bo);


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 18/32] drm: tegra: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/tegra/gem.c   | 27 ++-
  drivers/gpu/drm/tegra/plane.c | 15 +--
  2 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 723df142a981..01d94befab11 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -98,8 +98,8 @@ static struct sg_table *tegra_bo_pin(struct device *dev, 
struct host1x_bo *bo,
 * the SG table needs to be copied to avoid overwriting any
 * other potential users of the original SG table.
 */
-   err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl, 
obj->sgt->nents,
-GFP_KERNEL);
+   err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl,
+obj->sgt->orig_nents, GFP_KERNEL);
if (err < 0)
goto free;
} else {
@@ -196,8 +196,7 @@ static int tegra_bo_iommu_map(struct tegra_drm *tegra, 
struct tegra_bo *bo)
  
  	bo->iova = bo->mm->start;
  
-	bo->size = iommu_map_sg(tegra->domain, bo->iova, bo->sgt->sgl,

-   bo->sgt->nents, prot);
+   bo->size = iommu_map_sgtable(tegra->domain, bo->iova, bo->sgt, prot);
if (!bo->size) {
dev_err(tegra->drm->dev, "failed to map buffer\n");
err = -ENOMEM;
@@ -264,8 +263,7 @@ static struct tegra_bo *tegra_bo_alloc_object(struct 
drm_device *drm,
  static void tegra_bo_free(struct drm_device *drm, struct tegra_bo *bo)
  {
if (bo->pages) {
-   dma_unmap_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-DMA_FROM_DEVICE);
+   dma_unmap_sgtable(drm->dev, bo->sgt, DMA_FROM_DEVICE, 0);
drm_gem_put_pages(>gem, bo->pages, true, true);
sg_free_table(bo->sgt);
kfree(bo->sgt);
@@ -290,12 +288,9 @@ static int tegra_bo_get_pages(struct drm_device *drm, 
struct tegra_bo *bo)
goto put_pages;
}
  
-	err = dma_map_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,

-DMA_FROM_DEVICE);
-   if (err == 0) {
-   err = -EFAULT;
+   err = dma_map_sgtable(drm->dev, bo->sgt, DMA_FROM_DEVICE, 0);
+   if (err)
goto free_sgt;
-   }
  
  	return 0;
  
@@ -571,7 +566,7 @@ tegra_gem_prime_map_dma_buf(struct dma_buf_attachment *attach,

goto free;
}
  
-	if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)

+   if (dma_map_sgtable(attach->dev, sgt, dir, 0))
goto free;
  
  	return sgt;

@@ -590,7 +585,7 @@ static void tegra_gem_prime_unmap_dma_buf(struct 
dma_buf_attachment *attach,
struct tegra_bo *bo = to_tegra_bo(gem);
  
  	if (bo->pages)

-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sgtable(attach->dev, sgt, dir, 0);
  
  	sg_free_table(sgt);

kfree(sgt);
@@ -609,8 +604,7 @@ static int tegra_gem_prime_begin_cpu_access(struct dma_buf 
*buf,
struct drm_device *drm = gem->dev;
  
  	if (bo->pages)

-   dma_sync_sg_for_cpu(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-   DMA_FROM_DEVICE);
+   dma_sync_sgtable_for_cpu(drm->dev, bo->sgt, DMA_FROM_DEVICE);
  
  	return 0;

  }
@@ -623,8 +617,7 @@ static int tegra_gem_prime_end_cpu_access(struct dma_buf 
*buf,
struct drm_device *drm = gem->dev;
  
  	if (bo->pages)

-   dma_sync_sg_for_device(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-  DMA_TO_DEVICE);
+ 

Re: [PATCH v9 17/32] drm: rockchip: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.


Reviewed-by: Robin Murphy 

(Until now I hadn't noticed the crimes against the API that 
rockchip_gem_get_pages() is committing, but it's not this patch's 
fault... I'll have to take a closer look at that)



Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 23 +
  1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index 2970e534e2bb..cb50f2ba2e46 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -36,8 +36,8 @@ static int rockchip_gem_iommu_map(struct rockchip_gem_object 
*rk_obj)
  
  	rk_obj->dma_addr = rk_obj->mm.start;
  
-	ret = iommu_map_sg(private->domain, rk_obj->dma_addr, rk_obj->sgt->sgl,

-  rk_obj->sgt->nents, prot);
+   ret = iommu_map_sgtable(private->domain, rk_obj->dma_addr, rk_obj->sgt,
+   prot);
if (ret < rk_obj->base.size) {
DRM_ERROR("failed to map buffer: size=%zd request_size=%zd\n",
  ret, rk_obj->base.size);
@@ -98,11 +98,10 @@ static int rockchip_gem_get_pages(struct 
rockchip_gem_object *rk_obj)
 * TODO: Replace this by drm_clflush_sg() once it can be implemented
 * without relying on symbols that are not exported.
 */
-   for_each_sg(rk_obj->sgt->sgl, s, rk_obj->sgt->nents, i)
+   for_each_sgtable_sg(rk_obj->sgt, s, i)
sg_dma_address(s) = sg_phys(s);
  
-	dma_sync_sg_for_device(drm->dev, rk_obj->sgt->sgl, rk_obj->sgt->nents,

-  DMA_TO_DEVICE);
+   dma_sync_sgtable_for_device(drm->dev, rk_obj->sgt, DMA_TO_DEVICE);
  
  	return 0;
  
@@ -350,8 +349,8 @@ void rockchip_gem_free_object(struct drm_gem_object *obj)

if (private->domain) {
rockchip_gem_iommu_unmap(rk_obj);
} else {
-   dma_unmap_sg(drm->dev, rk_obj->sgt->sgl,
-rk_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(drm->dev, rk_obj->sgt,
+ DMA_BIDIRECTIONAL, 0);
}
drm_prime_gem_destroy(obj, rk_obj->sgt);
} else {
@@ -476,15 +475,13 @@ rockchip_gem_dma_map_sg(struct drm_device *drm,
struct sg_table *sg,
struct rockchip_gem_object *rk_obj)
  {
-   int count = dma_map_sg(drm->dev, sg->sgl, sg->nents,
-  DMA_BIDIRECTIONAL);
-   if (!count)
-   return -EINVAL;
+   int err = dma_map_sgtable(drm->dev, sg, DMA_BIDIRECTIONAL, 0);
+   if (err)
+   return err;
  
  	if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {

DRM_ERROR("failed to map sg_table to contiguous linear 
address.\n");
-   dma_unmap_sg(drm->dev, sg->sgl, sg->nents,
-DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(drm->dev, sg, DMA_BIDIRECTIONAL, 0);
return -EINVAL;
}
  


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-09-01 20:38, Ruhl, Michael J wrote:

-Original Message-
From: Intel-gfx  On Behalf Of
Marek Szyprowski
Sent: Wednesday, August 26, 2020 2:33 AM
To: dri-devel@lists.freedesktop.org; io...@lists.linux-foundation.org;
linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
Cc: Bartlomiej Zolnierkiewicz ; David Airlie
; intel-...@lists.freedesktop.org; Robin Murphy
; Christoph Hellwig ; linux-arm-
ker...@lists.infradead.org; Marek Szyprowski

Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table
related issues

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg()
function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

This driver creatively uses sg_table->orig_nents to store the size of the
allocated scatterlist and ignores the number of the entries returned by
dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
free the (over)allocated scatterlist.

This patch only introduces the common DMA-mapping wrappers operating
directly on the struct sg_table objects to the dmabuf related functions,
so the other drivers, which might share buffers with i915 could rely on
the properly set nents and orig_nents values.

Signed-off-by: Marek Szyprowski 
---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 2679380159fc..8a988592715b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct
dma_buf_attachment *attachme
src = sg_next(src);
}

-   if (!dma_map_sg_attrs(attachment->dev,
- st->sgl, st->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
-   ret = -ENOMEM;


You have dropped this error value.

Do you now if this is a benign loss?


True, dma_map_sgtable() will return -EINVAL rather than -ENOMEM for 
failure. A quick look through other .map_dma_buf callbacks suggests 
they're returning a motley mix of error values and NULL for failure 
cases, so I'd imagine that importers shouldn't be too sensitive to the 
exact value.


Robin.



M


+   ret = dma_map_sgtable(attachment->dev, st, dir,
DMA_ATTR_SKIP_CPU_SYNC);
+   if (ret)
goto err_free_sg;
-   }

return st;

@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct
dma_buf_attachment *attachment,
{
struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment-

dmabuf);


-   dma_unmap_sg_attrs(attachment->dev,
-  sg->sgl, sg->nents, dir,
-  DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(attachment->dev, sg, dir,
DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);

diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
index debaf7b18ab5..be30b27e2926 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct
dma_buf_attachment *attachment,
sg = sg_next(sg);
}

-   if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
-   err = -ENOMEM;
+   err = dma_map_sgtable(attachment->dev, st, dir, 0);
+   if (err)
goto err_st;
-   }

return st;

@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct
dma_buf_attachment *attachment,
   struct sg_table *st,
   enum dma_data_direction dir)
{
-   dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
+   dma_unmap_sgtable(attachment->dev, st, dir, 0);
sg_free_table(st);
kfree(st);
}
--
2.17.1

___
Intel-gfx mailing list
intel-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

___
dri-devel mailing list

RE: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues

2020-09-01 Thread Ruhl, Michael J
>-Original Message-
>From: Intel-gfx  On Behalf Of
>Marek Szyprowski
>Sent: Wednesday, August 26, 2020 2:33 AM
>To: dri-devel@lists.freedesktop.org; io...@lists.linux-foundation.org;
>linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
>Cc: Bartlomiej Zolnierkiewicz ; David Airlie
>; intel-...@lists.freedesktop.org; Robin Murphy
>; Christoph Hellwig ; linux-arm-
>ker...@lists.infradead.org; Marek Szyprowski
>
>Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table
>related issues
>
>The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg()
>function
>returns the number of the created entries in the DMA address space.
>However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
>dma_unmap_sg must be called with the original number of the entries
>passed to the dma_map_sg().
>
>struct sg_table is a common structure used for describing a non-contiguous
>memory buffer, used commonly in the DRM and graphics subsystems. It
>consists of a scatterlist with memory pages and DMA addresses (sgl entry),
>as well as the number of scatterlist entries: CPU pages (orig_nents entry)
>and DMA mapped pages (nents entry).
>
>It turned out that it was a common mistake to misuse nents and orig_nents
>entries, calling DMA-mapping functions with a wrong number of entries or
>ignoring the number of mapped entries returned by the dma_map_sg()
>function.
>
>This driver creatively uses sg_table->orig_nents to store the size of the
>allocated scatterlist and ignores the number of the entries returned by
>dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
>free the (over)allocated scatterlist.
>
>This patch only introduces the common DMA-mapping wrappers operating
>directly on the struct sg_table objects to the dmabuf related functions,
>so the other drivers, which might share buffers with i915 could rely on
>the properly set nents and orig_nents values.
>
>Signed-off-by: Marek Szyprowski 
>---
> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
> drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
> 2 files changed, 6 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>index 2679380159fc..8a988592715b 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct
>dma_buf_attachment *attachme
>   src = sg_next(src);
>   }
>
>-  if (!dma_map_sg_attrs(attachment->dev,
>-st->sgl, st->nents, dir,
>-DMA_ATTR_SKIP_CPU_SYNC)) {
>-  ret = -ENOMEM;

You have dropped this error value.

Do you now if this is a benign loss?

M

>+  ret = dma_map_sgtable(attachment->dev, st, dir,
>DMA_ATTR_SKIP_CPU_SYNC);
>+  if (ret)
>   goto err_free_sg;
>-  }
>
>   return st;
>
>@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct
>dma_buf_attachment *attachment,
> {
>   struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment-
>>dmabuf);
>
>-  dma_unmap_sg_attrs(attachment->dev,
>- sg->sgl, sg->nents, dir,
>- DMA_ATTR_SKIP_CPU_SYNC);
>+  dma_unmap_sgtable(attachment->dev, sg, dir,
>DMA_ATTR_SKIP_CPU_SYNC);
>   sg_free_table(sg);
>   kfree(sg);
>
>diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>index debaf7b18ab5..be30b27e2926 100644
>--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct
>dma_buf_attachment *attachment,
>   sg = sg_next(sg);
>   }
>
>-  if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
>-  err = -ENOMEM;
>+  err = dma_map_sgtable(attachment->dev, st, dir, 0);
>+  if (err)
>   goto err_st;
>-  }
>
>   return st;
>
>@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct
>dma_buf_attachment *attachment,
>  struct sg_table *st,
>  enum dma_data_direction dir)
> {
>-  dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
>+  dma_unmap_sgtable(attachment->dev, st, dir, 0);
>   sg_free_table(st);
>   kfree(st);
> }
>--
>2.17.1
>
>___
>Intel-gfx mailing list
>intel-...@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 16/32] drm: rockchip: use common helper for a scatterlist contiguity check

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:33, Marek Szyprowski wrote:

Use common helper for checking the contiguity of the imported dma-buf.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 19 +--
  1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index b9275ba7c5a5..2970e534e2bb 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -460,23 +460,6 @@ struct sg_table *rockchip_gem_prime_get_sg_table(struct 
drm_gem_object *obj)
return sgt;
  }
  
-static unsigned long rockchip_sg_get_contiguous_size(struct sg_table *sgt,

-int count)
-{
-   struct scatterlist *s;
-   dma_addr_t expected = sg_dma_address(sgt->sgl);
-   unsigned int i;
-   unsigned long size = 0;
-
-   for_each_sg(sgt->sgl, s, count, i) {
-   if (sg_dma_address(s) != expected)
-   break;
-   expected = sg_dma_address(s) + sg_dma_len(s);
-   size += sg_dma_len(s);
-   }
-   return size;
-}
-
  static int
  rockchip_gem_iommu_map_sg(struct drm_device *drm,
  struct dma_buf_attachment *attach,
@@ -498,7 +481,7 @@ rockchip_gem_dma_map_sg(struct drm_device *drm,
if (!count)
return -EINVAL;
  
-	if (rockchip_sg_get_contiguous_size(sg, count) < attach->dmabuf->size) {

+   if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
DRM_ERROR("failed to map sg_table to contiguous linear 
address.\n");
dma_unmap_sg(drm->dev, sg->sgl, sg->nents,
 DMA_BIDIRECTIONAL);


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 14/32] drm: omapdrm: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

Fix the code to refer to proper nents or orig_nents entries. This driver
checks for a buffer contiguity in DMA address space, so it should test
sg_table->nents entry.

Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/omapdrm/omap_gem.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c 
b/drivers/gpu/drm/omapdrm/omap_gem.c
index ff0c4b0c3fd0..a7a9a0afe2b6 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -48,7 +48,7 @@ struct omap_gem_object {
 *   OMAP_BO_MEM_DMA_API flag set)
 *
 * - buffers imported from dmabuf (with the OMAP_BO_MEM_DMABUF flag set)
-*   if they are physically contiguous (when sgt->orig_nents == 1)
+*   if they are physically contiguous (when sgt->nents == 1)


Hmm, if this really does mean *physically* contiguous - i.e. if buffers 
might be shared between DMA-translatable and non-DMA-translatable 
devices - then these changes might not be appropriate. If not and it 
only actually means DMA-contiguous, then it would be good to clarify the 
comments to that effect.


Can anyone familiar with omapdrm clarify what exactly the case is here? 
I know that IOMMUs might be involved to some degree, and I've skimmed 
the interconnect chapters of enough OMAP TRMs to be scared by the 
reference to the tiler aperture in the context below :)


Robin.


 *
 * - buffers mapped through the TILER when dma_addr_cnt is not zero, in
 *   which case the DMA address points to the TILER aperture
@@ -1279,7 +1279,7 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
union omap_gem_size gsize;
  
  	/* Without a DMM only physically contiguous buffers can be supported. */

-   if (sgt->orig_nents != 1 && !priv->has_dmm)
+   if (sgt->nents != 1 && !priv->has_dmm)
return ERR_PTR(-EINVAL);
  
  	gsize.bytes = PAGE_ALIGN(size);

@@ -1293,7 +1293,7 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
  
  	omap_obj->sgt = sgt;
  
-	if (sgt->orig_nents == 1) {

+   if (sgt->nents == 1) {
omap_obj->dma_addr = sg_dma_address(sgt->sgl);
} else {
/* Create pages list from sgt */


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 3/3] drm/ttm: remove io_reserve_lru handling v2

2020-09-01 Thread kernel test robot
Hi "Christian,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20200828]
[cannot apply to linus/master v5.9-rc3 v5.9-rc2 v5.9-rc1 v5.9-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Christian-K-nig/drm-ttm-make-sure-that-we-always-zero-init-mem-bus-v2/20200901-230736
base:b36c969764ab12faebb74711c942fa3e6eaf1e96
config: x86_64-randconfig-a006-20200901 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
# save the attached .config to linux build tree
make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_resource_iounmap':
>> drivers/gpu/drm/ttm/ttm_bo_util.c:157:31: warning: variable 'man' set but 
>> not used [-Wunused-but-set-variable]
 157 |  struct ttm_resource_manager *man;
 |   ^~~
   In file included from include/linux/energy_model.h:10,
from include/linux/device.h:16,
from include/drm/drm_print.h:32,
from include/drm/drm_mm.h:49,
from include/drm/ttm/ttm_bo_driver.h:33,
from drivers/gpu/drm/ttm/ttm_bo_util.c:32:
   At top level:
   include/linux/sched/topology.h:40:3: warning: 'sd_flag_debug' defined but 
not used [-Wunused-const-variable=]
  40 | } sd_flag_debug[] = {
 |   ^
   In file included from include/linux/energy_model.h:10,
from include/linux/device.h:16,
from include/drm/drm_print.h:32,
from include/drm/drm_mm.h:49,
from include/drm/ttm/ttm_bo_driver.h:33,
from drivers/gpu/drm/ttm/ttm_bo_util.c:32:
   include/linux/sched/topology.h:30:27: warning: 'SD_DEGENERATE_GROUPS_MASK' 
defined but not used [-Wunused-const-variable=]
  30 | static const unsigned int SD_DEGENERATE_GROUPS_MASK =
 |   ^

# 
https://github.com/0day-ci/linux/commit/640f5da8a063c527c64720caa2c7b8b29aee5bb3
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Christian-K-nig/drm-ttm-make-sure-that-we-always-zero-init-mem-bus-v2/20200901-230736
git checkout 640f5da8a063c527c64720caa2c7b8b29aee5bb3
vim +/man +157 drivers/gpu/drm/ttm/ttm_bo_util.c

ba4e7d973dd09b Thomas Hellstrom 2009-06-10  152  
2966141ad2dda2 Dave Airlie  2020-08-04  153  static void 
ttm_resource_iounmap(struct ttm_bo_device *bdev,
2966141ad2dda2 Dave Airlie  2020-08-04  154 
struct ttm_resource *mem,
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  155 
void *virtual)
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  156  {
9de59bc201496f Dave Airlie  2020-08-04 @157 struct 
ttm_resource_manager *man;
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  158  
9eca33f4a13919 Dave Airlie  2020-08-04  159 man = 
ttm_manager_type(bdev, mem->mem_type);
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  160  
0c321c79627189 Jerome Glisse2010-04-07  161 if (virtual && 
mem->bus.addr == NULL)
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  162 
iounmap(virtual);
82c5da6bf8b55a Jerome Glisse2010-04-09  163 ttm_mem_io_free(bdev, 
mem);
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  164  }
ba4e7d973dd09b Thomas Hellstrom 2009-06-10  165  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 13/32] drm: omapdrm: use common helper for extracting pages array

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

Use common helper for converting a sg_table object into struct
page pointer array.

Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/omapdrm/omap_gem.c | 14 --
  1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c 
b/drivers/gpu/drm/omapdrm/omap_gem.c
index d0d12d5dd76c..ff0c4b0c3fd0 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -1297,10 +1297,9 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
omap_obj->dma_addr = sg_dma_address(sgt->sgl);
} else {
/* Create pages list from sgt */
-   struct sg_page_iter iter;
struct page **pages;
unsigned int npages;
-   unsigned int i = 0;
+   unsigned int ret;
  
  		npages = DIV_ROUND_UP(size, PAGE_SIZE);

pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
@@ -1311,14 +1310,9 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
}
  
  		omap_obj->pages = pages;

-
-   for_each_sg_page(sgt->sgl, , sgt->orig_nents, 0) {
-   pages[i++] = sg_page_iter_page();
-   if (i > npages)
-   break;
-   }
-
-   if (WARN_ON(i != npages)) {
+   ret = drm_prime_sg_to_page_addr_arrays(sgt, pages, NULL,
+  npages);
+   if (WARN_ON(ret)) {


Again, I'm inclined to think the WARN_ON should remain in 
drm_prime_sg_to_page_addr_arrays() itself such that it could be removed 
here, but either way,


Reviewed-by: Robin Murphy 


omap_gem_free_object(obj);
obj = ERR_PTR(-ENOMEM);
goto done;


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 12/32] drm: msm: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Acked-by: Rob Clark 
---
  drivers/gpu/drm/msm/msm_gem.c| 13 +
  drivers/gpu/drm/msm/msm_gpummu.c | 14 ++
  drivers/gpu/drm/msm/msm_iommu.c  |  2 +-
  3 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index b2f49152b4d4..8c7ae812b813 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -53,11 +53,10 @@ static void sync_for_device(struct msm_gem_object *msm_obj)
struct device *dev = msm_obj->base.dev->dev;
  
  	if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {

-   dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_sync_sgtable_for_device(dev, msm_obj->sgt,
+   DMA_BIDIRECTIONAL);
} else {
-   dma_map_sg(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_map_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
}
  }
  
@@ -66,11 +65,9 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)

struct device *dev = msm_obj->base.dev->dev;
  
  	if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {

-   dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_sync_sgtable_for_cpu(dev, msm_obj->sgt, DMA_BIDIRECTIONAL);
} else {
-   dma_unmap_sg(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
}
  }
  
diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c

index 310a31b05faa..319f06c28235 100644
--- a/drivers/gpu/drm/msm/msm_gpummu.c
+++ b/drivers/gpu/drm/msm/msm_gpummu.c
@@ -30,21 +30,19 @@ static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t 
iova,
  {
struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
unsigned idx = (iova - GPUMMU_VA_START) / GPUMMU_PAGE_SIZE;
-   struct scatterlist *sg;
+   struct sg_dma_page_iter dma_iter;
unsigned prot_bits = 0;
-   unsigned i, j;
  
  	if (prot & IOMMU_WRITE)

prot_bits |= 1;
if (prot & IOMMU_READ)
prot_bits |= 2;
  
-	for_each_sg(sgt->sgl, sg, sgt->nents, i) {

-   dma_addr_t addr = sg->dma_address;
-   for (j = 0; j < sg->length / GPUMMU_PAGE_SIZE; j++, idx++) {
-   gpummu->table[idx] = addr | prot_bits;
-   addr += GPUMMU_PAGE_SIZE;
-   }
+   for_each_sgtable_dma_page(sgt, _iter, 0) {
+   dma_addr_t addr = sg_page_iter_dma_address(_iter);
+
+   BUILD_BUG_ON(GPUMMU_PAGE_SIZE != PAGE_SIZE);
+   gpummu->table[idx++] = addr | prot_bits;


Given that the BUILD_BUG_ON might prevent valid arm64 configs from 
building, how about a simple tweak like:


for (i = 0; i < PAGE_SIZE; i += GPUMMU_PAGE_SIZE)
gpummu->table[idx++] = i + addr | prot_bits;
?

Or alternatively perhaps some more aggressive #ifdefs or makefile tweaks 
to prevent the GPUMMU code building for arm64 at all if it's only 
relevant to 32-bit platforms (which I believe might be the case).


Robin.


}
  
  	/* we can improve by deferring flush for multiple map() */

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 3a381a9674c9..6c31e65834c6 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -36,7 +36,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,

[Bug 203905] amdgpu:actual_brightness has unreal/wrong value

2020-09-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203905

--- Comment #14 from Xia Mu (mu.xia...@gmail.com) ---
The bug should be fixed in 5.9.0-0.rc3. I tested it on my laptop with AMD Ryzen
7 PRO 4750U CPU with Renoir GPU.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 11/32] drm: mediatek: use common helper for extracting pages array

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

Use common helper for converting a sg_table object into struct
page pointer array.


Reviewed-by: Robin Murphy 

Side note: is mtk_drm_gem_prime_vmap() missing a call to 
sg_free_table(sgt) before its kfree(sgt)?



Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/mediatek/mtk_drm_gem.c | 9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 3654ec732029..0583e557ad37 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -233,9 +233,7 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
  {
struct mtk_drm_gem_obj *mtk_gem = to_mtk_gem_obj(obj);
struct sg_table *sgt;
-   struct sg_page_iter iter;
unsigned int npages;
-   unsigned int i = 0;
  
  	if (mtk_gem->kvaddr)

return mtk_gem->kvaddr;
@@ -249,11 +247,8 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
if (!mtk_gem->pages)
goto out;
  
-	for_each_sg_page(sgt->sgl, , sgt->orig_nents, 0) {

-   mtk_gem->pages[i++] = sg_page_iter_page();
-   if (i > npages)
-   break;
-   }
+   drm_prime_sg_to_page_addr_arrays(sgt, mtk_gem->pages, NULL, npages);
+
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
   pgprot_writecombine(PAGE_KERNEL));
  


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 10/32] drm: mediatek: use common helper for a scatterlist contiguity check

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

Use common helper for checking the contiguity of the imported dma-buf and
do this check before allocating resources, so the error path is simpler.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/mediatek/mtk_drm_gem.c | 28 ++
  1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 6190cc3b7b0d..3654ec732029 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -212,37 +212,21 @@ struct drm_gem_object 
*mtk_gem_prime_import_sg_table(struct drm_device *dev,
struct dma_buf_attachment *attach, struct sg_table *sg)
  {
struct mtk_drm_gem_obj *mtk_gem;
-   int ret;
-   struct scatterlist *s;
-   unsigned int i;
-   dma_addr_t expected;
  
-	mtk_gem = mtk_drm_gem_init(dev, attach->dmabuf->size);

+   /* check if the entries in the sg_table are contiguous */
+   if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
+   DRM_ERROR("sg_table is not contiguous");
+   return ERR_PTR(-EINVAL);
+   }
  
+	mtk_gem = mtk_drm_gem_init(dev, attach->dmabuf->size);

if (IS_ERR(mtk_gem))
return ERR_CAST(mtk_gem);
  
-	expected = sg_dma_address(sg->sgl);

-   for_each_sg(sg->sgl, s, sg->nents, i) {
-   if (!sg_dma_len(s))
-   break;
-
-   if (sg_dma_address(s) != expected) {
-   DRM_ERROR("sg_table is not contiguous");
-   ret = -EINVAL;
-   goto err_gem_free;
-   }
-   expected = sg_dma_address(s) + sg_dma_len(s);
-   }
-
mtk_gem->dma_addr = sg_dma_address(sg->sgl);
mtk_gem->sg = sg;
  
  	return _gem->base;

-
-err_gem_free:
-   kfree(mtk_gem);
-   return ERR_PTR(ret);
  }
  
  void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

This driver creatively uses sg_table->orig_nents to store the size of the
allocated scatterlist and ignores the number of the entries returned by
dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
free the (over)allocated scatterlist.

This patch only introduces the common DMA-mapping wrappers operating
directly on the struct sg_table objects to the dmabuf related functions,
so the other drivers, which might share buffers with i915 could rely on
the properly set nents and orig_nents values.


This one looks mechanical enough :)

Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
  drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
  2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 2679380159fc..8a988592715b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
src = sg_next(src);
}
  
-	if (!dma_map_sg_attrs(attachment->dev,

- st->sgl, st->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
-   ret = -ENOMEM;
+   ret = dma_map_sgtable(attachment->dev, st, dir, DMA_ATTR_SKIP_CPU_SYNC);
+   if (ret)
goto err_free_sg;
-   }
  
  	return st;
  
@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,

  {
struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
  
-	dma_unmap_sg_attrs(attachment->dev,

-  sg->sgl, sg->nents, dir,
-  DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);
  
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c

index debaf7b18ab5..be30b27e2926 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct 
dma_buf_attachment *attachment,
sg = sg_next(sg);
}
  
-	if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {

-   err = -ENOMEM;
+   err = dma_map_sgtable(attachment->dev, st, dir, 0);
+   if (err)
goto err_st;
-   }
  
  	return st;
  
@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct dma_buf_attachment *attachment,

   struct sg_table *st,
   enum dma_data_direction dir)
  {
-   dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
+   dma_unmap_sgtable(attachment->dev, st, dir, 0);
sg_free_table(st);
kfree(st);
  }


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 05/32] drm: etnaviv: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 +---
  drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 13 +++--
  2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index f06e19e7be04..eaf1949bc2e4 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -27,7 +27,7 @@ static void etnaviv_gem_scatter_map(struct etnaviv_gem_object 
*etnaviv_obj)
 * because display controller, GPU, etc. are not coherent.
 */
if (etnaviv_obj->flags & ETNA_BO_CACHE_MASK)
-   dma_map_sg(dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   dma_map_sgtable(dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
  }
  
  static void etnaviv_gem_scatterlist_unmap(struct etnaviv_gem_object *etnaviv_obj)

@@ -51,7 +51,7 @@ static void etnaviv_gem_scatterlist_unmap(struct 
etnaviv_gem_object *etnaviv_obj
 * discard those writes.
 */
if (etnaviv_obj->flags & ETNA_BO_CACHE_MASK)
-   dma_unmap_sg(dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
  }
  
  /* called with etnaviv_obj->lock held */

@@ -404,9 +404,8 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
}
  
  	if (etnaviv_obj->flags & ETNA_BO_CACHED) {

-   dma_sync_sg_for_cpu(dev->dev, etnaviv_obj->sgt->sgl,
-   etnaviv_obj->sgt->nents,
-   etnaviv_op_to_dma_dir(op));
+   dma_sync_sgtable_for_cpu(dev->dev, etnaviv_obj->sgt,
+etnaviv_op_to_dma_dir(op));
etnaviv_obj->last_cpu_prep_op = op;
}
  
@@ -421,8 +420,7 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)

if (etnaviv_obj->flags & ETNA_BO_CACHED) {
/* fini without a prep is almost certainly a userspace error */
WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
-   dma_sync_sg_for_device(dev->dev, etnaviv_obj->sgt->sgl,
-   etnaviv_obj->sgt->nents,
+   dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt,
etnaviv_op_to_dma_dir(etnaviv_obj->last_cpu_prep_op));
etnaviv_obj->last_cpu_prep_op = 0;
}
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c 
b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
index 3607d348c298..13b100553a0b 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
@@ -79,7 +79,7 @@ static int etnaviv_iommu_map(struct etnaviv_iommu_context 
*context, u32 iova,
if (!context || !sgt)
return -EINVAL;
  
-	for_each_sg(sgt->sgl, sg, sgt->nents, i) {

+   for_each_sgtable_dma_sg(sgt, sg, i) {
u32 pa = sg_dma_address(sg) - sg->offset;
size_t bytes = sg_dma_len(sg) + sg->offset;
  
@@ -95,14 +95,7 @@ static int etnaviv_iommu_map(struct etnaviv_iommu_context *context, u32 iova,

return 0;
  
  fail:

-   da = iova;
-
-   for_each_sg(sgt->sgl, sg, i, j) {
-   size_t bytes = sg_dma_len(sg) + sg->offset;
-
-   etnaviv_context_unmap(context, da, bytes);
-   da += bytes;
-   }
+   etnaviv_context_unmap(context, iova, da - iova);


I had to take a closer look to figure this out, but AFAICS it does 
indeed work out as a simpler way of achieving the exact same result, and 
in fact neatly mirrors how etnaviv_context_map() itself cleans up.


Reviewed-by: Robin Murphy 


return ret;
  }
  
@@ -113,7 +106,7 @@ static void etnaviv_iommu_unmap(struct etnaviv_iommu_context 

Re: [PATCH v9 04/32] drm: armada: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
  drivers/gpu/drm/armada/armada_gem.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/armada/armada_gem.c 
b/drivers/gpu/drm/armada/armada_gem.c
index 8005614d2e6b..bedd8937d8a1 100644
--- a/drivers/gpu/drm/armada/armada_gem.c
+++ b/drivers/gpu/drm/armada/armada_gem.c
@@ -395,7 +395,7 @@ armada_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
  
  		mapping = dobj->obj.filp->f_mapping;
  
-		for_each_sg(sgt->sgl, sg, count, i) {

+   for_each_sgtable_sg(sgt, sg, i) {
struct page *page;
  
  			page = shmem_read_mapping_page(mapping, i);

@@ -407,8 +407,8 @@ armada_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
sg_set_page(sg, page, PAGE_SIZE, 0);
}
  
-		if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0) {

-   num = sgt->nents;
+   if (dma_map_sgtable(attach->dev, sgt, dir, 0)) {
+   num = count;


I think it might be even nicer to get rid of "num" entirely and convert 
the cleanup path to for_each_sgtable_sg() for completeness - AFAICS it 
should only need an extra "if (sg_page(sg))..." check in that loop. Then 
"count" could possibly be squashed into its one remaining use as well, 
but maybe it's worth keeping for readability.


Robin.


goto release;
}
} else if (dobj->page) {
@@ -418,7 +418,7 @@ armada_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
  
  		sg_set_page(sgt->sgl, dobj->page, dobj->obj.size, 0);
  
-		if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)

+   if (dma_map_sgtable(attach->dev, sgt, dir, 0))
goto free_table;
} else if (dobj->linear) {
/* Single contiguous physical region - no struct page */
@@ -449,11 +449,11 @@ static void armada_gem_prime_unmap_dma_buf(struct 
dma_buf_attachment *attach,
int i;
  
  	if (!dobj->linear)

-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sgtable(attach->dev, sgt, dir, 0);
  
  	if (dobj->obj.filp) {

struct scatterlist *sg;
-   for_each_sg(sgt->sgl, sg, sgt->nents, i)
+   for_each_sgtable_sg(sgt, sg, i)
put_page(sg_page(sg));
}
  


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/5] drm_dp_cec: add plumbing in preparation for MST support

2020-09-01 Thread Lyude Paul
Super minor nitpicks:

On Tue, 2020-09-01 at 16:22 +1000, Sam McNally wrote:
> From: Hans Verkuil 
> 
> Signed-off-by: Hans Verkuil 
> [sa...@chromium.org:
>  - rebased
>  - removed polling-related changes
>  - moved the calls to drm_dp_cec_(un)set_edid() into the next patch
> ]
> Signed-off-by: Sam McNally 
> ---
> 
>  .../display/amdgpu_dm/amdgpu_dm_mst_types.c   |  2 +-
>  drivers/gpu/drm/drm_dp_cec.c  | 22 ++-
>  drivers/gpu/drm/i915/display/intel_dp.c   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_connector.c   |  2 +-
>  include/drm/drm_dp_helper.h   |  6 +++--
>  5 files changed, 19 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 461fa4da0a34..6e7075893ec9 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -419,7 +419,7 @@ void amdgpu_dm_initialize_dp_connector(struct
> amdgpu_display_manager *dm,
>  
>   drm_dp_aux_init(>dm_dp_aux.aux);
>   drm_dp_cec_register_connector(>dm_dp_aux.aux,
> -   >base);
> +   >base, false);
>  
>   if (aconnector->base.connector_type == DRM_MODE_CONNECTOR_eDP)
>   return;
> diff --git a/drivers/gpu/drm/drm_dp_cec.c b/drivers/gpu/drm/drm_dp_cec.c
> index 3ab2609f9ec7..04ab7b88055c 100644
> --- a/drivers/gpu/drm/drm_dp_cec.c
> +++ b/drivers/gpu/drm/drm_dp_cec.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /*
>   * Unfortunately it turns out that we have a chicken-and-egg situation
> @@ -338,8 +339,6 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const
> struct edid *edid)
>   if (aux->cec.adap) {
>   if (aux->cec.adap->capabilities == cec_caps &&
>   aux->cec.adap->available_log_addrs == num_las) {
> - /* Unchanged, so just set the phys addr */
> - cec_s_phys_addr_from_edid(aux->cec.adap, edid);
>   goto unlock;
>   }

May as well drop the braces here

>   /*
> @@ -364,15 +363,16 @@ void drm_dp_cec_set_edid(struct drm_dp_aux *aux, const
> struct edid *edid)
>   if (cec_register_adapter(aux->cec.adap, connector->dev->dev)) {
>   cec_delete_adapter(aux->cec.adap);
>   aux->cec.adap = NULL;
> - } else {
> - /*
> -  * Update the phys addr for the new CEC adapter. When called
> -  * from drm_dp_cec_register_connector() edid == NULL, so in
> -  * that case the phys addr is just invalidated.
> -  */
> - cec_s_phys_addr_from_edid(aux->cec.adap, edid);
>   }
>  unlock:
> + /*
> +  * Update the phys addr for the new CEC adapter. When called
> +  * from drm_dp_cec_register_connector() edid == NULL, so in
> +  * that case the phys addr is just invalidated.
> +  */
> + if (aux->cec.adap && edid) {
> + cec_s_phys_addr_from_edid(aux->cec.adap, edid);
> + }

And here

>   mutex_unlock(>cec.lock);
>  }
>  EXPORT_SYMBOL(drm_dp_cec_set_edid);
> @@ -418,6 +418,7 @@ EXPORT_SYMBOL(drm_dp_cec_unset_edid);
>   * drm_dp_cec_register_connector() - register a new connector
>   * @aux: DisplayPort AUX channel
>   * @connector: drm connector
> + * @is_mst: set to true if this is an MST branch
>   *
>   * A new connector was registered with associated CEC adapter name and
>   * CEC adapter parent device. After registering the name and parent
> @@ -425,12 +426,13 @@ EXPORT_SYMBOL(drm_dp_cec_unset_edid);
>   * CEC and to register a CEC adapter if that is the case.
>   */
>  void drm_dp_cec_register_connector(struct drm_dp_aux *aux,
> -struct drm_connector *connector)
> +struct drm_connector *connector, bool is_mst)
>  {
>   WARN_ON(aux->cec.adap);
>   if (WARN_ON(!aux->transfer))
>   return;
>   aux->cec.connector = connector;
> + aux->cec.is_mst = is_mst;

Also JFYI, you can also check aux->is_remote, but maybe you've got another
reason for copying this here

Either way:

Reviewed-by: Lyude Paul 

...Also, maybe this is just a coincidence - but do I know your name from
somewhere? Perhaps an IRC community from long ago?

>   INIT_DELAYED_WORK(>cec.unregister_work,
> drm_dp_cec_unregister_work);
>  }
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> b/drivers/gpu/drm/i915/display/intel_dp.c
> index 82b9de274f65..744cb55572f9 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> @@ -6261,7 +6261,7 @@ intel_dp_connector_register(struct drm_connector
> *connector)
>   intel_dp->aux.dev = connector->kdev;
>   ret = drm_dp_aux_register(_dp->aux);
>   if 

Re: [PATCH] drm/tve200: Stabilize enable/disable

2020-09-01 Thread Daniel Vetter
On Tue, Sep 1, 2020 at 7:52 PM Linus Walleij  wrote:
>
> On Thu, Aug 20, 2020 at 10:32 PM Linus Walleij  
> wrote:
>
> > The TVE200 will occasionally print a bunch of lost interrupts
> > and similar dmesg messages, sometimes during boot and sometimes
> > after disabling and coming back to enablement. This is probably
> > because the hardware is left in an unknown state by the boot
> > loader that displays a logo.
> >
> > This can be fixed by bringing the controller into a known state
> > by resetting the controller while enabling it. We retry reset 5
> > times like the vendor driver does. We also put the controller
> > into reset before de-clocking it and clear all interrupts before
> > enabling the vblank IRQ.
> >
> > This makes the video enable/disable/enable cycle rock solid
> > on the D-Link DIR-685. Tested extensively.
> >
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Linus Walleij 
>
> Would someone have mercy on this patch and review or
> at least ACK it so I can merge it?

Does what it says on the label, looks symmetric, and "do this five
times for luck" is a classic.

Acked-by: Daniel Vetter 

The irq reset looks a bit like maybe separate patch, but *shrug*,
since your description says you're missing interrupts, not that you
have too many. But can't hurt (and maybe if we have spurious ones it
then looks like the next vblank went missing, so makes some sense).

Cheers, Daniel

> I offer any reviews in return, on stuff I understand, such
> as panel drivers.


>
> Best regards,
> Linus Walleij
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/dp: start using more of the extended receiver caps

2020-09-01 Thread Jani Nikula
On Tue, 01 Sep 2020, Lyude Paul  wrote:
> On Tue, 2020-09-01 at 15:32 +0300, Jani Nikula wrote:
>> In the future, we'll be needing more of the extended receiver capability
>> field starting at DPCD address 0x2200. (Specifically, we'll need main
>> link channel coding cap for DP 2.0.) Start using it now to not miss out
>> later on.
>> 
>> Cc: Lyude Paul 
>> Signed-off-by: Jani Nikula 
>> 
>> ---
>> 
>> I guess this can be merged after the topic branch to drm-misc-next or
>> so, but I'd prefer to have this fairly early on to catch any potential
>> issues.
>> ---
>>  drivers/gpu/drm/drm_dp_helper.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/drivers/gpu/drm/drm_dp_helper.c 
>> b/drivers/gpu/drm/drm_dp_helper.c
>> index 1e7c638873c8..3a3c238452df 100644
>> --- a/drivers/gpu/drm/drm_dp_helper.c
>> +++ b/drivers/gpu/drm/drm_dp_helper.c
>> @@ -436,7 +436,7 @@ static u8 drm_dp_downstream_port_count(const u8
>> dpcd[DP_RECEIVER_CAP_SIZE])
>>  static int drm_dp_read_extended_dpcd_caps(struct drm_dp_aux *aux,
>>u8 dpcd[DP_RECEIVER_CAP_SIZE])
>>  {
>> -u8 dpcd_ext[6];
>> +u8 dpcd_ext[DP_RECEIVER_CAP_SIZE];
>
> Not 100% sure this is right? It's not clear at first glance of the 2.0 spec, 
> but
> my assumption would be that on < DP2.0 devices that everything but those 
> first 6
> bytes are zeroed out in the extended DPRX field. Since we memcpy() dpcd_ext
> using sizeof(dpcd_ext), we'd potentially end up zeroing out all of the DPCD 
> caps
> that comes after those 6 bytes.

Re-reading stuff... AFAICT everything in 0x2200..0x220F should be
valid. They should match what's in 0x..0x000F except for 0x,
0x0001, and 0x0005, for backwards compatibility.

Apparently there are no such backwards compatibility concerns with the
other receiver cap fields then.

But it gives me an uneasy feeling that many places in the spec refer to
0x2200+ even though they should per spec be the same in 0x+.

I guess we can try without the change, and fix later if we hit issues.


BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 0/4] drm/panel: s6e63m0: Add DSI transport

2020-09-01 Thread Linus Walleij
On Thu, Aug 27, 2020 at 11:04 AM Linus Walleij  wrote:
> On Tue, Aug 18, 2020 at 7:10 PM Sam Ravnborg  wrote:
>
> > How does this patchset relate to the patchset posted by Paul?
> > https://lore.kernel.org/dri-devel/20200727164613.19744-1-p...@crapouillou.net/
>
> Not much. S6E63M0 uses "spi" as it is right now and is not using
> the existing DBI code.
>
> So it would require it to start using the DBI core to begin with.
> If it can. Which is kind of an orthogonal task.
>
> What would be the defining character for it to
> be "DBI"? I do see that the driver sends MIPI standard commands
> over SPI. I suspect this is another standard without public specs...
>
> > Seems that two different approcahes are used for the same type of
> > problem.
>
> This approach is based on the approach from IIO, se e.g.:
> drivers/iio/accel/bmc150-accel-core.c
> drivers/iio/accel/bmc150-accel.h
> drivers/iio/accel/bmc150-accel-i2c.c
> drivers/iio/accel/bmc150-accel-spi.c
>
> > Is it possible to find a common solution?
>
> I'm happy to rework it any direction. If the other patch set is going to
> take time to finalize (as in: will not merge it the coming week, need to
> hack and stuff) then I'd prefer to apply this so I know my display works
> in v5.10. I can certainly rework it into Paul's framework when that
> arrives.

Is it OK to merge this as-is? I'm fishing for an ACK here...

I will certainly adapt to the DBI framework when/if it arrives,
and I think my track record makes that claim believeable.

Yours,
Linus Walleij
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/tve200: Stabilize enable/disable

2020-09-01 Thread Linus Walleij
On Thu, Aug 20, 2020 at 10:32 PM Linus Walleij  wrote:

> The TVE200 will occasionally print a bunch of lost interrupts
> and similar dmesg messages, sometimes during boot and sometimes
> after disabling and coming back to enablement. This is probably
> because the hardware is left in an unknown state by the boot
> loader that displays a logo.
>
> This can be fixed by bringing the controller into a known state
> by resetting the controller while enabling it. We retry reset 5
> times like the vendor driver does. We also put the controller
> into reset before de-clocking it and clear all interrupts before
> enabling the vblank IRQ.
>
> This makes the video enable/disable/enable cycle rock solid
> on the D-Link DIR-685. Tested extensively.
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Linus Walleij 

Would someone have mercy on this patch and review or
at least ACK it so I can merge it?

I offer any reviews in return, on stuff I understand, such
as panel drivers.

Best regards,
Linus Walleij
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/dp: start using more of the extended receiver caps

2020-09-01 Thread Lyude Paul
On Tue, 2020-09-01 at 15:32 +0300, Jani Nikula wrote:
> In the future, we'll be needing more of the extended receiver capability
> field starting at DPCD address 0x2200. (Specifically, we'll need main
> link channel coding cap for DP 2.0.) Start using it now to not miss out
> later on.
> 
> Cc: Lyude Paul 
> Signed-off-by: Jani Nikula 
> 
> ---
> 
> I guess this can be merged after the topic branch to drm-misc-next or
> so, but I'd prefer to have this fairly early on to catch any potential
> issues.
> ---
>  drivers/gpu/drm/drm_dp_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index 1e7c638873c8..3a3c238452df 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -436,7 +436,7 @@ static u8 drm_dp_downstream_port_count(const u8
> dpcd[DP_RECEIVER_CAP_SIZE])
>  static int drm_dp_read_extended_dpcd_caps(struct drm_dp_aux *aux,
> u8 dpcd[DP_RECEIVER_CAP_SIZE])
>  {
> - u8 dpcd_ext[6];
> + u8 dpcd_ext[DP_RECEIVER_CAP_SIZE];

Not 100% sure this is right? It's not clear at first glance of the 2.0 spec, but
my assumption would be that on < DP2.0 devices that everything but those first 6
bytes are zeroed out in the extended DPRX field. Since we memcpy() dpcd_ext
using sizeof(dpcd_ext), we'd potentially end up zeroing out all of the DPCD caps
that comes after those 6 bytes.

>   int ret;
>  
>   /*
-- 
Sincerely,
  Lyude Paul (she/her)
  Software Engineer at Red Hat

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 03/32] drm: core: fix common struct sg_table related issues

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Andrzej Hajda 
---
  drivers/gpu/drm/drm_cache.c|  2 +-
  drivers/gpu/drm/drm_gem_shmem_helper.c | 14 +-
  drivers/gpu/drm/drm_prime.c| 11 ++-
  3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 03e01b000f7a..0fe3c496002a 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -127,7 +127,7 @@ drm_clflush_sg(struct sg_table *st)
struct sg_page_iter sg_iter;
  
  		mb(); /*CLFLUSH is ordered only by using memory barriers*/

-   for_each_sg_page(st->sgl, _iter, st->nents, 0)
+   for_each_sgtable_page(st, _iter, 0)
drm_clflush_page(sg_page_iter_page(_iter));
mb(); /*Make sure that all cache line entry is flushed*/
  
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c

index 4b7cfbac4daa..47d8211221f2 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -126,8 +126,8 @@ void drm_gem_shmem_free_object(struct drm_gem_object *obj)
drm_prime_gem_destroy(obj, shmem->sgt);
} else {
if (shmem->sgt) {
-   dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
-shmem->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
+ DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
}
@@ -424,8 +424,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_object *obj)
  
  	WARN_ON(!drm_gem_shmem_is_purgeable(shmem));
  
-	dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,

-shmem->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(obj->dev->dev, shmem->sgt, DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
shmem->sgt = NULL;
@@ -697,12 +696,17 @@ struct sg_table *drm_gem_shmem_get_pages_sgt(struct 
drm_gem_object *obj)
goto err_put_pages;
}
/* Map the pages for use by the h/w. */
-   dma_map_sg(obj->dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   ret = dma_map_sgtable(obj->dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
+   if (ret)
+   goto err_free_sgt;
  
  	shmem->sgt = sgt;
  
  	return sgt;
  
+err_free_sgt:

+   sg_free_table(sgt);
+   kfree(sgt);


Should this be a separate patch to add the missing error handling to the 
existing code first?


Otherwise the rest of the mechanical conversion looks straightforward 
enough, and I'm not the separation-of-concerns police (for this 
subsystem, at least), so either way,


Reviewed-by: Robin Murphy 


  err_put_pages:
drm_gem_shmem_put_pages(shmem);
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 5d181bf60a44..c45b0cc6e31d 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -617,6 +617,7 @@ struct sg_table *drm_gem_map_dma_buf(struct 
dma_buf_attachment *attach,
  {
struct drm_gem_object *obj = attach->dmabuf->priv;
struct sg_table *sgt;
+   int ret;
  
  	if (WARN_ON(dir == DMA_NONE))

return ERR_PTR(-EINVAL);
@@ -626,11 +627,12 @@ struct sg_table *drm_gem_map_dma_buf(struct 
dma_buf_attachment *attach,
else
sgt = obj->dev->driver->gem_prime_get_sg_table(obj);
  
-	if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,

- DMA_ATTR_SKIP_CPU_SYNC)) {
+   ret = 

Re: [PATCH v9 01/32] drm: prime: add common helper to check scatterlist contiguity

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

It is a common operation done by DRM drivers to check the contiguity
of the DMA-mapped buffer described by a scatterlist in the
sg_table object. Let's add a common helper for this operation.


I still think this could be hoisted even further out to the common 
sgtable API level, but let's get the individual subsystems straightened 
out first then worry about consolidation later.


Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
Reviewed-by: Andrzej Hajda 
---
  drivers/gpu/drm/drm_gem_cma_helper.c | 23 +++--
  drivers/gpu/drm/drm_prime.c  | 31 
  include/drm/drm_prime.h  |  2 ++
  3 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c 
b/drivers/gpu/drm/drm_gem_cma_helper.c
index 822edeadbab3..59b9ca207b42 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -471,26 +471,9 @@ drm_gem_cma_prime_import_sg_table(struct drm_device *dev,
  {
struct drm_gem_cma_object *cma_obj;
  
-	if (sgt->nents != 1) {

-   /* check if the entries in the sg_table are contiguous */
-   dma_addr_t next_addr = sg_dma_address(sgt->sgl);
-   struct scatterlist *s;
-   unsigned int i;
-
-   for_each_sg(sgt->sgl, s, sgt->nents, i) {
-   /*
-* sg_dma_address(s) is only valid for entries
-* that have sg_dma_len(s) != 0
-*/
-   if (!sg_dma_len(s))
-   continue;
-
-   if (sg_dma_address(s) != next_addr)
-   return ERR_PTR(-EINVAL);
-
-   next_addr = sg_dma_address(s) + sg_dma_len(s);
-   }
-   }
+   /* check if the entries in the sg_table are contiguous */
+   if (drm_prime_get_contiguous_size(sgt) < attach->dmabuf->size)
+   return ERR_PTR(-EINVAL);
  
  	/* Create a CMA GEM buffer. */

cma_obj = __drm_gem_cma_create(dev, attach->dmabuf->size);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 1693aa7c14b5..4ed5ed1f078c 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -825,6 +825,37 @@ struct sg_table *drm_prime_pages_to_sg(struct page 
**pages, unsigned int nr_page
  }
  EXPORT_SYMBOL(drm_prime_pages_to_sg);
  
+/**

+ * drm_prime_get_contiguous_size - returns the contiguous size of the buffer
+ * @sgt: sg_table describing the buffer to check
+ *
+ * This helper calculates the contiguous size in the DMA address space
+ * of the the buffer described by the provided sg_table.
+ *
+ * This is useful for implementing
+ * _gem_object_funcs.gem_prime_import_sg_table.
+ */
+unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt)
+{
+   dma_addr_t expected = sg_dma_address(sgt->sgl);
+   struct scatterlist *sg;
+   unsigned long size = 0;
+   int i;
+
+   for_each_sgtable_dma_sg(sgt, sg, i) {
+   unsigned int len = sg_dma_len(sg);
+
+   if (!len)
+   break;
+   if (sg_dma_address(sg) != expected)
+   break;
+   expected += len;
+   size += len;
+   }
+   return size;
+}
+EXPORT_SYMBOL(drm_prime_get_contiguous_size);
+
  /**
   * drm_gem_prime_export - helper library implementation of the export callback
   * @obj: GEM object to export
diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h
index 9af7422b44cf..47ef11614627 100644
--- a/include/drm/drm_prime.h
+++ b/include/drm/drm_prime.h
@@ -92,6 +92,8 @@ struct sg_table *drm_prime_pages_to_sg(struct page **pages, 
unsigned int nr_page
  struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
 int flags);
  
+unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt);

+
  /* helper functions for importing */
  struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev,
struct dma_buf *dma_buf,


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v9 02/32] drm: prime: use sgtable iterators in drm_prime_sg_to_page_addr_arrays()

2020-09-01 Thread Robin Murphy

On 2020-08-26 07:32, Marek Szyprowski wrote:

Replace the current hand-crafted code for extracting pages and DMA
addresses from the given scatterlist by the much more robust
code based on the generic scatterlist iterators and recently
introduced sg_table-based wrappers. The resulting code is simple and
easy to understand, so the comment describing the old code is no
longer needed.


Is removing the WARN_ON()s intentional? It certainly seems like it would 
be a genuine driver bug if the caller asked for addresses but didn't 
allocate appropriately-sized arrays. Might be worth noting either way. 
I'm also assuming this isn't called in performance-critical paths with 
massive lists such that the two separate iterations might have a 
noticeable impact.


Nits aside,

Reviewed-by: Robin Murphy 


Signed-off-by: Marek Szyprowski 
Reviewed-by: Andrzej Hajda 
---
  drivers/gpu/drm/drm_prime.c | 49 -
  1 file changed, 15 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 4ed5ed1f078c..5d181bf60a44 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -990,45 +990,26 @@ EXPORT_SYMBOL(drm_gem_prime_import);
  int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page 
**pages,
 dma_addr_t *addrs, int max_entries)
  {
-   unsigned count;
-   struct scatterlist *sg;
-   struct page *page;
-   u32 page_len, page_index;
-   dma_addr_t addr;
-   u32 dma_len, dma_index;
-
-   /*
-* Scatterlist elements contains both pages and DMA addresses, but
-* one shoud not assume 1:1 relation between them. The sg->length is
-* the size of the physical memory chunk described by the sg->page,
-* while sg_dma_len(sg) is the size of the DMA (IO virtual) chunk
-* described by the sg_dma_address(sg).
-*/
-   page_index = 0;
-   dma_index = 0;
-   for_each_sg(sgt->sgl, sg, sgt->nents, count) {
-   page_len = sg->length;
-   page = sg_page(sg);
-   dma_len = sg_dma_len(sg);
-   addr = sg_dma_address(sg);
-
-   while (pages && page_len > 0) {
-   if (WARN_ON(page_index >= max_entries))
+   struct sg_dma_page_iter dma_iter;
+   struct sg_page_iter page_iter;
+   struct page **p = pages;
+   dma_addr_t *a = addrs;
+
+   if (pages) {
+   for_each_sgtable_page(sgt, _iter, 0) {
+   if (p - pages >= max_entries)
return -1;
-   pages[page_index] = page;
-   page++;
-   page_len -= PAGE_SIZE;
-   page_index++;
+   *p++ = sg_page_iter_page(_iter);
}
-   while (addrs && dma_len > 0) {
-   if (WARN_ON(dma_index >= max_entries))
+   }
+   if (addrs) {
+   for_each_sgtable_dma_page(sgt, _iter, 0) {
+   if (a - addrs >= max_entries)
return -1;
-   addrs[dma_index] = addr;
-   addr += PAGE_SIZE;
-   dma_len -= PAGE_SIZE;
-   dma_index++;
+   *a++ = sg_page_iter_dma_address(_iter);
}
}
+
return 0;
  }
  EXPORT_SYMBOL(drm_prime_sg_to_page_addr_arrays);


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 20/20] arm: dts: qcom: sc7180: Set the compatible string for the GPU SMMU

2020-09-01 Thread Rob Clark
From: Rob Clark 

Set the qcom,adreno-smmu compatible string for the GPU SMMU to enable
split pagetables and per-instance pagetables for drm/msm.

Signed-off-by: Rob Clark 
---
 arch/arm64/boot/dts/qcom/sc7180.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sc7180.dtsi 
b/arch/arm64/boot/dts/qcom/sc7180.dtsi
index d46b3833e52f..f3bef1cad889 100644
--- a/arch/arm64/boot/dts/qcom/sc7180.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7180.dtsi
@@ -1937,7 +1937,7 @@ opp-18000 {
};
 
adreno_smmu: iommu@504 {
-   compatible = "qcom,sc7180-smmu-v2", "qcom,smmu-v2";
+   compatible = "qcom,sc7180-smmu-v2", "qcom,adreno-smmu", 
"qcom,smmu-v2";
reg = <0 0x0504 0 0x1>;
#iommu-cells = <1>;
#global-interrupts = <2>;
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 11/20] drm/msm: Show process names in gem_describe

2020-09-01 Thread Rob Clark
From: Rob Clark 

In $debugfs/gem we already show any vma(s) associated with an object.
Also show process names if the vma's address space is a per-process
address space.

Signed-off-by: Rob Clark 
Reviewed-by: Jordan Crouse 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/msm_drv.c |  2 +-
 drivers/gpu/drm/msm/msm_gem.c | 25 +
 drivers/gpu/drm/msm/msm_gem.h |  5 +
 drivers/gpu/drm/msm/msm_gem_vma.c |  1 +
 drivers/gpu/drm/msm/msm_gpu.c |  8 +---
 drivers/gpu/drm/msm/msm_gpu.h |  2 +-
 6 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 7e963f707852..7143756b7e83 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -597,7 +597,7 @@ static int context_init(struct drm_device *dev, struct 
drm_file *file)
kref_init(>ref);
msm_submitqueue_init(dev, ctx);
 
-   ctx->aspace = msm_gpu_create_private_address_space(priv->gpu);
+   ctx->aspace = msm_gpu_create_private_address_space(priv->gpu, current);
file->driver_priv = ctx;
 
return 0;
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 3cb7aeb93fd3..76a6c5271e57 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -842,11 +842,28 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m)
 
seq_puts(m, "  vmas:");
 
-   list_for_each_entry(vma, _obj->vmas, list)
-   seq_printf(m, " [%s: %08llx,%s,inuse=%d]",
-   vma->aspace != NULL ? vma->aspace->name : NULL,
-   vma->iova, vma->mapped ? "mapped" : "unmapped",
+   list_for_each_entry(vma, _obj->vmas, list) {
+   const char *name, *comm;
+   if (vma->aspace) {
+   struct msm_gem_address_space *aspace = 
vma->aspace;
+   struct task_struct *task =
+   get_pid_task(aspace->pid, PIDTYPE_PID);
+   if (task) {
+   comm = kstrdup(task->comm, GFP_KERNEL);
+   } else {
+   comm = NULL;
+   }
+   name = aspace->name;
+   } else {
+   name = comm = NULL;
+   }
+   seq_printf(m, " [%s%s%s: aspace=%p, 
%08llx,%s,inuse=%d]",
+   name, comm ? ":" : "", comm ? comm : "",
+   vma->aspace, vma->iova,
+   vma->mapped ? "mapped" : "unmapped",
vma->inuse);
+   kfree(comm);
+   }
 
seq_puts(m, "\n");
}
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 9c573c4269cb..7b1c7a5f8eef 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -24,6 +24,11 @@ struct msm_gem_address_space {
spinlock_t lock; /* Protects drm_mm node allocation/removal */
struct msm_mmu *mmu;
struct kref kref;
+
+   /* For address spaces associated with a specific process, this
+* will be non-NULL:
+*/
+   struct pid *pid;
 };
 
 struct msm_gem_vma {
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c 
b/drivers/gpu/drm/msm/msm_gem_vma.c
index 29cc1305cf37..80a8a266d68f 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -17,6 +17,7 @@ msm_gem_address_space_destroy(struct kref *kref)
drm_mm_takedown(>mm);
if (aspace->mmu)
aspace->mmu->funcs->destroy(aspace->mmu);
+   put_pid(aspace->pid);
kfree(aspace);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 951850804d77..ac8961187a73 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -825,10 +825,9 @@ static int get_clocks(struct platform_device *pdev, struct 
msm_gpu *gpu)
 
 /* Return a new address space for a msm_drm_private instance */
 struct msm_gem_address_space *
-msm_gpu_create_private_address_space(struct msm_gpu *gpu)
+msm_gpu_create_private_address_space(struct msm_gpu *gpu, struct task_struct 
*task)
 {
struct msm_gem_address_space *aspace = NULL;
-
if (!gpu)
return NULL;
 
@@ -836,8 +835,11 @@ msm_gpu_create_private_address_space(struct msm_gpu *gpu)
 * If the target doesn't support private address spaces then return
 * the global one
 */
-   if (gpu->funcs->create_private_address_space)
+   if (gpu->funcs->create_private_address_space) {
aspace = gpu->funcs->create_private_address_space(gpu);
+   if 

[PATCH v16 19/20] arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Set the qcom,adreno-smmu compatible string for the GPU SMMU to enable
split pagetables and per-instance pagetables for drm/msm.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
---
 arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi | 9 +
 arch/arm64/boot/dts/qcom/sdm845.dtsi   | 2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
index 64fc1bfd66fa..39f23cdcbd02 100644
--- a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
@@ -633,6 +633,15 @@ _mdp {
status = "okay";
 };
 
+/*
+ * Cheza fw does not properly program the GPU aperture to allow the
+ * GPU to update the SMMU pagetables for context switches.  Work
+ * around this by dropping the "qcom,adreno-smmu" compat string.
+ */
+_smmu {
+   compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+};
+
 _pil {
iommus = <_smmu 0x781 0x0>,
 <_smmu 0x724 0x3>;
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi 
b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 2884577dcb77..76a8a34640ae 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -4058,7 +4058,7 @@ opp-25700 {
};
 
adreno_smmu: iommu@504 {
-   compatible = "qcom,sdm845-smmu-v2", "qcom,smmu-v2";
+   compatible = "qcom,sdm845-smmu-v2", "qcom,adreno-smmu", 
"qcom,smmu-v2";
reg = <0 0x504 0 0x1>;
#iommu-cells = <1>;
#global-interrupts = <2>;
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 13/20] iommu/arm-smmu: Add support for split pagetables

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Enable TTBR1 for a context bank if IO_PGTABLE_QUIRK_ARM_TTBR1 is selected
by the io-pgtable configuration.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 19 +++
 drivers/iommu/arm/arm-smmu/arm-smmu.h | 25 +++--
 2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 37d8d49299b4..8e884e58f208 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -552,11 +552,15 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
cb->ttbr[1] = 0;
} else {
-   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
-   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID,
- cfg->asid);
+   cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
+cfg->asid);
cb->ttbr[1] = FIELD_PREP(ARM_SMMU_TTBRn_ASID,
 cfg->asid);
+
+   if (pgtbl_cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
+   cb->ttbr[1] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   else
+   cb->ttbr[0] |= pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
}
} else {
cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -822,7 +826,14 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
 
/* Update the domain's page sizes to reflect the page table format */
domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
-   domain->geometry.aperture_end = (1UL << ias) - 1;
+
+   if (pgtbl_cfg.quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+   domain->geometry.aperture_start = ~0UL << ias;
+   domain->geometry.aperture_end = ~0UL;
+   } else {
+   domain->geometry.aperture_end = (1UL << ias) - 1;
+   }
+
domain->geometry.force_aperture = true;
 
/* Initialise the context bank with our page table cfg */
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 83294516ac08..f3e456893f28 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -169,10 +169,12 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_TCR0x30
 #define ARM_SMMU_TCR_EAE   BIT(31)
 #define ARM_SMMU_TCR_EPD1  BIT(23)
+#define ARM_SMMU_TCR_A1BIT(22)
 #define ARM_SMMU_TCR_TG0   GENMASK(15, 14)
 #define ARM_SMMU_TCR_SH0   GENMASK(13, 12)
 #define ARM_SMMU_TCR_ORGN0 GENMASK(11, 10)
 #define ARM_SMMU_TCR_IRGN0 GENMASK(9, 8)
+#define ARM_SMMU_TCR_EPD0  BIT(7)
 #define ARM_SMMU_TCR_T0SZ  GENMASK(5, 0)
 
 #define ARM_SMMU_VTCR_RES1 BIT(31)
@@ -350,12 +352,23 @@ struct arm_smmu_domain {
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
 {
-   return ARM_SMMU_TCR_EPD1 |
-  FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
-  FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
-  FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
-  FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
-  FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+   u32 tcr = FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
+   FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
+   FIELD_PREP(ARM_SMMU_TCR_ORGN0, cfg->arm_lpae_s1_cfg.tcr.orgn) |
+   FIELD_PREP(ARM_SMMU_TCR_IRGN0, cfg->arm_lpae_s1_cfg.tcr.irgn) |
+   FIELD_PREP(ARM_SMMU_TCR_T0SZ, cfg->arm_lpae_s1_cfg.tcr.tsz);
+
+   /*
+   * When TTBR1 is selected shift the TCR fields by 16 bits and disable
+   * translation in TTBR0
+   */
+   if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+   tcr = (tcr << 16) & ~ARM_SMMU_TCR_A1;
+   tcr |= ARM_SMMU_TCR_EPD0;
+   } else
+   tcr |= ARM_SMMU_TCR_EPD1;
+
+   return tcr;
 }
 
 static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 14/20] iommu/arm-smmu: Prepare for the adreno-smmu implementation

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Do a bit of prep work to add the upcoming adreno-smmu implementation.

Add an hook to allow the implementation to choose which context banks
to allocate.

Move some of the common structs to arm-smmu.h in anticipation of them
being used by the implementations and update some of the existing hooks
to pass more information that the implementation will need.

These modifications will be used by the upcoming Adreno SMMU
implementation to identify the GPU device and properly configure it
for pagetable switching.

Co-developed-by: Rob Clark 
Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  2 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c  | 69 ++
 drivers/iommu/arm/arm-smmu/arm-smmu.h  | 51 +++-
 3 files changed, 68 insertions(+), 54 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index a9861dcd0884..88f17cc33023 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -69,7 +69,7 @@ static int cavium_cfg_probe(struct arm_smmu_device *smmu)
 }
 
 static int cavium_init_context(struct arm_smmu_domain *smmu_domain,
-   struct io_pgtable_cfg *pgtbl_cfg)
+   struct io_pgtable_cfg *pgtbl_cfg, struct device *dev)
 {
struct cavium_smmu *cs = container_of(smmu_domain->smmu,
  struct cavium_smmu, smmu);
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 8e884e58f208..68b7b9e6140e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -65,41 +65,10 @@ module_param(disable_bypass, bool, S_IRUGO);
 MODULE_PARM_DESC(disable_bypass,
"Disable bypass streams such that incoming transactions from devices 
that are not attached to an iommu domain will report an abort back to the 
device and will not be allowed to pass through the SMMU.");
 
-struct arm_smmu_s2cr {
-   struct iommu_group  *group;
-   int count;
-   enum arm_smmu_s2cr_type type;
-   enum arm_smmu_s2cr_privcfg  privcfg;
-   u8  cbndx;
-};
-
 #define s2cr_init_val (struct arm_smmu_s2cr){  \
.type = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS,\
 }
 
-struct arm_smmu_smr {
-   u16 mask;
-   u16 id;
-   boolvalid;
-};
-
-struct arm_smmu_cb {
-   u64 ttbr[2];
-   u32 tcr[2];
-   u32 mair[2];
-   struct arm_smmu_cfg *cfg;
-};
-
-struct arm_smmu_master_cfg {
-   struct arm_smmu_device  *smmu;
-   s16 smendx[];
-};
-#define INVALID_SMENDX -1
-#define cfg_smendx(cfg, fw, i) \
-   (i >= fw->num_ids ? INVALID_SMENDX : cfg->smendx[i])
-#define for_each_cfg_sme(cfg, fw, i, idx) \
-   for (i = 0; idx = cfg_smendx(cfg, fw, i), i < fw->num_ids; ++i)
-
 static bool using_legacy_binding, using_generic_binding;
 
 static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
@@ -234,19 +203,6 @@ static int arm_smmu_register_legacy_master(struct device 
*dev,
 }
 #endif /* CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS */
 
-static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end)
-{
-   int idx;
-
-   do {
-   idx = find_next_zero_bit(map, end, start);
-   if (idx == end)
-   return -ENOSPC;
-   } while (test_and_set_bit(idx, map));
-
-   return idx;
-}
-
 static void __arm_smmu_free_bitmap(unsigned long *map, int idx)
 {
clear_bit(idx, map);
@@ -578,7 +534,7 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
}
 }
 
-static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
+void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 {
u32 reg;
bool stage1;
@@ -665,7 +621,8 @@ static void arm_smmu_write_context_bank(struct 
arm_smmu_device *smmu, int idx)
 }
 
 static int arm_smmu_init_domain_context(struct iommu_domain *domain,
-   struct arm_smmu_device *smmu)
+   struct arm_smmu_device *smmu,
+   struct device *dev)
 {
int irq, start, ret = 0;
unsigned long ias, oas;
@@ -780,10 +737,20 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
ret = -EINVAL;
goto out_unlock;
}
-   ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+
+   smmu_domain->smmu = smmu;
+
+   if (smmu->impl && smmu->impl->alloc_context_bank)
+ 

[PATCH v16 10/20] drm/msm/a6xx: Add support for per-instance pagetables

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Add support for using per-instance pagetables if all the dependencies are
available.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Akhil P Oommen 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 63 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
 drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
 3 files changed, 65 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 5eabb0109577..d7ad6c78d787 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -81,6 +81,49 @@ static void get_stats_counter(struct msm_ringbuffer *ring, 
u32 counter,
OUT_RING(ring, upper_32_bits(iova));
 }
 
+static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
+   struct msm_ringbuffer *ring, struct msm_file_private *ctx)
+{
+   phys_addr_t ttbr;
+   u32 asid;
+   u64 memptr = rbmemptr(ring, ttbr0);
+
+   if (ctx == a6xx_gpu->cur_ctx)
+   return;
+
+   if (msm_iommu_pagetable_params(ctx->aspace->mmu, , ))
+   return;
+
+   /* Execute the table update */
+   OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+   OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr)));
+
+   OUT_RING(ring,
+   CP_SMMU_TABLE_UPDATE_1_TTBR0_HI(upper_32_bits(ttbr)) |
+   CP_SMMU_TABLE_UPDATE_1_ASID(asid));
+   OUT_RING(ring, CP_SMMU_TABLE_UPDATE_2_CONTEXTIDR(0));
+   OUT_RING(ring, CP_SMMU_TABLE_UPDATE_3_CONTEXTBANK(0));
+
+   /*
+* Write the new TTBR0 to the memstore. This is good for debugging.
+*/
+   OUT_PKT7(ring, CP_MEM_WRITE, 4);
+   OUT_RING(ring, CP_MEM_WRITE_0_ADDR_LO(lower_32_bits(memptr)));
+   OUT_RING(ring, CP_MEM_WRITE_1_ADDR_HI(upper_32_bits(memptr)));
+   OUT_RING(ring, lower_32_bits(ttbr));
+   OUT_RING(ring, (asid << 16) | upper_32_bits(ttbr));
+
+   /*
+* And finally, trigger a uche flush to be sure there isn't anything
+* lingering in that part of the GPU
+*/
+
+   OUT_PKT7(ring, CP_EVENT_WRITE, 1);
+   OUT_RING(ring, 0x31);
+
+   a6xx_gpu->cur_ctx = ctx;
+}
+
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT;
@@ -90,6 +133,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
struct msm_ringbuffer *ring = submit->ring;
unsigned int i;
 
+   a6xx_set_pagetable(a6xx_gpu, ring, submit->queue->ctx);
+
get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -696,6 +741,8 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
/* Always come up on rb 0 */
a6xx_gpu->cur_ring = gpu->rb[0];
 
+   a6xx_gpu->cur_ctx = NULL;
+
/* Enable the SQE_to start the CP engine */
gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 1);
 
@@ -1008,6 +1055,21 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *
+a6xx_create_private_address_space(struct msm_gpu *gpu)
+{
+   struct msm_gem_address_space *aspace = NULL;
+   struct msm_mmu *mmu;
+
+   mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
+
+   if (!IS_ERR(mmu))
+   aspace = msm_gem_address_space_create(mmu,
+   "gpu", 0x1ULL, 0x1ULL);
+
+   return aspace;
+}
+
 static const struct adreno_gpu_funcs funcs = {
.base = {
.get_param = adreno_get_param,
@@ -1031,6 +1093,7 @@ static const struct adreno_gpu_funcs funcs = {
.gpu_state_put = a6xx_gpu_state_put,
 #endif
.create_address_space = adreno_iommu_create_address_space,
+   .create_private_address_space = 
a6xx_create_private_address_space,
},
.get_timestamp = a6xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 03ba60d5b07f..da22d7549d9b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -19,6 +19,7 @@ struct a6xx_gpu {
uint64_t sqe_iova;
 
struct msm_ringbuffer *cur_ring;
+   struct msm_file_private *cur_ctx;
 
struct a6xx_gmu gmu;
 };
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h 
b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 7764373d0ed2..0987d6bf848c 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -31,6 +31,7 @@ struct msm_rbmemptrs {
volatile uint32_t fence;
 
volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
+   volatile u64 ttbr0;
 };
 
 struct msm_ringbuffer {
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org

[PATCH v16 15/20] iommu/arm-smmu: Constify some helpers

2020-09-01 Thread Rob Clark
From: Rob Clark 

Sprinkle a few `const`s where helpers don't need write access.

Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 59ff3fc5c6c8..27c8fc50 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -377,7 +377,7 @@ struct arm_smmu_master_cfg {
s16 smendx[];
 };
 
-static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
+static inline u32 arm_smmu_lpae_tcr(const struct io_pgtable_cfg *cfg)
 {
u32 tcr = FIELD_PREP(ARM_SMMU_TCR_TG0, cfg->arm_lpae_s1_cfg.tcr.tg) |
FIELD_PREP(ARM_SMMU_TCR_SH0, cfg->arm_lpae_s1_cfg.tcr.sh) |
@@ -398,13 +398,13 @@ static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg 
*cfg)
return tcr;
 }
 
-static inline u32 arm_smmu_lpae_tcr2(struct io_pgtable_cfg *cfg)
+static inline u32 arm_smmu_lpae_tcr2(const struct io_pgtable_cfg *cfg)
 {
return FIELD_PREP(ARM_SMMU_TCR2_PASIZE, cfg->arm_lpae_s1_cfg.tcr.ips) |
   FIELD_PREP(ARM_SMMU_TCR2_SEP, ARM_SMMU_TCR2_SEP_UPSTREAM);
 }
 
-static inline u32 arm_smmu_lpae_vtcr(struct io_pgtable_cfg *cfg)
+static inline u32 arm_smmu_lpae_vtcr(const struct io_pgtable_cfg *cfg)
 {
return ARM_SMMU_VTCR_RES1 |
   FIELD_PREP(ARM_SMMU_VTCR_PS, cfg->arm_lpae_s2_cfg.vtcr.ps) |
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 18/20] dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Every Qcom Adreno GPU has an embedded SMMU for its own use. These
devices depend on unique features such as split pagetables,
different stall/halt requirements and other settings. Identify them
with a compatible string so that they can be identified in the
arm-smmu implementation specific code.

Signed-off-by: Jordan Crouse 
Reviewed-by: Rob Herring 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 503160a7b9a0..3b63f2ae24db 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -28,8 +28,6 @@ properties:
   - enum:
   - qcom,msm8996-smmu-v2
   - qcom,msm8998-smmu-v2
-  - qcom,sc7180-smmu-v2
-  - qcom,sdm845-smmu-v2
   - const: qcom,smmu-v2
 
   - description: Qcom SoCs implementing "arm,mmu-500"
@@ -40,6 +38,13 @@ properties:
   - qcom,sm8150-smmu-500
   - qcom,sm8250-smmu-500
   - const: arm,mmu-500
+  - description: Qcom Adreno GPUs implementing "arm,smmu-v2"
+items:
+  - enum:
+  - qcom,sc7180-smmu-v2
+  - qcom,sdm845-smmu-v2
+  - const: qcom,adreno-smmu
+  - const: qcom,smmu-v2
   - description: Marvell SoCs implementing "arm,mmu-500"
 items:
   - const: marvell,ap806-smmu-500
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 16/20] iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Add a special implementation for the SMMU attached to most Adreno GPU
target triggered from the qcom,adreno-smmu compatible string.

The new Adreno SMMU implementation will enable split pagetables
(TTBR1) for the domain attached to the GPU device (SID 0) and
hard code it context bank 0 so the GPU hardware can implement
per-instance pagetables.

Co-developed-by: Rob Clark 
Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |   3 +
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 149 -
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |   1 +
 3 files changed, 151 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index 88f17cc33023..d199b4bff15d 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -223,6 +223,9 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
return qcom_smmu_impl_init(smmu);
 
+   if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
+   return qcom_adreno_smmu_impl_init(smmu);
+
if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
smmu->impl = _mmu500_impl;
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index be4318044f96..5640d9960610 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2019, The Linux Foundation. All rights reserved.
  */
 
+#include 
 #include 
 #include 
 
@@ -12,6 +13,132 @@ struct qcom_smmu {
struct arm_smmu_device smmu;
 };
 
+#define QCOM_ADRENO_SMMU_GPU_SID 0
+
+static bool qcom_adreno_smmu_is_gpu_device(struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   int i;
+
+   /*
+* The GPU will always use SID 0 so that is a handy way to uniquely
+* identify it and configure it for per-instance pagetables
+*/
+   for (i = 0; i < fwspec->num_ids; i++) {
+   u16 sid = FIELD_GET(ARM_SMMU_SMR_ID, fwspec->ids[i]);
+
+   if (sid == QCOM_ADRENO_SMMU_GPU_SID)
+   return true;
+   }
+
+   return false;
+}
+
+static const struct io_pgtable_cfg *qcom_adreno_smmu_get_ttbr1_cfg(
+   const void *cookie)
+{
+   struct arm_smmu_domain *smmu_domain = (void *)cookie;
+   struct io_pgtable *pgtable =
+   io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+   return >cfg;
+}
+
+/*
+ * Local implementation to configure TTBR0 with the specified pagetable config.
+ * The GPU driver will call this to enable TTBR0 when per-instance pagetables
+ * are active
+ */
+
+static int qcom_adreno_smmu_set_ttbr0_cfg(const void *cookie,
+   const struct io_pgtable_cfg *pgtbl_cfg)
+{
+   struct arm_smmu_domain *smmu_domain = (void *)cookie;
+   struct io_pgtable *pgtable = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+   struct arm_smmu_cfg *cfg = _domain->cfg;
+   struct arm_smmu_cb *cb = _domain->smmu->cbs[cfg->cbndx];
+
+   /* The domain must have split pagetables already enabled */
+   if (cb->tcr[0] & ARM_SMMU_TCR_EPD1)
+   return -EINVAL;
+
+   /* If the pagetable config is NULL, disable TTBR0 */
+   if (!pgtbl_cfg) {
+   /* Do nothing if it is already disabled */
+   if ((cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+   return -EINVAL;
+
+   /* Set TCR to the original configuration */
+   cb->tcr[0] = arm_smmu_lpae_tcr(>cfg);
+   cb->ttbr[0] = FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+   } else {
+   u32 tcr = cb->tcr[0];
+
+   /* Don't call this again if TTBR0 is already enabled */
+   if (!(cb->tcr[0] & ARM_SMMU_TCR_EPD0))
+   return -EINVAL;
+
+   tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+   tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+   cb->tcr[0] = tcr;
+   cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+   cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+   }
+
+   arm_smmu_write_context_bank(smmu_domain->smmu, cb->cfg->cbndx);
+
+   return 0;
+}
+
+static int qcom_adreno_smmu_alloc_context_bank(struct arm_smmu_domain 
*smmu_domain,
+   struct device *dev, int start, int count)
+{
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   /*
+* Assign context bank 0 to the GPU device so the GPU hardware can
+* switch pagetables
+*/
+   if (qcom_adreno_smmu_is_gpu_device(dev)) {
+   start = 0;
+   count = 1;
+   

[PATCH v16 17/20] iommu/arm-smmu: Add a way for implementations to influence SCTLR

2020-09-01 Thread Rob Clark
From: Rob Clark 

For the Adreno GPU's SMMU, we want SCTLR.HUPCF set to ensure that
pending translations are not terminated on iova fault.  Otherwise
a terminated CP read could hang the GPU by returning invalid
command-stream data.

Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 ++
 drivers/iommu/arm/arm-smmu/arm-smmu.c  | 3 +++
 drivers/iommu/arm/arm-smmu/arm-smmu.h  | 3 +++
 3 files changed, 12 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 5640d9960610..2aa6249050ff 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -127,6 +127,12 @@ static int qcom_adreno_smmu_init_context(struct 
arm_smmu_domain *smmu_domain,
(smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64))
pgtbl_cfg->quirks |= IO_PGTABLE_QUIRK_ARM_TTBR1;
 
+   /*
+* On the GPU device we want to process subsequent transactions after a
+* fault to keep the GPU from hanging
+*/
+   smmu_domain->cfg.sctlr_set |= ARM_SMMU_SCTLR_HUPCF;
+
/*
 * Initialize private interface with GPU:
 */
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 68b7b9e6140e..1773f54a7464 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -617,6 +617,9 @@ void arm_smmu_write_context_bank(struct arm_smmu_device 
*smmu, int idx)
if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN))
reg |= ARM_SMMU_SCTLR_E;
 
+   reg |= cfg->sctlr_set;
+   reg &= ~cfg->sctlr_clr;
+
arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index cd75a33967bb..2df3a70a8a41 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -144,6 +144,7 @@ enum arm_smmu_cbar_type {
 #define ARM_SMMU_CB_SCTLR  0x0
 #define ARM_SMMU_SCTLR_S1_ASIDPNE  BIT(12)
 #define ARM_SMMU_SCTLR_CFCFG   BIT(7)
+#define ARM_SMMU_SCTLR_HUPCF   BIT(8)
 #define ARM_SMMU_SCTLR_CFIEBIT(6)
 #define ARM_SMMU_SCTLR_CFREBIT(5)
 #define ARM_SMMU_SCTLR_E   BIT(4)
@@ -341,6 +342,8 @@ struct arm_smmu_cfg {
u16 asid;
u16 vmid;
};
+   u32 sctlr_set;/* extra bits to set in 
SCTLR */
+   u32 sctlr_clr;/* bits to mask in SCTLR 
*/
enum arm_smmu_cbar_type cbar;
enum arm_smmu_context_fmt   fmt;
 };
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 12/20] iommu/arm-smmu: Pass io-pgtable config to implementation specific function

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Construct the io-pgtable config before calling the implementation specific
init_context function and pass it so the implementation specific function
can get a chance to change it before the io-pgtable is created.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c |  3 ++-
 drivers/iommu/arm/arm-smmu/arm-smmu.c  | 11 ++-
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  3 ++-
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index f4ff124a1967..a9861dcd0884 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -68,7 +68,8 @@ static int cavium_cfg_probe(struct arm_smmu_device *smmu)
return 0;
 }
 
-static int cavium_init_context(struct arm_smmu_domain *smmu_domain)
+static int cavium_init_context(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *pgtbl_cfg)
 {
struct cavium_smmu *cs = container_of(smmu_domain->smmu,
  struct cavium_smmu, smmu);
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 09c42af9f31e..37d8d49299b4 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -795,11 +795,6 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
cfg->asid = cfg->cbndx;
 
smmu_domain->smmu = smmu;
-   if (smmu->impl && smmu->impl->init_context) {
-   ret = smmu->impl->init_context(smmu_domain);
-   if (ret)
-   goto out_unlock;
-   }
 
pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap  = smmu->pgsize_bitmap,
@@ -810,6 +805,12 @@ static int arm_smmu_init_domain_context(struct 
iommu_domain *domain,
.iommu_dev  = smmu->dev,
};
 
+   if (smmu->impl && smmu->impl->init_context) {
+   ret = smmu->impl->init_context(smmu_domain, _cfg);
+   if (ret)
+   goto out_clear_smmu;
+   }
+
if (smmu_domain->non_strict)
pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index d890a4a968e8..83294516ac08 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -386,7 +386,8 @@ struct arm_smmu_impl {
u64 val);
int (*cfg_probe)(struct arm_smmu_device *smmu);
int (*reset)(struct arm_smmu_device *smmu);
-   int (*init_context)(struct arm_smmu_domain *smmu_domain);
+   int (*init_context)(struct arm_smmu_domain *smmu_domain,
+   struct io_pgtable_cfg *cfg);
void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync,
 int status);
int (*def_domain_type)(struct device *dev);
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 08/20] drm/msm: Add support to create a local pagetable

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Add support to create a io-pgtable for use by targets that support
per-instance pagetables. In order to support per-instance pagetables the
GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
split pagetables enabled.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/Kconfig  |   1 +
 drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c  | 199 ++-
 drivers/gpu/drm/msm/msm_mmu.h|  16 ++-
 4 files changed, 215 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 6deaa7d01654..5102a58830b9 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -8,6 +8,7 @@ config DRM_MSM
depends on MMU
depends on INTERCONNECT || !INTERCONNECT
depends on QCOM_OCMEM || QCOM_OCMEM=n
+   select IOMMU_IO_PGTABLE
select QCOM_MDT_LOADER if ARCH_QCOM
select REGULATOR
select DRM_KMS_HELPER
diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
index 310a31b05faa..aab121f4beb7 100644
--- a/drivers/gpu/drm/msm/msm_gpummu.c
+++ b/drivers/gpu/drm/msm/msm_gpummu.c
@@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct 
msm_gpu *gpu)
}
 
gpummu->gpu = gpu;
-   msm_mmu_init(>base, dev, );
+   msm_mmu_init(>base, dev, , MSM_MMU_GPUMMU);
 
return >base;
 }
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 1b6635504069..697cc0a059d6 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -4,15 +4,210 @@
  * Author: Rob Clark 
  */
 
+#include 
+#include 
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
 struct msm_iommu {
struct msm_mmu base;
struct iommu_domain *domain;
+   atomic_t pagetables;
 };
+
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
+struct msm_iommu_pagetable {
+   struct msm_mmu base;
+   struct msm_mmu *parent;
+   struct io_pgtable_ops *pgtbl_ops;
+   phys_addr_t ttbr;
+   u32 asid;
+};
+static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
+{
+   return container_of(mmu, struct msm_iommu_pagetable, base);
+}
+
+static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
+   size_t size)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+   size_t unmapped = 0;
+
+   /* Unmap the block one page at a time */
+   while (size) {
+   unmapped += ops->unmap(ops, iova, 4096, NULL);
+   iova += 4096;
+   size -= 4096;
+   }
+
+   iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
+
+   return (unmapped == size) ? 0 : -EINVAL;
+}
+
+static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
+   struct sg_table *sgt, size_t len, int prot)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+   struct scatterlist *sg;
+   size_t mapped = 0;
+   u64 addr = iova;
+   unsigned int i;
+
+   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   size_t size = sg->length;
+   phys_addr_t phys = sg_phys(sg);
+
+   /* Map the block one page at a time */
+   while (size) {
+   if (ops->map(ops, addr, phys, 4096, prot, GFP_KERNEL)) {
+   msm_iommu_pagetable_unmap(mmu, iova, mapped);
+   return -EINVAL;
+   }
+
+   phys += 4096;
+   addr += 4096;
+   size -= 4096;
+   mapped += 4096;
+   }
+   }
+
+   return 0;
+}
+
+static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
+{
+   struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+   struct msm_iommu *iommu = to_msm_iommu(pagetable->parent);
+   struct adreno_smmu_priv *adreno_smmu =
+   dev_get_drvdata(pagetable->parent->dev);
+
+   /*
+* If this is the last attached pagetable for the parent,
+* disable TTBR0 in the arm-smmu driver
+*/
+   if (atomic_dec_return(>pagetables) == 0)
+   adreno_smmu->set_ttbr0_cfg(adreno_smmu->cookie, NULL);
+
+   free_io_pgtable_ops(pagetable->pgtbl_ops);
+   kfree(pagetable);
+}
+
+int msm_iommu_pagetable_params(struct msm_mmu *mmu,
+   phys_addr_t *ttbr, int *asid)
+{
+   struct msm_iommu_pagetable *pagetable;
+
+   if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
+   return -EINVAL;
+
+   pagetable = to_pagetable(mmu);
+
+   if (ttbr)
+   *ttbr = pagetable->ttbr;
+
+   if (asid)
+   *asid = 

[PATCH v16 09/20] drm/msm: Add support for private address space instances

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Add support for allocating private address space instances. Targets that
support per-context pagetables should implement their own function to
allocate private address spaces.

The default will return a pointer to the global address space.

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/msm_drv.c | 13 +++--
 drivers/gpu/drm/msm/msm_drv.h |  5 +
 drivers/gpu/drm/msm/msm_gem_vma.c |  9 +
 drivers/gpu/drm/msm/msm_gpu.c | 22 ++
 drivers/gpu/drm/msm/msm_gpu.h |  5 +
 5 files changed, 48 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 75cd7639f560..7e963f707852 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -597,7 +597,7 @@ static int context_init(struct drm_device *dev, struct 
drm_file *file)
kref_init(>ref);
msm_submitqueue_init(dev, ctx);
 
-   ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
+   ctx->aspace = msm_gpu_create_private_address_space(priv->gpu);
file->driver_priv = ctx;
 
return 0;
@@ -780,18 +780,19 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, 
void *data,
 }
 
 static int msm_ioctl_gem_info_iova(struct drm_device *dev,
-   struct drm_gem_object *obj, uint64_t *iova)
+   struct drm_file *file, struct drm_gem_object *obj,
+   uint64_t *iova)
 {
-   struct msm_drm_private *priv = dev->dev_private;
+   struct msm_file_private *ctx = file->driver_priv;
 
-   if (!priv->gpu)
+   if (!ctx->aspace)
return -EINVAL;
 
/*
 * Don't pin the memory here - just get an address so that userspace can
 * be productive
 */
-   return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
+   return msm_gem_get_iova(obj, ctx->aspace, iova);
 }
 
 static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
@@ -830,7 +831,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
args->value = msm_gem_mmap_offset(obj);
break;
case MSM_INFO_GET_IOVA:
-   ret = msm_ioctl_gem_info_iova(dev, obj, >value);
+   ret = msm_ioctl_gem_info_iova(dev, file, obj, >value);
break;
case MSM_INFO_SET_NAME:
/* length check should leave room for terminating null: */
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 4561bfb5e745..2ca9c3c03845 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -249,6 +249,10 @@ int msm_gem_map_vma(struct msm_gem_address_space *aspace,
 void msm_gem_close_vma(struct msm_gem_address_space *aspace,
struct msm_gem_vma *vma);
 
+
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace);
+
 void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
 
 struct msm_gem_address_space *
@@ -434,6 +438,7 @@ static inline void __msm_file_private_destroy(struct kref 
*kref)
struct msm_file_private *ctx = container_of(kref,
struct msm_file_private, ref);
 
+   msm_gem_address_space_put(ctx->aspace);
kfree(ctx);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c 
b/drivers/gpu/drm/msm/msm_gem_vma.c
index 5f6a11211b64..29cc1305cf37 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -27,6 +27,15 @@ void msm_gem_address_space_put(struct msm_gem_address_space 
*aspace)
kref_put(>kref, msm_gem_address_space_destroy);
 }
 
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace)
+{
+   if (!IS_ERR_OR_NULL(aspace))
+   kref_get(>kref);
+
+   return aspace;
+}
+
 /* Actually unmap memory for the vma */
 void msm_gem_purge_vma(struct msm_gem_address_space *aspace,
struct msm_gem_vma *vma)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index e1a3cbe25a0c..951850804d77 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -823,6 +823,28 @@ static int get_clocks(struct platform_device *pdev, struct 
msm_gpu *gpu)
return 0;
 }
 
+/* Return a new address space for a msm_drm_private instance */
+struct msm_gem_address_space *
+msm_gpu_create_private_address_space(struct msm_gpu *gpu)
+{
+   struct msm_gem_address_space *aspace = NULL;
+
+   if (!gpu)
+   return NULL;
+
+   /*
+* If the target doesn't support private address spaces then return
+* the global one
+*/
+   if (gpu->funcs->create_private_address_space)
+   aspace = gpu->funcs->create_private_address_space(gpu);
+
+   if (IS_ERR_OR_NULL(aspace))
+   aspace = msm_gem_address_space_get(gpu->aspace);
+
+   return aspace;

[PATCH v16 07/20] drm/msm: Set the global virtual address range from the IOMMU domain

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Use the aperture settings from the IOMMU domain to set up the virtual
address range for the GPU. This allows us to transparently deal with
IOMMU side features (like split pagetables).

Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 13 +++--
 drivers/gpu/drm/msm/msm_iommu.c |  7 +++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 533a34b4cce2..34e6242c1767 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -192,9 +192,18 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
struct iommu_domain *iommu = iommu_domain_alloc(_bus_type);
struct msm_mmu *mmu = msm_iommu_new(>dev, iommu);
struct msm_gem_address_space *aspace;
+   u64 start, size;
 
-   aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
-   0x - SZ_16M);
+   /*
+* Use the aperture start or SZ_16M, whichever is greater. This will
+* ensure that we align with the allocated pagetable range while still
+* allowing room in the lower 32 bits for GMEM and whatnot
+*/
+   start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
+   size = iommu->geometry.aperture_end - start + 1;
+
+   aspace = msm_gem_address_space_create(mmu, "gpu",
+   start & GENMASK(48, 0), size);
 
if (IS_ERR(aspace) && !IS_ERR(mmu))
mmu->funcs->destroy(mmu);
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 3a381a9674c9..1b6635504069 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -36,6 +36,10 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
struct msm_iommu *iommu = to_msm_iommu(mmu);
size_t ret;
 
+   /* The arm-smmu driver expects the addresses to be sign extended */
+   if (iova & BIT_ULL(48))
+   iova |= GENMASK_ULL(63, 49);
+
ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
WARN_ON(!ret);
 
@@ -46,6 +50,9 @@ static int msm_iommu_unmap(struct msm_mmu *mmu, uint64_t 
iova, size_t len)
 {
struct msm_iommu *iommu = to_msm_iommu(mmu);
 
+   if (iova & BIT_ULL(48))
+   iova |= GENMASK_ULL(63, 49);
+
iommu_unmap(iommu->domain, iova, len);
 
return 0;
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 03/20] drm/msm/gpu: Add dev_to_gpu() helper

2020-09-01 Thread Rob Clark
From: Rob Clark 

In a later patch, the drvdata will not directly be 'struct msm_gpu *',
so add a helper to reduce the churn.

Signed-off-by: Rob Clark 
Reviewed-by: Jordan Crouse 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/adreno/adreno_device.c | 10 --
 drivers/gpu/drm/msm/msm_gpu.c  |  6 +++---
 drivers/gpu/drm/msm/msm_gpu.h  |  5 +
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 9eeb46bf2a5d..26664e1b30c0 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -282,7 +282,7 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
int ret;
 
if (pdev)
-   gpu = platform_get_drvdata(pdev);
+   gpu = dev_to_gpu(>dev);
 
if (!gpu) {
dev_err_once(dev->dev, "no GPU device was found\n");
@@ -425,7 +425,7 @@ static int adreno_bind(struct device *dev, struct device 
*master, void *data)
 static void adreno_unbind(struct device *dev, struct device *master,
void *data)
 {
-   struct msm_gpu *gpu = dev_get_drvdata(dev);
+   struct msm_gpu *gpu = dev_to_gpu(dev);
 
pm_runtime_force_suspend(dev);
gpu->funcs->destroy(gpu);
@@ -490,16 +490,14 @@ static const struct of_device_id dt_match[] = {
 #ifdef CONFIG_PM
 static int adreno_resume(struct device *dev)
 {
-   struct platform_device *pdev = to_platform_device(dev);
-   struct msm_gpu *gpu = platform_get_drvdata(pdev);
+   struct msm_gpu *gpu = dev_to_gpu(dev);
 
return gpu->funcs->pm_resume(gpu);
 }
 
 static int adreno_suspend(struct device *dev)
 {
-   struct platform_device *pdev = to_platform_device(dev);
-   struct msm_gpu *gpu = platform_get_drvdata(pdev);
+   struct msm_gpu *gpu = dev_to_gpu(dev);
 
return gpu->funcs->pm_suspend(gpu);
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index d5645472b25d..6aa9e04e52e7 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -24,7 +24,7 @@
 static int msm_devfreq_target(struct device *dev, unsigned long *freq,
u32 flags)
 {
-   struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
+   struct msm_gpu *gpu = dev_to_gpu(dev);
struct dev_pm_opp *opp;
 
opp = devfreq_recommended_opp(dev, freq, flags);
@@ -45,7 +45,7 @@ static int msm_devfreq_target(struct device *dev, unsigned 
long *freq,
 static int msm_devfreq_get_dev_status(struct device *dev,
struct devfreq_dev_status *status)
 {
-   struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
+   struct msm_gpu *gpu = dev_to_gpu(dev);
ktime_t time;
 
if (gpu->funcs->gpu_get_freq)
@@ -64,7 +64,7 @@ static int msm_devfreq_get_dev_status(struct device *dev,
 
 static int msm_devfreq_get_cur_freq(struct device *dev, unsigned long *freq)
 {
-   struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
+   struct msm_gpu *gpu = dev_to_gpu(dev);
 
if (gpu->funcs->gpu_get_freq)
*freq = gpu->funcs->gpu_get_freq(gpu);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 0db117a7339b..8bda7beaed4b 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -141,6 +141,11 @@ struct msm_gpu {
struct msm_gpu_state *crashstate;
 };
 
+static inline struct msm_gpu *dev_to_gpu(struct device *dev)
+{
+   return dev_get_drvdata(dev);
+}
+
 /* It turns out that all targets use the same ringbuffer size */
 #define MSM_GPU_RINGBUFFER_SZ SZ_32K
 #define MSM_GPU_RINGBUFFER_BLKSIZE 32
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 01/20] drm/msm: Remove dangling submitqueue references

2020-09-01 Thread Rob Clark
From: Rob Clark 

Currently it doesn't matter, since we free the ctx immediately.  But
when we start refcnt'ing the ctx, we don't want old dangling list
entries to hang around.

Signed-off-by: Rob Clark 
Reviewed-by: Jordan Crouse 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/msm_submitqueue.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c 
b/drivers/gpu/drm/msm/msm_submitqueue.c
index a1d94be7883a..90c9d84e6155 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -49,8 +49,10 @@ void msm_submitqueue_close(struct msm_file_private *ctx)
 * No lock needed in close and there won't
 * be any more user ioctls coming our way
 */
-   list_for_each_entry_safe(entry, tmp, >submitqueues, node)
+   list_for_each_entry_safe(entry, tmp, >submitqueues, node) {
+   list_del(>node);
msm_submitqueue_put(entry);
+   }
 }
 
 int msm_submitqueue_create(struct drm_device *drm, struct msm_file_private 
*ctx,
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 06/20] drm/msm: Drop context arg to gpu->submit()

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Now that we can get the ctx from the submitqueue, the extra arg is
redundant.

Signed-off-by: Jordan Crouse 
[split out of previous patch to reduce churny noise]
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   | 12 +---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  5 ++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  5 ++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  3 +--
 drivers/gpu/drm/msm/msm_gem_submit.c|  2 +-
 drivers/gpu/drm/msm/msm_gpu.c   |  9 -
 drivers/gpu/drm/msm/msm_gpu.h   |  6 ++
 7 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 9e63a190642c..eff2439ea57b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -43,8 +43,7 @@ static void a5xx_flush(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
gpu_write(gpu, REG_A5XX_CP_RB_WPTR, wptr);
 }
 
-static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit 
*submit,
-   struct msm_file_private *ctx)
+static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit 
*submit)
 {
struct msm_drm_private *priv = gpu->dev->dev_private;
struct msm_ringbuffer *ring = submit->ring;
@@ -57,7 +56,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
msm_gem_submit *submit
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == ctx)
+   if (priv->lastctx == submit->queue->ctx)
break;
/* fall-thru */
case MSM_SUBMIT_CMD_BUF:
@@ -103,8 +102,7 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct 
msm_gem_submit *submit
msm_gpu_retire(gpu);
 }
 
-static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-   struct msm_file_private *ctx)
+static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
@@ -114,7 +112,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
 
if (IS_ENABLED(CONFIG_DRM_MSM_GPU_SUDO) && submit->in_rb) {
priv->lastctx = NULL;
-   a5xx_submit_in_rb(gpu, submit, ctx);
+   a5xx_submit_in_rb(gpu, submit);
return;
}
 
@@ -148,7 +146,7 @@ static void a5xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == ctx)
+   if (priv->lastctx == submit->queue->ctx)
break;
/* fall-thru */
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c5a3e4d4c007..5eabb0109577 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -81,8 +81,7 @@ static void get_stats_counter(struct msm_ringbuffer *ring, 
u32 counter,
OUT_RING(ring, upper_32_bits(iova));
 }
 
-static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-   struct msm_file_private *ctx)
+static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
unsigned int index = submit->seqno % MSM_GPU_SUBMIT_STATS_COUNT;
struct msm_drm_private *priv = gpu->dev->dev_private;
@@ -115,7 +114,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
case MSM_SUBMIT_CMD_IB_TARGET_BUF:
break;
case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
-   if (priv->lastctx == ctx)
+   if (priv->lastctx == submit->queue->ctx)
break;
/* fall-thru */
case MSM_SUBMIT_CMD_BUF:
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index d2dbb6968cba..533a34b4cce2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -457,8 +457,7 @@ void adreno_recover(struct msm_gpu *gpu)
}
 }
 
-void adreno_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-   struct msm_file_private *ctx)
+void adreno_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct msm_drm_private *priv = gpu->dev->dev_private;
@@ -472,7 +471,7 @@ void adreno_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit,
break;
 

[PATCH v16 04/20] drm/msm: Set adreno_smmu as gpu's drvdata

2020-09-01 Thread Rob Clark
From: Rob Clark 

This will be populated by adreno-smmu, to provide a way for coordinating
enabling/disabling TTBR0 translation.

Signed-off-by: Rob Clark 
Reviewed-by: Jordan Crouse 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/adreno/adreno_device.c | 2 --
 drivers/gpu/drm/msm/msm_gpu.c  | 2 +-
 drivers/gpu/drm/msm/msm_gpu.h  | 6 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 26664e1b30c0..58e03b20e1c7 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -417,8 +417,6 @@ static int adreno_bind(struct device *dev, struct device 
*master, void *data)
return PTR_ERR(gpu);
}
 
-   dev_set_drvdata(dev, gpu);
-
return 0;
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 6aa9e04e52e7..806eb0957280 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -892,7 +892,7 @@ int msm_gpu_init(struct drm_device *drm, struct 
platform_device *pdev,
gpu->gpu_cx = NULL;
 
gpu->pdev = pdev;
-   platform_set_drvdata(pdev, gpu);
+   platform_set_drvdata(pdev, >adreno_smmu);
 
msm_devfreq_init(gpu);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 8bda7beaed4b..f91b141add75 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -7,6 +7,7 @@
 #ifndef __MSM_GPU_H__
 #define __MSM_GPU_H__
 
+#include 
 #include 
 #include 
 #include 
@@ -73,6 +74,8 @@ struct msm_gpu {
struct platform_device *pdev;
const struct msm_gpu_funcs *funcs;
 
+   struct adreno_smmu_priv adreno_smmu;
+
/* performance counters (hw & sw): */
spinlock_t perf_lock;
bool perfcntr_active;
@@ -143,7 +146,8 @@ struct msm_gpu {
 
 static inline struct msm_gpu *dev_to_gpu(struct device *dev)
 {
-   return dev_get_drvdata(dev);
+   struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(dev);
+   return container_of(adreno_smmu, struct msm_gpu, adreno_smmu);
 }
 
 /* It turns out that all targets use the same ringbuffer size */
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 00/20] iommu/arm-smmu + drm/msm: per-process GPU pgtables

2020-09-01 Thread Rob Clark
From: Rob Clark 

NOTE: I have re-ordered the series, and propose that we could merge this
  series in the following order:

   1) 01-11 - merge via drm / msm-next
   2) 12-15 - merge via iommu, no dependency on msm-next pull req
   3) 16-18 - patch 16 has a dependency on 02 and 04, so it would
  need to come post -rc1 or on following cycle, but I
  think it would be unlikely to conflict with other
  arm-smmu patches (other than Bjorn's smmu handover
  series?)
   4) 19-20 - dt bits should be safe to land in any order without
  breaking anything



This series adds an Adreno SMMU implementation to arm-smmu to allow GPU hardware
pagetable switching.

The Adreno GPU has built in capabilities to switch the TTBR0 pagetable during
runtime to allow each individual instance or application to have its own
pagetable.  In order to take advantage of the HW capabilities there are certain
requirements needed of the SMMU hardware.

This series adds support for an Adreno specific arm-smmu implementation. The new
implementation 1) ensures that the GPU domain is always assigned context bank 0,
2) enables split pagetable support (TTBR1) so that the instance specific
pagetable can be swapped while the global memory remains in place and 3) shares
the current pagetable configuration with the GPU driver to allow it to create
its own io-pgtable instances.

The series then adds the drm/msm code to enable these features. For targets that
support it allocate new pagetables using the io-pgtable configuration shared by
the arm-smmu driver and swap them in during runtime.

This version of the series merges the previous patchset(s) [1] and [2]
with the following improvements:

v16: (Respin by Rob)
  - Fix indentation
  - Re-order series to split drm and iommu parts
v15: (Respin by Rob)
  - Adjust dt bindings to keep SoC specific compatible (Doug)
  - Add dts workaround for cheza fw limitation
  - Add missing 'select IOMMU_IO_PGTABLE' (Guenter)
v14: (Respin by Rob)
  - Minor update to 16/20 (only force ASID to zero in one place)
  - Addition of sc7180 dtsi patch.
v13: (Respin by Rob)
  - Switch to a private interface between adreno-smmu and GPU driver,
dropping the custom domain attr (Will Deacon)
  - Rework the SCTLR.HUPCF patch to add new fields in smmu_domain->cfg
rather than adding new impl hook (Will Deacon)
  - Drop for_each_cfg_sme() in favor of plain for() loop (Will Deacon)
  - Fix context refcnt'ing issue which was causing problems with GPU
crash recover stress testing.
  - Spiff up $debugfs/gem to show process information associated with
VMAs
v12:
  - Nitpick cleanups in gpu/drm/msm/msm_iommu.c (Rob Clark)
  - Reorg in gpu/drm/msm/msm_gpu.c (Rob Clark)
  - Use the default asid for the context bank so that iommu_tlb_flush_all works
  - Flush the UCHE after a page switch
  - Add the SCTLR.HUPCF patch at the end of the series
v11:
  - Add implementation specific get_attr/set_attr functions (per Rob Clark)
  - Fix context bank allocation (per Bjorn Andersson)
v10:
  - arm-smmu: add implementation hook to allocate context banks
  - arm-smmu: Match the GPU domain by stream ID instead of compatible string
  - arm-smmu: Make DOMAIN_ATTR_PGTABLE_CFG bi-directional. The leaf driver
queries the configuration to create a pagetable and then sends the newly
created configuration back to the smmu-driver to enable TTBR0
  - drm/msm: Add context reference counting for submissions
  - drm/msm: Use dummy functions to skip TLB operations on per-instance
pagetables

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-June/045653.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2020-June/045659.html

Jordan Crouse (12):
  drm/msm: Add a context pointer to the submitqueue
  drm/msm: Drop context arg to gpu->submit()
  drm/msm: Set the global virtual address range from the IOMMU domain
  drm/msm: Add support to create a local pagetable
  drm/msm: Add support for private address space instances
  drm/msm/a6xx: Add support for per-instance pagetables
  iommu/arm-smmu: Pass io-pgtable config to implementation specific
function
  iommu/arm-smmu: Add support for split pagetables
  iommu/arm-smmu: Prepare for the adreno-smmu implementation
  iommu/arm-smmu-qcom: Add implementation for the adreno GPU SMMU
  dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
  arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

Rob Clark (8):
  drm/msm: Remove dangling submitqueue references
  drm/msm: Add private interface for adreno-smmu
  drm/msm/gpu: Add dev_to_gpu() helper
  drm/msm: Set adreno_smmu as gpu's drvdata
  drm/msm: Show process names in gem_describe
  iommu/arm-smmu: Constify some helpers
  iommu/arm-smmu: Add a way for implementations to influence SCTLR
  arm: dts: qcom: sc7180: Set the compatible string for the GPU SMMU

 .../devicetree/bindings/iommu/arm,smmu.yaml   |   9 +-

[PATCH v16 02/20] drm/msm: Add private interface for adreno-smmu

2020-09-01 Thread Rob Clark
From: Rob Clark 

This interface will be used for drm/msm to coordinate with the
qcom_adreno_smmu_impl to enable/disable TTBR0 translation.

Once TTBR0 translation is enabled, the GPU's CP (Command Processor)
will directly switch TTBR0 pgtables (and do the necessary TLB inv)
synchronized to the GPU's operation.  But help from the SMMU driver
is needed to initially bootstrap TTBR0 translation, which cannot be
done from the GPU.

Since this is a very special case, a private interface is used to
avoid adding highly driver specific things to the public iommu
interface.

Signed-off-by: Rob Clark 
Reviewed-by: Jordan Crouse 
Reviewed-by: Bjorn Andersson 
---
 include/linux/adreno-smmu-priv.h | 36 
 1 file changed, 36 insertions(+)
 create mode 100644 include/linux/adreno-smmu-priv.h

diff --git a/include/linux/adreno-smmu-priv.h b/include/linux/adreno-smmu-priv.h
new file mode 100644
index ..a889f28afb42
--- /dev/null
+++ b/include/linux/adreno-smmu-priv.h
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 Google, Inc
+ */
+
+#ifndef __ADRENO_SMMU_PRIV_H
+#define __ADRENO_SMMU_PRIV_H
+
+#include 
+
+/**
+ * struct adreno_smmu_priv - private interface between adreno-smmu and GPU
+ *
+ * @cookie:An opque token provided by adreno-smmu and passed
+ * back into the callbacks
+ * @get_ttbr1_cfg: Get the TTBR1 config for the GPUs context-bank
+ * @set_ttbr0_cfg: Set the TTBR0 config for the GPUs context bank.  A
+ * NULL config disables TTBR0 translation, otherwise
+ * TTBR0 translation is enabled with the specified cfg
+ *
+ * The GPU driver (drm/msm) and adreno-smmu work together for controlling
+ * the GPU's SMMU instance.  This is by necessity, as the GPU is directly
+ * updating the SMMU for context switches, while on the other hand we do
+ * not want to duplicate all of the initial setup logic from arm-smmu.
+ *
+ * This private interface is used for the two drivers to coordinate.  The
+ * cookie and callback functions are populated when the GPU driver attaches
+ * it's domain.
+ */
+struct adreno_smmu_priv {
+const void *cookie;
+const struct io_pgtable_cfg *(*get_ttbr1_cfg)(const void *cookie);
+int (*set_ttbr0_cfg)(const void *cookie, const struct io_pgtable_cfg *cfg);
+};
+
+#endif /* __ADRENO_SMMU_PRIV_H */
\ No newline at end of file
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v16 05/20] drm/msm: Add a context pointer to the submitqueue

2020-09-01 Thread Rob Clark
From: Jordan Crouse 

Each submitqueue is attached to a context. Add a pointer to the
context to the submitqueue at create time and refcount it so
that it stays around through the life of the queue.

Co-developed-by: Rob Clark 
Signed-off-by: Jordan Crouse 
Signed-off-by: Rob Clark 
Reviewed-by: Bjorn Andersson 
---
 drivers/gpu/drm/msm/msm_drv.c |  3 ++-
 drivers/gpu/drm/msm/msm_drv.h | 20 
 drivers/gpu/drm/msm/msm_gem.h |  1 +
 drivers/gpu/drm/msm/msm_gem_submit.c  |  6 +++---
 drivers/gpu/drm/msm/msm_gpu.h |  1 +
 drivers/gpu/drm/msm/msm_submitqueue.c |  3 +++
 6 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 79333842f70a..75cd7639f560 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -594,6 +594,7 @@ static int context_init(struct drm_device *dev, struct 
drm_file *file)
if (!ctx)
return -ENOMEM;
 
+   kref_init(>ref);
msm_submitqueue_init(dev, ctx);
 
ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
@@ -615,7 +616,7 @@ static int msm_open(struct drm_device *dev, struct drm_file 
*file)
 static void context_close(struct msm_file_private *ctx)
 {
msm_submitqueue_close(ctx);
-   kfree(ctx);
+   msm_file_private_put(ctx);
 }
 
 static void msm_postclose(struct drm_device *dev, struct drm_file *file)
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index af259b0573ea..4561bfb5e745 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -57,6 +57,7 @@ struct msm_file_private {
struct list_head submitqueues;
int queueid;
struct msm_gem_address_space *aspace;
+   struct kref ref;
 };
 
 enum msm_mdp_plane_property {
@@ -428,6 +429,25 @@ void msm_submitqueue_close(struct msm_file_private *ctx);
 
 void msm_submitqueue_destroy(struct kref *kref);
 
+static inline void __msm_file_private_destroy(struct kref *kref)
+{
+   struct msm_file_private *ctx = container_of(kref,
+   struct msm_file_private, ref);
+
+   kfree(ctx);
+}
+
+static inline void msm_file_private_put(struct msm_file_private *ctx)
+{
+   kref_put(>ref, __msm_file_private_destroy);
+}
+
+static inline struct msm_file_private *msm_file_private_get(
+   struct msm_file_private *ctx)
+{
+   kref_get(>ref);
+   return ctx;
+}
 
 #define DBG(fmt, ...) DRM_DEBUG_DRIVER(fmt"\n", ##__VA_ARGS__)
 #define VERB(fmt, ...) if (0) DRM_DEBUG_DRIVER(fmt"\n", ##__VA_ARGS__)
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 972490b14ba5..9c573c4269cb 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -142,6 +142,7 @@ struct msm_gem_submit {
bool valid; /* true if no cmdstream patching needed */
bool in_rb; /* "sudo" mode, copy cmds into RB */
struct msm_ringbuffer *ring;
+   struct msm_file_private *ctx;
unsigned int nr_cmds;
unsigned int nr_bos;
u32 ident; /* A "identifier" for the submit for logging */
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 8cb9aa15ff90..1464b04d25d3 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -27,7 +27,7 @@
 #define BO_PINNED   0x2000
 
 static struct msm_gem_submit *submit_create(struct drm_device *dev,
-   struct msm_gpu *gpu, struct msm_gem_address_space *aspace,
+   struct msm_gpu *gpu,
struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
uint32_t nr_cmds)
 {
@@ -43,7 +43,7 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
return NULL;
 
submit->dev = dev;
-   submit->aspace = aspace;
+   submit->aspace = queue->ctx->aspace;
submit->gpu = gpu;
submit->fence = NULL;
submit->cmd = (void *)>bos[nr_bos];
@@ -677,7 +677,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
}
}
 
-   submit = submit_create(dev, gpu, ctx->aspace, queue, args->nr_bos,
+   submit = submit_create(dev, gpu, queue, args->nr_bos,
args->nr_cmds);
if (!submit) {
ret = -ENOMEM;
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index f91b141add75..97c527e98391 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -190,6 +190,7 @@ struct msm_gpu_submitqueue {
u32 flags;
u32 prio;
int faults;
+   struct msm_file_private *ctx;
struct list_head node;
struct kref ref;
 };
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c 
b/drivers/gpu/drm/msm/msm_submitqueue.c
index 90c9d84e6155..c3d206105d28 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c

Re: [PATCH] drm: Parse Colorimetry data block from EDID

2020-09-01 Thread crj

Hi,

在 2020/9/1 3:53, Ville Syrjälä 写道:

On Fri, Aug 28, 2020 at 09:07:13AM +0800, crj wrote:

Hi Ville Syrjälä,

在 2020/8/27 18:57, Ville Syrjälä 写道:

On Wed, Aug 26, 2020 at 10:23:28PM +0800, Algea Cao wrote:

CEA 861.3 spec adds colorimetry data block for HDMI.
Parsing the block to get the colorimetry data from
panel.

And what exactly do you want to do with that data?


We can get colorimetry data block from edid then support

HDMI colorimetry such as BT2020.

But what do you want to do with it? The patch does nothing
functional.


If we want to output BT2020 in HDMI driver, we can know whether TV 
support BT2020


via connector->display_info.hdmi.colorimetry. If TV don't support 
BT2020, HDMI shouldn't


ouput in BT2020.


Signed-off-by: Algea Cao 
---

   drivers/gpu/drm/drm_edid.c  | 45 +
   include/drm/drm_connector.h |  3 +++
   include/drm/drm_edid.h  | 14 
   3 files changed, 62 insertions(+)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 31496b6cfc56..67e607c04492 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3223,6 +3223,7 @@ add_detailed_modes(struct drm_connector *connector, 
struct edid *edid,
   #define VIDEO_BLOCK 0x02
   #define VENDOR_BLOCK0x03
   #define SPEAKER_BLOCK0x04
+#define COLORIMETRY_DATA_BLOCK 0x5
   #define HDR_STATIC_METADATA_BLOCK0x6
   #define USE_EXTENDED_TAG 0x07
   #define EXT_VIDEO_CAPABILITY_BLOCK 0x00
@@ -4309,6 +4310,48 @@ static void fixup_detailed_cea_mode_clock(struct 
drm_display_mode *mode)
mode->clock = clock;
   }
   
+static bool cea_db_is_hdmi_colorimetry_data_block(const u8 *db)

+{
+   if (cea_db_tag(db) != USE_EXTENDED_TAG)
+   return false;
+
+   if (db[1] != COLORIMETRY_DATA_BLOCK)
+   return false;
+
+   if (cea_db_payload_len(db) < 2)
+   return false;
+
+   return true;
+}
+
+static void
+drm_parse_colorimetry_data_block(struct drm_connector *connector, const u8 *db)
+{
+   struct drm_hdmi_info *info = >display_info.hdmi;
+
+   if (db[2] & DRM_EDID_CLRMETRY_xvYCC_601)
+   info->colorimetry |= DRM_EDID_CLRMETRY_xvYCC_601;
+   if (db[2] & DRM_EDID_CLRMETRY_xvYCC_709)
+   info->colorimetry |= DRM_EDID_CLRMETRY_xvYCC_709;
+   if (db[2] & DRM_EDID_CLRMETRY_sYCC_601)
+   info->colorimetry |= DRM_EDID_CLRMETRY_sYCC_601;
+   if (db[2] & DRM_EDID_CLRMETRY_ADBYCC_601)
+   info->colorimetry |= DRM_EDID_CLRMETRY_ADBYCC_601;
+   if (db[2] & DRM_EDID_CLRMETRY_ADB_RGB)
+   info->colorimetry |= DRM_EDID_CLRMETRY_ADB_RGB;
+   if (db[2] & DRM_EDID_CLRMETRY_BT2020_CYCC)
+   info->colorimetry |= DRM_EDID_CLRMETRY_BT2020_CYCC;
+   if (db[2] & DRM_EDID_CLRMETRY_BT2020_YCC)
+   info->colorimetry |= DRM_EDID_CLRMETRY_BT2020_YCC;
+   if (db[2] & DRM_EDID_CLRMETRY_BT2020_RGB)
+   info->colorimetry |= DRM_EDID_CLRMETRY_BT2020_RGB;
+   /* Byte 4 Bit 7: DCI-P3 */
+   if (db[3] & BIT(7))
+   info->colorimetry |= DRM_EDID_CLRMETRY_DCI_P3;
+
+   DRM_DEBUG_KMS("Supported Colorimetry 0x%x\n", info->colorimetry);
+}
+
   static bool cea_db_is_hdmi_hdr_metadata_block(const u8 *db)
   {
if (cea_db_tag(db) != USE_EXTENDED_TAG)
@@ -4994,6 +5037,8 @@ static void drm_parse_cea_ext(struct drm_connector 
*connector,
drm_parse_vcdb(connector, db);
if (cea_db_is_hdmi_hdr_metadata_block(db))
drm_parse_hdr_metadata_block(connector, db);
+   if (cea_db_is_hdmi_colorimetry_data_block(db))
+   drm_parse_colorimetry_data_block(connector, db);
}
   }
   
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h

index af145608b5ed..d599c3b9e881 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -207,6 +207,9 @@ struct drm_hdmi_info {
   
   	/** @y420_dc_modes: bitmap of deep color support index */

u8 y420_dc_modes;
+
+   /* @colorimetry: bitmap of supported colorimetry modes */
+   u16 colorimetry;
   };
   
   /**

diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index cfa4f5af49af..98fa78c2f82d 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -229,6 +229,20 @@ struct detailed_timing {
DRM_EDID_YCBCR420_DC_36 | \
DRM_EDID_YCBCR420_DC_30)
   
+/*

+ * Supported Colorimetry from colorimetry data block
+ * as per CEA 861-G spec
+ */
+#define DRM_EDID_CLRMETRY_xvYCC_601   (1 << 0)
+#define DRM_EDID_CLRMETRY_xvYCC_709   (1 << 1)
+#define DRM_EDID_CLRMETRY_sYCC_601(1 << 2)
+#define DRM_EDID_CLRMETRY_ADBYCC_601  (1 << 3)
+#define DRM_EDID_CLRMETRY_ADB_RGB (1 << 4)
+#define DRM_EDID_CLRMETRY_BT2020_CYCC (1 << 5)
+#define DRM_EDID_CLRMETRY_BT2020_YCC  (1 << 6)
+#define 

Re: [PATCH] drm/radeon: Reset ASIC if suspend is not managed by platform firmware

2020-09-01 Thread Kai-Heng Feng



> On Sep 1, 2020, at 22:19, Alex Deucher  wrote:
> 
> On Tue, Sep 1, 2020 at 3:32 AM Kai-Heng Feng
>  wrote:
>> 
>> Suspend with s2idle or by the following steps cause screen frozen:
>> # echo devices > /sys/power/pm_test
>> # echo freeze > /sys/power/mem
>> 
>> [  289.625461] [drm:uvd_v1_0_ib_test [radeon]] *ERROR* radeon: fence wait 
>> timed out.
>> [  289.625494] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed 
>> testing IB on ring 5 (-110).
>> 
>> The issue doesn't happen on traditional S3, probably because firmware or
>> hardware provides extra power management.
>> 
>> Inspired by Daniel Drake's patch [1] on amdgpu, using a similar approach
>> can fix the issue.
> 
> It doesn't actually fix the issue.  The device is never powered down
> so you are using more power than you would if you did not suspend in
> the first place.  The reset just works around the fact that the device
> is never powered down.

So how do we properly suspend/resume the device without help from platform 
firmware?

Kai-Heng

> 
> Alex
> 
>> 
>> [1] https://patchwork.freedesktop.org/patch/335839/
>> 
>> Signed-off-by: Kai-Heng Feng 
>> ---
>> drivers/gpu/drm/radeon/radeon_device.c | 3 +++
>> 1 file changed, 3 insertions(+)
>> 
>> diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
>> b/drivers/gpu/drm/radeon/radeon_device.c
>> index 266e3cbbd09b..df823b9ad79f 100644
>> --- a/drivers/gpu/drm/radeon/radeon_device.c
>> +++ b/drivers/gpu/drm/radeon/radeon_device.c
>> @@ -33,6 +33,7 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>> 
>> #include 
>> #include 
>> @@ -1643,6 +1644,8 @@ int radeon_suspend_kms(struct drm_device *dev, bool 
>> suspend,
>>rdev->asic->asic_reset(rdev, true);
>>pci_restore_state(dev->pdev);
>>} else if (suspend) {
>> +   if (pm_suspend_no_platform())
>> +   rdev->asic->asic_reset(rdev, true);
>>/* Shut down the device */
>>pci_disable_device(dev->pdev);
>>pci_set_power_state(dev->pdev, PCI_D3hot);
>> --
>> 2.17.1
>> 
>> ___
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 62/78] drm/vc4: hdmi: Adjust HSM clock rate depending on pixel rate

2020-09-01 Thread Maxime Ripard
Hi Chanwoo,

On Tue, Sep 01, 2020 at 01:36:17PM +0900, Chanwoo Choi wrote:
> On 7/9/20 2:42 AM, Maxime Ripard wrote:
> > The HSM clock needs to be setup at around 101% of the pixel rate. This
> > was done previously by setting the clock rate to 163.7MHz at probe time and
> > only check in mode_valid whether the mode pixel clock was under the pixel
> > clock +1% or not.
> > 
> > However, with 4k we need to change that frequency to a higher frequency
> > than 163.7MHz, and yet want to have the lowest clock as possible to have a
> > decent power saving.
> > 
> > Let's change that logic a bit by setting the clock rate of the HSM clock
> > to the pixel rate at encoder_enable time. This would work for the
> > BCM2711 that support 4k resolutions and has a clock that can provide it,
> > but we still have to take care of a 4k panel plugged on a BCM283x SoCs
> > that wouldn't be able to use those modes, so let's define the limit in
> > the variant.
> > 
> > Signed-off-by: Maxime Ripard 
> > ---
> >  drivers/gpu/drm/vc4/vc4_hdmi.c | 79 ---
> >  drivers/gpu/drm/vc4/vc4_hdmi.h |  3 +-
> >  2 files changed, 41 insertions(+), 41 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > index 17797b14cde4..9f30fab744f2 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > @@ -53,7 +53,6 @@
> >  #include "vc4_hdmi_regs.h"
> >  #include "vc4_regs.h"
> >  
> > -#define HSM_CLOCK_FREQ 163682864
> >  #define CEC_CLOCK_FREQ 4
> >  
> >  static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused)
> > @@ -326,6 +325,7 @@ static void vc4_hdmi_encoder_disable(struct drm_encoder 
> > *encoder)
> > HDMI_WRITE(HDMI_VID_CTL,
> >HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
> >  
> > +   clk_disable_unprepare(vc4_hdmi->hsm_clock);
> > clk_disable_unprepare(vc4_hdmi->pixel_clock);
> >  
> > ret = pm_runtime_put(_hdmi->pdev->dev);
> > @@ -423,6 +423,7 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder 
> > *encoder)
> > struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
> > struct vc4_hdmi_encoder *vc4_encoder = to_vc4_hdmi_encoder(encoder);
> > bool debug_dump_regs = false;
> > +   unsigned long pixel_rate, hsm_rate;
> > int ret;
> >  
> > ret = pm_runtime_get_sync(_hdmi->pdev->dev);
> > @@ -431,9 +432,8 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder 
> > *encoder)
> > return;
> > }
> >  
> > -   ret = clk_set_rate(vc4_hdmi->pixel_clock,
> > -  mode->clock * 1000 *
> > -  ((mode->flags & DRM_MODE_FLAG_DBLCLK) ? 2 : 1));
> > +   pixel_rate = mode->clock * 1000 * ((mode->flags & DRM_MODE_FLAG_DBLCLK) 
> > ? 2 : 1);
> > +   ret = clk_set_rate(vc4_hdmi->pixel_clock, pixel_rate);
> > if (ret) {
> > DRM_ERROR("Failed to set pixel clock rate: %d\n", ret);
> > return;
> > @@ -445,6 +445,36 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder 
> > *encoder)
> > return;
> > }
> >  
> > +   /*
> > +* As stated in RPi's vc4 firmware "HDMI state machine (HSM) clock must
> > +* be faster than pixel clock, infinitesimally faster, tested in
> > +* simulation. Otherwise, exact value is unimportant for HDMI
> > +* operation." This conflicts with bcm2835's vc4 documentation, which
> > +* states HSM's clock has to be at least 108% of the pixel clock.
> > +*
> > +* Real life tests reveal that vc4's firmware statement holds up, and
> > +* users are able to use pixel clocks closer to HSM's, namely for
> > +* 1920x1200@60Hz. So it was decided to have leave a 1% margin between
> > +* both clocks. Which, for RPi0-3 implies a maximum pixel clock of
> > +* 162MHz.
> > +*
> > +* Additionally, the AXI clock needs to be at least 25% of
> > +* pixel clock, but HSM ends up being the limiting factor.
> > +*/
> > +   hsm_rate = max_t(unsigned long, 12000, (pixel_rate / 100) * 101);
> > +   ret = clk_set_rate(vc4_hdmi->hsm_clock, hsm_rate);
> > +   if (ret) {
> > +   DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
> > +   return;
> > +   }
> > +
> > +   ret = clk_prepare_enable(vc4_hdmi->hsm_clock);
> > +   if (ret) {
> > +   DRM_ERROR("Failed to turn on HSM clock: %d\n", ret);
> > +   clk_disable_unprepare(vc4_hdmi->pixel_clock);
> > +   return;
> > +   }
> 
> About vc4_hdmi->hsm_clock instance, usually, we need to enable the clock
> with clk_prepare_enable() and then touch the clock like clk_set_rate().
> I think that need to enable the clock before calling clk_set_rate().
> 
> When I tested this patchset, it is well working because I think that
> vc4_hdmi->hsm_clock was already enabled on other side.

There's no clear rule here on the ordering (at least enforced by the
framework). There's clocks that need to be disabled to change their rate

Re: [PATCH v8 06/17] pwm: lpss: Use pwm_lpss_restore() when restoring state on resume

2020-09-01 Thread Andy Shevchenko
On Mon, Aug 31, 2020 at 07:57:30PM +0200, Hans de Goede wrote:
> On 8/31/20 3:15 PM, Thierry Reding wrote:
> > On Mon, Aug 31, 2020 at 01:46:28PM +0200, Hans de Goede wrote:
> > > On 8/31/20 1:10 PM, Thierry Reding wrote:
> > > > On Sun, Aug 30, 2020 at 02:57:42PM +0200, Hans de Goede wrote:
> > > > > Before this commit a suspend + resume of the LPSS PWM controller
> > > > > would result in the controller being reset to its defaults of
> > > > > output-freq = clock/256, duty-cycle=100%, until someone changes
> > > > > to the output-freq and/or duty-cycle are made.
> > > > > 
> > > > > This problem has been masked so far because the main consumer
> > > > > (the i915 driver) was always making duty-cycle changes on resume.
> > > > > With the conversion of the i915 driver to the atomic PWM API the
> > > > > driver now only disables/enables the PWM on suspend/resume leaving
> > > > > the output-freq and duty as is, triggering this problem.
> > > > 
> > > > Doesn't this imply that there's another bug at play here? At the PWM API
> > > > level you're applying a state and it's up to the driver to ensure that
> > > > the hardware state after ->apply() is what the software has requested.
> > > > 
> > > > If you only switch the enable state and that doesn't cause period and
> > > > duty cycle to be updated it means that your driver isn't writing those
> > > > registers when it should be.
> > > 
> > > Right, the driver was not committing those as it should *on resume*,
> > > that and it skips setting the update bit on the subsequent enable,
> > > which is an optimization which gets removed in 7/17.
> > > 
> > > Before switching the i915 driver over to atomic, when the LPSS-PWM
> > > was used for the backlight we got the following order on suspend/resume
> > > 
> > > 1. Set duty-cycle to 0%
> > > 2. Set enabled to 0
> > > 3. Save ctrl reg
> > > 4. Power-off PWM controller, it now looses all its state
> > > 5. Power-on PWM ctrl
> > > 6. Restore ctrl reg (as a single reg write)
> > > 7. Set enabled to 1, at this point one would expect the
> > > duty/freq from the restored ctrl-reg to apply, but:
> > > a) The resume code never sets the update bit (which this commit fixes); 
> > > and
> > > b) On applying the pwm_state with enabled=1 the code applying the
> > > state does this (before setting the enabled bit in the ctrl reg):
> > > 
> > >   if (orig_ctrl != ctrl) {
> > >   pwm_lpss_write(pwm, ctrl);
> > >   pwm_lpss_write(pwm, ctrl | PWM_SW_UPDATE);
> > >   }
> > > and since the restore of the ctrl reg set the old duty/freq the
> > > writes are skipped, so the update bit never gets set.
> > > 
> > > 8. Set duty-cycle to the pre-suspend value (which is not 0)
> > > this does cause a change in the ctrl-reg, so now the update flag
> > > does get set.
> > > 
> > > Note that 1-2 and 7-8 are both done by the non atomic i915 code,
> > > when moving the i915 code to atomic I decided that having these
> > > 2 separate steps here is non-sense, so the new i915 code just
> > > toggles the enable bit. So in essence the new atomic PWM
> > > i915 code drops step 1 and 8.
> > > 
> > > Dropping steps 8 means that the update bit never gets set and we
> > > end up with the PWM running at its power-on-reset duty cycle.
> > > 
> > > You are correct in your remark to patch 7/17 that since that removes
> > > the if (orig_ctrl != ctrl) for the writes that now step 7 will be
> > > sufficient to get the PWM to work again. But that only takes the i915
> > > usage into account.
> > > 
> > > What if the PWM is used through the sysfs userspace API?
> > > Then only steps 3-6 will happen on suspend-resume and without
> > > fixing step 6 to properly restore the PWM controller in its
> > > pre-resume state (this patch) it will once again be running at
> > > its power-on-reset defaults instead of the values from the
> > > restored control register.
> > 
> > Actually PWM's sysfs code has suspend/resume callbacks that basically
> > make sysfs just a regular consumer of PWMs. So they do end up doing a
> > pwm_apply_state() on the PWM as well on suspend and restore the state
> > from before suspend on resume.
> > 
> > This was done very specifically because the suspend/resume order can be
> > unexpected under some circumstances, so for PWM we really want for the
> > consumer to always have ultimate control over when precisely the PWM is
> > restored on resume.
> > 
> > The reason why we did this was because people observed weird glitches on
> > suspend/resume with different severity. In some cases a backlight would
> > be resumed before the display controller had had a chance to start
> > sending frames, causing on-screen corruption in some cases (such as
> > smart displays) and in other cases a PWM-controller regulator would be
> > resumed too late or too early, which I think was causing some issue with
> > the CPUs not working properly on resume.
> > 
> > So I'd prefer not to have any PWM driver save and restore its own
> > context on 

Re: [PATCH v2 1/4] drm/of: Change the prototype of drm_of_lvds_get_dual_link_pixel_order

2020-09-01 Thread Maxime Ripard
Hi Laurent,

On Mon, Aug 31, 2020 at 11:28:52PM +0300, Laurent Pinchart wrote:
> Hi Maxime,
> 
> Thank you for the patch.
> 
> On Thu, Jul 30, 2020 at 11:35:01AM +0200, Maxime Ripard wrote:
> > The drm_of_lvds_get_dual_link_pixel_order() function took so far the
> > device_node of the two ports used together to make up a dual-link LVDS
> > output.
> > 
> > This assumes that a binding would use an entire port for the LVDS output.
> > However, some bindings have used endpoints instead and thus we need to
> > operate at the endpoint level. Change slightly the arguments to allow that.
> 
> Is this still needed ? Unless I'm mistaken, the Allwinner platform now
> uses two TCON instances for the two links, so there are two ports.

Yes, and no.

The two TCONs indeed have each a port of their own, so we do have two
ports indeed. However, what we don't have is a port entirely dedicated
to the LVDS output.

Our binding uses a single port for all its output (RGB, LVDS or TV/HDMI
controllers) with different endpoints.

Maxime


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] Fix use after free in get_capset_info callback

2020-09-01 Thread Markus Elfring
> If a response to virtio_gpu_cmd_get_capset_info takes longer than
> five seconds to return, the callback will access freed kernel memory
> in vg->capsets.

* Can another imperative wording become helpful for the change description?

* How do you think about to mention the proposed addition of a spin lock
  and a null pointer check?

* Would you like to add the tag “Fixes” to the commit message?

Regards,
Markus
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 62/78] drm/vc4: hdmi: Adjust HSM clock rate depending on pixel rate

2020-09-01 Thread Chanwoo Choi
Hi Maxime,

On 9/1/20 6:45 PM, Maxime Ripard wrote:
> Hi Chanwoo,
> 
> On Tue, Sep 01, 2020 at 01:36:17PM +0900, Chanwoo Choi wrote:
>> On 7/9/20 2:42 AM, Maxime Ripard wrote:
>>> The HSM clock needs to be setup at around 101% of the pixel rate. This
>>> was done previously by setting the clock rate to 163.7MHz at probe time and
>>> only check in mode_valid whether the mode pixel clock was under the pixel
>>> clock +1% or not.
>>>
>>> However, with 4k we need to change that frequency to a higher frequency
>>> than 163.7MHz, and yet want to have the lowest clock as possible to have a
>>> decent power saving.
>>>
>>> Let's change that logic a bit by setting the clock rate of the HSM clock
>>> to the pixel rate at encoder_enable time. This would work for the
>>> BCM2711 that support 4k resolutions and has a clock that can provide it,
>>> but we still have to take care of a 4k panel plugged on a BCM283x SoCs
>>> that wouldn't be able to use those modes, so let's define the limit in
>>> the variant.
>>>
>>> Signed-off-by: Maxime Ripard 
>>> ---
>>>  drivers/gpu/drm/vc4/vc4_hdmi.c | 79 ---
>>>  drivers/gpu/drm/vc4/vc4_hdmi.h |  3 +-
>>>  2 files changed, 41 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
>>> index 17797b14cde4..9f30fab744f2 100644
>>> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
>>> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
>>> @@ -53,7 +53,6 @@
>>>  #include "vc4_hdmi_regs.h"
>>>  #include "vc4_regs.h"
>>>  
>>> -#define HSM_CLOCK_FREQ 163682864
>>>  #define CEC_CLOCK_FREQ 4
>>>  
>>>  static int vc4_hdmi_debugfs_regs(struct seq_file *m, void *unused)
>>> @@ -326,6 +325,7 @@ static void vc4_hdmi_encoder_disable(struct drm_encoder 
>>> *encoder)
>>> HDMI_WRITE(HDMI_VID_CTL,
>>>HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
>>>  
>>> +   clk_disable_unprepare(vc4_hdmi->hsm_clock);
>>> clk_disable_unprepare(vc4_hdmi->pixel_clock);
>>>  
>>> ret = pm_runtime_put(_hdmi->pdev->dev);
>>> @@ -423,6 +423,7 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder 
>>> *encoder)
>>> struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
>>> struct vc4_hdmi_encoder *vc4_encoder = to_vc4_hdmi_encoder(encoder);
>>> bool debug_dump_regs = false;
>>> +   unsigned long pixel_rate, hsm_rate;
>>> int ret;
>>>  
>>> ret = pm_runtime_get_sync(_hdmi->pdev->dev);
>>> @@ -431,9 +432,8 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder 
>>> *encoder)
>>> return;
>>> }
>>>  
>>> -   ret = clk_set_rate(vc4_hdmi->pixel_clock,
>>> -  mode->clock * 1000 *
>>> -  ((mode->flags & DRM_MODE_FLAG_DBLCLK) ? 2 : 1));
>>> +   pixel_rate = mode->clock * 1000 * ((mode->flags & DRM_MODE_FLAG_DBLCLK) 
>>> ? 2 : 1);
>>> +   ret = clk_set_rate(vc4_hdmi->pixel_clock, pixel_rate);
>>> if (ret) {
>>> DRM_ERROR("Failed to set pixel clock rate: %d\n", ret);
>>> return;
>>> @@ -445,6 +445,36 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder 
>>> *encoder)
>>> return;
>>> }
>>>  
>>> +   /*
>>> +* As stated in RPi's vc4 firmware "HDMI state machine (HSM) clock must
>>> +* be faster than pixel clock, infinitesimally faster, tested in
>>> +* simulation. Otherwise, exact value is unimportant for HDMI
>>> +* operation." This conflicts with bcm2835's vc4 documentation, which
>>> +* states HSM's clock has to be at least 108% of the pixel clock.
>>> +*
>>> +* Real life tests reveal that vc4's firmware statement holds up, and
>>> +* users are able to use pixel clocks closer to HSM's, namely for
>>> +* 1920x1200@60Hz. So it was decided to have leave a 1% margin between
>>> +* both clocks. Which, for RPi0-3 implies a maximum pixel clock of
>>> +* 162MHz.
>>> +*
>>> +* Additionally, the AXI clock needs to be at least 25% of
>>> +* pixel clock, but HSM ends up being the limiting factor.
>>> +*/
>>> +   hsm_rate = max_t(unsigned long, 12000, (pixel_rate / 100) * 101);
>>> +   ret = clk_set_rate(vc4_hdmi->hsm_clock, hsm_rate);
>>> +   if (ret) {
>>> +   DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
>>> +   return;
>>> +   }
>>> +
>>> +   ret = clk_prepare_enable(vc4_hdmi->hsm_clock);
>>> +   if (ret) {
>>> +   DRM_ERROR("Failed to turn on HSM clock: %d\n", ret);
>>> +   clk_disable_unprepare(vc4_hdmi->pixel_clock);
>>> +   return;
>>> +   }
>>
>> About vc4_hdmi->hsm_clock instance, usually, we need to enable the clock
>> with clk_prepare_enable() and then touch the clock like clk_set_rate().
>> I think that need to enable the clock before calling clk_set_rate().
>>
>> When I tested this patchset, it is well working because I think that
>> vc4_hdmi->hsm_clock was already enabled on other side.
> 
> There's no clear rule here on the ordering (at least enforced by the
> framework). There's 

Re: [PATCH V2 3/8] drm/msm: Unconditionally call dev_pm_opp_of_remove_table()

2020-09-01 Thread Viresh Kumar
On 01-09-20, 15:15, Rajendra Nayak wrote:
> 
> On 9/1/2020 2:08 PM, Viresh Kumar wrote:
> > On 01-09-20, 13:01, Rajendra Nayak wrote:
> > > So FWIU, dpu_unbind() gets called even when dpu_bind() fails for some 
> > > reason.
> > 
> > Ahh, I see.
> > 
> > > I tried to address that earlier [1] which I realized did not land.
> > 
> > I don't think that patch was required, as you can call
> > dev_pm_opp_put_clkname() multiple times and it will return without any
> > errors/crash.
> 
> We did see a crash (Sai had reported it), perhaps with dsi [1] and not this
> driver. But it was the same scenario that was possible here as well, which is
> dev_pm_opp_put_clkname() getting called without dev_pm_opp_set_clkname()
> being done. I think we ended up passing a NULL as opp_table in that case
> and the function tries de-referencing it.

Heh, yeah I did miss that stupid thing :(

> > 
> > > But with these changes
> > > it will be even more broken unless we identify if we failed dpu_bind() 
> > > before
> > > adding the OPP table, while adding it, or all went well with opps and 
> > > handle things
> > > accordingly in dpu_unbind.
> > 
> > Maybe not as dev_pm_opp_of_remove_table() can be called multiple times
> > as well without any errors or crash.
> 
> Can it be called without the driver ever doing a dev_pm_opp_of_add_table()?

Yes, as we will fail to find the OPP device in that case with -ENODEV
and so won't even print a warning.

Also if the OPP table was previously added as a response to
dev_pm_opp_set_clkname(), then we won't free it as well. So yes, it
should work just fine.

-- 
viresh
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 03/78] drm/vc4: hvs: Boost the core clock during modeset

2020-09-01 Thread Chanwoo Choi
On 9/1/20 8:21 PM, Chanwoo Choi wrote:
> Hi Maxime,
> 
> On 7/9/20 2:41 AM, Maxime Ripard wrote:
>> In order to prevent timeouts and stalls in the pipeline, the core clock
>> needs to be maxed at 500MHz during a modeset on the BCM2711.
>>
>> Reviewed-by: Eric Anholt 
>> Signed-off-by: Maxime Ripard 
>> ---
>>  drivers/gpu/drm/vc4/vc4_drv.h |  2 ++
>>  drivers/gpu/drm/vc4/vc4_hvs.c |  9 +
>>  drivers/gpu/drm/vc4/vc4_kms.c |  9 +
>>  3 files changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
>> index e4cde1f9224b..6358f6ca8d56 100644
>> --- a/drivers/gpu/drm/vc4/vc4_drv.h
>> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
>> @@ -320,6 +320,8 @@ struct vc4_hvs {
>>  void __iomem *regs;
>>  u32 __iomem *dlist;
>>  
>> +struct clk *core_clk;
>> +
>>  /* Memory manager for CRTCs to allocate space in the display
>>   * list.  Units are dwords.
>>   */
>> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
>> index 836d8799d79e..091fdf4908aa 100644
>> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
>> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
>> @@ -19,6 +19,7 @@
>>   * each CRTC.
>>   */
>>  
>> +#include 
>>  #include 
>>  #include 
>>  
>> @@ -540,6 +541,14 @@ static int vc4_hvs_bind(struct device *dev, struct 
>> device *master, void *data)
>>  hvs->regset.regs = hvs_regs;
>>  hvs->regset.nregs = ARRAY_SIZE(hvs_regs);
>>  
>> +if (hvs->hvs5) {
>> +hvs->core_clk = devm_clk_get(>dev, NULL);
>> +if (IS_ERR(hvs->core_clk)) {
>> +dev_err(>dev, "Couldn't get core clock\n");
>> +return PTR_ERR(hvs->core_clk);
>> +}
>> +}
>> +
>>  if (!hvs->hvs5)
>>  hvs->dlist = hvs->regs + SCALER_DLIST_START;
>>  else
>> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
>> index 08318e69061b..210cc2408087 100644
>> --- a/drivers/gpu/drm/vc4/vc4_kms.c
>> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
>> @@ -11,6 +11,8 @@
>>   * crtc, HDMI encoder).
>>   */
>>  
>> +#include 
>> +
>>  #include 
>>  #include 
>>  #include 
>> @@ -149,6 +151,7 @@ vc4_atomic_complete_commit(struct drm_atomic_state 
>> *state)
>>  {
>>  struct drm_device *dev = state->dev;
>>  struct vc4_dev *vc4 = to_vc4_dev(dev);
>> +struct vc4_hvs *hvs = vc4->hvs;
>>  struct vc4_crtc *vc4_crtc;
>>  int i;
>>  
>> @@ -160,6 +163,9 @@ vc4_atomic_complete_commit(struct drm_atomic_state 
>> *state)
>>  vc4_hvs_mask_underrun(dev, vc4_crtc->channel);
>>  }
>>  
>> +if (vc4->hvs->hvs5)
>> +clk_set_min_rate(hvs->core_clk, 5);
>> +
>>  drm_atomic_helper_wait_for_fences(dev, state, false);
>>  
>>  drm_atomic_helper_wait_for_dependencies(state);
>> @@ -182,6 +188,9 @@ vc4_atomic_complete_commit(struct drm_atomic_state 
>> *state)
>>  
>>  drm_atomic_helper_commit_cleanup_done(state);
>>  
>> +if (vc4->hvs->hvs5)
>> +clk_set_min_rate(hvs->core_clk, 0);
>> +
>>  drm_atomic_state_put(state);
>>  
>>  up(>async_modeset);
>>
> 
> This patch doesn't control the enable/disable of core_clk.
> So, I think that it need to handle the clock as following:
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> index 4ef88c0b51ab..355d67fd8beb 100644
> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> @@ -588,6 +588,12 @@ static int vc4_hvs_bind(struct device *dev, struct 
> device *master, void *data)
> dev_err(>dev, "Couldn't get core clock\n");
> return PTR_ERR(hvs->core_clk);
> }
> +
> +   ret = clk_prepare_enable(hvs->core_clk);
> +   if (ret) {
> +   dev_err(>dev, "Couldn't enable core clock\n");
> +   return ret;
> +   }
> }
>  
> if (!hvs->hvs5)
> @@ -681,6 +687,8 @@ static void vc4_hvs_unbind(struct device *dev, struct 
> device *master,
> drm_mm_takedown(>hvs->dlist_mm);
> drm_mm_takedown(>hvs->lbm_mm);
>  
> +   clk_prepare_enable(vc4->hvs->core_clk);

I'm sorry. Change to clk_disable_unprepare(vc4->hvs->core_clk);

> +
> vc4->hvs = NULL;
>  }
> 
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 77/78] drm/vc4: drv: Support BCM2711

2020-09-01 Thread Maxime Ripard
Hi Dave,

On Tue, Jul 28, 2020 at 04:30:16PM +0100, Dave Stevenson wrote:
> > @@ -681,10 +684,14 @@ int vc4_kms_load(struct drm_device *dev)
> > struct vc4_load_tracker_state *load_state;
> > int ret;
> >
> > -   /* Start with the load tracker enabled. Can be disabled through the
> > -* debugfs load_tracker file.
> > -*/
> > -   vc4->load_tracker_enabled = true;
> > +   if (!of_device_is_compatible(dev->dev->of_node, 
> > "brcm,bcm2711-vc5")) {
> 
> Is it better to look up the compatible string, or pass something via
> the .data element of the of_device_id table? Probably down to personal
> preference?

It's pretty much equivalent, so I'm not sure one is arguably better than
the other. However, checking for the compatible can be pretty cumbersome
when you have to do it repeatedly (like we do in the HDMI controller),
and when you don't it a lot, having a structure associated to the
compatible is also fairly cumbersome.

> > +   vc4->load_tracker_available = true;
> > +
> > +   /* Start with the load tracker enabled. Can be
> > +* disabled through the debugfs load_tracker file.
> > +*/
> > +   vc4->load_tracker_enabled = true;
> > +   }
> >
> > sema_init(>async_modeset, 1);
> >
> > @@ -698,8 +705,14 @@ int vc4_kms_load(struct drm_device *dev)
> > return ret;
> > }
> >
> > -   dev->mode_config.max_width = 2048;
> > -   dev->mode_config.max_height = 2048;
> > +   if (of_device_is_compatible(dev->dev->of_node, "brcm,bcm2711-vc5")) 
> > {
> 
> We're making the same of_device_is_compatible call twice within
> vc4_kms_load. Set a flag based on it and check that instead?

Good idea, thanks!
Maxime


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] dma-buf: fix kernel-doc warning in dma-fence.c

2020-09-01 Thread Randy Dunlap
On 9/1/20 6:37 AM, Christian König wrote:
> Am 01.09.20 um 15:32 schrieb Daniel Vetter:
>> On Mon, Aug 31, 2020 at 12:02:03PM +0200, Christian König wrote:
>>> Am 31.08.20 um 06:17 schrieb Randy Dunlap:
 Add @cookie to dma_fence_end_signalling() to prevent kernel-doc
 warning in drivers/dma-buf/dma-fence.c:

 ../drivers/dma-buf/dma-fence.c:291: warning: Function parameter or member 
 'cookie' not described in 'dma_fence_end_signalling'

 Signed-off-by: Randy Dunlap 
 Cc: Sumit Semwal 
 Cc: Gustavo Padovan 
 Cc: Christian König 
 Cc: linux-me...@vger.kernel.org
 Cc: dri-devel@lists.freedesktop.org
>>> Acked-by: Christian König 
>> Will you merge these two to drm-misc-fixes or should someone else?
> 
> I was wondering the same thing and just waiting for Randy to reply with 
> please pick them up or I'm going to push them because I have commit access.

I didn't realize that was needed, but anyway, Christian, please apply these 2
dma-buf kernel-doc patches.

thanks.

> Regards,
> Christian.
> 
>>
>> Always a bit confusing when maintainers reply with acks/r-b but not what
>> they'll do with the patch :-)

Agreed.

>> Cheers, Daniel
>>
 ---
    drivers/dma-buf/dma-fence.c |    1 +
    1 file changed, 1 insertion(+)

 --- lnx-59-rc3.orig/drivers/dma-buf/dma-fence.c
 +++ lnx-59-rc3/drivers/dma-buf/dma-fence.c
 @@ -283,6 +283,7 @@ EXPORT_SYMBOL(dma_fence_begin_signalling
    /**
     * dma_fence_end_signalling - end a critical DMA fence signalling 
 section
 + * @cookie: opaque cookie from dma_fence_begin_signalling()
     *
     * Closes a critical section annotation opened by 
 dma_fence_begin_signalling().
     */
>>> ___
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org


-- 
~Randy

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 75/78] dt-bindings: display: vc4: hdmi: Add BCM2711 HDMI controllers bindings

2020-09-01 Thread Maxime Ripard
Hi,

On Tue, Sep 01, 2020 at 01:45:07PM +0900, Chanwoo Choi wrote:
> Hi Maxime,
> 
> On 7/9/20 2:42 AM, Maxime Ripard wrote:
> > The HDMI controllers found in the BCM2711 SoC need some adjustments to the
> > bindings, especially since the registers have been shuffled around in more
> > register ranges.
> > 
> > Reviewed-by: Rob Herring 
> > Signed-off-by: Maxime Ripard 
> > ---
> >  Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml | 109 
> > -
> >  1 file changed, 109 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml 
> > b/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml
> > new file mode 100644
> > index ..6091fe3d315b
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/brcm,bcm2711-hdmi.yaml
> > @@ -0,0 +1,109 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +%YAML 1.2
> > +---
> > +$id: 
> > https://protect2.fireeye.com/url?k=556aeb05-08b8fda0-556b604a-0cc47a31bee8-c3a0ebd1d22c3183=1=http%3A%2F%2Fdevicetree.org%2Fschemas%2Fdisplay%2Fbrcm%2Cbcm2711-hdmi.yaml%23
> > +$schema: 
> > https://protect2.fireeye.com/url?k=24fa660c-792870a9-24fbed43-0cc47a31bee8-0bf16f4fd60f0ab4=1=http%3A%2F%2Fdevicetree.org%2Fmeta-schemas%2Fcore.yaml%23
> > +
> > +title: Broadcom BCM2711 HDMI Controller Device Tree Bindings
> > +
> > +maintainers:
> > +  - Eric Anholt 
> > +
> > +properties:
> > +  compatible:
> > +enum:
> > +  - brcm,bcm2711-hdmi0
> > +  - brcm,bcm2711-hdmi1
> > +
> > +  reg:
> > +items:
> > +  - description: HDMI controller register range
> > +  - description: DVP register range
> > +  - description: HDMI PHY register range
> > +  - description: Rate Manager register range
> > +  - description: Packet RAM register range
> > +  - description: Metadata RAM register range
> > +  - description: CSC register range
> > +  - description: CEC register range
> > +  - description: HD register range
> > +
> > +  reg-names:
> > +items:
> > +  - const: hdmi
> > +  - const: dvp
> > +  - const: phy
> > +  - const: rm
> > +  - const: packet
> > +  - const: metadata
> > +  - const: csc
> > +  - const: cec
> > +  - const: hd
> > +
> > +  clocks:
> > +description: The HDMI state machine clock
> 
> I'm not sure the following description is correct.
> But, this description doesn't contain the information of audio clock.
> 
>   description: The HDMI state machine and audio clock
> 
> > +
> > +  clock-names:
> > +const: hdmi
> 
> This patch is missing the following clock information for audio clock.
> 
>   const: clk-108M
> 
> > +
> > +  ddc:
> > +allOf:
> > +  - $ref: /schemas/types.yaml#/definitions/phandle
> > +description: >
> > +  Phandle of the I2C controller used for DDC EDID probing
> > +
> > +  hpd-gpios:
> > +description: >
> > +  The GPIO pin for the HDMI hotplug detect (if it doesn't appear
> > +  as an interrupt/status bit in the HDMI controller itself)
> > +
> > +  dmas:
> > +maxItems: 1
> > +description: >
> > +  Should contain one entry pointing to the DMA channel used to
> > +  transfer audio data.
> > +
> > +  dma-names:
> > +const: audio-rx
> > +
> > +  resets:
> > +maxItems: 1
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - reg-names
> > +  - clocks
> > +  - resets
> > +  - ddc
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > +  - |
> > +hdmi0: hdmi@7ef00700 {
> > +compatible = "brcm,bcm2711-hdmi0";
> > +reg = <0x7ef00700 0x300>,
> > +  <0x7ef00300 0x200>,
> > +  <0x7ef00f00 0x80>,
> > +  <0x7ef00f80 0x80>,
> > +  <0x7ef01b00 0x200>,
> > +  <0x7ef01f00 0x400>,
> > +  <0x7ef00200 0x80>,
> > +  <0x7ef04300 0x100>,
> > +  <0x7ef2 0x100>;
> > +reg-names = "hdmi",
> > +"dvp",
> > +"phy",
> > +"rm",
> > +"packet",
> > +"metadata",
> > +"csc",
> > +"cec",
> > +"hd";
> > +clocks = <_clocks 13>;
> > +clock-names = "hdmi";
> 
> Also, this example doesn't include the instance of audio clock.
> Need to edit them as following:
> 
>   clock-names = "hdmi", "clk-108M";
>   clocks = <_clocks 13>, < 0>;

Indeed, thanks for pointing it out

Maxime


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH V2 3/8] drm/msm: Unconditionally call dev_pm_opp_of_remove_table()

2020-09-01 Thread Viresh Kumar
On 01-09-20, 13:01, Rajendra Nayak wrote:
> So FWIU, dpu_unbind() gets called even when dpu_bind() fails for some reason.

Ahh, I see.

> I tried to address that earlier [1] which I realized did not land.

I don't think that patch was required, as you can call
dev_pm_opp_put_clkname() multiple times and it will return without any
errors/crash.

> But with these changes
> it will be even more broken unless we identify if we failed dpu_bind() before
> adding the OPP table, while adding it, or all went well with opps and handle 
> things
> accordingly in dpu_unbind.

Maybe not as dev_pm_opp_of_remove_table() can be called multiple times
as well without any errors or crash.

> [1] https://lore.kernel.org/patchwork/patch/1275632/

-- 
viresh
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 29/78] drm/vc4: crtc: Add a delay after disabling the PixelValve output

2020-09-01 Thread Maxime Ripard
Hi Stefan

On Tue, Aug 25, 2020 at 11:30:58PM +0200, Stefan Wahren wrote:
> Am 25.08.20 um 17:06 schrieb Maxime Ripard:
> > Hi Stefan,
> >
> > On Wed, Jul 29, 2020 at 05:50:31PM +0200, Stefan Wahren wrote:
> >> Am 29.07.20 um 16:42 schrieb Maxime Ripard:
> >>> Hi,
> >>>
> >>> On Wed, Jul 29, 2020 at 03:09:21PM +0100, Dave Stevenson wrote:
>  On Wed, 8 Jul 2020 at 18:43, Maxime Ripard  wrote:
> > In order to avoid pixels getting stuck in the (unflushable) FIFO between
> > the HVS and the PV, we need to add some delay after disabling the PV 
> > output
> > and before disabling the HDMI controller. 20ms seems to be good enough 
> > so
> > let's use that.
> >
> > Signed-off-by: Maxime Ripard 
> > ---
> >  drivers/gpu/drm/vc4/vc4_crtc.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c 
> > b/drivers/gpu/drm/vc4/vc4_crtc.c
> > index d0b326e1df0a..7b178d67187f 100644
> > --- a/drivers/gpu/drm/vc4/vc4_crtc.c
> > +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
> > @@ -403,6 +403,8 @@ static void vc4_crtc_atomic_disable(struct drm_crtc 
> > *crtc,
> > ret = wait_for(!(CRTC_READ(PV_V_CONTROL) & PV_VCONTROL_VIDEN), 
> > 1);
> > WARN_ONCE(ret, "Timeout waiting for !PV_VCONTROL_VIDEN\n");
> >
> > +   mdelay(20);
>  mdelay for 20ms seems a touch unfriendly as it's a busy wait. Can we
>  not msleep instead?
> >>> Since the timing was fairly critical, sleeping didn't seem like a good
> >>> solution since there's definitely some chance you overshoot and end up
> >>> with a higher time than the one you targeted.
> >> usleep_range(min, max) isn't a solution?
> > My understanding of usleep_range was that you can still overshoot, even
> > though it's backed by an HR timer so the resolution is not a jiffy. Are
> > we certain that we're going to be in that range?
> 
> you are right there is no guarantee about the upper wake up time.
> 
> And it's not worth the effort to poll the FIFO state until its empty
> (using 20 ms as timeout)?

I know this isn't really a great argument there, but getting this to
work has been quite painful, and the timing is very sensitive. If we
fail to wait for enough time, there's going to be a pixel shift that we
can't get rid of unless we reboot, which is pretty bad (and would fail
any CI test that checks for the output integrity).

I know busy-looping for 20ms isn't ideal, but it's not really in a
hot-path (it's only done when changing a mode), with the sync time of
the display likely to be much more than that, and if it can avoid having
to look into it ever again or avoid random failures, I'd say it's worth
it.

Maxime


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 03/78] drm/vc4: hvs: Boost the core clock during modeset

2020-09-01 Thread Chanwoo Choi
Hi Maxime,

On 7/9/20 2:41 AM, Maxime Ripard wrote:
> In order to prevent timeouts and stalls in the pipeline, the core clock
> needs to be maxed at 500MHz during a modeset on the BCM2711.
> 
> Reviewed-by: Eric Anholt 
> Signed-off-by: Maxime Ripard 
> ---
>  drivers/gpu/drm/vc4/vc4_drv.h |  2 ++
>  drivers/gpu/drm/vc4/vc4_hvs.c |  9 +
>  drivers/gpu/drm/vc4/vc4_kms.c |  9 +
>  3 files changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
> index e4cde1f9224b..6358f6ca8d56 100644
> --- a/drivers/gpu/drm/vc4/vc4_drv.h
> +++ b/drivers/gpu/drm/vc4/vc4_drv.h
> @@ -320,6 +320,8 @@ struct vc4_hvs {
>   void __iomem *regs;
>   u32 __iomem *dlist;
>  
> + struct clk *core_clk;
> +
>   /* Memory manager for CRTCs to allocate space in the display
>* list.  Units are dwords.
>*/
> diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
> index 836d8799d79e..091fdf4908aa 100644
> --- a/drivers/gpu/drm/vc4/vc4_hvs.c
> +++ b/drivers/gpu/drm/vc4/vc4_hvs.c
> @@ -19,6 +19,7 @@
>   * each CRTC.
>   */
>  
> +#include 
>  #include 
>  #include 
>  
> @@ -540,6 +541,14 @@ static int vc4_hvs_bind(struct device *dev, struct 
> device *master, void *data)
>   hvs->regset.regs = hvs_regs;
>   hvs->regset.nregs = ARRAY_SIZE(hvs_regs);
>  
> + if (hvs->hvs5) {
> + hvs->core_clk = devm_clk_get(>dev, NULL);
> + if (IS_ERR(hvs->core_clk)) {
> + dev_err(>dev, "Couldn't get core clock\n");
> + return PTR_ERR(hvs->core_clk);
> + }
> + }
> +
>   if (!hvs->hvs5)
>   hvs->dlist = hvs->regs + SCALER_DLIST_START;
>   else
> diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
> index 08318e69061b..210cc2408087 100644
> --- a/drivers/gpu/drm/vc4/vc4_kms.c
> +++ b/drivers/gpu/drm/vc4/vc4_kms.c
> @@ -11,6 +11,8 @@
>   * crtc, HDMI encoder).
>   */
>  
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -149,6 +151,7 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  {
>   struct drm_device *dev = state->dev;
>   struct vc4_dev *vc4 = to_vc4_dev(dev);
> + struct vc4_hvs *hvs = vc4->hvs;
>   struct vc4_crtc *vc4_crtc;
>   int i;
>  
> @@ -160,6 +163,9 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>   vc4_hvs_mask_underrun(dev, vc4_crtc->channel);
>   }
>  
> + if (vc4->hvs->hvs5)
> + clk_set_min_rate(hvs->core_clk, 5);
> +
>   drm_atomic_helper_wait_for_fences(dev, state, false);
>  
>   drm_atomic_helper_wait_for_dependencies(state);
> @@ -182,6 +188,9 @@ vc4_atomic_complete_commit(struct drm_atomic_state *state)
>  
>   drm_atomic_helper_commit_cleanup_done(state);
>  
> + if (vc4->hvs->hvs5)
> + clk_set_min_rate(hvs->core_clk, 0);
> +
>   drm_atomic_state_put(state);
>  
>   up(>async_modeset);
> 

This patch doesn't control the enable/disable of core_clk.
So, I think that it need to handle the clock as following:

diff --git a/drivers/gpu/drm/vc4/vc4_hvs.c b/drivers/gpu/drm/vc4/vc4_hvs.c
index 4ef88c0b51ab..355d67fd8beb 100644
--- a/drivers/gpu/drm/vc4/vc4_hvs.c
+++ b/drivers/gpu/drm/vc4/vc4_hvs.c
@@ -588,6 +588,12 @@ static int vc4_hvs_bind(struct device *dev, struct device 
*master, void *data)
dev_err(>dev, "Couldn't get core clock\n");
return PTR_ERR(hvs->core_clk);
}
+
+   ret = clk_prepare_enable(hvs->core_clk);
+   if (ret) {
+   dev_err(>dev, "Couldn't enable core clock\n");
+   return ret;
+   }
}
 
if (!hvs->hvs5)
@@ -681,6 +687,8 @@ static void vc4_hvs_unbind(struct device *dev, struct 
device *master,
drm_mm_takedown(>hvs->dlist_mm);
drm_mm_takedown(>hvs->lbm_mm);
 
+   clk_prepare_enable(vc4->hvs->core_clk);
+
vc4->hvs = NULL;
 }



-- 
Best Regards,
Chanwoo Choi
Samsung Electronics
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v4 29/78] drm/vc4: crtc: Add a delay after disabling the PixelValve output

2020-09-01 Thread Stefan Wahren
Hi Maxime,

Am 01.09.20 um 11:58 schrieb Maxime Ripard:
> Hi Stefan
>
> On Tue, Aug 25, 2020 at 11:30:58PM +0200, Stefan Wahren wrote:
>> Am 25.08.20 um 17:06 schrieb Maxime Ripard:
>>> Hi Stefan,
>>>
>>> On Wed, Jul 29, 2020 at 05:50:31PM +0200, Stefan Wahren wrote:
 Am 29.07.20 um 16:42 schrieb Maxime Ripard:
> Hi,
>
> On Wed, Jul 29, 2020 at 03:09:21PM +0100, Dave Stevenson wrote:
>> On Wed, 8 Jul 2020 at 18:43, Maxime Ripard  wrote:
>>> In order to avoid pixels getting stuck in the (unflushable) FIFO between
>>> the HVS and the PV, we need to add some delay after disabling the PV 
>>> output
>>> and before disabling the HDMI controller. 20ms seems to be good enough 
>>> so
>>> let's use that.
>>>
>>> Signed-off-by: Maxime Ripard 
>>> ---
>>>  drivers/gpu/drm/vc4/vc4_crtc.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c 
>>> b/drivers/gpu/drm/vc4/vc4_crtc.c
>>> index d0b326e1df0a..7b178d67187f 100644
>>> --- a/drivers/gpu/drm/vc4/vc4_crtc.c
>>> +++ b/drivers/gpu/drm/vc4/vc4_crtc.c
>>> @@ -403,6 +403,8 @@ static void vc4_crtc_atomic_disable(struct drm_crtc 
>>> *crtc,
>>> ret = wait_for(!(CRTC_READ(PV_V_CONTROL) & PV_VCONTROL_VIDEN), 
>>> 1);
>>> WARN_ONCE(ret, "Timeout waiting for !PV_VCONTROL_VIDEN\n");
>>>
>>> +   mdelay(20);
>> mdelay for 20ms seems a touch unfriendly as it's a busy wait. Can we
>> not msleep instead?
> Since the timing was fairly critical, sleeping didn't seem like a good
> solution since there's definitely some chance you overshoot and end up
> with a higher time than the one you targeted.
 usleep_range(min, max) isn't a solution?
>>> My understanding of usleep_range was that you can still overshoot, even
>>> though it's backed by an HR timer so the resolution is not a jiffy. Are
>>> we certain that we're going to be in that range?
>> you are right there is no guarantee about the upper wake up time.
>>
>> And it's not worth the effort to poll the FIFO state until its empty
>> (using 20 ms as timeout)?
> I know this isn't really a great argument there, but getting this to
> work has been quite painful, and the timing is very sensitive. If we
> fail to wait for enough time, there's going to be a pixel shift that we
> can't get rid of unless we reboot, which is pretty bad (and would fail
> any CI test that checks for the output integrity).
>
> I know busy-looping for 20ms isn't ideal, but it's not really in a
> hot-path (it's only done when changing a mode), with the sync time of
> the display likely to be much more than that, and if it can avoid having
> to look into it ever again or avoid random failures, I'd say it's worth
> it.

i don't want to delay this series.

Could you please add a small comment to the delay to clarify the timing
is very sensitive?

Thanks

>
> Maxime

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: Reset ASIC if suspend is not managed by platform firmware

2020-09-01 Thread Alex Deucher
On Tue, Sep 1, 2020 at 12:21 PM Kai-Heng Feng
 wrote:
>
>
>
> > On Sep 1, 2020, at 22:19, Alex Deucher  wrote:
> >
> > On Tue, Sep 1, 2020 at 3:32 AM Kai-Heng Feng
> >  wrote:
> >>
> >> Suspend with s2idle or by the following steps cause screen frozen:
> >> # echo devices > /sys/power/pm_test
> >> # echo freeze > /sys/power/mem
> >>
> >> [  289.625461] [drm:uvd_v1_0_ib_test [radeon]] *ERROR* radeon: fence wait 
> >> timed out.
> >> [  289.625494] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed 
> >> testing IB on ring 5 (-110).
> >>
> >> The issue doesn't happen on traditional S3, probably because firmware or
> >> hardware provides extra power management.
> >>
> >> Inspired by Daniel Drake's patch [1] on amdgpu, using a similar approach
> >> can fix the issue.
> >
> > It doesn't actually fix the issue.  The device is never powered down
> > so you are using more power than you would if you did not suspend in
> > the first place.  The reset just works around the fact that the device
> > is never powered down.
>
> So how do we properly suspend/resume the device without help from platform 
> firmware?

I guess you don't?

Alex


>
> Kai-Heng
>
> >
> > Alex
> >
> >>
> >> [1] https://patchwork.freedesktop.org/patch/335839/
> >>
> >> Signed-off-by: Kai-Heng Feng 
> >> ---
> >> drivers/gpu/drm/radeon/radeon_device.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
> >> b/drivers/gpu/drm/radeon/radeon_device.c
> >> index 266e3cbbd09b..df823b9ad79f 100644
> >> --- a/drivers/gpu/drm/radeon/radeon_device.c
> >> +++ b/drivers/gpu/drm/radeon/radeon_device.c
> >> @@ -33,6 +33,7 @@
> >> #include 
> >> #include 
> >> #include 
> >> +#include 
> >>
> >> #include 
> >> #include 
> >> @@ -1643,6 +1644,8 @@ int radeon_suspend_kms(struct drm_device *dev, bool 
> >> suspend,
> >>rdev->asic->asic_reset(rdev, true);
> >>pci_restore_state(dev->pdev);
> >>} else if (suspend) {
> >> +   if (pm_suspend_no_platform())
> >> +   rdev->asic->asic_reset(rdev, true);
> >>/* Shut down the device */
> >>pci_disable_device(dev->pdev);
> >>pci_set_power_state(dev->pdev, PCI_D3hot);
> >> --
> >> 2.17.1
> >>
> >> ___
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 06/19] drm/msm/gpu: add dev_to_gpu() helper

2020-09-01 Thread Rob Clark
On Mon, Aug 31, 2020 at 9:32 PM Bjorn Andersson
 wrote:
>
> On Thu 13 Aug 21:41 CDT 2020, Rob Clark wrote:
>
> > From: Rob Clark 
> >
> > In a later patch, the drvdata will not directly be 'struct msm_gpu *',
> > so add a helper to reduce the churn.
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/gpu/drm/msm/adreno/adreno_device.c | 10 --
> >  drivers/gpu/drm/msm/msm_gpu.c  |  6 +++---
> >  drivers/gpu/drm/msm/msm_gpu.h  |  5 +
> >  3 files changed, 12 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 9eeb46bf2a5d..26664e1b30c0 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -282,7 +282,7 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
> >   int ret;
> >
> >   if (pdev)
> > - gpu = platform_get_drvdata(pdev);
> > + gpu = dev_to_gpu(>dev);
> >
> >   if (!gpu) {
> >   dev_err_once(dev->dev, "no GPU device was found\n");
> > @@ -425,7 +425,7 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> >  static void adreno_unbind(struct device *dev, struct device *master,
> >   void *data)
> >  {
> > - struct msm_gpu *gpu = dev_get_drvdata(dev);
> > + struct msm_gpu *gpu = dev_to_gpu(dev);
> >
> >   pm_runtime_force_suspend(dev);
> >   gpu->funcs->destroy(gpu);
> > @@ -490,16 +490,14 @@ static const struct of_device_id dt_match[] = {
> >  #ifdef CONFIG_PM
> >  static int adreno_resume(struct device *dev)
> >  {
> > - struct platform_device *pdev = to_platform_device(dev);
> > - struct msm_gpu *gpu = platform_get_drvdata(pdev);
> > + struct msm_gpu *gpu = dev_to_gpu(dev);
> >
> >   return gpu->funcs->pm_resume(gpu);
> >  }
> >
> >  static int adreno_suspend(struct device *dev)
> >  {
> > - struct platform_device *pdev = to_platform_device(dev);
> > - struct msm_gpu *gpu = platform_get_drvdata(pdev);
> > + struct msm_gpu *gpu = dev_to_gpu(dev);
> >
> >   return gpu->funcs->pm_suspend(gpu);
> >  }
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index d5645472b25d..6aa9e04e52e7 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -24,7 +24,7 @@
> >  static int msm_devfreq_target(struct device *dev, unsigned long *freq,
> >   u32 flags)
> >  {
> > - struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
> > + struct msm_gpu *gpu = dev_to_gpu(dev);
> >   struct dev_pm_opp *opp;
> >
> >   opp = devfreq_recommended_opp(dev, freq, flags);
> > @@ -45,7 +45,7 @@ static int msm_devfreq_target(struct device *dev, 
> > unsigned long *freq,
> >  static int msm_devfreq_get_dev_status(struct device *dev,
> >   struct devfreq_dev_status *status)
> >  {
> > - struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
> > + struct msm_gpu *gpu = dev_to_gpu(dev);
> >   ktime_t time;
> >
> >   if (gpu->funcs->gpu_get_freq)
> > @@ -64,7 +64,7 @@ static int msm_devfreq_get_dev_status(struct device *dev,
> >
> >  static int msm_devfreq_get_cur_freq(struct device *dev, unsigned long 
> > *freq)
> >  {
> > - struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
> > + struct msm_gpu *gpu = dev_to_gpu(dev);
> >
> >   if (gpu->funcs->gpu_get_freq)
> >   *freq = gpu->funcs->gpu_get_freq(gpu);
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > index 0db117a7339b..8bda7beaed4b 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > @@ -141,6 +141,11 @@ struct msm_gpu {
> >   struct msm_gpu_state *crashstate;
> >  };
> >
> > +static inline struct msm_gpu *dev_to_gpu(struct device *dev)
>
> That's a fairly generic name for a driver-global helper :)

tbf, it is only global to the gpu part of the driver..

thanks for the review

BR,
-R

> Reviewed-by: Bjorn Andersson 
>
> Regards,
> Bjorn
>
> > +{
> > + return dev_get_drvdata(dev);
> > +}
> > +
> >  /* It turns out that all targets use the same ringbuffer size */
> >  #define MSM_GPU_RINGBUFFER_SZ SZ_32K
> >  #define MSM_GPU_RINGBUFFER_BLKSIZE 32
> > --
> > 2.26.2
> >
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 5.8 213/255] drm/modeset-lock: Take the modeset BKL for legacy drivers

2020-09-01 Thread Greg Kroah-Hartman
From: Daniel Vetter 

commit 77ef38574beb3e0b414db48e9c0f04633df68ba6 upstream.

This fell off in the conversion in

commit 9bcaa3fe58ab7559e71df798bcff6e0795158695
Author: Michal Orzel 
Date:   Tue Apr 28 19:10:04 2020 +0200

drm: Replace drm_modeset_lock/unlock_all with DRM_MODESET_LOCK_ALL_* helpers

but it's caught by the drm_warn_on_modeset_not_all_locked() that the
legacy modeset code uses. Since this is the bkl and it's unclear
what's all protected, play it safe and grab it again for legacy
drivers.

Unfortunately this means we need to sprinkle a few more #includes
around.

Also we need to add the drm_device as a parameter to the _END macro.

Finally remove the mute_lock() from setcrtc, since that's now done by
the macro.

Cc: Alex Deucher 
Fixes: 9bcaa3fe58ab ("drm: Replace drm_modeset_lock/unlock_all with 
DRM_MODESET_LOCK_ALL_* helpers")
Cc: Michal Orzel 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.8+
Signed-off-by: Daniel Vetter 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20200814093842.3048472-1-daniel.vet...@ffwll.ch
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/gpu/drm/drm_atomic_helper.c |7 ---
 drivers/gpu/drm/drm_color_mgmt.c|2 +-
 drivers/gpu/drm/drm_crtc.c  |4 +---
 drivers/gpu/drm/drm_mode_object.c   |4 ++--
 drivers/gpu/drm/drm_plane.c |2 +-
 include/drm/drm_modeset_lock.h  |9 +++--
 6 files changed, 16 insertions(+), 12 deletions(-)

--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -3105,7 +3106,7 @@ void drm_atomic_helper_shutdown(struct d
if (ret)
DRM_ERROR("Disabling all crtc's during unload failed with 
%i\n", ret);
 
-   DRM_MODESET_LOCK_ALL_END(ctx, ret);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
 }
 EXPORT_SYMBOL(drm_atomic_helper_shutdown);
 
@@ -3245,7 +3246,7 @@ struct drm_atomic_state *drm_atomic_help
}
 
 unlock:
-   DRM_MODESET_LOCK_ALL_END(ctx, err);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, err);
if (err)
return ERR_PTR(err);
 
@@ -3326,7 +3327,7 @@ int drm_atomic_helper_resume(struct drm_
 
err = drm_atomic_helper_commit_duplicated_state(state, );
 
-   DRM_MODESET_LOCK_ALL_END(ctx, err);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, err);
drm_atomic_state_put(state);
 
return err;
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -294,7 +294,7 @@ int drm_mode_gamma_set_ioctl(struct drm_
 crtc->gamma_size, );
 
 out:
-   DRM_MODESET_LOCK_ALL_END(ctx, ret);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
return ret;
 
 }
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -561,7 +561,6 @@ int drm_mode_setcrtc(struct drm_device *
if (crtc_req->mode_valid && !drm_lease_held(file_priv, plane->base.id))
return -EACCES;
 
-   mutex_lock(>dev->mode_config.mutex);
DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx,
   DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
 
@@ -728,8 +727,7 @@ out:
fb = NULL;
mode = NULL;
 
-   DRM_MODESET_LOCK_ALL_END(ctx, ret);
-   mutex_unlock(>dev->mode_config.mutex);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
 
return ret;
 }
--- a/drivers/gpu/drm/drm_mode_object.c
+++ b/drivers/gpu/drm/drm_mode_object.c
@@ -428,7 +428,7 @@ int drm_mode_obj_get_properties_ioctl(st
 out_unref:
drm_mode_object_put(obj);
 out:
-   DRM_MODESET_LOCK_ALL_END(ctx, ret);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
return ret;
 }
 
@@ -470,7 +470,7 @@ static int set_property_legacy(struct dr
break;
}
drm_property_change_valid_put(prop, ref);
-   DRM_MODESET_LOCK_ALL_END(ctx, ret);
+   DRM_MODESET_LOCK_ALL_END(dev, ctx, ret);
 
return ret;
 }
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -791,7 +791,7 @@ static int setplane_internal(struct drm_
  crtc_x, crtc_y, crtc_w, crtc_h,
  src_x, src_y, src_w, src_h, );
 
-   DRM_MODESET_LOCK_ALL_END(ctx, ret);
+   DRM_MODESET_LOCK_ALL_END(plane->dev, ctx, ret);
 
return ret;
 }
--- a/include/drm/drm_modeset_lock.h
+++ b/include/drm/drm_modeset_lock.h
@@ -164,6 +164,8 @@ int drm_modeset_lock_all_ctx(struct drm_
  * is 0, so no error checking is necessary
  */
 #define DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx, flags, ret)   \
+   if (!drm_drv_uses_atomic_modeset(dev))  \
+   mutex_lock(>mode_config.mutex);\

[PATCH 3/3] drm/msm/gpu: Add suspend/resume tracepoints

2020-09-01 Thread Rob Clark
From: Rob Clark 

Signed-off-by: Rob Clark 
---
I'm not sure if there is a better way to do no-arg tracepoints?  The
trace framework seems to go out of it's way to make this difficult.
Or maybe there is a more obvious thing that I'm not seeing.

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  4 
 drivers/gpu/drm/msm/msm_gpu.c |  2 ++
 drivers/gpu/drm/msm/msm_gpu_trace.h   | 26 ++
 3 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index c5a3e4d4c007..2de280e45077 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -923,6 +923,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
 
gpu->needs_hw_init = true;
 
+   trace_msm_gpu_resume(0);
+
ret = a6xx_gmu_resume(a6xx_gpu);
if (ret)
return ret;
@@ -937,6 +939,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 
+   trace_msm_gpu_suspend(0);
+
devfreq_suspend_device(gpu->devfreq.devfreq);
 
return a6xx_gmu_stop(a6xx_gpu);
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index b02866527386..5ceb2a966a87 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -202,6 +202,7 @@ int msm_gpu_pm_resume(struct msm_gpu *gpu)
int ret;
 
DBG("%s", gpu->name);
+   trace_msm_gpu_resume(0);
 
ret = enable_pwrrail(gpu);
if (ret)
@@ -227,6 +228,7 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)
int ret;
 
DBG("%s", gpu->name);
+   trace_msm_gpu_suspend(0);
 
devfreq_suspend_device(gpu->devfreq.devfreq);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu_trace.h 
b/drivers/gpu/drm/msm/msm_gpu_trace.h
index 1079fe551279..03e0c2536b94 100644
--- a/drivers/gpu/drm/msm/msm_gpu_trace.h
+++ b/drivers/gpu/drm/msm/msm_gpu_trace.h
@@ -140,6 +140,32 @@ TRACE_EVENT(msm_gem_purge_vmaps,
TP_printk("Purging %u vmaps", __entry->unmapped)
 );
 
+
+TRACE_EVENT(msm_gpu_suspend,
+   TP_PROTO(int dummy),
+   TP_ARGS(dummy),
+   TP_STRUCT__entry(
+   __field(u32, dummy)
+   ),
+   TP_fast_assign(
+   __entry->dummy = dummy;
+   ),
+   TP_printk("%u", __entry->dummy)
+);
+
+
+TRACE_EVENT(msm_gpu_resume,
+   TP_PROTO(int dummy),
+   TP_ARGS(dummy),
+   TP_STRUCT__entry(
+   __field(u32, dummy)
+   ),
+   TP_fast_assign(
+   __entry->dummy = dummy;
+   ),
+   TP_printk("%u", __entry->dummy)
+);
+
 #endif
 
 #undef TRACE_INCLUDE_PATH
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/3] drm/msm: Convert shrinker msgs to tracepoints

2020-09-01 Thread Rob Clark
From: Rob Clark 

This reduces the spam in dmesg when we start hitting the shrinker, and
replaces it with something we can put on a timeline while profiling or
debugging system issues.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem_shrinker.c |  5 +++--
 drivers/gpu/drm/msm/msm_gpu_trace.h| 26 ++
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c 
b/drivers/gpu/drm/msm/msm_gem_shrinker.c
index 722d61668a97..482576d7a39a 100644
--- a/drivers/gpu/drm/msm/msm_gem_shrinker.c
+++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c
@@ -6,6 +6,7 @@
 
 #include "msm_drv.h"
 #include "msm_gem.h"
+#include "msm_gpu_trace.h"
 
 static bool msm_gem_shrinker_lock(struct drm_device *dev, bool *unlock)
 {
@@ -87,7 +88,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct 
shrink_control *sc)
mutex_unlock(>struct_mutex);
 
if (freed > 0)
-   pr_info_ratelimited("Purging %lu bytes\n", freed << PAGE_SHIFT);
+   trace_msm_gem_purge(freed << PAGE_SHIFT);
 
return freed;
 }
@@ -123,7 +124,7 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned 
long event, void *ptr)
*(unsigned long *)ptr += unmapped;
 
if (unmapped > 0)
-   pr_info_ratelimited("Purging %u vmaps\n", unmapped);
+   trace_msm_gem_purge_vmaps(unmapped);
 
return NOTIFY_DONE;
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu_trace.h 
b/drivers/gpu/drm/msm/msm_gpu_trace.h
index 07572ab179fa..1079fe551279 100644
--- a/drivers/gpu/drm/msm/msm_gpu_trace.h
+++ b/drivers/gpu/drm/msm/msm_gpu_trace.h
@@ -114,6 +114,32 @@ TRACE_EVENT(msm_gmu_freq_change,
TP_printk("freq=%u, perf_index=%u", __entry->freq, 
__entry->perf_index)
 );
 
+
+TRACE_EVENT(msm_gem_purge,
+   TP_PROTO(u32 bytes),
+   TP_ARGS(bytes),
+   TP_STRUCT__entry(
+   __field(u32, bytes)
+   ),
+   TP_fast_assign(
+   __entry->bytes = bytes;
+   ),
+   TP_printk("Purging %u bytes", __entry->bytes)
+);
+
+
+TRACE_EVENT(msm_gem_purge_vmaps,
+   TP_PROTO(u32 unmapped),
+   TP_ARGS(unmapped),
+   TP_STRUCT__entry(
+   __field(u32, unmapped)
+   ),
+   TP_fast_assign(
+   __entry->unmapped = unmapped;
+   ),
+   TP_printk("Purging %u vmaps", __entry->unmapped)
+);
+
 #endif
 
 #undef TRACE_INCLUDE_PATH
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 0/3] drm/msm: More GPU tracepoints

2020-09-01 Thread Rob Clark
From: Rob Clark 

Various extra tracepoints that I've been collecting.

Rob Clark (3):
  drm/msm/gpu: Add GPU freq_change traces
  drm/msm: Convert shrinker msgs to tracepoints
  drm/msm/gpu: Add suspend/resume tracepoints

 drivers/gpu/drm/msm/adreno/a6xx_gmu.c  |  3 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  4 ++
 drivers/gpu/drm/msm/msm_gem_shrinker.c |  5 +-
 drivers/gpu/drm/msm/msm_gpu.c  |  4 ++
 drivers/gpu/drm/msm/msm_gpu_trace.h| 83 ++
 5 files changed, 97 insertions(+), 2 deletions(-)

-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/3] drm/msm/gpu: Add GPU freq_change traces

2020-09-01 Thread Rob Clark
From: Rob Clark 

Technically the GMU specific one is a bit redundant, but it was useful
to track down a bug.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +++
 drivers/gpu/drm/msm/msm_gpu.c |  2 ++
 drivers/gpu/drm/msm/msm_gpu_trace.h   | 31 +++
 3 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 46a29e383bfd..ab1e9eb619e0 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -11,6 +11,7 @@
 #include "a6xx_gpu.h"
 #include "a6xx_gmu.xml.h"
 #include "msm_gem.h"
+#include "msm_gpu_trace.h"
 #include "msm_mmu.h"
 
 static void a6xx_gmu_fault(struct a6xx_gmu *gmu)
@@ -124,6 +125,8 @@ void a6xx_gmu_set_freq(struct msm_gpu *gpu, struct 
dev_pm_opp *opp)
gmu->current_perf_index = perf_index;
gmu->freq = gmu->gpu_freqs[perf_index];
 
+   trace_msm_gmu_freq_change(gmu->freq, perf_index);
+
/*
 * This can get called from devfreq while the hardware is idle. Don't
 * bring up the power if it isn't already active
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index d5645472b25d..b02866527386 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -32,6 +32,8 @@ static int msm_devfreq_target(struct device *dev, unsigned 
long *freq,
if (IS_ERR(opp))
return PTR_ERR(opp);
 
+   trace_msm_gpu_freq_change(dev_pm_opp_get_freq(opp));
+
if (gpu->funcs->gpu_set_freq)
gpu->funcs->gpu_set_freq(gpu, opp);
else
diff --git a/drivers/gpu/drm/msm/msm_gpu_trace.h 
b/drivers/gpu/drm/msm/msm_gpu_trace.h
index 122b84789238..07572ab179fa 100644
--- a/drivers/gpu/drm/msm/msm_gpu_trace.h
+++ b/drivers/gpu/drm/msm/msm_gpu_trace.h
@@ -83,6 +83,37 @@ TRACE_EVENT(msm_gpu_submit_retired,
__entry->start_ticks, __entry->end_ticks)
 );
 
+
+TRACE_EVENT(msm_gpu_freq_change,
+   TP_PROTO(u32 freq),
+   TP_ARGS(freq),
+   TP_STRUCT__entry(
+   __field(u32, freq)
+   ),
+   TP_fast_assign(
+   /* trace freq in MHz to match intel_gpu_freq_change, to 
make life easier
+* for userspace
+*/
+   __entry->freq = DIV_ROUND_UP(freq, 100);
+   ),
+   TP_printk("new_freq=%u", __entry->freq)
+);
+
+
+TRACE_EVENT(msm_gmu_freq_change,
+   TP_PROTO(u32 freq, u32 perf_index),
+   TP_ARGS(freq, perf_index),
+   TP_STRUCT__entry(
+   __field(u32, freq)
+   __field(u32, perf_index)
+   ),
+   TP_fast_assign(
+   __entry->freq = freq;
+   __entry->perf_index = perf_index;
+   ),
+   TP_printk("freq=%u, perf_index=%u", __entry->freq, 
__entry->perf_index)
+);
+
 #endif
 
 #undef TRACE_INCLUDE_PATH
-- 
2.26.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/3] drm/ttm: remove io_reserve_lru handling v2

2020-09-01 Thread Christian König
From: Christian König 

That is not used any more.

v2: keep the NULL checks in TTM.

Signed-off-by: Christian König 
Acked-by: Daniel Vetter 
---
 drivers/gpu/drm/ttm/ttm_bo.c   |  34 +
 drivers/gpu/drm/ttm/ttm_bo_util.c  | 113 +++--
 drivers/gpu/drm/ttm/ttm_bo_vm.c|  39 +++---
 drivers/gpu/drm/ttm/ttm_resource.c |   3 -
 include/drm/ttm/ttm_bo_api.h   |   1 -
 include/drm/ttm/ttm_bo_driver.h|   5 --
 include/drm/ttm/ttm_resource.h |  16 
 7 files changed, 24 insertions(+), 187 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 772c640a6046..89d8ab6edd40 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -263,11 +263,7 @@ static int ttm_bo_handle_move_mem(struct ttm_buffer_object 
*bo,
struct ttm_resource_manager *new_man = ttm_manager_type(bdev, 
mem->mem_type);
int ret;
 
-   ret = ttm_mem_io_lock(old_man, true);
-   if (unlikely(ret != 0))
-   goto out_err;
-   ttm_bo_unmap_virtual_locked(bo);
-   ttm_mem_io_unlock(old_man);
+   ttm_bo_unmap_virtual(bo);
 
/*
 * Create and bind a ttm if required.
@@ -538,7 +534,6 @@ static void ttm_bo_release(struct kref *kref)
struct ttm_buffer_object *bo =
container_of(kref, struct ttm_buffer_object, kref);
struct ttm_bo_device *bdev = bo->bdev;
-   struct ttm_resource_manager *man = ttm_manager_type(bdev, 
bo->mem.mem_type);
size_t acc_size = bo->acc_size;
int ret;
 
@@ -556,9 +551,7 @@ static void ttm_bo_release(struct kref *kref)
bo->bdev->driver->release_notify(bo);
 
drm_vma_offset_remove(bdev->vma_manager, >base.vma_node);
-   ttm_mem_io_lock(man, false);
-   ttm_mem_io_free_vm(bo);
-   ttm_mem_io_unlock(man);
+   ttm_mem_io_free(bdev, >mem);
}
 
if (!dma_resv_test_signaled_rcu(bo->base.resv, true) ||
@@ -648,8 +641,6 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
 
evict_mem = bo->mem;
evict_mem.mm_node = NULL;
-   evict_mem.bus.io_reserved_vm = false;
-   evict_mem.bus.io_reserved_count = 0;
evict_mem.bus.base = 0;
evict_mem.bus.offset = 0;
evict_mem.bus.addr = NULL;
@@ -1085,8 +1076,6 @@ static int ttm_bo_move_buffer(struct ttm_buffer_object 
*bo,
mem.num_pages = bo->num_pages;
mem.size = mem.num_pages << PAGE_SHIFT;
mem.page_alignment = bo->mem.page_alignment;
-   mem.bus.io_reserved_vm = false;
-   mem.bus.io_reserved_count = 0;
mem.bus.base = 0;
mem.bus.offset = 0;
mem.bus.addr = NULL;
@@ -1238,7 +1227,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device *bdev,
INIT_LIST_HEAD(>lru);
INIT_LIST_HEAD(>ddestroy);
INIT_LIST_HEAD(>swap);
-   INIT_LIST_HEAD(>io_reserve_lru);
bo->bdev = bdev;
bo->type = type;
bo->num_pages = num_pages;
@@ -1247,8 +1235,6 @@ int ttm_bo_init_reserved(struct ttm_bo_device *bdev,
bo->mem.num_pages = bo->num_pages;
bo->mem.mm_node = NULL;
bo->mem.page_alignment = page_alignment;
-   bo->mem.bus.io_reserved_vm = false;
-   bo->mem.bus.io_reserved_count = 0;
bo->mem.bus.base = 0;
bo->mem.bus.offset = 0;
bo->mem.bus.addr = NULL;
@@ -1554,25 +1540,13 @@ EXPORT_SYMBOL(ttm_bo_device_init);
  * buffer object vm functions.
  */
 
-void ttm_bo_unmap_virtual_locked(struct ttm_buffer_object *bo)
-{
-   struct ttm_bo_device *bdev = bo->bdev;
-
-   drm_vma_node_unmap(>base.vma_node, bdev->dev_mapping);
-   ttm_mem_io_free_vm(bo);
-}
-
 void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo)
 {
struct ttm_bo_device *bdev = bo->bdev;
-   struct ttm_resource_manager *man = ttm_manager_type(bdev, 
bo->mem.mem_type);
 
-   ttm_mem_io_lock(man, false);
-   ttm_bo_unmap_virtual_locked(bo);
-   ttm_mem_io_unlock(man);
+   drm_vma_node_unmap(>base.vma_node, bdev->dev_mapping);
+   ttm_mem_io_free(bdev, >mem);
 }
-
-
 EXPORT_SYMBOL(ttm_bo_unmap_virtual);
 
 int ttm_bo_wait(struct ttm_buffer_object *bo,
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index ee04716b2603..40ded10055d2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -91,122 +91,42 @@ int ttm_bo_move_ttm(struct ttm_buffer_object *bo,
 }
 EXPORT_SYMBOL(ttm_bo_move_ttm);
 
-int ttm_mem_io_lock(struct ttm_resource_manager *man, bool interruptible)
-{
-   if (likely(!man->use_io_reserve_lru))
-   return 0;
-
-   if (interruptible)
-   return mutex_lock_interruptible(>io_reserve_mutex);
-
-   mutex_lock(>io_reserve_mutex);
-   return 0;
-}
-
-void ttm_mem_io_unlock(struct ttm_resource_manager *man)
-{
-   if (likely(!man->use_io_reserve_lru))
-   return;
-
-   

[PATCH 2/3] drm/nouveau: move io_reserve_lru handling into the driver v5

2020-09-01 Thread Christian König
While working on TTM cleanups I've found that the io_reserve_lru used by
Nouveau is actually not working at all.

In general we should remove driver specific handling from the memory
management, so this patch moves the io_reserve_lru handling into Nouveau
instead.

v2: don't call ttm_bo_unmap_virtual in nouveau_ttm_io_mem_reserve
v3: rebased and use both base and offset in the check
v4: fix small typos and test the patch
v5: rebased and keep the mem.bus init in TTM.

Signed-off-by: Christian König 
Acked-by: Daniel Vetter 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c  | 101 --
 drivers/gpu/drm/nouveau/nouveau_bo.h  |   3 +
 drivers/gpu/drm/nouveau/nouveau_drv.h |   2 +
 drivers/gpu/drm/nouveau/nouveau_ttm.c |  44 ++-
 4 files changed, 127 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 9140387f30dc..f74988771ed8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -137,6 +137,7 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo)
struct nouveau_bo *nvbo = nouveau_bo(bo);
 
WARN_ON(nvbo->pin_refcnt > 0);
+   nouveau_bo_del_io_reserve_lru(bo);
nv10_bo_put_tile_region(dev, nvbo->tile, NULL);
 
/*
@@ -304,6 +305,7 @@ nouveau_bo_init(struct nouveau_bo *nvbo, u64 size, int 
align, u32 flags,
 
nvbo->bo.mem.num_pages = size >> PAGE_SHIFT;
nouveau_bo_placement_set(nvbo, flags, 0);
+   INIT_LIST_HEAD(>io_reserve_lru);
 
ret = ttm_bo_init(nvbo->bo.bdev, >bo, size, type,
  >placement, align >> PAGE_SHIFT, false,
@@ -574,6 +576,26 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
PAGE_SIZE, DMA_FROM_DEVICE);
 }
 
+void nouveau_bo_add_io_reserve_lru(struct ttm_buffer_object *bo)
+{
+   struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
+   struct nouveau_bo *nvbo = nouveau_bo(bo);
+
+   mutex_lock(>ttm.io_reserve_mutex);
+   list_move_tail(>io_reserve_lru, >ttm.io_reserve_lru);
+   mutex_unlock(>ttm.io_reserve_mutex);
+}
+
+void nouveau_bo_del_io_reserve_lru(struct ttm_buffer_object *bo)
+{
+   struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
+   struct nouveau_bo *nvbo = nouveau_bo(bo);
+
+   mutex_lock(>ttm.io_reserve_mutex);
+   list_del_init(>io_reserve_lru);
+   mutex_unlock(>ttm.io_reserve_mutex);
+}
+
 int
 nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible,
bool no_wait_gpu)
@@ -888,6 +910,8 @@ nouveau_bo_move_ntfy(struct ttm_buffer_object *bo, bool 
evict,
if (bo->destroy != nouveau_bo_del_ttm)
return;
 
+   nouveau_bo_del_io_reserve_lru(bo);
+
if (mem && new_reg->mem_type != TTM_PL_SYSTEM &&
mem->mem.page == nvbo->page) {
list_for_each_entry(vma, >vma_list, head) {
@@ -1018,17 +1042,42 @@ nouveau_bo_verify_access(struct ttm_buffer_object *bo, 
struct file *filp)
  filp->private_data);
 }
 
+static void
+nouveau_ttm_io_mem_free_locked(struct nouveau_drm *drm,
+  struct ttm_resource *reg)
+{
+   struct nouveau_mem *mem = nouveau_mem(reg);
+
+   if (drm->client.mem->oclass >= NVIF_CLASS_MEM_NV50) {
+   switch (reg->mem_type) {
+   case TTM_PL_TT:
+   if (mem->kind)
+   nvif_object_unmap_handle(>mem.object);
+   break;
+   case TTM_PL_VRAM:
+   nvif_object_unmap_handle(>mem.object);
+   break;
+   default:
+   break;
+   }
+   }
+}
+
 static int
 nouveau_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_resource 
*reg)
 {
struct nouveau_drm *drm = nouveau_bdev(bdev);
struct nvkm_device *device = nvxx_device(>client.device);
struct nouveau_mem *mem = nouveau_mem(reg);
+   int ret;
 
+   mutex_lock(>ttm.io_reserve_mutex);
+retry:
switch (reg->mem_type) {
case TTM_PL_SYSTEM:
/* System memory */
-   return 0;
+   ret = 0;
+   goto out;
case TTM_PL_TT:
 #if IS_ENABLED(CONFIG_AGP)
if (drm->agp.bridge) {
@@ -1037,9 +1086,12 @@ nouveau_ttm_io_mem_reserve(struct ttm_bo_device *bdev, 
struct ttm_resource *reg)
reg->bus.is_iomem = !drm->agp.cma;
}
 #endif
-   if (drm->client.mem->oclass < NVIF_CLASS_MEM_NV50 || !mem->kind)
+   if (drm->client.mem->oclass < NVIF_CLASS_MEM_NV50 ||
+   !mem->kind) {
/* untiled */
+   ret = 0;
break;
+   }
fallthrough;/* tiled memory */
case TTM_PL_VRAM:
reg->bus.offset = reg->start << 

[PATCH 1/3] drm/ttm: make sure that we always zero init mem.bus v2

2020-09-01 Thread Christian König
We are trying to remove the io_lru handling and depend
on zero init base, offset and addr here.

v2: init addr as well

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index e3931e515906..772c640a6046 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -650,6 +650,9 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
evict_mem.mm_node = NULL;
evict_mem.bus.io_reserved_vm = false;
evict_mem.bus.io_reserved_count = 0;
+   evict_mem.bus.base = 0;
+   evict_mem.bus.offset = 0;
+   evict_mem.bus.addr = NULL;
 
ret = ttm_bo_mem_space(bo, , _mem, ctx);
if (ret) {
@@ -1084,6 +1087,9 @@ static int ttm_bo_move_buffer(struct ttm_buffer_object 
*bo,
mem.page_alignment = bo->mem.page_alignment;
mem.bus.io_reserved_vm = false;
mem.bus.io_reserved_count = 0;
+   mem.bus.base = 0;
+   mem.bus.offset = 0;
+   mem.bus.addr = NULL;
mem.mm_node = NULL;
 
/*
@@ -1243,6 +1249,9 @@ int ttm_bo_init_reserved(struct ttm_bo_device *bdev,
bo->mem.page_alignment = page_alignment;
bo->mem.bus.io_reserved_vm = false;
bo->mem.bus.io_reserved_count = 0;
+   bo->mem.bus.base = 0;
+   bo->mem.bus.offset = 0;
+   bo->mem.bus.addr = NULL;
bo->moving = NULL;
bo->mem.placement = (TTM_PL_FLAG_SYSTEM | TTM_PL_FLAG_CACHED);
bo->acc_size = acc_size;
-- 
2.17.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   >