Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-05-02 Thread Michel Dänzer
On 2018-04-29 01:56 AM, Ilia Mirkin wrote:
> On Sat, Apr 28, 2018 at 7:02 PM, Michel Dänzer  wrote:
>>
>> Unfortunately, there was an swiotlb regression (not directly related to
>> Christian's work) shortly after this fix, also in 4.16-rc1, which is now
>> fixed in 4.17-rc1 and will be backported to 4.16.y.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.16.5=2c9dacf5bfe1e45d96dfe97cb71d2b717786a7b9
> 
> This guy? Didn't help. I'm running 4.16.4 right now.

Try https://patchwork.freedesktop.org/patch/219765/ .


>>> We now have *two* broken releases, v4.15 and v4.16 (anything that
>>> spews error messages and stack traces ad-infinitum in dmesg is, by
>>> definition, broken).
>>
>> I haven't seen any evidence that there's still an issue in 4.15, is
>> there any?
> 
> Well, I did have a late 4.15 rc kernel in addition to the 'suppress
> warning' patch. Now I'm questioning my memory of whether the issue was
> resolved there or not. I'm pretty sure that 'not', but no longer 100%.

It seems pretty clear it is in fact resolved in 4.15. Even if it indeed
wasn't for you, did you expect us to find out by monitoring you on IRC 24/7?


> As a user who is not in a position to debug this directly due to lack
> of time and knowledge, [...]

I have plenty of other things to do other than looking into this and
coming up with the fix above as well, and I'm no more knowledgeable
about the SWIOTLB code than you.

Anyway, nobody can track down every problem they run into. But let me
kindly ask you to more carefully look at the information available
before deciding which tree to bark up in the future.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-05-02 Thread Michel Dänzer
On 2018-04-29 01:56 AM, Ilia Mirkin wrote:
> On Sat, Apr 28, 2018 at 7:02 PM, Michel Dänzer  wrote:
>>
>> Unfortunately, there was an swiotlb regression (not directly related to
>> Christian's work) shortly after this fix, also in 4.16-rc1, which is now
>> fixed in 4.17-rc1 and will be backported to 4.16.y.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.16.5=2c9dacf5bfe1e45d96dfe97cb71d2b717786a7b9
> 
> This guy? Didn't help. I'm running 4.16.4 right now.

Try https://patchwork.freedesktop.org/patch/219765/ .


>>> We now have *two* broken releases, v4.15 and v4.16 (anything that
>>> spews error messages and stack traces ad-infinitum in dmesg is, by
>>> definition, broken).
>>
>> I haven't seen any evidence that there's still an issue in 4.15, is
>> there any?
> 
> Well, I did have a late 4.15 rc kernel in addition to the 'suppress
> warning' patch. Now I'm questioning my memory of whether the issue was
> resolved there or not. I'm pretty sure that 'not', but no longer 100%.

It seems pretty clear it is in fact resolved in 4.15. Even if it indeed
wasn't for you, did you expect us to find out by monitoring you on IRC 24/7?


> As a user who is not in a position to debug this directly due to lack
> of time and knowledge, [...]

I have plenty of other things to do other than looking into this and
coming up with the fix above as well, and I'm no more knowledgeable
about the SWIOTLB code than you.

Anyway, nobody can track down every problem they run into. But let me
kindly ask you to more carefully look at the information available
before deciding which tree to bark up in the future.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-05-01 Thread Michel Dänzer
On 2018-05-01 01:15 AM, Dave Airlie wrote:
>>
>>
>> Yes, I fixed the original false positive messages myself with the swiotlb
>> maintainer and I was CCed in fixing the recent fallout from Chris changes as
>> well.
> 
> So do we have a good summary of where this at now?
> 
> I'm getting reports on 4.16.4 still displaying these, what hammer do I
> need to hit things with to get 4.16.x+1 to not do this?
> 
> Is there still outstanding issues upstream.

There are, https://patchwork.freedesktop.org/patch/219765/ should
hopefully fix the last of it.


> [...] I've no idea if the swiotlb things people report are the false
> positive, or some new thing.

The issues I've seen reported with 4.16 are false positives from TTM's
perspective, which uses DMA_ATTR_NO_WARN to suppress these warnings, due
to multiple regressions introduced by commit
0176adb004065d6815a8e67946752df4cd947c5b "swiotlb: refactor
 coherent buffer allocation" in 4.16-rc1.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-05-01 Thread Michel Dänzer
On 2018-05-01 01:15 AM, Dave Airlie wrote:
>>
>>
>> Yes, I fixed the original false positive messages myself with the swiotlb
>> maintainer and I was CCed in fixing the recent fallout from Chris changes as
>> well.
> 
> So do we have a good summary of where this at now?
> 
> I'm getting reports on 4.16.4 still displaying these, what hammer do I
> need to hit things with to get 4.16.x+1 to not do this?
> 
> Is there still outstanding issues upstream.

There are, https://patchwork.freedesktop.org/patch/219765/ should
hopefully fix the last of it.


> [...] I've no idea if the swiotlb things people report are the false
> positive, or some new thing.

The issues I've seen reported with 4.16 are false positives from TTM's
perspective, which uses DMA_ATTR_NO_WARN to suppress these warnings, due
to multiple regressions introduced by commit
0176adb004065d6815a8e67946752df4cd947c5b "swiotlb: refactor
 coherent buffer allocation" in 4.16-rc1.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-30 Thread Dave Airlie
>
>
> Yes, I fixed the original false positive messages myself with the swiotlb
> maintainer and I was CCed in fixing the recent fallout from Chris changes as
> well.

So do we have a good summary of where this at now?

I'm getting reports on 4.16.4 still displaying these, what hammer do I
need to hit things with to get 4.16.x+1 to not do this?

Is there still outstanding issues upstream.

For future reference I think how this should have gone down is

a) AMD implement a public CI with igt tests for all of this
b) we get these patches pushed and debugged.

Though to be a bit more constructive, I think you should have said at
-rc6 or 7 this isn't working for this kernel cycle,
push a minimal patch to back it off, even if the bug is in swiotlb, we
don't just add stuff broken like this, even if it's not
your bug we should have backed off for a kernel or two until we had
the underlying infrastructure fixed. Otherwise we
get what we have now, which is bit of a crappile, because now I've no
idea if the swiotlb things people report are
the false positive, or some new thing.

Dave.


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-30 Thread Dave Airlie
>
>
> Yes, I fixed the original false positive messages myself with the swiotlb
> maintainer and I was CCed in fixing the recent fallout from Chris changes as
> well.

So do we have a good summary of where this at now?

I'm getting reports on 4.16.4 still displaying these, what hammer do I
need to hit things with to get 4.16.x+1 to not do this?

Is there still outstanding issues upstream.

For future reference I think how this should have gone down is

a) AMD implement a public CI with igt tests for all of this
b) we get these patches pushed and debugged.

Though to be a bit more constructive, I think you should have said at
-rc6 or 7 this isn't working for this kernel cycle,
push a minimal patch to back it off, even if the bug is in swiotlb, we
don't just add stuff broken like this, even if it's not
your bug we should have backed off for a kernel or two until we had
the underlying infrastructure fixed. Otherwise we
get what we have now, which is bit of a crappile, because now I've no
idea if the swiotlb things people report are
the false positive, or some new thing.

Dave.


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-30 Thread Christian König

Am 30.04.2018 um 18:33 schrieb Michel Dänzer:

On 2018-04-29 09:02 AM, Christian König wrote:

Am 29.04.2018 um 01:02 schrieb Michel Dänzer:

On 2018-04-28 06:30 PM, Ilia Mirkin wrote:

On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer 
wrote:

From: Michel Dänzer 

Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
try to allocate huge pages. However, not all drivers can take advantage
of huge pages, but they would incur the overhead for allocating and
freeing them anyway.

Now, drivers which can take advantage of huge pages need to set the new
flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
no longer incur any overhead for allocating or freeing huge pages.

v2:
* Also guard swapping of consecutive pages in ttm_get_pages
* Reword commit log, hopefully clearer now

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4.

"lots of other people", "plenty of issues" sounds a bit exaggerated from
what I've seen. FWIW, while I did see the original messages myself, I
haven't seen any since Christian's original fix (see below), neither
with amdgpu nor radeon, even before this patch you followed up to.



Admittedly I'm on nouveau, but others have reported issues with
radeon/amdgpu as well. It's been going on since the feature was merged
in v4.15, with what seems like little investigation from the authors
introducing the feature.

That's not a fair assessment. See
https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
comments.

Christian fixed the original issue in
d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
__GFP_NOWARN is set". Christian did his best to try and get the fix in
before 4.15 final, but for reasons beyond his control, it was delayed
until 4.16-rc1 and then backported to 4.15.5.

Unfortunately, there was an swiotlb regression (not directly related to
Christian's work) shortly after this fix, also in 4.16-rc1, which is now
fixed in 4.17-rc1 and will be backported to 4.16.y.

And that's exactly the reason why I intentionally kept this enabled for
all users of the TTM DMA page pool and not put it behind a flag.

This change has surfaced quite a number of bugs in the swiotlb code
which could have caused issues before. It's just that those code path
where never exercised massively before.

Additional to that using huge pages is beneficial for the MM and CPU TLB
(not implemented yet) even when the GPU driver can't make much use of it.

Do I understand correctly that you're against this patch?


Not completely, I've considered adding that multiple times myself.

I'm just torn apart between keeping it enabled so that the underlying 
bugs gets fixed and disabling it for a better end user experience.


But in general I would opt out for a pool configuration option, not a 
per driver flag.



AFAIU the only benefit of huge pages with a driver which doesn't take
advantage of them directly is "for the MM". Can you describe a bit more
what that benefit is exactly?


When transparent huge pages are in effect we should have more huge pages 
than small pages in the system allocator.


So by enforcing allocation of small pages we fragment the huge pages 
once more and give khugepaged quite a bunch of work todo to gather them 
together into huge pages again.



Is it expected to outweigh the cost of allocating / freeing huge pages?


Yes, and actually quite well (at least in theory).


It looks like there's at least one more bug left, but it's not clear yet
when that was introduced, whether it's directly related to Christian's
work, or indeed what the impact is. Let's not get ahead of ourselves.

Well my patches surfaced the problems, but the underlying issues where
present even before those changes and I'm very well involved in fixing
the underlying issues.

I even considered to just revert the huge page path for the DMA pool
allocator, but it's just that the TTM patches seem to work exactly as
they are intended. So that doesn't feel like doing the right thing here.

I agree. Has anyone reported this to the DMA/SWIOTLB developers?


Yes, I fixed the original false positive messages myself with the 
swiotlb maintainer and I was CCed in fixing the recent fallout from 
Chris changes as well.


Regards,
Christian.


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-30 Thread Christian König

Am 30.04.2018 um 18:33 schrieb Michel Dänzer:

On 2018-04-29 09:02 AM, Christian König wrote:

Am 29.04.2018 um 01:02 schrieb Michel Dänzer:

On 2018-04-28 06:30 PM, Ilia Mirkin wrote:

On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer 
wrote:

From: Michel Dänzer 

Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
try to allocate huge pages. However, not all drivers can take advantage
of huge pages, but they would incur the overhead for allocating and
freeing them anyway.

Now, drivers which can take advantage of huge pages need to set the new
flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
no longer incur any overhead for allocating or freeing huge pages.

v2:
* Also guard swapping of consecutive pages in ttm_get_pages
* Reword commit log, hopefully clearer now

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4.

"lots of other people", "plenty of issues" sounds a bit exaggerated from
what I've seen. FWIW, while I did see the original messages myself, I
haven't seen any since Christian's original fix (see below), neither
with amdgpu nor radeon, even before this patch you followed up to.



Admittedly I'm on nouveau, but others have reported issues with
radeon/amdgpu as well. It's been going on since the feature was merged
in v4.15, with what seems like little investigation from the authors
introducing the feature.

That's not a fair assessment. See
https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
comments.

Christian fixed the original issue in
d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
__GFP_NOWARN is set". Christian did his best to try and get the fix in
before 4.15 final, but for reasons beyond his control, it was delayed
until 4.16-rc1 and then backported to 4.15.5.

Unfortunately, there was an swiotlb regression (not directly related to
Christian's work) shortly after this fix, also in 4.16-rc1, which is now
fixed in 4.17-rc1 and will be backported to 4.16.y.

And that's exactly the reason why I intentionally kept this enabled for
all users of the TTM DMA page pool and not put it behind a flag.

This change has surfaced quite a number of bugs in the swiotlb code
which could have caused issues before. It's just that those code path
where never exercised massively before.

Additional to that using huge pages is beneficial for the MM and CPU TLB
(not implemented yet) even when the GPU driver can't make much use of it.

Do I understand correctly that you're against this patch?


Not completely, I've considered adding that multiple times myself.

I'm just torn apart between keeping it enabled so that the underlying 
bugs gets fixed and disabling it for a better end user experience.


But in general I would opt out for a pool configuration option, not a 
per driver flag.



AFAIU the only benefit of huge pages with a driver which doesn't take
advantage of them directly is "for the MM". Can you describe a bit more
what that benefit is exactly?


When transparent huge pages are in effect we should have more huge pages 
than small pages in the system allocator.


So by enforcing allocation of small pages we fragment the huge pages 
once more and give khugepaged quite a bunch of work todo to gather them 
together into huge pages again.



Is it expected to outweigh the cost of allocating / freeing huge pages?


Yes, and actually quite well (at least in theory).


It looks like there's at least one more bug left, but it's not clear yet
when that was introduced, whether it's directly related to Christian's
work, or indeed what the impact is. Let's not get ahead of ourselves.

Well my patches surfaced the problems, but the underlying issues where
present even before those changes and I'm very well involved in fixing
the underlying issues.

I even considered to just revert the huge page path for the DMA pool
allocator, but it's just that the TTM patches seem to work exactly as
they are intended. So that doesn't feel like doing the right thing here.

I agree. Has anyone reported this to the DMA/SWIOTLB developers?


Yes, I fixed the original false positive messages myself with the 
swiotlb maintainer and I was CCed in fixing the recent fallout from 
Chris changes as well.


Regards,
Christian.


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-30 Thread Michel Dänzer
On 2018-04-29 09:02 AM, Christian König wrote:
> Am 29.04.2018 um 01:02 schrieb Michel Dänzer:
>> On 2018-04-28 06:30 PM, Ilia Mirkin wrote:
>>> On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer 
>>> wrote:
 From: Michel Dänzer 

 Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
 try to allocate huge pages. However, not all drivers can take advantage
 of huge pages, but they would incur the overhead for allocating and
 freeing them anyway.

 Now, drivers which can take advantage of huge pages need to set the new
 flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
 no longer incur any overhead for allocating or freeing huge pages.

 v2:
 * Also guard swapping of consecutive pages in ttm_get_pages
 * Reword commit log, hopefully clearer now

 Cc: sta...@vger.kernel.org
 Signed-off-by: Michel Dänzer 
>>> Both I and lots of other people, based on reports, are still seeing
>>> plenty of issues with this as late as 4.16.4.
>> "lots of other people", "plenty of issues" sounds a bit exaggerated from
>> what I've seen. FWIW, while I did see the original messages myself, I
>> haven't seen any since Christian's original fix (see below), neither
>> with amdgpu nor radeon, even before this patch you followed up to.
>>
>>
>>> Admittedly I'm on nouveau, but others have reported issues with
>>> radeon/amdgpu as well. It's been going on since the feature was merged
>>> in v4.15, with what seems like little investigation from the authors
>>> introducing the feature.
>> That's not a fair assessment. See
>> https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
>> comments.
>>
>> Christian fixed the original issue in
>> d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
>> __GFP_NOWARN is set". Christian did his best to try and get the fix in
>> before 4.15 final, but for reasons beyond his control, it was delayed
>> until 4.16-rc1 and then backported to 4.15.5.
>>
>> Unfortunately, there was an swiotlb regression (not directly related to
>> Christian's work) shortly after this fix, also in 4.16-rc1, which is now
>> fixed in 4.17-rc1 and will be backported to 4.16.y.
> 
> And that's exactly the reason why I intentionally kept this enabled for
> all users of the TTM DMA page pool and not put it behind a flag.
> 
> This change has surfaced quite a number of bugs in the swiotlb code
> which could have caused issues before. It's just that those code path
> where never exercised massively before.
> 
> Additional to that using huge pages is beneficial for the MM and CPU TLB
> (not implemented yet) even when the GPU driver can't make much use of it.

Do I understand correctly that you're against this patch?

AFAIU the only benefit of huge pages with a driver which doesn't take
advantage of them directly is "for the MM". Can you describe a bit more
what that benefit is exactly? Is it expected to outweigh the cost of
allocating / freeing huge pages?


>> It looks like there's at least one more bug left, but it's not clear yet
>> when that was introduced, whether it's directly related to Christian's
>> work, or indeed what the impact is. Let's not get ahead of ourselves.
> 
> Well my patches surfaced the problems, but the underlying issues where
> present even before those changes and I'm very well involved in fixing
> the underlying issues.
> 
> I even considered to just revert the huge page path for the DMA pool
> allocator, but it's just that the TTM patches seem to work exactly as
> they are intended. So that doesn't feel like doing the right thing here.

I agree. Has anyone reported this to the DMA/SWIOTLB developers?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-30 Thread Michel Dänzer
On 2018-04-29 09:02 AM, Christian König wrote:
> Am 29.04.2018 um 01:02 schrieb Michel Dänzer:
>> On 2018-04-28 06:30 PM, Ilia Mirkin wrote:
>>> On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer 
>>> wrote:
 From: Michel Dänzer 

 Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
 try to allocate huge pages. However, not all drivers can take advantage
 of huge pages, but they would incur the overhead for allocating and
 freeing them anyway.

 Now, drivers which can take advantage of huge pages need to set the new
 flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
 no longer incur any overhead for allocating or freeing huge pages.

 v2:
 * Also guard swapping of consecutive pages in ttm_get_pages
 * Reword commit log, hopefully clearer now

 Cc: sta...@vger.kernel.org
 Signed-off-by: Michel Dänzer 
>>> Both I and lots of other people, based on reports, are still seeing
>>> plenty of issues with this as late as 4.16.4.
>> "lots of other people", "plenty of issues" sounds a bit exaggerated from
>> what I've seen. FWIW, while I did see the original messages myself, I
>> haven't seen any since Christian's original fix (see below), neither
>> with amdgpu nor radeon, even before this patch you followed up to.
>>
>>
>>> Admittedly I'm on nouveau, but others have reported issues with
>>> radeon/amdgpu as well. It's been going on since the feature was merged
>>> in v4.15, with what seems like little investigation from the authors
>>> introducing the feature.
>> That's not a fair assessment. See
>> https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
>> comments.
>>
>> Christian fixed the original issue in
>> d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
>> __GFP_NOWARN is set". Christian did his best to try and get the fix in
>> before 4.15 final, but for reasons beyond his control, it was delayed
>> until 4.16-rc1 and then backported to 4.15.5.
>>
>> Unfortunately, there was an swiotlb regression (not directly related to
>> Christian's work) shortly after this fix, also in 4.16-rc1, which is now
>> fixed in 4.17-rc1 and will be backported to 4.16.y.
> 
> And that's exactly the reason why I intentionally kept this enabled for
> all users of the TTM DMA page pool and not put it behind a flag.
> 
> This change has surfaced quite a number of bugs in the swiotlb code
> which could have caused issues before. It's just that those code path
> where never exercised massively before.
> 
> Additional to that using huge pages is beneficial for the MM and CPU TLB
> (not implemented yet) even when the GPU driver can't make much use of it.

Do I understand correctly that you're against this patch?

AFAIU the only benefit of huge pages with a driver which doesn't take
advantage of them directly is "for the MM". Can you describe a bit more
what that benefit is exactly? Is it expected to outweigh the cost of
allocating / freeing huge pages?


>> It looks like there's at least one more bug left, but it's not clear yet
>> when that was introduced, whether it's directly related to Christian's
>> work, or indeed what the impact is. Let's not get ahead of ourselves.
> 
> Well my patches surfaced the problems, but the underlying issues where
> present even before those changes and I'm very well involved in fixing
> the underlying issues.
> 
> I even considered to just revert the huge page path for the DMA pool
> allocator, but it's just that the TTM patches seem to work exactly as
> they are intended. So that doesn't feel like doing the right thing here.

I agree. Has anyone reported this to the DMA/SWIOTLB developers?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-29 Thread Christian König

Am 29.04.2018 um 01:02 schrieb Michel Dänzer:

On 2018-04-28 06:30 PM, Ilia Mirkin wrote:

On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:

From: Michel Dänzer 

Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
try to allocate huge pages. However, not all drivers can take advantage
of huge pages, but they would incur the overhead for allocating and
freeing them anyway.

Now, drivers which can take advantage of huge pages need to set the new
flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
no longer incur any overhead for allocating or freeing huge pages.

v2:
* Also guard swapping of consecutive pages in ttm_get_pages
* Reword commit log, hopefully clearer now

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4.

"lots of other people", "plenty of issues" sounds a bit exaggerated from
what I've seen. FWIW, while I did see the original messages myself, I
haven't seen any since Christian's original fix (see below), neither
with amdgpu nor radeon, even before this patch you followed up to.



Admittedly I'm on nouveau, but others have reported issues with
radeon/amdgpu as well. It's been going on since the feature was merged
in v4.15, with what seems like little investigation from the authors
introducing the feature.

That's not a fair assessment. See
https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
comments.

Christian fixed the original issue in
d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
__GFP_NOWARN is set". Christian did his best to try and get the fix in
before 4.15 final, but for reasons beyond his control, it was delayed
until 4.16-rc1 and then backported to 4.15.5.

Unfortunately, there was an swiotlb regression (not directly related to
Christian's work) shortly after this fix, also in 4.16-rc1, which is now
fixed in 4.17-rc1 and will be backported to 4.16.y.


And that's exactly the reason why I intentionally kept this enabled for 
all users of the TTM DMA page pool and not put it behind a flag.


This change has surfaced quite a number of bugs in the swiotlb code 
which could have caused issues before. It's just that those code path 
where never exercised massively before.


Additional to that using huge pages is beneficial for the MM and CPU TLB 
(not implemented yet) even when the GPU driver can't make much use of it.



It looks like there's at least one more bug left, but it's not clear yet
when that was introduced, whether it's directly related to Christian's
work, or indeed what the impact is. Let's not get ahead of ourselves.


Well my patches surfaced the problems, but the underlying issues where 
present even before those changes and I'm very well involved in fixing 
the underlying issues.


I even considered to just revert the huge page path for the DMA pool 
allocator, but it's just that the TTM patches seem to work exactly as 
they are intended. So that doesn't feel like doing the right thing here.



We now have *two* broken releases, v4.15 and v4.16 (anything that
spews error messages and stack traces ad-infinitum in dmesg is, by
definition, broken).

I haven't seen any evidence that there's still an issue in 4.15, is
there any?


Not that I know of, the fix was backported as far as I know.


You're putting this behind a flag now (finally),

I wrote this patch because I realized due to some remark I happened to
see you make this week on IRC that the huge page support in TTM was
enabled for all drivers. Instead of making that kind of remark on IRC,
it would have been more constructive, and more conducive to quick
implementation, to suggest making the feature not active for drivers
which don't need it in a mailing list post.


I have to admit that I'm lacking behind taking care of the amdgpu/radeon 
user space issues just because of more important stuff to do, but the 
issues affecting other drivers should be fixed by now.


BTW: The user space problems for amdgpu/radeon seems to come from either 
the DDX or Glamour.


For example try playing a video user firefox with Glamour enabled and 
take a look at how much memory we free/allocate.


It's multiple gigabytes for just a few seconds playback, that strongly 
indicates that we allocate/free a texture for each displayed frame which 
is quite far from optimal.


Regards,
Christian.




At least, please do more research before making this kind of negative
post.

P.S. You might also want to look into whether nouveau really should be
hitting swiotlb in these cases.





Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-29 Thread Christian König

Am 29.04.2018 um 01:02 schrieb Michel Dänzer:

On 2018-04-28 06:30 PM, Ilia Mirkin wrote:

On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:

From: Michel Dänzer 

Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
try to allocate huge pages. However, not all drivers can take advantage
of huge pages, but they would incur the overhead for allocating and
freeing them anyway.

Now, drivers which can take advantage of huge pages need to set the new
flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
no longer incur any overhead for allocating or freeing huge pages.

v2:
* Also guard swapping of consecutive pages in ttm_get_pages
* Reword commit log, hopefully clearer now

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4.

"lots of other people", "plenty of issues" sounds a bit exaggerated from
what I've seen. FWIW, while I did see the original messages myself, I
haven't seen any since Christian's original fix (see below), neither
with amdgpu nor radeon, even before this patch you followed up to.



Admittedly I'm on nouveau, but others have reported issues with
radeon/amdgpu as well. It's been going on since the feature was merged
in v4.15, with what seems like little investigation from the authors
introducing the feature.

That's not a fair assessment. See
https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
comments.

Christian fixed the original issue in
d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
__GFP_NOWARN is set". Christian did his best to try and get the fix in
before 4.15 final, but for reasons beyond his control, it was delayed
until 4.16-rc1 and then backported to 4.15.5.

Unfortunately, there was an swiotlb regression (not directly related to
Christian's work) shortly after this fix, also in 4.16-rc1, which is now
fixed in 4.17-rc1 and will be backported to 4.16.y.


And that's exactly the reason why I intentionally kept this enabled for 
all users of the TTM DMA page pool and not put it behind a flag.


This change has surfaced quite a number of bugs in the swiotlb code 
which could have caused issues before. It's just that those code path 
where never exercised massively before.


Additional to that using huge pages is beneficial for the MM and CPU TLB 
(not implemented yet) even when the GPU driver can't make much use of it.



It looks like there's at least one more bug left, but it's not clear yet
when that was introduced, whether it's directly related to Christian's
work, or indeed what the impact is. Let's not get ahead of ourselves.


Well my patches surfaced the problems, but the underlying issues where 
present even before those changes and I'm very well involved in fixing 
the underlying issues.


I even considered to just revert the huge page path for the DMA pool 
allocator, but it's just that the TTM patches seem to work exactly as 
they are intended. So that doesn't feel like doing the right thing here.



We now have *two* broken releases, v4.15 and v4.16 (anything that
spews error messages and stack traces ad-infinitum in dmesg is, by
definition, broken).

I haven't seen any evidence that there's still an issue in 4.15, is
there any?


Not that I know of, the fix was backported as far as I know.


You're putting this behind a flag now (finally),

I wrote this patch because I realized due to some remark I happened to
see you make this week on IRC that the huge page support in TTM was
enabled for all drivers. Instead of making that kind of remark on IRC,
it would have been more constructive, and more conducive to quick
implementation, to suggest making the feature not active for drivers
which don't need it in a mailing list post.


I have to admit that I'm lacking behind taking care of the amdgpu/radeon 
user space issues just because of more important stuff to do, but the 
issues affecting other drivers should be fixed by now.


BTW: The user space problems for amdgpu/radeon seems to come from either 
the DDX or Glamour.


For example try playing a video user firefox with Glamour enabled and 
take a look at how much memory we free/allocate.


It's multiple gigabytes for just a few seconds playback, that strongly 
indicates that we allocate/free a texture for each displayed frame which 
is quite far from optimal.


Regards,
Christian.




At least, please do more research before making this kind of negative
post.

P.S. You might also want to look into whether nouveau really should be
hitting swiotlb in these cases.





Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-28 Thread Ilia Mirkin
On Sat, Apr 28, 2018 at 7:02 PM, Michel Dänzer  wrote:
> On 2018-04-28 06:30 PM, Ilia Mirkin wrote:
>> On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:
>>> From: Michel Dänzer 
>>>
>>> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
>>> try to allocate huge pages. However, not all drivers can take advantage
>>> of huge pages, but they would incur the overhead for allocating and
>>> freeing them anyway.
>>>
>>> Now, drivers which can take advantage of huge pages need to set the new
>>> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
>>> no longer incur any overhead for allocating or freeing huge pages.
>>>
>>> v2:
>>> * Also guard swapping of consecutive pages in ttm_get_pages
>>> * Reword commit log, hopefully clearer now
>>>
>>> Cc: sta...@vger.kernel.org
>>> Signed-off-by: Michel Dänzer 
>>
>> Both I and lots of other people, based on reports, are still seeing
>> plenty of issues with this as late as 4.16.4.
>
> "lots of other people", "plenty of issues" sounds a bit exaggerated from
> what I've seen. FWIW, while I did see the original messages myself, I
> haven't seen any since Christian's original fix (see below), neither
> with amdgpu nor radeon, even before this patch you followed up to.

Probably a half-dozen reports of it with nouveau, in addition to
another bunch of people talking about it on the bug you mention below,
along with email threads on dri-devel.

I figured I didn't have to raise my own since it was identical to the
others, and, I assumed, was being handled.

>> Admittedly I'm on nouveau, but others have reported issues with
>> radeon/amdgpu as well. It's been going on since the feature was merged
>> in v4.15, with what seems like little investigation from the authors
>> introducing the feature.
>
> That's not a fair assessment. See
> https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
> comments.
>
> Christian fixed the original issue in
> d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
> __GFP_NOWARN is set". Christian did his best to try and get the fix in
> before 4.15 final, but for reasons beyond his control, it was delayed
> until 4.16-rc1 and then backported to 4.15.5.

In case it's unclear, let me state this explicitly -- I totally get
that despite best intentions, bugs get introduced. I do it myself.
What I'm having trouble with is the handling once the issue is
discovered.

>
> Unfortunately, there was an swiotlb regression (not directly related to
> Christian's work) shortly after this fix, also in 4.16-rc1, which is now
> fixed in 4.17-rc1 and will be backported to 4.16.y.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.16.5=2c9dacf5bfe1e45d96dfe97cb71d2b717786a7b9

This guy? Didn't help. I'm running 4.16.4 right now.

> It looks like there's at least one more bug left, but it's not clear yet
> when that was introduced, whether it's directly related to Christian's
> work, or indeed what the impact is. Let's not get ahead of ourselves.

Whether it is directly related to that work or not, the issue
persists. There are two options:

 - When declaring things fixed, no serious attempt was actually made
at reproducing the underlying issues.
 - The authors truly can't reproduce the underlying issues users are
seeing and are taking stabs in the dark.

Given that a number of people are reporting problems, in either
scenario, the reasonable thing is to disable the feature, and figure
out what is going on. Maybe condition it on !CONFIG_SWIOTLB.

>> We now have *two* broken releases, v4.15 and v4.16 (anything that
>> spews error messages and stack traces ad-infinitum in dmesg is, by
>> definition, broken).
>
> I haven't seen any evidence that there's still an issue in 4.15, is
> there any?

Well, I did have a late 4.15 rc kernel in addition to the 'suppress
warning' patch. Now I'm questioning my memory of whether the issue was
resolved there or not. I'm pretty sure that 'not', but no longer 100%.
Either way, I think we all agree v4.15 was broken and more importantly
was *known* to be broken well in advance of the release. A reasonable
option would have been to disable the feature until the other bits
fell into place.

>> You're putting this behind a flag now (finally),
>
> I wrote this patch because I realized due to some remark I happened to
> see you make this week on IRC that the huge page support in TTM was
> enabled for all drivers. Instead of making that kind of remark on IRC,
> it would have been more constructive, and more conducive to quick
> implementation, to suggest making the feature not active for drivers
> which don't need it in a mailing list post.

I see IRC as a much faster and direct way of reaching the authors
and/or people who need to know an issue. As there was already a bug
filed about it and the issue was known about, I didn't really see a

Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-28 Thread Ilia Mirkin
On Sat, Apr 28, 2018 at 7:02 PM, Michel Dänzer  wrote:
> On 2018-04-28 06:30 PM, Ilia Mirkin wrote:
>> On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:
>>> From: Michel Dänzer 
>>>
>>> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
>>> try to allocate huge pages. However, not all drivers can take advantage
>>> of huge pages, but they would incur the overhead for allocating and
>>> freeing them anyway.
>>>
>>> Now, drivers which can take advantage of huge pages need to set the new
>>> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
>>> no longer incur any overhead for allocating or freeing huge pages.
>>>
>>> v2:
>>> * Also guard swapping of consecutive pages in ttm_get_pages
>>> * Reword commit log, hopefully clearer now
>>>
>>> Cc: sta...@vger.kernel.org
>>> Signed-off-by: Michel Dänzer 
>>
>> Both I and lots of other people, based on reports, are still seeing
>> plenty of issues with this as late as 4.16.4.
>
> "lots of other people", "plenty of issues" sounds a bit exaggerated from
> what I've seen. FWIW, while I did see the original messages myself, I
> haven't seen any since Christian's original fix (see below), neither
> with amdgpu nor radeon, even before this patch you followed up to.

Probably a half-dozen reports of it with nouveau, in addition to
another bunch of people talking about it on the bug you mention below,
along with email threads on dri-devel.

I figured I didn't have to raise my own since it was identical to the
others, and, I assumed, was being handled.

>> Admittedly I'm on nouveau, but others have reported issues with
>> radeon/amdgpu as well. It's been going on since the feature was merged
>> in v4.15, with what seems like little investigation from the authors
>> introducing the feature.
>
> That's not a fair assessment. See
> https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
> comments.
>
> Christian fixed the original issue in
> d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
> __GFP_NOWARN is set". Christian did his best to try and get the fix in
> before 4.15 final, but for reasons beyond his control, it was delayed
> until 4.16-rc1 and then backported to 4.15.5.

In case it's unclear, let me state this explicitly -- I totally get
that despite best intentions, bugs get introduced. I do it myself.
What I'm having trouble with is the handling once the issue is
discovered.

>
> Unfortunately, there was an swiotlb regression (not directly related to
> Christian's work) shortly after this fix, also in 4.16-rc1, which is now
> fixed in 4.17-rc1 and will be backported to 4.16.y.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.16.5=2c9dacf5bfe1e45d96dfe97cb71d2b717786a7b9

This guy? Didn't help. I'm running 4.16.4 right now.

> It looks like there's at least one more bug left, but it's not clear yet
> when that was introduced, whether it's directly related to Christian's
> work, or indeed what the impact is. Let's not get ahead of ourselves.

Whether it is directly related to that work or not, the issue
persists. There are two options:

 - When declaring things fixed, no serious attempt was actually made
at reproducing the underlying issues.
 - The authors truly can't reproduce the underlying issues users are
seeing and are taking stabs in the dark.

Given that a number of people are reporting problems, in either
scenario, the reasonable thing is to disable the feature, and figure
out what is going on. Maybe condition it on !CONFIG_SWIOTLB.

>> We now have *two* broken releases, v4.15 and v4.16 (anything that
>> spews error messages and stack traces ad-infinitum in dmesg is, by
>> definition, broken).
>
> I haven't seen any evidence that there's still an issue in 4.15, is
> there any?

Well, I did have a late 4.15 rc kernel in addition to the 'suppress
warning' patch. Now I'm questioning my memory of whether the issue was
resolved there or not. I'm pretty sure that 'not', but no longer 100%.
Either way, I think we all agree v4.15 was broken and more importantly
was *known* to be broken well in advance of the release. A reasonable
option would have been to disable the feature until the other bits
fell into place.

>> You're putting this behind a flag now (finally),
>
> I wrote this patch because I realized due to some remark I happened to
> see you make this week on IRC that the huge page support in TTM was
> enabled for all drivers. Instead of making that kind of remark on IRC,
> it would have been more constructive, and more conducive to quick
> implementation, to suggest making the feature not active for drivers
> which don't need it in a mailing list post.

I see IRC as a much faster and direct way of reaching the authors
and/or people who need to know an issue. As there was already a bug
filed about it and the issue was known about, I didn't really see a
reason to pile on (until now).

> At least, please do more research before making this 

Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-28 Thread Michel Dänzer
On 2018-04-28 06:30 PM, Ilia Mirkin wrote:
> On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:
>> From: Michel Dänzer 
>>
>> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
>> try to allocate huge pages. However, not all drivers can take advantage
>> of huge pages, but they would incur the overhead for allocating and
>> freeing them anyway.
>>
>> Now, drivers which can take advantage of huge pages need to set the new
>> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
>> no longer incur any overhead for allocating or freeing huge pages.
>>
>> v2:
>> * Also guard swapping of consecutive pages in ttm_get_pages
>> * Reword commit log, hopefully clearer now
>>
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Michel Dänzer 
> 
> Both I and lots of other people, based on reports, are still seeing
> plenty of issues with this as late as 4.16.4.

"lots of other people", "plenty of issues" sounds a bit exaggerated from
what I've seen. FWIW, while I did see the original messages myself, I
haven't seen any since Christian's original fix (see below), neither
with amdgpu nor radeon, even before this patch you followed up to.


> Admittedly I'm on nouveau, but others have reported issues with
> radeon/amdgpu as well. It's been going on since the feature was merged
> in v4.15, with what seems like little investigation from the authors
> introducing the feature.

That's not a fair assessment. See
https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
comments.

Christian fixed the original issue in
d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
__GFP_NOWARN is set". Christian did his best to try and get the fix in
before 4.15 final, but for reasons beyond his control, it was delayed
until 4.16-rc1 and then backported to 4.15.5.

Unfortunately, there was an swiotlb regression (not directly related to
Christian's work) shortly after this fix, also in 4.16-rc1, which is now
fixed in 4.17-rc1 and will be backported to 4.16.y.

It looks like there's at least one more bug left, but it's not clear yet
when that was introduced, whether it's directly related to Christian's
work, or indeed what the impact is. Let's not get ahead of ourselves.


> We now have *two* broken releases, v4.15 and v4.16 (anything that
> spews error messages and stack traces ad-infinitum in dmesg is, by
> definition, broken).

I haven't seen any evidence that there's still an issue in 4.15, is
there any?


> You're putting this behind a flag now (finally),

I wrote this patch because I realized due to some remark I happened to
see you make this week on IRC that the huge page support in TTM was
enabled for all drivers. Instead of making that kind of remark on IRC,
it would have been more constructive, and more conducive to quick
implementation, to suggest making the feature not active for drivers
which don't need it in a mailing list post.


At least, please do more research before making this kind of negative
post.

P.S. You might also want to look into whether nouveau really should be
hitting swiotlb in these cases.

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-28 Thread Michel Dänzer
On 2018-04-28 06:30 PM, Ilia Mirkin wrote:
> On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:
>> From: Michel Dänzer 
>>
>> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
>> try to allocate huge pages. However, not all drivers can take advantage
>> of huge pages, but they would incur the overhead for allocating and
>> freeing them anyway.
>>
>> Now, drivers which can take advantage of huge pages need to set the new
>> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
>> no longer incur any overhead for allocating or freeing huge pages.
>>
>> v2:
>> * Also guard swapping of consecutive pages in ttm_get_pages
>> * Reword commit log, hopefully clearer now
>>
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Michel Dänzer 
> 
> Both I and lots of other people, based on reports, are still seeing
> plenty of issues with this as late as 4.16.4.

"lots of other people", "plenty of issues" sounds a bit exaggerated from
what I've seen. FWIW, while I did see the original messages myself, I
haven't seen any since Christian's original fix (see below), neither
with amdgpu nor radeon, even before this patch you followed up to.


> Admittedly I'm on nouveau, but others have reported issues with
> radeon/amdgpu as well. It's been going on since the feature was merged
> in v4.15, with what seems like little investigation from the authors
> introducing the feature.

That's not a fair assessment. See
https://bugs.freedesktop.org/show_bug.cgi?id=104082#c40 and following
comments.

Christian fixed the original issue in
d0bc0c2a31c95002d37c3cc511ffdcab851b3256 "swiotlb: suppress warning when
__GFP_NOWARN is set". Christian did his best to try and get the fix in
before 4.15 final, but for reasons beyond his control, it was delayed
until 4.16-rc1 and then backported to 4.15.5.

Unfortunately, there was an swiotlb regression (not directly related to
Christian's work) shortly after this fix, also in 4.16-rc1, which is now
fixed in 4.17-rc1 and will be backported to 4.16.y.

It looks like there's at least one more bug left, but it's not clear yet
when that was introduced, whether it's directly related to Christian's
work, or indeed what the impact is. Let's not get ahead of ourselves.


> We now have *two* broken releases, v4.15 and v4.16 (anything that
> spews error messages and stack traces ad-infinitum in dmesg is, by
> definition, broken).

I haven't seen any evidence that there's still an issue in 4.15, is
there any?


> You're putting this behind a flag now (finally),

I wrote this patch because I realized due to some remark I happened to
see you make this week on IRC that the huge page support in TTM was
enabled for all drivers. Instead of making that kind of remark on IRC,
it would have been more constructive, and more conducive to quick
implementation, to suggest making the feature not active for drivers
which don't need it in a mailing list post.


At least, please do more research before making this kind of negative
post.

P.S. You might also want to look into whether nouveau really should be
hitting swiotlb in these cases.

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-28 Thread Ilia Mirkin
On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
> try to allocate huge pages. However, not all drivers can take advantage
> of huge pages, but they would incur the overhead for allocating and
> freeing them anyway.
>
> Now, drivers which can take advantage of huge pages need to set the new
> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
> no longer incur any overhead for allocating or freeing huge pages.
>
> v2:
> * Also guard swapping of consecutive pages in ttm_get_pages
> * Reword commit log, hopefully clearer now
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Michel Dänzer 

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4. Admittedly I'm on
nouveau, but others have reported issues with radeon/amdgpu as well.
It's been going on since the feature was merged in v4.15, with what
seems like little investigation from the authors introducing the
feature.

We now have *two* broken releases, v4.15 and v4.16 (anything that
spews error messages and stack traces ad-infinitum in dmesg is, by
definition, broken). You're putting this behind a flag now (finally),
but should it be enabled anywhere? Why is it being flipped on for
amdgpu by default, despite the still-existing problems?

Reverting this feature without just resetting back to the code in
v4.14 is painful, but why make Joe User suffer by enabling it while
you're still working out the kinks?

  -ilia


Re: [PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-28 Thread Ilia Mirkin
On Fri, Apr 27, 2018 at 9:08 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
> try to allocate huge pages. However, not all drivers can take advantage
> of huge pages, but they would incur the overhead for allocating and
> freeing them anyway.
>
> Now, drivers which can take advantage of huge pages need to set the new
> flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
> no longer incur any overhead for allocating or freeing huge pages.
>
> v2:
> * Also guard swapping of consecutive pages in ttm_get_pages
> * Reword commit log, hopefully clearer now
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Michel Dänzer 

Both I and lots of other people, based on reports, are still seeing
plenty of issues with this as late as 4.16.4. Admittedly I'm on
nouveau, but others have reported issues with radeon/amdgpu as well.
It's been going on since the feature was merged in v4.15, with what
seems like little investigation from the authors introducing the
feature.

We now have *two* broken releases, v4.15 and v4.16 (anything that
spews error messages and stack traces ad-infinitum in dmesg is, by
definition, broken). You're putting this behind a flag now (finally),
but should it be enabled anywhere? Why is it being flipped on for
amdgpu by default, despite the still-existing problems?

Reverting this feature without just resetting back to the code in
v4.14 is painful, but why make Joe User suffer by enabling it while
you're still working out the kinks?

  -ilia


[PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-27 Thread Michel Dänzer
From: Michel Dänzer 

Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
try to allocate huge pages. However, not all drivers can take advantage
of huge pages, but they would incur the overhead for allocating and
freeing them anyway.

Now, drivers which can take advantage of huge pages need to set the new
flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
no longer incur any overhead for allocating or freeing huge pages.

v2:
* Also guard swapping of consecutive pages in ttm_get_pages
* Reword commit log, hopefully clearer now

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  |  2 +-
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 35 +---
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |  8 --
 include/drm/ttm/ttm_tt.h |  1 +
 4 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index dfd22db13fb1..e03e9e361e2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -988,7 +988,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct 
ttm_buffer_object *bo,
return NULL;
}
gtt->ttm.ttm.func = _backend_func;
-   if (ttm_sg_tt_init(>ttm, bo, page_flags)) {
+   if (ttm_sg_tt_init(>ttm, bo, page_flags | 
TTM_PAGE_FLAG_TRANSHUGE)) {
kfree(gtt);
return NULL;
}
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index f0481b7b60c5..476d668e1cbd 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -760,7 +760,7 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
 {
struct ttm_page_pool *pool = ttm_get_pool(flags, false, cstate);
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   struct ttm_page_pool *huge = ttm_get_pool(flags, true, cstate);
+   struct ttm_page_pool *huge = NULL;
 #endif
unsigned long irq_flags;
unsigned i;
@@ -780,7 +780,8 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   if (!(flags & TTM_PAGE_FLAG_DMA32)) {
+   if ((flags & (TTM_PAGE_FLAG_DMA32 | 
TTM_PAGE_FLAG_TRANSHUGE)) ==
+   TTM_PAGE_FLAG_TRANSHUGE) {
for (j = 0; j < HPAGE_PMD_NR; ++j)
if (p++ != pages[i + j])
break;
@@ -805,6 +806,8 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
 
i = 0;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+   if (flags & TTM_PAGE_FLAG_TRANSHUGE)
+   huge = ttm_get_pool(flags, true, cstate);
if (huge) {
unsigned max_size, n2free;
 
@@ -877,7 +880,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
 {
struct ttm_page_pool *pool = ttm_get_pool(flags, false, cstate);
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   struct ttm_page_pool *huge = ttm_get_pool(flags, true, cstate);
+   struct ttm_page_pool *huge = NULL;
 #endif
struct list_head plist;
struct page *p = NULL;
@@ -906,7 +909,8 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
 
i = 0;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   if (!(gfp_flags & GFP_DMA32)) {
+   if ((flags & (TTM_PAGE_FLAG_DMA32 | TTM_PAGE_FLAG_TRANSHUGE)) ==
+   TTM_PAGE_FLAG_TRANSHUGE) {
while (npages >= HPAGE_PMD_NR) {
gfp_t huge_flags = gfp_flags;
 
@@ -933,9 +937,13 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
return -ENOMEM;
}
 
-   /* Swap the pages if we detect consecutive order */
-   if (i > first && pages[i - 1] == p - 1)
-   swap(p, pages[i - 1]);
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+   if (flags & TTM_PAGE_FLAG_TRANSHUGE) {
+   /* Swap the pages if we detect consecutive 
order */
+   if (i > first && pages[i - 1] == p - 1)
+   swap(p, pages[i - 1]);
+   }
+#endif
 
pages[i++] = p;
--npages;
@@ -946,6 +954,8 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
count = 0;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+   if (flags & TTM_PAGE_FLAG_TRANSHUGE)
+   huge = ttm_get_pool(flags, true, cstate);
if (huge && npages >= HPAGE_PMD_NR) {

[PATCH v2 1/2] drm/ttm: Only allocate huge pages with new flag TTM_PAGE_FLAG_TRANSHUGE

2018-04-27 Thread Michel Dänzer
From: Michel Dänzer 

Previously, TTM would always (with CONFIG_TRANSPARENT_HUGEPAGE enabled)
try to allocate huge pages. However, not all drivers can take advantage
of huge pages, but they would incur the overhead for allocating and
freeing them anyway.

Now, drivers which can take advantage of huge pages need to set the new
flag TTM_PAGE_FLAG_TRANSHUGE to get them. Drivers not setting this flag
no longer incur any overhead for allocating or freeing huge pages.

v2:
* Also guard swapping of consecutive pages in ttm_get_pages
* Reword commit log, hopefully clearer now

Cc: sta...@vger.kernel.org
Signed-off-by: Michel Dänzer 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  |  2 +-
 drivers/gpu/drm/ttm/ttm_page_alloc.c | 35 +---
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c |  8 --
 include/drm/ttm/ttm_tt.h |  1 +
 4 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index dfd22db13fb1..e03e9e361e2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -988,7 +988,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct 
ttm_buffer_object *bo,
return NULL;
}
gtt->ttm.ttm.func = _backend_func;
-   if (ttm_sg_tt_init(>ttm, bo, page_flags)) {
+   if (ttm_sg_tt_init(>ttm, bo, page_flags | 
TTM_PAGE_FLAG_TRANSHUGE)) {
kfree(gtt);
return NULL;
}
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index f0481b7b60c5..476d668e1cbd 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -760,7 +760,7 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
 {
struct ttm_page_pool *pool = ttm_get_pool(flags, false, cstate);
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   struct ttm_page_pool *huge = ttm_get_pool(flags, true, cstate);
+   struct ttm_page_pool *huge = NULL;
 #endif
unsigned long irq_flags;
unsigned i;
@@ -780,7 +780,8 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   if (!(flags & TTM_PAGE_FLAG_DMA32)) {
+   if ((flags & (TTM_PAGE_FLAG_DMA32 | 
TTM_PAGE_FLAG_TRANSHUGE)) ==
+   TTM_PAGE_FLAG_TRANSHUGE) {
for (j = 0; j < HPAGE_PMD_NR; ++j)
if (p++ != pages[i + j])
break;
@@ -805,6 +806,8 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
 
i = 0;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+   if (flags & TTM_PAGE_FLAG_TRANSHUGE)
+   huge = ttm_get_pool(flags, true, cstate);
if (huge) {
unsigned max_size, n2free;
 
@@ -877,7 +880,7 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
 {
struct ttm_page_pool *pool = ttm_get_pool(flags, false, cstate);
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   struct ttm_page_pool *huge = ttm_get_pool(flags, true, cstate);
+   struct ttm_page_pool *huge = NULL;
 #endif
struct list_head plist;
struct page *p = NULL;
@@ -906,7 +909,8 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
 
i = 0;
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-   if (!(gfp_flags & GFP_DMA32)) {
+   if ((flags & (TTM_PAGE_FLAG_DMA32 | TTM_PAGE_FLAG_TRANSHUGE)) ==
+   TTM_PAGE_FLAG_TRANSHUGE) {
while (npages >= HPAGE_PMD_NR) {
gfp_t huge_flags = gfp_flags;
 
@@ -933,9 +937,13 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
return -ENOMEM;
}
 
-   /* Swap the pages if we detect consecutive order */
-   if (i > first && pages[i - 1] == p - 1)
-   swap(p, pages[i - 1]);
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+   if (flags & TTM_PAGE_FLAG_TRANSHUGE) {
+   /* Swap the pages if we detect consecutive 
order */
+   if (i > first && pages[i - 1] == p - 1)
+   swap(p, pages[i - 1]);
+   }
+#endif
 
pages[i++] = p;
--npages;
@@ -946,6 +954,8 @@ static int ttm_get_pages(struct page **pages, unsigned 
npages, int flags,
count = 0;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
+   if (flags & TTM_PAGE_FLAG_TRANSHUGE)
+   huge = ttm_get_pool(flags, true, cstate);
if (huge && npages >= HPAGE_PMD_NR) {
INIT_LIST_HEAD();