[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2015-01-09 Thread Thierry Reding
On Thu, Jan 08, 2015 at 04:15:42PM +0200, Oded Gabbay wrote:
> Hi Thierry,
> Generally I agree with the issues you describe in the current design.
> One task in our 2015 workplan is to change the whole method amdkfd is
> loaded, so it can independently load at any time, regardless of the order of
> loading between it and radeon and amd_iommu_v2. To reach that goal, I assume
> we will have to use some form of deferred probing.

Sounds good.

> However, for the moment, this is the best band-aid I could think of, and
> choosing between this band-aid or no band-aid at all, I prefer the former
> any day.

Looking at the patches, the dependency is documented in the Makefile. I
guess that's fine as a temporary band-aid.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: 



[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2015-01-08 Thread Oded Gabbay
Hi Thierry,
Generally I agree with the issues you describe in the current design.
One task in our 2015 workplan is to change the whole method amdkfd is loaded, 
so 
it can independently load at any time, regardless of the order of loading 
between it and radeon and amd_iommu_v2. To reach that goal, I assume we will 
have to use some form of deferred probing.

However, for the moment, this is the best band-aid I could think of, and 
choosing between this band-aid or no band-aid at all, I prefer the former any 
day.

Oded


On 01/05/2015 05:46 PM, Thierry Reding wrote:
> On Mon, Dec 29, 2014 at 10:34:32AM +0100, Christian König wrote:
>> Am 29.12.2014 um 09:16 schrieb Laurent Pinchart:
>>> Hi Oded,
>>>
>>> On Sunday 28 December 2014 13:36:50 Oded Gabbay wrote:
 On 12/26/2014 11:19 AM, Laurent Pinchart wrote:
> On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
>> On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
>>> This small patch-set, was created to solve the bug described at
>>> https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
>>> trying use amdkfd driver on Kaveri). It replaces the previous patch-set
>>> called [PATCH 0/3] Use workqueue for device init in amdkfd
>>> (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.ht
>>> ml)
>>>
>>> That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
>>> inside the kernel (not as modules). In that case, the correct loading
>>> order, as determined by the exported symbol used by each driver, is
>>> not enforced anymore and the kernel loads them based on who was linked
>>> first. That makes radeon load first, amdkfd second and amd_iommu_v2
>>> third.
>>>
>>> Because the initialization of a device in amdkfd is initiated by radeon,
>>> and can only be completed if amdkfd and amd_iommu_v2 were loaded and
>>> initialized, then in the case mentioned above, this initalization fails
>>> and there is a kernel panic as some pointers are not initialized but
>>> used nontheless.
>>>
>>> To solve this bug, this patch-set moves iommu/ before gpu/ in
>>> drivers/Makefile and also moves amdkfd/ before radeon/ in
>>> drivers/gpu/drm/Makefile.
>>>
>>> The rationale is that in general, AMD GPU devices are dependent on AMD
>>> IOMMU controller functionality to allow the GPU to access a process's
>>> virtual memory address space, without the need for pinning the memory.
>>> That's why it makes sense to initialize the iommu/ subsystem ahead of
>>> the gpu/ subsystem.
>> I strongly object to this patch set. This makes assumptions about how
>> the build system influences probe order. That's bad because seemingly
>> unrelated changes could easily break this in the future.
>>
>> We already have ways to solve this kind of dependency (driver probe
>> deferral), and I think you should be using it to solve this particular
>> problem rather than some linking order hack.
> While I agree with you that probe deferral is the way to go, I believe
> linkage ordering can still be used as an optimization to avoid deferring
> probe in the most common cases. I'm thus not opposed to moving iommu/
> earlier in link order (provided we can properly test for side effects, as
> the jump is pretty large), but not as a replacement for probe deferral.
 My thoughts exactly. If this was some extreme use case, than it would be
 justified to solve it with probe deferral. But I think that for most common
 cases, GPU are dependent on IOMMU and *not* vice-versa.
>>
>> Fixing this through deferred probing sounds like the correct long term
>> solution to me as well.
>>
>> But what Thierry is referring to here is probably the approach of returning
>> -EAGAIN from the probe method (at least that was the last status when I
>> looked into this).
>
> -EPROBE_DEFER would be the one that I was referring to.
>
>> The problem with this approach is the interface design between radeon and
>> amdkfd. amdkfd simply doesn't have a probe method which gets called when the
>> hardware is detected and can return -EAGAIN. Instead amdkfd is called by
>> radeon after hardware initialization when it is way to late for such a
>> thing.
>
> That sounds like a pretty brittle design. It sounds like nowhere in the
> code you've encoded this dependency and you rely on symbols only to
> ensure probe ordering.
>
> Couldn't you simply make radeon check for availability of the IOMMU
> early and defer probing if it's not there yet? Or if radeon depends on
> the IOMMU via amdkfd, then perhaps calling into amdkfd to check for the
> availability of the IOMMU would be a more correct representation of the
> dependency.
>
 BTW, my first try at solving this was to use probe deferral (using
 workqueue), but the feedback I got from Christian and Dave was that moving
 iommu/ linkage bef

[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2015-01-05 Thread Thierry Reding
On Mon, Dec 29, 2014 at 10:34:32AM +0100, Christian König wrote:
> Am 29.12.2014 um 09:16 schrieb Laurent Pinchart:
> >Hi Oded,
> >
> >On Sunday 28 December 2014 13:36:50 Oded Gabbay wrote:
> >>On 12/26/2014 11:19 AM, Laurent Pinchart wrote:
> >>>On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
> On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
> >This small patch-set, was created to solve the bug described at
> >https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
> >trying use amdkfd driver on Kaveri). It replaces the previous patch-set
> >called [PATCH 0/3] Use workqueue for device init in amdkfd
> >(http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.ht
> >ml)
> >
> >That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
> >inside the kernel (not as modules). In that case, the correct loading
> >order, as determined by the exported symbol used by each driver, is
> >not enforced anymore and the kernel loads them based on who was linked
> >first. That makes radeon load first, amdkfd second and amd_iommu_v2
> >third.
> >
> >Because the initialization of a device in amdkfd is initiated by radeon,
> >and can only be completed if amdkfd and amd_iommu_v2 were loaded and
> >initialized, then in the case mentioned above, this initalization fails
> >and there is a kernel panic as some pointers are not initialized but
> >used nontheless.
> >
> >To solve this bug, this patch-set moves iommu/ before gpu/ in
> >drivers/Makefile and also moves amdkfd/ before radeon/ in
> >drivers/gpu/drm/Makefile.
> >
> >The rationale is that in general, AMD GPU devices are dependent on AMD
> >IOMMU controller functionality to allow the GPU to access a process's
> >virtual memory address space, without the need for pinning the memory.
> >That's why it makes sense to initialize the iommu/ subsystem ahead of
> >the gpu/ subsystem.
> I strongly object to this patch set. This makes assumptions about how
> the build system influences probe order. That's bad because seemingly
> unrelated changes could easily break this in the future.
> 
> We already have ways to solve this kind of dependency (driver probe
> deferral), and I think you should be using it to solve this particular
> problem rather than some linking order hack.
> >>>While I agree with you that probe deferral is the way to go, I believe
> >>>linkage ordering can still be used as an optimization to avoid deferring
> >>>probe in the most common cases. I'm thus not opposed to moving iommu/
> >>>earlier in link order (provided we can properly test for side effects, as
> >>>the jump is pretty large), but not as a replacement for probe deferral.
> >>My thoughts exactly. If this was some extreme use case, than it would be
> >>justified to solve it with probe deferral. But I think that for most common
> >>cases, GPU are dependent on IOMMU and *not* vice-versa.
> 
> Fixing this through deferred probing sounds like the correct long term
> solution to me as well.
> 
> But what Thierry is referring to here is probably the approach of returning
> -EAGAIN from the probe method (at least that was the last status when I
> looked into this).

-EPROBE_DEFER would be the one that I was referring to.

> The problem with this approach is the interface design between radeon and
> amdkfd. amdkfd simply doesn't have a probe method which gets called when the
> hardware is detected and can return -EAGAIN. Instead amdkfd is called by
> radeon after hardware initialization when it is way to late for such a
> thing.

That sounds like a pretty brittle design. It sounds like nowhere in the
code you've encoded this dependency and you rely on symbols only to
ensure probe ordering.

Couldn't you simply make radeon check for availability of the IOMMU
early and defer probing if it's not there yet? Or if radeon depends on
the IOMMU via amdkfd, then perhaps calling into amdkfd to check for the
availability of the IOMMU would be a more correct representation of the
dependency.

> >>BTW, my first try at solving this was to use probe deferral (using
> >>workqueue), but the feedback I got from Christian and Dave was that moving
> >>iommu/ linkage before gpu/ was a much more simpler solution.
> >To clarify my position, I believe changing the link order can be a worthwhile
> >optimization, but I'm uncertain about the long term viability of that change
> >as a fix. Probe deferral has been introduced because not all probe ordering
> >issues can be fixed through link ordering, so we should fix the problem
> >properly.
> >
> >This being said, if modifying the link order can help for now without
> >introducing negative side effects, it would only postpone the real fix, so 
> >I'm
> >not opposed to it.
> 
> Yeah, that sounds like the right approach to me as well.

I don't think that's a good approach at all. I

[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-29 Thread Christian König
Am 29.12.2014 um 09:16 schrieb Laurent Pinchart:
> Hi Oded,
>
> On Sunday 28 December 2014 13:36:50 Oded Gabbay wrote:
>> On 12/26/2014 11:19 AM, Laurent Pinchart wrote:
>>> On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
 On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
> This small patch-set, was created to solve the bug described at
> https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
> trying use amdkfd driver on Kaveri). It replaces the previous patch-set
> called [PATCH 0/3] Use workqueue for device init in amdkfd
> (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.ht
> ml)
>
> That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
> inside the kernel (not as modules). In that case, the correct loading
> order, as determined by the exported symbol used by each driver, is
> not enforced anymore and the kernel loads them based on who was linked
> first. That makes radeon load first, amdkfd second and amd_iommu_v2
> third.
>
> Because the initialization of a device in amdkfd is initiated by radeon,
> and can only be completed if amdkfd and amd_iommu_v2 were loaded and
> initialized, then in the case mentioned above, this initalization fails
> and there is a kernel panic as some pointers are not initialized but
> used nontheless.
>
> To solve this bug, this patch-set moves iommu/ before gpu/ in
> drivers/Makefile and also moves amdkfd/ before radeon/ in
> drivers/gpu/drm/Makefile.
>
> The rationale is that in general, AMD GPU devices are dependent on AMD
> IOMMU controller functionality to allow the GPU to access a process's
> virtual memory address space, without the need for pinning the memory.
> That's why it makes sense to initialize the iommu/ subsystem ahead of
> the gpu/ subsystem.
 I strongly object to this patch set. This makes assumptions about how
 the build system influences probe order. That's bad because seemingly
 unrelated changes could easily break this in the future.

 We already have ways to solve this kind of dependency (driver probe
 deferral), and I think you should be using it to solve this particular
 problem rather than some linking order hack.
>>> While I agree with you that probe deferral is the way to go, I believe
>>> linkage ordering can still be used as an optimization to avoid deferring
>>> probe in the most common cases. I'm thus not opposed to moving iommu/
>>> earlier in link order (provided we can properly test for side effects, as
>>> the jump is pretty large), but not as a replacement for probe deferral.
>> My thoughts exactly. If this was some extreme use case, than it would be
>> justified to solve it with probe deferral. But I think that for most common
>> cases, GPU are dependent on IOMMU and *not* vice-versa.

Fixing this through deferred probing sounds like the correct long term 
solution to me as well.

But what Thierry is referring to here is probably the approach of 
returning -EAGAIN from the probe method (at least that was the last 
status when I looked into this).

The problem with this approach is the interface design between radeon 
and amdkfd. amdkfd simply doesn't have a probe method which gets called 
when the hardware is detected and can return -EAGAIN. Instead amdkfd is 
called by radeon after hardware initialization when it is way to late 
for such a thing.

>>
>> BTW, my first try at solving this was to use probe deferral (using
>> workqueue), but the feedback I got from Christian and Dave was that moving
>> iommu/ linkage before gpu/ was a much more simpler solution.
> To clarify my position, I believe changing the link order can be a worthwhile
> optimization, but I'm uncertain about the long term viability of that change
> as a fix. Probe deferral has been introduced because not all probe ordering
> issues can be fixed through link ordering, so we should fix the problem
> properly.
>
> This being said, if modifying the link order can help for now without
> introducing negative side effects, it would only postpone the real fix, so I'm
> not opposed to it.

Yeah, that sounds like the right approach to me as well. In general I 
would prefer that modules compiled into the kernel load by the order of 
their symbol dependency just like standalone modules do.

That's what Rusty proposed more than 10 years ago when he reworked the 
module system and I'm actually not sure why it was never done this way. 
I can only find the initial patch to do so in the mail history, but not 
why it was rejected.

Regards,
Christian.

>
>> In addition, Linus said he doesn't object to this "band-aid". See:
>> https://lkml.org/lkml/2014/12/25/152
>>
>>  Oded
>>
 Coincidentally there's a separate thread currently going on that deals
 with IOMMUs and probe order. The solution being worked on is currently
 somewhat ARM-specific, so add

[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-29 Thread Laurent Pinchart
Hi Oded,

On Sunday 28 December 2014 13:36:50 Oded Gabbay wrote:
> On 12/26/2014 11:19 AM, Laurent Pinchart wrote:
> > On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
> >> On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
> >>> This small patch-set, was created to solve the bug described at
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
> >>> trying use amdkfd driver on Kaveri). It replaces the previous patch-set
> >>> called [PATCH 0/3] Use workqueue for device init in amdkfd
> >>> (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.ht
> >>> ml)
> >>> 
> >>> That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
> >>> inside the kernel (not as modules). In that case, the correct loading
> >>> order, as determined by the exported symbol used by each driver, is
> >>> not enforced anymore and the kernel loads them based on who was linked
> >>> first. That makes radeon load first, amdkfd second and amd_iommu_v2
> >>> third.
> >>> 
> >>> Because the initialization of a device in amdkfd is initiated by radeon,
> >>> and can only be completed if amdkfd and amd_iommu_v2 were loaded and
> >>> initialized, then in the case mentioned above, this initalization fails
> >>> and there is a kernel panic as some pointers are not initialized but
> >>> used nontheless.
> >>> 
> >>> To solve this bug, this patch-set moves iommu/ before gpu/ in
> >>> drivers/Makefile and also moves amdkfd/ before radeon/ in
> >>> drivers/gpu/drm/Makefile.
> >>> 
> >>> The rationale is that in general, AMD GPU devices are dependent on AMD
> >>> IOMMU controller functionality to allow the GPU to access a process's
> >>> virtual memory address space, without the need for pinning the memory.
> >>> That's why it makes sense to initialize the iommu/ subsystem ahead of
> >>> the gpu/ subsystem.
> >> 
> >> I strongly object to this patch set. This makes assumptions about how
> >> the build system influences probe order. That's bad because seemingly
> >> unrelated changes could easily break this in the future.
> >> 
> >> We already have ways to solve this kind of dependency (driver probe
> >> deferral), and I think you should be using it to solve this particular
> >> problem rather than some linking order hack.
> > 
> > While I agree with you that probe deferral is the way to go, I believe
> > linkage ordering can still be used as an optimization to avoid deferring
> > probe in the most common cases. I'm thus not opposed to moving iommu/
> > earlier in link order (provided we can properly test for side effects, as
> > the jump is pretty large), but not as a replacement for probe deferral.
> 
> My thoughts exactly. If this was some extreme use case, than it would be
> justified to solve it with probe deferral. But I think that for most common
> cases, GPU are dependent on IOMMU and *not* vice-versa.
> 
> BTW, my first try at solving this was to use probe deferral (using
> workqueue), but the feedback I got from Christian and Dave was that moving
> iommu/ linkage before gpu/ was a much more simpler solution.

To clarify my position, I believe changing the link order can be a worthwhile 
optimization, but I'm uncertain about the long term viability of that change 
as a fix. Probe deferral has been introduced because not all probe ordering 
issues can be fixed through link ordering, so we should fix the problem 
properly.

This being said, if modifying the link order can help for now without 
introducing negative side effects, it would only postpone the real fix, so I'm 
not opposed to it.

> In addition, Linus said he doesn't object to this "band-aid". See:
> https://lkml.org/lkml/2014/12/25/152
> 
>   Oded
> 
> >> Coincidentally there's a separate thread currently going on that deals
> >> with IOMMUs and probe order. The solution being worked on is currently
> >> somewhat ARM-specific, so adding a couple of folks for visibility. It
> >> looks like we're going to need something more generic since this is a
> >> problem that even the "big" architectures need to solve.

-- 
Regards,

Laurent Pinchart



[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-28 Thread Oded Gabbay


On 12/26/2014 11:19 AM, Laurent Pinchart wrote:
> Hi Thierry,
>
> On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
>> On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
>>> This small patch-set, was created to solve the bug described at
>>> https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
>>> trying use amdkfd driver on Kaveri). It replaces the previous patch-set
>>> called [PATCH 0/3] Use workqueue for device init in amdkfd
>>> (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.html
>>> )
>>>
>>> That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
>>> inside the kernel (not as modules). In that case, the correct loading
>>> order, as determined by the exported symbol used by each driver, is
>>> not enforced anymore and the kernel loads them based on who was linked
>>> first. That makes radeon load first, amdkfd second and amd_iommu_v2
>>> third.
>>>
>>> Because the initialization of a device in amdkfd is initiated by radeon,
>>> and can only be completed if amdkfd and amd_iommu_v2 were loaded and
>>> initialized, then in the case mentioned above, this initalization fails
>>> and there is a kernel panic as some pointers are not initialized but
>>> used nontheless.
>>>
>>> To solve this bug, this patch-set moves iommu/ before gpu/ in
>>> drivers/Makefile and also moves amdkfd/ before radeon/ in
>>> drivers/gpu/drm/Makefile.
>>>
>>> The rationale is that in general, AMD GPU devices are dependent on AMD
>>> IOMMU controller functionality to allow the GPU to access a process's
>>> virtual memory address space, without the need for pinning the memory.
>>> That's why it makes sense to initialize the iommu/ subsystem ahead of the
>>> gpu/ subsystem.
>>
>> I strongly object to this patch set. This makes assumptions about how
>> the build system influences probe order. That's bad because seemingly
>> unrelated changes could easily break this in the future.
>>
>> We already have ways to solve this kind of dependency (driver probe
>> deferral), and I think you should be using it to solve this particular
>> problem rather than some linking order hack.
>
> While I agree with you that probe deferral is the way to go, I believe linkage
> ordering can still be used as an optimization to avoid deferring probe in the
> most common cases. I'm thus not opposed to moving iommu/ earlier in link order
> (provided we can properly test for side effects, as the jump is pretty large),
> but not as a replacement for probe deferral.

My thoughts exactly. If this was some extreme use case, than it would be 
justified to solve it with probe deferral. But I think that for most common 
cases, GPU are dependent on IOMMU and *not* vice-versa.

BTW, my first try at solving this was to use probe deferral (using workqueue), 
but the feedback I got from Christian and Dave was that moving iommu/ linkage 
before gpu/ was a much more simpler solution.

In addition, Linus said he doesn't object to this "band-aid". See: 
https://lkml.org/lkml/2014/12/25/152

Oded
>
>> Coincidentally there's a separate thread currently going on that deals
>> with IOMMUs and probe order. The solution being worked on is currently
>> somewhat ARM-specific, so adding a couple of folks for visibility. It
>> looks like we're going to need something more generic since this is a
>> problem that even the "big" architectures need to solve.
>


[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-26 Thread Laurent Pinchart
Hi Thierry,

On Thursday 25 December 2014 14:20:59 Thierry Reding wrote:
> On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
> > This small patch-set, was created to solve the bug described at
> > https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
> > trying use amdkfd driver on Kaveri). It replaces the previous patch-set
> > called [PATCH 0/3] Use workqueue for device init in amdkfd
> > (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.html
> > )
> > 
> > That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
> > inside the kernel (not as modules). In that case, the correct loading
> > order, as determined by the exported symbol used by each driver, is
> > not enforced anymore and the kernel loads them based on who was linked
> > first. That makes radeon load first, amdkfd second and amd_iommu_v2
> > third.
> > 
> > Because the initialization of a device in amdkfd is initiated by radeon,
> > and can only be completed if amdkfd and amd_iommu_v2 were loaded and
> > initialized, then in the case mentioned above, this initalization fails
> > and there is a kernel panic as some pointers are not initialized but
> > used nontheless.
> > 
> > To solve this bug, this patch-set moves iommu/ before gpu/ in
> > drivers/Makefile and also moves amdkfd/ before radeon/ in
> > drivers/gpu/drm/Makefile.
> > 
> > The rationale is that in general, AMD GPU devices are dependent on AMD
> > IOMMU controller functionality to allow the GPU to access a process's
> > virtual memory address space, without the need for pinning the memory.
> > That's why it makes sense to initialize the iommu/ subsystem ahead of the
> > gpu/ subsystem.
>
> I strongly object to this patch set. This makes assumptions about how
> the build system influences probe order. That's bad because seemingly
> unrelated changes could easily break this in the future.
> 
> We already have ways to solve this kind of dependency (driver probe
> deferral), and I think you should be using it to solve this particular
> problem rather than some linking order hack.

While I agree with you that probe deferral is the way to go, I believe linkage 
ordering can still be used as an optimization to avoid deferring probe in the 
most common cases. I'm thus not opposed to moving iommu/ earlier in link order 
(provided we can properly test for side effects, as the jump is pretty large), 
but not as a replacement for probe deferral.

> Coincidentally there's a separate thread currently going on that deals
> with IOMMUs and probe order. The solution being worked on is currently
> somewhat ARM-specific, so adding a couple of folks for visibility. It
> looks like we're going to need something more generic since this is a
> problem that even the "big" architectures need to solve.

-- 
Regards,

Laurent Pinchart



[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-25 Thread Thierry Reding
On Mon, Dec 22, 2014 at 01:07:13PM +0200, Oded Gabbay wrote:
> This small patch-set, was created to solve the bug described at 
> https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when 
> trying use amdkfd driver on Kaveri). It replaces the previous patch-set 
> called 
> [PATCH 0/3] Use workqueue for device init in amdkfd
> (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.html)
> 
> That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled 
> inside the kernel (not as modules). In that case, the correct loading 
> order, as determined by the exported symbol used by each driver, is 
> not enforced anymore and the kernel loads them based on who was linked 
> first. That makes radeon load first, amdkfd second and amd_iommu_v2 
> third.
> 
> Because the initialization of a device in amdkfd is initiated by radeon, 
> and can only be completed if amdkfd and amd_iommu_v2 were loaded and 
> initialized, then in the case mentioned above, this initalization fails 
> and there is a kernel panic as some pointers are not initialized but 
> used nontheless.
> 
> To solve this bug, this patch-set moves iommu/ before gpu/ in 
> drivers/Makefile 
> and also moves amdkfd/ before radeon/ in drivers/gpu/drm/Makefile.
> 
> The rationale is that in general, AMD GPU devices are dependent on AMD IOMMU 
> controller functionality to allow the GPU to access a process's virtual 
> memory 
> address space, without the need for pinning the memory. That's why it makes 
> sense to initialize the iommu/ subsystem ahead of the gpu/ subsystem.

I strongly object to this patch set. This makes assumptions about how
the build system influences probe order. That's bad because seemingly
unrelated changes could easily break this in the future.

We already have ways to solve this kind of dependency (driver probe
deferral), and I think you should be using it to solve this particular
problem rather than some linking order hack.

Coincidentally there's a separate thread currently going on that deals
with IOMMUs and probe order. The solution being worked on is currently
somewhat ARM-specific, so adding a couple of folks for visibility. It
looks like we're going to need something more generic since this is a
problem that even the "big" architectures need to solve.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 



[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-22 Thread Oded Gabbay
This small patch-set, was created to solve the bug described at 
https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when 
trying use amdkfd driver on Kaveri). It replaces the previous patch-set called 
[PATCH 0/3] Use workqueue for device init in amdkfd
(http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.html)

That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled 
inside the kernel (not as modules). In that case, the correct loading 
order, as determined by the exported symbol used by each driver, is 
not enforced anymore and the kernel loads them based on who was linked 
first. That makes radeon load first, amdkfd second and amd_iommu_v2 
third.

Because the initialization of a device in amdkfd is initiated by radeon, 
and can only be completed if amdkfd and amd_iommu_v2 were loaded and 
initialized, then in the case mentioned above, this initalization fails 
and there is a kernel panic as some pointers are not initialized but 
used nontheless.

To solve this bug, this patch-set moves iommu/ before gpu/ in drivers/Makefile 
and also moves amdkfd/ before radeon/ in drivers/gpu/drm/Makefile.

The rationale is that in general, AMD GPU devices are dependent on AMD IOMMU 
controller functionality to allow the GPU to access a process's virtual memory 
address space, without the need for pinning the memory. That's why it makes 
sense to initialize the iommu/ subsystem ahead of the gpu/ subsystem.

Oded

Oded Gabbay (2):
  drivers: Move iommu/ before gpu/ in Makefile
  drm: Put amdkfd before radeon in drm Makefile

 drivers/Makefile | 6 --
 drivers/gpu/drm/Makefile | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

-- 
1.9.1



[PATCH 0/2] Change order of linkage in kernel makefiles for amdkfd

2014-12-22 Thread Christian König
For this series: Reviewed-by: Christian König 

Am 22.12.2014 um 12:07 schrieb Oded Gabbay:
> This small patch-set, was created to solve the bug described at
> https://bugzilla.kernel.org/show_bug.cgi?id=89661 (Kernel panic when
> trying use amdkfd driver on Kaveri). It replaces the previous patch-set called
> [PATCH 0/3] Use workqueue for device init in amdkfd
> (http://lists.freedesktop.org/archives/dri-devel/2014-December/074401.html)
>
> That bug appears only when radeon, amdkfd and amd_iommu_v2 are compiled
> inside the kernel (not as modules). In that case, the correct loading
> order, as determined by the exported symbol used by each driver, is
> not enforced anymore and the kernel loads them based on who was linked
> first. That makes radeon load first, amdkfd second and amd_iommu_v2
> third.
>
> Because the initialization of a device in amdkfd is initiated by radeon,
> and can only be completed if amdkfd and amd_iommu_v2 were loaded and
> initialized, then in the case mentioned above, this initalization fails
> and there is a kernel panic as some pointers are not initialized but
> used nontheless.
>
> To solve this bug, this patch-set moves iommu/ before gpu/ in drivers/Makefile
> and also moves amdkfd/ before radeon/ in drivers/gpu/drm/Makefile.
>
> The rationale is that in general, AMD GPU devices are dependent on AMD IOMMU
> controller functionality to allow the GPU to access a process's virtual memory
> address space, without the need for pinning the memory. That's why it makes
> sense to initialize the iommu/ subsystem ahead of the gpu/ subsystem.
>
>   Oded
>
> Oded Gabbay (2):
>drivers: Move iommu/ before gpu/ in Makefile
>drm: Put amdkfd before radeon in drm Makefile
>
>   drivers/Makefile | 6 --
>   drivers/gpu/drm/Makefile | 2 +-
>   2 files changed, 5 insertions(+), 3 deletions(-)
>