Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-22 Thread Michael Ellerman
Shawn Anastasio  writes:
> On 7/22/19 7:16 AM, Michael Ellerman wrote:
>> Arnd Bergmann  writes:
>>> On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig  wrote:
 On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:
> On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:
>>> Other than m68k, mips, and arm64, everybody else that doesn't have
>>> ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
>>> I assume this behavior is acceptable on those architectures.
>>
>> It might be acceptable, but there's no reason to use pgport_noncached
>> if the platform supports cache-coherent DMA.
>>
>> Christoph (+cc) made the change so maybe he saw something we're missing.
>
> I always found the forcing of noncached access even for coherent
> devices a little odd, but this was inherited from the previous
> implementation, which surprised me a bit as the different attributes
> are usually problematic even on x86.  Let me dig into the history a
> bit more, but I suspect the righ fix is to default to cached mappings
> for coherent devices.

 Ok, some history:

 The generic dma mmap implementation, which we are effectively still
 using today was added by:

 commit 64ccc9c033c6089b2d426dad3c56477ab066c999
 Author: Marek Szyprowski 
 Date:   Thu Jun 14 13:03:04 2012 +0200

  common: dma-mapping: add support for generic dma_mmap_* calls

 and unconditionally uses pgprot_noncached in dma_common_mmap, which is
 then used as the fallback by dma_mmap_attrs if no ->mmap method is
 present.  At that point we already had the powerpc implementation
 that only uses pgprot_noncached for non-coherent mappings, and
 the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
 is set and otherwise pgprot_dmacoherent, which seems to be uncached.
 Arm did support coherent platforms at that time, but they might have
 been an afterthought and not handled properly.
>>>
>>> Cache-coherent devices are still very rare on 32-bit ARM.
>>>
>>> Among the callers of dma_mmap_coherent(), almost all are in platform
>>> specific device drivers that only ever run on noncoherent ARM SoCs,
>>> which explains why nobody would have noticed problems.
>>>
>>> There is also a difference in behavior between ARM and PowerPC
>>> when dealing with mismatched cacheability attributes: If the same
>>> page is mapped as both cached and uncached to, this may
>>> cause silent undefined behavior on ARM, while PowerPC should
>>> enter a checkstop as soon as it notices.
>> 
>> On newer Power CPUs it's actually more like the ARM behaviour.
>> 
>> I don't know for sure that it will *never* checkstop but there are at
>> least cases where it won't. There's some (not much) detail in the
>> Power8/9 user manuals.
>
> The issue was discovered due to sporadic checkstops on P9, so it
> seems like it will happen at least sometimes.

Yeah true. I wasn't sure if that checkstop was actually caused by a
cached/uncached mismatch or something else, but looks like it was, from
the hostboot issue (https://github.com/open-power/hostboot/issues/180):

  12.47015|  Signature Description: pu.ex:k0:n0:s0:p00:c0 (L2FIR[16]) Cache 
line inhibited hit cacheable space


So I'm not really sure how to square that with the documentation in the
user manual:

  If a caching-inhibited load instruction hits in the L1 data cache, the
  load data is serviced from the L1 data cache and no request is sent to
  the NCU.
  If a caching-inhibited store instruction hits in the L1 data cache,
  the store data is written to the L1 data cache and sent to the NCU.
  Note that the L1 data cache and L2 cache are no longer coherent.

I guess I'm either misinterpreting that section or there's some *other*
documentation somewhere I haven't found that says that it will also
checkstop.

cheers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-22 Thread Shawn Anastasio via iommu




On 7/22/19 7:16 AM, Michael Ellerman wrote:

Arnd Bergmann  writes:

On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig  wrote:

On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:

On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:

Other than m68k, mips, and arm64, everybody else that doesn't have
ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
I assume this behavior is acceptable on those architectures.


It might be acceptable, but there's no reason to use pgport_noncached
if the platform supports cache-coherent DMA.

Christoph (+cc) made the change so maybe he saw something we're missing.


I always found the forcing of noncached access even for coherent
devices a little odd, but this was inherited from the previous
implementation, which surprised me a bit as the different attributes
are usually problematic even on x86.  Let me dig into the history a
bit more, but I suspect the righ fix is to default to cached mappings
for coherent devices.


Ok, some history:

The generic dma mmap implementation, which we are effectively still
using today was added by:

commit 64ccc9c033c6089b2d426dad3c56477ab066c999
Author: Marek Szyprowski 
Date:   Thu Jun 14 13:03:04 2012 +0200

 common: dma-mapping: add support for generic dma_mmap_* calls

and unconditionally uses pgprot_noncached in dma_common_mmap, which is
then used as the fallback by dma_mmap_attrs if no ->mmap method is
present.  At that point we already had the powerpc implementation
that only uses pgprot_noncached for non-coherent mappings, and
the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
is set and otherwise pgprot_dmacoherent, which seems to be uncached.
Arm did support coherent platforms at that time, but they might have
been an afterthought and not handled properly.


Cache-coherent devices are still very rare on 32-bit ARM.

Among the callers of dma_mmap_coherent(), almost all are in platform
specific device drivers that only ever run on noncoherent ARM SoCs,
which explains why nobody would have noticed problems.

There is also a difference in behavior between ARM and PowerPC
when dealing with mismatched cacheability attributes: If the same
page is mapped as both cached and uncached to, this may
cause silent undefined behavior on ARM, while PowerPC should
enter a checkstop as soon as it notices.


On newer Power CPUs it's actually more like the ARM behaviour.

I don't know for sure that it will *never* checkstop but there are at
least cases where it won't. There's some (not much) detail in the
Power8/9 user manuals.


The issue was discovered due to sporadic checkstops on P9, so it
seems like it will happen at least sometimes.


cheers

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-22 Thread Michael Ellerman
Arnd Bergmann  writes:
> On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig  wrote:
>> On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:
>> > On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:
>> > > > Other than m68k, mips, and arm64, everybody else that doesn't have
>> > > > ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
>> > > > I assume this behavior is acceptable on those architectures.
>> > >
>> > > It might be acceptable, but there's no reason to use pgport_noncached
>> > > if the platform supports cache-coherent DMA.
>> > >
>> > > Christoph (+cc) made the change so maybe he saw something we're missing.
>> >
>> > I always found the forcing of noncached access even for coherent
>> > devices a little odd, but this was inherited from the previous
>> > implementation, which surprised me a bit as the different attributes
>> > are usually problematic even on x86.  Let me dig into the history a
>> > bit more, but I suspect the righ fix is to default to cached mappings
>> > for coherent devices.
>>
>> Ok, some history:
>>
>> The generic dma mmap implementation, which we are effectively still
>> using today was added by:
>>
>> commit 64ccc9c033c6089b2d426dad3c56477ab066c999
>> Author: Marek Szyprowski 
>> Date:   Thu Jun 14 13:03:04 2012 +0200
>>
>> common: dma-mapping: add support for generic dma_mmap_* calls
>>
>> and unconditionally uses pgprot_noncached in dma_common_mmap, which is
>> then used as the fallback by dma_mmap_attrs if no ->mmap method is
>> present.  At that point we already had the powerpc implementation
>> that only uses pgprot_noncached for non-coherent mappings, and
>> the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
>> is set and otherwise pgprot_dmacoherent, which seems to be uncached.
>> Arm did support coherent platforms at that time, but they might have
>> been an afterthought and not handled properly.
>
> Cache-coherent devices are still very rare on 32-bit ARM.
>
> Among the callers of dma_mmap_coherent(), almost all are in platform
> specific device drivers that only ever run on noncoherent ARM SoCs,
> which explains why nobody would have noticed problems.
>
> There is also a difference in behavior between ARM and PowerPC
> when dealing with mismatched cacheability attributes: If the same
> page is mapped as both cached and uncached to, this may
> cause silent undefined behavior on ARM, while PowerPC should
> enter a checkstop as soon as it notices.

On newer Power CPUs it's actually more like the ARM behaviour.

I don't know for sure that it will *never* checkstop but there are at
least cases where it won't. There's some (not much) detail in the
Power8/9 user manuals.

cheers
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-19 Thread Arnd Bergmann
On Thu, Jul 18, 2019 at 11:52 AM Christoph Hellwig  wrote:
>
> On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:
> > On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:
> > > > Other than m68k, mips, and arm64, everybody else that doesn't have
> > > > ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
> > > > I assume this behavior is acceptable on those architectures.
> > >
> > > It might be acceptable, but there's no reason to use pgport_noncached
> > > if the platform supports cache-coherent DMA.
> > >
> > > Christoph (+cc) made the change so maybe he saw something we're missing.
> >
> > I always found the forcing of noncached access even for coherent
> > devices a little odd, but this was inherited from the previous
> > implementation, which surprised me a bit as the different attributes
> > are usually problematic even on x86.  Let me dig into the history a
> > bit more, but I suspect the righ fix is to default to cached mappings
> > for coherent devices.
>
> Ok, some history:
>
> The generic dma mmap implementation, which we are effectively still
> using today was added by:
>
> commit 64ccc9c033c6089b2d426dad3c56477ab066c999
> Author: Marek Szyprowski 
> Date:   Thu Jun 14 13:03:04 2012 +0200
>
> common: dma-mapping: add support for generic dma_mmap_* calls
>
> and unconditionally uses pgprot_noncached in dma_common_mmap, which is
> then used as the fallback by dma_mmap_attrs if no ->mmap method is
> present.  At that point we already had the powerpc implementation
> that only uses pgprot_noncached for non-coherent mappings, and
> the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
> is set and otherwise pgprot_dmacoherent, which seems to be uncached.
> Arm did support coherent platforms at that time, but they might have
> been an afterthought and not handled properly.

Cache-coherent devices are still very rare on 32-bit ARM.

Among the callers of dma_mmap_coherent(), almost all are in platform
specific device drivers that only ever run on noncoherent ARM SoCs,
which explains why nobody would have noticed problems.

There is also a difference in behavior between ARM and PowerPC
when dealing with mismatched cacheability attributes: If the same
page is mapped as both cached and uncached to, this may
cause silent undefined behavior on ARM, while PowerPC should
enter a checkstop as soon as it notices.

  Arnd
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-19 Thread Shawn Anastasio via iommu

On 7/19/19 2:06 AM, Christoph Hellwig wrote:
> What is inherently architecture specific here over the fact that
> the pgprot_* expand to architecture specific bits?

What I meant is that different architectures seem to have different
criteria for setting the different pgprot_ bits. i.e. ppc checks
for cache coherency, arm64 checks for cache coherency and
writecombine, mips just checks for writecombine, etc.

That being said, I'm no expert here and there is probably some behavior
here that would make for a much more sane default.

> I'd rather not create boilerplate code where we don't have to it. Note
> that arch_dma_mmap_pgprot is a little misnamed now, as we also use it
> for the in-kernel remapping in kernel/dma/remap.c, which I'm slowly
> switching a lot of architectures to.  powerpc will follow soon once
> I get the ppc44x that was given to me to actually boot with a recent
> kernel (not that I've tried much so far).

Fair enough. I didn't realize that most of the other architectures
don't use the common code anyways as you mention below.

> Every arch except for arm32 now uses dma direct for the direct mapping,
> and thus the common code without the indeed odd default.  I also have
> a series to remove the default fallback, which is inherently a bad idea:
>
> 
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-no-defaults


Awesome. This is great to hear.

> I think your patch that started this thread is fine for 5.3 and -stable:
>
> Reviewed-by: Christoph Hellwig 

Thanks!

> But going forward I'd rather have a sane default.

Agreed.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-19 Thread Christoph Hellwig
On Thu, Jul 18, 2019 at 02:46:00PM -0500, Shawn Anastasio wrote:
> Personally, I'm not a huge fan of an implicit default for something
> inherently architecture-dependent like this at all.

What is inherently architecture specific here over the fact that
the pgprot_* expand to architecture specific bits?

> What I'd like to
> see is a mechanism that forces architecture code to explicitly
> opt in to the default pgprot settings if they don't provide an
> implementation of arch_dma_mmap_pgprot. This could perhaps be done
> by reversing ARCH_HAS_DMA_MMAP_PGPROT to something like
> ARCH_USE_DEFAULT_DMA_MMAP_PGPROT.

I'd rather not create boilerplate code where we don't have to it.  Note
that arch_dma_mmap_pgprot is a little misnamed now, as we also use it
for the in-kernel remapping in kernel/dma/remap.c, which I'm slowly
switching a lot of architectures to.  powerpc will follow soon once
I get the ppc44x that was given to me to actually boot with a recent
kernel (not that I've tried much so far).

>
> This way as more systems are moved to use the common mmap code instead
> of their ops->mmap, the people doing the refactoring have to make an
> explicit decision about the pgprot settings to use. Such a configuration
> would have likely prevented this situation with powerpc from happening.

Every arch except for arm32 now uses dma direct for the direct mapping,
and thus the common code without the indeed odd default.  I also have
a series to remove the default fallback, which is inherently a bad idea:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/dma-no-defaults

> That being said, if the default behavior doesn't make sense in the
> general case it should probably be fixed as well.
>
> Curious to hear some thoughts on this.

I think your patch that started this thread is fine for 5.3 and -stable:

Reviewed-by: Christoph Hellwig 

But going forward I'd rather have a sane default.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-18 Thread Shawn Anastasio via iommu

On 7/18/19 4:52 AM, Christoph Hellwig wrote:

On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:

On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:

Other than m68k, mips, and arm64, everybody else that doesn't have
ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
I assume this behavior is acceptable on those architectures.


It might be acceptable, but there's no reason to use pgport_noncached
if the platform supports cache-coherent DMA.

Christoph (+cc) made the change so maybe he saw something we're missing.


I always found the forcing of noncached access even for coherent
devices a little odd, but this was inherited from the previous
implementation, which surprised me a bit as the different attributes
are usually problematic even on x86.  Let me dig into the history a
bit more, but I suspect the righ fix is to default to cached mappings
for coherent devices.


Ok, some history:

The generic dma mmap implementation, which we are effectively still
using today was added by:

commit 64ccc9c033c6089b2d426dad3c56477ab066c999
Author: Marek Szyprowski 
Date:   Thu Jun 14 13:03:04 2012 +0200

 common: dma-mapping: add support for generic dma_mmap_* calls

and unconditionally uses pgprot_noncached in dma_common_mmap, which is
then used as the fallback by dma_mmap_attrs if no ->mmap method is
present.  At that point we already had the powerpc implementation
that only uses pgprot_noncached for non-coherent mappings, and
the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
is set and otherwise pgprot_dmacoherent, which seems to be uncached.
Arm did support coherent platforms at that time, but they might have
been an afterthought and not handled properly.

So it migt have been that we were all wrong for that time and might
have to fix it up.


Personally, I'm not a huge fan of an implicit default for something
inherently architecture-dependent like this at all. What I'd like to
see is a mechanism that forces architecture code to explicitly
opt in to the default pgprot settings if they don't provide an
implementation of arch_dma_mmap_pgprot. This could perhaps be done
by reversing ARCH_HAS_DMA_MMAP_PGPROT to something like
ARCH_USE_DEFAULT_DMA_MMAP_PGPROT.

This way as more systems are moved to use the common mmap code instead
of their ops->mmap, the people doing the refactoring have to make an
explicit decision about the pgprot settings to use. Such a configuration
would have likely prevented this situation with powerpc from happening.

That being said, if the default behavior doesn't make sense in the
general case it should probably be fixed as well.

Curious to hear some thoughts on this.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] powerpc/dma: Fix invalid DMA mmap behavior

2019-07-18 Thread Christoph Hellwig
On Thu, Jul 18, 2019 at 10:49:34AM +0200, Christoph Hellwig wrote:
> On Thu, Jul 18, 2019 at 01:45:16PM +1000, Oliver O'Halloran wrote:
> > > Other than m68k, mips, and arm64, everybody else that doesn't have
> > > ARCH_NO_COHERENT_DMA_MMAP set uses this default implementation, so
> > > I assume this behavior is acceptable on those architectures.
> > 
> > It might be acceptable, but there's no reason to use pgport_noncached
> > if the platform supports cache-coherent DMA.
> > 
> > Christoph (+cc) made the change so maybe he saw something we're missing.
> 
> I always found the forcing of noncached access even for coherent
> devices a little odd, but this was inherited from the previous
> implementation, which surprised me a bit as the different attributes
> are usually problematic even on x86.  Let me dig into the history a
> bit more, but I suspect the righ fix is to default to cached mappings
> for coherent devices.

Ok, some history:

The generic dma mmap implementation, which we are effectively still
using today was added by:

commit 64ccc9c033c6089b2d426dad3c56477ab066c999
Author: Marek Szyprowski 
Date:   Thu Jun 14 13:03:04 2012 +0200

common: dma-mapping: add support for generic dma_mmap_* calls

and unconditionally uses pgprot_noncached in dma_common_mmap, which is
then used as the fallback by dma_mmap_attrs if no ->mmap method is
present.  At that point we already had the powerpc implementation
that only uses pgprot_noncached for non-coherent mappings, and
the arm one, which uses pgprot_writecombine if DMA_ATTR_WRITE_COMBINE
is set and otherwise pgprot_dmacoherent, which seems to be uncached.
Arm did support coherent platforms at that time, but they might have
been an afterthought and not handled properly.

So it migt have been that we were all wrong for that time and might
have to fix it up.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu