Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-05 Thread Jason Harmening
On Thu, Nov 5, 2015 at 9:21 AM, Ian Lepore  wrote:

> Just as an FYI for you guys, since we're all splashing around in busdma
> code lately The things on my near-term (next several weeks) to-do
> list for busdma are:
>
> Add "small bounce" support to help reduce the bounce preallocation
> overhead for handling bounces due to cacheline alignment.  Almost all
> arm and mips bounces are triggered by cache alignment restrictions and
> involve buffers smaller than a page (usually 256 bytes or less), and we
> currently allocate full pages for bouncing.  Instead we can use the
> same small-buffer uma pools that are used for bus_dmamem_alloc().
>

I like that.  Along similar (but definitely not the same) lines, I had a
suggestion in https://reviews.freebsd.org/D888 to coalesce bounce buffers,
similar to how addseg already coalesces adjacent segments.  Would something
like that help here (in addition to small-bounce) to cut down on bounce
buffer overhead, when the total buffer size is still a page or larger?


>
> The armv6 busdma implementation can be shared by armv4, armv6, and
> mips.  The only thing blocking this right now is the small-bounce
> stuff; armv4 and mips boards often have only 32-128MB of ram, and they
> can't afford the bounce-page allocation overhead of the current armv6
> implementation.  Once the small-bounce support is in place, we can use
> this implementation for all these platforms (and basically for any
> platform that comes along that has a similar requirement for software
> -assisted cache coherency for DMA).
>

Do you plan to merge armv6 and mips soonish?  If so, then
https://reviews.freebsd.org/D3986 can go away.  That will also fix the
problems mips busdma has with the cacheline copies in sync_buf() not being
threadsafe.

Code duplication in busdma is a huge pain right now.  It seems like we have
3 broad classes of busdma support: bounce/coherent, bounce/non-coherent,
and iommu.

bounce/coherent could probably share all the same code.  You might need a
light dusting of platform-specific code (e.g. powerpc issues a sync
instruction at the end of bus_dmamap_sync), but otherwise it should be the
same.

bounce/noncoherent could share the same code, as long as you define a set
of macros or inline funcs that abstract the cache maintenance.  That'll
already be needed to cover the cache differences between armv4, armv6, and
mips anyway.

iommu is always going to be hardware-specific and need its own
implementation for each kind of iommu.

Then there's some logic that could probably be shared everywhere:  a lot of
the bus_dmamem_alloc()/bufalloc code strikes me as falling into that
category.  You might need MD wrappers that call MI helper functions, so for
example the MD code can decide whether BUS_DMA_COHERENT means
VM_MEMATTR_UNCACHEABLE, or whether allocated buffers really need to be
physically contiguous (not needed for iommu).

On top of all that, it makes sense to generalize the vtable approach
already used for x86 and arm64.   In a lot of cases the vtable would just
point to the common bounce/coherent or bounce/noncoherent functions, or in
some cases lightweight MD wrappers around those.   Then for example it
would be easy to add bounce/noncoherent support for arm64 if we end up
needing it, and somewhat easier to add arm/mips/ppc iommu support.

I'm definitely not suggesting you do all that stuff right now, just
throwing ideas out there :)


>
> -- Ian
>
> On Thu, 2015-11-05 at 08:08 -0800, Jason Harmening wrote:
> > Userspace buffers in load_buffer() also need temporary mappings, so
> > it
> > might be nice to keep the panic/KASSERT there for completeness.  I
> > don't
> > see how  anything coming from userspace would be outside
> > vm_page_array
> > though, so that is up to you and Michal.
> >
> > Since Michal's already made the initial patch, I think he should make
> > the
> > commit.  That seems only right, plus I'm still busy trying to get
> > everything set up at my new place.  I'd like to be on the phabricator
> > review, though.
> >
> > On Thu, Nov 5, 2015 at 3:58 AM, Svatopluk Kraus 
> > wrote:
> >
> > > On Thu, Nov 5, 2015 at 1:11 AM, Jason Harmening
> > >  wrote:
> > > >
> > > > On Wed, Nov 4, 2015 at 2:17 PM, Svatopluk Kraus  > > > >
> > > wrote:
> > > > >
> > > > > On Tue, Nov 3, 2015 at 8:45 AM, Jason Harmening
> > > > >  wrote:
> > > > > >
> > > > > > On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore 
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > It's almost certainly not related to sysinit ordering.
> > > > > > >  This
> > > exception
> > > > > > > is happening during mmc probing after interrupts are
> > > > > > > enabled.
> > > > > > >
> > > > > > > It appears that the problem is the faulting code is running
> > > > > > > on one of
> > > > > > > the very early pre-allocated kernel stacks (perhaps in an
> > > > > > > interrupt
> > > > > > > handler on an idle thread stack), and these stacks are not
> > > > > > > in memory
> > > > > > > represented in the vm pa

Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-05 Thread Ian Lepore
Just as an FYI for you guys, since we're all splashing around in busdma
code lately The things on my near-term (next several weeks) to-do
list for busdma are:

Add "small bounce" support to help reduce the bounce preallocation
overhead for handling bounces due to cacheline alignment.  Almost all
arm and mips bounces are triggered by cache alignment restrictions and
involve buffers smaller than a page (usually 256 bytes or less), and we
currently allocate full pages for bouncing.  Instead we can use the
same small-buffer uma pools that are used for bus_dmamem_alloc().

The armv6 busdma implementation can be shared by armv4, armv6, and
mips.  The only thing blocking this right now is the small-bounce
stuff; armv4 and mips boards often have only 32-128MB of ram, and they
can't afford the bounce-page allocation overhead of the current armv6
implementation.  Once the small-bounce support is in place, we can use
this implementation for all these platforms (and basically for any
platform that comes along that has a similar requirement for software
-assisted cache coherency for DMA).

-- Ian

On Thu, 2015-11-05 at 08:08 -0800, Jason Harmening wrote:
> Userspace buffers in load_buffer() also need temporary mappings, so
> it
> might be nice to keep the panic/KASSERT there for completeness.  I
> don't
> see how  anything coming from userspace would be outside
> vm_page_array
> though, so that is up to you and Michal.
> 
> Since Michal's already made the initial patch, I think he should make
> the
> commit.  That seems only right, plus I'm still busy trying to get
> everything set up at my new place.  I'd like to be on the phabricator
> review, though.
> 
> On Thu, Nov 5, 2015 at 3:58 AM, Svatopluk Kraus 
> wrote:
> 
> > On Thu, Nov 5, 2015 at 1:11 AM, Jason Harmening
> >  wrote:
> > > 
> > > On Wed, Nov 4, 2015 at 2:17 PM, Svatopluk Kraus  > > >
> > wrote:
> > > > 
> > > > On Tue, Nov 3, 2015 at 8:45 AM, Jason Harmening
> > > >  wrote:
> > > > > 
> > > > > On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore 
> > > > > wrote:
> > > > > > 
> > > > > > 
> > > > > > It's almost certainly not related to sysinit ordering. 
> > > > > >  This
> > exception
> > > > > > is happening during mmc probing after interrupts are
> > > > > > enabled.
> > > > > > 
> > > > > > It appears that the problem is the faulting code is running
> > > > > > on one of
> > > > > > the very early pre-allocated kernel stacks (perhaps in an
> > > > > > interrupt
> > > > > > handler on an idle thread stack), and these stacks are not
> > > > > > in memory
> > > > > > represented in the vm page arrays.  The mmc code is using a
> > > > > > 64-byte
> > > > > > buffer on the stack and mapping it for DMA.  Normally that
> > > > > > causes a
> > > > > > bounce for cacheline alignment, but unluckily in this case
> > > > > > that
> > buffer
> > > > > > on the stack just happened to be aligned to a cacheline
> > > > > > boundary and
> > a
> > > > > > multiple of the cacheline size, so no bounce.  That causes
> > > > > > the new
> > sync
> > > > > > logic that is based on keeping vm_page_t pointers and
> > > > > > offsets to get
> > a
> > > > > > NULL pointer back from PHYS_TO_VM_PAGE when mapping, then
> > > > > > it dies at
> > > > > > sync time trying to dereference that.  It used to work
> > > > > > because the
> > sync
> > > > > > logic used to use the vaddr, not a page pointer.
> > > > > > 
> > > > > > Michal was working on a patch yesterday.
> > > > > > 
> > > > > 
> > > > > Ah, thanks for pointing that out Ian.  I was left scratching
> > > > > my head
> > > > > (admittedly on the road and w/o easy access to the code)
> > > > > wondering
> > what
> > > > > on
> > > > > earth would be trying to do DMA during SI_SUB_CPU.
> > > > > 
> > > > 
> > > > 
> > > > Using of fictitious pages is not so easy here as in case
> > > > pointed by
> > > > kib@ where they are allocated and freed inside one function.
> > > > For sync
> > > > list sake, they must be allocated when a buffer is loaded and
> > > > freed
> > > > when is unloaded.
> > > > 
> > > > Michal uses pmap_kextract() in case when KVA of buffer is not
> > > > zero in
> > > > his proof-of-concept patch:
> > > > https://gist.github.com/strejda/d5ca3ed202427a2e0557
> > > > When KVA of buffer is not zero, temporary mapping is not used
> > > > at all,
> > > > so vm_page_t array is not needed too.
> > > > 
> > > > IMO, buffer's physical address can be saved in sync list to be
> > > > used
> > > > when its KVA is not zero. Thus pmap_kextract() won't be called
> > > > in
> > > > dma_dcache_sync().
> > > 
> > > 
> > > I like the idea of storing off the physical address.  If you want
> > > to save
> > > space in the sync list, I think you can place busaddr and pages
> > > in a
> > union,
> > > using vaddr == 0 to select which field to use.  Some people frown
> > > upon
> > use
> > > of unions, though.
> > > 
> > > Any reason the panics on PHYS_TO_VM_PAGE in load_buffer() and
> > > load_phys()
> > > shouldn't be KASSERTs in

Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-05 Thread Jason Harmening
Userspace buffers in load_buffer() also need temporary mappings, so it
might be nice to keep the panic/KASSERT there for completeness.  I don't
see how  anything coming from userspace would be outside vm_page_array
though, so that is up to you and Michal.

Since Michal's already made the initial patch, I think he should make the
commit.  That seems only right, plus I'm still busy trying to get
everything set up at my new place.  I'd like to be on the phabricator
review, though.

On Thu, Nov 5, 2015 at 3:58 AM, Svatopluk Kraus  wrote:

> On Thu, Nov 5, 2015 at 1:11 AM, Jason Harmening
>  wrote:
> >
> > On Wed, Nov 4, 2015 at 2:17 PM, Svatopluk Kraus 
> wrote:
> >>
> >> On Tue, Nov 3, 2015 at 8:45 AM, Jason Harmening
> >>  wrote:
> >> >
> >> > On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore  wrote:
> >> >>
> >> >>
> >> >> It's almost certainly not related to sysinit ordering.  This
> exception
> >> >> is happening during mmc probing after interrupts are enabled.
> >> >>
> >> >> It appears that the problem is the faulting code is running on one of
> >> >> the very early pre-allocated kernel stacks (perhaps in an interrupt
> >> >> handler on an idle thread stack), and these stacks are not in memory
> >> >> represented in the vm page arrays.  The mmc code is using a 64-byte
> >> >> buffer on the stack and mapping it for DMA.  Normally that causes a
> >> >> bounce for cacheline alignment, but unluckily in this case that
> buffer
> >> >> on the stack just happened to be aligned to a cacheline boundary and
> a
> >> >> multiple of the cacheline size, so no bounce.  That causes the new
> sync
> >> >> logic that is based on keeping vm_page_t pointers and offsets to get
> a
> >> >> NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
> >> >> sync time trying to dereference that.  It used to work because the
> sync
> >> >> logic used to use the vaddr, not a page pointer.
> >> >>
> >> >> Michal was working on a patch yesterday.
> >> >>
> >> >
> >> > Ah, thanks for pointing that out Ian.  I was left scratching my head
> >> > (admittedly on the road and w/o easy access to the code) wondering
> what
> >> > on
> >> > earth would be trying to do DMA during SI_SUB_CPU.
> >> >
> >>
> >>
> >> Using of fictitious pages is not so easy here as in case pointed by
> >> kib@ where they are allocated and freed inside one function. For sync
> >> list sake, they must be allocated when a buffer is loaded and freed
> >> when is unloaded.
> >>
> >> Michal uses pmap_kextract() in case when KVA of buffer is not zero in
> >> his proof-of-concept patch:
> >> https://gist.github.com/strejda/d5ca3ed202427a2e0557
> >> When KVA of buffer is not zero, temporary mapping is not used at all,
> >> so vm_page_t array is not needed too.
> >>
> >> IMO, buffer's physical address can be saved in sync list to be used
> >> when its KVA is not zero. Thus pmap_kextract() won't be called in
> >> dma_dcache_sync().
> >
> >
> > I like the idea of storing off the physical address.  If you want to save
> > space in the sync list, I think you can place busaddr and pages in a
> union,
> > using vaddr == 0 to select which field to use.  Some people frown upon
> use
> > of unions, though.
> >
> > Any reason the panics on PHYS_TO_VM_PAGE in load_buffer() and load_phys()
> > shouldn't be KASSERTs instead?
>
>
> KASSERTs are fine. I have looked at Michal's patch again and only
> KASSERT should be in load_phys() as such buffers are unmapped,
> temporary mapping must be used and so, they must be in vm_page_array.
> And maybe some inline function for getting PA from sync list would be
> nice too. I would like to review final patch if you are going to make
> it. And it would be nice if Ganbold will test it. Phabricator?
>
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-05 Thread Svatopluk Kraus
On Thu, Nov 5, 2015 at 1:11 AM, Jason Harmening
 wrote:
>
> On Wed, Nov 4, 2015 at 2:17 PM, Svatopluk Kraus  wrote:
>>
>> On Tue, Nov 3, 2015 at 8:45 AM, Jason Harmening
>>  wrote:
>> >
>> > On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore  wrote:
>> >>
>> >>
>> >> It's almost certainly not related to sysinit ordering.  This exception
>> >> is happening during mmc probing after interrupts are enabled.
>> >>
>> >> It appears that the problem is the faulting code is running on one of
>> >> the very early pre-allocated kernel stacks (perhaps in an interrupt
>> >> handler on an idle thread stack), and these stacks are not in memory
>> >> represented in the vm page arrays.  The mmc code is using a 64-byte
>> >> buffer on the stack and mapping it for DMA.  Normally that causes a
>> >> bounce for cacheline alignment, but unluckily in this case that buffer
>> >> on the stack just happened to be aligned to a cacheline boundary and a
>> >> multiple of the cacheline size, so no bounce.  That causes the new sync
>> >> logic that is based on keeping vm_page_t pointers and offsets to get a
>> >> NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
>> >> sync time trying to dereference that.  It used to work because the sync
>> >> logic used to use the vaddr, not a page pointer.
>> >>
>> >> Michal was working on a patch yesterday.
>> >>
>> >
>> > Ah, thanks for pointing that out Ian.  I was left scratching my head
>> > (admittedly on the road and w/o easy access to the code) wondering what
>> > on
>> > earth would be trying to do DMA during SI_SUB_CPU.
>> >
>>
>>
>> Using of fictitious pages is not so easy here as in case pointed by
>> kib@ where they are allocated and freed inside one function. For sync
>> list sake, they must be allocated when a buffer is loaded and freed
>> when is unloaded.
>>
>> Michal uses pmap_kextract() in case when KVA of buffer is not zero in
>> his proof-of-concept patch:
>> https://gist.github.com/strejda/d5ca3ed202427a2e0557
>> When KVA of buffer is not zero, temporary mapping is not used at all,
>> so vm_page_t array is not needed too.
>>
>> IMO, buffer's physical address can be saved in sync list to be used
>> when its KVA is not zero. Thus pmap_kextract() won't be called in
>> dma_dcache_sync().
>
>
> I like the idea of storing off the physical address.  If you want to save
> space in the sync list, I think you can place busaddr and pages in a union,
> using vaddr == 0 to select which field to use.  Some people frown upon use
> of unions, though.
>
> Any reason the panics on PHYS_TO_VM_PAGE in load_buffer() and load_phys()
> shouldn't be KASSERTs instead?


KASSERTs are fine. I have looked at Michal's patch again and only
KASSERT should be in load_phys() as such buffers are unmapped,
temporary mapping must be used and so, they must be in vm_page_array.
And maybe some inline function for getting PA from sync list would be
nice too. I would like to review final patch if you are going to make
it. And it would be nice if Ganbold will test it. Phabricator?
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-04 Thread Jason Harmening
On Wed, Nov 4, 2015 at 2:17 PM, Svatopluk Kraus  wrote:

> On Tue, Nov 3, 2015 at 8:45 AM, Jason Harmening
>  wrote:
> >
> > On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore  wrote:
> >>
> >>
> >> It's almost certainly not related to sysinit ordering.  This exception
> >> is happening during mmc probing after interrupts are enabled.
> >>
> >> It appears that the problem is the faulting code is running on one of
> >> the very early pre-allocated kernel stacks (perhaps in an interrupt
> >> handler on an idle thread stack), and these stacks are not in memory
> >> represented in the vm page arrays.  The mmc code is using a 64-byte
> >> buffer on the stack and mapping it for DMA.  Normally that causes a
> >> bounce for cacheline alignment, but unluckily in this case that buffer
> >> on the stack just happened to be aligned to a cacheline boundary and a
> >> multiple of the cacheline size, so no bounce.  That causes the new sync
> >> logic that is based on keeping vm_page_t pointers and offsets to get a
> >> NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
> >> sync time trying to dereference that.  It used to work because the sync
> >> logic used to use the vaddr, not a page pointer.
> >>
> >> Michal was working on a patch yesterday.
> >>
> >
> > Ah, thanks for pointing that out Ian.  I was left scratching my head
> > (admittedly on the road and w/o easy access to the code) wondering what
> on
> > earth would be trying to do DMA during SI_SUB_CPU.
> >
>
>
> Using of fictitious pages is not so easy here as in case pointed by
> kib@ where they are allocated and freed inside one function. For sync
> list sake, they must be allocated when a buffer is loaded and freed
> when is unloaded.
>
> Michal uses pmap_kextract() in case when KVA of buffer is not zero in
> his proof-of-concept patch:
> https://gist.github.com/strejda/d5ca3ed202427a2e0557
> When KVA of buffer is not zero, temporary mapping is not used at all,
> so vm_page_t array is not needed too.
>
> IMO, buffer's physical address can be saved in sync list to be used
> when its KVA is not zero. Thus pmap_kextract() won't be called in
> dma_dcache_sync().
>

I like the idea of storing off the physical address.  If you want to save
space in the sync list, I think you can place busaddr and pages in a union,
using vaddr == 0 to select which field to use.  Some people frown upon use
of unions, though.

Any reason the panics on PHYS_TO_VM_PAGE in load_buffer() and load_phys()
shouldn't be KASSERTs instead?
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-04 Thread Svatopluk Kraus
On Tue, Nov 3, 2015 at 8:45 AM, Jason Harmening
 wrote:
>
> On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore  wrote:
>>
>>
>> It's almost certainly not related to sysinit ordering.  This exception
>> is happening during mmc probing after interrupts are enabled.
>>
>> It appears that the problem is the faulting code is running on one of
>> the very early pre-allocated kernel stacks (perhaps in an interrupt
>> handler on an idle thread stack), and these stacks are not in memory
>> represented in the vm page arrays.  The mmc code is using a 64-byte
>> buffer on the stack and mapping it for DMA.  Normally that causes a
>> bounce for cacheline alignment, but unluckily in this case that buffer
>> on the stack just happened to be aligned to a cacheline boundary and a
>> multiple of the cacheline size, so no bounce.  That causes the new sync
>> logic that is based on keeping vm_page_t pointers and offsets to get a
>> NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
>> sync time trying to dereference that.  It used to work because the sync
>> logic used to use the vaddr, not a page pointer.
>>
>> Michal was working on a patch yesterday.
>>
>
> Ah, thanks for pointing that out Ian.  I was left scratching my head
> (admittedly on the road and w/o easy access to the code) wondering what on
> earth would be trying to do DMA during SI_SUB_CPU.
>


Using of fictitious pages is not so easy here as in case pointed by
kib@ where they are allocated and freed inside one function. For sync
list sake, they must be allocated when a buffer is loaded and freed
when is unloaded.

Michal uses pmap_kextract() in case when KVA of buffer is not zero in
his proof-of-concept patch:
https://gist.github.com/strejda/d5ca3ed202427a2e0557
When KVA of buffer is not zero, temporary mapping is not used at all,
so vm_page_t array is not needed too.

IMO, buffer's physical address can be saved in sync list to be used
when its KVA is not zero. Thus pmap_kextract() won't be called in
dma_dcache_sync().
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-02 Thread Jason Harmening
On Sun, Nov 1, 2015 at 8:11 AM, Ian Lepore  wrote:

>
> It's almost certainly not related to sysinit ordering.  This exception
> is happening during mmc probing after interrupts are enabled.
>
> It appears that the problem is the faulting code is running on one of
> the very early pre-allocated kernel stacks (perhaps in an interrupt
> handler on an idle thread stack), and these stacks are not in memory
> represented in the vm page arrays.  The mmc code is using a 64-byte
> buffer on the stack and mapping it for DMA.  Normally that causes a
> bounce for cacheline alignment, but unluckily in this case that buffer
> on the stack just happened to be aligned to a cacheline boundary and a
> multiple of the cacheline size, so no bounce.  That causes the new sync
> logic that is based on keeping vm_page_t pointers and offsets to get a
> NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
> sync time trying to dereference that.  It used to work because the sync
> logic used to use the vaddr, not a page pointer.
>
> Michal was working on a patch yesterday.
>
>
Ah, thanks for pointing that out Ian.  I was left scratching my head
(admittedly on the road and w/o easy access to the code) wondering what on
earth would be trying to do DMA during SI_SUB_CPU.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-01 Thread Konstantin Belousov
On Sun, Nov 01, 2015 at 09:11:51AM -0700, Ian Lepore wrote:
> It's almost certainly not related to sysinit ordering.  This exception
> is happening during mmc probing after interrupts are enabled.
> 
> It appears that the problem is the faulting code is running on one of
> the very early pre-allocated kernel stacks (perhaps in an interrupt
> handler on an idle thread stack), and these stacks are not in memory
> represented in the vm page arrays.  The mmc code is using a 64-byte
> buffer on the stack and mapping it for DMA.  Normally that causes a
> bounce for cacheline alignment, but unluckily in this case that buffer
> on the stack just happened to be aligned to a cacheline boundary and a
> multiple of the cacheline size, so no bounce.  That causes the new sync
> logic that is based on keeping vm_page_t pointers and offsets to get a
> NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
> sync time trying to dereference that.  It used to work because the sync
> logic used to use the vaddr, not a page pointer.
> 
> Michal was working on a patch yesterday.

This actually is the same as the issue I had with the DMAR busdma on
x86.  You could see the trick used in the
sys/x86/iommu/busdma_dmar.c:dmar_bus_dmamap_load_buffer(),
the dumping case.  I just allocated fictitious pages to cope with
kernel core dump needing to DMA from memory which is not described
by vm_page_array.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-01 Thread Ian Lepore
On Sun, 2015-11-01 at 07:01 -0600, Jason Harmening wrote:
> On Sat, Oct 31, 2015 at 4:55 AM, Jason Harmening <
> jason.harmen...@gmail.com>
> wrote:
> 
> > 
> > 
> > On 10/31/15 03:21, Ganbold Tsagaankhuu wrote:
> > > On Fri, Oct 23, 2015 at 12:38 AM, Jason A. Harmening <
> > > j...@freebsd.org>
> > > wrote:
> > > 
> > > > Author: jah
> > > > Date: Thu Oct 22 16:38:01 2015
> > > > New Revision: 289759
> > > > URL: https://svnweb.freebsd.org/changeset/base/289759
> > > > 
> > > > Log:
> > > >   Use pmap_quick* functions in armv6 busdma, for bounce buffers
> > > > and
> > cache
> > > > maintenance.  This makes it safe to sync buffers that have no
> > > > VA mapping
> > > > associated with the busdma map, but may have other mappings,
> > > > possibly on
> > > > different CPUs.  This also makes it safe to sync unmapped
> > > > bounce
> > buffers in
> > > > non-sleepable thread contexts.
> > > > 
> > > >   Similar to r286787 for x86, this treats userspace buffers the
> > > > same as
> > > > unmapped buffers and no longer borrows the UVA for sync
> > > > operations.
> > > > 
> > > >   Submitted by: Svatopluk Kraus 
> > > > (earlier
> > > > revision)
> > > >   Tested by:Svatopluk Kraus
> > > >   Differential Revision:
> > > > https://reviews.freebsd.org/D3869
> > > 
> > > 
> > > 
> > > It seems I can't boot Odroid C1 with this change.
> > > 
> > > http://pastebin.ca/3227678
> > > 
> > > r289758 works for me.
> > > 
> > > Am I missing something?
> > > 
> > > thanks,
> > > 
> > > Ganbold
> > > 
> > > 
> > 
> > Hmmm, the fault address of 0x20 and the fact that this is happening
> > during mi_startup() make me wonder if the per-cpu pageframes used
> > by
> > pmap_quick* haven't been initialized yet.
> > 
> > I wonder if changing the order of qpages_init in
> > sys/arm/arm/pmap-v6[-new].c to something other than SI_ORDER_ANY
> > would
> > help?
> > 
> > It seems like we'd want SI_ORDER_FOURTH or SI_ORDER_MIDDLE, since
> > mp_start() is SI_ORDER_THIRD.
> > 
> > It would be nice to know what's calling bus_dmamap_sync() in this
> > case.
> >  I can't figure that out, but maybe that's because I haven't had
> > coffee
> > yet.
> > 
> > 
> Can you build the kernel with 'options VERBOSE_SYSINIT' ?
> That will add some spew to the boot log, but it should confirm
> whether this
> is a sysinit ordering issue.

It's almost certainly not related to sysinit ordering.  This exception
is happening during mmc probing after interrupts are enabled.

It appears that the problem is the faulting code is running on one of
the very early pre-allocated kernel stacks (perhaps in an interrupt
handler on an idle thread stack), and these stacks are not in memory
represented in the vm page arrays.  The mmc code is using a 64-byte
buffer on the stack and mapping it for DMA.  Normally that causes a
bounce for cacheline alignment, but unluckily in this case that buffer
on the stack just happened to be aligned to a cacheline boundary and a
multiple of the cacheline size, so no bounce.  That causes the new sync
logic that is based on keeping vm_page_t pointers and offsets to get a
NULL pointer back from PHYS_TO_VM_PAGE when mapping, then it dies at
sync time trying to dereference that.  It used to work because the sync
logic used to use the vaddr, not a page pointer.

Michal was working on a patch yesterday.

-- Ian
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-11-01 Thread Jason Harmening
On Sat, Oct 31, 2015 at 4:55 AM, Jason Harmening 
wrote:

>
>
> On 10/31/15 03:21, Ganbold Tsagaankhuu wrote:
> > On Fri, Oct 23, 2015 at 12:38 AM, Jason A. Harmening 
> > wrote:
> >
> >> Author: jah
> >> Date: Thu Oct 22 16:38:01 2015
> >> New Revision: 289759
> >> URL: https://svnweb.freebsd.org/changeset/base/289759
> >>
> >> Log:
> >>   Use pmap_quick* functions in armv6 busdma, for bounce buffers and
> cache
> >> maintenance.  This makes it safe to sync buffers that have no VA mapping
> >> associated with the busdma map, but may have other mappings, possibly on
> >> different CPUs.  This also makes it safe to sync unmapped bounce
> buffers in
> >> non-sleepable thread contexts.
> >>
> >>   Similar to r286787 for x86, this treats userspace buffers the same as
> >> unmapped buffers and no longer borrows the UVA for sync operations.
> >>
> >>   Submitted by: Svatopluk Kraus  (earlier
> >> revision)
> >>   Tested by:Svatopluk Kraus
> >>   Differential Revision:https://reviews.freebsd.org/D3869
> >
> >
> >
> > It seems I can't boot Odroid C1 with this change.
> >
> > http://pastebin.ca/3227678
> >
> > r289758 works for me.
> >
> > Am I missing something?
> >
> > thanks,
> >
> > Ganbold
> >
> >
>
> Hmmm, the fault address of 0x20 and the fact that this is happening
> during mi_startup() make me wonder if the per-cpu pageframes used by
> pmap_quick* haven't been initialized yet.
>
> I wonder if changing the order of qpages_init in
> sys/arm/arm/pmap-v6[-new].c to something other than SI_ORDER_ANY would
> help?
>
> It seems like we'd want SI_ORDER_FOURTH or SI_ORDER_MIDDLE, since
> mp_start() is SI_ORDER_THIRD.
>
> It would be nice to know what's calling bus_dmamap_sync() in this case.
>  I can't figure that out, but maybe that's because I haven't had coffee
> yet.
>
>
Can you build the kernel with 'options VERBOSE_SYSINIT' ?
That will add some spew to the boot log, but it should confirm whether this
is a sysinit ordering issue.
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-10-31 Thread Jason Harmening


On 10/31/15 03:21, Ganbold Tsagaankhuu wrote:
> On Fri, Oct 23, 2015 at 12:38 AM, Jason A. Harmening 
> wrote:
> 
>> Author: jah
>> Date: Thu Oct 22 16:38:01 2015
>> New Revision: 289759
>> URL: https://svnweb.freebsd.org/changeset/base/289759
>>
>> Log:
>>   Use pmap_quick* functions in armv6 busdma, for bounce buffers and cache
>> maintenance.  This makes it safe to sync buffers that have no VA mapping
>> associated with the busdma map, but may have other mappings, possibly on
>> different CPUs.  This also makes it safe to sync unmapped bounce buffers in
>> non-sleepable thread contexts.
>>
>>   Similar to r286787 for x86, this treats userspace buffers the same as
>> unmapped buffers and no longer borrows the UVA for sync operations.
>>
>>   Submitted by: Svatopluk Kraus  (earlier
>> revision)
>>   Tested by:Svatopluk Kraus
>>   Differential Revision:https://reviews.freebsd.org/D3869
> 
> 
> 
> It seems I can't boot Odroid C1 with this change.
> 
> http://pastebin.ca/3227678
> 
> r289758 works for me.
> 
> Am I missing something?
> 
> thanks,
> 
> Ganbold
> 
>

Hmmm, the fault address of 0x20 and the fact that this is happening
during mi_startup() make me wonder if the per-cpu pageframes used by
pmap_quick* haven't been initialized yet.

I wonder if changing the order of qpages_init in
sys/arm/arm/pmap-v6[-new].c to something other than SI_ORDER_ANY would help?

It seems like we'd want SI_ORDER_FOURTH or SI_ORDER_MIDDLE, since
mp_start() is SI_ORDER_THIRD.

It would be nice to know what's calling bus_dmamap_sync() in this case.
 I can't figure that out, but maybe that's because I haven't had coffee yet.

Unfortunately I'm going to have limited time to help debug this over the
next week, as I'm moving across the country starting (hopefully) today.

--Jason



signature.asc
Description: OpenPGP digital signature


Re: svn commit: r289759 - in head/sys/arm: arm include

2015-10-31 Thread Ganbold Tsagaankhuu
On Fri, Oct 23, 2015 at 12:38 AM, Jason A. Harmening 
wrote:

> Author: jah
> Date: Thu Oct 22 16:38:01 2015
> New Revision: 289759
> URL: https://svnweb.freebsd.org/changeset/base/289759
>
> Log:
>   Use pmap_quick* functions in armv6 busdma, for bounce buffers and cache
> maintenance.  This makes it safe to sync buffers that have no VA mapping
> associated with the busdma map, but may have other mappings, possibly on
> different CPUs.  This also makes it safe to sync unmapped bounce buffers in
> non-sleepable thread contexts.
>
>   Similar to r286787 for x86, this treats userspace buffers the same as
> unmapped buffers and no longer borrows the UVA for sync operations.
>
>   Submitted by: Svatopluk Kraus  (earlier
> revision)
>   Tested by:Svatopluk Kraus
>   Differential Revision:https://reviews.freebsd.org/D3869



It seems I can't boot Odroid C1 with this change.

http://pastebin.ca/3227678

r289758 works for me.

Am I missing something?

thanks,

Ganbold



>
>
> Modified:
>   head/sys/arm/arm/busdma_machdep-v6.c
>   head/sys/arm/include/cpu-v6.h
>
> Modified: head/sys/arm/arm/busdma_machdep-v6.c
>
> ==
> --- head/sys/arm/arm/busdma_machdep-v6.cThu Oct 22 15:42:53 2015
>   (r289758)
> +++ head/sys/arm/arm/busdma_machdep-v6.cThu Oct 22 16:38:01 2015
>   (r289759)
> @@ -61,7 +61,7 @@ __FBSDID("$FreeBSD$");
>
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>
>  #define MAX_BPAGES 64
> @@ -104,14 +104,16 @@ struct bounce_page {
> vm_offset_t vaddr;  /* kva of bounce buffer */
> bus_addr_t  busaddr;/* Physical address */
> vm_offset_t datavaddr;  /* kva of client data */
> -   bus_addr_t  dataaddr;   /* client physical address */
> +   vm_page_t   datapage;   /* physical page of client data */
> +   vm_offset_t dataoffs;   /* page offset of client data */
> bus_size_t  datacount;  /* client data count */
> STAILQ_ENTRY(bounce_page) links;
>  };
>
>  struct sync_list {
> vm_offset_t vaddr;  /* kva of client data */
> -   bus_addr_t  busaddr;/* client physical address */
> +   vm_page_t   pages;  /* starting page of client data */
> +   vm_offset_t dataoffs;   /* page offset of client data */
> bus_size_t  datacount;  /* client data count */
>  };
>
> @@ -181,7 +183,6 @@ struct bus_dmamap {
> intpagesreserved;
> bus_dma_tag_t  dmat;
> struct memdesc mem;
> -   pmap_t pmap;
> bus_dmamap_callback_t *callback;
> void  *callback_arg;
> int   flags;
> @@ -206,12 +207,14 @@ static bus_addr_t add_bounce_page(bus_dm
>   vm_offset_t vaddr, bus_addr_t addr,
>   bus_size_t size);
>  static void free_bounce_page(bus_dma_tag_t dmat, struct bounce_page
> *bpage);
> -static void _bus_dmamap_count_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
> -void *buf, bus_size_t buflen, int flags);
> +static void _bus_dmamap_count_pages(bus_dma_tag_t dmat, pmap_t pmap,
> +bus_dmamap_t map, void *buf, bus_size_t buflen, int flags);
>  static void _bus_dmamap_count_phys(bus_dma_tag_t dmat, bus_dmamap_t map,
>  vm_paddr_t buf, bus_size_t buflen, int flags);
>  static int _bus_dmamap_reserve_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
>  int flags);
> +static void dma_preread_safe(vm_offset_t va, vm_paddr_t pa, vm_size_t
> size);
> +static void dma_dcache_sync(struct sync_list *sl, bus_dmasync_op_t op);
>
>  static busdma_bufalloc_t coherent_allocator;   /* Cache of coherent
> buffers */
>  static busdma_bufalloc_t standard_allocator;   /* Cache of standard
> buffers */
> @@ -896,7 +899,8 @@ _bus_dmamap_count_phys(bus_dma_tag_t dma
> while (buflen != 0) {
> sgsize = MIN(buflen, dmat->maxsegsz);
> if (must_bounce(dmat, map, curaddr, sgsize) != 0) {
> -   sgsize = MIN(sgsize, PAGE_SIZE);
> +   sgsize = MIN(sgsize,
> +   PAGE_SIZE - (curaddr & PAGE_MASK));
> map->pagesneeded++;
> }
> curaddr += sgsize;
> @@ -907,7 +911,7 @@ _bus_dmamap_count_phys(bus_dma_tag_t dma
>  }
>
>  static void
> -_bus_dmamap_count_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
> +_bus_dmamap_count_pages(bus_dma_tag_t dmat, pmap_t pmap, bus_dmamap_t map,
>  void *buf, bus_size_t buflen, int flags)
>  {
> vm_offset_t vaddr;
> @@ -927,10 +931,10 @@ _bus_dmamap_count_pages(bus_dma_tag_t dm
> vendaddr = (vm_offset_t)buf + buflen;
>
> while (vaddr < vendaddr) {
> -   

svn commit: r289759 - in head/sys/arm: arm include

2015-10-22 Thread Jason A. Harmening
Author: jah
Date: Thu Oct 22 16:38:01 2015
New Revision: 289759
URL: https://svnweb.freebsd.org/changeset/base/289759

Log:
  Use pmap_quick* functions in armv6 busdma, for bounce buffers and cache 
maintenance.  This makes it safe to sync buffers that have no VA mapping 
associated with the busdma map, but may have other mappings, possibly on 
different CPUs.  This also makes it safe to sync unmapped bounce buffers in 
non-sleepable thread contexts.
  
  Similar to r286787 for x86, this treats userspace buffers the same as 
unmapped buffers and no longer borrows the UVA for sync operations.
  
  Submitted by: Svatopluk Kraus  (earlier revision)
  Tested by:Svatopluk Kraus
  Differential Revision:https://reviews.freebsd.org/D3869

Modified:
  head/sys/arm/arm/busdma_machdep-v6.c
  head/sys/arm/include/cpu-v6.h

Modified: head/sys/arm/arm/busdma_machdep-v6.c
==
--- head/sys/arm/arm/busdma_machdep-v6.cThu Oct 22 15:42:53 2015
(r289758)
+++ head/sys/arm/arm/busdma_machdep-v6.cThu Oct 22 16:38:01 2015
(r289759)
@@ -61,7 +61,7 @@ __FBSDID("$FreeBSD$");
 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #define MAX_BPAGES 64
@@ -104,14 +104,16 @@ struct bounce_page {
vm_offset_t vaddr;  /* kva of bounce buffer */
bus_addr_t  busaddr;/* Physical address */
vm_offset_t datavaddr;  /* kva of client data */
-   bus_addr_t  dataaddr;   /* client physical address */
+   vm_page_t   datapage;   /* physical page of client data */
+   vm_offset_t dataoffs;   /* page offset of client data */
bus_size_t  datacount;  /* client data count */
STAILQ_ENTRY(bounce_page) links;
 };
 
 struct sync_list {
vm_offset_t vaddr;  /* kva of client data */
-   bus_addr_t  busaddr;/* client physical address */
+   vm_page_t   pages;  /* starting page of client data */
+   vm_offset_t dataoffs;   /* page offset of client data */
bus_size_t  datacount;  /* client data count */
 };
 
@@ -181,7 +183,6 @@ struct bus_dmamap {
intpagesreserved;
bus_dma_tag_t  dmat;
struct memdesc mem;
-   pmap_t pmap;
bus_dmamap_callback_t *callback;
void  *callback_arg;
int   flags;
@@ -206,12 +207,14 @@ static bus_addr_t add_bounce_page(bus_dm
  vm_offset_t vaddr, bus_addr_t addr,
  bus_size_t size);
 static void free_bounce_page(bus_dma_tag_t dmat, struct bounce_page *bpage);
-static void _bus_dmamap_count_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
-void *buf, bus_size_t buflen, int flags);
+static void _bus_dmamap_count_pages(bus_dma_tag_t dmat, pmap_t pmap,
+bus_dmamap_t map, void *buf, bus_size_t buflen, int flags);
 static void _bus_dmamap_count_phys(bus_dma_tag_t dmat, bus_dmamap_t map,
 vm_paddr_t buf, bus_size_t buflen, int flags);
 static int _bus_dmamap_reserve_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
 int flags);
+static void dma_preread_safe(vm_offset_t va, vm_paddr_t pa, vm_size_t size);
+static void dma_dcache_sync(struct sync_list *sl, bus_dmasync_op_t op);
 
 static busdma_bufalloc_t coherent_allocator;   /* Cache of coherent buffers */
 static busdma_bufalloc_t standard_allocator;   /* Cache of standard buffers */
@@ -896,7 +899,8 @@ _bus_dmamap_count_phys(bus_dma_tag_t dma
while (buflen != 0) {
sgsize = MIN(buflen, dmat->maxsegsz);
if (must_bounce(dmat, map, curaddr, sgsize) != 0) {
-   sgsize = MIN(sgsize, PAGE_SIZE);
+   sgsize = MIN(sgsize,
+   PAGE_SIZE - (curaddr & PAGE_MASK));
map->pagesneeded++;
}
curaddr += sgsize;
@@ -907,7 +911,7 @@ _bus_dmamap_count_phys(bus_dma_tag_t dma
 }
 
 static void
-_bus_dmamap_count_pages(bus_dma_tag_t dmat, bus_dmamap_t map,
+_bus_dmamap_count_pages(bus_dma_tag_t dmat, pmap_t pmap, bus_dmamap_t map,
 void *buf, bus_size_t buflen, int flags)
 {
vm_offset_t vaddr;
@@ -927,10 +931,10 @@ _bus_dmamap_count_pages(bus_dma_tag_t dm
vendaddr = (vm_offset_t)buf + buflen;
 
while (vaddr < vendaddr) {
-   if (__predict_true(map->pmap == kernel_pmap))
+   if (__predict_true(pmap == kernel_pmap))
paddr = pmap_kextract(vaddr);
else
-   paddr = pmap_extract(map->pmap, vaddr);
+   paddr = pmap_extract(pmap, vaddr);
if (must_bounce(dmat,