Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
On Wed, Nov 28, 2018 at 11:03:37PM -0800, Liam Mark wrote: > On Wed, 28 Nov 2018, Brian Starkey wrote: > > On Tue, Nov 27, 2018 at 10:46:07PM -0800, Liam Mark wrote: > > > On Tue, 27 Nov 2018, Brian Starkey wrote: > > > > On Mon, Nov 26, 2018 at 08:59:44PM -0800, Liam Mark wrote: > > > > > On Tue, 20 Nov 2018, Brian Starkey wrote: [snip] > > > > > > > > > > Sounds like you are suggesting using carveouts to support uncached? > > > > > > > No, I'm just saying that ion can't give out uncached _CPU_ mappings > > for pages which are already mapped on the CPU as cached. Probably this should have been: s/can't/shouldn't/ > > > > Okay then I guess I am not clear on where you would get this memory > which doesn't have a cached kernel mapping. > It sounded like you wanted to define sections of memory in the DT as not > mapped in the kernel and then hand this memory to > dma_declare_coherent_memory (so that it can be managed) and then use an > ION heap as the allocator. If the memory was defined this way it sounded > a lot like a carveout. But I guess you have some thoughts on how this > memory which doesn't have a kernel mapping can be made available for general > use (for example available in buddy)? > > Perhaps you were thinking of dynamically removing the kernel mappings > before handing it out as uncached, but this would have a general system > performance impact as this memory could come from anywhere so we would > quickly lose our 1GB block mappings (and likely many of our 2MB block > mappings as well). > All I'm saying, with respect to non-cached memory and mappings, is this: I don't think ion should create non-cached CPU mappings of memory which is mapped in the kernel as cached. By extension, that means that in my opinion, the only way userspace should be able to get a non-cached mapping, is by allocating from a carveout. However, I don't think this should be what we do in our complicated media-heavy systems - carveouts are clearly impractical, as is removing memory from the kernel map. What I think we should do, is always do CPU access via cached mappings, for memory which is mapped in the kernel as cached. [snip] > > > > I am not sure I am properly understanding as this is what my V2 patch > does, then when it gets an opportunity it allows the memory to be > re-mapped as uncached. It's the remapping as uncached part which I'm not so keen on. It just seems rather fragile to have mappings for the same memory with different attributes around. > > Or are you perhaps suggesting that if the memory is allocated from a > cached region then it always remains as cached, so only provide uncached > if it was allocated from an uncached region? If so I view all memory > available to the ION system heap for uncached allocations as having a > cached mapping (since it is all part of the kernel logical mappigns), so I > can't see how this would ever be able to support uncached allocations. Yeah, that's exactly what I'm saying. The system heap should not allow uncached allocations, and, memory allocated from the system heap should always be mapped as cached for CPU accesses. Non-cached allocations would only be allowed from carveouts (but as discussed, I don't think carveouts are a practical solution for the general case). The summary of my proposal is that instead of focussing on getting non-cached allocations, we should make cached allocations work better, so that non-cached aliases of cached memory aren't required. [snip] > > Unfortunately if are only using cached mappings it isn't only the first > time you dma map the buffer you need to do cache maintenance, you need to > almost always do it because you don't know what CPU access happened (or > will happen) without a device. I think you can always know if CPU _has_ accessed the buffer - in begin_cpu_access, ion can set a flag, which it checks in map_dma_buf. If that flag says it's been touched, then a cache clean is needed. Of course you can't predict the future - there's no way to know if the CPU _will_ access the buffer - which I think is what you're getting at below. > Explained more below. > > > > But with this cached memory you get poor performance because you are > > > frequently doing cache mainteance uncessarily because there *could* be > > > CPU access. > > > > > > The reason we want to use uncached allocations, with uncached mappings, > > > is > > > to avoid all this uncessary cache maintenance. > > > > > > > OK I think this is the key - you don't actually care whether the > > mappings are non-cached, you just don't want to pay a sync penalty if > > the CPU never touched the buffer. > > > > In that case, then to me the right thing to do is make ion use > > dma_map_sg_attrs(..., DMA_ATTR_SKIP_CPU_SYNC) in ion_map_dma_buf(), if > > it knows that the CPU hasn't touched the buffer (which it does - from > > {begin,end}_cpu_access). > > > > Unfortunately that isn't the case we are trying to optimize for
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
On Wed, 28 Nov 2018, Brian Starkey wrote: > Hi Liam, > > On Tue, Nov 27, 2018 at 10:46:07PM -0800, Liam Mark wrote: > > On Tue, 27 Nov 2018, Brian Starkey wrote: > > > > > Hi Liam, > > > > > > On Mon, Nov 26, 2018 at 08:59:44PM -0800, Liam Mark wrote: > > > > On Tue, 20 Nov 2018, Brian Starkey wrote: > > > > > > > > > Hi Liam, > > > > > > > > > > I'm missing a bit of context here, but I did read the v1 thread. > > > > > Please accept my apologies if I'm re-treading trodden ground. > > > > > > > > > > I do know we're chasing nebulous ion "problems" on our end, which > > > > > certainly seem to be related to what you're trying to fix here. > > > > > > > > > > On Thu, Nov 01, 2018 at 03:15:06PM -0700, Liam Mark wrote: > > > > > >Based on the suggestions from Laura I created a first draft for a > > > > > >change > > > > > >which will attempt to ensure that uncached mappings are only applied > > > > > >to > > > > > >ION memory who's cache lines have been cleaned. > > > > > >It does this by providing cached mappings (for uncached ION > > > > > >allocations) > > > > > >until the ION buffer is dma mapped and successfully cleaned, then it > > > > > drops > > > > > >the userspace mappings and when pages are accessed they are faulted > > > > > >back > > > > > >in and uncached mappings are created. > > > > > > > > > > If I understand right, there's no way to portably clean the cache of > > > > > the kernel mapping before we map the pages into userspace. Is that > > > > > right? > > > > > > > > > > > > > Yes, it isn't always possible to clean the caches for an uncached > > > > mapping > > > > because a device is required by the DMA APIs to do cache maintenance > > > > and > > > > there isn't necessarily a device available (dma_buf_attach may not yet > > > > have been called). > > > > > > > > > Alternatively, can we just make ion refuse to give userspace a > > > > > non-cached mapping for pages which are mapped in the kernel as cached? > > > > > > > > These pages will all be mapped as cached in the kernel for 64 bit > > > > (kernel > > > > logical addresses) so you would always be refusing to create a > > > > non-cached mapping. > > > > > > And that might be the sane thing to do, no? > > > > > > AFAIK there are still pages which aren't ever mapped as cached (e.g. > > > dma_declare_coherent_memory(), anything under /reserved-memory marked > > > as no-map). If those are exposed as an ion heap, then non-cached > > > mappings would be fine, and permitted. > > > > > > > Sounds like you are suggesting using carveouts to support uncached? > > > > No, I'm just saying that ion can't give out uncached _CPU_ mappings > for pages which are already mapped on the CPU as cached. > Okay then I guess I am not clear on where you would get this memory which doesn't have a cached kernel mapping. It sounded like you wanted to define sections of memory in the DT as not mapped in the kernel and then hand this memory to dma_declare_coherent_memory (so that it can be managed) and then use an ION heap as the allocator. If the memory was defined this way it sounded a lot like a carveout. But I guess you have some thoughts on how this memory which doesn't have a kernel mapping can be made available for general use (for example available in buddy)? Perhaps you were thinking of dynamically removing the kernel mappings before handing it out as uncached, but this would have a general system performance impact as this memory could come from anywhere so we would quickly lose our 1GB block mappings (and likely many of our 2MB block mappings as well). > > We have many multimedia use cases which use very large amounts of uncached > > memory, uncached memory is used as a performance optimization because CPU > > access won't happen so it allows us to skip cache maintenance for all the > > dma map and dma unmap calls. To create carveouts large enough to support > > to support the worst case scenarios could result in very large carveouts. > > > > Large carveouts like this would likely result in poor memory utilizations > > (since they are tuned for worst case) which would likely have significant > > performance impacts (more limited memory causes more frequent memory > > reclaim ect...). > > > > Also estimating for worst case could be difficult since the amount of > > uncached memory could be app dependent. > > Unfortunately I don't think this would make for a very scalable solution. > > > > Sure, I understand the desire not to use carveouts. I'm not suggesting > carveouts are a viable alternative. > > > > > > > > > > Would userspace using the dma-buf sync ioctl around its accesses do > > > > > the "right thing" in that case? > > > > > > > > > > > > > I don't think so, the dma-buf sync ioctl require a device to peform > > > > cache > > > > maintenance, but as mentioned above a device may not be available. > > > > > > > > > > If a device didn't attach yet, then no cache maintenance is > > > necessary. T
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
Hi Liam, On Tue, Nov 27, 2018 at 10:46:07PM -0800, Liam Mark wrote: > On Tue, 27 Nov 2018, Brian Starkey wrote: > > > Hi Liam, > > > > On Mon, Nov 26, 2018 at 08:59:44PM -0800, Liam Mark wrote: > > > On Tue, 20 Nov 2018, Brian Starkey wrote: > > > > > > > Hi Liam, > > > > > > > > I'm missing a bit of context here, but I did read the v1 thread. > > > > Please accept my apologies if I'm re-treading trodden ground. > > > > > > > > I do know we're chasing nebulous ion "problems" on our end, which > > > > certainly seem to be related to what you're trying to fix here. > > > > > > > > On Thu, Nov 01, 2018 at 03:15:06PM -0700, Liam Mark wrote: > > > > >Based on the suggestions from Laura I created a first draft for a > > > > >change > > > > >which will attempt to ensure that uncached mappings are only applied to > > > > >ION memory who's cache lines have been cleaned. > > > > >It does this by providing cached mappings (for uncached ION > > > > >allocations) > > > > >until the ION buffer is dma mapped and successfully cleaned, then it > > > > drops > > > > >the userspace mappings and when pages are accessed they are faulted > > > > >back > > > > >in and uncached mappings are created. > > > > > > > > If I understand right, there's no way to portably clean the cache of > > > > the kernel mapping before we map the pages into userspace. Is that > > > > right? > > > > > > > > > > Yes, it isn't always possible to clean the caches for an uncached mapping > > > because a device is required by the DMA APIs to do cache maintenance and > > > there isn't necessarily a device available (dma_buf_attach may not yet > > > have been called). > > > > > > > Alternatively, can we just make ion refuse to give userspace a > > > > non-cached mapping for pages which are mapped in the kernel as cached? > > > > > > These pages will all be mapped as cached in the kernel for 64 bit (kernel > > > logical addresses) so you would always be refusing to create a non-cached > > > mapping. > > > > And that might be the sane thing to do, no? > > > > AFAIK there are still pages which aren't ever mapped as cached (e.g. > > dma_declare_coherent_memory(), anything under /reserved-memory marked > > as no-map). If those are exposed as an ion heap, then non-cached > > mappings would be fine, and permitted. > > > > Sounds like you are suggesting using carveouts to support uncached? > No, I'm just saying that ion can't give out uncached _CPU_ mappings for pages which are already mapped on the CPU as cached. > We have many multimedia use cases which use very large amounts of uncached > memory, uncached memory is used as a performance optimization because CPU > access won't happen so it allows us to skip cache maintenance for all the > dma map and dma unmap calls. To create carveouts large enough to support > to support the worst case scenarios could result in very large carveouts. > > Large carveouts like this would likely result in poor memory utilizations > (since they are tuned for worst case) which would likely have significant > performance impacts (more limited memory causes more frequent memory > reclaim ect...). > > Also estimating for worst case could be difficult since the amount of > uncached memory could be app dependent. > Unfortunately I don't think this would make for a very scalable solution. > Sure, I understand the desire not to use carveouts. I'm not suggesting carveouts are a viable alternative. > > > > > > > Would userspace using the dma-buf sync ioctl around its accesses do > > > > the "right thing" in that case? > > > > > > > > > > I don't think so, the dma-buf sync ioctl require a device to peform cache > > > maintenance, but as mentioned above a device may not be available. > > > > > > > If a device didn't attach yet, then no cache maintenance is > > necessary. The only thing accessing the memory is the CPU, via a > > cached mapping, which should work just fine. So far so good. > > > > Unfortunately not. > Scenario: > - Client allocates uncached memory. > - Client calls the DMA_BUF_IOCTL_SYNC IOCT IOCTL with flags > DMA_BUF_SYNC_START (but this doesn't do any cache maintenance since there > isn't any device) > - Client mmap the memory (ION creates uncached mapping) > - Client reads from that uncached mapping I think I maybe wasn't clear with my proposal. The sequence should be like this: - Client allocates memory - If this is from a region which the CPU has mapped as cached, then that's not "uncached" memory - it's cached memory - and you have to treat it as such. - Client calls the DMA_BUF_IOCTL_SYNC IOCTL with flags DMA_BUF_SYNC_START (but this doesn't do any cache maintenance since there isn't any device) - Client mmaps the memory - ion creates a _cached_ mapping into the userspace process. ion *must not* create an uncached mapping. - Client reads from that cached mapping - It sees zeroes, as expected. This proposal ensures that everyone will *always* see
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
On Tue, 27 Nov 2018, Brian Starkey wrote: > Hi Liam, > > On Mon, Nov 26, 2018 at 08:59:44PM -0800, Liam Mark wrote: > > On Tue, 20 Nov 2018, Brian Starkey wrote: > > > > > Hi Liam, > > > > > > I'm missing a bit of context here, but I did read the v1 thread. > > > Please accept my apologies if I'm re-treading trodden ground. > > > > > > I do know we're chasing nebulous ion "problems" on our end, which > > > certainly seem to be related to what you're trying to fix here. > > > > > > On Thu, Nov 01, 2018 at 03:15:06PM -0700, Liam Mark wrote: > > > >Based on the suggestions from Laura I created a first draft for a change > > > >which will attempt to ensure that uncached mappings are only applied to > > > >ION memory who's cache lines have been cleaned. > > > >It does this by providing cached mappings (for uncached ION allocations) > > > >until the ION buffer is dma mapped and successfully cleaned, then it > > > drops > > > >the userspace mappings and when pages are accessed they are faulted back > > > >in and uncached mappings are created. > > > > > > If I understand right, there's no way to portably clean the cache of > > > the kernel mapping before we map the pages into userspace. Is that > > > right? > > > > > > > Yes, it isn't always possible to clean the caches for an uncached mapping > > because a device is required by the DMA APIs to do cache maintenance and > > there isn't necessarily a device available (dma_buf_attach may not yet > > have been called). > > > > > Alternatively, can we just make ion refuse to give userspace a > > > non-cached mapping for pages which are mapped in the kernel as cached? > > > > These pages will all be mapped as cached in the kernel for 64 bit (kernel > > logical addresses) so you would always be refusing to create a non-cached > > mapping. > > And that might be the sane thing to do, no? > > AFAIK there are still pages which aren't ever mapped as cached (e.g. > dma_declare_coherent_memory(), anything under /reserved-memory marked > as no-map). If those are exposed as an ion heap, then non-cached > mappings would be fine, and permitted. > Sounds like you are suggesting using carveouts to support uncached? We have many multimedia use cases which use very large amounts of uncached memory, uncached memory is used as a performance optimization because CPU access won't happen so it allows us to skip cache maintenance for all the dma map and dma unmap calls. To create carveouts large enough to support to support the worst case scenarios could result in very large carveouts. Large carveouts like this would likely result in poor memory utilizations (since they are tuned for worst case) which would likely have significant performance impacts (more limited memory causes more frequent memory reclaim ect...). Also estimating for worst case could be difficult since the amount of uncached memory could be app dependent. Unfortunately I don't think this would make for a very scalable solution. > > > > > Would userspace using the dma-buf sync ioctl around its accesses do > > > the "right thing" in that case? > > > > > > > I don't think so, the dma-buf sync ioctl require a device to peform cache > > maintenance, but as mentioned above a device may not be available. > > > > If a device didn't attach yet, then no cache maintenance is > necessary. The only thing accessing the memory is the CPU, via a > cached mapping, which should work just fine. So far so good. > Unfortunately not. Scenario: - Client allocates uncached memory. - Client calls the DMA_BUF_IOCTL_SYNC IOCT IOCTL with flags DMA_BUF_SYNC_START (but this doesn't do any cache maintenance since there isn't any device) - Client mmap the memory (ION creates uncached mapping) - Client reads from that uncached mapping Because memory has not been cleaned (we haven't had a device yet) the zeros that were written to this memory could still be in the cache (since they were written with a cached mapping), this means that the unprivilived userpace client is now potentially reading sensitive kernel data > If there are already attachments, then ion_dma_buf_begin_cpu_access() > will sync for CPU access against all of the attached devices, and > again the CPU should see the right thing. > > In the other direction, ion_dma_buf_end_cpu_access() will sync for > device access for all currently attached devices. If there's no > attached devices yet, then there's nothing to do until there is (only > thing accessing is CPU via a CPU-cached mapping). > > When the first (or another) device attaches, then when it maps the > buffer, the map_dma_buf callback should do whatever sync-ing is needed > for that device. > > I might be way off with my understanding of the various DMA APIs, but > this is how I think they're meant to work. > > > > Given that as you pointed out, the kernel does still have a cached > > > mapping to these pages, trying to give the CPU a non-cached mapping of > > > those same pages while preserving
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
Hi Liam, On Mon, Nov 26, 2018 at 08:59:44PM -0800, Liam Mark wrote: > On Tue, 20 Nov 2018, Brian Starkey wrote: > > > Hi Liam, > > > > I'm missing a bit of context here, but I did read the v1 thread. > > Please accept my apologies if I'm re-treading trodden ground. > > > > I do know we're chasing nebulous ion "problems" on our end, which > > certainly seem to be related to what you're trying to fix here. > > > > On Thu, Nov 01, 2018 at 03:15:06PM -0700, Liam Mark wrote: > > >Based on the suggestions from Laura I created a first draft for a change > > >which will attempt to ensure that uncached mappings are only applied to > > >ION memory who's cache lines have been cleaned. > > >It does this by providing cached mappings (for uncached ION allocations) > > >until the ION buffer is dma mapped and successfully cleaned, then it > > drops > > >the userspace mappings and when pages are accessed they are faulted back > > >in and uncached mappings are created. > > > > If I understand right, there's no way to portably clean the cache of > > the kernel mapping before we map the pages into userspace. Is that > > right? > > > > Yes, it isn't always possible to clean the caches for an uncached mapping > because a device is required by the DMA APIs to do cache maintenance and > there isn't necessarily a device available (dma_buf_attach may not yet > have been called). > > > Alternatively, can we just make ion refuse to give userspace a > > non-cached mapping for pages which are mapped in the kernel as cached? > > These pages will all be mapped as cached in the kernel for 64 bit (kernel > logical addresses) so you would always be refusing to create a non-cached > mapping. And that might be the sane thing to do, no? AFAIK there are still pages which aren't ever mapped as cached (e.g. dma_declare_coherent_memory(), anything under /reserved-memory marked as no-map). If those are exposed as an ion heap, then non-cached mappings would be fine, and permitted. > > > Would userspace using the dma-buf sync ioctl around its accesses do > > the "right thing" in that case? > > > > I don't think so, the dma-buf sync ioctl require a device to peform cache > maintenance, but as mentioned above a device may not be available. > If a device didn't attach yet, then no cache maintenance is necessary. The only thing accessing the memory is the CPU, via a cached mapping, which should work just fine. So far so good. If there are already attachments, then ion_dma_buf_begin_cpu_access() will sync for CPU access against all of the attached devices, and again the CPU should see the right thing. In the other direction, ion_dma_buf_end_cpu_access() will sync for device access for all currently attached devices. If there's no attached devices yet, then there's nothing to do until there is (only thing accessing is CPU via a CPU-cached mapping). When the first (or another) device attaches, then when it maps the buffer, the map_dma_buf callback should do whatever sync-ing is needed for that device. I might be way off with my understanding of the various DMA APIs, but this is how I think they're meant to work. > > Given that as you pointed out, the kernel does still have a cached > > mapping to these pages, trying to give the CPU a non-cached mapping of > > those same pages while preserving consistency seems fraught. Wouldn't > > it be better to make sure all CPU mappings are cached, and have CPU > > clients use the dma_buf_{begin,end}_cpu_access() hooks to get > > consistency where needed? > > > > It is fraught, but unfortunately you can't rely on > dma_buf_{begin,end}_cpu_access() to do cache maintenance as these calls > require a device, and a device is not always available. As above, if there's really no device, then no syncing is needed because only the CPU is accessing the buffer, and only ever via cached mappings. > > > > > > >This change has the following potential disadvantages: > > >- It assumes that userpace clients won't attempt to access the buffer > > >while it is being mapped as we are removing the userpspace mappings at > > >this point (though it is okay for them to have it mapped) > > >- It assumes that kernel clients won't hold a kernel mapping to the > > buffer > > >(ie dma_buf_kmap) while it is being dma-mapped. What should we do if > > there > > >is a kernel mapping at the time of dma mapping, fail the mapping, warn? > > >- There may be a performance penalty as a result of having to fault in > > the > > >pages after removing the userspace mappings. > > > > I wonder if the dma-buf sync ioctl might provide a way for userspace > > to opt-in to when the zap/fault happens. Zap on (DMA_BUF_SYNC_WRITE | > > DMA_BUF_SYNC_WRITE_END) and fault on (DMA_BUF_SYNC_READ | > > DMA_BUF_SYNC_START) > > > > Not sure I understand, can you elaborate. > Are you also adding a requirment that ION pages can't be mmaped during a > call to dma_buf_map_attachment? I was only suggesting that zapping the mappings "at random
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
On Tue, 20 Nov 2018, Brian Starkey wrote: > Hi Liam, > > I'm missing a bit of context here, but I did read the v1 thread. > Please accept my apologies if I'm re-treading trodden ground. > > I do know we're chasing nebulous ion "problems" on our end, which > certainly seem to be related to what you're trying to fix here. > > On Thu, Nov 01, 2018 at 03:15:06PM -0700, Liam Mark wrote: > >Based on the suggestions from Laura I created a first draft for a change > >which will attempt to ensure that uncached mappings are only applied to > >ION memory who's cache lines have been cleaned. > >It does this by providing cached mappings (for uncached ION allocations) > >until the ION buffer is dma mapped and successfully cleaned, then it > drops > >the userspace mappings and when pages are accessed they are faulted back > >in and uncached mappings are created. > > If I understand right, there's no way to portably clean the cache of > the kernel mapping before we map the pages into userspace. Is that > right? > Yes, it isn't always possible to clean the caches for an uncached mapping because a device is required by the DMA APIs to do cache maintenance and there isn't necessarily a device available (dma_buf_attach may not yet have been called). > Alternatively, can we just make ion refuse to give userspace a > non-cached mapping for pages which are mapped in the kernel as cached? These pages will all be mapped as cached in the kernel for 64 bit (kernel logical addresses) so you would always be refusing to create a non-cached mapping. > Would userspace using the dma-buf sync ioctl around its accesses do > the "right thing" in that case? > I don't think so, the dma-buf sync ioctl require a device to peform cache maintenance, but as mentioned above a device may not be available. > Given that as you pointed out, the kernel does still have a cached > mapping to these pages, trying to give the CPU a non-cached mapping of > those same pages while preserving consistency seems fraught. Wouldn't > it be better to make sure all CPU mappings are cached, and have CPU > clients use the dma_buf_{begin,end}_cpu_access() hooks to get > consistency where needed? > It is fraught, but unfortunately you can't rely on dma_buf_{begin,end}_cpu_access() to do cache maintenance as these calls require a device, and a device is not always available. > > > >This change has the following potential disadvantages: > >- It assumes that userpace clients won't attempt to access the buffer > >while it is being mapped as we are removing the userpspace mappings at > >this point (though it is okay for them to have it mapped) > >- It assumes that kernel clients won't hold a kernel mapping to the > buffer > >(ie dma_buf_kmap) while it is being dma-mapped. What should we do if > there > >is a kernel mapping at the time of dma mapping, fail the mapping, warn? > >- There may be a performance penalty as a result of having to fault in > the > >pages after removing the userspace mappings. > > I wonder if the dma-buf sync ioctl might provide a way for userspace > to opt-in to when the zap/fault happens. Zap on (DMA_BUF_SYNC_WRITE | > DMA_BUF_SYNC_WRITE_END) and fault on (DMA_BUF_SYNC_READ | > DMA_BUF_SYNC_START) > Not sure I understand, can you elaborate. Are you also adding a requirment that ION pages can't be mmaped during a call to dma_buf_map_attachment? > > > >It passes basic testing involving reading writing and reading from > >uncached system heap allocations before and after dma mapping. > > > >Please let me know if this is heading in the right direction and if there > >are any concerns. > > > >Signed-off-by: Liam Mark > >--- > > drivers/staging/android/ion/ion.c | 146 > +- > > drivers/staging/android/ion/ion.h | 9 +++ > > 2 files changed, 152 insertions(+), 3 deletions(-) > > > >diff --git a/drivers/staging/android/ion/ion.c > b/drivers/staging/android/ion/ion.c > >index 99073325b0c0..3dc0f5a265bf 100644 > >--- a/drivers/staging/android/ion/ion.c > >+++ b/drivers/staging/android/ion/ion.c > >@@ -96,6 +96,7 @@ static struct ion_buffer *ion_buffer_create(struct > ion_heap *heap, > > } > > > > INIT_LIST_HEAD(&buffer->attachments); > >+INIT_LIST_HEAD(&buffer->vmas); > > mutex_init(&buffer->lock); > > mutex_lock(&dev->buffer_lock); > > ion_buffer_add(dev, buffer); > >@@ -117,6 +118,7 @@ void ion_buffer_destroy(struct ion_buffer *buffer) > > buffer->heap->ops->unmap_kernel(buffer->heap, buffer); > > } > > buffer->heap->ops->free(buffer); > >+vfree(buffer->pages); > > kfree(buffer); > > } > > > >@@ -245,11 +247,29 @@ static void ion_dma_buf_detatch(struct dma_buf > *dmabuf, > > kfree(a); > > } > > > >+static bool ion_buffer_uncached_clean(struct ion_buffer *buffer) > >+{ > >+return buffer->uncached_clean; > >+} > > nit: The function name sounds like a verb to me - as in "calling this > will clean the buffer". I feel ion_buffer_is_uncached_
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
Hi Liam, I'm missing a bit of context here, but I did read the v1 thread. Please accept my apologies if I'm re-treading trodden ground. I do know we're chasing nebulous ion "problems" on our end, which certainly seem to be related to what you're trying to fix here. On Thu, Nov 01, 2018 at 03:15:06PM -0700, Liam Mark wrote: >Based on the suggestions from Laura I created a first draft for a change >which will attempt to ensure that uncached mappings are only applied to >ION memory who's cache lines have been cleaned. >It does this by providing cached mappings (for uncached ION allocations) >until the ION buffer is dma mapped and successfully cleaned, then it drops >the userspace mappings and when pages are accessed they are faulted back >in and uncached mappings are created. If I understand right, there's no way to portably clean the cache of the kernel mapping before we map the pages into userspace. Is that right? Alternatively, can we just make ion refuse to give userspace a non-cached mapping for pages which are mapped in the kernel as cached? Would userspace using the dma-buf sync ioctl around its accesses do the "right thing" in that case? Given that as you pointed out, the kernel does still have a cached mapping to these pages, trying to give the CPU a non-cached mapping of those same pages while preserving consistency seems fraught. Wouldn't it be better to make sure all CPU mappings are cached, and have CPU clients use the dma_buf_{begin,end}_cpu_access() hooks to get consistency where needed? > >This change has the following potential disadvantages: >- It assumes that userpace clients won't attempt to access the buffer >while it is being mapped as we are removing the userpspace mappings at >this point (though it is okay for them to have it mapped) >- It assumes that kernel clients won't hold a kernel mapping to the buffer >(ie dma_buf_kmap) while it is being dma-mapped. What should we do if there >is a kernel mapping at the time of dma mapping, fail the mapping, warn? >- There may be a performance penalty as a result of having to fault in the >pages after removing the userspace mappings. I wonder if the dma-buf sync ioctl might provide a way for userspace to opt-in to when the zap/fault happens. Zap on (DMA_BUF_SYNC_WRITE | DMA_BUF_SYNC_WRITE_END) and fault on (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_START) > >It passes basic testing involving reading writing and reading from >uncached system heap allocations before and after dma mapping. > >Please let me know if this is heading in the right direction and if there >are any concerns. > >Signed-off-by: Liam Mark >--- > drivers/staging/android/ion/ion.c | 146 +- > drivers/staging/android/ion/ion.h | 9 +++ > 2 files changed, 152 insertions(+), 3 deletions(-) > >diff --git a/drivers/staging/android/ion/ion.c >b/drivers/staging/android/ion/ion.c >index 99073325b0c0..3dc0f5a265bf 100644 >--- a/drivers/staging/android/ion/ion.c >+++ b/drivers/staging/android/ion/ion.c >@@ -96,6 +96,7 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap >*heap, > } > > INIT_LIST_HEAD(&buffer->attachments); >+ INIT_LIST_HEAD(&buffer->vmas); > mutex_init(&buffer->lock); > mutex_lock(&dev->buffer_lock); > ion_buffer_add(dev, buffer); >@@ -117,6 +118,7 @@ void ion_buffer_destroy(struct ion_buffer *buffer) > buffer->heap->ops->unmap_kernel(buffer->heap, buffer); > } > buffer->heap->ops->free(buffer); >+ vfree(buffer->pages); > kfree(buffer); > } > >@@ -245,11 +247,29 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf, > kfree(a); > } > >+static bool ion_buffer_uncached_clean(struct ion_buffer *buffer) >+{ >+ return buffer->uncached_clean; >+} nit: The function name sounds like a verb to me - as in "calling this will clean the buffer". I feel ion_buffer_is_uncached_clean() would read better. Thanks, -Brian >+ >+/* expect buffer->lock to be already taken */ >+static void ion_buffer_zap_mappings(struct ion_buffer *buffer) >+{ >+ struct ion_vma_list *vma_list; >+ >+ list_for_each_entry(vma_list, &buffer->vmas, list) { >+ struct vm_area_struct *vma = vma_list->vma; >+ >+ zap_page_range(vma, vma->vm_start, vma->vm_end - vma->vm_start); >+ } >+} >+ > static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment, > enum dma_data_direction direction) > { > struct ion_dma_buf_attachment *a = attachment->priv; > struct sg_table *table; >+ struct ion_buffer *buffer = attachment->dmabuf->priv; > > table = a->table; > >@@ -257,6 +277,19 @@ static struct sg_table *ion_map_dma_buf(struct >dma_buf_attachment *attachment, > direction)) > return ERR_PTR(-ENOMEM); > >+ if (!ion_buffer_cached(buffer)) { >+ mutex_lock(&buffer->lock); >+ if (!ion_buffer_uncached_clean(buffer))
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
On Fri, 2 Nov 2018, John Stultz wrote: > On Thu, Nov 1, 2018 at 3:15 PM, Liam Mark wrote: > > Based on the suggestions from Laura I created a first draft for a change > > which will attempt to ensure that uncached mappings are only applied to > > ION memory who's cache lines have been cleaned. > > It does this by providing cached mappings (for uncached ION allocations) > > until the ION buffer is dma mapped and successfully cleaned, then it drops > > the userspace mappings and when pages are accessed they are faulted back > > in and uncached mappings are created. > > > > This change has the following potential disadvantages: > > - It assumes that userpace clients won't attempt to access the buffer > > while it is being mapped as we are removing the userpspace mappings at > > this point (though it is okay for them to have it mapped) > > - It assumes that kernel clients won't hold a kernel mapping to the buffer > > (ie dma_buf_kmap) while it is being dma-mapped. What should we do if there > > is a kernel mapping at the time of dma mapping, fail the mapping, warn? > > - There may be a performance penalty as a result of having to fault in the > > pages after removing the userspace mappings. > > > > It passes basic testing involving reading writing and reading from > > uncached system heap allocations before and after dma mapping. > > > > Please let me know if this is heading in the right direction and if there > > are any concerns. > > > > Signed-off-by: Liam Mark > > > Thanks for sending this out! I gave this a whirl on my HiKey960. Seems > to work ok, but I'm not sure if the board's usage benefits much from > your changes. > Thanks for testing this. I didn't expect this patch to improve performance but I was worried it might hurt performance. I don't know how many uncached ION allocations Hikey960 makes, or how it uses uncached allocations. It is possible that Hikey960 doesn't make much usage of uncached buffers, or if it does it may not attempt to mmap them before dma mapping them, so it is possible this change isn't getting exercised very much in the test you ran. I will need to look into how best to exercise this patch on Hikey960. Liam Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
On Thu, Nov 1, 2018 at 3:15 PM, Liam Mark wrote: > Based on the suggestions from Laura I created a first draft for a change > which will attempt to ensure that uncached mappings are only applied to > ION memory who's cache lines have been cleaned. > It does this by providing cached mappings (for uncached ION allocations) > until the ION buffer is dma mapped and successfully cleaned, then it drops > the userspace mappings and when pages are accessed they are faulted back > in and uncached mappings are created. > > This change has the following potential disadvantages: > - It assumes that userpace clients won't attempt to access the buffer > while it is being mapped as we are removing the userpspace mappings at > this point (though it is okay for them to have it mapped) > - It assumes that kernel clients won't hold a kernel mapping to the buffer > (ie dma_buf_kmap) while it is being dma-mapped. What should we do if there > is a kernel mapping at the time of dma mapping, fail the mapping, warn? > - There may be a performance penalty as a result of having to fault in the > pages after removing the userspace mappings. > > It passes basic testing involving reading writing and reading from > uncached system heap allocations before and after dma mapping. > > Please let me know if this is heading in the right direction and if there > are any concerns. > > Signed-off-by: Liam Mark Thanks for sending this out! I gave this a whirl on my HiKey960. Seems to work ok, but I'm not sure if the board's usage benefits much from your changes. First, ignore how crazy overall these frame values are right off, we have some cpuidle/cpufreq issues w/ 4.14 that we're still sorting out. Without your patch: default-jankview_list_view,jankbench,1,mean,0,iter_10,List View Fling,48.1333678017, default-jankview_list_view,jankbench,2,mean,0,iter_10,List View Fling,55.8407417387, default-jankview_list_view,jankbench,3,mean,0,iter_10,List View Fling,43.88160374, default-jankview_list_view,jankbench,4,mean,0,iter_10,List View Fling,42.2606222784, default-jankview_list_view,jankbench,5,mean,0,iter_10,List View Fling,44.1791721797, default-jankview_list_view,jankbench,6,mean,0,iter_10,List View Fling,39.7692731775, default-jankview_list_view,jankbench,7,mean,0,iter_10,List View Fling,48.5462154074, default-jankview_list_view,jankbench,8,mean,0,iter_10,List View Fling,40.1321166548, default-jankview_list_view,jankbench,9,mean,0,iter_10,List View Fling,48.0163174397, default-jankview_list_view,jankbench,10,mean,0,iter_10,List View Fling,51.1971686844, With your patch: default-jankview_list_view,jankbench,1,mean,0,iter_10,List View Fling,43.3983274772, default-jankview_list_view,jankbench,2,mean,0,iter_10,List View Fling,45.8456678409, default-jankview_list_view,jankbench,3,mean,0,iter_10,List View Fling,42.9609507211, default-jankview_list_view,jankbench,4,mean,0,iter_10,List View Fling,48.602186248, default-jankview_list_view,jankbench,5,mean,0,iter_10,List View Fling,47.9257658765, default-jankview_list_view,jankbench,6,mean,0,iter_10,List View Fling,47.7405384035, default-jankview_list_view,jankbench,7,mean,0,iter_10,List View Fling,52.0017667611, default-jankview_list_view,jankbench,8,mean,0,iter_10,List View Fling,43.7480812349, default-jankview_list_view,jankbench,9,mean,0,iter_10,List View Fling,44.8138758796, default-jankview_list_view,jankbench,10,mean,0,iter_10,List View Fling,46.4941804068, Just for reference, compared to my earlier patch: default-jankview_list_view,jankbench,1,mean,0,iter_10,List View Fling,33.8638094852, default-jankview_list_view,jankbench,2,mean,0,iter_10,List View Fling,34.0859500474, default-jankview_list_view,jankbench,3,mean,0,iter_10,List View Fling,35.6278973379, default-jankview_list_view,jankbench,4,mean,0,iter_10,List View Fling,31.4999822195, default-jankview_list_view,jankbench,5,mean,0,iter_10,List View Fling,40.0634874771, default-jankview_list_view,jankbench,6,mean,0,iter_10,List View Fling,28.0633472181, default-jankview_list_view,jankbench,7,mean,0,iter_10,List View Fling,36.0400585616, default-jankview_list_view,jankbench,8,mean,0,iter_10,List View Fling,38.1871234374, default-jankview_list_view,jankbench,9,mean,0,iter_10,List View Fling,37.4103602014, default-jankview_list_view,jankbench,10,mean,0,iter_10,List View Fling,40.7147881231, Though I'll spend some more time looking at it closer. thanks -john ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[RFC PATCH v2] android: ion: How to properly clean caches for uncached allocations
Based on the suggestions from Laura I created a first draft for a change which will attempt to ensure that uncached mappings are only applied to ION memory who's cache lines have been cleaned. It does this by providing cached mappings (for uncached ION allocations) until the ION buffer is dma mapped and successfully cleaned, then it drops the userspace mappings and when pages are accessed they are faulted back in and uncached mappings are created. This change has the following potential disadvantages: - It assumes that userpace clients won't attempt to access the buffer while it is being mapped as we are removing the userpspace mappings at this point (though it is okay for them to have it mapped) - It assumes that kernel clients won't hold a kernel mapping to the buffer (ie dma_buf_kmap) while it is being dma-mapped. What should we do if there is a kernel mapping at the time of dma mapping, fail the mapping, warn? - There may be a performance penalty as a result of having to fault in the pages after removing the userspace mappings. It passes basic testing involving reading writing and reading from uncached system heap allocations before and after dma mapping. Please let me know if this is heading in the right direction and if there are any concerns. Signed-off-by: Liam Mark --- drivers/staging/android/ion/ion.c | 146 +- drivers/staging/android/ion/ion.h | 9 +++ 2 files changed, 152 insertions(+), 3 deletions(-) diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index 99073325b0c0..3dc0f5a265bf 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -96,6 +96,7 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, } INIT_LIST_HEAD(&buffer->attachments); + INIT_LIST_HEAD(&buffer->vmas); mutex_init(&buffer->lock); mutex_lock(&dev->buffer_lock); ion_buffer_add(dev, buffer); @@ -117,6 +118,7 @@ void ion_buffer_destroy(struct ion_buffer *buffer) buffer->heap->ops->unmap_kernel(buffer->heap, buffer); } buffer->heap->ops->free(buffer); + vfree(buffer->pages); kfree(buffer); } @@ -245,11 +247,29 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf, kfree(a); } +static bool ion_buffer_uncached_clean(struct ion_buffer *buffer) +{ + return buffer->uncached_clean; +} + +/* expect buffer->lock to be already taken */ +static void ion_buffer_zap_mappings(struct ion_buffer *buffer) +{ + struct ion_vma_list *vma_list; + + list_for_each_entry(vma_list, &buffer->vmas, list) { + struct vm_area_struct *vma = vma_list->vma; + + zap_page_range(vma, vma->vm_start, vma->vm_end - vma->vm_start); + } +} + static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment, enum dma_data_direction direction) { struct ion_dma_buf_attachment *a = attachment->priv; struct sg_table *table; + struct ion_buffer *buffer = attachment->dmabuf->priv; table = a->table; @@ -257,6 +277,19 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment, direction)) return ERR_PTR(-ENOMEM); + if (!ion_buffer_cached(buffer)) { + mutex_lock(&buffer->lock); + if (!ion_buffer_uncached_clean(buffer)) { + ion_buffer_zap_mappings(buffer); + if (buffer->kmap_cnt > 0) { + pr_warn_once("%s: buffer still mapped in the kernel\n", +__func__); + } + buffer->uncached_clean = true; + } + mutex_unlock(&buffer->lock); + } + return table; } @@ -267,6 +300,94 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment, dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction); } +static void __ion_vm_open(struct vm_area_struct *vma, bool lock) +{ + struct ion_buffer *buffer = vma->vm_private_data; + struct ion_vma_list *vma_list; + + vma_list = kmalloc(sizeof(*vma_list), GFP_KERNEL); + if (!vma_list) + return; + vma_list->vma = vma; + + if (lock) + mutex_lock(&buffer->lock); + list_add(&vma_list->list, &buffer->vmas); + if (lock) + mutex_unlock(&buffer->lock); +} + +static void ion_vm_open(struct vm_area_struct *vma) +{ + __ion_vm_open(vma, true); +} + +static void ion_vm_close(struct vm_area_struct *vma) +{ + struct ion_buffer *buffer = vma->vm_private_data; + struct ion_vma_list *vma_list, *tmp; + + mutex_lock(&buffer->lock); + list_for_each_entry_safe(vma_list, tmp, &buffer->vmas, list) { + if (vma_list->vma != vma) +