On Fri, Mar 2, 2012 at 6:23 PM, Sakari Ailus <sakari.ailus at iki.fi> wrote:
> Hi Daniel,
>
> Thanks for the patch.
>
> On Thu, Mar 01, 2012 at 04:36:01PM +0100, Daniel Vetter wrote:
>> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
>> ---
>> ?Documentation/dma-buf-sharing.txt | ?102 
>> +++++++++++++++++++++++++++++++++++-
>> ?1 files changed, 99 insertions(+), 3 deletions(-)
>>
>> diff --git a/Documentation/dma-buf-sharing.txt 
>> b/Documentation/dma-buf-sharing.txt
>> index 225f96d..f12542b 100644
>> --- a/Documentation/dma-buf-sharing.txt
>> +++ b/Documentation/dma-buf-sharing.txt
>> @@ -32,8 +32,12 @@ The buffer-user
>> ?*IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details]
>> ?For this first version, A buffer shared using the dma_buf sharing API:
>> ?- *may* be exported to user space using "mmap" *ONLY* by exporter, outside 
>> of
>> - ? this framework.
>> -- may be used *ONLY* by importers that do not need CPU access to the buffer.
>> + ?this framework.
>> +- with this new iteration of the dma-buf api cpu access from the kernel has 
>> been
>> + ?enable, see below for the details.
>> +
>> +dma-buf operations for device dma only
>> +--------------------------------------
>>
>> ?The dma_buf buffer sharing API usage contains the following steps:
>>
>> @@ -219,7 +223,99 @@ NOTES:
>> ? ? If the exporter chooses not to allow an attach() operation once a
>> ? ? map_dma_buf() API has been called, it simply returns an error.
>>
>> -Miscellaneous notes:
>> +Kernel cpu access to a dma-buf buffer object
>> +--------------------------------------------
>> +
>> +The motivation to allow cpu access from the kernel to a dma-buf object from 
>> the
>> +importers side are:
>> +- fallback operations, e.g. if the devices is connected to a usb bus and the
>> + ?kernel needs to shuffle the data around first before sending it away.
>> +- full transperancy for existing users on the importer side, i.e. userspace
>> + ?should not notice the difference between a normal object from that 
>> subsystem
>> + ?and an imported one backed by a dma-buf. This is really important for drm
>> + ?opengl drivers that expect to still use all the existing upload/download
>> + ?paths.
>> +
>> +Access to a dma_buf from the kernel context involves three steps:
>> +
>> +1. Prepare access, which invalidate any necessary caches and make the object
>> + ? available for cpu access.
>> +2. Access the object page-by-page with the dma_buf map apis
>> +3. Finish access, which will flush any necessary cpu caches and free 
>> reserved
>> + ? resources.
>
> Where it should be decided which operations are being done to the buffer
> when it is passed to user space and back to kernel space?
>
> How about spliting these operations to those done on the first time the
> buffer is passed to the user space (mapping to kernel address space, for
> example) and those required every time buffer is passed from kernel to user
> and back (cache flusing)?
>
> I'm asking since any unnecessary time-consuming operations, especially as
> heavy as mapping the buffer, should be avoidable in subsystems dealing
> with streaming video, cameras etc., i.e. non-GPU users.


Well, this is really something for the buffer exporter to deal with..
since there is no way for an importer to create a userspace mmap'ing
of the buffer.  A lot of these expensive operations go away if you
don't even create a userspace virtual mapping in the first place ;-)

BR,
-R

>
>> +1. Prepare acces
>> +
>> + ? Before an importer can acces a dma_buf object with the cpu from the 
>> kernel
>> + ? context, it needs to notice the exporter of the access that is about to
>> + ? happen.
>> +
>> + ? Interface:
>> + ? ? ?int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?size_t start, size_t len,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?enum dma_data_direction direction)
>> +
>> + ? This allows the exporter to ensure that the memory is actually available 
>> for
>> + ? cpu access - the exporter might need to allocate or swap-in and pin the
>> + ? backing storage. The exporter also needs to ensure that cpu access is
>> + ? coherent for the given range and access direction. The range and access
>> + ? direction can be used by the exporter to optimize the cache flushing, 
>> i.e.
>> + ? access outside of the range or with a different direction (read instead 
>> of
>> + ? write) might return stale or even bogus data (e.g. when the exporter 
>> needs to
>> + ? copy the data to temporaray storage).
>> +
>> + ? This step might fail, e.g. in oom conditions.
>> +
>> +2. Accessing the buffer
>> +
>> + ? To support dma_buf objects residing in highmem cpu access is page-based 
>> using
>> + ? an api similar to kmap. Accessing a dma_buf is done in aligned chunks of
>> + ? PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which 
>> returns
>> + ? a pointer in kernel virtual address space. Afterwards the chunk needs to 
>> be
>> + ? unmapped again. There is no limit on how often a given chunk can be 
>> mapped
>> + ? and unmmapped, i.e. the importer does not need to call begin_cpu_access 
>> again
>> + ? before mapping the same chunk again.
>> +
>> + ? Interfaces:
>> + ? ? ?void *dma_buf_kmap(struct dma_buf *, unsigned long);
>> + ? ? ?void dma_buf_kunmap(struct dma_buf *, unsigned long, void *);
>> +
>> + ? There are also atomic variants of these interfaces. Like for kmap they
>> + ? facilitate non-blocking fast-paths. Neither the importer nor the 
>> exporter (in
>> + ? the callback) is allowed to block when using these.
>> +
>> + ? Interfaces:
>> + ? ? ?void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long);
>> + ? ? ?void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *);
>> +
>> + ? For importers all the restrictions of using kmap apply, like the limited
>> + ? supply of kmap_atomic slots. Hence an importer shall only hold onto at 
>> most 2
>> + ? atomic dma_buf kmaps at the same time (in any given process context).
>> +
>> + ? dma_buf kmap calls outside of the range specified in begin_cpu_access are
>> + ? undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed 
>> on
>> + ? the partial chunks at the beginning and end but may return stale or bogus
>> + ? data outside of the range (in these partial chunks).
>> +
>> + ? Note that these calls need to always succeed. The exporter needs to 
>> complete
>> + ? any preparations that might fail in begin_cpu_access.
>> +
>> +3. Finish access
>> +
>> + ? When the importer is done accessing the range specified in 
>> begin_cpu_acces,
>> + ? it needs to announce this to the exporter (to facilitate cache flushing 
>> and
>> + ? unpinning of any pinned resources). The result of of any dma_buf kmap 
>> calls
>> + ? after end_cpu_access is undefined.
>> +
>> + ? Interface:
>> + ? ? ?void dma_buf_end_cpu_access(struct dma_buf *dma_buf,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? size_t start, size_t len,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? enum dma_data_direction dir);
>> +
>> +
>> +Miscellaneous notes
>> +-------------------
>> +
>> ?- Any exporters or users of the dma-buf buffer sharing framework must have
>> ? ?a 'select DMA_SHARED_BUFFER' in their respective Kconfigs.
>
> Kind regards,
>
> --
> Sakari Ailus
> e-mail: sakari.ailus at iki.fi ? ? jabber/XMPP/Gmail: sailus at retiisi.org.uk
>
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig at lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

Reply via email to