Re: [PATCH for v3.18 00/18] Backport CVE-2017-13166 fixes to Kernel 3.18

2018-03-29 Thread Inki Dae


2018년 03월 29일 16:00에 Greg KH 이(가) 쓴 글:
> On Thu, Mar 29, 2018 at 03:39:54PM +0900, Inki Dae wrote:
>> 2018년 03월 29일 13:25에 Greg KH 이(가) 쓴 글:
>>> On Thu, Mar 29, 2018 at 08:22:08AM +0900, Inki Dae wrote:
>>>> Really thanks for doing this. :) There would be many users who use
>>>> Linux-3.18 for their products yet.
>>>
>>> For new products?  They really should not be.  The kernel is officially
>>
>> Really no. Old products would still be using Linux-3.18 kernel without
>> kernel upgrade. For new product, most of SoC vendors will use
>> Linux-4.x including us.
>> Actually, we are preparing for kernel upgrade for some devices even
>> some old devices (to Linux-4.14-LTS) and almost done.
> 
> That is great to hear.
> 
>>> What is keeping you on 3.18.y and not allowing you to move to a newer
>>> kernel version?
>>
>> We also want to move to latest kernel version. However, there is a case that 
>> we cannot upgrade the kernel.
>> In case that SoC vendor never share firmwares and relevant data
>> sheets, we cannot upgrade the kernel. However, we have to resolve the
>> security issues for users of this device.
> 
> It sounds like you need to be getting those security updates from those
> SoC vendors, as they are the ones you are paying for support for that

It's true but some open source developers like me who use vendor kernel without 
vendor's support will never get the security updates from them.
So if you merge CVE patches even through this kernel is already EOL then many 
open source developers would be glad. :)

Thanks,
Inki Dae

> kernel version that they are forcing you to stay on.
> 
> good luck!
> 
> greg k-h
> 
> 
> 


Re: [PATCH for v3.18 00/18] Backport CVE-2017-13166 fixes to Kernel 3.18

2018-03-28 Thread Inki Dae


2018년 03월 29일 13:25에 Greg KH 이(가) 쓴 글:
> On Thu, Mar 29, 2018 at 08:22:08AM +0900, Inki Dae wrote:
>> Really thanks for doing this. :) There would be many users who use
>> Linux-3.18 for their products yet.
> 
> For new products?  They really should not be.  The kernel is officially

Really no. Old products would still be using Linux-3.18 kernel without kernel 
upgrade. For new product, most of SoC vendors will use Linux-4.x including us.
Actually, we are preparing for kernel upgrade for some devices even some old 
devices (to Linux-4.14-LTS) and almost done.

> end-of-life, but I'm keeping it alive for a short while longer just
> because too many people seem to still be using it.  However, they are
> not actually updating the kernel in their devices, so I don't think I
> will be doing many more new 3.18.y releases.
> 
> It's a problem when people ask for support, and then don't use the
> releases given to them :(
> 
> What is keeping you on 3.18.y and not allowing you to move to a newer
> kernel version?

We also want to move to latest kernel version. However, there is a case that we 
cannot upgrade the kernel.
In case that SoC vendor never share firmwares and relevant data sheets, we 
cannot upgrade the kernel. However, we have to resolve the security issues for 
users of this device.

Thanks,
Inki Dae

> 
> thanks,
> 
> greg k-h
> 
> 
> 


Re: [PATCH for v3.18 00/18] Backport CVE-2017-13166 fixes to Kernel 3.18

2018-03-28 Thread Inki Dae
Hi Mauro,

2018년 03월 29일 03:12에 Mauro Carvalho Chehab 이(가) 쓴 글:
> Hi Greg,
> 
> Those are the backports meant to solve CVE-2017-13166 on Kernel 3.18.
> 
> It contains two v4l2-ctrls fixes that are required to avoid crashes
> at the test application.
> 
> I wrote two patches myself for Kernel 3.18 in order to solve some
> issues specific for Kernel 3.18 with aren't needed upstream.
> one is actually a one-line change backport. The other one makes
> sure that both 32-bits and 64-bits version of some ioctl calls
> will return the same value for a reserved field.
> 
> I noticed an extra bug while testing it, but the bug also hits upstream,
> and should be backported all the way down all stable/LTS versions.
> So, I'll send it the usual way, after merging upsream.

Really thanks for doing this. :) There would be many users who use Linux-3.18 
for their products yet.

Thanks,
Inki Dae

> 
> Regards,
> Mauro
> 
> 
> Daniel Mentz (2):
>   media: v4l2-compat-ioctl32: Copy v4l2_window->global_alpha
>   media: v4l2-compat-ioctl32.c: refactor compat ioctl32 logic
> 
> Hans Verkuil (12):
>   media: v4l2-ioctl.c: don't copy back the result for -ENOTTY
>   media: v4l2-compat-ioctl32.c: add missing VIDIOC_PREPARE_BUF
>   media: v4l2-compat-ioctl32.c: fix the indentation
>   media: v4l2-compat-ioctl32.c: move 'helper' functions to
> __get/put_v4l2_format32
>   media: v4l2-compat-ioctl32.c: avoid sizeof(type)
>   media: v4l2-compat-ioctl32.c: copy m.userptr in put_v4l2_plane32
>   media: v4l2-compat-ioctl32.c: fix ctrl_is_pointer
>   media: v4l2-compat-ioctl32.c: make ctrl_is_pointer work for subdevs
>   media: v4l2-compat-ioctl32.c: copy clip list in put_v4l2_window32
>   media: v4l2-compat-ioctl32.c: drop pr_info for unknown buffer type
>   media: v4l2-compat-ioctl32.c: don't copy back the result for certain
> errors
>   media: v4l2-ctrls: fix sparse warning
> 
> Mauro Carvalho Chehab (2):
>   media: v4l2-compat-ioctl32: use compat_u64 for video standard
>   media: v4l2-compat-ioctl32: initialize a reserved field
> 
> Ricardo Ribalda (2):
>   vb2: V4L2_BUF_FLAG_DONE is set after DQBUF
>   media: media/v4l2-ctrls: volatiles should not generate CH_VALUE
> 
>  drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 1020 
> +++--
>  drivers/media/v4l2-core/v4l2-ctrls.c  |   96 ++-
>  drivers/media/v4l2-core/v4l2-ioctl.c  |5 +-
>  drivers/media/v4l2-core/videobuf2-core.c  |5 +
>  4 files changed, 691 insertions(+), 435 deletions(-)
> 


Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-05-10 Thread Inki Dae


2017년 05월 10일 16:55에 Daniel Vetter 이(가) 쓴 글:
> On Wed, May 10, 2017 at 03:27:02PM +0900, Inki Dae wrote:
>> Hi Tomasz,
>>
>> 2017년 05월 10일 14:38에 Tomasz Figa 이(가) 쓴 글:
>>> Hi Everyone,
>>>
>>> On Wed, May 10, 2017 at 9:24 AM, Inki Dae  wrote:
>>>>
>>>>
>>>> 2017년 04월 26일 07:21에 Sakari Ailus 이(가) 쓴 글:
>>>>> Hi Marek,
>>>>>
>>>>> On Thu, Apr 20, 2017 at 01:23:09PM +0200, Marek Szyprowski wrote:
>>>>>> Hi Laurent,
>>>>>>
>>>>>> On 2017-04-20 12:25, Laurent Pinchart wrote:
>>>>>>> Hi Marek,
>>>>>>>
>>>>>>> (CC'ing Sakari Ailus)
>>>>>>>
>>>>>>> Thank you for the patches.
>>>>>>>
>>>>>>> On Thursday 20 Apr 2017 11:13:36 Marek Szyprowski wrote:
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>> This is an updated proposal for extending EXYNOS DRM API with generic
>>>>>>>> support for hardware modules, which can be used for processing image 
>>>>>>>> data
>>>>>>> >from the one memory buffer to another. Typical memory-to-memory 
>>>>>>> >operations
>>>>>>>> are: rotation, scaling, colour space conversion or mix of them. This 
>>>>>>>> is a
>>>>>>>> follow-up of my previous proposal "[RFC 0/2] New feature: Framebuffer
>>>>>>>> processors", which has been rejected as "not really needed in the DRM
>>>>>>>> core":
>>>>>>>> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
>>>>>>>>
>>>>>>>> In this proposal I moved all the code to Exynos DRM driver, so now this
>>>>>>>> will be specific only to Exynos DRM. I've also changed the name from
>>>>>>>> framebuffer processor (fbproc) to picture processor (pp) to avoid 
>>>>>>>> confusion
>>>>>>>> with fbdev API.
>>>>>>>>
>>>>>>>> Here is a bit more information what picture processors are:
>>>>>>>>
>>>>>>>> Embedded SoCs are known to have a number of hardware blocks, which 
>>>>>>>> perform
>>>>>>>> such operations. They can be used in paralel to the main GPU module to
>>>>>>>> offload CPU from processing grapics or video data. One of example use 
>>>>>>>> of
>>>>>>>> such modules is implementing video overlay, which usually requires 
>>>>>>>> color
>>>>>>>> space conversion from NV12 (or similar) to RGB32 color space and 
>>>>>>>> scaling to
>>>>>>>> target window size.
>>>>>>>>
>>>>>>>> The proposed API is heavily inspired by atomic KMS approach - it is 
>>>>>>>> also
>>>>>>>> based on DRM objects and their properties. A new DRM object is 
>>>>>>>> introduced:
>>>>>>>> picture processor (called pp for convenience). Such objects have a set 
>>>>>>>> of
>>>>>>>> standard DRM properties, which describes the operation to be performed 
>>>>>>>> by
>>>>>>>> respective hardware module. In typical case those properties are a 
>>>>>>>> source
>>>>>>>> fb id and rectangle (x, y, width, height) and destination fb id and
>>>>>>>> rectangle. Optionally a rotation property can be also specified if
>>>>>>>> supported by the given hardware. To perform an operation on image data,
>>>>>>>> userspace provides a set of properties and their values for given 
>>>>>>>> fbproc
>>>>>>>> object in a similar way as object and properties are provided for
>>>>>>>> performing atomic page flip / mode setting.
>>>>>>>>
>>>>>>>> The proposed API consists of the 3 new ioctls:
>>>>>>>> - DRM_IOCTL_EXYNOS_PP_GET_RESOURCES: to enumerate all available picture
>>>>>>>>   processors,
>>>>>>>> - DRM_IOCTL_EXYNOS_PP_GET: to query capabilities 

Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-05-10 Thread Inki Dae


2017년 05월 10일 15:38에 Tomasz Figa 이(가) 쓴 글:
> On Wed, May 10, 2017 at 2:27 PM, Inki Dae  wrote:
>> Hi Tomasz,
>>
>> 2017년 05월 10일 14:38에 Tomasz Figa 이(가) 쓴 글:
>>> Hi Everyone,
>>>
>>> On Wed, May 10, 2017 at 9:24 AM, Inki Dae  wrote:
>>>>
>>>>
>>>> 2017년 04월 26일 07:21에 Sakari Ailus 이(가) 쓴 글:
>>>>> Hi Marek,
>>>>>
>>>>> On Thu, Apr 20, 2017 at 01:23:09PM +0200, Marek Szyprowski wrote:
>>>>>> Hi Laurent,
>>>>>>
>>>>>> On 2017-04-20 12:25, Laurent Pinchart wrote:
>>>>>>> Hi Marek,
>>>>>>>
>>>>>>> (CC'ing Sakari Ailus)
>>>>>>>
>>>>>>> Thank you for the patches.
>>>>>>>
>>>>>>> On Thursday 20 Apr 2017 11:13:36 Marek Szyprowski wrote:
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>> This is an updated proposal for extending EXYNOS DRM API with generic
>>>>>>>> support for hardware modules, which can be used for processing image 
>>>>>>>> data
>>>>>>> >from the one memory buffer to another. Typical memory-to-memory 
>>>>>>> >operations
>>>>>>>> are: rotation, scaling, colour space conversion or mix of them. This 
>>>>>>>> is a
>>>>>>>> follow-up of my previous proposal "[RFC 0/2] New feature: Framebuffer
>>>>>>>> processors", which has been rejected as "not really needed in the DRM
>>>>>>>> core":
>>>>>>>> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
>>>>>>>>
>>>>>>>> In this proposal I moved all the code to Exynos DRM driver, so now this
>>>>>>>> will be specific only to Exynos DRM. I've also changed the name from
>>>>>>>> framebuffer processor (fbproc) to picture processor (pp) to avoid 
>>>>>>>> confusion
>>>>>>>> with fbdev API.
>>>>>>>>
>>>>>>>> Here is a bit more information what picture processors are:
>>>>>>>>
>>>>>>>> Embedded SoCs are known to have a number of hardware blocks, which 
>>>>>>>> perform
>>>>>>>> such operations. They can be used in paralel to the main GPU module to
>>>>>>>> offload CPU from processing grapics or video data. One of example use 
>>>>>>>> of
>>>>>>>> such modules is implementing video overlay, which usually requires 
>>>>>>>> color
>>>>>>>> space conversion from NV12 (or similar) to RGB32 color space and 
>>>>>>>> scaling to
>>>>>>>> target window size.
>>>>>>>>
>>>>>>>> The proposed API is heavily inspired by atomic KMS approach - it is 
>>>>>>>> also
>>>>>>>> based on DRM objects and their properties. A new DRM object is 
>>>>>>>> introduced:
>>>>>>>> picture processor (called pp for convenience). Such objects have a set 
>>>>>>>> of
>>>>>>>> standard DRM properties, which describes the operation to be performed 
>>>>>>>> by
>>>>>>>> respective hardware module. In typical case those properties are a 
>>>>>>>> source
>>>>>>>> fb id and rectangle (x, y, width, height) and destination fb id and
>>>>>>>> rectangle. Optionally a rotation property can be also specified if
>>>>>>>> supported by the given hardware. To perform an operation on image data,
>>>>>>>> userspace provides a set of properties and their values for given 
>>>>>>>> fbproc
>>>>>>>> object in a similar way as object and properties are provided for
>>>>>>>> performing atomic page flip / mode setting.
>>>>>>>>
>>>>>>>> The proposed API consists of the 3 new ioctls:
>>>>>>>> - DRM_IOCTL_EXYNOS_PP_GET_RESOURCES: to enumerate all available picture
>>>>>>>>   processors,
>>>>>>>> - DRM_IOCTL_EXYNOS_PP_GET: to query capabilities of given pictur

Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-05-09 Thread Inki Dae
Hi Tomasz,

2017년 05월 10일 14:38에 Tomasz Figa 이(가) 쓴 글:
> Hi Everyone,
> 
> On Wed, May 10, 2017 at 9:24 AM, Inki Dae  wrote:
>>
>>
>> 2017년 04월 26일 07:21에 Sakari Ailus 이(가) 쓴 글:
>>> Hi Marek,
>>>
>>> On Thu, Apr 20, 2017 at 01:23:09PM +0200, Marek Szyprowski wrote:
>>>> Hi Laurent,
>>>>
>>>> On 2017-04-20 12:25, Laurent Pinchart wrote:
>>>>> Hi Marek,
>>>>>
>>>>> (CC'ing Sakari Ailus)
>>>>>
>>>>> Thank you for the patches.
>>>>>
>>>>> On Thursday 20 Apr 2017 11:13:36 Marek Szyprowski wrote:
>>>>>> Dear all,
>>>>>>
>>>>>> This is an updated proposal for extending EXYNOS DRM API with generic
>>>>>> support for hardware modules, which can be used for processing image data
>>>>> >from the one memory buffer to another. Typical memory-to-memory 
>>>>> >operations
>>>>>> are: rotation, scaling, colour space conversion or mix of them. This is a
>>>>>> follow-up of my previous proposal "[RFC 0/2] New feature: Framebuffer
>>>>>> processors", which has been rejected as "not really needed in the DRM
>>>>>> core":
>>>>>> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
>>>>>>
>>>>>> In this proposal I moved all the code to Exynos DRM driver, so now this
>>>>>> will be specific only to Exynos DRM. I've also changed the name from
>>>>>> framebuffer processor (fbproc) to picture processor (pp) to avoid 
>>>>>> confusion
>>>>>> with fbdev API.
>>>>>>
>>>>>> Here is a bit more information what picture processors are:
>>>>>>
>>>>>> Embedded SoCs are known to have a number of hardware blocks, which 
>>>>>> perform
>>>>>> such operations. They can be used in paralel to the main GPU module to
>>>>>> offload CPU from processing grapics or video data. One of example use of
>>>>>> such modules is implementing video overlay, which usually requires color
>>>>>> space conversion from NV12 (or similar) to RGB32 color space and scaling 
>>>>>> to
>>>>>> target window size.
>>>>>>
>>>>>> The proposed API is heavily inspired by atomic KMS approach - it is also
>>>>>> based on DRM objects and their properties. A new DRM object is 
>>>>>> introduced:
>>>>>> picture processor (called pp for convenience). Such objects have a set of
>>>>>> standard DRM properties, which describes the operation to be performed by
>>>>>> respective hardware module. In typical case those properties are a source
>>>>>> fb id and rectangle (x, y, width, height) and destination fb id and
>>>>>> rectangle. Optionally a rotation property can be also specified if
>>>>>> supported by the given hardware. To perform an operation on image data,
>>>>>> userspace provides a set of properties and their values for given fbproc
>>>>>> object in a similar way as object and properties are provided for
>>>>>> performing atomic page flip / mode setting.
>>>>>>
>>>>>> The proposed API consists of the 3 new ioctls:
>>>>>> - DRM_IOCTL_EXYNOS_PP_GET_RESOURCES: to enumerate all available picture
>>>>>>   processors,
>>>>>> - DRM_IOCTL_EXYNOS_PP_GET: to query capabilities of given picture
>>>>>>   processor,
>>>>>> - DRM_IOCTL_EXYNOS_PP_COMMIT: to perform operation described by given
>>>>>>   property set.
>>>>>>
>>>>>> The proposed API is extensible. Drivers can attach their own, custom
>>>>>> properties to add support for more advanced picture processing (for 
>>>>>> example
>>>>>> blending).
>>>>>>
>>>>>> This proposal aims to replace Exynos DRM IPP (Image Post Processing)
>>>>>> subsystem. IPP API is over-engineered in general, but not really 
>>>>>> extensible
>>>>>> on the other side. It is also buggy, with significant design flaws - the
>>>>>> biggest issue is the fact that the API covers memory-2-memory picture
>>>>>> operations together with CRTC w

Re: [RFC 0/4] Exynos DRM: add Picture Processor extension

2017-05-09 Thread Inki Dae
t;> operation:
>>  - typically it will be used by compositing window manager, this means that
>>some parameters of the processing might change on each vblank (like
>>destination rectangle for example). This api allows such change on each
>>operation without any additional cost. V4L2 requires to reinitialize
>>queues with new configuration on such change, what means that a bunch of
>>ioctls has to be called.
> 
> What do you mean by re-initialising the queue? Format, buffers or something
> else?
> 
> If you need a larger buffer than what you have already allocated, you'll
> need to re-allocate, V4L2 or not.
> 
> We also do lack a way to destroy individual buffers in V4L2. It'd be up to
> implementing that and some work in videobuf2.
> 
> Another thing is that V4L2 is very stream oriented. For most devices that's
> fine as a lot of the parameters are not changeable during streaming,
> especially if the pipeline is handled by multiple drivers. That said, for
> devices that process data from memory to memory performing changes in the
> media bus formats and pipeline configuration is not very efficient
> currently, largely for the same reason.
> 
> The request API that people have been working for a bit different use cases
> isn't in mainline yet. It would allow more efficient per-request
> configuration than what is currently possible, but it has turned out to be
> far from trivial to implement.
> 
>>  - validating processing parameters in V4l2 API is really complicated,
>>because the parameters (format, src&dest rectangles, rotation) are being
>>set incrementally, so we have to either allow some impossible,
>> transitional
>>configurations or complicate the configuration steps even more (like
>>calling some ioctls multiple times for both input and output). In the end
>>all parameters have to be again validated just before performing the
>>operation.
> 
> You have to validate the parameters in any case. In a MC pipeline this takes
> place when the stream is started.
> 
>>
>> 3. generic approach (to add it to DRM core) has been rejected:
>> http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg146286.html
> 
> For GPUs I generally understand the reasoning: there's a very limited number
> of users of this API --- primarily because it's not an application
> interface.
> 
> If you have a device that however falls under the scope of V4L2 (at least
> API-wise), does this continue to be the case? Will there be only one or two
> (or so) users for this API? Is it the case here?
> 
> Using a device specific interface definitely has some benefits: there's no
> need to think how would you generalise the interface for other similar
> devices. There's no need to consider backwards compatibility as it's not a
> requirement. The drawback is that the applications that need to support
> similar devices will bear the burden of having to support different APIs.
> 
> I don't mean to say that you should ram whatever under V4L2 / MC
> independently of how unworkable that might be, but there are also clear
> advantages in using a standardised interface such as V4L2.
> 
> V4L2 has a long history behind it and if it was designed today, I bet it
> would look quite different from what it is now.

It's true. There is definitely a benefit with V4L2 because V4L2 provides Linux 
standard ABI - for DRM as of now not.

However, I think that is a only benefit we could get through V4L2. Using V4L2 
makes software stack of Platform to be complicated - We have to open video 
device node and card device node to display a image on the screen scaling or 
converting color space of the image and also we need to export DMA buffer from 
one side and import it to other side using DMABUF.

It may not related to this but even V4L2 has performance problem - every 
QBUF/DQBUF requests performs mapping/unmapping DMA buffer you already know 
this. :)

In addition, recently Display subsystems on ARM SoC tend to include pre/post 
processing hardware in Display controller - OMAP, Exynos8895 and MSM as long as 
I know.


Thanks,
Inki Dae

> 
>>
>> 4. this api can be considered as extended 'blit' operation, other DRM
>> drivers
>>(MGA, R128, VIA) already have ioctls for such operation, so there is also
>>place in DRM for it
> 
> Added LMML to cc.
> 


Re: [PATCH v2] dma-buf: Wait on the reservation object when sync'ing before CPU access

2016-12-18 Thread Inki Dae


2016년 08월 16일 01:02에 Daniel Vetter 이(가) 쓴 글:
> On Mon, Aug 15, 2016 at 04:42:18PM +0100, Chris Wilson wrote:
>> Rendering operations to the dma-buf are tracked implicitly via the
>> reservation_object (dmabuf->resv). This is used to allow poll() to
>> wait upon outstanding rendering (or just query the current status of
>> rendering). The dma-buf sync ioctl allows userspace to prepare the
>> dma-buf for CPU access, which should include waiting upon rendering.
>> (Some drivers may need to do more work to ensure that the dma-buf mmap
>> is coherent as well as complete.)
>>
>> v2: Always wait upon the reservation object implicitly. We choose to do
>> it after the native handler in case it can do so more efficiently.
>>
>> Testcase: igt/prime_vgem
>> Testcase: igt/gem_concurrent_blit # *vgem*
>> Signed-off-by: Chris Wilson 
>> Cc: Sumit Semwal 
>> Cc: Daniel Vetter 
>> Cc: Eric Anholt 
>> Cc: linux-media@vger.kernel.org
>> Cc: dri-de...@lists.freedesktop.org
>> Cc: linaro-mm-...@lists.linaro.org
>> Cc: linux-ker...@vger.kernel.org
>> ---
>>  drivers/dma-buf/dma-buf.c | 23 +++
>>  1 file changed, 23 insertions(+)
>>
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index ddaee60ae52a..cf04d249a6a4 100644
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -586,6 +586,22 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment 
>> *attach,
>>  }
>>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>>  
>> +static int __dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>> +  enum dma_data_direction direction)
>> +{
>> +bool write = (direction == DMA_BIDIRECTIONAL ||
>> +  direction == DMA_TO_DEVICE);
>> +struct reservation_object *resv = dmabuf->resv;
>> +long ret;
>> +
>> +/* Wait on any implicit rendering fences */
>> +ret = reservation_object_wait_timeout_rcu(resv, write, true,
>> +  MAX_SCHEDULE_TIMEOUT);
>> +if (ret < 0)
>> +return ret;
>> +
>> +return 0;
>> +}
>>  
>>  /**
>>   * dma_buf_begin_cpu_access - Must be called before accessing a dma_buf 
>> from the
>> @@ -608,6 +624,13 @@ int dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>>  if (dmabuf->ops->begin_cpu_access)
>>  ret = dmabuf->ops->begin_cpu_access(dmabuf, direction);
>>  
>> +/* Ensure that all fences are waited upon - but we first allow
>> + * the native handler the chance to do so more efficiently if it
>> + * chooses. A double invocation here will be reasonably cheap no-op.
>> + */
>> +if (ret == 0)
>> +ret = __dma_buf_begin_cpu_access(dmabuf, direction);
> 
> Not sure we should wait first and the flush or the other way round. But I
> don't think it'll matter for any current dma-buf exporter, so meh.
> 

Sorry for late comment. I wonder there is no problem in case that GPU or other 
DMA device tries to access this dma buffer after dma_buf_begin_cpu_access call.
I think in this case, they - GPU or DMA devices - would make a mess of the dma 
buffer while CPU is accessing the buffer.

This patch is in mainline already so if this is real problem then I think we 
sould choose,
1. revert this patch from mainline
2. make sure to prevent other DMA devices to try to access the buffer while CPU 
is accessing the buffer.

Thanks.

> Reviewed-by: Daniel Vetter 
> 
> Sumits, can you pls pick this one up and put into drm-misc?
> -Daniel
> 
>> +
>>  return ret;
>>  }
>>  EXPORT_SYMBOL_GPL(dma_buf_begin_cpu_access);
>> -- 
>> 2.8.1
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] drm/exynos: Convert g2d_userptr_get_dma_addr() to use get_vaddr_frames()

2015-07-17 Thread Inki Dae
On 2015년 07월 17일 19:31, Hans Verkuil wrote:
> On 07/17/2015 12:29 PM, Inki Dae wrote:
>> On 2015년 07월 17일 19:20, Hans Verkuil wrote:
>>> On 07/13/2015 04:55 PM, Jan Kara wrote:
>>>> From: Jan Kara 
>>>>
>>>> Convert g2d_userptr_get_dma_addr() to pin pages using get_vaddr_frames().
>>>> This removes the knowledge about vmas and mmap_sem locking from exynos
>>>> driver. Also it fixes a problem that the function has been mapping user
>>>> provided address without holding mmap_sem.
>>>
>>> I'd like to see an Ack from one of the exynos drm driver maintainers before
>>> I merge this.
>>>
>>> Inki, Marek?
>>
>> I already gave Ack but it seems that Jan missed it while updating.
>>
>> Anyway,
>> Acked-by: Inki Dae 
> 
> Thanks!

Oops, sorry. This patch would incur a build warning. Below is my comment.

> 
> BTW, I didn't see your earlier Ack either. Was it posted to the linux-media 
> list as well?
> It didn't turn up there.

I thought posted but I couldn't find the email in my mailbox so I may
mistake.

> 
> Regards,
> 
>   Hans
> 
>>
>> Thanks,
>> Inki Dae
>>
>>>
>>> Regards,
>>>
>>> Hans
>>>
>>>>
>>>> Signed-off-by: Jan Kara 
>>>> ---
>>>>  drivers/gpu/drm/exynos/Kconfig  |  1 +
>>>>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 91 
>>>> ++-
>>>>  drivers/gpu/drm/exynos/exynos_drm_gem.c | 97 
>>>> -
>>>>  3 files changed, 30 insertions(+), 159 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/exynos/Kconfig 
>>>> b/drivers/gpu/drm/exynos/Kconfig
>>>> index 43003c4ad80b..b364562dc6c1 100644
>>>> --- a/drivers/gpu/drm/exynos/Kconfig
>>>> +++ b/drivers/gpu/drm/exynos/Kconfig
>>>> @@ -77,6 +77,7 @@ config DRM_EXYNOS_VIDI
>>>>  config DRM_EXYNOS_G2D
>>>>bool "Exynos DRM G2D"
>>>>depends on DRM_EXYNOS && !VIDEO_SAMSUNG_S5P_G2D
>>>> +  select FRAME_VECTOR
>>>>help
>>>>  Choose this option if you want to use Exynos G2D for DRM.
>>>>  
>>>> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
>>>> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
>>>> index 81a250830808..1d8d9a508373 100644
>>>> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
>>>> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
>>>> @@ -190,10 +190,8 @@ struct g2d_cmdlist_userptr {
>>>>dma_addr_t  dma_addr;
>>>>unsigned long   userptr;
>>>>unsigned long   size;
>>>> -  struct page **pages;
>>>> -  unsigned intnpages;
>>>> +  struct frame_vector *vec;
>>>>struct sg_table *sgt;
>>>> -  struct vm_area_struct   *vma;
>>>>atomic_trefcount;
>>>>boolin_pool;
>>>>boolout_of_list;
>>>> @@ -363,6 +361,7 @@ static void g2d_userptr_put_dma_addr(struct drm_device 
>>>> *drm_dev,
>>>>  {
>>>>struct g2d_cmdlist_userptr *g2d_userptr =
>>>>(struct g2d_cmdlist_userptr *)obj;
>>>> +  struct page **pages;
>>>>  
>>>>if (!obj)
>>>>return;
>>>> @@ -382,19 +381,21 @@ out:
>>>>exynos_gem_unmap_sgt_from_dma(drm_dev, g2d_userptr->sgt,
>>>>DMA_BIDIRECTIONAL);
>>>>  
>>>> -  exynos_gem_put_pages_to_userptr(g2d_userptr->pages,
>>>> -  g2d_userptr->npages,
>>>> -  g2d_userptr->vma);
>>>> +  pages = frame_vector_pages(g2d_userptr->vec);
>>>> +  if (!IS_ERR(pages)) {
>>>> +  int i;
>>>>  
>>>> -  exynos_gem_put_vma(g2d_userptr->vma);
>>>> +  for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
>>>> +  set_page_dirty_lock(pages[i]);
>>>> +  }
>>>> +  put_vaddr_frames(g2d_userptr->vec);
>>>> +  frame_vector_destroy(g2d_userptr->vec);
>>>>  
>>>>if (!g2d_userptr->out_of_list)
>>>>list_del_init

Re: [PATCH 9/9] drm/exynos: Convert g2d_userptr_get_dma_addr() to use get_vaddr_frames()

2015-07-17 Thread Inki Dae
On 2015년 07월 17일 19:20, Hans Verkuil wrote:
> On 07/13/2015 04:55 PM, Jan Kara wrote:
>> From: Jan Kara 
>>
>> Convert g2d_userptr_get_dma_addr() to pin pages using get_vaddr_frames().
>> This removes the knowledge about vmas and mmap_sem locking from exynos
>> driver. Also it fixes a problem that the function has been mapping user
>> provided address without holding mmap_sem.
> 
> I'd like to see an Ack from one of the exynos drm driver maintainers before
> I merge this.
> 
> Inki, Marek?

I already gave Ack but it seems that Jan missed it while updating.

Anyway,
Acked-by: Inki Dae 

Thanks,
Inki Dae

> 
> Regards,
> 
>   Hans
> 
>>
>> Signed-off-by: Jan Kara 
>> ---
>>  drivers/gpu/drm/exynos/Kconfig  |  1 +
>>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 91 ++-
>>  drivers/gpu/drm/exynos/exynos_drm_gem.c | 97 
>> -
>>  3 files changed, 30 insertions(+), 159 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig
>> index 43003c4ad80b..b364562dc6c1 100644
>> --- a/drivers/gpu/drm/exynos/Kconfig
>> +++ b/drivers/gpu/drm/exynos/Kconfig
>> @@ -77,6 +77,7 @@ config DRM_EXYNOS_VIDI
>>  config DRM_EXYNOS_G2D
>>  bool "Exynos DRM G2D"
>>  depends on DRM_EXYNOS && !VIDEO_SAMSUNG_S5P_G2D
>> +select FRAME_VECTOR
>>  help
>>Choose this option if you want to use Exynos G2D for DRM.
>>  
>> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
>> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
>> index 81a250830808..1d8d9a508373 100644
>> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
>> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
>> @@ -190,10 +190,8 @@ struct g2d_cmdlist_userptr {
>>  dma_addr_t  dma_addr;
>>  unsigned long   userptr;
>>  unsigned long   size;
>> -struct page **pages;
>> -unsigned intnpages;
>> +struct frame_vector *vec;
>>  struct sg_table *sgt;
>> -struct vm_area_struct   *vma;
>>  atomic_trefcount;
>>  boolin_pool;
>>  boolout_of_list;
>> @@ -363,6 +361,7 @@ static void g2d_userptr_put_dma_addr(struct drm_device 
>> *drm_dev,
>>  {
>>  struct g2d_cmdlist_userptr *g2d_userptr =
>>  (struct g2d_cmdlist_userptr *)obj;
>> +struct page **pages;
>>  
>>  if (!obj)
>>  return;
>> @@ -382,19 +381,21 @@ out:
>>  exynos_gem_unmap_sgt_from_dma(drm_dev, g2d_userptr->sgt,
>>  DMA_BIDIRECTIONAL);
>>  
>> -exynos_gem_put_pages_to_userptr(g2d_userptr->pages,
>> -g2d_userptr->npages,
>> -g2d_userptr->vma);
>> +pages = frame_vector_pages(g2d_userptr->vec);
>> +if (!IS_ERR(pages)) {
>> +int i;
>>  
>> -exynos_gem_put_vma(g2d_userptr->vma);
>> +for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
>> +set_page_dirty_lock(pages[i]);
>> +}
>> +put_vaddr_frames(g2d_userptr->vec);
>> +frame_vector_destroy(g2d_userptr->vec);
>>  
>>  if (!g2d_userptr->out_of_list)
>>  list_del_init(&g2d_userptr->list);
>>  
>>  sg_free_table(g2d_userptr->sgt);
>>  kfree(g2d_userptr->sgt);
>> -
>> -drm_free_large(g2d_userptr->pages);
>>  kfree(g2d_userptr);
>>  }
>>  
>> @@ -408,9 +409,7 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
>> drm_device *drm_dev,
>>  struct exynos_drm_g2d_private *g2d_priv = file_priv->g2d_priv;
>>  struct g2d_cmdlist_userptr *g2d_userptr;
>>  struct g2d_data *g2d;
>> -struct page **pages;
>>  struct sg_table *sgt;
>> -struct vm_area_struct *vma;
>>  unsigned long start, end;
>>  unsigned int npages, offset;
>>  int ret;
>> @@ -456,65 +455,38 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
>> drm_device *drm_dev,
>>  return ERR_PTR(-ENOMEM);
>>  
>>  atomic_set(&g2d_userptr->refcount, 1);
>> +g2d_userptr->size = size;
>>  
>>  start = userptr & PAGE_MASK;
>>  offset = userptr & ~PAGE_MASK;
>>  end = PAGE_ALIGN(userptr + size);
>> 

Re: [PATCH 9/9] drm/exynos: Convert g2d_userptr_get_dma_addr() to use get_vaddr_frames()

2015-05-14 Thread Inki Dae
Hi,

On 2015년 05월 13일 22:08, Jan Kara wrote:
> Convert g2d_userptr_get_dma_addr() to pin pages using get_vaddr_frames().
> This removes the knowledge about vmas and mmap_sem locking from exynos
> driver. Also it fixes a problem that the function has been mapping user
> provided address without holding mmap_sem.
> 
> Signed-off-by: Jan Kara 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 89 ++
>  drivers/gpu/drm/exynos/exynos_drm_gem.c | 97 
> -
>  2 files changed, 29 insertions(+), 157 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index 81a250830808..265519c0fe2d 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -190,10 +190,8 @@ struct g2d_cmdlist_userptr {
>   dma_addr_t  dma_addr;
>   unsigned long   userptr;
>   unsigned long   size;
> - struct page **pages;
> - unsigned intnpages;
> + struct frame_vector *vec;
>   struct sg_table *sgt;
> - struct vm_area_struct   *vma;
>   atomic_trefcount;
>   boolin_pool;
>   boolout_of_list;
> @@ -363,6 +361,7 @@ static void g2d_userptr_put_dma_addr(struct drm_device 
> *drm_dev,
>  {
>   struct g2d_cmdlist_userptr *g2d_userptr =
>   (struct g2d_cmdlist_userptr *)obj;
> + struct page **pages;
>  
>   if (!obj)
>   return;
> @@ -382,19 +381,21 @@ out:
>   exynos_gem_unmap_sgt_from_dma(drm_dev, g2d_userptr->sgt,
>   DMA_BIDIRECTIONAL);
>  
> - exynos_gem_put_pages_to_userptr(g2d_userptr->pages,
> - g2d_userptr->npages,
> - g2d_userptr->vma);
> + pages = frame_vector_pages(g2d_userptr->vec);
> + if (!IS_ERR(pages)) {
> + int i;
>  
> - exynos_gem_put_vma(g2d_userptr->vma);
> + for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
> + set_page_dirty_lock(pages[i]);
> + }
> + put_vaddr_frames(g2d_userptr->vec);
> + frame_vector_destroy(g2d_userptr->vec);
>  
>   if (!g2d_userptr->out_of_list)
>   list_del_init(&g2d_userptr->list);
>  
>   sg_free_table(g2d_userptr->sgt);
>   kfree(g2d_userptr->sgt);
> -
> - drm_free_large(g2d_userptr->pages);
>   kfree(g2d_userptr);
>  }
>  
> @@ -413,6 +414,7 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
> drm_device *drm_dev,
>   struct vm_area_struct *vma;
>   unsigned long start, end;
>   unsigned int npages, offset;
> + struct frame_vector *vec;
>   int ret;
>  
>   if (!size) {
> @@ -456,65 +458,37 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
> drm_device *drm_dev,
>   return ERR_PTR(-ENOMEM);
>  
>   atomic_set(&g2d_userptr->refcount, 1);
> + g2d_userptr->size = size;
>  
>   start = userptr & PAGE_MASK;
>   offset = userptr & ~PAGE_MASK;
>   end = PAGE_ALIGN(userptr + size);
>   npages = (end - start) >> PAGE_SHIFT;
> - g2d_userptr->npages = npages;
> -
> - pages = drm_calloc_large(npages, sizeof(struct page *));

The declaration to pages isn't needed anymore because you removed it.

> - if (!pages) {
> - DRM_ERROR("failed to allocate pages.\n");
> - ret = -ENOMEM;
> + vec = g2d_userptr->vec = frame_vector_create(npages);

I think you can use g2d_userptr->vec so it seems that vec isn't needed.

> + if (!vec)
>   goto err_free;
> - }
>  
> - down_read(¤t->mm->mmap_sem);
> - vma = find_vma(current->mm, userptr);

For vma, ditto.

Thanks,
Inki Dae

[--SNIP--]
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: exynos4 / g2d

2014-02-10 Thread Inki Dae
2014-02-10 17:44 GMT+09:00 Sachin Kamat :
> +cc Joonyoung Shim
>
> Hi,
>
> On 10 February 2014 13:58, Tobias Jakobi  
> wrote:
>> Hello!
>>
>>
>> Sachin Kamat wrote:
>>> +cc linux-media list and some related maintainers
>>>
>>> Hi,
>>>
>>> On 10 February 2014 00:22, Tobias Jakobi  
>>> wrote:
>>>> Hello!
>>>>
>>>> I noticed while here
>>>> (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/arm/boot/dts/exynos4x12.dtsi?id=3a0d48f6f81459c874165ffb14b310c0b5bb0c58)
>>>> the necessary entry for the dts was made, on the drm driver side
>>>> (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/exynos/exynos_drm_g2d.c)
>>>> this was never added.
>>>>
>>>> Shouldn't "samsung,exynos4212-g2d" go into exynos_g2d_match as well?
>>> The DRM version of G2D driver does not support Exynos4 based G2D IP
>>> yet. The support for this IP
>>> is available only in the V4L2 version of the driver. Please see the file:
>>> drivers/media/platform/s5p-g2d/g2d.c
>>>
>> That doesn't make sense to me. From the initial commit message of the
>> DRM code:
>> "The G2D is a 2D graphic accelerator that supports Bit Block Transfer.
>> This G2D driver is exynos drm specific and supports only G2D(version
>> 4.1) of later Exynos series from Exynos4X12 because supporting DMA."
>> (https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/drivers/gpu/drm/exynos/exynos_drm_g2d.c?id=d7f1642c90ab5eb2d7c48af0581c993094f97e1a)
>>
>> In fact, this doesn't even mention the Exynos5?!
>
> It does say "later Exynos series from Exynos4X12" which technically
> includes Exynos5 and

Right, supported.

> does not include previous Exynos series SoCs like 4210, etc.
> Anyway, I haven't tested this driver on Exynos4 based platforms and
> hence cannot confirm if it
> supports 4x12 in the current form. I leave it to the original author
> and Inki to comment about it.
>

Just add "samsung,exynos4212-g2d" to exynos_g2d_match if you want to
use g2d driver on exynos4212 SoC. We already tested this driver on
Exynos4x12 SoC also. We didn't just post dt support patch for
exynos4x12 series.

Thanks,
Inki Dae

> --
> With warm regards,
> Sachin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v9 1/2] dmabuf-sync: Add a buffer synchronization framework

2013-09-17 Thread Inki Dae
The below is example codes,
struct dmabuf_sync *sync;

sync = dmabuf_sync_init(...);
...

dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_R);
...

And the below can be used as access types:
DMA_BUF_ACCESS_R - CPU will access a buffer for read.
DMA_BUF_ACCESS_W - CPU will access a buffer for read or write.
DMA_BUF_ACCESS_DMA_R - DMA will access a buffer for read
DMA_BUF_ACCESS_DMA_W - DMA will access a buffer for read or
write.

2. Mandatory resource releasing - a task cannot hold a lock indefinitely.
A task may never try to unlock a buffer after taking a lock to the buffer.
In this case, a timer handler to the corresponding sync object is called
in five (default) seconds and then the timed-out buffer is unlocked by work
queue handler to avoid lockups and to enforce resources of the buffer.

The below is how to use interfaces for device driver:
1. Allocate and Initialize a sync object:
static void xxx_dmabuf_sync_free(void *priv)
{
struct xxx_context *ctx = priv;

if (!ctx)
return;

ctx->sync = NULL;
}
...

static struct dmabuf_sync_priv_ops driver_specific_ops = {
.free = xxx_dmabuf_sync_free,
};
...

struct dmabuf_sync *sync;

sync = dmabuf_sync_init("test sync", &driver_specific_ops, ctx);
...

2. Add a dmabuf to the sync object when setting up dma buffer relevant
   registers:
dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
...

3. Lock all dmabufs of the sync object before DMA or CPU accesses
   the dmabufs:
dmabuf_sync_lock(sync);
...

4. Now CPU or DMA can access all dmabufs locked in step 3.

5. Unlock all dmabufs added in a sync object after DMA or CPU access
   to these dmabufs is completed:
dmabuf_sync_unlock(sync);

   And call the following functions to release all resources,
dmabuf_sync_put_all(sync);
dmabuf_sync_fini(sync);

You can refer to actual example codes:
"drm/exynos: add dmabuf sync support for g2d driver" and
"drm/exynos: add dmabuf sync support for kms framework" from
https://git.kernel.org/cgit/linux/kernel/git/daeinki/
drm-exynos.git/log/?h=dmabuf-sync

And this framework includes fcntl[3] and select system call as interfaces
exported to user. As you know, user sees a buffer object as a dma-buf file
descriptor. fcntl() call with the file descriptor means to lock some buffer
region being managed by the dma-buf object. And select() call with the file
descriptor means to poll the completion event of CPU or DMA access to
the dma-buf.

The below is how to use interfaces for user application:

fcntl system call:

struct flock filelock;

1. Lock a dma buf:
filelock.l_type = F_WRLCK or F_RDLCK;

/* lock entire region to the dma buf. */
filelock.lwhence = SEEK_CUR;
filelock.l_start = 0;
filelock.l_len = 0;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);
...
CPU access to the dma buf

2. Unlock a dma buf:
filelock.l_type = F_UNLCK;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);

close(dmabuf fd) call would also unlock the dma buf. And for 
more
detail, please refer to [3]

select system call:

fd_set wdfs or rdfs;

FD_ZERO(&wdfs or &rdfs);
FD_SET(fd, &wdfs or &rdfs);

select(fd + 1, &rdfs, NULL, NULL, NULL);
or
select(fd + 1, NULL, &wdfs, NULL, NULL);

Every time select system call is called, a caller will wait for
the completion of DMA or CPU access to a shared buffer if there
is someone accessing the shared buffer. If no anyone then select
system call will be returned at once.

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  286 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |5 +
 drivers/base/dmabuf-sync.c |  951 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  257 +++
 7 f

[PATCH v2 2/2] dma-buf: Add user interfaces for dmabuf sync support

2013-09-17 Thread Inki Dae
This patch adds lock and poll callbacks to dma buf file operations,
and these callbacks will be called by fcntl and select system calls.

fcntl and select system calls can be used to wait for the completion
of DMA or CPU access to a shared dmabuf. The difference of them is
fcntl system call takes a lock after the completion but select system
call doesn't. So in case of fcntl system call, it's useful when a task
wants to access a shared dmabuf without any broken. On the other hand,
it's useful when a task wants to just wait for the completion.

Changelog v2:
- Add select system call support.
  . The purpose of this feature is to wait for the completion of DMA or
CPU access to a dmabuf without that caller locks the dmabuf again
after the completion.
That is useful when caller wants to be aware of the completion of
DMA access to the dmabuf, and the caller doesn't use intefaces for
the DMA device driver.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/base/dma-buf.c |   81 
 1 file changed, 81 insertions(+)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 3985751..73234ba 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static inline int is_dma_buf_file(struct file *);
@@ -106,10 +107,90 @@ static loff_t dma_buf_llseek(struct file *file, loff_t 
offset, int whence)
return base + offset;
 }
 
+static unsigned int dma_buf_poll(struct file *filp,
+   struct poll_table_struct *poll)
+{
+   struct dma_buf *dmabuf;
+   struct dmabuf_sync_reservation *robj;
+   int ret = 0;
+
+   if (!is_dma_buf_file(filp))
+   return POLLERR;
+
+   dmabuf = filp->private_data;
+   if (!dmabuf || !dmabuf->sync)
+   return POLLERR;
+
+   robj = dmabuf->sync;
+
+   mutex_lock(&robj->lock);
+
+   robj->polled = true;
+
+   /*
+* CPU or DMA access to this buffer has been completed, and
+* the blocked task has been waked up. Return poll event
+* so that the task can get out of select().
+*/
+   if (robj->poll_event) {
+   robj->poll_event = false;
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   /*
+* There is no anyone accessing this buffer so just return.
+*/
+   if (!robj->locked) {
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   poll_wait(filp, &robj->poll_wait, poll);
+
+   mutex_unlock(&robj->lock);
+
+   return ret;
+}
+
+static int dma_buf_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+   struct dma_buf *dmabuf;
+   unsigned int type;
+   bool wait = false;
+
+   if (!is_dma_buf_file(file))
+   return -EINVAL;
+
+   dmabuf = file->private_data;
+
+   if ((fl->fl_type & F_UNLCK) == F_UNLCK) {
+   dmabuf_sync_single_unlock(dmabuf);
+   return 0;
+   }
+
+   /* convert flock type to dmabuf sync type. */
+   if ((fl->fl_type & F_WRLCK) == F_WRLCK)
+   type = DMA_BUF_ACCESS_W;
+   else if ((fl->fl_type & F_RDLCK) == F_RDLCK)
+   type = DMA_BUF_ACCESS_R;
+   else
+   return -EINVAL;
+
+   if (fl->fl_flags & FL_SLEEP)
+   wait = true;
+
+   /* TODO. the locking to certain region should also be considered. */
+
+   return dmabuf_sync_single_lock(dmabuf, type, wait);
+}
+
 static const struct file_operations dma_buf_fops = {
.release= dma_buf_release,
.mmap   = dma_buf_mmap_internal,
.llseek = dma_buf_llseek,
+   .poll   = dma_buf_poll,
+   .lock   = dma_buf_lock,
 };
 
 /*
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v9 0/2] Introduce buffer synchronization framework

2013-09-17 Thread Inki Dae
n.

Web app based on HTML5 also has the same issue. Web browser and Web app
are different process. The Web app can draw something in its own buffer using
CPU, and then the Web Browser can compose the buffer with its own back buffer.

Thus, in such cases, a shared buffer could be broken as one process draws
something in a buffer using CPU, when other process composes the buffer with
its own buffer using GPU without any locking mechanism. That is why we need
user land locking interface, fcntl system call.

And last one is a deferred page flip issue. This issue is that a window
buffer rendered can be displayed on screen in about 32ms in worst case:
assume that the gpu rendering is completed within 16ms.
That can be incurred when compositing a pixmap buffer with a window buffer
using GPU and when vsync is just started. At this time, Xorg waits for
a vblank event to get a window buffer so 3d rendering will be delayed
up to about 16ms. As a result, the window buffer would be displayed in
about two vsyncs (about 32ms) and in turn, that would show slow
responsiveness.

For this, we could enhance the responsiveness with locking mechanism: skipping
one vblank wait. I guess Android, Chrome OS, and other platforms are using
their own locking mechanisms with similar reason; Android sync driver, KDS, and
DMA fence.

The below shows the deferred page flip issue in worst case:

   | <- vsync signal
   |<-- DRI2GetBuffers
   |
   |
   |
   | <- vsync signal
   |<-- Request gpu rendering
  time |
   |
   |<-- Request page flip (deferred)
   | <- vsync signal
   |<-- Displayed on screen
   |
   |
   |
       | <- vsync signal

Thanks,
Inki Dae

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl
[4] https://www.tizen.org/

Inki Dae (2):
  dmabuf-sync: Add a buffer synchronization framework
  dma-buf: Add user interfaces for dmabuf sync support

 Documentation/dma-buf-sync.txt |  286 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |   86 
 drivers/base/dmabuf-sync.c |  951 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  257 +++
 7 files changed, 1604 insertions(+)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 0/2] Introduce buffer synchronization framework

2013-08-29 Thread Inki Dae
ng in its own buffer using
CPU, and then the Web Browser can compose the buffer with its own back buffer.

Thus, in such cases, a shared buffer could be broken as one process draws
something in a buffer using CPU, when other process composes the buffer with
its own buffer using GPU without any locking mechanism. That is why we need
user land locking interface, fcntl system call.

And last one is a deferred page flip issue. This issue is that a window
buffer rendered can be displayed on screen in about 32ms in worst case:
assume that the gpu rendering is completed within 16ms.
That can be incurred when compositing a pixmap buffer with a window buffer
using GPU and when vsync is just started. At this time, Xorg waits for
a vblank event to get a window buffer so 3d rendering will be delayed
up to about 16ms. As a result, the window buffer would be displayed in
about two vsyncs (about 32ms) and in turn, that would show slow
responsiveness.

For this, we could enhance the responsiveness with locking mechanism: skipping
one vblank wait. I guess Android, Chrome OS, and other platforms are using
their own locking mechanisms with similar reason; Android sync driver, KDS, and
DMA fence.

The below shows the deferred page flip issue in worst case:

   | <- vsync signal
   |<-- DRI2GetBuffers
   |
   |
   |
   | <- vsync signal
   |<-- Request gpu rendering
  time |
   |
   |<-- Request page flip (deferred)
   | <- vsync signal
   |<-- Displayed on screen
   |
   |
   |
       | <- vsync signal

Thanks,
Inki Dae

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl
[4] https://www.tizen.org/

Inki Dae (2):
  dmabuf-sync: Add a buffer synchronization framework
  dma-buf: Add user interfaces for dmabuf sync support

 Documentation/dma-buf-sync.txt |  286 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |   85 
 drivers/base/dmabuf-sync.c |  943 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  257 +++
 7 files changed, 1595 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/2] dma-buf: Add user interfaces for dmabuf sync support

2013-08-29 Thread Inki Dae
This patch adds lock and poll callbacks to dma buf file operations,
and these callbacks will be called by fcntl and select system calls.

fcntl and select system calls can be used to wait for the completion
of DMA or CPU access to a shared dmabuf. The difference of them is
fcntl system call takes a lock after the completion but select system
call doesn't. So in case of fcntl system call, it's useful when a task
wants to access a shared dmabuf without any broken. On the other hand,
it's useful when a task wants to just wait for the completion.

Changelog v2:
- Add select system call support.
  . The purpose of this feature is to wait for the completion of DMA or
CPU access to a dmabuf without that caller locks the dmabuf again
after the completion.
That is useful when caller wants to be aware of the completion of
DMA access to the dmabuf, and the caller doesn't use intefaces for
the DMA device driver.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/base/dma-buf.c |   81 
 1 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index cc42a38..f961907 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static inline int is_dma_buf_file(struct file *);
@@ -81,9 +82,89 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
return dmabuf->ops->mmap(dmabuf, vma);
 }
 
+static unsigned int dma_buf_poll(struct file *filp,
+   struct poll_table_struct *poll)
+{
+   struct dma_buf *dmabuf;
+   struct dmabuf_sync_reservation *robj;
+   int ret = 0;
+
+   if (!is_dma_buf_file(filp))
+   return POLLERR;
+
+   dmabuf = filp->private_data;
+   if (!dmabuf || !dmabuf->sync)
+   return POLLERR;
+
+   robj = dmabuf->sync;
+
+   mutex_lock(&robj->lock);
+
+   robj->polled = true;
+
+   /*
+* CPU or DMA access to this buffer has been completed, and
+* the blocked task has been waked up. Return poll event
+* so that the task can get out of select().
+*/
+   if (robj->poll_event) {
+   robj->poll_event = false;
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   /*
+* There is no anyone accessing this buffer so just return.
+*/
+   if (!robj->locked) {
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   poll_wait(filp, &robj->poll_wait, poll);
+
+   mutex_unlock(&robj->lock);
+
+   return ret;
+}
+
+static int dma_buf_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+   struct dma_buf *dmabuf;
+   unsigned int type;
+   bool wait = false;
+
+   if (!is_dma_buf_file(file))
+   return -EINVAL;
+
+   dmabuf = file->private_data;
+
+   if ((fl->fl_type & F_UNLCK) == F_UNLCK) {
+   dmabuf_sync_single_unlock(dmabuf);
+   return 0;
+   }
+
+   /* convert flock type to dmabuf sync type. */
+   if ((fl->fl_type & F_WRLCK) == F_WRLCK)
+   type = DMA_BUF_ACCESS_W;
+   else if ((fl->fl_type & F_RDLCK) == F_RDLCK)
+   type = DMA_BUF_ACCESS_R;
+   else
+   return -EINVAL;
+
+   if (fl->fl_flags & FL_SLEEP)
+   wait = true;
+
+   /* TODO. the locking to certain region should also be considered. */
+
+   return dmabuf_sync_single_lock(dmabuf, type, wait);
+}
+
 static const struct file_operations dma_buf_fops = {
.release= dma_buf_release,
.mmap   = dma_buf_mmap_internal,
+   .poll   = dma_buf_poll,
+   .lock   = dma_buf_lock,
 };
 
 /*
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v8 1/2] dmabuf-sync: Add a buffer synchronization framework

2013-08-29 Thread Inki Dae
uf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_R);
...

And the below can be used as access types:
DMA_BUF_ACCESS_R - CPU will access a buffer for read.
DMA_BUF_ACCESS_W - CPU will access a buffer for read or write.
DMA_BUF_ACCESS_DMA_R - DMA will access a buffer for read
DMA_BUF_ACCESS_DMA_W - DMA will access a buffer for read or
write.

2. Mandatory resource releasing - a task cannot hold a lock indefinitely.
A task may never try to unlock a buffer after taking a lock to the buffer.
In this case, a timer handler to the corresponding sync object is called
in five (default) seconds and then the timed-out buffer is unlocked by work
queue handler to avoid lockups and to enforce resources of the buffer.

The below is how to use interfaces for device driver:
1. Allocate and Initialize a sync object:
static void xxx_dmabuf_sync_free(void *priv)
{
struct xxx_context *ctx = priv;

if (!ctx)
return;

ctx->sync = NULL;
}
...

static struct dmabuf_sync_priv_ops driver_specific_ops = {
.free = xxx_dmabuf_sync_free,
};
...

struct dmabuf_sync *sync;

sync = dmabuf_sync_init("test sync", &driver_specific_ops, ctx);
...

2. Add a dmabuf to the sync object when setting up dma buffer relevant
   registers:
dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
...

3. Lock all dmabufs of the sync object before DMA or CPU accesses
   the dmabufs:
dmabuf_sync_lock(sync);
...

4. Now CPU or DMA can access all dmabufs locked in step 3.

5. Unlock all dmabufs added in a sync object after DMA or CPU access
   to these dmabufs is completed:
dmabuf_sync_unlock(sync);

   And call the following functions to release all resources,
dmabuf_sync_put_all(sync);
dmabuf_sync_fini(sync);

You can refer to actual example codes:
"drm/exynos: add dmabuf sync support for g2d driver" and
"drm/exynos: add dmabuf sync support for kms framework" from
https://git.kernel.org/cgit/linux/kernel/git/daeinki/
drm-exynos.git/log/?h=dmabuf-sync

And this framework includes fcntl[3] and select system call as interfaces
exported to user. As you know, user sees a buffer object as a dma-buf file
descriptor. fcntl() call with the file descriptor means to lock some buffer
region being managed by the dma-buf object. And select() call with the file
descriptor means to poll the completion event of CPU or DMA access to
the dma-buf.

The below is how to use interfaces for user application:

fcntl system call:

struct flock filelock;

1. Lock a dma buf:
filelock.l_type = F_WRLCK or F_RDLCK;

/* lock entire region to the dma buf. */
filelock.lwhence = SEEK_CUR;
filelock.l_start = 0;
filelock.l_len = 0;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);
...
CPU access to the dma buf

2. Unlock a dma buf:
filelock.l_type = F_UNLCK;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);

close(dmabuf fd) call would also unlock the dma buf. And for 
more
detail, please refer to [3]

select system call:

fd_set wdfs or rdfs;

FD_ZERO(&wdfs or &rdfs);
FD_SET(fd, &wdfs or &rdfs);

select(fd + 1, &rdfs, NULL, NULL, NULL);
or
select(fd + 1, NULL, &wdfs, NULL, NULL);

Every time select system call is called, a caller will wait for
the completion of DMA or CPU access to a shared buffer if there
is someone accessing the shared buffer. If no anyone then select
system call will be returned at once.

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  286 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |4 +
 drivers/base/dmabuf-sync.c |  945 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  257 +++
 7 files changed, 1516 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/ba

RE: [PATCH v2 2/2] dma-buf: Add user interfaces for dmabuf sync support

2013-08-22 Thread Inki Dae
Thanks for your comments,
Inki Dae

> -Original Message-
> From: David Herrmann [mailto:dh.herrm...@gmail.com]
> Sent: Wednesday, August 21, 2013 10:17 PM
> To: Inki Dae
> Cc: dri-de...@lists.freedesktop.org; linux-fb...@vger.kernel.org; linux-
> arm-ker...@lists.infradead.org; linux-media@vger.kernel.org; linaro-
> ker...@lists.linaro.org; Maarten Lankhorst; Sumit Semwal;
> kyungmin.p...@samsung.com; myungjoo@samsung.com
> Subject: Re: [PATCH v2 2/2] dma-buf: Add user interfaces for dmabuf sync
> support
> 
> Hi
> 
> On Wed, Aug 21, 2013 at 12:33 PM, Inki Dae  wrote:
> > This patch adds lock and poll callbacks to dma buf file operations,
> > and these callbacks will be called by fcntl and select system calls.
> >
> > fcntl and select system calls can be used to wait for the completion
> > of DMA or CPU access to a shared dmabuf. The difference of them is
> > fcntl system call takes a lock after the completion but select system
> > call doesn't. So in case of fcntl system call, it's useful when a task
> > wants to access a shared dmabuf without any broken. On the other hand,
> > it's useful when a task wants to just wait for the completion.
> 
> 1)
> So how is that supposed to work in user-space? I don't want to block
> on a buffer, but get notified once I can lock it. So I do:
>   select(..dmabuf..)
> Once it is finished, I want to use it:
>   flock(..dmabuf..)
> However, how can I guarantee the flock will not block? Some other
> process might have locked it in between. So I do a non-blocking
> flock() and if it fails I wait again?

s/flock/fcntl

Yes, it does if you wanted to do a non-blocking fcntl. The fcntl() call will
return -EAGAIN if some other process have locked first. So user process
could retry to lock or do other work. This user process called fcntl() with
non-blocking mode so in this case, I think the user should consider two
things. One is that the fcntl() call couldn't be failed, and other is that
the call could take a lock successfully. Isn't fcntl() with a other fd also,
not dmabuf, take a same action?

>Looks ugly and un-predictable.
> 

So I think this is reasonable. However, for select system call, I'm not sure
that this system call is needed yet. So I can remove it if unnecessary.

> 2)
> What do I do if some user-space program holds a lock and dead-locks?
> 

I think fcntl call with a other fd also could lead same situation, and the
lock will be unlocked once the user-space program is killed because when the
process is killed, all file descriptors of the process are closed.

> 3)
> How do we do modesetting in atomic-context in the kernel? There is no
> way to lock the object. But this is required for panic-handlers and
> more importantly the kdb debugging hooks.
> Ok, I can live with that being racy, but would still be nice to be
> considered.

Yes,  The lock shouldn't be called in the atomic-context. For this, will add
comments enough.

> 
> 4)
> Why do we need locks? Aren't fences enough? That is, in which
> situation is a lock really needed?
> If we assume we have two writers A and B (DMA, CPU, GPU, whatever) and
> they have no synchronization on their own. What do we win by
> synchronizing their writes? Ok, yeah, we end up with either A or B and
> not a mixture of both. But if we cannot predict whether we get A or B,
> I don't know why we care at all? It's random, so a mixture would be
> fine, too, wouldn't it?

I think not so. There are some cases that the mixture wouldn't be fine. For
this, will describe it at below.

> 
> So if user-space doesn't have any synchronization on its own, I don't
> see why we need an implicit sync on a dma-buf. Could you describe a
> more elaborate use-case?

Ok, first, I think I described that enough though [PATCH 0/2]. For this, you
can refer to the below link,
http://lwn.net/Articles/564208/ 

Anyway, there are some cases that user-space process needs the
synchronization on its own. In case of Tizen platform[1], one is between X
Client and X Server; actually, Composite Manager. Other is between Web app
based on HTML5 and Web Browser.

Please, assume that X Client draws something in a window buffer using CPU,
and then the X Client requests SWAP to X Server. And then X Server notifies
a damage event to Composite Manager. And then Composite Manager composes the
window buffer with its own back buffer using GPU. In this case, Composite
Manager calls eglSwapBuffers; internally, flushing GL commands instead of
finishing them for more performance.

As you may know, the flushing doesn't wait for the complete event from GPU
driver. And at the same time, the X Client could do other work, and also
draw something in the same buffer again. At this time, The buffer could b

[PATCH v7 1/2] dmabuf-sync: Add a buffer synchronization framework

2013-08-21 Thread Inki Dae
low is how to use interfaces for device driver:
1. Allocate and Initialize a sync object:
static void xxx_dmabuf_sync_free(void *priv)
{
struct xxx_context *ctx = priv;

if (!ctx)
return;

ctx->sync = NULL;
}
...

static struct dmabuf_sync_priv_ops driver_specific_ops = {
.free = xxx_dmabuf_sync_free,
};
...

struct dmabuf_sync *sync;

sync = dmabuf_sync_init("test sync", &driver_specific_ops, ctx);
...

2. Add a dmabuf to the sync object when setting up dma buffer relevant
   registers:
dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
...

3. Lock all dmabufs of the sync object before DMA or CPU accesses
   the dmabufs:
dmabuf_sync_lock(sync);
...

4. Now CPU or DMA can access all dmabufs locked in step 3.

5. Unlock all dmabufs added in a sync object after DMA or CPU access
   to these dmabufs is completed:
dmabuf_sync_unlock(sync);

   And call the following functions to release all resources,
dmabuf_sync_put_all(sync);
dmabuf_sync_fini(sync);

You can refer to actual example codes:
"drm/exynos: add dmabuf sync support for g2d driver" and
"drm/exynos: add dmabuf sync support for kms framework" from
https://git.kernel.org/cgit/linux/kernel/git/daeinki/
drm-exynos.git/log/?h=dmabuf-sync

And this framework includes fcntl[3] and select system call as interfaces
exported to user. As you know, user sees a buffer object as a dma-buf file
descriptor. fcntl() call with the file descriptor means to lock some buffer
region being managed by the dma-buf object. And select() call with the file
descriptor means to poll the completion event of CPU or DMA access to
the dma-buf.

The below is how to use interfaces for user application:

fcntl system call:

struct flock filelock;

1. Lock a dma buf:
filelock.l_type = F_WRLCK or F_RDLCK;

/* lock entire region to the dma buf. */
filelock.lwhence = SEEK_CUR;
filelock.l_start = 0;
filelock.l_len = 0;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);
...
CPU access to the dma buf

2. Unlock a dma buf:
filelock.l_type = F_UNLCK;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);

close(dmabuf fd) call would also unlock the dma buf. And for 
more
detail, please refer to [3]

select system call:

fd_set wdfs or rdfs;

FD_ZERO(&wdfs or &rdfs);
FD_SET(fd, &wdfs or &rdfs);

select(fd + 1, &rdfs, NULL, NULL, NULL);
or
select(fd + 1, NULL, &wdfs, NULL, NULL);

Every time select system call is called, a caller will wait for
the completion of DMA or CPU access to a shared buffer if there
is someone accessing the shared buffer. If no anyone then select
system call will be returned at once.

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  286 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |4 +
 drivers/base/dmabuf-sync.c |  706 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  236 ++
 7 files changed, 1256 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..5945c8a
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,286 @@
+DMA Buffer Synchronization Framework
+    
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization mechanism between DMA and DMA, CPU and DMA, and
+CPU and CPU.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, and 

[PATCH v2 2/2] dma-buf: Add user interfaces for dmabuf sync support

2013-08-21 Thread Inki Dae
This patch adds lock and poll callbacks to dma buf file operations,
and these callbacks will be called by fcntl and select system calls.

fcntl and select system calls can be used to wait for the completion
of DMA or CPU access to a shared dmabuf. The difference of them is
fcntl system call takes a lock after the completion but select system
call doesn't. So in case of fcntl system call, it's useful when a task
wants to access a shared dmabuf without any broken. On the other hand,
it's useful when a task wants to just wait for the completion.

Changelog v2:
- Add select system call support.
  . The purpose of this feature is to wait for the completion of DMA or
CPU access to a dmabuf without that caller locks the dmabuf again
after the completion.
That is useful when caller wants to be aware of the completion of
DMA access to the dmabuf, and the caller doesn't use intefaces for
the DMA device driver.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/base/dma-buf.c |   81 
 1 files changed, 81 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 4aca57a..f16a396 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static inline int is_dma_buf_file(struct file *);
@@ -80,9 +81,89 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
return dmabuf->ops->mmap(dmabuf, vma);
 }
 
+static unsigned int dma_buf_poll(struct file *filp,
+   struct poll_table_struct *poll)
+{
+   struct dma_buf *dmabuf;
+   struct dmabuf_sync_reservation *robj;
+   int ret = 0;
+
+   if (!is_dma_buf_file(filp))
+   return POLLERR;
+
+   dmabuf = filp->private_data;
+   if (!dmabuf || !dmabuf->sync)
+   return POLLERR;
+
+   robj = dmabuf->sync;
+
+   mutex_lock(&robj->lock);
+
+   robj->polled = true;
+
+   /*
+* CPU or DMA access to this buffer has been completed, and
+* the blocked task has been waked up. Return poll event
+* so that the task can get out of select().
+*/
+   if (robj->poll_event) {
+   robj->poll_event = false;
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   /*
+* There is no anyone accessing this buffer so just return.
+*/
+   if (!robj->locked) {
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   poll_wait(filp, &robj->poll_wait, poll);
+
+   mutex_unlock(&robj->lock);
+
+   return ret;
+}
+
+static int dma_buf_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+   struct dma_buf *dmabuf;
+   unsigned int type;
+   bool wait = false;
+
+   if (!is_dma_buf_file(file))
+   return -EINVAL;
+
+   dmabuf = file->private_data;
+
+   if ((fl->fl_type & F_UNLCK) == F_UNLCK) {
+   dmabuf_sync_single_unlock(dmabuf);
+   return 0;
+   }
+
+   /* convert flock type to dmabuf sync type. */
+   if ((fl->fl_type & F_WRLCK) == F_WRLCK)
+   type = DMA_BUF_ACCESS_W;
+   else if ((fl->fl_type & F_RDLCK) == F_RDLCK)
+   type = DMA_BUF_ACCESS_R;
+   else
+   return -EINVAL;
+
+   if (fl->fl_flags & FL_SLEEP)
+   wait = true;
+
+   /* TODO. the locking to certain region should also be considered. */
+
+   return dmabuf_sync_single_lock(dmabuf, type, wait);
+}
+
 static const struct file_operations dma_buf_fops = {
.release= dma_buf_release,
.mmap   = dma_buf_mmap_internal,
+   .poll   = dma_buf_poll,
+   .lock   = dma_buf_lock,
 };
 
 /*
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 0/2] Introduce buffer synchronization framework

2013-08-21 Thread Inki Dae
to get a window buffer so 3d rendering will be delayed
up to about 16ms. As a result, the window buffer would be displayed in
about two vsyncs (about 32ms) and in turn, that would show slow
responsiveness.

For this, we could enhance the responsiveness with locking
mechanism: skipping one vblank wait. I guess in the similar reason,
Android, Chrome OS, and other platforms are using their own locking
mechanisms; Android sync driver, KDS, and DMA fence.

The below shows the deferred page flip issue in worst case,

   | <- vsync signal
   |<-- DRI2GetBuffers
   |
   |
   |
   | <- vsync signal
   |<-- Request gpu rendering
  time |
   |
   |<-- Request page flip (deferred)
   | <- vsync signal
   |<-- Displayed on screen
   |
   |
   |
   | <- vsync signal

Thanks,
Inki Dae

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl

Inki Dae (2):
  dmabuf-sync: Add a buffer synchronization framework
  dma-buf: Add user interfaces for dmabuf sync support

 Documentation/dma-buf-sync.txt |  286 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |   85 +
 drivers/base/dmabuf-sync.c |  706 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  236 ++
 7 files changed, 1337 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework

2013-08-21 Thread Inki Dae

Thanks for the review,
Inki Dae

> -Original Message-
> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> ow...@vger.kernel.org] On Behalf Of Konrad Rzeszutek Wilk
> Sent: Wednesday, August 21, 2013 4:22 AM
> To: Inki Dae
> Cc: dri-de...@lists.freedesktop.org; linux-fb...@vger.kernel.org; linux-
> arm-ker...@lists.infradead.org; linux-media@vger.kernel.org; linaro-
> ker...@lists.linaro.org; kyungmin.p...@samsung.com;
> myungjoo@samsung.com
> Subject: Re: [PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer
> synchronization framework
> 
> On Tue, Aug 13, 2013 at 06:19:35PM +0900, Inki Dae wrote:
> > This patch adds a buffer synchronization framework based on DMA BUF[1]
> > and and based on ww-mutexes[2] for lock mechanism.
> >
> > The purpose of this framework is to provide not only buffer access
> control
> > to CPU and DMA but also easy-to-use interfaces for device drivers and
> > user application. This framework can be used for all dma devices using
> > system memory as dma buffer, especially for most ARM based SoCs.
> >
> > Changelog v6:
> > - Fix sync lock to multiple reads.
> > - Add select system call support.
> >   . Wake up poll_wait when a dmabuf is unlocked.
> > - Remove unnecessary the use of mutex lock.
> > - Add private backend ops callbacks.
> >   . This ops has one callback for device drivers to clean up their
> > sync object resource when the sync object is freed. For this,
> > device drivers should implement the free callback properly.
> > - Update document file.
> >
> > Changelog v5:
> > - Rmove a dependence on reservation_object: the reservation_object is
> used
> >   to hook up to ttm and dma-buf for easy sharing of reservations across
> >   devices. However, the dmabuf sync can be used for all dma devices;
> v4l2
> >   and drm based drivers, so doesn't need the reservation_object anymore.
> >   With regared to this, it adds 'void *sync' to dma_buf structure.
> > - All patches are rebased on mainline, Linux v3.10.
> >
> > Changelog v4:
> > - Add user side interface for buffer synchronization mechanism and
> update
> >   descriptions related to the user side interface.
> >
> > Changelog v3:
> > - remove cache operation relevant codes and update document file.
> >
> > Changelog v2:
> > - use atomic_add_unless to avoid potential bug.
> > - add a macro for checking valid access type.
> > - code clean.
> >
> > The mechanism of this framework has the following steps,
> > 1. Register dmabufs to a sync object - A task gets a new sync object
> and
> > can add one or more dmabufs that the task wants to access.
> > This registering should be performed when a device context or an
> event
> > context such as a page flip event is created or before CPU accesses
a
> shared
> > buffer.
> >
> > dma_buf_sync_get(a sync object, a dmabuf);
> >
> > 2. Lock a sync object - A task tries to lock all dmabufs added in
its
> own
> > sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid
> dead
> > lock issue and for race condition between CPU and CPU, CPU and DMA,
> and DMA
> > and DMA. Taking a lock means that others cannot access all locked
> dmabufs
> > until the task that locked the corresponding dmabufs, unlocks all
the
> locked
> > dmabufs.
> > This locking should be performed before DMA or CPU accesses these
> dmabufs.
> >
> > dma_buf_sync_lock(a sync object);
> >
> > 3. Unlock a sync object - The task unlocks all dmabufs added in its
> own sync
> > object. The unlock means that the DMA or CPU accesses to the dmabufs
> have
> > been completed so that others may access them.
> > This unlocking should be performed after DMA or CPU has completed
> accesses
> > to the dmabufs.
> >
> > dma_buf_sync_unlock(a sync object);
> >
> > 4. Unregister one or all dmabufs from a sync object - A task
> unregisters
> > the given dmabufs from the sync object. This means that the task
> dosen't
> > want to lock the dmabufs.
> > The unregistering should be performed after DMA or CPU has completed
> > accesses to the dmabufs or when dma_buf_sync_lock() is failed.
> >
> > dma_buf_sync_put(a sync object, a dmabuf);
> > dma_buf_sync_put_all(a sync object);
> >
> > The described steps may be summarized as:
> > get -> lock -> CPU or DMA access to a buffer/s -> unlock ->

RE: [PATCH v2 1/5] [media] exynos-mscl: Add new driver for M-Scaler

2013-08-20 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Shaik Ameer Basha
> Sent: Tuesday, August 20, 2013 5:07 PM
> To: Inki Dae
> Cc: Shaik Ameer Basha; LMML; linux-samsung-...@vger.kernel.org;
> c...@samsung.com; Sylwester Nawrocki; posc...@google.com; Arun Kumar K
> Subject: Re: [PATCH v2 1/5] [media] exynos-mscl: Add new driver for M-
> Scaler
> 
> Hi Inki Dae,
> 
> Thanks for the review.
> 
> 
> On Mon, Aug 19, 2013 at 6:18 PM, Inki Dae  wrote:
> > Just quick review.
> >
> >> -Original Message-
> >> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> >> ow...@vger.kernel.org] On Behalf Of Shaik Ameer Basha
> >> Sent: Monday, August 19, 2013 7:59 PM
> >> To: linux-media@vger.kernel.org; linux-samsung-...@vger.kernel.org
> >> Cc: s.nawro...@samsung.com; posc...@google.com; arun...@samsung.com;
> >> shaik.am...@samsung.com
> >> Subject: [PATCH v2 1/5] [media] exynos-mscl: Add new driver for M-
> Scaler
> >>
> >> This patch adds support for M-Scaler (M2M Scaler) device which is a
> >> new device for scaling, blending, color fill  and color space
> >> conversion on EXYNOS5 SoCs.
> >>
> >> This device supports the followings as key feature.
> >> input image format
> >> - YCbCr420 2P(UV/VU), 3P
> >> - YCbCr422 1P(YUYV/UYVY/YVYU), 2P(UV,VU), 3P
> >> - YCbCr444 2P(UV,VU), 3P
> >> - RGB565, ARGB1555, ARGB, ARGB, RGBA
> >> - Pre-multiplexed ARGB, L8A8 and L8
> >> output image format
> >> - YCbCr420 2P(UV/VU), 3P
> >> - YCbCr422 1P(YUYV/UYVY/YVYU), 2P(UV,VU), 3P
> >> - YCbCr444 2P(UV,VU), 3P
> >> - RGB565, ARGB1555, ARGB, ARGB, RGBA
> >> - Pre-multiplexed ARGB
> >> input rotation
> >> - 0/90/180/270 degree, X/Y/XY Flip
> >> scale ratio
> >> - 1/4 scale down to 16 scale up
> >> color space conversion
> >> - RGB to YUV / YUV to RGB
> >> Size
> >> - Input : 16x16 to 8192x8192
> >> - Output:   4x4 to 8192x8192
> >> alpha blending, color fill
> >>
> >> Signed-off-by: Shaik Ameer Basha 
> >> ---
> >>  drivers/media/platform/exynos-mscl/mscl-regs.c |  318
> >> 
> >>  drivers/media/platform/exynos-mscl/mscl-regs.h |  282
> >> +
> >>  2 files changed, 600 insertions(+)
> >>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-regs.c
> >>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-regs.h
> >>
> >> diff --git a/drivers/media/platform/exynos-mscl/mscl-regs.c
> >> b/drivers/media/platform/exynos-mscl/mscl-regs.c
> >> new file mode 100644
> >> index 000..9354afc
> >> --- /dev/null
> >> +++ b/drivers/media/platform/exynos-mscl/mscl-regs.c
> >> @@ -0,0 +1,318 @@
> >> +/*
> >> + * Copyright (c) 2013 - 2014 Samsung Electronics Co., Ltd.
> >> + *   http://www.samsung.com
> >> + *
> >> + * Samsung EXYNOS5 SoC series M-Scaler driver
> >> + *
> >> + * This program is free software; you can redistribute it and/or
> modify
> >> + * it under the terms of the GNU General Public License as published
> >> + * by the Free Software Foundation, either version 2 of the License,
> >> + * or (at your option) any later version.
> >> + */
> >> +
> >> +#include 
> >> +#include 
> >> +
> >> +#include "mscl-core.h"
> >> +
> >> +void mscl_hw_set_sw_reset(struct mscl_dev *dev)
> >> +{
> >> + u32 cfg;
> >> +
> >> + cfg = readl(dev->regs + MSCL_CFG);
> >> + cfg |= MSCL_CFG_SOFT_RESET;
> >> +
> >> + writel(cfg, dev->regs + MSCL_CFG);
> >> +}
> >> +
> >> +int mscl_wait_reset(struct mscl_dev *dev)
> >> +{
> >> + unsigned long end = jiffies + msecs_to_jiffies(50);
> >
> > What does 50 mean?
> >
> >> + u32 cfg, reset_done = 0;
> >> +
> >
> > Please describe why the below codes are needed.
> 
> 
> As per the Documentation,
> 
> " SOFT RESET: Writing "1" to this bit generates software reset. To
> check the completion of the reset, wait until this
> field becomes zero, then w

RE: [PATCH v2 4/5] [media] exynos-mscl: Add DT bindings for M-Scaler driver

2013-08-19 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Shaik Ameer Basha
> Sent: Monday, August 19, 2013 7:59 PM
> To: linux-media@vger.kernel.org; linux-samsung-...@vger.kernel.org
> Cc: s.nawro...@samsung.com; posc...@google.com; arun...@samsung.com;
> shaik.am...@samsung.com
> Subject: [PATCH v2 4/5] [media] exynos-mscl: Add DT bindings for M-Scaler
> driver
> 
> This patch adds the DT binding documentation for the exynos5
> based M-Scaler device driver.
> 
> Signed-off-by: Shaik Ameer Basha 
> ---
>  .../devicetree/bindings/media/exynos5-mscl.txt |   34
> 
>  1 file changed, 34 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/exynos5-
> mscl.txt
> 
> diff --git a/Documentation/devicetree/bindings/media/exynos5-mscl.txt
> b/Documentation/devicetree/bindings/media/exynos5-mscl.txt
> new file mode 100644
> index 000..5c9d1b1
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/media/exynos5-mscl.txt
> @@ -0,0 +1,34 @@
> +* Samsung Exynos5 M-Scaler device
> +
> +M-Scaler is used for scaling, blending, color fill and color space
> +conversion on EXYNOS5 SoCs.
> +
> +Required properties:
> +- compatible: should be "samsung,exynos5-mscl"

If Exynos5410/5420 have same IP,
"samsung,exynos5410-mscl" for M Scaler IP in Exynos5410/5420"

Else,
Compatible: should be one of the following:
(a) "samsung,exynos5410-mscl" for M Scaler IP in Exynos5410"
(b) "samsung,exynos5420-mscl" for M Scaler IP in Exynos5420"

> +- reg: should contain M-Scaler physical address location and length.
> +- interrupts: should contain M-Scaler interrupt number
> +- clocks: should contain the clock number according to CCF
> +- clock-names: should be "mscl"
> +
> +Example:
> +
> + mscl_0: mscl@0x1280 {
> + compatible = "samsung,exynos5-mscl";

"samsung,exynos5410-mscl";

> + reg = <0x1280 0x1000>;
> + interrupts = <0 220 0>;
> + clocks = <&clock 381>;
> + clock-names = "mscl";
> + };
> +
> +Aliases:
> +Each M-Scaler node should have a numbered alias in the aliases node,
> +in the form of msclN, N = 0...2. M-Scaler driver uses these aliases
> +to retrieve the device IDs using "of_alias_get_id()" call.
> +
> +Example:
> +
> +aliases {
> + mscl0 =&mscl_0;
> + mscl1 =&mscl_1;
> + mscl2 =&mscl_2;
> +};
> --
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 1/5] [media] exynos-mscl: Add new driver for M-Scaler

2013-08-19 Thread Inki Dae
Just quick review.

> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Shaik Ameer Basha
> Sent: Monday, August 19, 2013 7:59 PM
> To: linux-media@vger.kernel.org; linux-samsung-...@vger.kernel.org
> Cc: s.nawro...@samsung.com; posc...@google.com; arun...@samsung.com;
> shaik.am...@samsung.com
> Subject: [PATCH v2 1/5] [media] exynos-mscl: Add new driver for M-Scaler
> 
> This patch adds support for M-Scaler (M2M Scaler) device which is a
> new device for scaling, blending, color fill  and color space
> conversion on EXYNOS5 SoCs.
> 
> This device supports the followings as key feature.
> input image format
> - YCbCr420 2P(UV/VU), 3P
> - YCbCr422 1P(YUYV/UYVY/YVYU), 2P(UV,VU), 3P
> - YCbCr444 2P(UV,VU), 3P
> - RGB565, ARGB1555, ARGB, ARGB, RGBA
> - Pre-multiplexed ARGB, L8A8 and L8
> output image format
> - YCbCr420 2P(UV/VU), 3P
> - YCbCr422 1P(YUYV/UYVY/YVYU), 2P(UV,VU), 3P
> - YCbCr444 2P(UV,VU), 3P
> - RGB565, ARGB1555, ARGB, ARGB, RGBA
> - Pre-multiplexed ARGB
> input rotation
> - 0/90/180/270 degree, X/Y/XY Flip
> scale ratio
> - 1/4 scale down to 16 scale up
> color space conversion
> - RGB to YUV / YUV to RGB
> Size
> - Input : 16x16 to 8192x8192
> - Output:   4x4 to 8192x8192
> alpha blending, color fill
> 
> Signed-off-by: Shaik Ameer Basha 
> ---
>  drivers/media/platform/exynos-mscl/mscl-regs.c |  318
> 
>  drivers/media/platform/exynos-mscl/mscl-regs.h |  282
> +
>  2 files changed, 600 insertions(+)
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-regs.c
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-regs.h
> 
> diff --git a/drivers/media/platform/exynos-mscl/mscl-regs.c
> b/drivers/media/platform/exynos-mscl/mscl-regs.c
> new file mode 100644
> index 000..9354afc
> --- /dev/null
> +++ b/drivers/media/platform/exynos-mscl/mscl-regs.c
> @@ -0,0 +1,318 @@
> +/*
> + * Copyright (c) 2013 - 2014 Samsung Electronics Co., Ltd.
> + *   http://www.samsung.com
> + *
> + * Samsung EXYNOS5 SoC series M-Scaler driver
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published
> + * by the Free Software Foundation, either version 2 of the License,
> + * or (at your option) any later version.
> + */
> +
> +#include 
> +#include 
> +
> +#include "mscl-core.h"
> +
> +void mscl_hw_set_sw_reset(struct mscl_dev *dev)
> +{
> + u32 cfg;
> +
> + cfg = readl(dev->regs + MSCL_CFG);
> + cfg |= MSCL_CFG_SOFT_RESET;
> +
> + writel(cfg, dev->regs + MSCL_CFG);
> +}
> +
> +int mscl_wait_reset(struct mscl_dev *dev)
> +{
> + unsigned long end = jiffies + msecs_to_jiffies(50);

What does 50 mean?

> + u32 cfg, reset_done = 0;
> +

Please describe why the below codes are needed.

> + while (time_before(jiffies, end)) {
> + cfg = readl(dev->regs + MSCL_CFG);
> + if (!(cfg & MSCL_CFG_SOFT_RESET)) {
> + reset_done = 1;
> + break;
> + }
> + usleep_range(10, 20);
> + }
> +
> + /* write any value to r/w reg and read it back */
> + while (reset_done) {
> +
> + /* [TBD] need to define number of tries before returning
> +  * -EBUSY to the caller
> +  */
> +
> + writel(MSCL_CFG_SOFT_RESET_CHECK_VAL,
> + dev->regs + MSCL_CFG_SOFT_RESET_CHECK_REG);
> + if (MSCL_CFG_SOFT_RESET_CHECK_VAL ==
> + readl(dev->regs + MSCL_CFG_SOFT_RESET_CHECK_REG))
> + return 0;
> + }
> +
> + return -EBUSY;
> +}
> +
> +void mscl_hw_set_irq_mask(struct mscl_dev *dev, int interrupt, bool mask)
> +{
> + u32 cfg;
> +
> + switch (interrupt) {
> + case MSCL_INT_TIMEOUT:
> + case MSCL_INT_ILLEGAL_BLEND:
> + case MSCL_INT_ILLEGAL_RATIO:
> + case MSCL_INT_ILLEGAL_DST_HEIGHT:
> + case MSCL_INT_ILLEGAL_DST_WIDTH:
> + case MSCL_INT_ILLEGAL_DST_V_POS:
> + case MSCL_INT_ILLEGAL_DST_H_POS:
> + case MSCL_INT_ILLEGAL_DST_C_SPAN:
> + case MSCL_INT_ILLEGAL_DST_Y_SPAN:
> + case MSCL_INT_ILLEGAL_DST_CR_BASE:
> + case MSCL_INT_ILLEGAL_DST_CB_BASE:
> + case MSCL_INT_ILLEGAL_DST_Y_BASE:
> + case MSCL_INT_ILLEGAL_DST_COLOR:
> + case MSCL_INT_ILLEGAL_SRC_HEIGHT:
> + case MSCL_INT_ILLEGAL_SRC_WIDTH:
> + case MSCL_INT_ILLEGAL_SRC_CV_POS:
> + case MSCL_INT_ILLEGAL_SRC_CH_POS:
> + case MSCL_INT_ILLEGAL_SRC_YV_POS:
> + case MSCL_INT_ILLEGAL_SRC_YH_POS:
> + case MSCL_INT_ILLEGAL_SRC_C_SPAN:
> + case MSCL_INT_ILLEGAL_SRC_Y_SPAN:
> + case MSCL_INT_ILLEGAL_SRC_CR_BASE:
> + case MSCL_INT_ILLEGAL_SRC_CB_BASE:
> + 

RE: [PATCH v2 0/5] Exynos5 M-Scaler Driver

2013-08-19 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Shaik Ameer Basha
> Sent: Monday, August 19, 2013 7:59 PM
> To: linux-media@vger.kernel.org; linux-samsung-...@vger.kernel.org
> Cc: s.nawro...@samsung.com; posc...@google.com; arun...@samsung.com;
> shaik.am...@samsung.com
> Subject: [PATCH v2 0/5] Exynos5 M-Scaler Driver
> 
> This patch adds support for M-Scaler (M2M Scaler) device which is a
> new device for scaling, blending, color fill  and color space
> conversion on EXYNOS5 SoCs.

All Exynos5 SoCs really have this IP? It seems that only Exynos5420 and
maybe Exynos5410 have this IP, NOT Exynos5250. Please check it again and
describe it surely over the all patch series.

Thanks,
Inki Dae

> 
> This device supports the following as key features.
> input image format
> - YCbCr420 2P(UV/VU), 3P
> - YCbCr422 1P(YUYV/UYVY/YVYU), 2P(UV,VU), 3P
> - YCbCr444 2P(UV,VU), 3P
> - RGB565, ARGB1555, ARGB, ARGB, RGBA
> - Pre-multiplexed ARGB, L8A8 and L8
> output image format
> - YCbCr420 2P(UV/VU), 3P
> - YCbCr422 1P(YUYV/UYVY/YVYU), 2P(UV,VU), 3P
> - YCbCr444 2P(UV,VU), 3P
> - RGB565, ARGB1555, ARGB, ARGB, RGBA
> - Pre-multiplexed ARGB
> input rotation
> - 0/90/180/270 degree, X/Y/XY Flip
> scale ratio
> - 1/4 scale down to 16 scale up
> color space conversion
> - RGB to YUV / YUV to RGB
> Size
> - Input : 16x16 to 8192x8192
> - Output:   4x4 to 8192x8192
> alpha blending, color fill
> 
> Rebased on:
> ---
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git:master
> 
> Changes from v1:
> ---
> 1] Split the previous single patch into multiple patches.
> 2] Added DT binding documentation.
> 3] Removed the unnecessary header file inclusions.
> 4] Fix the condition check in mscl_prepare_address for swapping cb/cr
> addresses.
> 
> Shaik Ameer Basha (5):
>   [media] exynos-mscl: Add new driver for M-Scaler
>   [media] exynos-mscl: Add core functionality for the M-Scaler driver
>   [media] exynos-mscl: Add m2m functionality for the M-Scaler driver
>   [media] exynos-mscl: Add DT bindings for M-Scaler driver
>   [media] exynos-mscl: Add Makefile for M-Scaler driver
> 
>  .../devicetree/bindings/media/exynos5-mscl.txt |   34 +
>  drivers/media/platform/Kconfig |8 +
>  drivers/media/platform/Makefile|1 +
>  drivers/media/platform/exynos-mscl/Makefile|3 +
>  drivers/media/platform/exynos-mscl/mscl-core.c | 1312
> 
>  drivers/media/platform/exynos-mscl/mscl-core.h |  549 
>  drivers/media/platform/exynos-mscl/mscl-m2m.c  |  763 
>  drivers/media/platform/exynos-mscl/mscl-regs.c |  318 +
>  drivers/media/platform/exynos-mscl/mscl-regs.h |  282 +
>  9 files changed, 3270 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/exynos5-
> mscl.txt
>  create mode 100644 drivers/media/platform/exynos-mscl/Makefile
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-core.c
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-core.h
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-m2m.c
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-regs.c
>  create mode 100644 drivers/media/platform/exynos-mscl/mscl-regs.h
> 
> --
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Resend][RFC PATCH v6 0/2] Introduce buffer synchronization framework

2013-08-13 Thread Inki Dae
Just adding more detailed descriptions.

Hi all,

   This patch set introduces a buffer synchronization framework based
   on DMA BUF[1] and based on ww-mutexes[2] for lock mechanism, and
   may be final RFC.

   The purpose of this framework is to provide not only buffer access
   control to CPU and CPU, and CPU and DMA, and DMA and DMA but also
   easy-to-use interfaces for device drivers and user application.
   In addtion, this patch set suggests a way for enhancing performance.

   For generic user mode interface, we have used fcntl and select system
   call[3]. As you know, user application sees a buffer object as a dma-buf
   file descriptor. So fcntl() call with the file descriptor means to lock
   some buffer region being managed by the dma-buf object. And select() call
   means to wait for the completion of CPU or DMA access to the dma-buf
   without locking. For more detail, you can refer to the dma-buf-sync.txt
   in Documentation/


   There are some cases we should use this buffer synchronization framework.
   One of which is to primarily enhance GPU rendering performance on Tizen
   platform in case of 3d app with compositing mode that 3d app draws
   something in off-screen buffer, and Web app.

   In case of 3d app with compositing mode which is not a full screen mode,
   the app calls glFlush to submit 3d commands to GPU driver instead of
   glFinish for more performance. The reason we call glFlush is that glFinish
   blocks caller's task until the execution of the 2d commands is completed.
   Thus, that makes GPU and CPU more idle. As result, 3d rendering performance
   with glFinish is quite lower than glFlush. However, the use of glFlush has
   one issue that the a buffer shared with GPU could be broken when CPU
   accesses the buffer at once after glFlush because CPU cannot be aware of
   the completion of GPU access to the buffer. Of course, the app can be aware
   of that time using eglWaitGL but this function is valid only in case of the
   same process.

   The below summarizes how app's window is displayed on Tizen platform:
   1. X client requests a window buffer to Xorg.
   2. X client draws something in the window buffer using CPU.
   3. X client requests SWAP to Xorg.
   4. Xorg notifies a damage event to Composite Manager.
   5. Composite Manager gets the window buffer (front buffer) through
  DRI2GetBuffers.
   6. Composite Manager composes the window buffer and its own back buffer
  using GPU. At this time, eglSwapBuffers is called: internally, 3d
  commands are flushed to gpu driver.
   7. Composite Manager requests SWAP to Xorg.
   8. Xorg performs drm page flip. At this time, the window buffer is
  displayed on screen.

   Web app based on HTML5 also has similar procedure. Web browser and its web
   app are different processs. Web app draws something in its own buffer,
   and then the web browser gets a window buffer from Xorg, and then composes
   those two buffers using GPU.

   Thus, in such cases, a shared buffer could be broken when one process draws
   something in a shared buffer using CPU while Composite manager is composing
   two buffers - X client's front buffer and Composite manger's back buffer, or
   web app's front buffer and web browser's back buffer - using GPU without
   any locking mechanism. That is why we need user land locking interface,
   fcntl system call.

   And last one is a deferred page flip issue. This issue is that a window
   buffer rendered can be displayed on screen in about 32ms in worst case:
   assume that the gpu rendering is completed within 16ms.
   That can be incurred when compositing a pixmap buffer with a window buffer
   using GPU and when vsync is just started. At this time, Xorg waits for
   a vblank event to get a window buffer so 3d rendering will be delayed
   up to about 16ms. As a result, the window buffer would be displayed in
   about two vsyncs (about 32ms) and in turn, that would show slow
   responsiveness.

   For this, we could enhance the responsiveness with locking
   mechanism: skipping one vblank wait. I guess in the similar reason,
   Android, Chrome OS, and other platforms are using their own locking
   mechanisms; Android sync driver, KDS, and DMA fence.

   The below shows the deferred page flip issue in worst case,

   | <- vsync signal
   |<-- DRI2GetBuffers
   |
   |
   |
   | <- vsync signal
   |<-- Request gpu rendering
  time |
   |
   |<-- Request page flip (deferred)
   | <- vsync signal
   |<-- Displayed on screen
   |
   |
   |
       | <- vsync signal


Thanks,
Inki Dae


References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://

[RFC PATCH v6 0/2] Introduce buffer synchronization framework

2013-08-13 Thread Inki Dae
Hi all,

   This patch set introduces a buffer synchronization framework based
   on DMA BUF[1] and based on ww-mutexes[2] for lock mechanism, and
   may be final RFC.

   The purpose of this framework is to provide not only buffer access
   control to CPU and CPU, and CPU and DMA, and DMA and DMA but also
   easy-to-use interfaces for device drivers and user application.
   In addtion, this patch set suggests a way for enhancing performance.

   For generic user mode interface, we have used fcntl and select system
   call[3]. As you know, user application sees a buffer object as a dma-buf
   file descriptor. So fcntl() call with the file descriptor means to lock
   some buffer region being managed by the dma-buf object. And select() call
   means to wait for the completion of CPU or DMA access to the dma-buf
   without locking. For more detail, you can refer to the dma-buf-sync.txt
   in Documentation/


   There are some cases we should use this buffer synchronization framework.
   One of which is to primarily enhance GPU rendering performance on Tizen
   platform in case of 3d app with compositing mode that 3d app draws
   something in off-screen buffer, and Web app.

   In case of 3d app with compositing mode which is not a full screen mode,
   the app calls glFlush to submit 3d commands to GPU driver instead of
   glFinish for more performance. The reason we call glFlush is that glFinish
   blocks caller's task until the execution of the 2d commands is completed.
   Thus, that makes GPU and CPU more idle. As result, 3d rendering performance
   with glFinish is quite lower than glFlush. However, the use of glFlush has
   one issue that the a buffer shared with GPU could be broken when CPU
   accesses the buffer at once after glFlush because CPU cannot be aware of
   the completion of GPU access to the buffer. Of course, the app can be aware
   of that time using eglWaitGL but this function is valid only in case of the
   same process.

   In case of Tizen, there are some applications that one process draws
   something in its own off-screen buffer (pixmap buffer) using CPU, and other
   process gets a off-screen buffer (window buffer) from Xorg using
   DRI2GetBuffers, and then composites the pixmap buffer with the window buffer
   using GPU, and finally page flip.

   Web app based on HTML5 also has the same issue. Web browser and its web app
   are different process. The web app draws something in its own pixmap buffer,
   and then the web browser gets a window buffer from Xorg, and then composites
   the pixmap buffer with the window buffer. And finally, page flip.

   Thus, in such cases, a shared buffer could be broken as one process draws
   something in pixmap buffer using CPU, when other process composites the
   pixmap buffer with window buffer using GPU without any locking mechanism.
   That is why we need user land locking interface, fcntl system call.

   And last one is a deferred page flip issue. This issue is that a window
   buffer rendered can be displayed on screen in about 32ms in worst case:
   assume that the gpu rendering is completed within 16ms.
   That can be incurred when compositing a pixmap buffer with a window buffer
   using GPU and when vsync is just started. At this time, Xorg waits for
   a vblank event to get a window buffer so 3d rendering will be delayed
   up to about 16ms. As a result, the window buffer would be displayed in
   about two vsyncs (about 32ms) and in turn, that would show slow
   responsiveness.

   For this, we could enhance the responsiveness with locking
   mechanism: skipping one vblank wait. I guess in the similar reason,
   Android, Chrome OS, and other platforms are using their own locking
   mechanisms; Android sync driver, KDS, and DMA fence.

   The below shows the deferred page flip issue in worst case,

   | <- vsync signal
   |<-- DRI2GetBuffers
   |
   |
   |
   | <- vsync signal
   |<-- Request gpu rendering
  time |
   |
   |<-- Request page flip (deferred)
   | <- vsync signal
   |<-- Displayed on screen
   |
   |
   |
   | <- vsync signal


Thanks,
Inki Dae


References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl


Inki Dae (2):
  [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework
  [RFC PATCH v2] dma-buf: Add user interfaces for dmabuf sync support.

 Documentation/dma-buf-sync.txt |  285 +
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |   85 +
 drivers/base/dmabuf-sync.c |  678 
 include/linux/dma-buf.h|   16 +
 include/

[PATCH 2/2] [RFC PATCH v2] dma-buf: Add user interfaces for dmabuf sync support.

2013-08-13 Thread Inki Dae
This patch adds lock and poll callbacks to dma buf file operations,
and these callbacks will be called by fcntl and select system calls.

fcntl and select system calls can be used to wait for the completion
of DMA or CPU access to a shared dmabuf. The difference of them is
fcntl system call takes a lock after the completion but select system
call doesn't. So in case of fcntl system call, it's useful when a task
wants to access a shared dmabuf without any broken. On the other hand,
it's useful when a task wants to just wait for the completion.

Changelog v2:
- Add select system call support.
  . The purpose of this feature is to wait for the completion of DMA or
CPU access to a dmabuf without that caller locks the dmabuf again
after the completion.
That is useful when caller wants to be aware of the completion of
DMA access to the dmabuf, and the caller doesn't use intefaces for
the DMA device driver.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/base/dma-buf.c  |   81 +++
 include/linux/dmabuf-sync.h |1 +
 2 files changed, 82 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 4aca57a..f16a396 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static inline int is_dma_buf_file(struct file *);
@@ -80,9 +81,89 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
return dmabuf->ops->mmap(dmabuf, vma);
 }
 
+static unsigned int dma_buf_poll(struct file *filp,
+   struct poll_table_struct *poll)
+{
+   struct dma_buf *dmabuf;
+   struct dmabuf_sync_reservation *robj;
+   int ret = 0;
+
+   if (!is_dma_buf_file(filp))
+   return POLLERR;
+
+   dmabuf = filp->private_data;
+   if (!dmabuf || !dmabuf->sync)
+   return POLLERR;
+
+   robj = dmabuf->sync;
+
+   mutex_lock(&robj->lock);
+
+   robj->polled = true;
+
+   /*
+* CPU or DMA access to this buffer has been completed, and
+* the blocked task has been waked up. Return poll event
+* so that the task can get out of select().
+*/
+   if (robj->poll_event) {
+   robj->poll_event = false;
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   /*
+* There is no anyone accessing this buffer so just return.
+*/
+   if (!robj->locked) {
+   mutex_unlock(&robj->lock);
+   return POLLIN | POLLOUT;
+   }
+
+   poll_wait(filp, &robj->poll_wait, poll);
+
+   mutex_unlock(&robj->lock);
+
+   return ret;
+}
+
+static int dma_buf_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+   struct dma_buf *dmabuf;
+   unsigned int type;
+   bool wait = false;
+
+   if (!is_dma_buf_file(file))
+   return -EINVAL;
+
+   dmabuf = file->private_data;
+
+   if ((fl->fl_type & F_UNLCK) == F_UNLCK) {
+   dmabuf_sync_single_unlock(dmabuf);
+   return 0;
+   }
+
+   /* convert flock type to dmabuf sync type. */
+   if ((fl->fl_type & F_WRLCK) == F_WRLCK)
+   type = DMA_BUF_ACCESS_W;
+   else if ((fl->fl_type & F_RDLCK) == F_RDLCK)
+   type = DMA_BUF_ACCESS_R;
+   else
+   return -EINVAL;
+
+   if (fl->fl_flags & FL_SLEEP)
+   wait = true;
+
+   /* TODO. the locking to certain region should also be considered. */
+
+   return dmabuf_sync_single_lock(dmabuf, type, wait);
+}
+
 static const struct file_operations dma_buf_fops = {
.release= dma_buf_release,
.mmap   = dma_buf_mmap_internal,
+   .poll   = dma_buf_poll,
+   .lock   = dma_buf_lock,
 };
 
 /*
diff --git a/include/linux/dmabuf-sync.h b/include/linux/dmabuf-sync.h
index 9a3afc4..0316f68 100644
--- a/include/linux/dmabuf-sync.h
+++ b/include/linux/dmabuf-sync.h
@@ -11,6 +11,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] [RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework

2013-08-13 Thread Inki Dae
  return;

ctx->sync = NULL;
}
...

static struct dmabuf_sync_priv_ops driver_specific_ops = {
.free = xxx_dmabuf_sync_free,
};
...

struct dmabuf_sync *sync;

sync = dmabuf_sync_init("test sync", &driver_specific_ops, ctx);
...

2. Add a dmabuf to the sync object when setting up dma buffer relevant
   registers:
dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
...

3. Lock all dmabufs of the sync object before DMA or CPU accesses
   the dmabufs:
dmabuf_sync_lock(sync);
...

4. Now CPU or DMA can access all dmabufs locked in step 3.

5. Unlock all dmabufs added in a sync object after DMA or CPU access
   to these dmabufs is completed:
dmabuf_sync_unlock(sync);

   And call the following functions to release all resources,
dmabuf_sync_put_all(sync);
dmabuf_sync_fini(sync);

You can refer to actual example codes:
"drm/exynos: add dmabuf sync support for g2d driver" and
"drm/exynos: add dmabuf sync support for kms framework" from
https://git.kernel.org/cgit/linux/kernel/git/daeinki/
drm-exynos.git/log/?h=dmabuf-sync

And this framework includes fcntl system call[3] as interfaces exported
to user. As you know, user sees a buffer object as a dma-buf file descriptor.
So fcntl() call with the file descriptor means to lock some buffer region being
managed by the dma-buf object.

The below is how to use interfaces for user application:

fcntl system call:

struct flock filelock;

1. Lock a dma buf:
filelock.l_type = F_WRLCK or F_RDLCK;

/* lock entire region to the dma buf. */
filelock.lwhence = SEEK_CUR;
filelock.l_start = 0;
filelock.l_len = 0;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);
...
CPU access to the dma buf

2. Unlock a dma buf:
filelock.l_type = F_UNLCK;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);

close(dmabuf fd) call would also unlock the dma buf. And for 
more
detail, please refer to [3]

select system call:

fd_set wdfs or rdfs;

FD_ZERO(&wdfs or &rdfs);
FD_SET(fd, &wdfs or &rdfs);

select(fd + 1, &rdfs, NULL, NULL, NULL);
or
select(fd + 1, NULL, &wdfs, NULL, NULL);

Every time select system call is called, a caller will wait for
the completion of DMA or CPU access to a shared buffer if there
is someone accessing the shared buffer; locked the shared buffer.
However, if no anyone then select system call will be returned
at once.

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  285 +
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |4 +
 drivers/base/dmabuf-sync.c |  678 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  190 +++
 7 files changed, 1181 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..8023d06
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,285 @@
+DMA Buffer Synchronization Framework
+    
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization mechanism between DMA and DMA, CPU and DMA, and
+CPU and CPU.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, and easy-to-use interfaces for
+device drivers and user application. And this API can be used for all dma
+devices using system memory as dma buffer, especially for most ARM based SoCs.
+
+
+Motivation
+--
+
+Buffer synchronization issue between DMA and DMA:
+   Sharing a buffer, a device cannot be aware of when the other device
+   will access the shared

About buffer sychronization mechanism and cache operation

2013-08-12 Thread Inki Dae
 may need Linux generic buffer synchronization mechanism that 
uses only Linux standard interfaces (dmabuf) including user land interfaces 
(fcntl and select system calls), and the dmabuf sync framework could meet it.


Thanks,
Inki Dae

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 0/2 v5] Introduce buffer synchronization framework

2013-07-11 Thread Inki Dae
Hi all,

This patch set introduces a buffer synchronization framework based
on DMA BUF[1] and based on ww-mutexes[2] for lock mechanism.

The purpose of this framework is to provide not only buffer access
control to CPU and CPU, and CPU and DMA, and DMA and DMA but also
easy-to-use interfaces for device drivers and user application.
In addtion, this patch set suggests a way for enhancing performance.

For generic user mode interface, we have used fcntl system call[3].
As you know, user application sees a buffer object as a dma-buf file
descriptor. So fcntl() call with the file descriptor means to lock
some buffer region being managed by the dma-buf object.
For more detail, you can refer to the dma-buf-sync.txt in Documentation/

Moreover, we had tried to find how we could utilize limited hardware
resources more using buffer synchronization mechanism. And finally,
we have realized that it could enhance performance using multi threads
with this buffer synchronization mechanism: DMA and CPU works individually
so CPU could perform other works while DMA is performing some works,
and vise versa.

However, in the conventional way, that is not easy to do so because DMA
operation is depend on CPU operation, and vice versa.

Conventional way:
User Kernel
-
CPU writes something to src
send the src to driver->
 update DMA register
request DMA start(1)--->
 DMA start
<-completion signal(2)--
CPU accesses dst

(1) Request DMA start after the CPU access to src buffer is completed.
(2) Access dst buffer after DMA access to the dst buffer is completed.

On the other hand, if there is something to control buffer access between CPU
and DMA? The below shows that:

User(thread a)  User(thread b)Kernel
-
send a src to driver-->
  update DMA register
lock the src
request DMA start(1)-->
CPU acccess to src
unlock the srclock src and dst
  DMA start
<-completion signal(2)-
lock dst  DMA completion
CPU access to dst unlock src and dst
unlock DST

(1) Try to start DMA operation while CPU is accessing the src buffer.
(2) Try CPU access to dst buffer while DMA is accessing the dst buffer.

This means that CPU or DMA could do more works.

In the same way, we could reduce hand shaking overhead between
two processes when those processes need to share a shared buffer.
There may be other cases that we could reduce overhead as well.


References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl


Inki Dae (2):
  [RFC PATCH v5 1/2] dmabuf-sync: Introduce buffer synchronization framework
  [RFC PATCH v1 2/2] dma-buf: add lock callback for fcntl system call

 Documentation/dma-buf-sync.txt |  290 +
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |   37 +++
 drivers/base/dmabuf-sync.c |  674 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  178 +++
 7 files changed, 1203 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v5 1/2] dmabuf-sync: Introduce buffer synchronization framework

2013-07-11 Thread Inki Dae
 CPU access
   to these dmabufs is completed:
dmabuf_sync_unlock(sync);

   And call the following functions to release all resources,
dmabuf_sync_put_all(sync);
dmabuf_sync_fini(sync);

You can refer to actual example codes:
"drm/exynos: add dmabuf sync support for g2d driver" and
"drm/exynos: add dmabuf sync support for kms framework" from
https://git.kernel.org/cgit/linux/kernel/git/daeinki/
drm-exynos.git/log/?h=dmabuf-sync

And this framework includes fcntl system call[3] as interfaces exported
to user. As you know, user sees a buffer object as a dma-buf file descriptor.
So fcntl() call with the file descriptor means to lock some buffer region being
managed by the dma-buf object.

The below is how to use interfaces for user application:
struct flock filelock;

1. Lock a dma buf:
filelock.l_type = F_WRLCK or F_RDLCK;

/* lock entire region to the dma buf. */
filelock.lwhence = SEEK_CUR;
filelock.l_start = 0;
filelock.l_len = 0;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);
...
CPU access to the dma buf

2. Unlock a dma buf:
filelock.l_type = F_UNLCK;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);

close(dmabuf fd) call would also unlock the dma buf. And for 
more
detail, please refer to [3]

References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  290 +
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |4 +
 drivers/base/dmabuf-sync.c |  674 
 include/linux/dma-buf.h|   16 +
 include/linux/dmabuf-sync.h|  178 +++
 7 files changed, 1170 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..4427759
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,290 @@
+DMA Buffer Synchronization Framework
+    ~~~~
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization mechanism between DMA and DMA, CPU and DMA, and
+CPU and CPU.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, and easy-to-use interfaces for
+device drivers and user application. And this API can be used for all dma
+devices using system memory as dma buffer, especially for most ARM based SoCs.
+
+
+Motivation
+--
+
+Buffer synchronization issue between DMA and DMA:
+   Sharing a buffer, a device cannot be aware of when the other device
+   will access the shared buffer: a device may access a buffer containing
+   wrong data if the device accesses the shared buffer while another
+   device is still accessing the shared buffer.
+   Therefore, a user process should have waited for the completion of DMA
+   access by another device before a device tries to access the shared
+   buffer.
+
+Buffer synchronization issue between CPU and DMA:
+   A user process should consider that when having to send a buffer, filled
+   by CPU, to a device driver for the device driver to access the buffer as
+   a input buffer while CPU and DMA are sharing the buffer.
+   This means that the user process needs to understand how the device
+   driver is worked. Hence, the conventional mechanism not only makes
+   user application complicated but also incurs performance overhead.
+
+Buffer synchronization issue between CPU and CPU:
+   In case that two processes share one buffer; shared with DMA also,
+   they may need some mechanism to allow process B to access the shared
+   buffer after the completion of CPU access by process A.
+   Therefore, process B should have waited for the completion of CPU access
+   by process A using the mechanism before trying to access the shared
+   buffer.
+
+What is the best way to solve these buffer synchronization issues?
+   We may need a common object that a device driver and a user process
+   notify the common object of when they try to access a

[RFC PATCH v1 2/2] dma-buf: add lock callback for fcntl system call

2013-07-11 Thread Inki Dae
This patch adds lock callback to dma buf file operations,
and this callback will be called by fcntl system call.

With this patch, fcntl system call can be used for buffer
synchronization between CPU and CPU, and CPU and DMA in user mode.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/base/dma-buf.c |   33 +
 1 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index 9a26981..e1b8583 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -80,9 +80,42 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
return dmabuf->ops->mmap(dmabuf, vma);
 }
 
+static int dma_buf_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+   struct dma_buf *dmabuf;
+   unsigned int type;
+   bool wait = false;
+
+   if (!is_dma_buf_file(file))
+   return -EINVAL;
+
+   dmabuf = file->private_data;
+
+   if ((fl->fl_type & F_UNLCK) == F_UNLCK) {
+   dmabuf_sync_single_unlock(dmabuf);
+   return 0;
+   }
+
+   /* convert flock type to dmabuf sync type. */
+   if ((fl->fl_type & F_WRLCK) == F_WRLCK)
+   type = DMA_BUF_ACCESS_W;
+   else if ((fl->fl_type & F_RDLCK) == F_RDLCK)
+   type = DMA_BUF_ACCESS_R;
+   else
+   return -EINVAL;
+
+   if (fl->fl_flags & FL_SLEEP)
+   wait = true;
+
+   /* TODO. the locking to certain region should also be considered. */
+
+   return dmabuf_sync_single_lock(dmabuf, type, wait);
+}
+
 static const struct file_operations dma_buf_fops = {
.release= dma_buf_release,
.mmap   = dma_buf_mmap_internal,
+   .lock   = dma_buf_lock,
 };
 
 /*
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v1 2/2] dma-buf: add lock callback for fcntl system call.

2013-07-10 Thread Inki Dae
This patch adds lock callback to dma buf file operations,
and this callback will be called by fcntl system call.

With this patch, fcntl system call can be used for buffer
synchronization between CPU and CPU, and CPU and DMA in user mode.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/base/dma-buf.c |   34 ++
 1 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c
index fe39120..cd71447 100644
--- a/drivers/base/dma-buf.c
+++ b/drivers/base/dma-buf.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static inline int is_dma_buf_file(struct file *);
 
@@ -82,9 +83,42 @@ static int dma_buf_mmap_internal(struct file *file, struct 
vm_area_struct *vma)
return dmabuf->ops->mmap(dmabuf, vma);
 }
 
+static int dma_buf_lock(struct file *file, int cmd, struct file_lock *fl)
+{
+   struct dma_buf *dmabuf;
+   unsigned int type;
+   bool wait = false;
+
+   if (!is_dma_buf_file(file))
+   return -EINVAL;
+
+   dmabuf = file->private_data;
+
+   if ((fl->fl_type & F_UNLCK) == F_UNLCK) {
+   dmabuf_sync_single_unlock(dmabuf);
+   return 0;
+   }
+
+   /* convert flock type to dmabuf sync type. */
+   if ((fl->fl_type & F_WRLCK) == F_WRLCK)
+   type = DMA_BUF_ACCESS_W;
+   else if ((fl->fl_type & F_RDLCK) == F_RDLCK)
+   type = DMA_BUF_ACCESS_R;
+   else
+   return -EINVAL;
+
+   if (fl->fl_flags & FL_SLEEP)
+   wait = true;
+
+   /* TODO. the locking to certain region should also be considered. */
+
+   return dmabuf_sync_single_lock(dmabuf, type, wait);
+}
+
 static const struct file_operations dma_buf_fops = {
.release= dma_buf_release,
.mmap   = dma_buf_mmap_internal,
+   .lock   = dma_buf_lock,
 };
 
 /*
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v4 1/2] dmabuf-sync: Introduce buffer synchronization framework

2013-07-10 Thread Inki Dae
commit/?h=dmabuf-sync&id=4030bdee9bab5841ad32faade528d04cc0c5fc94


https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/

commit/?h=dmabuf-sync&id=6ca548e9ea9e865592719ef6b1cde58366af9f5c

And this framework includes fcntl system call[4] as interfaces exported
to user. As you know, user sees a buffer object as a dma-buf file descriptor.
So fcntl() call with the file descriptor means to lock some buffer region being
managed by the dma-buf object.

The below is how to use for user application:
struct flock filelock;

1. Lock a dma buf:
filelock.l_type = F_WRLCK or F_RDLCK;

/* lock entire region to the dma buf. */
filelock.lwhence = SEEK_CUR;
filelock.l_start = 0;
filelock.l_len = 0;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);
...
CPU access to the dma buf

2. Unlock a dma buf:
filelock.l_type = F_UNLCK;

fcntl(dmabuf fd, F_SETLKW or F_SETLK, &filelock);

close(dmabuf fd) call would also unlock the dma buf. And for 
more
detail, please refer to [4]


References:
[1] http://lwn.net/Articles/470339/
[2] http://lwn.net/Articles/532616/
[3] https://patchwork.kernel.org/patch/2625361/
[4] http://linux.die.net/man/2/fcntl

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  283 +
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dmabuf-sync.c |  661 
 include/linux/dma-buf.h|   14 +
 include/linux/dmabuf-sync.h|  132 
 include/linux/reservation.h|9 +
 7 files changed, 1107 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..9d12d00
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,283 @@
+DMA Buffer Synchronization Framework
+~~~~~~~~
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization mechanism between DMA and DMA, CPU and DMA, and
+CPU and CPU.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, and easy-to-use interfaces for
+device drivers and user application. And this API can be used for all dma
+devices using system memory as dma buffer, especially for most ARM based SoCs.
+
+
+Motivation
+--
+
+Buffer synchronization issue between DMA and DMA:
+   Sharing a buffer, a device cannot be aware of when the other device
+   will access the shared buffer: a device may access a buffer containing
+   wrong data if the device accesses the shared buffer while another
+   device is still accessing the shared buffer.
+   Therefore, a user process should have waited for the completion of DMA
+   access by another device before a device tries to access the shared
+   buffer.
+
+Buffer synchronization issue between CPU and DMA:
+   A user process should consider that when having to send a buffer, filled
+   by CPU, to a device driver for the device driver to access the buffer as
+   a input buffer while CPU and DMA are sharing the buffer.
+   This means that the user process needs to understand how the device
+   driver is worked. Hence, the conventional mechanism not only makes
+   user application complicated but also incurs performance overhead.
+
+Buffer synchronization issue between CPU and CPU:
+   In case that two processes share one buffer; shared with DMA also,
+   they may need some mechanism to allow process B to access the shared
+   buffer after the completion of CPU access by process A.
+   Therefore, process B should have waited for the completion of CPU access
+   by process A using the mechanism before trying to access the shared
+   buffer.
+
+What is the best way to solve these buffer synchronization issues?
+   We may need a common object that a device driver and a user process
+   notify the common object of when they try to access a shared buffer.
+   That way we could decide when we have to allow or not to allow for CPU
+   or DMA to access the shared buffer through the common object.
+   If so, what could become the common object? Right, that's a dma-buf[1].
+   Now we have already been using the dma-buf to share one buffer with
+   o

[RFC PATCH v4 0/2] Introduce buffer synchronization framework

2013-07-10 Thread Inki Dae
Hi all,

This patch set introduces a buffer synchronization framework based
on DMA BUF[1] and reservation[2] to use dma-buf resource, and based
on ww-mutexes[3] for lock mechanism.

The purpose of this framework is to provide not only buffer access
control to CPU and CPU, and CPU and DMA, and DMA and DMA but also
easy-to-use interfaces for device drivers and user application.
In addtion, this patch set suggests a way for enhancing performance.

to implement generic user mode interface, we have used fcntl system
call[4]. As you know, user application sees a buffer object as
a dma-buf file descriptor. So fcntl() call with the file descriptor
means to lock some buffer region being managed by the dma-buf object.
For more detail, you can refer to the dma-buf-sync.txt in Documentation/

Moreover, we had tried to find how we could utilize limited hardware
resources more using buffer synchronization mechanism. And finally,
we have realized that it could enhance performance using multi threads
with this buffer synchronization mechanism: DMA and CPU works individually
so CPU could perform other works while DMA is performing some works,
and vise versa.

However, in the conventional way, that is not easy to do so because DMA
operation is depend on CPU operation, and vice versa.

Conventional way:
User Kernel
-
CPU writes something to src
send the src to driver->
 update DMA register
request DMA start(1)--->
 DMA start
<-completion signal(2)--
CPU accesses dst

(1) Request DMA start after the CPU access to src buffer is completed.
(2) Access dst buffer after DMA access to the dst buffer is completed.

On the other hand, if there is something to control buffer access between CPU
and DMA? The below shows that:

User(thread a)  User(thread b)Kernel
-
send a src to driver-->
  update DMA register
lock the src
request DMA start(1)-->
CPU acccess to src
unlock the srclock src and dst
  DMA start
<-completion signal(2)-
lock dst  DMA completion
CPU access to dst unlock src and dst
unlock DST

(1) Try to start DMA operation while CPU is accessing the src buffer.
(2) Try CPU access to dst buffer while DMA is accessing the dst buffer.

This means that CPU or DMA could do more works.

In the same way, we could reduce hand shaking overhead between
two processes when those processes need to share a shared buffer.
There may be other cases that we could reduce overhead as well.


References:
[1] http://lwn.net/Articles/470339/
[2] http://lwn.net/Articles/532616/
[3] https://patchwork.kernel.org/patch/2625361/
[4] http://linux.die.net/man/2/fcntl

Inki Dae (2):
  dmabuf-sync: Introduce buffer synchronization framework
  dma-buf: add lock callback for fcntl system call.

 Documentation/dma-buf-sync.txt |  283 +
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dma-buf.c |   34 ++
 drivers/base/dmabuf-sync.c |  661 
 include/linux/dma-buf.h|   14 +
 include/linux/dmabuf-sync.h|  132 
 include/linux/reservation.h|9 +
 8 files changed, 1141 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework

2013-06-26 Thread Inki Dae
2013/6/25 Jerome Glisse :
> On Tue, Jun 25, 2013 at 10:17 AM, Inki Dae  wrote:
>> 2013/6/25 Rob Clark :
>>> On Tue, Jun 25, 2013 at 5:09 AM, Inki Dae  wrote:
>>>>> that
>>>>> should be the role of kernel memory management which of course needs
>>>>> synchronization btw A and B. But in no case this should be done using
>>>>> dma-buf. dma-buf is for sharing content btw different devices not
>>>>> sharing resources.
>>>>>
>>>>
>>>> hmm, is that true? And are you sure? Then how do you think about
>>>> reservation? the reservation also uses dma-buf with same reason as long
>>>> as I
>>>> know: actually, we use reservation to use dma-buf. As you may know, a
>>>> reservation object is allocated and initialized when a buffer object is
>>>> exported to a dma buf.
>>>
>>> no, this is why the reservation object can be passed in when you
>>> construction the dmabuf.
>>
>> Right, that way, we could use dma buf for buffer synchronization. I
>> just wanted to ask for why Jerome said that "dma-buf is for sharing
>> content btw different devices not sharing resources".
>
> From memory, the motivation of dma-buf was to done for few use case,
> among them webcam capturing frame into a buffer and having gpu using
> it directly without memcpy, or one big gpu rendering a scene into a
> buffer that is then use by low power gpu for display ie it was done to
> allow different device to operate on same data using same backing
> memory.
>
> AFAICT you seem to want to use dma-buf to create scratch buffer, ie a
> process needs to use X amount of memory for an operation, it can
> release|free this memory once its done
> and a process B can the use
> this X memory for its own operation discarding content of process A.
> presume that next frame would have the sequence repeat, process A do
> something, then process B does its thing.
> So to me it sounds like you
> want to implement global scratch buffer using the dmabuf API and that
> sounds bad to me.
>
> I know most closed driver have several pool of memory, long lived
> object, short lived object and scratch space, then user space allocate
> from one of this pool and there is synchronization done by driver
> using driver specific API to reclaim memory.
> Of course this work
> nicely if you only talking about one logic block or at very least hw
> that have one memory controller.
>
> Now if you are thinking of doing scratch buffer for several different
> device and share the memory among then you need to be aware of
> security implication, most obvious being that you don't want process B
> being able to read process A scratch memory.
> I know the argument about
> it being graphic but one day this might become gpu code and it might
> be able to insert jump to malicious gpu code.
>

If you think so, it seems like that there is *definitely* your
misunderstanding. My approach is similar to dma fence: it guarantees
that a DMA cannot access a buffer while other DMA is accessing the
buffer. I guess now some gpu drivers in mainline have been using
specific mechanism for it. And when it comes to the portion you
commented, please know that I just introduced user side mechanism for
buffer sychronization between CPU and CPU, and CPU and DMA in
addition; not implemented but just planned.

Thanks,
Inki Dae

> Cheers,
> Jerome
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework

2013-06-25 Thread Inki Dae
2013/6/25 Rob Clark :
> On Tue, Jun 25, 2013 at 5:09 AM, Inki Dae  wrote:
>>> that
>>> should be the role of kernel memory management which of course needs
>>> synchronization btw A and B. But in no case this should be done using
>>> dma-buf. dma-buf is for sharing content btw different devices not
>>> sharing resources.
>>>
>>
>> hmm, is that true? And are you sure? Then how do you think about
>> reservation? the reservation also uses dma-buf with same reason as long as I
>> know: actually, we use reservation to use dma-buf. As you may know, a
>> reservation object is allocated and initialized when a buffer object is
>> exported to a dma buf.
>
> no, this is why the reservation object can be passed in when you
> construction the dmabuf.

Right, that way, we could use dma buf for buffer synchronization. I
just wanted to ask for why Jerome said that "dma-buf is for sharing
content btw different devices not sharing resources".

> The fallback is for dmabuf to create it's
> own, for compatibility and to make life easier for simple devices with
> few buffers... but I think pretty much all drm drivers would embed the
> reservation object in the gem buffer and pass it in when the dmabuf is
> created.
>
> It is pretty much imperative that synchronization works independently
> of dmabuf, you really don't want to have two different cases to deal
> with in your driver, one for synchronizing non-exported objects, and
> one for synchronizing dmabuf objects.
>

Now my approach is concentrating on the most basic implementation,
buffer synchronization mechanism between CPU and CPU, CPU and DMA, and
DMA and DMA.  But I think reserveration could be used for other
purposes such as pipe line synchronization independently of dmabuf as
you said. Actually, I had already implemented pipe line
synchronization mechanism using the reservation: in case of MALI-400
DDK, there was pipe line issue between gp and pp jobs, and we had
solved the issue using the pipe line synchronization mechanism with
the reservation. So, we could add more features anytime; those two
different cases, dmabuf objects and non-exported objects, if needed
because we are using the reservation object.

Thanks,
Inki Dae

> BR,
> -R
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-21 Thread Inki Dae
2013/6/21 Lucas Stach :
> Hi Inki,
>
> please refrain from sending HTML Mails, it makes proper quoting without
> messing up the layout everywhere pretty hard.
>

Sorry about that. I should have used text mode.

> Am Freitag, den 21.06.2013, 20:01 +0900 schrieb Inki Dae:
> [...]
>
>> Yeah, you'll some knowledge and understanding about the API
>> you are
>> working with to get things right. But I think it's not an
>> unreasonable
>> thing to expect the programmer working directly with kernel
>> interfaces
>> to read up on how things work.
>>
>> Second thing: I'll rather have *one* consistent API for every
>> subsystem,
>> even if they differ from each other than having to implement
>> this
>> syncpoint thing in every subsystem. Remember: a single execbuf
>> in DRM
>> might reference both GEM objects backed by dma-buf as well
>> native SHM or
>> CMA backed objects. The dma-buf-mgr proposal already allows
>> you to
>> handle dma-bufs much the same way during validation than
>> native GEM
>> objects.
>>
>> Actually, at first I had implemented a fence helper framework based on
>> reservation and dma fence to provide easy-use-interface for device
>> drivers. However, that was wrong implemention: I had not only
>> customized the dma fence but also not considered dead lock issue.
>> After that, I have reimplemented it as dmabuf sync to solve dead
>> issue, and at that time, I realized that we first need to concentrate
>> on the most basic thing: the fact CPU and CPU, CPU and DMA, or DMA and
>> DMA can access a same buffer, And the fact simple is the best, and the
>> fact we need not only kernel side but also user side interfaces. After
>> that, I collected what is the common part for all subsystems, and I
>> have devised this dmabuf sync framework for it. I'm not really
>> specialist in Desktop world. So question. isn't the execbuf used only
>> for the GPU? the gpu has dedicated video memory(VRAM) so it needs
>> migration mechanism between system memory and the dedicated video
>> memory, and also to consider ordering issue while be migrated.
>>
>
> Yeah, execbuf is pretty GPU specific, but I don't see how this matters
> for this discussion. Also I don't see a big difference between embedded
> and desktop GPUs. Buffer migration is more of a detail here. Both take
> command stream that potentially reference other buffers, which might be
> native GEM or dma-buf backed objects. Both have to make sure the buffers
> are in the right domain (caches cleaned and address mappings set up) and
> are available for the desired operation, meaning you have to sync with
> other DMA engines and maybe also with CPU.

Yeah, right. Then, in case of desktop gpu, does't it need additional
something to do when a buffer/s is/are migrated from system memory to
video memory domain, or from video memory to system memory domain? I
guess the below members does similar thing, and all other DMA devices
would not need them:
struct fence {
  ...
  unsigned int context, seqno;
  ...
};

And,
   struct seqno_fence {
 ...
 uint32_t seqno_ofs;
 ...
   };

>
> The only case where sync isn't clearly defined right now by the current
> API entrypoints is when you access memory through the dma-buf fallback
> mmap support, which might happen with some software processing element
> in a video pipeline or something. I agree that we will need a userspace
> interface here, but I think this shouldn't be yet another sync object,
> but rather more a prepare/fini_cpu_access ioctl on the dma-buf which
> hooks into the existing dma-fence and reservation stuff.

I think we don't need addition ioctl commands for that. I am thinking
of using existing resources as possible. My idea also is similar in
using the reservation stuff to your idea because my approach also
should use the dma-buf resource. However, My idea is that a user
process, that wants buffer synchronization with the other, sees a sync
object as a file descriptor like dma-buf does. The below shows simple
my idea about it:

ioctl(dmabuf_fd, DMA_BUF_IOC_OPEN_SYNC, &sync);

flock(sync->fd, LOCK_SH); <- LOCK_SH means a shared lock.
CPU access for read
flock(sync->fd, LOCK_UN);

Or

flock(sync->fd, LOCK_EX); <- LOCK_EX means an exclusive lock
CPU access for write
flock(sync->fd, LOCK_UN);

close(sync->fd);

As you know, that's similar to dmabuf export featur

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-20 Thread Inki Dae


> -Original Message-
> From: Lucas Stach [mailto:l.st...@pengutronix.de]
> Sent: Thursday, June 20, 2013 7:11 PM
> To: Inki Dae
> Cc: 'Russell King - ARM Linux'; 'Inki Dae'; 'linux-fbdev'; 'YoungJun Cho';
> 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Am Donnerstag, den 20.06.2013, 17:24 +0900 schrieb Inki Dae:
> [...]
> > > > In addition, please see the below more detail examples.
> > > >
> > > > The conventional way (without dmabuf-sync) is:
> > > > Task A
> > > > 
> > > >  1. CPU accesses buf
> > > >  2. Send the buf to Task B
> > > >  3. Wait for the buf from Task B
> > > >  4. go to 1
> > > >
> > > > Task B
> > > > ---
> > > > 1. Wait for the buf from Task A
> > > > 2. qbuf the buf
> > > > 2.1 insert the buf to incoming queue
> > > > 3. stream on
> > > > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > > > 3.2 get the buf from ready queue, and dma start.
> > > > 4. dqbuf
> > > > 4.1 dma_unmap_sg after dma operation completion
> > > > 4.2 move the buf to outgoing queue
> > > > 5. back the buf to Task A
> > > > 6. go to 1
> > > >
> > > > In case that two tasks share buffers, and data flow goes from Task A
> to
> > > Task
> > > > B, we would need IPC operation to send and receive buffers properly
> > > between
> > > > those two tasks every time CPU or DMA access to buffers is started
> or
> > > > completed.
> > > >
> > > >
> > > > With dmabuf-sync:
> > > >
> > > > Task A
> > > > 
> > > >  1. dma_buf_sync_lock <- synpoint (call by user side)
> > > >  2. CPU accesses buf
> > > >  3. dma_buf_sync_unlock <- syncpoint (call by user side)
> > > >  4. Send the buf to Task B (just one time)
> > > >  5. go to 1
> > > >
> > > >
> > > > Task B
> > > > ---
> > > > 1. Wait for the buf from Task A (just one time)
> > > > 2. qbuf the buf
> > > > 1.1 insert the buf to incoming queue
> > > > 3. stream on
> > > > 3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
> > > > 3.2 dma_map_sg if ready, and move the buf to ready queue
> > > > 3.3 get the buf from ready queue, and dma start.
> > > > 4. dqbuf
> > > > 4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
> > > > 4.2 dma_unmap_sg after dma operation completion
> > > > 4.3 move the buf to outgoing queue
> > > > 5. go to 1
> > > >
> > > > On the other hand, in case of using dmabuf-sync, as you can see the
> > > above
> > > > example, we would need IPC operation just one time. That way, I
> think we
> > > > could not only reduce performance overhead but also make user
> > > application
> > > > simplified. Of course, this approach can be used for all DMA device
> > > drivers
> > > > such as DRM. I'm not a specialist in v4l2 world so there may be
> missing
> > > > point.
> > > >
> > >
> > > You already need some kind of IPC between the two tasks, as I suspect
> > > even in your example it wouldn't make much sense to queue the buffer
> > > over and over again in task B without task A writing anything to it.
> So
> > > task A has to signal task B there is new data in the buffer to be
> > > processed.
> > >
> > > There is no need to share the buffer over and over again just to get
> the
> > > two processes to work together on the same thing. Just share the fd
> > > between both and then do out-of-band completion signaling, as you need
> > > this anyway. Without this you'll end up with unpredictable behavior.
> > > Just because sync allows you to access the buffer doesn't mean it's
> > > valid for your use-case. Without completion signaling you could easily
> > > end up overwriting your data from task A multiple times before task B
> > > even tries

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-20 Thread Inki Dae


> -Original Message-
> From: Lucas Stach [mailto:l.st...@pengutronix.de]
> Sent: Thursday, June 20, 2013 4:47 PM
> To: Inki Dae
> Cc: 'Russell King - ARM Linux'; 'Inki Dae'; 'linux-fbdev'; 'YoungJun Cho';
> 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Am Donnerstag, den 20.06.2013, 15:43 +0900 schrieb Inki Dae:
> >
> > > -Original Message-
> > > From: dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org
> > > [mailto:dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org]
> On
> > > Behalf Of Russell King - ARM Linux
> > > Sent: Thursday, June 20, 2013 3:29 AM
> > > To: Inki Dae
> > > Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham;
> YoungJun
> > > Cho; linux-media@vger.kernel.org; linux-arm-ker...@lists.infradead.org
> > > Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer
> synchronization
> > > framework
> > >
> > > On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > > > On the other hand, the below shows how we could enhance the
> conventional
> > > > way with my approach (just example):
> > > >
> > > > CPU -> DMA,
> > > > ioctl(qbuf command)  ioctl(streamon)
> > > >   |   |
> > > >   |   |
> > > > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> > > >
> > > > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> > > And
> > > > the syncpoint is performed by calling dma_buf_sync_lock(), and then
> DMA
> > > > accesses the sync buffer.
> > > >
> > > > And DMA -> CPU,
> > > > ioctl(dqbuf command)
> > > >   |
> > > >   |
> > > > dqbuf <- nothing to do
> > > >
> > > > Actual syncpoint is when DMA operation is completed (in interrupt
> > > handler):
> > > > the syncpoint is performed by calling dma_buf_sync_unlock().
> > > > Hence,  my approach is to move the syncpoints into just before dma
> > > access
> > > > as long as possible.
> > >
> > > What you've just described does *not* work on architectures such as
> > > ARMv7 which do speculative cache fetches from memory at any time that
> > > that memory is mapped with a cacheable status, and will lead to data
> > > corruption.
> >
> > I didn't explain that enough. Sorry about that. 'nothing to do' means
> that a
> > dmabuf sync interface isn't called but existing functions are called. So
> > this may be explained again:
> > ioctl(dqbuf command)
> > |
> > |
> > dqbuf <- 1. dma_unmap_sg
> > 2. dma_buf_sync_unlock (syncpoint)
> >
> > The syncpoint I mentioned means lock mechanism; not doing cache
> operation.
> >
> > In addition, please see the below more detail examples.
> >
> > The conventional way (without dmabuf-sync) is:
> > Task A
> > 
> >  1. CPU accesses buf
> >  2. Send the buf to Task B
> >  3. Wait for the buf from Task B
> >  4. go to 1
> >
> > Task B
> > ---
> > 1. Wait for the buf from Task A
> > 2. qbuf the buf
> > 2.1 insert the buf to incoming queue
> > 3. stream on
> > 3.1 dma_map_sg if ready, and move the buf to ready queue
> > 3.2 get the buf from ready queue, and dma start.
> > 4. dqbuf
> > 4.1 dma_unmap_sg after dma operation completion
> > 4.2 move the buf to outgoing queue
> > 5. back the buf to Task A
> > 6. go to 1
> >
> > In case that two tasks share buffers, and data flow goes from Task A to
> Task
> > B, we would need IPC operation to send and receive buffers properly
> between
> > those two tasks every time CPU or DMA access to buffers is started or
> > completed.
> >
> >
> > With dmabuf-sync:
> >
> > Task A
> > 
> >  1. dma_buf_sync_lock <- synpoint (call by user side)
> >  2. CPU accesses buf
> >  3. dma_buf_sync_unlock &l

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-19 Thread Inki Dae


> -Original Message-
> From: dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org
> [mailto:dri-devel-bounces+inki.dae=samsung@lists.freedesktop.org] On
> Behalf Of Russell King - ARM Linux
> Sent: Thursday, June 20, 2013 3:29 AM
> To: Inki Dae
> Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> Cho; linux-media@vger.kernel.org; linux-arm-ker...@lists.infradead.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> On Thu, Jun 20, 2013 at 12:10:04AM +0900, Inki Dae wrote:
> > On the other hand, the below shows how we could enhance the conventional
> > way with my approach (just example):
> >
> > CPU -> DMA,
> > ioctl(qbuf command)  ioctl(streamon)
> >   |   |
> >   |   |
> > qbuf  <- dma_buf_sync_get   start streaming <- syncpoint
> >
> > dma_buf_sync_get just registers a sync buffer(dmabuf) to sync object.
> And
> > the syncpoint is performed by calling dma_buf_sync_lock(), and then DMA
> > accesses the sync buffer.
> >
> > And DMA -> CPU,
> > ioctl(dqbuf command)
> >   |
> >   |
> > dqbuf <- nothing to do
> >
> > Actual syncpoint is when DMA operation is completed (in interrupt
> handler):
> > the syncpoint is performed by calling dma_buf_sync_unlock().
> > Hence,  my approach is to move the syncpoints into just before dma
> access
> > as long as possible.
> 
> What you've just described does *not* work on architectures such as
> ARMv7 which do speculative cache fetches from memory at any time that
> that memory is mapped with a cacheable status, and will lead to data
> corruption.

I didn't explain that enough. Sorry about that. 'nothing to do' means that a
dmabuf sync interface isn't called but existing functions are called. So
this may be explained again:
ioctl(dqbuf command)
|
|
dqbuf <- 1. dma_unmap_sg
2. dma_buf_sync_unlock (syncpoint)

The syncpoint I mentioned means lock mechanism; not doing cache operation.

In addition, please see the below more detail examples.

The conventional way (without dmabuf-sync) is:
Task A 

 1. CPU accesses buf  
 2. Send the buf to Task B  
 3. Wait for the buf from Task B
 4. go to 1

Task B
---
1. Wait for the buf from Task A
2. qbuf the buf 
2.1 insert the buf to incoming queue
3. stream on
3.1 dma_map_sg if ready, and move the buf to ready queue
3.2 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_unmap_sg after dma operation completion
4.2 move the buf to outgoing queue
5. back the buf to Task A
6. go to 1

In case that two tasks share buffers, and data flow goes from Task A to Task
B, we would need IPC operation to send and receive buffers properly between
those two tasks every time CPU or DMA access to buffers is started or
completed.


With dmabuf-sync:

Task A 

 1. dma_buf_sync_lock <- synpoint (call by user side)
 2. CPU accesses buf  
 3. dma_buf_sync_unlock <- syncpoint (call by user side)
 4. Send the buf to Task B (just one time)
 5. go to 1


Task B
---
1. Wait for the buf from Task A (just one time)
2. qbuf the buf 
1.1 insert the buf to incoming queue
3. stream on
3.1 dma_buf_sync_lock <- syncpoint (call by kernel side)
3.2 dma_map_sg if ready, and move the buf to ready queue
3.3 get the buf from ready queue, and dma start.
4. dqbuf
4.1 dma_buf_sync_unlock <- syncpoint (call by kernel side)
4.2 dma_unmap_sg after dma operation completion
4.3 move the buf to outgoing queue
5. go to 1

On the other hand, in case of using dmabuf-sync, as you can see the above
example, we would need IPC operation just one time. That way, I think we
could not only reduce performance overhead but also make user application
simplified. Of course, this approach can be used for all DMA device drivers
such as DRM. I'm not a specialist in v4l2 world so there may be missing
point.

Thanks,
Inki Dae

> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-19 Thread Inki Dae


> -Original Message-
> From: Lucas Stach [mailto:l.st...@pengutronix.de]
> Sent: Wednesday, June 19, 2013 7:22 PM
> To: Inki Dae
> Cc: 'Russell King - ARM Linux'; 'linux-fbdev'; 'Kyungmin Park'; 'DRI
> mailing list'; 'myungjoo.ham'; 'YoungJun Cho'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Am Mittwoch, den 19.06.2013, 14:45 +0900 schrieb Inki Dae:
> >
> > > -Original Message-
> > > From: Lucas Stach [mailto:l.st...@pengutronix.de]
> > > Sent: Tuesday, June 18, 2013 6:47 PM
> > > To: Inki Dae
> > > Cc: 'Russell King - ARM Linux'; 'linux-fbdev'; 'Kyungmin Park'; 'DRI
> > > mailing list'; 'myungjoo.ham'; 'YoungJun Cho'; linux-arm-
> > > ker...@lists.infradead.org; linux-media@vger.kernel.org
> > > Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer
> synchronization
> > > framework
> > >
> > > Am Dienstag, den 18.06.2013, 18:04 +0900 schrieb Inki Dae:
> > > [...]
> > > >
> > > > > a display device driver.  It shouldn't be used within a single
> driver
> > > > > as a means of passing buffers between userspace and kernel space.
> > > >
> > > > What I try to do is not really such ugly thing. What I try to do is
> to
> > > > notify that, when CPU tries to access a buffer , to kernel side
> through
> > > > dmabuf interface. So it's not really to send the buffer to kernel.
> > > >
> > > > Thanks,
> > > > Inki Dae
> > > >
> > > The most basic question about why you are trying to implement this
> sort
> > > of thing in the dma_buf framework still stands.
> > >
> > > Once you imported a dma_buf into your DRM driver it's a GEM object and
> > > you can and should use the native DRM ioctls to prepare/end a CPU
> access
> > > to this BO. Then internally to your driver you can use the dma_buf
> > > reservation/fence stuff to provide the necessary cross-device sync.
> > >
> >
> > I don't really want that is used only for DRM drivers. We really need
> > it for all other DMA devices; i.e., v4l2 based drivers. That is what I
> > try to do. And my approach uses reservation to use dma-buf resources
> > but not dma fence stuff anymore. However, I'm looking into Radeon DRM
> > driver for why we need dma fence stuff, and how we can use it if
> > needed.
> >
> 
> Still I don't see the point why you need syncpoints above dma-buf. In
> both the DRM and the V4L2 world we have defined points in the API where
> a buffer is allowed to change domain from device to CPU and vice versa.
> 
> In DRM if you want to access a buffer with the CPU you do a cpu_prepare.
> The buffer changes back to GPU domain once you do the execbuf
> validation, queue a pageflip to the buffer or similar things.
> 
> In V4L2 the syncpoints for cache operations are the queue/dequeue API
> entry points. Those are also the exact points to synchronize with other
> hardware thus using dma-buf reserve/fence.


If so, what if we want to access a buffer with the CPU _in V4L2_? We should 
open a drm device node, and then do a cpu_prepare? 

Thanks,
Inki Dae

> 
> In all this I can't see any need for a new syncpoint primitive slapped
> on top of dma-buf.
> 
> Regards,
> Lucas
> --
> Pengutronix e.K.   | Lucas Stach |
> Industrial Linux Solutions | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
> Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3] dmabuf-sync: Introduce buffer synchronization framework

2013-06-19 Thread Inki Dae
nel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/

commit/?h=dmabuf-sync&id=6ca548e9ea9e865592719ef6b1cde58366af9f5c

[1] http://lwn.net/Articles/470339/
[2] http://lwn.net/Articles/532616/
[3] https://patchwork.kernel.org/patch/2625361/

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  199 
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dmabuf-sync.c |  501 
 include/linux/dma-buf.h|   14 ++
 include/linux/dmabuf-sync.h|  115 +
 include/linux/reservation.h|7 +
 7 files changed, 844 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..134de7b
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,199 @@
+DMA Buffer Synchronization Framework
+~~~~
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization mechanism between DMA and DMA, CPU and DMA, and
+CPU and CPU.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, and easy-to-use interfaces for
+device drivers and potentially user application (not implemented for user
+applications, yet). And this API can be used for all dma devices using system
+memory as dma buffer, especially for most ARM based SoCs.
+
+
+Motivation
+--
+
+Buffer synchronization issue between DMA and DMA:
+   Sharing a buffer, a device cannot be aware of when the other device
+   will access the shared buffer: a device may access a buffer containing
+   wrong data if the device accesses the shared buffer while another
+   device is still accessing the shared buffer.
+   Therefore, a user process should have waited for the completion of DMA
+   access by another device before a device tries to access the shared
+   buffer.
+
+Buffer synchronization issue between CPU and DMA:
+   A user process should consider that when having to send a buffer, filled
+   by CPU, to a device driver for the device driver to access the buffer as
+   a input buffer while CPU and DMA are sharing the buffer.
+   This means that the user process needs to understand how the device
+   driver is worked. Hence, the conventional mechanism not only makes
+   user application complicated but also incurs performance overhead.
+
+Buffer synchronization issue between CPU and CPU:
+   In case that two processes share one buffer; shared with DMA also,
+   they may need some mechanism to allow process B to access the shared
+   buffer after the completion of CPU access by process A.
+   Therefore, process B should have waited for the completion of CPU access
+   by process A using the mechanism before trying to access the shared
+   buffer.
+
+What is the best way to solve these buffer synchronization issues?
+   We may need a common object that a device driver and a user process
+   notify the common object of when they try to access a shared buffer.
+   That way we could decide when we have to allow or not to allow for CPU
+   or DMA to access the shared buffer through the common object.
+   If so, what could become the common object? Right, that's a dma-buf[1].
+   Now we have already been using the dma-buf to share one buffer with
+   other drivers.
+
+
+Basic concept
+-
+
+The mechanism of this framework has the following steps,
+1. Register dmabufs to a sync object - A task gets a new sync object and
+can add one or more dmabufs that the task wants to access.
+This registering should be performed when a device context or an event
+context such as a page flip event is created or before CPU accesses a 
shared
+buffer.
+
+   dma_buf_sync_get(a sync object, a dmabuf);
+
+2. Lock a sync object - A task tries to lock all dmabufs added in its own
+sync object. Basically, the lock mechanism uses ww-mutexes[2] to avoid dead
+lock issue and for race condition between CPU and CPU, CPU and DMA, and DMA
+and DMA. Taking a lock means that others cannot access all locked dmabufs
+until the task that locked the corresponding dmabufs, unlocks all the 
locked
+dmabufs.
+This locking should be performed before DMA or CPU accesses these dmabufs.
+
+   dma_buf_sync_lock(a sync object);
+
+3. Unlock a sync object - The task unlocks all dmabufs added in its ow

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-18 Thread Inki Dae


> -Original Message-
> From: Lucas Stach [mailto:l.st...@pengutronix.de]
> Sent: Tuesday, June 18, 2013 6:47 PM
> To: Inki Dae
> Cc: 'Russell King - ARM Linux'; 'linux-fbdev'; 'Kyungmin Park'; 'DRI
> mailing list'; 'myungjoo.ham'; 'YoungJun Cho'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Am Dienstag, den 18.06.2013, 18:04 +0900 schrieb Inki Dae:
> [...]
> >
> > > a display device driver.  It shouldn't be used within a single driver
> > > as a means of passing buffers between userspace and kernel space.
> >
> > What I try to do is not really such ugly thing. What I try to do is to
> > notify that, when CPU tries to access a buffer , to kernel side through
> > dmabuf interface. So it's not really to send the buffer to kernel.
> >
> > Thanks,
> > Inki Dae
> >
> The most basic question about why you are trying to implement this sort
> of thing in the dma_buf framework still stands.
> 
> Once you imported a dma_buf into your DRM driver it's a GEM object and
> you can and should use the native DRM ioctls to prepare/end a CPU access
> to this BO. Then internally to your driver you can use the dma_buf
> reservation/fence stuff to provide the necessary cross-device sync.
> 

I don't really want that is used only for DRM drivers. We really need it for 
all other DMA devices; i.e., v4l2 based drivers. That is what I try to do. And 
my approach uses reservation to use dma-buf resources but not dma fence stuff 
anymore. However, I'm looking into Radeon DRM driver for why we need dma fence 
stuff, and how we can use it if needed.

Thanks,
Inki Dae

> Regards,
> Lucas
> --
> Pengutronix e.K.   | Lucas Stach |
> Industrial Linux Solutions | http://www.pengutronix.de/  |
> Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
> Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-18 Thread Inki Dae


> -Original Message-
> From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk]
> Sent: Tuesday, June 18, 2013 5:43 PM
> To: Inki Dae
> Cc: 'Maarten Lankhorst'; 'linux-fbdev'; 'Kyungmin Park'; 'DRI mailing
> list'; 'Rob Clark'; 'myungjoo.ham'; 'YoungJun Cho'; 'Daniel Vetter';
> linux-arm-ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> On Tue, Jun 18, 2013 at 02:27:40PM +0900, Inki Dae wrote:
> > So I'd like to ask for other DRM maintainers. How do you think about it?
> it
> > seems like that Intel DRM (maintained by Daniel), OMAP DRM (maintained
> by
> > Rob) and GEM CMA helper also have same issue Russell pointed out. I
> think
> > not only the above approach but also the performance is very important.
> 
> CMA uses coherent memory to back their buffers, though that might not be
> true of memory obtained from other drivers via dma_buf.  Plus, there is
> no support in the CMA helper for exporting or importng these buffers.
> 

It's not so. Please see Dave's drm next. recently dmabuf support for the CMA
helper has been merged to there.

> I guess Intel i915 is only used on x86, which is a coherent platform and
> requires no cache maintanence for DMA.
> 
> OMAP DRM does not support importing non-DRM buffers buffers back into

Correct. TODO yet.

> DRM.  Moreover, it will suffer from the problems I described if any
> attempt is made to write to the buffer after it has been re-imported.
> 
> Lastly, I should point out that the dma_buf stuff is really only useful
> when you need to export a dma buffer from one driver and import it into
> another driver - for example to pass data from a camera device driver to

Most people know that.

> a display device driver.  It shouldn't be used within a single driver
> as a means of passing buffers between userspace and kernel space.

What I try to do is not really such ugly thing. What I try to do is to
notify that, when CPU tries to access a buffer , to kernel side through
dmabuf interface. So it's not really to send the buffer to kernel.

Thanks,
Inki Dae

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-17 Thread Inki Dae


> -Original Message-
> From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk]
> Sent: Tuesday, June 18, 2013 3:21 AM
> To: Inki Dae
> Cc: Maarten Lankhorst; linux-fbdev; Kyungmin Park; DRI mailing list; Rob
> Clark; myungjoo.ham; YoungJun Cho; Daniel Vetter; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> On Tue, Jun 18, 2013 at 02:19:04AM +0900, Inki Dae wrote:
> > It seems like that all pages of the scatterlist should be mapped with
> > DMA every time DMA operation  is started (or drm_xxx_set_src_dma_buffer
> > function call), and the pages should be unmapped from DMA again every
> > time DMA operation is completed: internally, including each cache
> > operation.
> 
> Correct.
> 
> > Isn't that big overhead?
> 
> Yes, if you _have_ to do a cache operation to ensure that the DMA agent
> can see the data the CPU has written.
> 
> > And If there is no problem, we should accept such overhead?
> 
> If there is no problem then why are we discussing this and why do we need
> this API extension? :)

Ok, another issue regardless of dmabuf-sync. Reasonable to me even though
big overhead. Besides, it seems like that most DRM drivers have same issue.
Therefore, we may need to solve this issue like below:
- do not map a dmabuf with DMA. And just create/update buffer object
of importer.
- map the buffer with DMA every time DMA start or iommu page fault
occurs.
- unmap the buffer from DMA every time DMA operation is completed

With the above approach, cache operation portion of my approach,
dmabuf-sync, can be removed. However, I'm not sure that we really have to
use the above approach with a big overhead. Of course, if we don't use the
above approach then user processes would need to do each cache operation
before DMA operation is started and also after DMA operation is completed in
some cases; user space mapped with physical memory as cachable, and CPU and
DMA share the same buffer.

So I'd like to ask for other DRM maintainers. How do you think about it? it
seems like that Intel DRM (maintained by Daniel), OMAP DRM (maintained by
Rob) and GEM CMA helper also have same issue Russell pointed out. I think
not only the above approach but also the performance is very important.

Thanks,
Inki Dae

> 
> > Actually, drm_gem_fd_to_handle() includes to map a
> > given dmabuf with iommu table (just logical data) of the DMA. And then,
> the
> > device address (or iova) already mapped will be set to buffer register
> of
> > the DMA with drm_xxx_set_src/dst_dma_buffer(handle1, ...) call.
> 
> Consider this with a PIPT cache:
> 
>   dma_map_sg()- at this point, the kernel addresses of these
>   buffers are cleaned and invalidated for the DMA
> 
>   userspace writes to the buffer, the data sits in the CPU cache
>   Because the cache is PIPT, this data becomes visible to the
>   kernel as well.
> 
>   DMA is started, and it writes to the buffer
> 
> Now, at this point, which is the correct data?  The data physically in the
> RAM which the DMA has written, or the data in the CPU cache.  It may
> the answer is - they both are, and the loss of either can be a potential
> data corruption issue - there is no way to tell which data should be
> kept but the system is in an inconsistent state and _one_ of them will
> have to be discarded.
> 
>   dma_unmap_sg()  - at this point, the kernel addresses of the
>   buffers are _invalidated_ and any data in those
>   cache lines is discarded
> 
> Which also means that the data in userspace will also be discarded with
> PIPT caches.
> 
> This is precisely why we have buffer rules associated with the DMA API,
> which are these:
> 
>   dma_map_sg()
>   - the buffer transfers ownership from the CPU to the DMA agent.
>   - the CPU may not alter the buffer in any way.
>   while (cpu_wants_access) {
>   dma_sync_sg_for_cpu()
>   - the buffer transfers ownership from the DMA to the CPU.
>   - the DMA may not alter the buffer in any way.
>   dma_sync_sg_for_device()
>   - the buffer transfers ownership from the CPU to the DMA
>   - the CPU may not alter the buffer in any way.
>   }
>   dma_unmap_sg()
>   - the buffer transfers ownership from the DMA to the CPU.
>   - the DMA may not alter the buffer in any way.
> 
> Any violation of that is likely to lead to data corruption.  Now, some
> may say that the DMA API is only about the kernel mapping.  Yes it is,
> because it

RE: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-17 Thread Inki Dae


> -Original Message-
> From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com]
> Sent: Monday, June 17, 2013 8:35 PM
> To: Inki Dae
> Cc: dri-de...@lists.freedesktop.org; linux-fb...@vger.kernel.org; linux-
> arm-ker...@lists.infradead.org; linux-media@vger.kernel.org;
> dan...@ffwll.ch; robdcl...@gmail.com; kyungmin.p...@samsung.com;
> myungjoo@samsung.com; yj44@samsung.com
> Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> Op 17-06-13 13:15, Inki Dae schreef:
> > This patch adds a buffer synchronization framework based on DMA BUF[1]
> > and reservation[2] to use dma-buf resource, and based on ww-mutexes[3]
> > for lock mechanism.
> >
> > The purpose of this framework is not only to couple cache operations,
> > and buffer access control to CPU and DMA but also to provide easy-to-use
> > interfaces for device drivers and potentially user application
> > (not implemented for user applications, yet). And this framework can be
> > used for all dma devices using system memory as dma buffer, especially
> > for most ARM based SoCs.
> >
> > Changelog v2:
> > - use atomic_add_unless to avoid potential bug.
> > - add a macro for checking valid access type.
> > - code clean.
> >
> > The mechanism of this framework has the following steps,
> > 1. Register dmabufs to a sync object - A task gets a new sync object
> and
> > can add one or more dmabufs that the task wants to access.
> > This registering should be performed when a device context or an
> event
> > context such as a page flip event is created or before CPU accesses
a
> shared
> > buffer.
> >
> > dma_buf_sync_get(a sync object, a dmabuf);
> >
> > 2. Lock a sync object - A task tries to lock all dmabufs added in
its
> own
> > sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid
> dead
> > lock issue and for race condition between CPU and CPU, CPU and DMA,
> and DMA
> > and DMA. Taking a lock means that others cannot access all locked
> dmabufs
> > until the task that locked the corresponding dmabufs, unlocks all
the
> locked
> > dmabufs.
> > This locking should be performed before DMA or CPU accesses these
> dmabufs.
> >
> > dma_buf_sync_lock(a sync object);
> >
> > 3. Unlock a sync object - The task unlocks all dmabufs added in its
> own sync
> > object. The unlock means that the DMA or CPU accesses to the dmabufs
> have
> > been completed so that others may access them.
> > This unlocking should be performed after DMA or CPU has completed
> accesses
> > to the dmabufs.
> >
> > dma_buf_sync_unlock(a sync object);
> >
> > 4. Unregister one or all dmabufs from a sync object - A task
> unregisters
> > the given dmabufs from the sync object. This means that the task
> dosen't
> > want to lock the dmabufs.
> > The unregistering should be performed after DMA or CPU has completed
> > accesses to the dmabufs or when dma_buf_sync_lock() is failed.
> >
> > dma_buf_sync_put(a sync object, a dmabuf);
> > dma_buf_sync_put_all(a sync object);
> >
> > The described steps may be summarized as:
> > get -> lock -> CPU or DMA access to a buffer/s -> unlock -> put
> >
> > This framework includes the following two features.
> > 1. read (shared) and write (exclusive) locks - A task is required to
> declare
> > the access type when the task tries to register a dmabuf;
> > READ, WRITE, READ DMA, or WRITE DMA.
> >
> > The below is example codes,
> > struct dmabuf_sync *sync;
> >
> > sync = dmabuf_sync_init(NULL, "test sync");
> >
> > dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
> > ...
> >
> > And the below can be used as access types:
> > DMA_BUF_ACCESS_READ,
> > - CPU will access a buffer for read.
> > DMA_BUF_ACCESS_WRITE,
> > - CPU will access a buffer for read or write.
> > DMA_BUF_ACCESS_READ | DMA_BUF_ACCESS_DMA,
> > - DMA will access a buffer for read
> > DMA_BUF_ACCESS_WRITE | DMA_BUF_ACCESS_DMA,
> > - DMA will access a buffer for read or write.
> >
> > 2. Mandatory resource releasing - a task cannot hold a lock
> indefinitely.
> > A task may never try to unlock a buffer after taking a lock to the
> buffer.
> > In this case, a timer

[RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework

2013-06-17 Thread Inki Dae
   
https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/

commit/?h=dmabuf-sync&id=6ca548e9ea9e865592719ef6b1cde58366af9f5c

The framework performs cache operation based on the previous and current access
types to the dmabufs after the locks to all dmabufs are taken:
Call dma_buf_begin_cpu_access() to invalidate cache if,
previous access type is DMA_BUF_ACCESS_WRITE | DMA and
current access type is DMA_BUF_ACCESS_READ

Call dma_buf_end_cpu_access() to clean cache if,
previous access type is DMA_BUF_ACCESS_WRITE and
current access type is DMA_BUF_ACCESS_READ | DMA

Such cache operations are invoked via dma-buf interfaces so the dma buf exporter
should implement dmabuf->ops->begin_cpu_access/end_cpu_access callbacks.

[1] http://lwn.net/Articles/470339/
[2] http://lwn.net/Articles/532616/
[3] https://patchwork-mail1.kernel.org/patch/2625321/

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 Documentation/dma-buf-sync.txt |  246 ++
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dmabuf-sync.c |  545 
 include/linux/dma-buf.h|   14 +
 include/linux/dmabuf-sync.h|  115 +
 include/linux/reservation.h|7 +
 7 files changed, 935 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..e71b6f4
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,246 @@
+DMA Buffer Synchronization Framework
+~~~~~~~~
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization between CPU and CPU, CPU and DMA, and DMA and DMA.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, cache operations, and easy-to-use
+interfaces for device drivers and potentially user application
+(not implemented for user applications, yet). And this API can be used for all
+dma devices using system memory as dma buffer, especially for most ARM based
+SoCs.
+
+
+Motivation
+--
+
+Sharing a buffer, a device cannot be aware of when the other device will access
+the shared buffer: a device may access a buffer containing wrong data if
+the device accesses the shared buffer while another device is still accessing
+the shared buffer. Therefore, a user process should have waited for
+the completion of DMA access by another device before a device tries to access
+the shared buffer.
+
+Besides, there is the same issue when CPU and DMA are sharing a buffer; i.e.,
+a user process should consider that when the user process have to send a buffer
+to a device driver for the device driver to access the buffer as input.
+This means that a user process needs to understand how the device driver is
+worked. Hence, the conventional mechanism not only makes user application
+complicated but also incurs performance overhead because the conventional
+mechanism cannot control devices precisely without additional and complex
+implemantations.
+
+In addition, in case of ARM based SoCs, most devices have no hardware cache
+consistency mechanisms between CPU and DMA devices because they do not use ACP
+(Accelerator Coherency Port). ACP can be connected to DMA engine or similar
+devices in order to keep cache coherency between CPU cache and DMA device.
+Thus, we need additional cache operations to have the devices operate properly;
+i.e., user applications should request cache operations to kernel before DMA
+accesses the buffer and after the completion of buffer access by CPU, or vise
+versa.
+
+   buffer access by CPU -> cache clean -> buffer access by DMA
+
+Or,
+   buffer access by DMA -> cache invalidate -> buffer access by CPU
+
+The below shows why cache operations should be requested by user
+process,
+(Presume that CPU and DMA share a buffer and the buffer is mapped
+ with user space as cachable)
+
+   handle = drm_gem_alloc(size);
+   ...
+   va1 = drm_gem_mmap(handle1);
+   va2 = malloc(size);
+   ...
+
+   while(conditions) {
+   memcpy(va1, some data, size);
+   ...
+   drm_xxx_set_dma_buffer(handle, ...);
+   ...
+
+   /* user need to request cache clean at here. */
+
+   /* blocked until dma operation is completed. */
+   drm_xxx_start_dma(...);
+   ...
+
+

RE: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework

2013-06-13 Thread Inki Dae
Hi Russell,

> -Original Message-
> From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk]
> Sent: Friday, June 14, 2013 2:26 AM
> To: Inki Dae
> Cc: maarten.lankho...@canonical.com; dan...@ffwll.ch; robdcl...@gmail.com;
> linux-fb...@vger.kernel.org; dri-de...@lists.freedesktop.org;
> kyungmin.p...@samsung.com; myungjoo@samsung.com; yj44@samsung.com;
> linux-arm-ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization
> framework
> 
> On Thu, Jun 13, 2013 at 05:28:08PM +0900, Inki Dae wrote:
> > This patch adds a buffer synchronization framework based on DMA BUF[1]
> > and reservation[2] to use dma-buf resource, and based on ww-mutexes[3]
> > for lock mechanism.
> >
> > The purpose of this framework is not only to couple cache operations,
> > and buffer access control to CPU and DMA but also to provide easy-to-use
> > interfaces for device drivers and potentially user application
> > (not implemented for user applications, yet). And this framework can be
> > used for all dma devices using system memory as dma buffer, especially
> > for most ARM based SoCs.
> >
> > The mechanism of this framework has the following steps,
> > 1. Register dmabufs to a sync object - A task gets a new sync object
> and
> > can add one or more dmabufs that the task wants to access.
> > This registering should be performed when a device context or an
> event
> > context such as a page flip event is created or before CPU accesses
a
> shared
> > buffer.
> >
> > dma_buf_sync_get(a sync object, a dmabuf);
> >
> > 2. Lock a sync object - A task tries to lock all dmabufs added in
its
> own
> > sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid
> dead
> > lock issue and for race condition between CPU and CPU, CPU and DMA,
> and DMA
> > and DMA. Taking a lock means that others cannot access all locked
> dmabufs
> > until the task that locked the corresponding dmabufs, unlocks all
the
> locked
> > dmabufs.
> > This locking should be performed before DMA or CPU accesses these
> dmabufs.
> >
> > dma_buf_sync_lock(a sync object);
> >
> > 3. Unlock a sync object - The task unlocks all dmabufs added in its
> own sync
> > object. The unlock means that the DMA or CPU accesses to the dmabufs
> have
> > been completed so that others may access them.
> > This unlocking should be performed after DMA or CPU has completed
> accesses
> > to the dmabufs.
> >
> > dma_buf_sync_unlock(a sync object);
> >
> > 4. Unregister one or all dmabufs from a sync object - A task
> unregisters
> > the given dmabufs from the sync object. This means that the task
> dosen't
> > want to lock the dmabufs.
> > The unregistering should be performed after DMA or CPU has completed
> > accesses to the dmabufs or when dma_buf_sync_lock() is failed.
> >
> > dma_buf_sync_put(a sync object, a dmabuf);
> > dma_buf_sync_put_all(a sync object);
> >
> > The described steps may be summarized as:
> > get -> lock -> CPU or DMA access to a buffer/s -> unlock -> put
> >
> > This framework includes the following two features.
> > 1. read (shared) and write (exclusive) locks - A task is required to
> declare
> > the access type when the task tries to register a dmabuf;
> > READ, WRITE, READ DMA, or WRITE DMA.
> >
> > The below is example codes,
> > struct dmabuf_sync *sync;
> >
> > sync = dmabuf_sync_init(NULL, "test sync");
> >
> > dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ);
> > ...
> >
> > And the below can be used as access types:
> > DMA_BUF_ACCESS_READ,
> > - CPU will access a buffer for read.
> > DMA_BUF_ACCESS_WRITE,
> > - CPU will access a buffer for read or write.
> > DMA_BUF_ACCESS_READ | DMA_BUF_ACCESS_DMA,
> > - DMA will access a buffer for read
> > DMA_BUF_ACCESS_WRITE | DMA_BUF_ACCESS_DMA,
> > - DMA will access a buffer for read or write.
> >
> > 2. Mandatory resource releasing - a task cannot hold a lock
> indefinitely.
> > A task may never try to unlock a buffer after taking a lock to the
> buffer.
> > In this case, a timer handler to the corresponding sync object is
> called
> > in five (default) seconds and then the timed-o

RE: [RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework

2013-06-13 Thread Inki Dae

> +static void dmabuf_sync_timeout_worker(struct work_struct *work)
> +{
> + struct dmabuf_sync *sync = container_of(work, struct dmabuf_sync,
> work);
> + struct dmabuf_sync_object *sobj;
> +
> + mutex_lock(&sync->lock);
> +
> + list_for_each_entry(sobj, &sync->syncs, head) {
> + if (WARN_ON(!sobj->robj))
> + continue;
> +
> + printk(KERN_WARNING "%s: timeout = 0x%x [type = %d, " \
> + "refcnt = %d, locked = %d]\n",
> + sync->name, (u32)sobj->dmabuf,
> + sobj->access_type,
> +
atomic_read(&sobj->robj->shared_cnt),
> + sobj->robj->locked);
> +
> + /* unlock only valid sync object. */
> + if (!sobj->robj->locked)
> + continue;
> +
> + if (sobj->robj->shared &&
> + atomic_read(&sobj->robj->shared_cnt) > 1) {
> + atomic_dec(&sobj->robj->shared_cnt);
> + continue;
> + }
> +
> + ww_mutex_unlock(&sobj->robj->lock);
> +
> + if (sobj->access_type & DMA_BUF_ACCESS_READ)
> + printk(KERN_WARNING "%s: r-unlocked = 0x%x\n",
> + sync->name, (u32)sobj->dmabuf);
> + else
> + printk(KERN_WARNING "%s: w-unlocked = 0x%x\n",
> + sync->name, (u32)sobj->dmabuf);
> +
> +#if defined(CONFIG_DEBUG_FS)
> + sync_debugfs_timeout_cnt++;
> +#endif

Oops, unnecessary codes. will remove them.

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] dmabuf-sync: Introduce buffer synchronization framework

2013-06-13 Thread Inki Dae
6ca548e9ea9e865592719ef6b1cde58366af9f5c

The framework performs cache operation based on the previous and current access
types to the dmabufs after the locks to all dmabufs are taken:
Call dma_buf_begin_cpu_access() to invalidate cache if,
previous access type is DMA_BUF_ACCESS_WRITE | DMA and
current access type is DMA_BUF_ACCESS_READ

Call dma_buf_end_cpu_access() to clean cache if,
previous access type is DMA_BUF_ACCESS_WRITE and
current access type is DMA_BUF_ACCESS_READ | DMA

Such cache operations are invoked via dma-buf interfaces so the dma buf exporter
should implement dmabuf->ops->begin_cpu_access/end_cpu_access callbacks.

[1] http://lwn.net/Articles/470339/
[2] http://lwn.net/Articles/532616/
[3] https://patchwork-mail1.kernel.org/patch/2625321/

Signed-off-by: Inki Dae 
---
 Documentation/dma-buf-sync.txt |  246 ++
 drivers/base/Kconfig   |7 +
 drivers/base/Makefile  |1 +
 drivers/base/dmabuf-sync.c |  555 
 include/linux/dma-buf.h|5 +
 include/linux/dmabuf-sync.h|  115 +
 include/linux/reservation.h|7 +
 7 files changed, 936 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/dma-buf-sync.txt
 create mode 100644 drivers/base/dmabuf-sync.c
 create mode 100644 include/linux/dmabuf-sync.h

diff --git a/Documentation/dma-buf-sync.txt b/Documentation/dma-buf-sync.txt
new file mode 100644
index 000..e71b6f4
--- /dev/null
+++ b/Documentation/dma-buf-sync.txt
@@ -0,0 +1,246 @@
+DMA Buffer Synchronization Framework
+~~~~~~~~
+
+  Inki Dae
+  
+  
+
+This document is a guide for device-driver writers describing the DMA buffer
+synchronization API. This document also describes how to use the API to
+use buffer synchronization between CPU and CPU, CPU and DMA, and DMA and DMA.
+
+The DMA Buffer synchronization API provides buffer synchronization mechanism;
+i.e., buffer access control to CPU and DMA, cache operations, and easy-to-use
+interfaces for device drivers and potentially user application
+(not implemented for user applications, yet). And this API can be used for all
+dma devices using system memory as dma buffer, especially for most ARM based
+SoCs.
+
+
+Motivation
+--
+
+Sharing a buffer, a device cannot be aware of when the other device will access
+the shared buffer: a device may access a buffer containing wrong data if
+the device accesses the shared buffer while another device is still accessing
+the shared buffer. Therefore, a user process should have waited for
+the completion of DMA access by another device before a device tries to access
+the shared buffer.
+
+Besides, there is the same issue when CPU and DMA are sharing a buffer; i.e.,
+a user process should consider that when the user process have to send a buffer
+to a device driver for the device driver to access the buffer as input.
+This means that a user process needs to understand how the device driver is
+worked. Hence, the conventional mechanism not only makes user application
+complicated but also incurs performance overhead because the conventional
+mechanism cannot control devices precisely without additional and complex
+implemantations.
+
+In addition, in case of ARM based SoCs, most devices have no hardware cache
+consistency mechanisms between CPU and DMA devices because they do not use ACP
+(Accelerator Coherency Port). ACP can be connected to DMA engine or similar
+devices in order to keep cache coherency between CPU cache and DMA device.
+Thus, we need additional cache operations to have the devices operate properly;
+i.e., user applications should request cache operations to kernel before DMA
+accesses the buffer and after the completion of buffer access by CPU, or vise
+versa.
+
+   buffer access by CPU -> cache clean -> buffer access by DMA
+
+Or,
+   buffer access by DMA -> cache invalidate -> buffer access by CPU
+
+The below shows why cache operations should be requested by user
+process,
+(Presume that CPU and DMA share a buffer and the buffer is mapped
+ with user space as cachable)
+
+   handle = drm_gem_alloc(size);
+   ...
+   va1 = drm_gem_mmap(handle1);
+   va2 = malloc(size);
+   ...
+
+   while(conditions) {
+   memcpy(va1, some data, size);
+   ...
+   drm_xxx_set_dma_buffer(handle, ...);
+   ...
+
+   /* user need to request cache clean at here. */
+
+   /* blocked until dma operation is completed. */
+   drm_xxx_start_dma(...);
+   ...
+
+   /* user need to request cache invalidate at here. */
+
+   memcpy(va2, va1, size);
+   }
+
+The issue arises: user pr

Introduce a dmabuf sync framework for buffer synchronization

2013-06-07 Thread Inki Dae
tries to take a lock to dmabuf
C again.

And the below is my concerns and opinions,
A dma buf has a reservation object when a buffer is exported to the dma buf
- I'm not sure but it seems that reservation object is used for x86 gpu;
having vram and different domain, or similar devices. So in case of embedded
system, most dma devices and cpu share system memory so I think that
reservation object should be considered for them also because basically,
buffer synchronization mechanism should be worked based on dmabuf. For this,
I have added four members to reservation_object; shared_cnt and shared for
read lock, accessed_type for cache operation, and locked for timeout case.
Hence, some devices might need specific something for itself. So how about
remaining only common part for reservation_object structure; it seems that
fence_excl, fence_shared, and so on are not common part.

Now wound/wait mutex doesn't consider read and write lock - In case of using
ww-mutexes for buffer synchronization, it seems that we need read and write
lock for more performance; read access and then read access doesn't need to
be locked. For this, I have added three members, shared_cnt and shared to
reservation_object and this is just to show you how we can use read lock.
However, I'm sure that there is a better/more generic way.

The above all things is just quick implementation for buffer synchronization
so this should be more cleaned up and there might be my missing point.
Please give me your advices and opinions.

Thanks,
Inki Dae

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae


> -Original Message-
> From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of
> Daniel Vetter
> Sent: Wednesday, May 29, 2013 1:50 AM
> To: Inki Dae
> Cc: Rob Clark; Maarten Lankhorst; linux-fbdev; YoungJun Cho; Kyungmin
Park;
> myungjoo.ham; DRI mailing list; linux-arm-ker...@lists.infradead.org;
> linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 28, 2013 at 4:50 PM, Inki Dae  wrote:
> > I think I already used reservation stuff any time in that way except
> > ww-mutex. And I'm not sure that embedded system really needs ww-mutex.
> If
> > there is any case,
> > could you tell me the case? I really need more advice and
> understanding :)
> 
> If you have only one driver, you can get away without ww_mutex.
> drm/i915 does it, all buffer state is protected by dev->struct_mutex.
> But as soon as you have multiple drivers sharing buffers with dma_buf
> things will blow up.
> 
> Yep, current prime is broken and can lead to deadlocks.
> 
> In practice it doesn't (yet) matter since only the X server does the
> sharing dance, and that one's single-threaded. Now you can claim that
> since you have all buffers pinned in embedded gfx anyway, you don't
> care. But both in desktop gfx and embedded gfx the real fun starts
> once you put fences into the mix and link them up with buffers, then
> every command submission risks that deadlock. Furthermore you can get
> unlucky and construct a circle of fences waiting on each another (only
> though if the fence singalling fires off the next batchbuffer
> asynchronously).

In our case, we haven't ever experienced deadlock yet but there is still
possible to face with deadlock in case that a process is sharing two buffer
with another process like below,
Process A committed buffer A and  waits for buffer B,
Process B committed buffer B and waits for buffer A

That is deadlock and it seems that you say we can resolve deadlock issue
with ww-mutexes. And it seems that we can replace our block-wakeup mechanism
with mutex lock for more performance.

> 
> To prevent such deadlocks you _absolutely_ need to lock _all_ buffers
> that take part in a command submission at once. To do that you either
> need a global lock (ugh) or ww_mutexes.
> 
> So ww_mutexes are the fundamental ingredient of all this, if you don't
> see why you need them then everything piled on top is broken. I think
> until you've understood why exactly we need ww_mutexes there's not
> much point in discussing the finer issues of fences, reservation
> objects and how to integrate it with dma_bufs exactly.
> 
> I'll try to clarify the motivating example in the ww_mutex
> documentation a bit, but I dunno how else I could explain this ...
> 

I don't really want for you waste your time on me. I will trying to apply
ww-mutexes (v4) to the proposed framework for more understanding.

Thanks for your advices.:) 
Inki Dae

> Yours, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae


> -Original Message-
> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> ow...@vger.kernel.org] On Behalf Of Rob Clark
> Sent: Tuesday, May 28, 2013 10:49 PM
> To: Inki Dae
> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> Park; myungjoo.ham; DRI mailing list;
linux-arm-ker...@lists.infradead.org;
> linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 11:56 PM, Inki Dae  wrote:
> >
> >
> >> -Original Message-
> >> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> >> ow...@vger.kernel.org] On Behalf Of Rob Clark
> >> Sent: Tuesday, May 28, 2013 12:48 AM
> >> To: Inki Dae
> >> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho;
> Kyungmin
> >> Park; myungjoo.ham; DRI mailing list;
> > linux-arm-ker...@lists.infradead.org;
> >> linux-media@vger.kernel.org
> >> Subject: Re: Introduce a new helper framework for buffer
> synchronization
> >>
> >> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> >> > Hi all,
> >> >
> >> > I have been removed previous branch and added new one with more
> cleanup.
> >> > This time, the fence helper doesn't include user side interfaces and
> >> cache
> >> > operation relevant codes anymore because not only we are not sure
> that
> >> > coupling those two things, synchronizing caches and buffer access
> >> between
> >> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side
> is
> >> a
> >> > good idea yet but also existing codes for user side have problems
> with
> >> badly
> >> > behaved or crashing userspace. So this could be more discussed later.
> >> >
> >> > The below is a new branch,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/?h=dma-f
> >> > ence-helper
> >> >
> >> > And fence helper codes,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/commit/?
> >> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >> >
> >> > And example codes for device driver,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/commit/?
> >> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >> >
> >> > I think the time is not yet ripe for RFC posting: maybe existing dma
> >> fence
> >> > and reservation need more review and addition work. So I'd glad for
> >> somebody
> >> > giving other opinions and advices in advance before RFC posting.
> >>
> >> thoughts from a *really* quick, pre-coffee, first look:
> >> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> >> probably wouldn't want to bake in assumption that seqno_fence is used.
> >> * I guess g2d is probably not actually a simple use case, since I
> >> expect you can submit blits involving multiple buffers :-P
> >
> > I don't think so. One and more buffers can be used: seqno_fence also has
> > only one buffer. Actually, we have already applied this approach to most
> > devices; multimedia, gpu and display controller. And this approach shows
> > more performance; reduced power consumption against traditional way. And
> g2d
> > example is just to show you how to apply my approach to device driver.
> 
> no, you need the ww-mutex / reservation stuff any time you have
> multiple independent devices (or rings/contexts for hw that can
> support multiple contexts) which can do operations with multiple
> buffers.

I think I already used reservation stuff any time in that way except
ww-mutex. And I'm not sure that embedded system really needs ww-mutex. If
there is any case, 
could you tell me the case? I really need more advice and understanding :)

Thanks,
Inki Dae

  So you could conceivably hit this w/ gpu + g2d if multiple
> buffers where shared between the two.  vram migration and such
> 'desktop stuff' might make the problem worse, but just because you
> don't have vram doesn't mean you don't have a problem with multiple
> buffers.
> 
> >> * otherwise, you probably don't want to depend on dmabuf, which is why
> >> reservation/fence is split out the way it is..  you want to be able to
> >> use a single reserva

RE: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae

Hi Daniel,

Thank you so much. And so very useful.:) Sorry but could be give me more
comments to the below my comments? There are still things making me
confusing.:(


> -Original Message-
> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Tuesday, May 28, 2013 7:33 PM
> To: Inki Dae
> Cc: 'Rob Clark'; 'Maarten Lankhorst'; 'Daniel Vetter'; 'linux-fbdev';
> 'YoungJun Cho'; 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list';
> linux-arm-ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 28, 2013 at 12:56:57PM +0900, Inki Dae wrote:
> >
> >
> > > -Original Message-
> > > From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> > > ow...@vger.kernel.org] On Behalf Of Rob Clark
> > > Sent: Tuesday, May 28, 2013 12:48 AM
> > > To: Inki Dae
> > > Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho;
> Kyungmin
> > > Park; myungjoo.ham; DRI mailing list;
> > linux-arm-ker...@lists.infradead.org;
> > > linux-media@vger.kernel.org
> > > Subject: Re: Introduce a new helper framework for buffer
> synchronization
> > >
> > > On Mon, May 27, 2013 at 6:38 AM, Inki Dae 
wrote:
> > > > Hi all,
> > > >
> > > > I have been removed previous branch and added new one with more
> cleanup.
> > > > This time, the fence helper doesn't include user side interfaces and
> > > cache
> > > > operation relevant codes anymore because not only we are not sure
> that
> > > > coupling those two things, synchronizing caches and buffer access
> > > between
> > > > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel
> side is
> > > a
> > > > good idea yet but also existing codes for user side have problems
> with
> > > badly
> > > > behaved or crashing userspace. So this could be more discussed
later.
> > > >
> > > > The below is a new branch,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/?h=dma-f
> > > > ence-helper
> > > >
> > > > And fence helper codes,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/commit/?
> > > > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> > > >
> > > > And example codes for device driver,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/commit/?
> > > > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> > > >
> > > > I think the time is not yet ripe for RFC posting: maybe existing dma
> > > fence
> > > > and reservation need more review and addition work. So I'd glad for
> > > somebody
> > > > giving other opinions and advices in advance before RFC posting.
> > >
> > > thoughts from a *really* quick, pre-coffee, first look:
> > > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > > probably wouldn't want to bake in assumption that seqno_fence is used.
> > > * I guess g2d is probably not actually a simple use case, since I
> > > expect you can submit blits involving multiple buffers :-P
> >
> > I don't think so. One and more buffers can be used: seqno_fence also has
> > only one buffer. Actually, we have already applied this approach to most
> > devices; multimedia, gpu and display controller. And this approach shows
> > more performance; reduced power consumption against traditional way. And
> g2d
> > example is just to show you how to apply my approach to device driver.
> 
> Note that seqno_fence is an implementation pattern for a certain type of
> direct hw->hw synchronization which uses a shared dma_buf to exchange the
> sync cookie.

I'm afraid that I don't understand hw->hw synchronization. hw->hw
synchronization means that device has a hardware feature which supports
buffer synchronization hardware internally? And what is the sync cookie?

> The dma_buf attached to the seqno_fence has _nothing_ to do
> with the dma_buf the fence actually coordinates access to.
> 
> I think that confusing is a large reason for why Maarten&I don't
> understand what you want to achieve with your fence helpers. Currently
> they're us

RE: Introduce a new helper framework for buffer synchronization

2013-05-27 Thread Inki Dae


> -Original Message-
> From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of
> Daniel Vetter
> Sent: Tuesday, May 28, 2013 1:02 AM
> To: Rob Clark
> Cc: Inki Dae; Maarten Lankhorst; linux-fbdev; YoungJun Cho; Kyungmin Park;
> myungjoo.ham; DRI mailing list; linux-arm-ker...@lists.infradead.org;
> linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 5:47 PM, Rob Clark  wrote:
> > On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> >> Hi all,
> >>
> >> I have been removed previous branch and added new one with more
cleanup.
> >> This time, the fence helper doesn't include user side interfaces and
> cache
> >> operation relevant codes anymore because not only we are not sure that
> >> coupling those two things, synchronizing caches and buffer access
> between
> >> CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side
> is a
> >> good idea yet but also existing codes for user side have problems with
> badly
> >> behaved or crashing userspace. So this could be more discussed later.
> >>
> >> The below is a new branch,
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/?h=dma-f
> >> ence-helper
> >>
> >> And fence helper codes,
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> >> h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >>
> >> And example codes for device driver,
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> >> h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >>
> >> I think the time is not yet ripe for RFC posting: maybe existing dma
> fence
> >> and reservation need more review and addition work. So I'd glad for
> somebody
> >> giving other opinions and advices in advance before RFC posting.
> >
> > thoughts from a *really* quick, pre-coffee, first look:
> > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > probably wouldn't want to bake in assumption that seqno_fence is used.
> 
> Yeah, which is why Maarten&I discussed ideas already for what needs to
> be improved in the current dma-buf interface code to make this Just
> Work. At least as long as a driver doesn't want to add new fences,
> which would be especially useful for all kinds of gpu access.
> 
> > * I guess g2d is probably not actually a simple use case, since I
> > expect you can submit blits involving multiple buffers :-P
> 
> Yeah, on a quick read the current fence helper code seems to be a bit
> limited in scope.
> 
> > * otherwise, you probably don't want to depend on dmabuf, which is why
> > reservation/fence is split out the way it is..  you want to be able to
> > use a single reservation/fence mechanism within your driver without
> > having to care about which buffers are exported to dmabuf's and which
> > are not.  Creating a dmabuf for every GEM bo is too heavyweight.
> 
> That's pretty much the reason that reservations are free-standing from
> dma_bufs. The idea is to embed them into the gem/ttm/v4l buffer
> object. Maarten also has some helpers to keep track of multi-buffer
> ww_mutex locking and fence attaching in his reservation helpers, but I
> think we should wait with those until we have drivers using them.
> 
> For now I think the priority should be to get the basic stuff in and
> ttm as the first user established. Then we can go nuts later on.
> 
> > I'm not entirely sure if reservation/fence could/should be made any
> > simpler for multi-buffer users.  Probably the best thing to do is just
> > get reservation/fence rolled out in a few drivers and see if some
> > common patterns emerge.
> 
> I think we can make the 1 buffer per dma op (i.e. 1:1
> dma_buf->reservation : fence mapping) work fairly simple in dma_buf
> with maybe a dma_buf_attachment_start_dma/end_dma helpers. But there's
> also still the open that currently there's no way to flush cpu caches
> for dma access without unmapping the attachement (or resorting to


That was what I tried adding user interfaces to dmabuf: coupling
synchronizing caches and buffer access between CPU and CPU, CPU and DMA, and
DMA and DMA with fences in kernel side. We need something to do between
mapping and unmapping attachment.

> trick which might not work), so we have a few gaping holes in the
> interface already anyway.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-27 Thread Inki Dae


> -Original Message-
> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> ow...@vger.kernel.org] On Behalf Of Rob Clark
> Sent: Tuesday, May 28, 2013 12:48 AM
> To: Inki Dae
> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> Park; myungjoo.ham; DRI mailing list;
linux-arm-ker...@lists.infradead.org;
> linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> > Hi all,
> >
> > I have been removed previous branch and added new one with more cleanup.
> > This time, the fence helper doesn't include user side interfaces and
> cache
> > operation relevant codes anymore because not only we are not sure that
> > coupling those two things, synchronizing caches and buffer access
> between
> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
> a
> > good idea yet but also existing codes for user side have problems with
> badly
> > behaved or crashing userspace. So this could be more discussed later.
> >
> > The below is a new branch,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/?h=dma-f
> > ence-helper
> >
> > And fence helper codes,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >
> > And example codes for device driver,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >
> > I think the time is not yet ripe for RFC posting: maybe existing dma
> fence
> > and reservation need more review and addition work. So I'd glad for
> somebody
> > giving other opinions and advices in advance before RFC posting.
> 
> thoughts from a *really* quick, pre-coffee, first look:
> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> probably wouldn't want to bake in assumption that seqno_fence is used.
> * I guess g2d is probably not actually a simple use case, since I
> expect you can submit blits involving multiple buffers :-P

I don't think so. One and more buffers can be used: seqno_fence also has
only one buffer. Actually, we have already applied this approach to most
devices; multimedia, gpu and display controller. And this approach shows
more performance; reduced power consumption against traditional way. And g2d
example is just to show you how to apply my approach to device driver.

> * otherwise, you probably don't want to depend on dmabuf, which is why
> reservation/fence is split out the way it is..  you want to be able to
> use a single reservation/fence mechanism within your driver without
> having to care about which buffers are exported to dmabuf's and which
> are not.  Creating a dmabuf for every GEM bo is too heavyweight.

Right. But I think we should dealt with this separately. Actually, we are
trying to use reservation for gpu pipe line synchronization such as sgx sync
object and this approach is used without dmabuf. In order words, some device
can use only reservation for such pipe line synchronization and at the same
time, fence helper or similar thing with dmabuf for buffer synchronization.

> 
> I'm not entirely sure if reservation/fence could/should be made any
> simpler for multi-buffer users.  Probably the best thing to do is just
> get reservation/fence rolled out in a few drivers and see if some
> common patterns emerge.
> 
> BR,
> -R
> 
> >
> > Thanks,
> > Inki Dae
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-27 Thread Inki Dae


> -Original Message-
> From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com]
> Sent: Tuesday, May 28, 2013 12:23 AM
> To: Inki Dae
> Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'YoungJun Cho'; 'Kyungmin
> Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> Hey,
> 
> Op 27-05-13 12:38, Inki Dae schreef:
> > Hi all,
> >
> > I have been removed previous branch and added new one with more cleanup.
> > This time, the fence helper doesn't include user side interfaces and
> cache
> > operation relevant codes anymore because not only we are not sure that
> > coupling those two things, synchronizing caches and buffer access
> between
> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
> a
> > good idea yet but also existing codes for user side have problems with
> badly
> > behaved or crashing userspace. So this could be more discussed later.
> >
> > The below is a new branch,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/?h=dma-f
> > ence-helper
> >
> > And fence helper codes,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >
> > And example codes for device driver,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >
> > I think the time is not yet ripe for RFC posting: maybe existing dma
> fence
> > and reservation need more review and addition work. So I'd glad for
> somebody
> > giving other opinions and advices in advance before RFC posting.
> >
> NAK.
> 
> For examples for how to handle locking properly, see Documentation/ww-
> mutex-design.txt in my recent tree.
> I could list what I believe is wrong with your implementation, but real
> problem is that the approach you're taking is wrong.

I just removed ticket stubs to show my approach you guys as simple as
possible, and I just wanted to show that we could use buffer synchronization
mechanism without ticket stubs.

Question, WW-Mutexes could be used for all devices? I guess this has
dependence on x86 gpu: gpu has VRAM and it means different memory domain.
And could you tell my why shared fence should have only eight objects? I
think we could need more than eight objects for read access. Anyway I think
I don't surely understand yet so there might be my missing point.

Thanks,
Inki Dae

> 

> ~Maarten

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-27 Thread Inki Dae
Hi all,

I have been removed previous branch and added new one with more cleanup.
This time, the fence helper doesn't include user side interfaces and cache
operation relevant codes anymore because not only we are not sure that
coupling those two things, synchronizing caches and buffer access between
CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is a
good idea yet but also existing codes for user side have problems with badly
behaved or crashing userspace. So this could be more discussed later.

The below is a new branch,

https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/?h=dma-f
ence-helper

And fence helper codes,

https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/commit/?
h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005

And example codes for device driver,

https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/commit/?
h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae

I think the time is not yet ripe for RFC posting: maybe existing dma fence
and reservation need more review and addition work. So I'd glad for somebody
giving other opinions and advices in advance before RFC posting.

Thanks,
Inki Dae

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-23 Thread Inki Dae
> -Original Message-
> From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of
> Daniel Vetter
> Sent: Thursday, May 23, 2013 8:56 PM
> To: Inki Dae
> Cc: Rob Clark; linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham;
> YoungJun Cho; linux-arm-ker...@lists.infradead.org; linux-
> me...@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 21, 2013 at 11:22 AM, Inki Dae  wrote:
> >> -Original Message-
> >> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel
> >> Vetter
> >> Sent: Tuesday, May 21, 2013 4:45 PM
> >> To: Inki Dae
> >> Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'DRI mailing list';
> >> 'Kyungmin Park'; 'myungjoo.ham'; 'YoungJun Cho'; linux-arm-
> >> ker...@lists.infradead.org; linux-media@vger.kernel.org
> >> Subject: Re: Introduce a new helper framework for buffer
> synchronization
> >>
> >> On Tue, May 21, 2013 at 04:03:06PM +0900, Inki Dae wrote:
> >> > > - Integration of fence syncing into dma_buf. Imo we should have a
> >> > >   per-attachment mode which decides whether map/unmap (and the new
> >> sync)
> >> > >   should wait for fences or whether the driver takes care of
> syncing
> >> > >   through the new fence framework itself (for direct hw sync).
> >> >
> >> > I think it's a good idea to have per-attachment mode for buffer sync.
> >> But
> >> > I'd like to say we make sure what is the purpose of map
> >> > (dma_buf_map_attachment)first. This interface is used to get a sgt;
> >> > containing pages to physical memory region, and map that region with
> >> > device's iommu table. The important thing is that this interface is
> >> called
> >> > just one time when user wants to share an allocated buffer with dma.
> But
> >> cpu
> >> > will try to access the buffer every time as long as it wants.
> Therefore,
> >> we
> >> > need cache control every time cpu and dma access the shared buffer:
> >> cache
> >> > clean when cpu goes to dma and cache invalidate when dma goes to cpu.
> >> That
> >> > is why I added new interfaces, DMA_BUF_GET_FENCE and
> DMA_BUF_PUT_FENCE,
> >> to
> >> > dma buf framework. Of course, Those are ugly so we need a better way:
> I
> >> just
> >> > wanted to show you that in such way, we can synchronize the shared
> >> buffer
> >> > between cpu and dma. By any chance, is there any way that kernel can
> be
> >> > aware of when cpu accesses the shared buffer or is there any point I
> >> didn't
> >> > catch up?
> >>
> >> Well dma_buf_map/unmap_attachment should also flush/invalidate any
> caches,
> >> and with the current dma_buf spec those two functions are the _only_
> means
> >
> > I know that dma buf exporter should make sure that cache
> clean/invalidate
> > are done when dma_buf_map/unmap_attachement is called. For this, already
> we
> > do so. However, this function is called when dma buf import is requested
> by
> > user to map a dmabuf fd with dma. This means that
> dma_buf_map_attachement()
> > is called ONCE when user wants to share the dmabuf fd with dma.
Actually,
> in
> > case of exynos drm, dma_map_sg_attrs(), performing cache operation
> > internally, is called when dmabuf import is requested by user.
> >
> >> you have to do so. Which strictly means that if you interleave device
> dma
> >> and cpu acccess you need to unmap/map every time.
> >>
> >> Which is obviously horribly inefficient, but a known gap in the dma_buf
> >
> > Right, and also this has big overhead.
> >
> >> interface. Hence why I proposed to add dma_buf_sync_attachment similar
> to
> >> dma_sync_single_for_device:
> >>
> >> /**
> >>  * dma_buf_sync_sg_attachment - sync caches for dma access
> >>  * @attach: dma-buf attachment to sync
> >>  * @sgt: the sg table to sync (returned by dma_buf_map_attachement)
> >>  * @direction: dma direction to sync for
> >>  *
> >>  * This function synchronizes caches for device dma through the given
> >>  * dma-buf attachment when interleaving dma from different devices and
> the
> >>  * cpu. Other device dma needs to be synced also by calls to this
> >>  * function (or 

RE: Introduce a new helper framework for buffer synchronization

2013-05-21 Thread Inki Dae


> -Original Message-
> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Tuesday, May 21, 2013 4:45 PM
> To: Inki Dae
> Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'DRI mailing list';
> 'Kyungmin Park'; 'myungjoo.ham'; 'YoungJun Cho'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 21, 2013 at 04:03:06PM +0900, Inki Dae wrote:
> > > - Integration of fence syncing into dma_buf. Imo we should have a
> > >   per-attachment mode which decides whether map/unmap (and the new
> sync)
> > >   should wait for fences or whether the driver takes care of syncing
> > >   through the new fence framework itself (for direct hw sync).
> >
> > I think it's a good idea to have per-attachment mode for buffer sync.
> But
> > I'd like to say we make sure what is the purpose of map
> > (dma_buf_map_attachment)first. This interface is used to get a sgt;
> > containing pages to physical memory region, and map that region with
> > device's iommu table. The important thing is that this interface is
> called
> > just one time when user wants to share an allocated buffer with dma. But
> cpu
> > will try to access the buffer every time as long as it wants. Therefore,
> we
> > need cache control every time cpu and dma access the shared buffer:
> cache
> > clean when cpu goes to dma and cache invalidate when dma goes to cpu.
> That
> > is why I added new interfaces, DMA_BUF_GET_FENCE and DMA_BUF_PUT_FENCE,
> to
> > dma buf framework. Of course, Those are ugly so we need a better way: I
> just
> > wanted to show you that in such way, we can synchronize the shared
> buffer
> > between cpu and dma. By any chance, is there any way that kernel can be
> > aware of when cpu accesses the shared buffer or is there any point I
> didn't
> > catch up?
> 
> Well dma_buf_map/unmap_attachment should also flush/invalidate any caches,
> and with the current dma_buf spec those two functions are the _only_ means

I know that dma buf exporter should make sure that cache clean/invalidate
are done when dma_buf_map/unmap_attachement is called. For this, already we
do so. However, this function is called when dma buf import is requested by
user to map a dmabuf fd with dma. This means that dma_buf_map_attachement()
is called ONCE when user wants to share the dmabuf fd with dma. Actually, in
case of exynos drm, dma_map_sg_attrs(), performing cache operation
internally, is called when dmabuf import is requested by user.

> you have to do so. Which strictly means that if you interleave device dma
> and cpu acccess you need to unmap/map every time.
> 
> Which is obviously horribly inefficient, but a known gap in the dma_buf

Right, and also this has big overhead.

> interface. Hence why I proposed to add dma_buf_sync_attachment similar to
> dma_sync_single_for_device:
> 
> /**
>  * dma_buf_sync_sg_attachment - sync caches for dma access
>  * @attach: dma-buf attachment to sync
>  * @sgt: the sg table to sync (returned by dma_buf_map_attachement)
>  * @direction: dma direction to sync for
>  *
>  * This function synchronizes caches for device dma through the given
>  * dma-buf attachment when interleaving dma from different devices and the
>  * cpu. Other device dma needs to be synced also by calls to this
>  * function (or a pair of dma_buf_map/unmap_attachment calls), cpu access
>  * needs to be synced with dma_buf_begin/end_cpu_access.
>  */
> void dma_buf_sync_sg_attachment(struct dma_buf_attachment *attach,
>   struct sg_table *sgt,
>   enum dma_data_direction direction)
> 
> Note that "sync" here only means to synchronize caches, not wait for any
> outstanding fences. This is simply to be consistent with the established
> lingo of the dma api. How the dma-buf fences fit into this is imo a
> different topic, but my idea is that all the cache coherency barriers
> (i.e. dma_buf_map/unmap_attachment, dma_buf_sync_sg_attachment and
> dma_buf_begin/end_cpu_access) would automatically block for any attached
> fences (i.e. block for write fences when doing read-only access, block for
> all fences otherwise).

As I mentioned already, kernel can't aware of when cpu accesses a shared
buffer: user can access a shared buffer after mmap anytime and the shared
buffer should be synchronized between cpu and dma. Therefore, the above
cache coherency barriers should be called every time cpu and dma tries to
access a shared buffer, checking before and after of c

RE: Introduce a new helper framework for buffer synchronization

2013-05-21 Thread Inki Dae


> -Original Message-
> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Tuesday, May 21, 2013 6:31 AM
> To: Inki Dae
> Cc: Rob Clark; linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham;
> YoungJun Cho; linux-arm-ker...@lists.infradead.org; linux-
> me...@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 20, 2013 at 11:13:04PM +0200, Daniel Vetter wrote:
> > On Sat, May 18, 2013 at 03:47:43PM +0900, Inki Dae wrote:
> > > 2013/5/15 Rob Clark 
> > >
> > > > On Wed, May 15, 2013 at 1:19 AM, Inki Dae 
> wrote:
> > > > >
> > > > >
> > > > >> -Original Message-
> > > > >> From: Rob Clark [mailto:robdcl...@gmail.com]
> > > > >> Sent: Tuesday, May 14, 2013 10:39 PM
> > > > >> To: Inki Dae
> > > > >> Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham;
> YoungJun
> > > > >> Cho; linux-arm-ker...@lists.infradead.org; linux-
> me...@vger.kernel.org
> > > > >> Subject: Re: Introduce a new helper framework for buffer
> synchronization
> > > > >>
> > > > >> On Mon, May 13, 2013 at 10:52 PM, Inki Dae 
> > > > wrote:
> > > > >> >> well, for cache management, I think it is a better idea.. I
> didn't
> > > > >> >> really catch that this was the motivation from the initial
> patch, but
> > > > >> >> maybe I read it too quickly.  But cache can be decoupled from
> > > > >> >> synchronization, because CPU access is not asynchronous.  For
> > > > >> >> userspace/CPU access to buffer, you should:
> > > > >> >>
> > > > >> >>   1) wait for buffer
> > > > >> >>   2) prepare-access
> > > > >> >>   3)  ... do whatever cpu access to buffer ...
> > > > >> >>   4) finish-access
> > > > >> >>   5) submit buffer for new dma-operation
> > > > >> >>
> > > > >> >
> > > > >> >
> > > > >> > For data flow from CPU to DMA device,
> > > > >> > 1) wait for buffer
> > > > >> > 2) prepare-access (dma_buf_begin_cpu_access)
> > > > >> > 3) cpu access to buffer
> > > > >> >
> > > > >> >
> > > > >> > For data flow from DMA device to CPU
> > > > >> > 1) wait for buffer
> > > > >>
> > > > >> Right, but CPU access isn't asynchronous (from the point of view
> of
> > > > >> the CPU), so there isn't really any wait step at this point.  And
> if
> > > > >> you do want the CPU to be able to signal a fence from userspace
> for
> > > > >> some reason, you probably what something file/fd based so the
> > > > >> refcnting/cleanup when process dies doesn't leave some pending
> DMA
> > > > >> action wedged.  But I don't really see the point of that
> complexity
> > > > >> when the CPU access isn't asynchronous in the first place.
> > > > >>
> > > > >
> > > > > There was my missing comments, please see the below sequence.
> > > > >
> > > > > For data flow from CPU to DMA device and then from DMA device to
> CPU,
> > > > > 1) wait for buffer <- at user side - ioctl(fd,
> DMA_BUF_GET_FENCE, ...)
> > > > > - including prepare-access (dma_buf_begin_cpu_access)
> > > > > 2) cpu access to buffer
> > > > > 3) wait for buffer <- at device driver
> > > > > - but CPU is already accessing the buffer so blocked.
> > > > > 4) signal <- at user side - ioctl(fd, DMA_BUF_PUT_FENCE, ...)
> > > > > 5) the thread, blocked at 3), is waked up by 4).
> > > > > - and then finish-access (dma_buf_end_cpu_access)
> > > >
> > > > right, I understand you can have background threads, etc, in
> > > > userspace.  But there are already plenty of synchronization
> primitives
> > > > that can be used for cpu->cpu synchronization, either within the
> same
> > > > process or between multiple processes.  For cpu access, even if it
> is
> > > > handled by background threads/processes, I think it i

RE: Introduce a new helper framework for buffer synchronization

2013-05-14 Thread Inki Dae


> -Original Message-
> From: Rob Clark [mailto:robdcl...@gmail.com]
> Sent: Tuesday, May 14, 2013 10:39 PM
> To: Inki Dae
> Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> Cho; linux-arm-ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 13, 2013 at 10:52 PM, Inki Dae  wrote:
> >> well, for cache management, I think it is a better idea.. I didn't
> >> really catch that this was the motivation from the initial patch, but
> >> maybe I read it too quickly.  But cache can be decoupled from
> >> synchronization, because CPU access is not asynchronous.  For
> >> userspace/CPU access to buffer, you should:
> >>
> >>   1) wait for buffer
> >>   2) prepare-access
> >>   3)  ... do whatever cpu access to buffer ...
> >>   4) finish-access
> >>   5) submit buffer for new dma-operation
> >>
> >
> >
> > For data flow from CPU to DMA device,
> > 1) wait for buffer
> > 2) prepare-access (dma_buf_begin_cpu_access)
> > 3) cpu access to buffer
> >
> >
> > For data flow from DMA device to CPU
> > 1) wait for buffer
> 
> Right, but CPU access isn't asynchronous (from the point of view of
> the CPU), so there isn't really any wait step at this point.  And if
> you do want the CPU to be able to signal a fence from userspace for
> some reason, you probably what something file/fd based so the
> refcnting/cleanup when process dies doesn't leave some pending DMA
> action wedged.  But I don't really see the point of that complexity
> when the CPU access isn't asynchronous in the first place.
>

There was my missing comments, please see the below sequence.

For data flow from CPU to DMA device and then from DMA device to CPU,
1) wait for buffer <- at user side - ioctl(fd, DMA_BUF_GET_FENCE, ...)
- including prepare-access (dma_buf_begin_cpu_access)
2) cpu access to buffer
3) wait for buffer <- at device driver
- but CPU is already accessing the buffer so blocked.
4) signal <- at user side - ioctl(fd, DMA_BUF_PUT_FENCE, ...)
5) the thread, blocked at 3), is waked up by 4).
- and then finish-access (dma_buf_end_cpu_access)
6) dma access to buffer
7) wait for buffer <- at user side - ioctl(fd, DMA_BUF_GET_FENCE, ...)
- but DMA is already accessing the buffer so blocked.
8) signal <- at device driver
9) the thread, blocked at 7), is waked up by 8)
- and then prepare-access (dma_buf_begin_cpu_access)
10 cpu access to buffer

Basically, 'wait for buffer' includes buffer synchronization, committing
processing, and cache operation. The buffer synchronization means that a
current thread should wait for other threads accessing a shared buffer until
the completion of their access. And the committing processing means that a
current thread possesses the shared buffer so any trying to access the
shared buffer by another thread makes the thread to be blocked. However, as
I already mentioned before, it seems that these user interfaces are so ugly
yet. So we need better way.

Give me more comments if there is my missing point :)

Thanks,
Inki Dae

> BR,
> -R
> 
> 
> > 2) finish-access (dma_buf_end _cpu_access)
> > 3) dma access to buffer
> >
> > 1) and 2) are coupled with one function: we have implemented
> > fence_helper_commit_reserve() for it.
> >
> > Cache control(cache clean or cache invalidate) is performed properly
> > checking previous access type and current access type.
> > And the below is actual codes for it,

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-13 Thread Inki Dae


> -Original Message-
> From: Rob Clark [mailto:robdcl...@gmail.com]
> Sent: Tuesday, May 14, 2013 2:58 AM
> To: Inki Dae
> Cc: linux-fbdev; DRI mailing list; Kyungmin Park; myungjoo.ham; YoungJun
> Cho; linux-arm-ker...@lists.infradead.org; linux-media@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 13, 2013 at 1:18 PM, Inki Dae  wrote:
> >
> >
> > 2013/5/13 Rob Clark 
> >>
> >> On Mon, May 13, 2013 at 8:21 AM, Inki Dae  wrote:
> >> >
> >> >> In that case you still wouldn't give userspace control over the
> fences.
> >> >> I
> >> >> don't see any way that can end well.
> >> >> What if userspace never signals? What if userspace gets killed by
> oom
> >> >> killer. Who keeps track of that?
> >> >>
> >> >
> >> > In all cases, all kernel resources to user fence will be released by
> >> > kernel
> >> > once the fence is timed out: never signaling and process killing by
> oom
> >> > killer makes the fence timed out. And if we use mmap mechanism you
> >> > mentioned
> >> > before, I think user resource could also be freed properly.
> >>
> >>
> >> I tend to agree w/ Maarten here.. there is no good reason for
> >> userspace to be *signaling* fences.  The exception might be some blob
> >> gpu drivers which don't have enough knowledge in the kernel to figure
> >> out what to do.  (In which case you can add driver private ioctls for
> >> that.. still not the right thing to do but at least you don't make a
> >> public API out of it.)
> >>
> >
> > Please do not care whether those are generic or not. Let's see the
> following
> > three things. First, it's cache operation. As you know, ARM SoC has ACP
> > (Accelerator Coherency Port) and can be connected to DMA engine or
> similar
> > devices. And this port is used for cache coherency between CPU cache and
> DMA
> > device. However, most devices on ARM based embedded systems don't use
> the
> > ACP port. So they need proper cache operation before and after of DMA or
> CPU
> > access in case of using cachable mapping. Actually, I see many Linux
> based
> > platforms call cache control interfaces directly for that. I think the
> > reason, they do so, is that kernel isn't aware of when and how CPU
> accessed
> > memory.
> 
> I think we had kicked around the idea of giving dmabuf's a
> prepare/finish ioctl quite some time back.  This is probably something
> that should be at least a bit decoupled from fences.  (Possibly
> 'prepare' waits for dma access to complete, but not the other way
> around.)
> 
> And I did implement in omapdrm support for simulating coherency via
> page fault-in / shoot-down..  It is one option that makes it
> completely transparent to userspace, although there is some
> performance const, so I suppose it depends a bit on your use-case.
> 
> > And second, user process has to do so many things in case of using
> shared
> > memory with DMA device. User process should understand how DMA device is
> > operated and when interfaces for controling the DMA device are called.
> Such
> > things would make user application so complicated.
> >
> > And third, it's performance optimization to multimedia and graphics
> devices.
> > As I mentioned already, we should consider sequential processing for
> buffer
> > sharing between CPU and DMA device. This means that CPU should stay with
> > idle until DMA device is completed and vise versa.
> >
> > That is why I proposed such user interfaces. Of course, these interfaces
> > might be so ugly yet: for this, Maarten pointed already out and I agree
> with
> > him. But there must be another better way. Aren't you think we need
> similar
> > thing? With such interfaces, cache control and buffer synchronization
> can be
> > performed in kernel level. Moreover, user applization doesn't need to
> > consider DMA device controlling anymore. Therefore, one thread can
> access a
> > shared buffer and the other can control DMA device with the shared
> buffer in
> > parallel. We can really make the best use of CPU and DMA idle time. In
> other
> > words, we can really make the best use of multi tasking OS, Linux.
> >
> > So could you please tell me about that there is any reason we don't use
> > public API for it? I think we can add and use public API if NECESSARY.
> 
> we

RE: Introduce a new helper framework for buffer synchronization

2013-05-13 Thread Inki Dae


> -Original Message-
> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> ow...@vger.kernel.org] On Behalf Of Maarten Lankhorst
> Sent: Monday, May 13, 2013 8:41 PM
> To: Inki Dae
> Cc: 'Rob Clark'; 'Daniel Vetter'; 'DRI mailing list'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org; 'linux-fbdev';
> 'Kyungmin Park'; 'myungjoo.ham'; 'YoungJun Cho'
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> Op 13-05-13 13:24, Inki Dae schreef:
> >> and can be solved with userspace locking primitives. No need for the
> >> kernel to get involved.
> >>
> > Yes, that is how we have synchronized buffer between CPU and DMA device
> > until now without buffer synchronization mechanism. I thought that it's
> best
> > to make user not considering anything: user can access a buffer
> regardless
> > of any DMA device controlling and the buffer synchronization is
> performed in
> > kernel level. Moreover, I think we could optimize graphics and
> multimedia
> > hardware performance because hardware can do more works: one thread
> accesses
> > a shared buffer and the other controls DMA device with the shared buffer
> in
> > parallel. Thus, we could avoid sequential processing and that is my
> > intention. Aren't you think about that we could improve hardware
> utilization
> > with such way or other? of course, there could be better way.
> >
> If you don't want to block the hardware the only option is to allocate a
> temp bo and blit to/from it using the hardware.
> OpenGL already does that when you want to read back, because otherwise the
> entire pipeline can get stalled.
> The overhead of command submission for that shouldn't be high, if it is
> you should really try to optimize that first
> before coming up with this crazy scheme.
> 

I have considered all devices sharing buffer with CPU; multimedia, display
controller and graphics devices (including GPU). And we could provide
easy-to-use user interfaces for buffer synchronization. Of course, the
proposed user interfaces may be so ugly yet but at least, I think we need
something else for it.

> In that case you still wouldn't give userspace control over the fences. I
> don't see any way that can end well.
> What if userspace never signals? What if userspace gets killed by oom
> killer. Who keeps track of that?
> 

In all cases, all kernel resources to user fence will be released by kernel
once the fence is timed out: never signaling and process killing by oom
killer makes the fence timed out. And if we use mmap mechanism you mentioned
before, I think user resource could also be freed properly.

Thanks,
Inki Dae

> ~Maarten
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Introduce a new helper framework for buffer synchronization

2013-05-13 Thread Inki Dae


> -Original Message-
> From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com]
> Sent: Monday, May 13, 2013 6:52 PM
> To: Inki Dae
> Cc: 'Rob Clark'; 'Daniel Vetter'; 'DRI mailing list'; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org; 'linux-fbdev';
> 'Kyungmin Park'; 'myungjoo.ham'; 'YoungJun Cho'
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> Op 13-05-13 11:21, Inki Dae schreef:
> >
> >> -Original Message-
> >> From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com]
> >> Sent: Monday, May 13, 2013 5:01 PM
> >> To: Inki Dae
> >> Cc: Rob Clark; Daniel Vetter; DRI mailing list; linux-arm-
> >> ker...@lists.infradead.org; linux-media@vger.kernel.org; linux-fbdev;
> >> Kyungmin Park; myungjoo.ham; YoungJun Cho
> >> Subject: Re: Introduce a new helper framework for buffer
> synchronization
> >>
> >> Op 09-05-13 09:33, Inki Dae schreef:
> >>> Hi all,
> >>>
> >>> This post introduces a new helper framework based on dma fence. And
> the
> >>> purpose of this post is to collect other opinions and advices before
> RFC
> >>> posting.
> >>>
> >>> First of all, this helper framework, called fence helper, is in
> progress
> >>> yet so might not have enough comments in codes and also might need to
> be
> >>> more cleaned up. Moreover, we might be missing some parts of the dma
> >> fence.
> >>> However, I'd like to say that all things mentioned below has been
> tested
> >>> with Linux platform and worked well.
> >>> 
> >>>
> >>> And tutorial for user process.
> >>> just before cpu access
> >>> struct dma_buf_fence *df;
> >>>
> >>> df->type = DMA_BUF_ACCESS_READ or
DMA_BUF_ACCESS_WRITE;
> >>> ioctl(fd, DMA_BUF_GET_FENCE, &df);
> >>>
> >>> after memset or memcpy
> >>> ioctl(fd, DMA_BUF_PUT_FENCE, &df);
> >> NAK.
> >>
> >> Userspace doesn't need to trigger fences. It can do a buffer idle wait,
> >> and postpone submitting new commands until after it's done using the
> >> buffer.
> > Hi Maarten,
> >
> > It seems that you say user should wait for a buffer like KDS does: KDS
> uses
> > select() to postpone submitting new commands. But I think this way
> assumes
> > that every data flows a DMA device to a CPU. For example, a CPU should
> keep
> > polling for the completion of a buffer access by a DMA device. This
> means
> > that the this way isn't considered for data flow to opposite case; CPU
> to
> > DMA device.
> Not really. You do both things the same way. You first wait for the bo to
> be idle, this could be implemented by adding poll support to the dma-buf
> fd.
> Then you either do your read or write. Since userspace is supposed to be
> the one controlling the bo it should stay idle at that point. If you have
> another thread queueing
> the buffer againbefore your thread is done that's a bug in the
application,
> and can be solved with userspace locking primitives. No need for the
> kernel to get involved.
> 

Yes, that is how we have synchronized buffer between CPU and DMA device
until now without buffer synchronization mechanism. I thought that it's best
to make user not considering anything: user can access a buffer regardless
of any DMA device controlling and the buffer synchronization is performed in
kernel level. Moreover, I think we could optimize graphics and multimedia
hardware performance because hardware can do more works: one thread accesses
a shared buffer and the other controls DMA device with the shared buffer in
parallel. Thus, we could avoid sequential processing and that is my
intention. Aren't you think about that we could improve hardware utilization
with such way or other? of course, there could be better way.

> >> Kernel space doesn't need the root hole you created by giving a
> >> dereferencing a pointer passed from userspace.
> >> Your next exercise should be to write a security exploit from the api
> you
> >> created here. It's the only way to learn how to write safe code. Hint:
> >> df.ctx = mmap(..);
> >>
> > Also I'm not clear to use our way yet and that is why I posted. As you
> > mentioned, it seems like that using mmap() is more safe. But there is
> one
> > issue

RE: Introduce a new helper framework for buffer synchronization

2013-05-13 Thread Inki Dae


> -Original Message-
> From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com]
> Sent: Monday, May 13, 2013 5:01 PM
> To: Inki Dae
> Cc: Rob Clark; Daniel Vetter; DRI mailing list; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org; linux-fbdev;
> Kyungmin Park; myungjoo.ham; YoungJun Cho
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> Op 09-05-13 09:33, Inki Dae schreef:
> > Hi all,
> >
> > This post introduces a new helper framework based on dma fence. And the
> > purpose of this post is to collect other opinions and advices before RFC
> > posting.
> >
> > First of all, this helper framework, called fence helper, is in progress
> > yet so might not have enough comments in codes and also might need to be
> > more cleaned up. Moreover, we might be missing some parts of the dma
> fence.
> > However, I'd like to say that all things mentioned below has been tested
> > with Linux platform and worked well.
> 
> > 
> >
> > And tutorial for user process.
> > just before cpu access
> > struct dma_buf_fence *df;
> >
> > df->type = DMA_BUF_ACCESS_READ or DMA_BUF_ACCESS_WRITE;
> > ioctl(fd, DMA_BUF_GET_FENCE, &df);
> >
> > after memset or memcpy
> > ioctl(fd, DMA_BUF_PUT_FENCE, &df);
> NAK.
> 
> Userspace doesn't need to trigger fences. It can do a buffer idle wait,
> and postpone submitting new commands until after it's done using the
> buffer.

Hi Maarten,

It seems that you say user should wait for a buffer like KDS does: KDS uses
select() to postpone submitting new commands. But I think this way assumes
that every data flows a DMA device to a CPU. For example, a CPU should keep
polling for the completion of a buffer access by a DMA device. This means
that the this way isn't considered for data flow to opposite case; CPU to
DMA device.

> Kernel space doesn't need the root hole you created by giving a
> dereferencing a pointer passed from userspace.
> Your next exercise should be to write a security exploit from the api you
> created here. It's the only way to learn how to write safe code. Hint:
> df.ctx = mmap(..);
> 

Also I'm not clear to use our way yet and that is why I posted. As you
mentioned, it seems like that using mmap() is more safe. But there is one
issue it makes me confusing. For your hint, df.ctx = mmap(..), the issue is
that dmabuf mmap can be used to map a dmabuf with user space. And the dmabuf
means a physical memory region allocated by some allocator such as drm gem
or ion.

There might be my missing point so could you please give me more comments?

Thanks,
Inki Dae



> ~Maarten

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3] drm/exynos: enable FIMD clocks

2013-04-21 Thread Inki Dae
Hi, Mr. Vikas

Please fix the below typos Viresh pointed out and my comments.

> -Original Message-
> From: Viresh Kumar [mailto:viresh.ku...@linaro.org]
> Sent: Monday, April 01, 2013 5:51 PM
> To: Vikas Sajjan
> Cc: dri-de...@lists.freedesktop.org; linux-samsung-...@vger.kernel.org;
> jy0922.s...@samsung.com; inki@samsung.com; kgene@samsung.com;
> linaro-ker...@lists.linaro.org; linux-media@vger.kernel.org
> Subject: Re: [PATCH v3] drm/exynos: enable FIMD clocks
> 
> On 1 April 2013 14:13, Vikas Sajjan  wrote:
> > While migrating to common clock framework (CCF), found that the FIMD
> clocks
> 
> s/found/we found/
> 
> > were pulled down by the CCF.
> > If CCF finds any clock(s) which has NOT been claimed by any of the
> > drivers, then such clock(s) are PULLed low by CCF.
> >
> > By calling clk_prepare_enable() for FIMD clocks fixes the issue.
> 
> s/By calling/Calling/
> 
> and
> 
> s/the/this
> 
> > this patch also replaces clk_disable() with clk_disable_unprepare()
> 
> s/this/This
> 
> > during exit.
> 
> Sorry but your log doesn't say what you are doing. You are just adding
> relevant calls to clk_prepare/unprepare() before calling
> clk_enable/disable.
> 
> > Signed-off-by: Vikas Sajjan 
> > ---
> > Changes since v2:
> > - moved clk_prepare_enable() and clk_disable_unprepare() from
> > fimd_probe() to fimd_clock() as suggested by Inki Dae
> 
> > Changes since v1:
> > - added error checking for clk_prepare_enable() and also
replaced
> > clk_disable() with clk_disable_unprepare() during exit.
> > ---
> >  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   14 +++---
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> > index 9537761..f2400c8 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> > @@ -799,18 +799,18 @@ static int fimd_clock(struct fimd_context *ctx,
> bool enable)
> > if (enable) {
> > int ret;
> >
> > -   ret = clk_enable(ctx->bus_clk);
> > +   ret = clk_prepare_enable(ctx->bus_clk);
> > if (ret < 0)
> > return ret;
> >
> > -   ret = clk_enable(ctx->lcd_clk);
> > +   ret = clk_prepare_enable(ctx->lcd_clk);
> > if  (ret < 0) {
> > -   clk_disable(ctx->bus_clk);
> > +   clk_disable_unprepare(ctx->bus_clk);
> > return ret;
> > }
> > } else {
> > -   clk_disable(ctx->lcd_clk);
> > -   clk_disable(ctx->bus_clk);
> > +   clk_disable_unprepare(ctx->lcd_clk);
> > +   clk_disable_unprepare(ctx->bus_clk);
> > }
> >
> > return 0;
> > @@ -981,8 +981,8 @@ static int fimd_remove(struct platform_device *pdev)
> >         if (ctx->suspended)
> > goto out;
> >
> > -   clk_disable(ctx->lcd_clk);
> > -   clk_disable(ctx->bus_clk);
> > +   clk_disable_unprepare(ctx->lcd_clk);
> > +   clk_disable_unprepare(ctx->bus_clk);

Just remove the above codes. It seems that clk_disable(also
clk_disable_unprepare) isn't needed because it will be done by
pm_runtime_put_sync and please re-post it(probably patch v5??)

Thanks,
Inki Dae

> 
> You are doing things at the right place but i have a suggestion. Are you
> doing
> anything in your clk_prepare() atleast for this device? Probably not.
> 
> If not, then its better to call clk_prepare/unprepare only once at
> probe/remove
> and keep clk_enable/disable calls as is.
> 
> --
> viresh

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] drm/exynos: enable FIMD clocks

2013-03-27 Thread Inki Dae
2013/3/20 Vikas Sajjan :
> While migrating to common clock framework (CCF), found that the FIMD clocks
> were pulled down by the CCF.
> If CCF finds any clock(s) which has NOT been claimed by any of the
> drivers, then such clock(s) are PULLed low by CCF.
>
> By calling clk_prepare_enable() for FIMD clocks fixes the issue.
>
> this patch also replaces clk_disable() with clk_disable_unprepare()
> during exit.
>
> Signed-off-by: Vikas Sajjan 
> ---
> Changes since v1:
> - added error checking for clk_prepare_enable() and also replaced
> clk_disable() with clk_disable_unprepare() during exit.
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c 
> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index 9537761..014d750 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -934,6 +934,19 @@ static int fimd_probe(struct platform_device *pdev)
> return ret;
> }
>
> +   ret = clk_prepare_enable(ctx->lcd_clk);
> +   if (ret) {
> +   dev_err(dev, "failed to enable 'sclk_fimd' clock\n");
> +   return ret;
> +   }
> +
> +   ret = clk_prepare_enable(ctx->bus_clk);
> +   if (ret) {
> +   clk_disable_unprepare(ctx->lcd_clk);
> +   dev_err(dev, "failed to enable 'fimd' clock\n");
> +   return ret;
> +   }
> +

Please remove the above two clk_prepare_enable function calls and use
them in fimd_clock() instead of clk_enable/disable(). When probed,
fimd clock will be enabled by runtime pm.

Thanks,
Inki Dae

> ctx->vidcon0 = pdata->vidcon0;
> ctx->vidcon1 = pdata->vidcon1;
> ctx->default_win = pdata->default_win;
> @@ -981,8 +994,8 @@ static int fimd_remove(struct platform_device *pdev)
> if (ctx->suspended)
> goto out;
>
> -   clk_disable(ctx->lcd_clk);
> -   clk_disable(ctx->bus_clk);
> +   clk_disable_unprepare(ctx->lcd_clk);
> +   clk_disable_unprepare(ctx->bus_clk);
>
> pm_runtime_set_suspended(dev);
> pm_runtime_put_sync(dev);
> --
> 1.7.9.5
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v12 2/2] drm/exynos: enable OF_VIDEOMODE and FB_MODE_HELPERS for exynos drm fimd

2013-03-07 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Vikas Sajjan
> Sent: Thursday, March 07, 2013 4:40 PM
> To: dri-de...@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org; kgene@samsung.com;
> inki@samsung.com; l.kris...@samsung.com; jo...@samsung.com; linaro-
> ker...@lists.linaro.org
> Subject: [PATCH v12 2/2] drm/exynos: enable OF_VIDEOMODE and
> FB_MODE_HELPERS for exynos drm fimd
> 
> patch adds "select OF_VIDEOMODE" and "select FB_MODE_HELPERS" when
> EXYNOS_DRM_FIMD config is selected.
> 
> Signed-off-by: Vikas Sajjan 
> ---
>  drivers/gpu/drm/exynos/Kconfig |2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/exynos/Kconfig
> b/drivers/gpu/drm/exynos/Kconfig
> index 046bcda..bb25130 100644
> --- a/drivers/gpu/drm/exynos/Kconfig
> +++ b/drivers/gpu/drm/exynos/Kconfig
> @@ -25,6 +25,8 @@ config DRM_EXYNOS_DMABUF
>  config DRM_EXYNOS_FIMD
>   bool "Exynos DRM FIMD"
>   depends on DRM_EXYNOS && !FB_S3C && !ARCH_MULTIPLATFORM

Again, you missed 'OF' dependency. At least, let's have build testing surely
before posting :)

Thanks,
Inki Dae

> + select OF_VIDEOMODE
> + select FB_MODE_HELPERS
>   help
> Choose this option if you want to use Exynos FIMD for DRM.
> 
> --
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] drm/exynos: modify the compatible string for exynos fimd

2013-03-06 Thread Inki Dae
Already merged. :)

> -Original Message-
> From: Vikas Sajjan [mailto:vikas.saj...@linaro.org]
> Sent: Thursday, March 07, 2013 4:09 PM
> To: InKi Dae
> Cc: dri-de...@lists.freedesktop.org; linux-media@vger.kernel.org;
> kgene@samsung.com; Joonyoung Shim; sunil joshi
> Subject: Re: [PATCH] drm/exynos: modify the compatible string for exynos
> fimd
> 
> Hi Mr Inki Dae,
> 
> On 28 February 2013 08:12, Joonyoung Shim  wrote:
> > On 02/27/2013 07:32 PM, Vikas Sajjan wrote:
> >>
> >> modified compatible string for exynos4 fimd as "exynos4210-fimd" and
> >> exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible
> >> value should be named after first specific SoC model in which this
> >> particular IP version was included as discussed at
> >> https://patchwork.kernel.org/patch/2144861/
> >>
> >> Signed-off-by: Vikas Sajjan 
> >> ---
> >>   drivers/gpu/drm/exynos/exynos_drm_fimd.c |4 ++--
> >>   1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> >> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> >> index 9537761..433ed35 100644
> >> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> >> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> >> @@ -109,9 +109,9 @@ struct fimd_context {
> >> #ifdef CONFIG_OF
> >>   static const struct of_device_id fimd_driver_dt_match[] = {
> >> -   { .compatible = "samsung,exynos4-fimd",
> >> +   { .compatible = "samsung,exynos4210-fimd",
> >>   .data = &exynos4_fimd_driver_data },
> >> -   { .compatible = "samsung,exynos5-fimd",
> >> +   { .compatible = "samsung,exynos5250-fimd",
> >>   .data = &exynos5_fimd_driver_data },
> >> {},
> >>   };
> >
> >
> > Acked-by: Joonyoung Shim 
> 
> Can you please apply this patch.
> 
> >
> > Thanks.
> 
> 
> 
> --
> Thanks and Regards
>  Vikas Sajjan

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 1/2] video: drm: exynos: Add display-timing node parsing using video helper function

2013-03-06 Thread Inki Dae
2013/3/1 Vikas Sajjan :
> Add support for parsing the display-timing node using video helper
> function.
>
> The DT node parsing is done only if 'dev.of_node'
> exists and the NON-DT logic is still maintained under the 'else' part.
>
> Signed-off-by: Leela Krishna Amudala 
> Signed-off-by: Vikas Sajjan 
> Acked-by: Joonyoung Shim 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   24 
>  1 file changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c 
> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index 9537761..e323cf9 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>
> +#include 
>  #include 
>  #include 
>
> @@ -883,10 +884,25 @@ static int fimd_probe(struct platform_device *pdev)
>
> DRM_DEBUG_KMS("%s\n", __FILE__);
>
> -   pdata = pdev->dev.platform_data;
> -   if (!pdata) {
> -   dev_err(dev, "no platform data specified\n");
> -   return -EINVAL;
> +   if (pdev->dev.of_node) {
> +   pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
> +   if (!pdata) {
> +   DRM_ERROR("memory allocation for pdata failed\n");
> +   return -ENOMEM;
> +   }
> +
> +   ret = of_get_fb_videomode(dev->of_node, &pdata->panel.timing,
> +   OF_USE_NATIVE_MODE);

Add "select OF_VIDEOMODE" and "select FB_MODE_HELPERS" to
drivers/gpu/drm/exynos/Kconfig. When EXYNOS_DRM_FIMD config is
selected, these two configs should also be selected.

Thanks,
Inki Dae

> +   if (ret) {
> +   DRM_ERROR("failed: of_get_fb_videomode() : %d\n", 
> ret);
> +   return ret;
> +   }
> +   } else {
> +   pdata = pdev->dev.platform_data;
> +   if (!pdata) {
> +   DRM_ERROR("no platform data specified\n");
> +   return -EINVAL;
> +   }
> }
>
> panel = &pdata->panel;
> --
> 1.7.9.5
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6 1/1] video: drm: exynos: Add display-timing node parsing using video helper function

2013-02-20 Thread Inki Dae


> -Original Message-
> From: Vikas Sajjan [mailto:vikas.saj...@linaro.org]
> Sent: Friday, February 15, 2013 3:43 PM
> To: dri-de...@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org; kgene@samsung.com;
> inki@samsung.com; l.kris...@samsung.com; patc...@linaro.org
> Subject: [PATCH v6 1/1] video: drm: exynos: Add display-timing node
> parsing using video helper function
> 
> Add support for parsing the display-timing node using video helper
> function.
> 
> The DT node parsing and pinctrl selection is done only if 'dev.of_node'
> exists and the NON-DT logic is still maintained under the 'else' part.
> 
> Signed-off-by: Leela Krishna Amudala 
> Signed-off-by: Vikas Sajjan 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   37
> ++
>  1 file changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index 9537761..8b2c0ff 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -19,7 +19,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
> +#include 
>  #include 
>  #include 
> 
> @@ -877,16 +879,43 @@ static int fimd_probe(struct platform_device *pdev)
>   struct exynos_drm_subdrv *subdrv;
>   struct exynos_drm_fimd_pdata *pdata;
>   struct exynos_drm_panel_info *panel;
> + struct fb_videomode *fbmode;
> + struct pinctrl *pctrl;
>   struct resource *res;
>   int win;
>   int ret = -EINVAL;
> 
>   DRM_DEBUG_KMS("%s\n", __FILE__);
> 
> - pdata = pdev->dev.platform_data;
> - if (!pdata) {
> - dev_err(dev, "no platform data specified\n");
> - return -EINVAL;
> + if (pdev->dev.of_node) {
> + pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
> + if (!pdata) {
> + DRM_ERROR("memory allocation for pdata failed\n");
> + return -ENOMEM;
> + }
> +
> + fbmode = &pdata->panel.timing;
> + ret = of_get_fb_videomode(dev->of_node, fbmode,
> + OF_USE_NATIVE_MODE);
> + if (ret) {
> + DRM_ERROR("failed: of_get_fb_videomode()\n"
> +         "with return value: %d\n", ret);
> + return ret;
> + }
> +
> + pctrl = devm_pinctrl_get_select_default(dev);

Why does it need pinctrl? and even though needed, I think this should be
separated into another one.

Thanks,
Inki Dae

> + if (IS_ERR_OR_NULL(pctrl)) {
> + DRM_ERROR("failed:
> devm_pinctrl_get_select_default()\n"
> + "with return value: %d\n", PTR_RET(pctrl));
> + return PTR_RET(pctrl);
> + }
> +
> + } else {
> + pdata = pdev->dev.platform_data;
> + if (!pdata) {
> + DRM_ERROR("no platform data specified\n");
> + return -EINVAL;
> + }
>   }
> 
>   panel = &pdata->panel;
> --
> 1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/1] video: drm: exynos: Add display-timing node parsing using video helper function

2013-02-14 Thread Inki Dae
2013/2/6 Vikas Sajjan :
> Add support for parsing the display-timing node using video helper
> function.
>
> The DT node parsing and pinctrl selection is done only if 'dev.of_node'
> exists and the NON-DT logic is still maintained under the 'else' part.
>
> Signed-off-by: Leela Krishna Amudala 
> Signed-off-by: Vikas Sajjan 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_fimd.c |   41 
> +++---
>  1 file changed, 37 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c 
> b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index bf0d9ba..978e866 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -905,16 +906,48 @@ static int __devinit fimd_probe(struct platform_device 
> *pdev)
> struct exynos_drm_subdrv *subdrv;
> struct exynos_drm_fimd_pdata *pdata;
> struct exynos_drm_panel_info *panel;
> +   struct fb_videomode *fbmode;
> +   struct pinctrl *pctrl;
> struct resource *res;
> int win;
> int ret = -EINVAL;
>
> DRM_DEBUG_KMS("%s\n", __FILE__);
>
> -   pdata = pdev->dev.platform_data;
> -   if (!pdata) {
> -   dev_err(dev, "no platform data specified\n");
> -   return -EINVAL;
> +   if (pdev->dev.of_node) {
> +   pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
> +   if (!pdata) {
> +   DRM_ERROR("memory allocation for pdata failed\n");
> +   return -ENOMEM;
> +   }
> +
> +   fbmode = devm_kzalloc(dev, sizeof(*fbmode), GFP_KERNEL);
> +   if (!fbmode) {
> +   DRM_ERROR("memory allocation for fbmode failed\n");
> +   return -ENOMEM;
> +   }

It doesn't need to allocate fbmode.

> +
> +   ret = of_get_fb_videomode(dev->of_node, fbmode, -1);

What is -1? use OF_USE_NATIVE_MODE instead including
"of_display_timing.h" and just change the above code like below,

   fbmode = &pdata->panel.timing;
   ret = of_get_fb_videomode(dev->of_node, fbmode,
OF_USE_NATIVE_MODE);

> +   if (ret) {
> +   DRM_ERROR("failed: of_get_fb_videomode()\n"
> +   "with return value: %d\n", ret);
> +   return ret;
> +   }
> +   pdata->panel.timing = (struct fb_videomode) *fbmode;

remove the above line.

> +
> +   pctrl = devm_pinctrl_get_select_default(dev);
> +   if (IS_ERR_OR_NULL(pctrl)) {
> +   DRM_ERROR("failed: 
> devm_pinctrl_get_select_default()\n"
> +   "with return value: %d\n", PTR_RET(pctrl));
> +   return PTR_RET(pctrl);
> +   }
> +
> +   } else {
> +   pdata = pdev->dev.platform_data;
> +   if (!pdata) {
> +   DRM_ERROR("no platform data specified\n");
> +   return -EINVAL;
> +   }
> }
>
> panel = &pdata->panel;
> --
> 1.7.9.5
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-12 Thread Inki Dae
2013/2/12 Sylwester Nawrocki :
> On 02/12/2013 02:17 PM, Inki Dae wrote:
>> Applied and will go to -next.
>> And please post the document(in
>> Documentation/devicetree/bindings/gpu/) for it later.
>
> There is already some old patch applied in the devicetree/next tree:
>
> http://git.secretlab.ca/?p=linux.git;a=commitdiff;h=09495dda6a62c74b13412a63528093910ef80edd
>
> I guess there is now an incremental patch needed for this.
>

I think that this patch should be reverted because the compatible
string of this document isn't generic and also the document file
should be moved into proper place(.../bindings/gpu/).

So Mr. Grant, could you please revert the below patch?
"of/exynos_g2d: Add Bindings for exynos G2D driver"
commit: 09495dda6a62c74b13412a63528093910ef80edd

This document should be modifed correctly and re-posted. For this, we
have already reached an arrangement with other Exynos maintainters.

Thanks,
Inki Dae

>
> Regards,
> Sylwester
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-12 Thread Inki Dae
Applied and will go to -next.
And please post the document(in
Documentation/devicetree/bindings/gpu/) for it later.

Thanks,
Inki Dae

2013/2/6 Sachin Kamat :
> From: Ajay Kumar 
>
> This patch adds device tree match table for Exynos G2D controller.
>
> Signed-off-by: Ajay Kumar 
> Signed-off-by: Sachin Kamat 
> ---
> Patch based on exynos-drm-fixes branch of Inki Dae's tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git
>
> Changes since v1:
> Modified the compatible string as per the discussions at [1].
> [1] https://patchwork1.kernel.org/patch/2045821/
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c |   10 ++
>  1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index ddcfb5d..0fcfbe4 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -1240,6 +1241,14 @@ static int g2d_resume(struct device *dev)
>
>  static SIMPLE_DEV_PM_OPS(g2d_pm_ops, g2d_suspend, g2d_resume);
>
> +#ifdef CONFIG_OF
> +static const struct of_device_id exynos_g2d_match[] = {
> +   { .compatible = "samsung,exynos5250-g2d" },
> +   {},
> +};
> +MODULE_DEVICE_TABLE(of, exynos_g2d_match);
> +#endif
> +
>  struct platform_driver g2d_driver = {
> .probe  = g2d_probe,
> .remove = g2d_remove,
> @@ -1247,5 +1256,6 @@ struct platform_driver g2d_driver = {
> .name   = "s5p-g2d",
> .owner  = THIS_MODULE,
> .pm = &g2d_pm_ops,
> +   .of_match_table = of_match_ptr(exynos_g2d_match),
> },
>  };
> --
> 1.7.4.1
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-06 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Sylwester Nawrocki
> Sent: Wednesday, February 06, 2013 8:24 PM
> To: Inki Dae
> Cc: 'Sachin Kamat'; linux-media@vger.kernel.org; dri-
> de...@lists.freedesktop.org; devicetree-disc...@lists.ozlabs.org;
> k.deb...@samsung.com; kgene@samsung.com; patc...@linaro.org; 'Ajay
> Kumar'; kyungmin.p...@samsung.com; sw0312@samsung.com;
> jy0922.s...@samsung.com
> Subject: Re: [PATCH v2 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> On 02/06/2013 09:51 AM, Inki Dae wrote:
> [...]
> > I think that it's better to go to gpu than media and we can divide
> Exynos
> > IPs into the bellow categories,
> >
> > Media : mfc
> > GPU : g2d, g3d, fimc, gsc
> 
> Heh, nice try! :) GPU and FIMC ? FIMC is a camera subsystem (hence 'C'
> in the acronym), so what it has really to do with GPU ? All right, this IP
> has really two functions: camera capture and video post-processing
> (colorspace conversion, scaling), but the main feature is camera capture
> (fimc-lite is a camera capture interface IP only).
> 
> Also, Exynos5 GScaler is used as a DMA engine for camera capture data
> pipelines, so it will be used by a camera capture driver as well. It
> really belongs to "Media" and "GPU", as this is a multifunctional
> device (similarly to FIMC).
> 
> So I propose following classification, which seems less inaccurate:
> 
> GPU:   g2d, g3d
> Media: mfc, fimc, fimc-lite, fimc-is, mipi-csis, gsc
> Video: fimd, hdmi, eDP, mipi-dsim
> 

Ok, it seems that your propose is better. :)

To Sachin,
Please add g2d document to .../bindings/gpu

To Rahul,
Could you please move .../drm/exynos/* to .../bindings/video? Probably you
need to rename the files there to exynos*.txt

If there are no other opinions, let's start  :)

Thanks,
Inki Dae

> I have already a DT bindings description prepared for fimc [1].
> (probably it needs to be rephrased a bit not to refer to the linux
> device model). I put it in Documentation/devicetree/bindings/media/soc,
> but likely there is no need for the 'soc' subdirectory...
> 
> > Video : fimd, hdmi, eDP, MIPI-DSI
> >
> > And I think that the device-tree describes hardware so possibly, all
> > documents in .../bindings/drm/exynos/* should be moved to proper place
> also.
> > Please give  me any opinions.
> 
> Yes, I agree. If possible, it would be nice to have some Linux API
> agnostic locations.
> 
> [1] goo.gl/eTGOl
> 
> --
> 
> Thanks,
> Sylwester
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-06 Thread Inki Dae


> -Original Message-
> From: Sachin Kamat [mailto:sachin.ka...@linaro.org]
> Sent: Wednesday, February 06, 2013 5:03 PM
> To: Inki Dae
> Cc: linux-media@vger.kernel.org; dri-de...@lists.freedesktop.org;
> devicetree-disc...@lists.ozlabs.org; k.deb...@samsung.com;
> s.nawro...@samsung.com; kgene@samsung.com; patc...@linaro.org; Ajay
> Kumar
> Subject: Re: [PATCH v2 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> On 6 February 2013 13:02, Inki Dae  wrote:
> >
> > Looks good to me but please add document for it.
> 
> Yes. I will. I was planning to send the bindings document patch along
> with the dt patches (adding node entries to dts files).
> Sylwester had suggested adding this to
> Documentation/devicetree/bindings/media/ which contains other media
> IPs.

I think that it's better to go to gpu than media and we can divide Exynos
IPs into the bellow categories,

Media : mfc
GPU : g2d, g3d, fimc, gsc
Video : fimd, hdmi, eDP, MIPI-DSI

And I think that the device-tree describes hardware so possibly, all
documents in .../bindings/drm/exynos/* should be moved to proper place also.
Please give  me any opinions.

Thanks,
Inki Dae

> 
> >
> > To other guys,
> > And is there anyone who know where this document should be added to?
> > I'm not sure that the g2d document should be placed in
> > Documentation/devicetree/bindings/gpu, media, drm/exynos or arm/exynos.
> At
> > least, this document should be shared with the g2d hw relevant drivers
> such
> > as v4l2 and drm. So is ".../bindings/gpu" proper place?
> >
> 
> 
> --
> With warm regards,
> Sachin

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-05 Thread Inki Dae
> -Original Message-
> From: Sachin Kamat [mailto:sachin.ka...@linaro.org]
> Sent: Wednesday, February 06, 2013 2:30 PM
> To: linux-media@vger.kernel.org; dri-de...@lists.freedesktop.org;
> devicetree-disc...@lists.ozlabs.org
> Cc: k.deb...@samsung.com; sachin.ka...@linaro.org; inki@samsung.com;
> s.nawro...@samsung.com; kgene@samsung.com; patc...@linaro.org; Ajay
> Kumar
> Subject: [PATCH v2 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> From: Ajay Kumar 
> 
> This patch adds device tree match table for Exynos G2D controller.
> 
> Signed-off-by: Ajay Kumar 
> Signed-off-by: Sachin Kamat 
> ---
> Patch based on exynos-drm-fixes branch of Inki Dae's tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git
> 
> Changes since v1:
> Modified the compatible string as per the discussions at [1].
> [1] https://patchwork1.kernel.org/patch/2045821/
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c |   10 ++
>  1 files changed, 10 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index ddcfb5d..0fcfbe4 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include 
>  #include 
> @@ -1240,6 +1241,14 @@ static int g2d_resume(struct device *dev)
> 
>  static SIMPLE_DEV_PM_OPS(g2d_pm_ops, g2d_suspend, g2d_resume);
> 
> +#ifdef CONFIG_OF
> +static const struct of_device_id exynos_g2d_match[] = {
> + { .compatible = "samsung,exynos5250-g2d" },

Looks good to me but please add document for it.

To other guys,
And is there anyone who know where this document should be added to?
I'm not sure that the g2d document should be placed in
Documentation/devicetree/bindings/gpu, media, drm/exynos or arm/exynos. At
least, this document should be shared with the g2d hw relevant drivers such
as v4l2 and drm. So is ".../bindings/gpu" proper place?

Thanks,
Inki Dae

> + {},
> +};
> +MODULE_DEVICE_TABLE(of, exynos_g2d_match);
> +#endif
> +
>  struct platform_driver g2d_driver = {
>   .probe  = g2d_probe,
>   .remove = g2d_remove,
> @@ -1247,5 +1256,6 @@ struct platform_driver g2d_driver = {
>   .name   = "s5p-g2d",
>   .owner  = THIS_MODULE,
>   .pm = &g2d_pm_ops,
> + .of_match_table = of_match_ptr(exynos_g2d_match),
>   },
>  };
> --
> 1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-04 Thread Inki Dae
2013/2/4 Sachin Kamat :
> On 1 February 2013 18:28, Inki Dae  wrote:
>>
>>
>>
>>
>> 2013. 2. 1. 오후 8:52 Inki Dae  작성:
>>
>>>
>>>
>>>> -Original Message-
>>>> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
>>>> ow...@vger.kernel.org] On Behalf Of Sachin Kamat
>>>> Sent: Friday, February 01, 2013 8:40 PM
>>>> To: Inki Dae
>>>> Cc: Sylwester Nawrocki; Kukjin Kim; Sylwester Nawrocki; linux-
>>>> me...@vger.kernel.org; dri-de...@lists.freedesktop.org; devicetree-
>>>> disc...@lists.ozlabs.org; patc...@linaro.org
>>>> Subject: Re: [PATCH 2/2] drm/exynos: Add device tree based discovery
>>>> support for G2D
>>>>
>>>> On 1 February 2013 17:02, Inki Dae  wrote:
>>>>>
>>>>> How about using like below?
>>>>>Compatible = ""samsung,exynos4x12-fimg-2d" /* for Exynos4212,
>>>>> Exynos4412  */
>>>>> It looks odd to use "samsung,exynos4212-fimg-2d" saying that this ip is
>>>> for
>>>>> exynos4212 and exynos4412.
>>>>
>>>> AFAIK, compatible strings are not supposed to have any wildcard
>>> characters.
>>>> Compatible string should suggest the first SoC that contained this IP.
>>>> Hence IMO 4212 is OK.
>>>>
>>
>> Oops, one more thing. AFAIK Exynos4210 also has fimg-2d ip. In this case, we 
>> should use "samsung,exynos4210-fimg-2d" as comparible string and add it to 
>> exynos4210.dtsi?
>
> Exynos4210 has same g2d IP (v3.0) as C110 or V210; so the same
> comptible string will be used for this one too.
>
>> And please check if exynos4212 and 4412 SoCs have same fimg-2d ip. If it's 
>> different, we might need to add ip version property or compatible string to 
>> each dtsi file to identify the ip version.
>
> AFAIK, they both have the same IP (v4.1).
>

Ok, let's use the below,

For exynos4210 SoC,
compatible = "samsung,exynos4210-g2d"

For exynos4x12 SoCs,
compatible = "samsung,exynos4212-g2d"

For exynos5250, 5410 (In case of Exynos5440, I'm not sure that the SoC
has same ip)
compatible = "samsung,exynos5250-g2d"

To other guys,
The device tree is used by not only v4l2 side but also drm side so we
should reach an arrangement. So please give me ack if you agree to my
opinion. Otherwise please, give me your opinions.

Thanks,
Inki Dae


>>
>> Sorry but give me your opinions.
>>
>> Thanks,
>> Inki Dae
>>
>>
>>>
>>> Got it. Please post it again.
>>>
>>>>
>>>> --
>>>> With warm regards,
>>>> Sachin
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-media" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> ___
>>> dri-devel mailing list
>>> dri-de...@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
>
> --
> With warm regards,
> Sachin
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Linaro-mm-sig] [PATCH 6/7] reservation: cross-device reservation support

2013-02-03 Thread Inki Dae
> +/**
> + * ticket_commit - commit a reservation with a new fence
> + * @ticket:[in]the reservation_ticket returned by
> + * ticket_reserve
> + * @entries:   [in]a linked list of struct reservation_entry
> + * @fence: [in]the fence that indicates completion
> + *
> + * This function will call reservation_ticket_fini, no need
> + * to do it manually.
> + *
> + * This function should be called after a hardware command submission is
> + * completed succesfully. The fence is used to indicate completion of
> + * those commands.
> + */
> +void
> +ticket_commit(struct reservation_ticket *ticket,
> + struct list_head *entries, struct fence *fence)
> +{
> +   struct list_head *cur;
> +
> +   if (list_empty(entries))
> +   return;
> +
> +   if (WARN_ON(!fence)) {
> +   ticket_backoff(ticket, entries);
> +   return;
> +   }
> +
> +   list_for_each(cur, entries) {
> +   struct reservation_object *bo;
> +   bool shared;
> +
> +   reservation_entry_get(cur, &bo, &shared);
> +
> +   if (!shared) {
> +   int i;
> +   for (i = 0; i < bo->fence_shared_count; ++i) {
> +   fence_put(bo->fence_shared[i]);
> +   bo->fence_shared[i] = NULL;
> +   }
> +   bo->fence_shared_count = 0;
> +   if (bo->fence_excl)
> +   fence_put(bo->fence_excl);
> +
> +   bo->fence_excl = fence;
> +   } else {
> +   if (WARN_ON(bo->fence_shared_count >=
> +   ARRAY_SIZE(bo->fence_shared))) {
> +   mutex_unreserve_unlock(&bo->lock);
> +   continue;
> +   }
> +
> +   bo->fence_shared[bo->fence_shared_count++] = fence;
> +   }

Hi,

I got some questions to fence_excl and fence_shared. At the above
code, if bo->fence_excl is not NULL then it puts bo->fence_excl and
sets a new fence to it. This seems like that someone that committed a
new fence, wants to access the given dmabuf exclusively even if
someone is accessing the given dmabuf.

On the other hand, in case of fence_shared, someone wants to access
that dmabuf non-exclusively. So this case seems like that the given
dmabuf could be accessed by two more devices. So I guess that the
fence_excl could be used for write access(may need buffer sync like
blocking) and read access for the fence_shared(may not need buffer
sync). I'm not sure that I understand these two things correctly so
could you please give me more comments for them?

Thanks,
Inki Dae

> +   fence_get(fence);
> +
> +   mutex_unreserve_unlock(&bo->lock);
> +   }
> +   reservation_ticket_fini(ticket);
> +}
> +EXPORT_SYMBOL(ticket_commit);
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-01 Thread Inki Dae




2013. 2. 1. 오후 8:52 Inki Dae  작성:

> 
> 
>> -Original Message-
>> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
>> ow...@vger.kernel.org] On Behalf Of Sachin Kamat
>> Sent: Friday, February 01, 2013 8:40 PM
>> To: Inki Dae
>> Cc: Sylwester Nawrocki; Kukjin Kim; Sylwester Nawrocki; linux-
>> me...@vger.kernel.org; dri-de...@lists.freedesktop.org; devicetree-
>> disc...@lists.ozlabs.org; patc...@linaro.org
>> Subject: Re: [PATCH 2/2] drm/exynos: Add device tree based discovery
>> support for G2D
>> 
>> On 1 February 2013 17:02, Inki Dae  wrote:
>>> 
>>> How about using like below?
>>>Compatible = ""samsung,exynos4x12-fimg-2d" /* for Exynos4212,
>>> Exynos4412  */
>>> It looks odd to use "samsung,exynos4212-fimg-2d" saying that this ip is
>> for
>>> exynos4212 and exynos4412.
>> 
>> AFAIK, compatible strings are not supposed to have any wildcard
> characters.
>> Compatible string should suggest the first SoC that contained this IP.
>> Hence IMO 4212 is OK.
>> 

Oops, one more thing. AFAIK Exynos4210 also has fimg-2d ip. In this case, we 
should use "samsung,exynos4210-fimg-2d" as comparible string and add it to 
exynos4210.dtsi?
And please check if exynos4212 and 4412 SoCs have same fimg-2d ip. If it's 
different, we might need to add ip version property or compatible string to 
each dtsi file to identify the ip version.

Sorry but give me your opinions.

Thanks,
Inki Dae


> 
> Got it. Please post it again.
> 
>> 
>> --
>> With warm regards,
>> Sachin
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-media" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-01 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Sachin Kamat
> Sent: Friday, February 01, 2013 8:40 PM
> To: Inki Dae
> Cc: Sylwester Nawrocki; Kukjin Kim; Sylwester Nawrocki; linux-
> me...@vger.kernel.org; dri-de...@lists.freedesktop.org; devicetree-
> disc...@lists.ozlabs.org; patc...@linaro.org
> Subject: Re: [PATCH 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> On 1 February 2013 17:02, Inki Dae  wrote:
> >
> > How about using like below?
> > Compatible = ""samsung,exynos4x12-fimg-2d" /* for Exynos4212,
> > Exynos4412  */
> > It looks odd to use "samsung,exynos4212-fimg-2d" saying that this ip is
> for
> > exynos4212 and exynos4412.
> 
> AFAIK, compatible strings are not supposed to have any wildcard
characters.
> Compatible string should suggest the first SoC that contained this IP.
> Hence IMO 4212 is OK.
> 

Got it. Please post it again.

> 
> --
> With warm regards,
> Sachin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-02-01 Thread Inki Dae


> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Sachin Kamat
> Sent: Friday, February 01, 2013 8:13 PM
> To: Sylwester Nawrocki
> Cc: Inki Dae; Kukjin Kim; Sylwester Nawrocki; linux-media@vger.kernel.org;
> dri-de...@lists.freedesktop.org; devicetree-disc...@lists.ozlabs.org;
> patc...@linaro.org
> Subject: Re: [PATCH 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> >> In any case please let me know the final preferred one so that I can
> >> update the code send the revised patches.
> >
> > The version with SoC name embedded in it seems most reliable and correct
> > to me.
> >
> > compatible = "samsung,exynos3110-fimg-2d" /* for Exynos3110 (S5PC110,
> S5PV210),
> >  Exynos4210 */
> > compatible = "samsung,exynos4212-fimg-2d" /* for Exynos4212, Exynos4412
> */
> >
> Looks good to me.
> 
> Inki, Kukjin, please let us know your opinion so that we can freeze
> this. Also please suggest the SoC name for Exynos5 (5250?).
> 

How about using like below?
Compatible = ""samsung,exynos4x12-fimg-2d" /* for Exynos4212,
Exynos4412  */

It looks odd to use "samsung,exynos4212-fimg-2d" saying that this ip is for
exynos4212 and exynos4412.


> --
> With warm regards,
> Sachin
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-01-31 Thread Inki Dae
Hi Kukjin,

> -Original Message-
> From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
> ow...@vger.kernel.org] On Behalf Of Kukjin Kim
> Sent: Friday, February 01, 2013 9:15 AM
> To: 'Sylwester Nawrocki'; 'Inki Dae'
> Cc: 'Sachin Kamat'; linux-media@vger.kernel.org; dri-
> de...@lists.freedesktop.org; devicetree-disc...@lists.ozlabs.org;
> patc...@linaro.org; s.nawro...@samsung.com
> Subject: RE: [PATCH 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> Sylwester Nawrocki wrote:
> >
> > Hi Inki,
> >
> Hi Sylwester and Inki,
> 
> > On 01/31/2013 02:30 AM, Inki Dae wrote:
> > >> -Original Message-
> > >> From: Sylwester Nawrocki [mailto:sylvester.nawro...@gmail.com]
> > >> Sent: Thursday, January 31, 2013 5:51 AM
> > >> To: Inki Dae
> > >> Cc: Sachin Kamat; linux-media@vger.kernel.org; dri-
> > >> de...@lists.freedesktop.org; devicetree-disc...@lists.ozlabs.org;
> > >> patc...@linaro.org; s.nawro...@samsung.com
> > >> Subject: Re: [PATCH 2/2] drm/exynos: Add device tree based discovery
> > >> support for G2D
> > >>
> > >> On 01/30/2013 09:50 AM, Inki Dae wrote:
> > >>>> +static const struct of_device_id exynos_g2d_match[] = {
> > >>>> +   { .compatible = "samsung,g2d-v41" },
> > >>>
> > >>> not only Exynos5 and also Exyno4 has the g2d gpu and drm-based g2d
> > >>> driver shoud support for all Exynos SoCs. How about using
> > >>> "samsung,exynos5-g2d" instead and adding a new property 'version' to
> > >>> identify ip version more surely? With this, we could know which SoC
> > >>> and its g2d ip version. The version property could have '0x14' or
> > >>> others. And please add descriptions to dt document.
> > >>
> > >> Err no. Are you suggesting using "samsung,exynos5-g2d" compatible
> > string
> > >> for Exynos4 specific IPs ? This would not be correct, and you still
> can
> > >
> > > I assumed the version 'v41' is the ip for Exynos5 SoC. So if this
> version
> > > means Exynos4 SoC then it should be "samsung,exynos4-g2d".
> >
> > Yes, v3.0 is implemented in the S5PC110 (Exynos3110) SoCs and
Exynos4210,
> > V4.1 can be found in Exynos4212 and Exynos4412, if I'm not mistaken.
> >
> > So we could have:
> >
> > compatible = "samsung,exynos-g2d-3.0" /* for Exynos3110, Exynos4210 */
> > compatible = "samsung,exynos-g2d-4.1" /* for Exynos4212, Exynos4412 */
> >
> In my opinion, this is better than later. Because as I said, when we can
> use
> IP version to identify, it is more clear and can be used
> 
> One more, how about following?
> 
> compatible = "samsung,g2d-3.0"
> compatible = "samsung,g2d-4.1"
> 

I think compatible string should be considered case by case.

For example,
If compatible = "samsung,g2d-3.0" is added to exynos4210.dtsi, it'd be
reasonable. But what if that compatible string is added to exynos4.dtsi?.
This case isn't considered for exynos4412 SoC with v4.1. 

So at least shouldn't that compatible string include SoC version so that
that can be added to proper dtsi file? And I'm not sure how the ip version
should be dealt with as of now:( Really enough to know the ip version
implicitly(ie. exynos4412 string means implicitly that its g2d ip version is
v4.1 so its device driver refers to the necessary data through
of_device_id's data)?


> I think, just g2d is enough. For example, we are using it for mfc like
> following: compatible = "samsung.mfc-v6"
> 
> > or alternatively
> >
> > compatible = "samsung,exynos3110-g2d" /* for Exynos3110, Exynos4210 */
> > compatible = "samsung,exynos4212-g2d" /* for Exynos4212, Exynos4412 */
> >

So, IMO, I think this is better than first one.

Thanks,
Inki Dae

> Thanks.
> 
> - Kukjin
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-media" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)

2013-01-31 Thread Inki Dae
2013/1/31 Daniel Vetter :
> On Thu, Jan 31, 2013 at 06:32:15PM +0900, Inki Dae wrote:
>> Hi,
>>
>> below is my opinion.
>>
>> > +struct fence;
>> > +struct fence_ops;
>> > +struct fence_cb;
>> > +
>> > +/**
>> > + * struct fence - software synchronization primitive
>> > + * @refcount: refcount for this fence
>> > + * @ops: fence_ops associated with this fence
>> > + * @cb_list: list of all callbacks to call
>> > + * @lock: spin_lock_irqsave used for locking
>> > + * @priv: fence specific private data
>> > + * @flags: A mask of FENCE_FLAG_* defined below
>> > + *
>> > + * the flags member must be manipulated and read using the appropriate
>> > + * atomic ops (bit_*), so taking the spinlock will not be needed most
>> > + * of the time.
>> > + *
>> > + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
>> > + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
>> > + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
>> > + * implementer of the fence for its own purposes. Can be used in different
>> > + * ways by different fence implementers, so do not rely on this.
>> > + *
>> > + * *) Since atomic bitops are used, this is not guaranteed to be the case.
>> > + * Particularly, if the bit was set, but fence_signal was called right
>> > + * before this bit was set, it would have been able to set the
>> > + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
>> > + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
>> > + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
>> > + * after fence_signal was called, any enable_signaling call will have 
>> > either
>> > + * been completed, or never called at all.
>> > + */
>> > +struct fence {
>> > +   struct kref refcount;
>> > +   const struct fence_ops *ops;
>> > +   struct list_head cb_list;
>> > +   spinlock_t *lock;
>> > +   unsigned context, seqno;
>> > +   unsigned long flags;
>> > +};
>> > +
>> > +enum fence_flag_bits {
>> > +   FENCE_FLAG_SIGNALED_BIT,
>> > +   FENCE_FLAG_ENABLE_SIGNAL_BIT,
>> > +   FENCE_FLAG_USER_BITS, /* must always be last member */
>> > +};
>> > +
>>
>> It seems like that this fence framework need to add read/write flags.
>> In case of two read operations, one might wait for another one. But
>> the another is just read operation so we doesn't need to wait for it.
>> Shouldn't fence-wait-request be ignored? In this case, I think it's
>> enough to consider just only write operation.
>>
>> For this, you could add the following,
>>
>> enum fence_flag_bits {
>> ...
>> FENCE_FLAG_ACCESS_READ_BIT,
>> FENCE_FLAG_ACCESS_WRITE_BIT,
>> ...
>> };
>>
>> And the producer could call fence_init() like below,
>> __fence_init(..., FENCE_FLAG_ACCESS_WRITE_BIT,...);
>>
>> With this, fence->flags has FENCE_FLAG_ACCESS_WRITE_BIT as write
>> operation and then other sides(read or write operation) would wait for
>> the write operation completion.
>> And also consumer calls that function with FENCE_FLAG_ACCESS_READ_BIT
>> so that other consumers could ignore the fence-wait to any read
>> operations.
>
> Fences here match more to the sync-points concept from the android stuff.
> The idea is that they only signal when a hw operation completes.
>
> Synchronization integration happens at the dma_buf level, where you can
> specify whether the new operation you're doing is exclusive (which means
> that you need to wait for all previous operations to complete), i.e. a
> write. Or whether the operation is non-excluses (i.e. just reading) in
> which case you only need to wait for any still outstanding exclusive
> fences attached to the dma_buf. But you _can_ attach more than one
> non-exclusive fence to a dma_buf at the same time, and so e.g. read a
> buffer objects from different engines concurrently.
>
> There's been some talk whether we also need a non-exclusive write
> attachment (i.e. allow multiple concurrent writers), but I don't yet fully
> understand the use-case.
>
> In short the proposed patches can do what you want to do, it's just that
> read/write access isn't part of the fences, but how you attach fences to
> dma_bufs.
>

Thanks for comments, Maarten and Daniel.

I think I understand as your

Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)

2013-01-31 Thread Inki Dae
Hi,

below is my opinion.

> +struct fence;
> +struct fence_ops;
> +struct fence_cb;
> +
> +/**
> + * struct fence - software synchronization primitive
> + * @refcount: refcount for this fence
> + * @ops: fence_ops associated with this fence
> + * @cb_list: list of all callbacks to call
> + * @lock: spin_lock_irqsave used for locking
> + * @priv: fence specific private data
> + * @flags: A mask of FENCE_FLAG_* defined below
> + *
> + * the flags member must be manipulated and read using the appropriate
> + * atomic ops (bit_*), so taking the spinlock will not be needed most
> + * of the time.
> + *
> + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
> + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
> + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
> + * implementer of the fence for its own purposes. Can be used in different
> + * ways by different fence implementers, so do not rely on this.
> + *
> + * *) Since atomic bitops are used, this is not guaranteed to be the case.
> + * Particularly, if the bit was set, but fence_signal was called right
> + * before this bit was set, it would have been able to set the
> + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
> + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
> + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
> + * after fence_signal was called, any enable_signaling call will have either
> + * been completed, or never called at all.
> + */
> +struct fence {
> +   struct kref refcount;
> +   const struct fence_ops *ops;
> +   struct list_head cb_list;
> +   spinlock_t *lock;
> +   unsigned context, seqno;
> +   unsigned long flags;
> +};
> +
> +enum fence_flag_bits {
> +   FENCE_FLAG_SIGNALED_BIT,
> +   FENCE_FLAG_ENABLE_SIGNAL_BIT,
> +   FENCE_FLAG_USER_BITS, /* must always be last member */
> +};
> +

It seems like that this fence framework need to add read/write flags.
In case of two read operations, one might wait for another one. But
the another is just read operation so we doesn't need to wait for it.
Shouldn't fence-wait-request be ignored? In this case, I think it's
enough to consider just only write operation.

For this, you could add the following,

enum fence_flag_bits {
...
FENCE_FLAG_ACCESS_READ_BIT,
FENCE_FLAG_ACCESS_WRITE_BIT,
...
};

And the producer could call fence_init() like below,
__fence_init(..., FENCE_FLAG_ACCESS_WRITE_BIT,...);

With this, fence->flags has FENCE_FLAG_ACCESS_WRITE_BIT as write
operation and then other sides(read or write operation) would wait for
the write operation completion.
And also consumer calls that function with FENCE_FLAG_ACCESS_READ_BIT
so that other consumers could ignore the fence-wait to any read
operations.

Thanks,
Inki Dae
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-01-30 Thread Inki Dae


> -Original Message-
> From: Sylwester Nawrocki [mailto:sylvester.nawro...@gmail.com]
> Sent: Thursday, January 31, 2013 5:51 AM
> To: Inki Dae
> Cc: Sachin Kamat; linux-media@vger.kernel.org; dri-
> de...@lists.freedesktop.org; devicetree-disc...@lists.ozlabs.org;
> patc...@linaro.org; s.nawro...@samsung.com
> Subject: Re: [PATCH 2/2] drm/exynos: Add device tree based discovery
> support for G2D
> 
> On 01/30/2013 09:50 AM, Inki Dae wrote:
> >> +static const struct of_device_id exynos_g2d_match[] = {
> >> +   { .compatible = "samsung,g2d-v41" },
> >
> > not only Exynos5 and also Exyno4 has the g2d gpu and drm-based g2d
> > driver shoud support for all Exynos SoCs. How about using
> > "samsung,exynos5-g2d" instead and adding a new property 'version' to
> > identify ip version more surely? With this, we could know which SoC
> > and its g2d ip version. The version property could have '0x14' or
> > others. And please add descriptions to dt document.
> 
> Err no. Are you suggesting using "samsung,exynos5-g2d" compatible string
> for Exynos4 specific IPs ? This would not be correct, and you still can

I assumed the version 'v41' is the ip for Exynos5 SoC. So if this version
means Exynos4 SoC then it should be "samsung,exynos4-g2d".

> match the driver with multiple different revisions of the IP and associate
> any required driver's private data with each corresponding compatible
> property.
> 

Right, and for why I prefer to use version property instead of embedded
version string, you can refer to the my comment I replied already to the
"drm/exynos: Get HDMI version from device tree" email thread.

> Perhaps it would make more sense to include the SoCs name in the
> compatible
> string, e.g. "samsung,exynos-g2d-v41", but appending revision of the IP
> seems acceptable to me. The revisions appear to be well documented and
> it's
> more or less clear which one corresponds to which SoC.
> 
> --
> 
> Thanks,
> Sylwester

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] drm/exynos: Add device tree based discovery support for G2D

2013-01-30 Thread Inki Dae
2013/1/25 Sachin Kamat :
> From: Ajay Kumar 
>
> This patch adds device tree match table for Exynos G2D controller.
>
> Signed-off-by: Ajay Kumar 
> Signed-off-by: Sachin Kamat 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_g2d.c |   10 ++
>  1 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index ddcfb5d..d24b170 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -1240,6 +1241,14 @@ static int g2d_resume(struct device *dev)
>
>  static SIMPLE_DEV_PM_OPS(g2d_pm_ops, g2d_suspend, g2d_resume);
>
> +#ifdef CONFIG_OF
> +static const struct of_device_id exynos_g2d_match[] = {
> +   { .compatible = "samsung,g2d-v41" },

not only Exynos5 and also Exyno4 has the g2d gpu and drm-based g2d
driver shoud support for all Exynos SoCs. How about using
"samsung,exynos5-g2d" instead and adding a new property 'version' to
identify ip version more surely? With this, we could know which SoC
and its g2d ip version. The version property could have '0x14' or
others. And please add descriptions to dt document.

> +   {},
> +};
> +MODULE_DEVICE_TABLE(of, exynos_g2d_match);
> +#endif
> +
>  struct platform_driver g2d_driver = {
> .probe  = g2d_probe,
> .remove = g2d_remove,
> @@ -1247,5 +1256,6 @@ struct platform_driver g2d_driver = {
> .name   = "s5p-g2d",
> .owner  = THIS_MODULE,
> .pm = &g2d_pm_ops,
> +   .of_match_table = of_match_ptr(exynos_g2d_match),
> },
>  };
> --
> 1.7.4.1
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)

2013-01-24 Thread Inki Dae
2013/1/16 Maarten Lankhorst :
> Op 16-01-13 07:28, Inki Dae schreef:
>> 2013/1/15 Maarten Lankhorst :
>>> This type of fence can be used with hardware synchronization for simple
>>> hardware that can block execution until the condition
>>> (dma_buf[offset] - value) >= 0 has been met.
>>>
>>> A software fallback still has to be provided in case the fence is used
>>> with a device that doesn't support this mechanism. It is useful to expose
>>> this for graphics cards that have an op to support this.
>>>
>>> Some cards like i915 can export those, but don't have an option to wait,
>>> so they need the software fallback.
>>>
>>> I extended the original patch by Rob Clark.
>>>
>>> v1: Original
>>> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
>>> not much was left of the file. Lots of documentation added.
>>> v3: Use fence_ops instead of custom callbacks. Moved to own file
>>> to avoid circular dependency between dma-buf.h and fence.h
>>> v4: Add spinlock pointer to seqno_fence_init
>>>
>>> Signed-off-by: Maarten Lankhorst 
>>> ---
>>>  Documentation/DocBook/device-drivers.tmpl |   1 +
>>>  drivers/base/fence.c  |  38 +++
>>>  include/linux/seqno-fence.h   | 105 
>>> ++
>>>  3 files changed, 144 insertions(+)
>>>  create mode 100644 include/linux/seqno-fence.h
>>>
>>> diff --git a/Documentation/DocBook/device-drivers.tmpl 
>>> b/Documentation/DocBook/device-drivers.tmpl
>>> index 6f53fc0..ad14396 100644
>>> --- a/Documentation/DocBook/device-drivers.tmpl
>>> +++ b/Documentation/DocBook/device-drivers.tmpl
>>> @@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
>>>  !Edrivers/base/dma-buf.c
>>>  !Edrivers/base/fence.c
>>>  !Iinclude/linux/fence.h
>>> +!Iinclude/linux/seqno-fence.h
>>>  !Edrivers/base/dma-coherent.c
>>>  !Edrivers/base/dma-mapping.c
>>>   
>>> diff --git a/drivers/base/fence.c b/drivers/base/fence.c
>>> index 28e5ffd..1d3f29c 100644
>>> --- a/drivers/base/fence.c
>>> +++ b/drivers/base/fence.c
>>> @@ -24,6 +24,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  atomic_t fence_context_counter = ATOMIC_INIT(0);
>>>  EXPORT_SYMBOL(fence_context_counter);
>>> @@ -284,3 +285,40 @@ out:
>>> return ret;
>>>  }
>>>  EXPORT_SYMBOL(fence_default_wait);
>>> +
>>> +static bool seqno_enable_signaling(struct fence *fence)
>>> +{
>>> +   struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +   return seqno_fence->ops->enable_signaling(fence);
>>> +}
>>> +
>>> +static bool seqno_signaled(struct fence *fence)
>>> +{
>>> +   struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +   return seqno_fence->ops->signaled && 
>>> seqno_fence->ops->signaled(fence);
>>> +}
>>> +
>>> +static void seqno_release(struct fence *fence)
>>> +{
>>> +   struct seqno_fence *f = to_seqno_fence(fence);
>>> +
>>> +   dma_buf_put(f->sync_buf);
>>> +   if (f->ops->release)
>>> +   f->ops->release(fence);
>>> +   else
>>> +   kfree(f);
>>> +}
>>> +
>>> +static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
>>> +{
>>> +   struct seqno_fence *f = to_seqno_fence(fence);
>>> +   return f->ops->wait(fence, intr, timeout);
>>> +}
>>> +
>>> +const struct fence_ops seqno_fence_ops = {
>>> +   .enable_signaling = seqno_enable_signaling,
>>> +   .signaled = seqno_signaled,
>>> +   .wait = seqno_wait,
>>> +   .release = seqno_release
>>> +};
>>> +EXPORT_SYMBOL_GPL(seqno_fence_ops);
>>> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
>>> new file mode 100644
>>> index 000..603adc0
>>> --- /dev/null
>>> +++ b/include/linux/seqno-fence.h
>>> @@ -0,0 +1,105 @@
>>> +/*
>>> + * seqno-fence, using a dma-buf to synchronize fencing
>>> + *
>>> + * Copyright (C) 2012 Texas Instruments
>>> + * Copyright (C) 2012 Canonical Ltd
>>> + * Authors:
>>> + *   Rob Clark 
>

Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)

2013-01-16 Thread Inki Dae
2013/1/16 Maarten Lankhorst :
> Op 16-01-13 07:28, Inki Dae schreef:
>> 2013/1/15 Maarten Lankhorst :
>>> This type of fence can be used with hardware synchronization for simple
>>> hardware that can block execution until the condition
>>> (dma_buf[offset] - value) >= 0 has been met.
>>>
>>> A software fallback still has to be provided in case the fence is used
>>> with a device that doesn't support this mechanism. It is useful to expose
>>> this for graphics cards that have an op to support this.
>>>
>>> Some cards like i915 can export those, but don't have an option to wait,
>>> so they need the software fallback.
>>>
>>> I extended the original patch by Rob Clark.
>>>
>>> v1: Original
>>> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
>>> not much was left of the file. Lots of documentation added.
>>> v3: Use fence_ops instead of custom callbacks. Moved to own file
>>> to avoid circular dependency between dma-buf.h and fence.h
>>> v4: Add spinlock pointer to seqno_fence_init
>>>
>>> Signed-off-by: Maarten Lankhorst 
>>> ---
>>>  Documentation/DocBook/device-drivers.tmpl |   1 +
>>>  drivers/base/fence.c  |  38 +++
>>>  include/linux/seqno-fence.h   | 105 
>>> ++
>>>  3 files changed, 144 insertions(+)
>>>  create mode 100644 include/linux/seqno-fence.h
>>>
>>> diff --git a/Documentation/DocBook/device-drivers.tmpl 
>>> b/Documentation/DocBook/device-drivers.tmpl
>>> index 6f53fc0..ad14396 100644
>>> --- a/Documentation/DocBook/device-drivers.tmpl
>>> +++ b/Documentation/DocBook/device-drivers.tmpl
>>> @@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
>>>  !Edrivers/base/dma-buf.c
>>>  !Edrivers/base/fence.c
>>>  !Iinclude/linux/fence.h
>>> +!Iinclude/linux/seqno-fence.h
>>>  !Edrivers/base/dma-coherent.c
>>>  !Edrivers/base/dma-mapping.c
>>>   
>>> diff --git a/drivers/base/fence.c b/drivers/base/fence.c
>>> index 28e5ffd..1d3f29c 100644
>>> --- a/drivers/base/fence.c
>>> +++ b/drivers/base/fence.c
>>> @@ -24,6 +24,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  atomic_t fence_context_counter = ATOMIC_INIT(0);
>>>  EXPORT_SYMBOL(fence_context_counter);
>>> @@ -284,3 +285,40 @@ out:
>>> return ret;
>>>  }
>>>  EXPORT_SYMBOL(fence_default_wait);
>>> +
>>> +static bool seqno_enable_signaling(struct fence *fence)
>>> +{
>>> +   struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +   return seqno_fence->ops->enable_signaling(fence);
>>> +}
>>> +
>>> +static bool seqno_signaled(struct fence *fence)
>>> +{
>>> +   struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +   return seqno_fence->ops->signaled && 
>>> seqno_fence->ops->signaled(fence);
>>> +}
>>> +
>>> +static void seqno_release(struct fence *fence)
>>> +{
>>> +   struct seqno_fence *f = to_seqno_fence(fence);
>>> +
>>> +   dma_buf_put(f->sync_buf);
>>> +   if (f->ops->release)
>>> +   f->ops->release(fence);
>>> +   else
>>> +   kfree(f);
>>> +}
>>> +
>>> +static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
>>> +{
>>> +   struct seqno_fence *f = to_seqno_fence(fence);
>>> +   return f->ops->wait(fence, intr, timeout);
>>> +}
>>> +
>>> +const struct fence_ops seqno_fence_ops = {
>>> +   .enable_signaling = seqno_enable_signaling,
>>> +   .signaled = seqno_signaled,
>>> +   .wait = seqno_wait,
>>> +   .release = seqno_release
>>> +};
>>> +EXPORT_SYMBOL_GPL(seqno_fence_ops);
>>> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
>>> new file mode 100644
>>> index 000..603adc0
>>> --- /dev/null
>>> +++ b/include/linux/seqno-fence.h
>>> @@ -0,0 +1,105 @@
>>> +/*
>>> + * seqno-fence, using a dma-buf to synchronize fencing
>>> + *
>>> + * Copyright (C) 2012 Texas Instruments
>>> + * Copyright (C) 2012 Canonical Ltd
>>> + * Authors:
>>> + *   Rob Clark 
>

Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)

2013-01-15 Thread Inki Dae
;
> +   struct dma_buf *sync_buf;
> +   uint32_t seqno_ofs;
> +};

Hi maarten,

I'm applying dma-fence v11 and seqno-fence v4 to exynos drm and have
some proposals.

The above seqno_fence structure has only one dmabuf. Shouldn't it have
mutiple dmabufs? For example, in case of drm driver, when pageflip is
requested, one framebuffer could have one more gem buffer for NV12M
format. And this means that one more exported dmabufs should be
sychronized with other devices. Below is simple structure for it,

struct seqno_fence_dmabuf {
struct list_headlist;
intid;
struct dmabuf  *sync_buf;
uint32_t   seqno_ops;
uint32_t   seqno;
};

The member, id, could be used to identify which device sync_buf is
going to be accessed by. In case of drm driver, one framebuffer could
be accessed by one more devices, one is Display controller and another
is HDMI controller. So id would have crtc number.

And seqno_fence structure could be defined like below,

struct seqno_fence {
struct list_headsync_buf_list;
struct fence  base;
const struct fence_ops *ops;
};

In addition, I have implemented fence-helper framework for sw sync as
WIP and below is intefaces for it,

struct fence_helper {
struct list_headentries;
struct reservation_ticket   ticket;
struct seqno_fence  *sf;
spinlock_t lock;
void  *priv;
};

int fence_helper_init(struct fence_helper *fh, void *priv, void
(*remease)(struct fence *fence));
- This function is called at driver open so process unique context
would have a new seqno_fence instance. This function does just
seqno_fence_init call, initialize entries list and set device specific
fence release callback.

bool fence_helper_check_sync_buf(struct fence_helper *fh, struct
dma_buf *sync_buf, int id);
- This function is called before dma is started and checks if same
sync_bufs had already be committed to reservation_object,
bo->fence_shared[n]. And id could be used to identy which device
sync_buf is going to be accessed by.

int fence_helper_add_sync_buf(struct fence_helper *fh, struct dma_buf
*sync_buf, int id);
- This function is called if fence_helper_check_sync_buf() is true and
adds it seqno_fence's sync_buf_list wrapping sync_buf as
seqno_fence_dma_buf structure. With this function call, one
seqno_fence instance would have more sync_bufs. At this time, the
reference count to this sync_buf is taken.

void fence_helper_del_sync_buf(struct fence_helper *fh, int id);
- This function is called if some operation is failed after
fence_helper_add_sync_buf call to release relevant resources.

int fence_helper_init_reservation_entry(struct fence_helper *fh,
struct dma_buf *dmabuf, bool shared, int id);
- This function is called after fence_helper_add_sync_buf call and
calls reservation_entry_init function to set a reservation object of
sync_buf to a new reservation_entry object. And then the new
reservation_entry is added to fh->entries to track all sync_bufs this
device is going to access.

void fence_helper_fini_reservation_entry(struct fence_helper *fh, int id);
- This function is called if some operation is failed after
fence_helper_init_reservation_entry call to releae relevant resources.

int fence_helper_ticket_reserve(struct fence_helper *fh, int id);
- This function is called after fence_helper_init_reservation_entry
call and calls ticket_reserve function to reserve a ticket(locked for
each reservation entry in fh->entires)

void fence_helper_ticket_commit(struct fence_helper *fh, int id);
- This function is called after fence_helper_ticket_reserve() is
called to commit this device's fence to all reservation_objects of
each sync_buf. After that, once other devices try to access these
buffers, they would be blocked and unlock each reservation entry in
fh->entires.

int fence_helper_wait(struct fence_helper *fh, struct dma_buf *dmabuf,
bool intr);
- This function is called before fence_helper_add_sync_buf() is called
to wait for a signal from other devices.

int fence_helper_signal(struct fence_helper *fh, int id);
- This function is called by device's interrupt handler or somewhere
when dma access to this buffer has been completed and calls
fence_signal() with each fence registed to each reservation object in
fh->entries to notify dma access completion to other deivces. At this
time, other devices blocked would be waked up and forward their next
step.

For more detail, in addition, this function does the following,
- delete each reservation entry in fh->entries.
- release each seqno_fence_dmabuf object in seqno_fence's
sync_buf_list and call dma_buf_put() to put the reference count to
dmabuf.


Now the fence-helper framework is 

Re: [PATCH 2/2] [RFC] video: display: Adding frame related ops to MIPI DSI video source struct

2013-01-09 Thread Inki Dae
2013/1/10 Laurent Pinchart :
> Hi Vikas,
>
> Thank you for the patch.
>
> On Friday 04 January 2013 10:24:04 Vikas Sajjan wrote:
>> On 3 January 2013 16:29, Tomasz Figa  wrote:
>> > On Wednesday 02 of January 2013 18:47:22 Vikas C Sajjan wrote:
>> >> From: Vikas Sajjan 
>> >>
>> >> Signed-off-by: Vikas Sajjan 
>> >> ---
>> >>
>> >>  include/video/display.h |6 ++
>> >>  1 file changed, 6 insertions(+)
>> >>
>> >> diff --git a/include/video/display.h b/include/video/display.h
>> >> index b639fd0..fb2f437 100644
>> >> --- a/include/video/display.h
>> >> +++ b/include/video/display.h
>> >> @@ -117,6 +117,12 @@ struct dsi_video_source_ops {
>> >>
>> >>   void (*enable_hs)(struct video_source *src, bool enable);
>> >>
>> >> + /* frame related */
>> >> + int (*get_frame_done)(struct video_source *src);
>> >> + int (*clear_frame_done)(struct video_source *src);
>> >> + int (*set_early_blank_mode)(struct video_source *src, int power);
>> >> + int (*set_blank_mode)(struct video_source *src, int power);
>> >> +
>> >
>> > I'm not sure if all those extra ops are needed in any way.
>> >
>> > Looking and Exynos MIPI DSIM driver, set_blank_mode is handling only
>> > FB_BLANK_UNBLANK status, which basically equals to the already existing
>> > enable operation, while set_early_blank mode handles only
>> > FB_BLANK_POWERDOWN, being equal to disable callback.
>>
>> Right, exynos_mipi_dsi_blank_mode() only supports FB_BLANK_UNBLANK as
>> of now, but FB_BLANK_NORMAL will be supported in future.
>> If not for Exynos, i think it will be need for other SoCs which
>> support FB_BLANK_UNBLANK and FB_BLANK_NORMAL.
>
> Could you please explain in a bit more details what the set_early_blank_mode
> and set_blank_mode operations do ?
>
>> > Both get_frame_done and clear_frame_done do not look at anything used at
>> > the moment and if frame done status monitoring will be ever needed, I
>> > think a better way should be implemented.
>>
>> You are right, as of now Exynos MIPI DSI Panels are NOT using these
>> callbacks, but as you mentioned we will need frame done status monitoring
>> anyways, so i included these callbacks here. Will check, if we can implement
>> any better method.
>
> Do you expect the entity drivers (and in particular the panel drivers) to
> require frame done notification ? If so, could you explain your use case(s) ?
>

Hi Laurent,

As you know, there are two types of MIPI-DSI based lcd panels, RGB and
CPU mode. In case of CPU mode lcd panel, it has its own framebuffer
internally and the image in the framebuffer is transferred on lcd
panel in 60Hz itself. But for this, there is something we should
consider. The display controller with CPU mode doens't transfer image
data to MIPI-DSI controller itself. So we should set trigger bit of
the display controller to 1 to do it and also check whether the data
transmission in the framebuffer is done on lcd panel to avoid tearing
issue and some confliction issue(A) between read and write operations
like below,

lcd_panel_frame_done_interrrupt_handler()
{
...
if (mipi-dsi frame done)
trigger display controller;
...
}

A. the issue that LCD panel can access its own framebuffer while some
new data from MIPI-DSI controller is being written in the framebuffer.

But I think there might be better way to avoid such thing.

Thanks,
Inki Dae

> --
> Regards,
>
> Laurent Pinchart
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/5] Generic panel framework

2012-10-20 Thread Inki Dae
correct some typo. Sorry for this.

2012/10/20 Inki Dae :
> Hi Laurent. sorry for being late.
>
> 2012/8/17 Laurent Pinchart :
>> Hi everybody,
>>
>> While working on DT bindings for the Renesas Mobile SoC display controller
>> (a.k.a. LCDC) I quickly realized that display panel implementation based on
>> board code callbacks would need to be replaced by a driver-based panel
>> framework.
>>
>> Several driver-based panel support solution already exist in the kernel.
>>
>> - The LCD device class is implemented in drivers/video/backlight/lcd.c and
>>   exposes a kernel API in include/linux/lcd.h. That API is tied to the FBDEV
>>   API for historical reason, uses board code callback for reset and power
>>   management, and doesn't include support for standard features available in
>>   today's "smart panels".
>>
>> - OMAP2+ based systems use custom panel drivers available in
>>   drivers/video/omap2/displays. Those drivers are based on OMAP DSS (display
>>   controller) specific APIs.
>>
>> - Similarly, Exynos based systems use custom panel drivers available in
>>   drivers/video/exynos. Only a single driver (s6e8ax0) is currently 
>> available.
>>   That driver is based on Exynos display controller specific APIs and on the
>>   LCD device class API.
>>
>> I've brought up the issue with Tomi Valkeinen (OMAP DSS maintainer) and 
>> Marcus
>> Lorentzon (working on panel support for ST/Linaro), and we agreed that a
>> generic panel framework for display devices is needed. These patches 
>> implement
>> a first proof of concept.
>>
>> One of the main reasons for creating a new panel framework instead of adding
>> missing features to the LCD framework is to avoid being tied to the FBDEV
>> framework. Panels will be used by DRM drivers as well, and their API should
>> thus be subsystem-agnostic. Note that the panel framework used the
>> fb_videomode structure in its API, this will be replaced by a common video
>> mode structure shared across subsystems (there's only so many hours per day).
>>
>> Panels, as used in these patches, are defined as physical devices combining a
>> matrix of pixels and a controller capable of driving that matrix.
>>
>> Panel physical devices are registered as children of the control bus the 
>> panel
>> controller is connected to (depending on the panel type, we can find platform
>> devices for dummy panels with no control bus, or I2C, SPI, DBI, DSI, ...
>> devices). The generic panel framework matches registered panel devices with
>> panel drivers and call the panel drivers probe method, as done by other 
>> device
>> classes in the kernel. The driver probe() method is responsible for
>> instantiating a struct panel instance and registering it with the generic
>> panel framework.
>>
>> Display drivers are panel consumers. They register a panel notifier with the
>> framework, which then calls the notifier when a matching panel is registered.
>> The reason for this asynchronous mode of operation, compared to how drivers
>> acquire regulator or clock resources, is that the panel can use resources
>> provided by the display driver. For instance a panel can be a child of the 
>> DBI
>> or DSI bus controlled by the display device, or use a clock provided by that
>> device. We can't defer the display device probe until the panel is registered
>> and also defer the panel device probe until the display is registered. As
>> most display drivers need to handle output devices hotplug (HDMI monitors for
>> instance), handling panel through a notification system seemed to be the
>> easiest solution.
>>
>> Note that this brings a different issue after registration, as display and
>> panel drivers would take a reference to each other. Those circular references
>> would make driver unloading impossible. I haven't found a good solution for
>> that problem yet (hence the RFC state of those patches), and I would
>> appreciate your input here. This might also be a hint that the framework
>> design is wrong to start with. I guess I can't get everything right on the
>> first try ;-)
>>
>> Getting hold of the panel is the most complex part. Once done, display 
>> drivers
>> can call abstract operations provided by panel drivers to control the panel
>> operation. These patches implement three of those operations (enable, start
>> transfer and get modes). More operations will be needed, and those three
>> operations will likely get modified during review. Most of the panels on
>> d

Re: [RFC 0/5] Generic panel framework

2012-10-20 Thread Inki Dae
Hi Tomi,

2012/8/17 Tomi Valkeinen :
> Hi,
>
> On Fri, 2012-08-17 at 02:49 +0200, Laurent Pinchart wrote:
>
>> I will appreciate all reviews, comments, criticisms, ideas, remarks, ... If
>
> Oookay, where to start... ;)
>
> A few cosmetic/general comments first.
>
> I find the file naming a bit strange. You have panel.c, which is the
> core framework, panel-dbi.c, which is the DBI bus, panel-r61517.c, which
> is driver for r61517 panel...
>
> Perhaps something in this direction (in order): panel-core.c,
> mipi-dbi-bus.c, panel-r61517.c? And we probably end up with quite a lot
> of panel drivers, perhaps we should already divide these into separate
> directories, and then we wouldn't need to prefix each panel with
> "panel-" at all.
>
> ---
>
> Should we aim for DT only solution from the start? DT is the direction
> we are going, and I feel the older platform data stuff would be
> deprecated soon.
>
> ---
>
> Something missing from the intro is how this whole thing should be used.
> It doesn't help if we know how to turn on the panel, we also need to
> display something on it =). So I think some kind of diagram/example of
> how, say, drm would use this thing, and also how the SoC specific DBI
> bus driver would be done, would clarify things.
>
> ---
>
> We have discussed face to face about the different hardware setups and
> scenarios that we should support, but I'll list some of them here for
> others:
>
> 1) We need to support chains of external display chips and panels. A
> simple example is a chip that takes DSI in, and outputs DPI. In that
> case we'd have a chain of SoC -> DSI2DPI -> DPI panel.
>
> In final products I think two external devices is the maximum (at least
> I've never seen three devices in a row), but in theory and in
> development environments the chain can be arbitrarily long. Also the
> connections are not necessarily 1-to-1, but a device can take one input
> while it has two outputs, or a device can take two inputs.
>
> Now, I think two external devices is a must requirement. I'm not sure if
> supporting more is an important requirement. However, if we support two
> devices, it could be that it's trivial to change the framework to
> support n devices.
>
> 2) Panels and display chips are all but standard. They very often have
> their own sequences how to do things, have bugs, or implement some
> feature in slightly different way than some other panel. This is why the
> panel driver should be able to control or define the way things happen.
>
> As an example, Sharp LQ043T1DG01 panel
> (www.sharpsme.com/download/LQ043T1DG01-SP-072106pdf). It is enabled with
> the following sequence:
>
> - Enable VCC and AVDD regulators
> - Wait min 50ms
> - Enable full video stream (pck, syncs, pixels) from SoC
> - Wait min 0.5ms
> - Set DISP GPIO, which turns on the display panel
>
> Here we could split the enabling of panel to two parts, prepare (in this
> case starts regulators and waits 50ms) and finish (wait 0.5ms and set
> DISP GPIO), and the upper layer would start the video stream in between.
>
> I realize this could be done with the PANEL_ENABLE_* levels in your RFC,
> but I don't think the concepts quite match:
>
> - PANEL_ENABLE_BLANK level is needed for "smart panels", as we need to
> configure them and send the initial frame at that operating level. With

The smart panel means command mode way(same as cpu mode)? This panel
includes framebuffer internally and needs triggering from Display
controller to update a new frame on that internal framebuffer. I think
we also need this trigger interface.

Thanks,
Inki Dae


> dummy panels there's really no such level, there's just one enable
> sequence that is always done right away.
>
> - I find waiting at the beginning of a function very ugly (what are we
> waiting for?) and we'd need that when changing the panel to
> PANEL_ENABLE_ON level.
>
> - It's still limited if the panel is a stranger one (see following
> example).
>
> Consider the following theoretical panel enable example, taken to absurd
> level just to show the general problem:
>
> - Enable regulators
> - Enable video stream
> - Wait 50ms
> - Disable video stream
> - Set enable GPIO
> - Enable video stream
>
> This one would be rather impossible with the upper layer handling the
> enabling of the video stream. Thus I see that the panel driver needs to
> control the sequences, and the Sharp panel driver's enable would look
> something like:
>
> regulator_enable(...);
> sleep();
> dpi_enable_video();
> sleep();
> gpip_set(..);
>
> Note that even with this model w

Re: [RFC 0/5] Generic panel framework

2012-10-20 Thread Inki Dae
gt; with no control bus, and Renesas R61505- and R61517-based panels, both using
> DBI as their control bus). Only the dummy panel driver has been tested as I
> lack hardware for the two other drivers.
>
> I will appreciate all reviews, comments, criticisms, ideas, remarks, ... If
> you can find a clever way to solve the cyclic references issue described above
> I'll buy you a beer at the next conference we will both attend. If you think
> the proposed solution is too complex, or too simple, I'm all ears. I
> personally already feel that we might need something even more generic to
> support other kinds of external devices connected to display controllers, such
> as external DSI to HDMI converters for instance. Some kind of video entity
> exposing abstract operations like the panels do would make sense, in which
> case panels would "inherit" from that video entity.
>
> Speaking of conferences, I will attend the KS/LPC in San Diego in a bit more
> than a week, and would be happy to discuss this topic face to face there.
>
> Laurent Pinchart (5):
>   video: Add generic display panel core
>   video: panel: Add dummy panel support
>   video: panel: Add MIPI DBI bus support
>   video: panel: Add R61505 panel support
>   video: panel: Add R61517 panel support

how about using 'buses' directory instead of 'panel' and adding
'panel' under that like below?
video/buess: display bus frameworks such as MIPI-DBI/DSI and eDP are placed.
video/buess/panel: panel drivers based on display bus-based drivers are placed.

I think MIPI-DBI(Display Bus Interface)/DSI(Display Serial Interface)
and eDP are the bus interfaces for display controllers such as
DISC(OMAP SoC) and FIMC(Exynos SoC).

Thanks,
Inki Dae

>
>  drivers/video/Kconfig  |1 +
>  drivers/video/Makefile |1 +
>  drivers/video/panel/Kconfig|   37 +++
>  drivers/video/panel/Makefile   |5 +
>  drivers/video/panel/panel-dbi.c|  217 +++
>  drivers/video/panel/panel-dummy.c  |  103 +++
>  drivers/video/panel/panel-r61505.c |  520 
> 
>  drivers/video/panel/panel-r61517.c |  408 
>  drivers/video/panel/panel.c|  269 +++
>  include/video/panel-dbi.h  |   92 +++
>  include/video/panel-dummy.h|   25 ++
>  include/video/panel-r61505.h   |   27 ++
>  include/video/panel-r61517.h   |   28 ++
>  include/video/panel.h  |  111 
>  14 files changed, 1844 insertions(+), 0 deletions(-)
>  create mode 100644 drivers/video/panel/Kconfig
>  create mode 100644 drivers/video/panel/Makefile
>  create mode 100644 drivers/video/panel/panel-dbi.c
>  create mode 100644 drivers/video/panel/panel-dummy.c
>  create mode 100644 drivers/video/panel/panel-r61505.c
>  create mode 100644 drivers/video/panel/panel-r61517.c
>  create mode 100644 drivers/video/panel/panel.c
>  create mode 100644 include/video/panel-dbi.h
>  create mode 100644 include/video/panel-dummy.h
>  create mode 100644 include/video/panel-r61505.h
>  create mode 100644 include/video/panel-r61517.h
>  create mode 100644 include/video/panel.h
>
> --
> Regards,
>
> Laurent Pinchart
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v1 01/14] media: s5p-hdmi: add HPD GPIO to platform data

2012-10-04 Thread Inki Dae
Hello Media guys,

This is dependent of exynos drm patch set to be merged to mainline so if
there is no problem then please, give me ack so that I can merge this patch
with exynos drm patch set.

Thanks,
Inki Dae

> -Original Message-
> From: RAHUL SHARMA [mailto:rahul.sha...@samsung.com]
> Sent: Thursday, October 04, 2012 4:40 PM
> To: Tomasz Stanislawski; Kyungmin Park; linux-arm-
> ker...@lists.infradead.org; linux-media@vger.kernel.org
> Cc: In-Ki Dae; SUNIL JOSHI; r.sh.o...@gmail.com
> Subject: Re: [PATCH v1 01/14] media: s5p-hdmi: add HPD GPIO to platform
> data
> 
> Hi Mr. Tomasz, Mr. Park, list,
> 
> First patch in the following set belongs to s5p-media, rest to exynos-drm.
> Please review the media patch so that It can be merged for mainline.
> 
> regards,
> Rahul Sharma
> 
> On Thu, Oct 4, 2012 at 9:12 PM, Rahul Sharma 
> wrote:
> > From: Tomasz Stanislawski 
> >
> > This patch extends s5p-hdmi platform data by a GPIO identifier for
> > Hot-Plug-Detection pin.
> >
> > Signed-off-by: Tomasz Stanislawski 
> > Signed-off-by: Kyungmin Park 
> > ---
> >  include/media/s5p_hdmi.h |2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/include/media/s5p_hdmi.h b/include/media/s5p_hdmi.h
> > index 361a751..181642b 100644
> > --- a/include/media/s5p_hdmi.h
> > +++ b/include/media/s5p_hdmi.h
> > @@ -20,6 +20,7 @@ struct i2c_board_info;
> >   * @hdmiphy_info: template for HDMIPHY I2C device
> >   * @mhl_bus: controller id for MHL control bus
> >   * @mhl_info: template for MHL I2C device
> > + * @hpd_gpio: GPIO for Hot-Plug-Detect pin
> >   *
> >   * NULL pointer for *_info fields indicates that
> >   * the corresponding chip is not present
> > @@ -29,6 +30,7 @@ struct s5p_hdmi_platform_data {
> > struct i2c_board_info *hdmiphy_info;
> > int mhl_bus;
> > struct i2c_board_info *mhl_info;
> > +   int hpd_gpio;
> >  };
> >
> >  #endif /* S5P_HDMI_H */
> > --
> > 1.7.0.4
> >
> >
> > ___
> > linux-arm-kernel mailing list
> > linux-arm-ker...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >