Re: v4l: Buffer pools

2011-04-08 Thread Laurent Pinchart
Hi Guennadi,

On Thursday 07 April 2011 13:51:12 Guennadi Liakhovetski wrote:
 On Fri, 1 Apr 2011, Laurent Pinchart wrote:
 
 [snip]
 
  - Cache management (ISP and DSS)
  
  Cache needs to be synchronized between userspace applications, kernel
  space and hardware. Synchronizing the cache is an expensive operation
  and should be avoided when possible. Userspace applications don't need
  to select memory mapping cache attributes, but should be able to either
  handle cache synchronization explicitly, or override the drivers'
  default behaviour.
 
 So, what cache attributes are currently used by the driver? Presumably, it
 is some cacheable variant?

When using MMAP, memory is allocated by the driver from system memory and is 
mapped as normal cacheable memory. When using USERPTR, mapping cache 
attributes are not touched by the driver.

 And which way should the application be able to override the driver's
 behaviour? One of these overrides would probably be skip cache invalidate
 (input) / flush (output), right? Anything else?

Those are the operations I was thinking about.

-- 
Regards,

Laurent Pinchart
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4l: Buffer pools

2011-04-07 Thread Guennadi Liakhovetski
Hi Laurent

On Fri, 1 Apr 2011, Laurent Pinchart wrote:

[snip]

 - Cache management (ISP and DSS)
 
 Cache needs to be synchronized between userspace applications, kernel space 
 and hardware. Synchronizing the cache is an expensive operation and should be 
 avoided when possible. Userspace applications don't need to select memory 
 mapping cache attributes, but should be able to either handle cache 
 synchronization explicitly, or override the drivers' default behaviour.

So, what cache attributes are currently used by the driver? Presumably, it 
is some cacheable variant? And which way should the application be able to 
override the driver's behaviour? One of these overrides would probably be 
skip cache invalidate (input) / flush (output), right? Anything else?

Thanks
Guennadi
---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: v4l: Buffer pools

2011-03-31 Thread Marek Szyprowski
Hello,

On Tuesday, March 29, 2011 4:02 PM Willy POISSON wrote:

   Following to the Warsaw mini-summit action point, I would like to open 
 the thread to gather
 buffer pool  memory manager requirements.
 The list of requirement for buffer pool may contain:
 - Support physically contiguous and virtual memory
 - Support IPC, import/export handles (between 
 processes/drivers/userland/etc)
 - Security(access rights in order to secure no one unauthorized is 
 allowed to access buffers)
 - Cache flush management (by using setdomain and optimize when flushing 
 is needed)
 - Pin/unpin in order to get the actual address to be able to do 
 defragmentation
 - Support pinning in user land in order to allow defragmentation while 
 buffer is mmapped but not
 pined.
 - Both a user API and a Kernel API is needed for this module. (Kernel 
 drivers needs to be able to
 resolve buffer handles as well from the memory manager module, and pin/unpin)
 - be able to support any platform specific allocator (Separate memory 
 allocation from management
 as allocator is platform dependant)
 - Support multiple region domain (Allow to allocate from several memory 
 domain ex: DDR1, DDR2,
 Embedded SRAM to make for ex bandwidth load balancing ...)

The above list looks fine.

Memory/buffer pools are a large topic that covers at least 3 subsystems:
1. user space api
2. in-kernel buffer manager
3. in-kernel memory allocator 

Most of the requirements above list can be assigned to one of these subsystems.

If would like to focus first on the user space API. This API should provide a 
generic way to allocate
memory buffers. User space should not be aware of the allocator specific 
parameters of the buffer.
User space should not decide whether a physically contiguous buffer is needed 
or not. The only
information that user space should provide is a set or list of devices that the 
application want use
with the allocated buffer. User space might also provide some additional hints 
about the buffers - like
the preferred memory region.

Our chip S5PC110 and EXYNOS4 are very similar in terms of integrated multimedia 
modules, however there
is one important difference. The latter has IOMMU module, so multimedia blocks 
doesn't require physically
contiguous buffers. In userspace however we would like to support both with the 
same API.

We have also a very specific requirement for buffers for video codes (chroma 
buffers and luma buffers
must be allocated from different memory banks). The memory bank should be 
specified at allocation time.

The only problem is to define a way the user space API will be able to provide 
a list of devices that 
must be able to operate with the allocated buffer. Without some kind of 
enumeration of all entities 
that work with buffer pool it might be a bit hard. I would like to avoid the 
need of hardcoding device 
names in the user space applications.

The in-kernel memory allocator is mainly targeted to systems that require 
physically contiguous buffers.
Currently CMA framework perfectly fits here. A new version will be posted very 
soon.

 Another idea, but not so linked to memory management (more usage of buffers), 
 would be to have a
 common data container (structure to access data) shared by several media 
 (Imaging, video/still codecs,
 graphics, Display...) to ease usage of the data. This container could  embed 
 data type (video frames,
 Access Unit) , frames format, pixel format, width, height, pixel aspect 
 ratio, region of interest, CTS
 (composition time stamp),  ColorSpace, transparency (opaque, alpha, color 
 key...), pointer on buffer(s)
 handle)...

I'm not sure if such idea can be ever implemented in the mainline kernel... 
IHMO it is too complicated.

Best regards
--
Marek Szyprowski
Samsung Poland RD Center


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4l: Buffer pools

2011-03-31 Thread Laurent Pinchart
Hi Marek,

On Thursday 31 March 2011 12:14:55 Marek Szyprowski wrote:
 On Tuesday, March 29, 2011 4:02 PM Willy POISSON wrote:
  Following to the Warsaw mini-summit action point, I would like to open
  the thread to gather
  
  buffer pool  memory manager requirements.
  The list of requirement for buffer pool may contain:
  -   Support physically contiguous and virtual memory
  -   Support IPC, import/export handles (between
  processes/drivers/userland/etc) -   Security(access rights in order to
  secure no one unauthorized is allowed to access buffers) -  Cache flush
  management (by using setdomain and optimize when flushing is needed)
  -   Pin/unpin in order to get the actual address to be able to do
  defragmentation -   Support pinning in user land in order to allow
  defragmentation while buffer is mmapped but not pined.
  -   Both a user API and a Kernel API is needed for this module. (Kernel
  drivers needs to be able to resolve buffer handles as well from the
  memory manager module, and pin/unpin) - be able to support any platform
  specific allocator (Separate memory allocation from management as
  allocator is platform dependant)
  -   Support multiple region domain (Allow to allocate from several memory
  domain ex: DDR1, DDR2, Embedded SRAM to make for ex bandwidth load
  balancing ...)
 
 The above list looks fine.
 
 Memory/buffer pools are a large topic that covers at least 3 subsystems:
 1. user space api
 2. in-kernel buffer manager
 3. in-kernel memory allocator
 
 Most of the requirements above list can be assigned to one of these
 subsystems.
 
 If would like to focus first on the user space API. This API should provide
 a generic way to allocate memory buffers. User space should not be aware
 of the allocator specific parameters of the buffer. User space should not
 decide whether a physically contiguous buffer is needed or not. The only
 information that user space should provide is a set or list of devices
 that the application want use with the allocated buffer. User space might
 also provide some additional hints about the buffers - like the preferred
 memory region.
 
 Our chip S5PC110 and EXYNOS4 are very similar in terms of integrated
 multimedia modules, however there is one important difference. The latter
 has IOMMU module, so multimedia blocks doesn't require physically
 contiguous buffers. In userspace however we would like to support both
 with the same API.
 
 We have also a very specific requirement for buffers for video codes
 (chroma buffers and luma buffers must be allocated from different memory
 banks). The memory bank should be specified at allocation time.
 
 The only problem is to define a way the user space API will be able to
 provide a list of devices that must be able to operate with the allocated
 buffer. Without some kind of enumeration of all entities that work with
 buffer pool it might be a bit hard. I would like to avoid the need of
 hardcoding device names in the user space applications.

I've been thinking about that, and I'm not sure how feasible it would be. 
Beside the difficulty of passing a list of devices from applications to the 
kernel, drivers would need to transform that list into memory requirements 
compatible with all devices in the list. That will likely become very complex.

Wouldn't it be better to let applications specify memory requirements 
explicitly, and have individual drivers check if the buffers match their 
requirements when the buffer are used ?

 The in-kernel memory allocator is mainly targeted to systems that require
 physically contiguous buffers. Currently CMA framework perfectly fits
 here. A new version will be posted very soon.
 
  Another idea, but not so linked to memory management (more usage of
  buffers), would be to have a common data container (structure to access
  data) shared by several media (Imaging, video/still codecs, graphics,
  Display...) to ease usage of the data. This container could  embed data
  type (video frames, Access Unit) , frames format, pixel format, width,
  height, pixel aspect ratio, region of interest, CTS (composition time
  stamp),  ColorSpace, transparency (opaque, alpha, color key...), pointer
  on buffer(s) handle)...
 
 I'm not sure if such idea can be ever implemented in the mainline kernel...
 IHMO it is too complicated.

-- 
Regards,

Laurent Pinchart
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4l: Buffer pools

2011-03-31 Thread Laurent Pinchart
Hi everybody,

Here are the requirements of the OMAP3 ISP (Image Signal Processor) regarding 
a global buffers pool. I've tried to expand the list of requirements to take 
the OMAP3 DSP and DSS (Display SubSystem) into account, but I have less 
experience with those subsystems. The list might thus not be exhaustive.

Sakari, could you please check if you see something missing from the list ?

- Memory allocation (ISP and DSP)

The ISP and DSP both use an IOMMU to access system memory. The MMUs can access 
the whole 32-bit physical address space without any restriction through 4kB or 
64kB pages, 1MB sections and 16MB supersections.

No hardware requirement needs to be enforced by the allocator, but better 
performances can be achieved if the memory is not fragmented down to page 
level.

Memory is allocated and freed at runtime. Allocation is an expensive operation 
and needs to be performed in advanced, before the memory gets used by the ISP 
and DSP drivers or devices.

- Memory allocation (DSS)

The DSS needs physically contiguous memory. The memory base address needs to 
be aligned to a pixel boundary.

Memory for framebuffer devices is allocated when the device is probed and kept 
until the device is removed or the driver unloaded. Memory for V4L2 video 
output devices is allocated and freed at runtime.

- Alignment and padding (ISP)

ISP buffers must be aligned on a 32 or 64 bytes boundary, depending on the ISP 
module that reads from or writes to memory. A 256 bytes alignment is 
preferable to achieve better performances.

Line stride (the number of bytes between the first pixel of two consecutive 
lines) must be a multiple of 32 or 64 bytes and can be larger than the line 
length. This results in padding at the end of each line (optional if the line 
length is already a multiple of 32 or 64 bytes). Padding bytes are not written 
to by the ISP, and their values are ignored when the ISP reads from memory.

To achieve interoperability with the ISP, other hardware modules need to take 
the ISP line stride requirements into account. This is likely out of scope of 
the buffer pool though.

- Cache management (ISP and DSS)

Cache needs to be synchronized between userspace applications, kernel space 
and hardware. Synchronizing the cache is an expensive operation and should be 
avoided when possible. Userspace applications don't need to select memory 
mapping cache attributes, but should be able to either handle cache 
synchronization explicitly, or override the drivers' default behaviour.

To avoid cache coherency issues caused by speculative prefetching as much as 
possible, no unneeded memory mappings should be present, both for kernelspace 
and userspace.

Cache management operations can be handled either by the buffer pool or by the 
ISP and DSP drivers. In the later case, the drivers need a way to query buffer 
properties to avoid cache synchronization if no cacheable mapping exist for a 
buffer.

- IOMMU mappings (ISP and DSP)

Mapping buffers to the ISP and DSP address spaces through the corresponding 
IOMMUs is a time consuming operation. Drivers need to map the buffers in 
advance before using the memory.

- Multiple use of the same buffer

If images need to be captured directly to the frame buffer, applications might 
need to queue a single buffer multiple times to the ISP capture device to 
avoid buffer queue underruns.

Whether this use case is needed is not known yet (the OMAP Xv implementation 
currently copies images to the frame buffer using a CPU memcpy).

-- 
Regards,

Laurent Pinchart
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: v4l: Buffer pools

2011-03-30 Thread Jonghun Han

Hi all,

The followings are Samsung S.LSI requirement for memory provider.

1. User space API
1.1. New memory management(MM) features should includes followings 
 to the user space.: UMP
A. user space API for memory allocation from system memory: UMP
   Any user process can allocate memory from kernel space by new MM model.
B. user space API for cache operations: flush, clean, invalidate
   Any user process can do cache operation on the allocated memory.
C. user space API for mapping memory attribute as cacheable
   When the system memory mapped into the user space,
   user process can set its property as cacheable.
D. user space API for mapping memory attribute as non-cacheable
   When the system memory mapped into the user space,
   user process can set its property as non-cacheable.

1.2. Inter-process memory sharing: UMP
New MM features should provide memory sharing between user process.

A. Memory allocated by user space can be shared between user processes.
B. Memory allocated by kernel space can be shared between user processes.

2. Kernel space API
New MM features should includes followings to the kernel space.: CMA, VCMM

2-1. Physically memory allocator
A. kernel space API for contiguous memory allocation: CMA(*)
B. kernel space API for non-contiguous memory allocation: VCMM (*)
C. start address alignment: CMA, VCMM
D. selectable allocating region: CMA
*refer to the bottom's extension.

2-2. Device virtual address management: VCMM
New MM features should provide 
the way of managing device virtual memory address as like followings:

A. IOMMU(System MMU) support
   IOMMU is a kind of memory MMU, but IOMMU is dedicated for each device.
B. device virtual address mapping for each device
C. virtual memory allocation
D. mapping / remapping between phys and device virtual address
E. dedicated device virtual address space for each device
F. address translation between address space

 U.V
/   \
  K.V - P.A
   \/
 D.V

U.V: User space address
K.A: Kernel space address
P.A: Physical address
D.V: Device virtual address

3. Extensions
A. extension for custom physical memory allocator
B. extension for custom MMU controller

-
You can find the implementation in the following git repository.
http://git.kernel.org/?p=linux/kernel/git/kki_ap/linux-2.6-
samsung.git;a=tree;hb=refs/heads/2.6.36-samsung

1. UMP (Unified Memory Provider)
- The UMP is an auxiliary component which enables memory to be shared
  across different applications, drivers and hardware components.
- http://blogs.arm.com/multimedia/249-making-the-mali-gpu-device-driver-
open-source/page__cid__133__show__newcomment/
- Suggested by ARM, Not submitted yet.
- implementation
  drivers/media/video/samsung/ump/*

2. VCMM (Virtual Contiguous Memory Manager)
- The VCMM is a framework to deal with multiple IOMMUs in a system
  with intuitive and abstract objects
- Submitted by Michal Nazarewicz @Samsung-SPRC
- Also submitted by KyongHo Cho @Samsung-SYS.LSI
- http://article.gmane.org/gmane.linux.kernel.mm/56912/match=vcm
- implementation
  include/linux/vcm.h
  include/linux/vcm-drv.h
  mm/vcm.c
  arch/arm/plat-s5p/s5p-vcm.c
  arch/amr/plat-s5p/include/plat/s5p-vcm.h

3. CMA (Contiguous Memory Allocator)
- The Contiguous Memory Allocator (CMA) is a framework, which allows
  setting up a machine-specific configuration for physically-contiguous
  memory management. Memory for devices is then allocated according
  to that configuration.
- http://lwn.net/Articles/396702/
- http://www.spinics.net/lists/linux-media/msg26486.html
- Submitted by Michal Nazarewicz @Samsung-SPRC
- implementation
  mm/cma.c
  include/linux/cma.h

4. SYS.MMU
- System MMU supports address transition from VA to PA.
- http://thread.gmane.org/gmane.linux.kernel.samsung-soc/3909
- Submitted by Sangbeom Kim
- Merged by Kukjin Kim, ARM/S5P ARM ARCHITECTURES maintainer
- implementation
  arch/arm/plat-s5p/sysmmu.c
  arch/arm/plat-s5p/include/plat/sysmmu.h

Best regards,
Jonghun Han

 -Original Message-
 From: linux-media-ow...@vger.kernel.org [mailto:linux-media-
 ow...@vger.kernel.org] On Behalf Of Willy POISSON
 Sent: Tuesday, March 29, 2011 11:02 PM
 To: linux-media@vger.kernel.org
 Subject: v4l: Buffer pools
 
 Hi all,
   Following to the Warsaw mini-summit action point, I would like to
open
 the thread to gather buffer pool  memory manager requirements.
 The list of requirement for buffer pool may contain:
 - Support physically contiguous and virtual memory
 - Support IPC, import/export handles (between
 processes/drivers/userland/etc)
 - Security(access rights in order to secure no one unauthorized is
 allowed to access buffers)
 - Cache flush management (by using setdomain and optimize when
flushing
 is needed)
 - Pin/unpin in order to get the actual address to be able to do
 defragmentation

v4l: Buffer pools

2011-03-29 Thread Willy POISSON
Hi all,
Following to the Warsaw mini-summit action point, I would like to open 
the thread to gather buffer pool  memory manager requirements.
The list of requirement for buffer pool may contain:
-   Support physically contiguous and virtual memory
-   Support IPC, import/export handles (between 
processes/drivers/userland/etc)
-   Security(access rights in order to secure no one unauthorized is 
allowed to access buffers)
-   Cache flush management (by using setdomain and optimize when flushing 
is needed)
-   Pin/unpin in order to get the actual address to be able to do 
defragmentation
-   Support pinning in user land in order to allow defragmentation while 
buffer is mmapped but not pined.
-   Both a user API and a Kernel API is needed for this module. (Kernel 
drivers needs to be able to resolve buffer handles as well from the memory 
manager module, and pin/unpin)
-   be able to support any platform specific allocator (Separate memory 
allocation from management as allocator is platform dependant)
-   Support multiple region domain (Allow to allocate from several memory 
domain ex: DDR1, DDR2, Embedded SRAM to make for ex bandwidth load balancing 
...)
Another idea, but not so linked to memory management (more usage of buffers), 
would be to have a common data container (structure to access data) shared by 
several media (Imaging, video/still codecs, graphics, Display...) to ease usage 
of the data. This container could  embed data type (video frames, Access Unit) 
, frames format, pixel format, width, height, pixel aspect ratio, region of 
interest, CTS (composition time stamp),  ColorSpace, transparency (opaque, 
alpha, color key...), pointer on buffer(s) handle)... 
Regards,
Willy.
=
Willy Poisson
ST-Ericsson
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4l: Buffer pools

2011-03-29 Thread Hans Verkuil
On Tuesday, March 29, 2011 16:01:33 Willy POISSON wrote:
 Hi all,
   Following to the Warsaw mini-summit action point, I would like to open 
 the thread to gather buffer pool  memory manager requirements.
 The list of requirement for buffer pool may contain:
 - Support physically contiguous and virtual memory
 - Support IPC, import/export handles (between 
 processes/drivers/userland/etc)
 - Security(access rights in order to secure no one unauthorized is 
 allowed to access buffers)
 - Cache flush management (by using setdomain and optimize when flushing 
 is needed)
 - Pin/unpin in order to get the actual address to be able to do 
 defragmentation
 - Support pinning in user land in order to allow defragmentation while 
 buffer is mmapped but not pined.
 - Both a user API and a Kernel API is needed for this module. (Kernel 
 drivers needs to be able to resolve buffer handles as well from the memory 
 manager module, and pin/unpin)
 - be able to support any platform specific allocator (Separate memory 
 allocation from management as allocator is platform dependant)
 - Support multiple region domain (Allow to allocate from several memory 
 domain ex: DDR1, DDR2, Embedded SRAM to make for ex bandwidth load balancing 
 ...)

Thanks for your input, Willy!

I have one question: do you know which of the points mentioned above are
implemented in actual existing code that ST-Ericsson uses? Ideally with links
to such code as well if available :-)

That will help as a reference.

 Another idea, but not so linked to memory management (more usage of buffers), 
 would be to have a common data container (structure to access data) shared by 
 several media (Imaging, video/still codecs, graphics, Display...) to ease 
 usage of the data. This container could  embed data type (video frames, 
 Access Unit) , frames format, pixel format, width, height, pixel aspect 
 ratio, region of interest, CTS (composition time stamp),  ColorSpace, 
 transparency (opaque, alpha, color key...), pointer on buffer(s) handle)... 

Regards,

Hans
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: v4l: Buffer pools

2011-03-29 Thread Robert Fekete
Hi Hans!

On 29 March 2011 16:50, Hans Verkuil hverk...@xs4all.nl wrote:
 On Tuesday, March 29, 2011 16:01:33 Willy POISSON wrote:
 Hi all,
       Following to the Warsaw mini-summit action point, I would like to open 
 the thread to gather buffer pool  memory manager requirements.
 The list of requirement for buffer pool may contain:
 -     Support physically contiguous and virtual memory

Only physical contigous allocator supported in hwmem as of today.

 -     Support IPC, import/export handles (between 
 processes/drivers/userland/etc)

Supported in hwmem

 -     Security(access rights in order to secure no one unauthorized is 
 allowed to access buffers)

Supported in hwmem

 -     Cache flush management (by using setdomain and optimize when flushing 
 is needed)

Supported in hwmem

 -     Pin/unpin in order to get the actual address to be able to do 
 defragmentation

Pin/unpin implemented but defragmentation not done...yet. It's more of
a memory allocator feature and I know CMA took influence of this and
added pin/unpin into CMA in order to be able to do defragmentation

 -     Support pinning in user land in order to allow defragmentation while 
 buffer is mmapped but not pined.
 -     Both a user API and a Kernel API is needed for this module. (Kernel 
 drivers needs to be able to resolve buffer handles as well from the memory 
 manager module, and pin/unpin)

Supported in hwmem

 -     be able to support any platform specific allocator (Separate memory 
 allocation from management as allocator is platform dependant)

As Laurent pinpointed we have a tightly coupled physical contigous
memory allocator in hwmem today. Plan is to separate this more clearly
from the hwmem api's and also add a virtual memory allocator as
well(This is not supported today). We have also tested as a quick
prototype to use CMA as allocator.


 -     Support multiple region domain (Allow to allocate from several memory 
 domain ex: DDR1, DDR2, Embedded SRAM to make for ex bandwidth load balancing 
 ...)

Not supported in hwmem today. No problems to add separate
regions(mapped to several ddr banks or whatever), although
loadbalancing and similar will likely need some fancy hw in order to
spread addresses across the regions efficiently.


 Thanks for your input, Willy!

 I have one question: do you know which of the points mentioned above are
 implemented in actual existing code that ST-Ericsson uses? Ideally with links
 to such code as well if available :-)

 That will help as a reference.


You can find the details of current hwmem in the linux-mm list.
http://marc.info/?l=linux-mmw=2r=1s=hwmemq=b

BR
/Robert Fekete
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html