[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-11 Thread Ilyes Gouta
Hi,

Just a side question (for information),

On Fri, Jul 11, 2014 at 6:07 PM, Bridgman, John 
wrote:

>
> Right. The SET_RESOURCES packet (kfd_pm4_headers.h, added in patch 49)
> allocates a range of HW queues, VMIDs and GDS to the HW scheduler, then the
> scheduler uses the allocated VMIDs to support a potentially larger number
> of user processes by dynamically mapping PASIDs to VMIDs and memory queue
> descriptors (MQDs) to HW queues.
>

Are there any documentation/specifications online describing these
mechanisms?

Thanks,
-- next part --
An HTML attachment was scrubbed...
URL: 



Re: A simple alternative to GEMr

2013-10-03 Thread Ilyes Gouta
Hi,

DirectFB is a good example of doing it all in userspace. It works but at
the cost of ending up with pretty custom interfaces and non-standard ways
of handling things such as buffer addresses (physical) w.r.t to h/w
acceleration, IPC/RPC, buffer sharing for multi-process support, etc.
Memory management (and dma) has to be the kernel's duty.

Ilyes


On Thu, Oct 3, 2013 at 6:00 PM, Rob Clark robdcl...@gmail.com wrote:

 On Thu, Oct 3, 2013 at 7:48 AM, dm.leontiev7 dm.leonti...@gmail.com
 wrote:
  Hello
 
  In my opinion, graphics stack will benefit from moving memory management
 to userspace because there are tons of features not available in kernel,
 like simd or c++.

 both of which bring no benefit to memory management code

  Also, bugs in buffer management code will bite only one process, not the
 whole system.

 As soon as you need to pin pages (which you need to do, except for the
 hw that Jerome is targetting with his proposal where the GPU can
 really support virtual memory), memory management becomes a whole
 system issue..  pinning pages can only be done from the kernel and it
 is pretty frowned upon to have a driver that lets userspace pin
 arbitrary pages without being able to keep track of those pages and
 clean up.

 Anyways, it is much better to trust the kernel than userspace.  In
 system design, you must assume userspace is untrusted.  If you have
 enough tracking for random pages that userspace asks the kernel to pin
 for the gpu in order to cleanup when userspace process dies, then you
 have *more* complexity than what you have in GEM.  Trust me, it is far
 easier for the kernel to deal with buffer handles than having go
 figure out the pages backing a random vma (get_user_pages()) and
 keeping track of things on a per-page basis.

 
  However, tile-based page flipping can be implemented without major
 changes in graphics stack and it may improve double-buffered 2D rendering
 performance by reducing amount of blitted pixels by reusing unchanged
 pages. If GPU's ROP units can take pixels from one location(front buffer)
 and put results to another one(back buffer), blitting may be completely
 avoided if a small area of double buffered window is updated.
 

 Taking pixels from one location to another sounds like blitting to me.
  But anyways, client GL app blitting (or otherwise) directly into
 front buffer is basically defeating the purpose of dri2

 And tile base page flipping is an orthogonal topic to userspace vs
 kernel memory management.

  As for security, there are thousands of ways to peeform a DoS attack. In
 windows, one can eat so much ram, so user will be unable to kill an app
 because the task manager will not start. To avoid this, some memory must be
 reserved for emergency situation, enough to perform 2D rendering by single
 client. Multiple clients will be able to render their gui without caching
 of window contents even under stress conditions. Also, kernel dri module
 must be able to warn a client  if it must return memory to system and reset
 it's context on task manager request
 

 With the current GEM design, buffers can be swapped out under memory
 pressure, or the appropriate cleanup done if OOM killer kills a
 userspace process.

 Doing the memory management in userspace, there are just so many ways
 that things can go wrong.  And once you've fixed those, you end up
 with something more complex.   Sorry, it is just a really bad idea.

 BR,
 -R

  Regards, Dmitry.
 
 
 
  Пользователь Rob Clark robdcl...@gmail.com писал:
 
 right, but the time you do that, you've implemented enough memory
 tracking/management in the kernel, so you don't really win on
 complexity.  Otherwise those pinned pages will remain pinned, and you
 are still out of memory.
 
 BR,
 -R
 
 
 On Fri, Sep 27, 2013 at 7:53 PM, dm.leontiev7 dm.leonti...@gmail.com
 wrote:
  DoS from client app is a certainly a problem if we can't interrupt a
 program. But we can.
 
  The program ate all gpu ram, ok. Let wm to cast oom killer on gpu ram
 eater.j
 
  Пользователь Rob Clark robdcl...@gmail.com писал:
 
 sure, but userspace memory management is not a good idea for gpu's
 which cannot support page fault  resume, as it requires pinning
 pages.  In the best case (ignoring other issues), it allows any
 userspace that can use GPU easily construct a DoS attach by pinning
 all available memory.
 
 BR,
 -R
 
 On Fri, Sep 27, 2013 at 6:54 PM, dm.leontiev7 dm.leonti...@gmail.com
 wrote:
  My idea targets not only new gpus. it targets any GPU with MMU.
 
 
  I  just want the idea to be not patentable.
 
  Пользователь Rob Clark robdcl...@gmail.com писал:
 
 new gpu's can support coherency.. this is the HSA stuff (latest
 generation of radeon can support, and I think latest nv stuff as
 well.. probably not any current intel hw, though).  What Jerome was
 talking about is a bit different from what you are trying to do.
 
 On Fri, Sep 27, 2013 at 6:41 PM, dm.leontiev7 
 dm.leonti...@gmail.com wrote:
  Passing 

[PATCH 000/165] radeon drm-next patches

2013-06-29 Thread Ilyes Gouta
Hi,

> Yes, this works on my rv790 games get high and using vdpau/gl for video
stays low (which is nice as the fan on my card is too noisy on high).
>
> One thing which I guess 99.999% of people won't notice is that doing any
thing with plain X + fluxbox (so no compositing) very briefly ramps up the
speed.
>
> As my fan is just audible on low but very quick to respond I can hear
every time the screen gets updated eg. switching desktops,
browsing+scrolling or switching tabs, even typing dmesg in an xterm which
time shows as taking < 0.1 sec results in a fan change.

Probably a low-pass filter is needed/has to be configured in front of the
freq. scaling engine?

Ilyes

>
>
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
-- next part --
An HTML attachment was scrubbed...
URL: 



Re: [PATCH 000/165] radeon drm-next patches

2013-06-29 Thread Ilyes Gouta
Hi,

 Yes, this works on my rv790 games get high and using vdpau/gl for video
stays low (which is nice as the fan on my card is too noisy on high).

 One thing which I guess 99.999% of people won't notice is that doing any
thing with plain X + fluxbox (so no compositing) very briefly ramps up the
speed.

 As my fan is just audible on low but very quick to respond I can hear
every time the screen gets updated eg. switching desktops,
browsing+scrolling or switching tabs, even typing dmesg in an xterm which
time shows as taking  0.1 sec results in a fan change.

Probably a low-pass filter is needed/has to be configured in front of the
freq. scaling engine?

Ilyes




 ___
 dri-devel mailing list
 dri-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread Ilyes Gouta
Hi Ville,

Regarding 3 plane YCbCr, DRM_FORMAT_yuv444 (non sub-sampled YCbCr)
would also be useful.

-Ilyes

On Wed, Nov 16, 2011 at 7:42 PM,   wrote:
> From: Ville Syrj?l? 
>
> Name the formats as DRM_FORMAT_X instead of DRM_FOURCC_X. Use consistent
> names, especially for the RGB formats. Component order and byte order are
> now strictly specified for each format.
>
> The RGB format naming follows a convention where the components names
> and sizes are listed from left to right, matching the order within a
> single pixel from most significant bit to least significant bit. Lower
> case letters are used when listing the components to improve
> readablility. I believe this convention matches the one used by pixman.
>
> The YUV format names vary more. For the 4:2:2 packed formats and 2
> plane formats use the fourcc. For the three plane formats the
> name includes the plane order and subsampling information using the
> standard subsampling notation. Some of those also happen to match
> the official fourcc definition.
>
> The fourccs for for all the RGB formats and some of the YUV formats
> I invented myself. The idea was that looking at just the fourcc you
> get some idea what the format is about without having to decode it
> using some external reference.
>
> Signed-off-by: Ville Syrj?l? 
> ---
> ?drivers/gpu/drm/drm_crtc.c ? ? ? ? ? | ? 18 +++---
> ?drivers/gpu/drm/drm_crtc_helper.c ? ?| ? 39 --
> ?drivers/gpu/drm/i915/intel_display.c | ? 18 ---
> ?include/drm/drm_fourcc.h ? ? ? ? ? ? | ? 96 
> --
> ?4 files changed, 121 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
> index 30a70a4..761f265 100644
> --- a/drivers/gpu/drm/drm_crtc.c
> +++ b/drivers/gpu/drm/drm_crtc.c
> @@ -1918,28 +1918,28 @@ uint32_t drm_mode_legacy_fb_format(uint32_t bpp, 
> uint32_t depth)
>
> ? ? ? ?switch (bpp) {
> ? ? ? ?case 8:
> - ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB332;
> + ? ? ? ? ? ? ? fmt = DRM_FORMAT_r3g3b2;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 16:
> ? ? ? ? ? ? ? ?if (depth == 15)
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB555;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_x1r5g5b5;
> ? ? ? ? ? ? ? ?else
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB565;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_r5g6b5;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 24:
> - ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB24;
> + ? ? ? ? ? ? ? fmt = DRM_FORMAT_r8g8b8;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?case 32:
> ? ? ? ? ? ? ? ?if (depth == 24)
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB24;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_x8r8g8b8;
> ? ? ? ? ? ? ? ?else if (depth == 30)
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_INTEL_RGB30;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_x2r10g10b10;
> ? ? ? ? ? ? ? ?else
> - ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB32;
> + ? ? ? ? ? ? ? ? ? ? ? fmt = DRM_FORMAT_a8r8g8b8;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?default:
> - ? ? ? ? ? ? ? DRM_ERROR("bad bpp, assuming RGB24 pixel format\n");
> - ? ? ? ? ? ? ? fmt = DRM_FOURCC_RGB24;
> + ? ? ? ? ? ? ? DRM_ERROR("bad bpp, assuming x8r8g8b8 pixel format\n");
> + ? ? ? ? ? ? ? fmt = DRM_FORMAT_x8r8g8b8;
> ? ? ? ? ? ? ? ?break;
> ? ? ? ?}
>
> diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
> b/drivers/gpu/drm/drm_crtc_helper.c
> index 3e0645c..4ef19d37 100644
> --- a/drivers/gpu/drm/drm_crtc_helper.c
> +++ b/drivers/gpu/drm/drm_crtc_helper.c
> @@ -816,27 +816,54 @@ void drm_helper_get_fb_bpp_depth(uint32_t format, 
> unsigned int *depth,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? int *bpp)
> ?{
> ? ? ? ?switch (format) {
> - ? ? ? case DRM_FOURCC_RGB332:
> + ? ? ? case DRM_FORMAT_r3g3b2:
> + ? ? ? case DRM_FORMAT_b2g3r3:
> ? ? ? ? ? ? ? ?*depth = 8;
> ? ? ? ? ? ? ? ?*bpp = 8;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_FOURCC_RGB555:
> + ? ? ? case DRM_FORMAT_x1r5g5b5:
> + ? ? ? case DRM_FORMAT_x1b5g5r5:
> + ? ? ? case DRM_FORMAT_r5g5b5x1:
> + ? ? ? case DRM_FORMAT_b5g5r5x1:
> + ? ? ? case DRM_FORMAT_a1r5g5b5:
> + ? ? ? case DRM_FORMAT_a1b5g5r5:
> + ? ? ? case DRM_FORMAT_r5g5b5a1:
> + ? ? ? case DRM_FORMAT_b5g5r5a1:
> ? ? ? ? ? ? ? ?*depth = 15;
> ? ? ? ? ? ? ? ?*bpp = 16;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_FOURCC_RGB565:
> + ? ? ? case DRM_FORMAT_r5g6b5:
> + ? ? ? case DRM_FORMAT_b5g6r5:
> ? ? ? ? ? ? ? ?*depth = 16;
> ? ? ? ? ? ? ? ?*bpp = 16;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_FOURCC_RGB24:
> + ? ? ? case DRM_FORMAT_r8g8b8:
> + ? ? ? case DRM_FORMAT_b8g8r8:
> + ? ? ? ? ? ? ? *depth = 24;
> + ? ? ? ? ? ? ? *bpp = 24;
> + ? ? ? ? ? ? ? break;
> + ? ? ? case DRM_FORMAT_x8r8g8b8:
> + ? ? ? case DRM_FORMAT_x8b8g8r8:
> + ? ? ? case DRM_FORMAT_r8g8b8x8:
> + ? ? ? case DRM_FORMAT_b8g8r8x8:
> ? ? ? ? ? ? ? ?*depth = 24;
> ? ? ? ? ? ? ? ?*bpp = 32;
> ? ? ? ? ? ? ? ?break;
> - ? ? ? case DRM_INTEL_RGB30:
> + ? ? ? case DRM_FORMAT_x2r10g10b10:
> + ? ? ? case DRM_FORMAT_x2b10g10r10:
> + ? ? ? case DRM_FORMAT_r10g10b10x2:
> + ? ? ? case DRM_FORMAT_b10g10r10x2:
> + ? ? ? case 

Re: [PATCH 2/2] drm: Redefine pixel formats

2011-11-16 Thread Ilyes Gouta
Hi Ville,

Regarding 3 plane YCbCr, DRM_FORMAT_yuv444 (non sub-sampled YCbCr)
would also be useful.

-Ilyes

On Wed, Nov 16, 2011 at 7:42 PM,  ville.syrj...@linux.intel.com wrote:
 From: Ville Syrjälä ville.syrj...@linux.intel.com

 Name the formats as DRM_FORMAT_X instead of DRM_FOURCC_X. Use consistent
 names, especially for the RGB formats. Component order and byte order are
 now strictly specified for each format.

 The RGB format naming follows a convention where the components names
 and sizes are listed from left to right, matching the order within a
 single pixel from most significant bit to least significant bit. Lower
 case letters are used when listing the components to improve
 readablility. I believe this convention matches the one used by pixman.

 The YUV format names vary more. For the 4:2:2 packed formats and 2
 plane formats use the fourcc. For the three plane formats the
 name includes the plane order and subsampling information using the
 standard subsampling notation. Some of those also happen to match
 the official fourcc definition.

 The fourccs for for all the RGB formats and some of the YUV formats
 I invented myself. The idea was that looking at just the fourcc you
 get some idea what the format is about without having to decode it
 using some external reference.

 Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com
 ---
  drivers/gpu/drm/drm_crtc.c           |   18 +++---
  drivers/gpu/drm/drm_crtc_helper.c    |   39 --
  drivers/gpu/drm/i915/intel_display.c |   18 ---
  include/drm/drm_fourcc.h             |   96 
 --
  4 files changed, 121 insertions(+), 50 deletions(-)

 diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
 index 30a70a4..761f265 100644
 --- a/drivers/gpu/drm/drm_crtc.c
 +++ b/drivers/gpu/drm/drm_crtc.c
 @@ -1918,28 +1918,28 @@ uint32_t drm_mode_legacy_fb_format(uint32_t bpp, 
 uint32_t depth)

        switch (bpp) {
        case 8:
 -               fmt = DRM_FOURCC_RGB332;
 +               fmt = DRM_FORMAT_r3g3b2;
                break;
        case 16:
                if (depth == 15)
 -                       fmt = DRM_FOURCC_RGB555;
 +                       fmt = DRM_FORMAT_x1r5g5b5;
                else
 -                       fmt = DRM_FOURCC_RGB565;
 +                       fmt = DRM_FORMAT_r5g6b5;
                break;
        case 24:
 -               fmt = DRM_FOURCC_RGB24;
 +               fmt = DRM_FORMAT_r8g8b8;
                break;
        case 32:
                if (depth == 24)
 -                       fmt = DRM_FOURCC_RGB24;
 +                       fmt = DRM_FORMAT_x8r8g8b8;
                else if (depth == 30)
 -                       fmt = DRM_INTEL_RGB30;
 +                       fmt = DRM_FORMAT_x2r10g10b10;
                else
 -                       fmt = DRM_FOURCC_RGB32;
 +                       fmt = DRM_FORMAT_a8r8g8b8;
                break;
        default:
 -               DRM_ERROR(bad bpp, assuming RGB24 pixel format\n);
 -               fmt = DRM_FOURCC_RGB24;
 +               DRM_ERROR(bad bpp, assuming x8r8g8b8 pixel format\n);
 +               fmt = DRM_FORMAT_x8r8g8b8;
                break;
        }

 diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
 b/drivers/gpu/drm/drm_crtc_helper.c
 index 3e0645c..4ef19d37 100644
 --- a/drivers/gpu/drm/drm_crtc_helper.c
 +++ b/drivers/gpu/drm/drm_crtc_helper.c
 @@ -816,27 +816,54 @@ void drm_helper_get_fb_bpp_depth(uint32_t format, 
 unsigned int *depth,
                                 int *bpp)
  {
        switch (format) {
 -       case DRM_FOURCC_RGB332:
 +       case DRM_FORMAT_r3g3b2:
 +       case DRM_FORMAT_b2g3r3:
                *depth = 8;
                *bpp = 8;
                break;
 -       case DRM_FOURCC_RGB555:
 +       case DRM_FORMAT_x1r5g5b5:
 +       case DRM_FORMAT_x1b5g5r5:
 +       case DRM_FORMAT_r5g5b5x1:
 +       case DRM_FORMAT_b5g5r5x1:
 +       case DRM_FORMAT_a1r5g5b5:
 +       case DRM_FORMAT_a1b5g5r5:
 +       case DRM_FORMAT_r5g5b5a1:
 +       case DRM_FORMAT_b5g5r5a1:
                *depth = 15;
                *bpp = 16;
                break;
 -       case DRM_FOURCC_RGB565:
 +       case DRM_FORMAT_r5g6b5:
 +       case DRM_FORMAT_b5g6r5:
                *depth = 16;
                *bpp = 16;
                break;
 -       case DRM_FOURCC_RGB24:
 +       case DRM_FORMAT_r8g8b8:
 +       case DRM_FORMAT_b8g8r8:
 +               *depth = 24;
 +               *bpp = 24;
 +               break;
 +       case DRM_FORMAT_x8r8g8b8:
 +       case DRM_FORMAT_x8b8g8r8:
 +       case DRM_FORMAT_r8g8b8x8:
 +       case DRM_FORMAT_b8g8r8x8:
                *depth = 24;
                *bpp = 32;
                break;
 -       case DRM_INTEL_RGB30:
 +       case DRM_FORMAT_x2r10g10b10:
 +       case DRM_FORMAT_x2b10g10r10:
 +       case DRM_FORMAT_r10g10b10x2:
 +       case DRM_FORMAT_b10g10r10x2:
 +       case DRM_FORMAT_a2r10g10b10:
 +       case DRM_FORMAT_a2b10g10r10:
 +