date:20141010

[Mesa-dev] [Bug 84566] Unify the format conversion code

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84566

--- Comment #12 from Iago Toral ito...@igalia.com ---
(In reply to Jason Ekstrand from comment #11)
 (In reply to Iago Toral from comment #10)
  (In reply to Iago Toral from comment #9)
   Jason, piglit tests hit cases where they attempt to convert GL format and
   data type combinations that do not match any of the existing mesa formats.
   
   For example GL_RGB +  GL_UNSIGNED_BYTE_2_3_3_REV (BBGG GRRR). The Only 
   mesa
   format of this kind is MESA_FORMAT_B2G3R3_UNORM (RRRG GGBB).
   
   This means that we don't have pack and unpack functions for these types,
   which we need to use a master conversion function. I think the natural 
   thing
   to do would be to add new mesa_format types for these, together with their
   format_pack.c and format_unpack.c functions (which should be 
   auto-generated
   too). I suppose it is okay to add new mesa_format enums, right?
  
  BTW, as an added bonus, with this approach we will speed up conversion for
  some of these types too. For example, the way Mesa currently handles
  GL_UNSIGNED_BYTE_2_3_3_REV to GL_RGBA UBYTE involves two conversions
  (2_3_3_REV - RGBA FLOAT - RGBA_UBYTE), while we would be able to do that
  in one go via the auto-generated unpack function.
 
 How many formats like this are there?  If it's only a few, then it probably
 makes sense to add the few mesa_formats that we need.

I don't know yet. For that I would have enable the master convertion function
for all code paths, then run all the piglit tests and then check the cases that
hit the assertion I have one by one removing duplicate cases, so it would take
some time.

In any case, even if these were a significant bunch: do we have a good
alternative? If we don't create mesa_formats for these types we would have to
handle them as exceptions to the process (and this kind of defeats the purpose
of a master function). We would have to handle conversions from and to these
types through different paths and write the conversions functions we need by
hand...

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84566] Unify the format conversion code

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84566

--- Comment #13 from Jason Ekstrand ja...@jlekstrand.net ---
(In reply to Iago Toral from comment #12)
 (In reply to Jason Ekstrand from comment #11)
  (In reply to Iago Toral from comment #10)
   (In reply to Iago Toral from comment #9)
Jason, piglit tests hit cases where they attempt to convert GL format 
and
data type combinations that do not match any of the existing mesa 
formats.

For example GL_RGB +  GL_UNSIGNED_BYTE_2_3_3_REV (BBGG GRRR). The Only 
mesa
format of this kind is MESA_FORMAT_B2G3R3_UNORM (RRRG GGBB).

This means that we don't have pack and unpack functions for these types,
which we need to use a master conversion function. I think the natural 
thing
to do would be to add new mesa_format types for these, together with 
their
format_pack.c and format_unpack.c functions (which should be 
auto-generated
too). I suppose it is okay to add new mesa_format enums, right?
   
   BTW, as an added bonus, with this approach we will speed up conversion for
   some of these types too. For example, the way Mesa currently handles
   GL_UNSIGNED_BYTE_2_3_3_REV to GL_RGBA UBYTE involves two conversions
   (2_3_3_REV - RGBA FLOAT - RGBA_UBYTE), while we would be able to do that
   in one go via the auto-generated unpack function.
  
  How many formats like this are there?  If it's only a few, then it probably
  makes sense to add the few mesa_formats that we need.
 
 I don't know yet. For that I would have enable the master convertion
 function for all code paths, then run all the piglit tests and then check
 the cases that hit the assertion I have one by one removing duplicate cases,
 so it would take some time.
 
 In any case, even if these were a significant bunch: do we have a good
 alternative? If we don't create mesa_formats for these types we would have
 to handle them as exceptions to the process (and this kind of defeats the
 purpose of a master function). We would have to handle conversions from and
 to these types through different paths and write the conversions functions
 we need by hand...

You should know once you write a gl_format_and_type_to_mesa_format function.  I
don't think there will be many.  I think OpenGL only specifies about 8 packed
formats (plus swizzling) and we should already have most of them.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] winsys/radeon: Use separate caching buffer manager for each set of flags

2014-10-10 Thread Marek Olšák

I wonder if it wouldn't be nicer if the cache manager understood that
there are buffers with different flags, so that we don't have to have
so many of them.

Marek

On Thu, Oct 9, 2014 at 11:42 AM, Michel Dänzer mic...@daenzer.net wrote:
 From: Michel Dänzer michel.daen...@amd.com

 Otherwise the caching buffer manager may return a buffer which was created
 with a different set of flags, which can cause trouble.

 Cc: mesa-sta...@lists.freedesktop.org
 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 15 +++
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 50 
 +++
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.h |  8 ++--
  3 files changed, 32 insertions(+), 41 deletions(-)

 diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
 b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
 index e61e9fd..9518e53 100644
 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
 +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
 @@ -822,17 +822,12 @@ radeon_winsys_bo_create(struct radeon_winsys *rws,
  desc.flags = flags;

  /* Assign a buffer manager. */
 +assert(flags  RADEON_NUM_CACHE_MANAGERS);
  if (use_reusable_pool) {
 -if (domain == RADEON_DOMAIN_VRAM) {
 -if (flags  RADEON_FLAG_GTT_WC)
 -provider = ws-cman_vram_gtt_wc;
 -else
 -provider = ws-cman_vram;
 -} else if (flags  RADEON_FLAG_GTT_WC) {
 -provider = ws-cman_gtt_wc;
 -} else {
 -provider = ws-cman_gtt;
 -}
 +if (domain == RADEON_DOMAIN_VRAM)
 +provider = ws-cman_vram[flags];
 +else
 +provider = ws-cman_gtt[flags];
  } else {
  provider = ws-kman;
  }
 diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
 b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
 index 3b695f9..c67549e 100644
 --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
 +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
 @@ -441,6 +441,7 @@ static boolean do_winsys_init(struct radeon_drm_winsys 
 *ws)
  static void radeon_winsys_destroy(struct radeon_winsys *rws)
  {
  struct radeon_drm_winsys *ws = (struct radeon_drm_winsys*)rws;
 +int i;

  if (ws-thread) {
  ws-kill_thread = 1;
 @@ -453,10 +454,10 @@ static void radeon_winsys_destroy(struct radeon_winsys 
 *rws)
  pipe_mutex_destroy(ws-cmask_owner_mutex);
  pipe_mutex_destroy(ws-cs_stack_lock);

 -ws-cman_vram-destroy(ws-cman_vram);
 -ws-cman_vram_gtt_wc-destroy(ws-cman_vram_gtt_wc);
 -ws-cman_gtt-destroy(ws-cman_gtt);
 -ws-cman_gtt_wc-destroy(ws-cman_gtt_wc);
 +for (i = 0; i  RADEON_NUM_CACHE_MANAGERS; i++) {
 +ws-cman_gtt[i]-destroy(ws-cman_gtt[i]);
 +ws-cman_vram[i]-destroy(ws-cman_vram[i]);
 +}
  ws-kman-destroy(ws-kman);
  if (ws-gen = DRV_R600) {
  radeon_surface_manager_free(ws-surf_man);
 @@ -643,6 +644,7 @@ PUBLIC struct radeon_winsys *
  radeon_drm_winsys_create(int fd, radeon_screen_create_t screen_create)
  {
  struct radeon_drm_winsys *ws;
 +int i;

  pipe_mutex_lock(fd_tab_mutex);
  if (!fd_tab) {
 @@ -671,22 +673,18 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
 screen_create)
  ws-kman = radeon_bomgr_create(ws);
  if (!ws-kman)
  goto fail;
 -ws-cman_vram = pb_cache_manager_create(ws-kman, 100, 2.0f, 0,
 -ws-info.vram_size / 8);
 -if (!ws-cman_vram)
 -goto fail;
 -ws-cman_vram_gtt_wc = pb_cache_manager_create(ws-kman, 100, 2.0f, 
 0,
 +
 +for (i = 0; i  RADEON_NUM_CACHE_MANAGERS; i++) {
 +ws-cman_vram[i] = pb_cache_manager_create(ws-kman, 100, 2.0f, 
 0,
 ws-info.vram_size / 8);
 -if (!ws-cman_vram_gtt_wc)
 -goto fail;
 -ws-cman_gtt = pb_cache_manager_create(ws-kman, 100, 2.0f, 0,
 -   ws-info.gart_size / 8);
 -if (!ws-cman_gtt)
 -goto fail;
 -ws-cman_gtt_wc = pb_cache_manager_create(ws-kman, 100, 2.0f, 0,
 -  ws-info.gart_size / 8);
 -if (!ws-cman_gtt_wc)
 -goto fail;
 +if (!ws-cman_vram[i])
 +goto fail;
 +
 +ws-cman_gtt[i] = pb_cache_manager_create(ws-kman, 100, 2.0f, 0,
 +  ws-info.gart_size / 8);
 +if (!ws-cman_gtt[i])
 +goto fail;
 +}

  if (ws-gen = DRV_R600) {
  ws-surf_man = radeon_surface_manager_new(fd);
 @@ -741,14 +739,12 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
 screen_create)

  fail:
  pipe_mutex_unlock(fd_tab_mutex);
 -if (ws-cman_gtt)
 -ws-cman_gtt-destroy(ws-cman_gtt);
 -if (ws-cman_gtt_wc)
 -ws-cman_gtt_wc-destroy(ws-cman_gtt_wc);
 -if

Re: [Mesa-dev] [PATCH 3/3] r600g, radeonsi: Only set use_staging_texture = TRUE once

2014-10-10 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Thu, Oct 9, 2014 at 11:42 AM, Michel Dänzer mic...@daenzer.net wrote:
 From: Michel Dänzer michel.daen...@amd.com

 No need to check for setting the flag after we set it already.

 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/gallium/drivers/radeon/r600_texture.c | 13 +
  1 file changed, 5 insertions(+), 8 deletions(-)

 diff --git a/src/gallium/drivers/radeon/r600_texture.c 
 b/src/gallium/drivers/radeon/r600_texture.c
 index 13df495..1d4e966 100644
 --- a/src/gallium/drivers/radeon/r600_texture.c
 +++ b/src/gallium/drivers/radeon/r600_texture.c
 @@ -924,19 +924,16 @@ static void *r600_texture_transfer_map(struct 
 pipe_context *ctx,
  * the CPU is much happier reading out of cached system memory
  * than uncached VRAM.
  */
 -   if (rtex-surface.level[0].mode = RADEON_SURF_MODE_1D)
 +   if (rtex-surface.level[0].mode = RADEON_SURF_MODE_1D) {
 use_staging_texture = TRUE;
 -
 -   /* Untiled buffers in VRAM, which is slow for CPU reads */
 -   if ((usage  PIPE_TRANSFER_READ)  !(usage  
 PIPE_TRANSFER_MAP_DIRECTLY) 
 +   } else if ((usage  PIPE_TRANSFER_READ)  !(usage  
 PIPE_TRANSFER_MAP_DIRECTLY) 
 (rtex-resource.domains == RADEON_DOMAIN_VRAM)) {
 +   /* Untiled buffers in VRAM, which is slow for CPU reads */
 use_staging_texture = TRUE;
 -   }
 -
 -   /* Use a staging texture for uploads if the underlying BO is busy. */
 -   if (!(usage  PIPE_TRANSFER_READ) 
 +   } else if (!(usage  PIPE_TRANSFER_READ) 
 (r600_rings_is_buffer_referenced(rctx, rtex-resource.cs_buf, 
 RADEON_USAGE_READWRITE) ||
  rctx-ws-buffer_is_busy(rtex-resource.buf, 
 RADEON_USAGE_READWRITE))) {
 +   /* Use a staging texture for uploads if the underlying BO is 
 busy. */
 use_staging_texture = TRUE;
 }

 --
 2.1.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84566] Unify the format conversion code

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84566

--- Comment #14 from Iago Toral ito...@igalia.com ---
(In reply to Jason Ekstrand from comment #13)
 (In reply to Iago Toral from comment #12)
  (In reply to Jason Ekstrand from comment #11)
   (In reply to Iago Toral from comment #10)
(In reply to Iago Toral from comment #9)
 Jason, piglit tests hit cases where they attempt to convert GL format 
 and
 data type combinations that do not match any of the existing mesa 
 formats.
 
 For example GL_RGB +  GL_UNSIGNED_BYTE_2_3_3_REV (BBGG GRRR). The 
 Only mesa
 format of this kind is MESA_FORMAT_B2G3R3_UNORM (RRRG GGBB).
 
 This means that we don't have pack and unpack functions for these 
 types,
 which we need to use a master conversion function. I think the 
 natural thing
 to do would be to add new mesa_format types for these, together with 
 their
 format_pack.c and format_unpack.c functions (which should be 
 auto-generated
 too). I suppose it is okay to add new mesa_format enums, right?

BTW, as an added bonus, with this approach we will speed up conversion 
for
some of these types too. For example, the way Mesa currently handles
GL_UNSIGNED_BYTE_2_3_3_REV to GL_RGBA UBYTE involves two conversions
(2_3_3_REV - RGBA FLOAT - RGBA_UBYTE), while we would be able to do 
that
in one go via the auto-generated unpack function.
   
   How many formats like this are there?  If it's only a few, then it 
   probably
   makes sense to add the few mesa_formats that we need.
  
  I don't know yet. For that I would have enable the master convertion
  function for all code paths, then run all the piglit tests and then check
  the cases that hit the assertion I have one by one removing duplicate cases,
  so it would take some time.
  
  In any case, even if these were a significant bunch: do we have a good
  alternative? If we don't create mesa_formats for these types we would have
  to handle them as exceptions to the process (and this kind of defeats the
  purpose of a master function). We would have to handle conversions from and
  to these types through different paths and write the conversions functions
  we need by hand...
 
 You should know once you write a gl_format_and_type_to_mesa_format function.

I have that already, but it is not enough. At the moment, if I have detected
that a format is not an array format, I do something like this to decide if it
has a matching mesa format:

   for (int f = 1; f  MESA_FORMAT_COUNT; f++)
  if (_mesa_format_matches_format_and_type(f, format, type, swap_bytes))
 return f;

So the cases that don't match simply continue and hit an assertion.

 I don't think there will be many.  I think OpenGL only specifies about 8
 packed formats (plus swizzling) and we should already have most of them.

Let's assume they are not that many then.

On a different note, I have just noticed that the driver can select a different
texture format than the internal format specified by the client (glTexImage*),
when the specified format is not supported. This creates a requirement for
swizzle transformations where we need to do src-baseinternal-rgba-dst, but
the master function, as it is right now, does not know about the internal
format (only knows src and dst, so it does src-rgba-dst), so it fails for
some of these cases.

For example, in one case I see that the client specifies MESA_FORMAT_I_SINT8
(swizzle ) as the internal format for the texture, but the driver does not
support that and uses MESA_FORMAT_RGBA_SINT8 (swizzle 0123) instead. A master
function that only knows about MESA_FORMAT_RGBA_SINT8 and does not know that
the format requested by the client was MESA_FORMAT_I_SINT8 will not produce
correct results since it would not be able to compute the right swizzle
transform for _mesa_swizzle_and_convert.

So my proposal is to pass the baseinternalformat to the master converter. If
there are cases where we do not care about an internalformat we can just pass
_mesa_get_format_base_format(dstFormat) and then have the master converter
compute a different swizzle when the provided internal format is different from
_mesa_get_format_base_format(dstFormat), which is what various parts of
texstore are doing now.

Sounds reasonable?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

2014-10-10 Thread Timothy Arceri

On Mon, 2014-10-06 at 17:03 +0200, Erik Faye-Lund wrote:
 On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
  Our current atan()-approximation is pretty inaccurate at 1.0, so
  let's try to improve the situation by doing a direct approximation
  without going through atan.
 
  This new implementation uses an 11th degree polynomial to approximate
  atan in the [-1..1] range, and the following identitiy to reduce the
  entire range to [-1..1]:
 
  atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)
 
  This range-reduction idea is taken from the paper Fast computation
  of Arctangent Functions for Embedded Applications: A Comparative
  Analysis (Ukil et al. 2011).
 
  The polynomial that approximates atan(x) is:
 
  x   * 0.793128310355 - x^3  * 0.3326756418091246 +
  x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
  x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
 
  This polynomial was found with the following GNU Octave script:
 
  x = linspace(0, 1);
  y = atan(x);
  n = [1, 3, 5, 7, 9, 11];
  format long;
  polyfitc(x, y, n)
 
  The polyfitc function is not built-in, but too long to include here.
  It can be downloaded from the following URL:
 
  http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m
 
  This fixes the following piglit test:
  shaders/glsl-const-folding-01
 
  Signed-off-by: Erik Faye-Lund kusmab...@gmail.com
  Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 
 Ping?

Are you just looking for someone to commit this?

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

2014-10-10 Thread Erik Faye-Lund

On Fri, Oct 10, 2014 at 12:22 PM, Timothy Arceri t_arc...@yahoo.com.au wrote:
 On Mon, 2014-10-06 at 17:03 +0200, Erik Faye-Lund wrote:
 On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
  Our current atan()-approximation is pretty inaccurate at 1.0, so
  let's try to improve the situation by doing a direct approximation
  without going through atan.
 
  This new implementation uses an 11th degree polynomial to approximate
  atan in the [-1..1] range, and the following identitiy to reduce the
  entire range to [-1..1]:
 
  atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)
 
  This range-reduction idea is taken from the paper Fast computation
  of Arctangent Functions for Embedded Applications: A Comparative
  Analysis (Ukil et al. 2011).
 
  The polynomial that approximates atan(x) is:
 
  x   * 0.793128310355 - x^3  * 0.3326756418091246 +
  x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
  x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
 
  This polynomial was found with the following GNU Octave script:
 
  x = linspace(0, 1);
  y = atan(x);
  n = [1, 3, 5, 7, 9, 11];
  format long;
  polyfitc(x, y, n)
 
  The polyfitc function is not built-in, but too long to include here.
  It can be downloaded from the following URL:
 
  http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m
 
  This fixes the following piglit test:
  shaders/glsl-const-folding-01
 
  Signed-off-by: Erik Faye-Lund kusmab...@gmail.com
  Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 Ping?

 Are you just looking for someone to commit this?

Either that, or a reason for it to not be applied ;)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: implement switch flow control using a loop

2014-10-10 Thread Francisco Jerez

Tapani Pälli tapani.pa...@intel.com writes:

 Hi;

 Any comments on this approach? I have also a branch that implements a
 'switch specific dead code elimination pass' but it is only enough to
 fix non-conditional breaks (fs-exec-after-break.shader_test). If I
 understand correctly fixing conditional breaks would need adding switch
 breaks as part of IR or wrapping switch as a loop like in the patch here.

 Thanks;


I like this solution because it has the advantage that it doesn't
increase the complexity of the IR that different back-ends will then
have to handle, as defining a new ir_switch instruction would. -- No
need for back-ends to re-implement the same logic to lower it to a chain
of if statements themselves.

Sure, it might be more optimal to implement the switch statement as a
jump table on some architectures in the rare cases where it's faster
than a chain or a binary tree of if conditionals.  But a majority of the
hardware we care about won't be able to do that anyway because the
argument of the switch statement can be an arbitrary non-uniform
expression, and for the minority that can handle it I'll be surprised if
it makes any significant difference.

Aside from the minor nit-pick below, this patch is:

Reviewed-by: Francisco Jerez curroje...@riseup.net

 // Tapani

 On 08/06/2014 02:21 PM, Tapani Pälli wrote:
 Patch removes old variable based logic for handling a break inside
 switch. Switch is put inside a loop so that existing infrastructure
 for loop flow control can be used for the switch, now also dead code
 elimination works properly.

 Possible 'continue' call inside a switch needs now special handling
 which is taken care of by detecting continue, breaking out and calling
 continue for the outside loop.

 Fixes following Piglit tests:

fs-exec-after-break.shader_test
fs-conditional-break.shader_test

 No Piglit or es3conform regressions.

 Signed-off-by: Tapani Pälli tapani.pa...@intel.com
 ---
  src/glsl/ast_to_hir.cpp   | 101 
 +++---
  src/glsl/glsl_parser_extras.h |   4 +-
  2 files changed, 68 insertions(+), 37 deletions(-)

 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index 30b02d0..4e3c48c 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -4366,7 +4366,7 @@ ast_jump_statement::hir(exec_list *instructions,
* loop.
*/
   if (state-loop_nesting_ast != NULL 
 - mode == ast_continue) {
 + mode == ast_continue  
 !state-switch_state.is_switch_innermost) {
  if (state-loop_nesting_ast-rest_expression) {
 state-loop_nesting_ast-rest_expression-hir(instructions,
   state);
 @@ -4378,19 +4378,27 @@ ast_jump_statement::hir(exec_list *instructions,
   }
  
   if (state-switch_state.is_switch_innermost 
 + mode == ast_continue) {
 +/* Set 'continue_inside' to true. */
 +ir_rvalue *const true_val = new (ctx) ir_constant(true);
 +ir_dereference_variable *deref_continue_inside_var =
 +   new(ctx) 
 ir_dereference_variable(state-switch_state.continue_inside);
 +instructions-push_tail(new(ctx) 
 ir_assignment(deref_continue_inside_var,
 +   true_val));
 +
 +/* Break out from the switch, continue for the loop will
 + * be called right after switch. */
 +ir_loop_jump *const jump =
 +   new(ctx) ir_loop_jump(ir_loop_jump::jump_break);
 +instructions-push_tail(jump);
 +
 + } else if (state-switch_state.is_switch_innermost 
   mode == ast_break) {
 -/* Force break out of switch by setting is_break switch state.
 - */
 -ir_variable *const is_break_var = 
 state-switch_state.is_break_var;
 -ir_dereference_variable *const deref_is_break_var =
 -   new(ctx) ir_dereference_variable(is_break_var);
 -ir_constant *const true_val = new(ctx) ir_constant(true);
 -ir_assignment *const set_break_var =
 -   new(ctx) ir_assignment(deref_is_break_var, true_val);
 -
 -instructions-push_tail(set_break_var);
 - }
 - else {
 +/* Force break out of switch by inserting a break. */
 +ir_loop_jump *const jump =
 +   new(ctx) ir_loop_jump(ir_loop_jump::jump_break);
 +instructions-push_tail(jump);
 + } else {
  ir_loop_jump *const jump =
 new(ctx) ir_loop_jump((mode == ast_break)
? ir_loop_jump::jump_break
 @@ -4502,19 +4510,19 @@ ast_switch_statement::hir(exec_list *instructions,
 instructions-push_tail(new(ctx) ir_assignment(deref_is_fallthru_var,
is_fallthru_val));
  
 -   /* Initalize

[Mesa-dev] [Bug 84566] Unify the format conversion code

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84566

--- Comment #15 from Jason Ekstrand ja...@jlekstrand.net ---
(In reply to Iago Toral from comment #14)
 (...)
   I don't know yet. For that I would have enable the master convertion
   function for all code paths, then run all the piglit tests and then check
   the cases that hit the assertion I have one by one removing duplicate 
   cases,
   so it would take some time.
   
   In any case, even if these were a significant bunch: do we have a good
   alternative? If we don't create mesa_formats for these types we would have
   to handle them as exceptions to the process (and this kind of defeats the
   purpose of a master function). We would have to handle conversions from 
   and
   to these types through different paths and write the conversions functions
   we need by hand...
  
  You should know once you write a gl_format_and_type_to_mesa_format function.
 
 I have that already, but it is not enough. At the moment, if I have detected
 that a format is not an array format, I do something like this to decide if
 it has a matching mesa format:
 
for (int f = 1; f  MESA_FORMAT_COUNT; f++)
   if (_mesa_format_matches_format_and_type(f, format, type, swap_bytes))
  return f;
 
 So the cases that don't match simply continue and hit an assertion.

You should be able to do this with a simple switch statement.  There aren't
that many of them.  According to the GL 1.2 docs for TexImage, there are:

GL_UNSIGNED_BYTE_3_3_2
GL_UNSIGNED_BYTE_2_3_3_REV
GL_UNSIGNED_SHORT_5_6_5
GL_UNSIGNED_SHORT_5_6_5_REV
GL_UNSIGNED_SHORT_4_4_4_4
GL_UNSIGNED_SHORT_4_4_4_4_REV
GL_UNSIGNED_SHORT_5_5_5_1
GL_UNSIGNED_SHORT_1_5_5_5_REV
GL_UNSIGNED_INT_8_8_8_8
GL_UNSIGNED_INT_8_8_8_8_REV
GL_UNSIGNED_INT_10_10_10_2
GL_UNSIGNED_INT_2_10_10_10_REV

I think they added 1 or 2 more in extensions, but that should be it.  Also, you
have to watch out for GL_RGB vs. GL_BGR stuff

  I don't think there will be many.  I think OpenGL only specifies about 8
  packed formats (plus swizzling) and we should already have most of them.
 
 Let's assume they are not that many then.
 
 On a different note, I have just noticed that the driver can select a
 different texture format than the internal format specified by the client
 (glTexImage*), when the specified format is not supported. This creates a
 requirement for swizzle transformations where we need to do
 src-baseinternal-rgba-dst, but the master function, as it is right now,
 does not know about the internal format (only knows src and dst, so it does
 src-rgba-dst), so it fails for some of these cases.
 
 For example, in one case I see that the client specifies MESA_FORMAT_I_SINT8
 (swizzle ) as the internal format for the texture, but the driver does
 not support that and uses MESA_FORMAT_RGBA_SINT8 (swizzle 0123) instead. A
 master function that only knows about MESA_FORMAT_RGBA_SINT8 and does not
 know that the format requested by the client was MESA_FORMAT_I_SINT8 will
 not produce correct results since it would not be able to compute the right
 swizzle transform for _mesa_swizzle_and_convert.
 
 So my proposal is to pass the baseinternalformat to the master converter. If
 there are cases where we do not care about an internalformat we can just
 pass _mesa_get_format_base_format(dstFormat) and then have the master
 converter compute a different swizzle when the provided internal format is
 different from _mesa_get_format_base_format(dstFormat), which is what
 various parts of texstore are doing now.
 
 Sounds reasonable?

Yes, we do need something for that.  Using the GL_RGB/RGBA enum would work
fine.  Another option would be to have an array of 4 bools that gives a channel
mask.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] clover: add clCompile

2014-10-10 Thread Tom Stellard

On Thu, Oct 09, 2014 at 09:22:46PM +0200, EdB wrote:
 On Thursday, October 09, 2014 06:29:40 AM Tom Stellard wrote:
  On Sun, Sep 28, 2014 at 12:57:22PM +0200, EdB wrote:
   ---
   
src/gallium/state_trackers/clover/api/dispatch.cpp |  2 +-
src/gallium/state_trackers/clover/api/program.cpp  | 39
+++--- .../state_trackers/clover/core/compiler.hpp  
 | 12 ---
src/gallium/state_trackers/clover/core/error.hpp   |  2 +-
src/gallium/state_trackers/clover/core/program.cpp | 14 ++--
src/gallium/state_trackers/clover/core/program.hpp |  5 ++-
.../state_trackers/clover/llvm/invocation.cpp  | 39
++ 7 files changed, 93 insertions(+), 20
deletions(-)
   
   diff --git a/src/gallium/state_trackers/clover/api/dispatch.cpp
   b/src/gallium/state_trackers/clover/api/dispatch.cpp index
   35d150d..b5a4094 100644
   --- a/src/gallium/state_trackers/clover/api/dispatch.cpp
   +++ b/src/gallium/state_trackers/clover/api/dispatch.cpp
   @@ -122,7 +122,7 @@ namespace clover {
   
  clReleaseDevice,
  clCreateImage,
  clCreateProgramWithBuiltInKernels,
   
   -  NULL, // clCompileProgram
   +  clCompileProgram,
   
  NULL, // clLinkProgram
  clUnloadPlatformCompiler,
  NULL, // clGetKernelArgInfo
   
   diff --git a/src/gallium/state_trackers/clover/api/program.cpp
   b/src/gallium/state_trackers/clover/api/program.cpp index
   6771735..33df0cd 100644
   --- a/src/gallium/state_trackers/clover/api/program.cpp
   +++ b/src/gallium/state_trackers/clover/api/program.cpp
   @@ -152,14 +152,34 @@ CLOVER_API cl_int
   
clBuildProgram(cl_program d_prog, cl_uint num_devs,

   const cl_device_id *d_devs, const char *p_opts,
   void (*pfn_notify)(cl_program, void *),
   
   -   void *user_data) try {
   +   void *user_data) {
   +   cl_int error = clCompileProgram(d_prog, num_devs, d_devs, p_opts,
   +   0, 0, 0,
   +   pfn_notify, user_data);
   +   return error == CL_COMPILE_PROGRAM_FAILURE ?
   + CL_BUILD_PROGRAM_FAILURE : error;
   +}
   +
   +CLOVER_API cl_int
   +clCompileProgram(cl_program d_prog, cl_uint num_devs,
   + const cl_device_id *d_devs, const char *p_opts,
   + cl_uint num_headers, const cl_program *d_header_progs,
   + const char **header_names,
   + void (*pfn_notify)(cl_program, void *),
   + void *user_data) try {
   
   auto prog = obj(d_prog);
   
   auto devs = (d_devs ? objs(d_devs, num_devs) :
ref_vectordevice(prog.context().devices()));
   
   auto opts = (p_opts ? p_opts : );
   
   -   if (bool(num_devs) != bool(d_devs) ||
   -   (!pfn_notify  user_data))
   +   if (bool(num_devs) != bool(d_devs))
   +  throw error(CL_INVALID_VALUE);
   +
   +   if (!pfn_notify  user_data)
   +  throw error(CL_INVALID_VALUE);
   +
   +   if (bool(num_headers) != bool(header_names) ||
   +   bool(num_headers) != bool(d_header_progs))
   
  throw error(CL_INVALID_VALUE);
   
   if (any_of([](const device dev) {
   
   @@ -170,7 +190,18 @@ clBuildProgram(cl_program d_prog, cl_uint num_devs,
   
   if (prog.kernel_ref_count())
   
  throw error(CL_INVALID_OPERATION);
   
   -   prog.build(devs, opts);
   +   std::mapconst std::string, const std::string headers;
   +   for (cl_uint i = 0; i  num_headers; ++i) {
   +  auto h_name = std::string(header_names[i]);
   +  auto h_prog = obj(d_header_progs[i]);
   +
   +  if (!h_prog.has_source)
   + throw error(CL_INVALID_OPERATION);
   +
   +  headers.insert(make_pair(h_name, h_prog.source()));
   +   }
   +
   +   prog.build(devs, opts, headers);
   
   return CL_SUCCESS;

} catch (error e) {
   
   diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp
   b/src/gallium/state_trackers/clover/core/compiler.hpp index
   6ef84d1..c2c4063 100644
   --- a/src/gallium/state_trackers/clover/core/compiler.hpp
   +++ b/src/gallium/state_trackers/clover/core/compiler.hpp
   @@ -29,11 +29,15 @@
   
#include pipe/p_defines.h

namespace clover {
   
   +   typedef compat::paircompat::vector_refconst char,
   +compat::vector_refconst char  vector_ref_pair;
   +
   
   module compile_program_llvm(const compat::string source,
   
   -   pipe_shader_ir ir,
   -   const compat::string target,
   -   const compat::string opts,
   -   compat::string r_log);
   +   const compat::vectorvector_ref_pair headers,
   +   pipe_shader_ir ir,
   +   const compat::string target,
   +   const compat::string opts,
   +

[Mesa-dev] [Bug 84570] Borderlands 2: Constant frame rate drops while playing; really bad with additionl lighting

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84570

--- Comment #15 from Kai k...@dev.carbon-project.org ---
(In reply to Michel Dänzer from comment #14)
 People reported that Mesa commit 7b4276d7acf2e0f77044cb50caa6ad936fa78786
 ('r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers')
 helped for Borderlands 2.

THAT is a LOT better! Even with DynamicLights on I only get occasional FPS
drops. Usually directly after entering a new area. Sometimes, when there's a
lot to draw, that moves, you can get the drops again as well. I played for
about an hour (with DynamicLights=true), and it didn't get worse with time.
It's just dependent on whether there is lots of new stuff to draw or there are
lots of moving parts with lighting eg. from fires or effects.

TL;DR: Almost there. There are still drops, but not as bad as before and almost
no complete 1 second freezes any longer.


My current stack is (Debian testing as a base):
GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1)
Mesa: Git:master/ac557b4c12 + attachment 107542 and attachment 107543
libdrm: Git:master/00847fa48b
LLVM: SVN:trunk/r219409 (3.6 devel)
X.Org: 2:1.16.1-1
Linux: Git:~agd5f/linux:drm-next-3.18:369283bfbd + attachment 107451 and
attachment 107544 (identifies itself as 3.17.0-rc5)
Firmware: http://people.freedesktop.org/~agd5f/radeon_ucode/
 9e05820da42549ce9c89d147cf1f8e19  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_ce.bin
 c8bab593090fc54f239c8d7596c8d846  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_mc.bin
 3618dbb955d8a84970e262bb2e6d2a16  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_me.bin
 c000b0fc9ff6582145f66504b0ec9597  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_mec.bin
 0643ad24b3beff2214cce533e094c1b7  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_pfp.bin
 ba6054b7d78184a74602fd81607e1386  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_rlc.bin
 11288f635737331b69de9ee82fe04898  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_sdma.bin
 284429675a5560e0fad42aa982965fc2  
 /lib/firmware/updates/3.17.0-citadel/radeon/hawaii_smc.bin
libclc: Git:master/7f6f5bff1f
DDX: Git:master/xf86-video-ati-7.5.0

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] clover: add clCompile

2014-10-10 Thread EdB

On Friday 10 October 2014 10:16:08 Tom Stellard wrote:
 On Thu, Oct 09, 2014 at 09:22:46PM +0200, EdB wrote:
  On Thursday, October 09, 2014 06:29:40 AM Tom Stellard wrote:
   On Sun, Sep 28, 2014 at 12:57:22PM +0200, EdB wrote:
---

 src/gallium/state_trackers/clover/api/dispatch.cpp |  2 +-
 src/gallium/state_trackers/clover/api/program.cpp  | 39
 +++--- .../state_trackers/clover/core/compiler.hpp
 
  | 12 ---
 
 src/gallium/state_trackers/clover/core/error.hpp   |  2 +-
 src/gallium/state_trackers/clover/core/program.cpp | 14 ++--
 src/gallium/state_trackers/clover/core/program.hpp |  5 ++-
 .../state_trackers/clover/llvm/invocation.cpp  | 39
 ++ 7 files changed, 93 insertions(+), 20
 deletions(-)

diff --git a/src/gallium/state_trackers/clover/api/dispatch.cpp
b/src/gallium/state_trackers/clover/api/dispatch.cpp index
35d150d..b5a4094 100644
--- a/src/gallium/state_trackers/clover/api/dispatch.cpp
+++ b/src/gallium/state_trackers/clover/api/dispatch.cpp
@@ -122,7 +122,7 @@ namespace clover {

   clReleaseDevice,
   clCreateImage,
   clCreateProgramWithBuiltInKernels,

-  NULL, // clCompileProgram
+  clCompileProgram,

   NULL, // clLinkProgram
   clUnloadPlatformCompiler,
   NULL, // clGetKernelArgInfo

diff --git a/src/gallium/state_trackers/clover/api/program.cpp
b/src/gallium/state_trackers/clover/api/program.cpp index
6771735..33df0cd 100644
--- a/src/gallium/state_trackers/clover/api/program.cpp
+++ b/src/gallium/state_trackers/clover/api/program.cpp
@@ -152,14 +152,34 @@ CLOVER_API cl_int

 clBuildProgram(cl_program d_prog, cl_uint num_devs,
 
const cl_device_id *d_devs, const char *p_opts,
void (*pfn_notify)(cl_program, void *),

-   void *user_data) try {
+   void *user_data) {
+   cl_int error = clCompileProgram(d_prog, num_devs, d_devs, p_opts,
+   0, 0, 0,
+   pfn_notify, user_data);
+   return error == CL_COMPILE_PROGRAM_FAILURE ?
+ CL_BUILD_PROGRAM_FAILURE : error;
+}
+
+CLOVER_API cl_int
+clCompileProgram(cl_program d_prog, cl_uint num_devs,
+ const cl_device_id *d_devs, const char *p_opts,
+ cl_uint num_headers, const cl_program
*d_header_progs,
+ const char **header_names,
+ void (*pfn_notify)(cl_program, void *),
+ void *user_data) try {

auto prog = obj(d_prog);

auto devs = (d_devs ? objs(d_devs, num_devs) :
 ref_vectordevice(prog.context().devices()));

auto opts = (p_opts ? p_opts : );

-   if (bool(num_devs) != bool(d_devs) ||
-   (!pfn_notify  user_data))
+   if (bool(num_devs) != bool(d_devs))
+  throw error(CL_INVALID_VALUE);
+
+   if (!pfn_notify  user_data)
+  throw error(CL_INVALID_VALUE);
+
+   if (bool(num_headers) != bool(header_names) ||
+   bool(num_headers) != bool(d_header_progs))

   throw error(CL_INVALID_VALUE);

if (any_of([](const device dev) {

@@ -170,7 +190,18 @@ clBuildProgram(cl_program d_prog, cl_uint
num_devs,

if (prog.kernel_ref_count())

   throw error(CL_INVALID_OPERATION);

-   prog.build(devs, opts);
+   std::mapconst std::string, const std::string headers;
+   for (cl_uint i = 0; i  num_headers; ++i) {
+  auto h_name = std::string(header_names[i]);
+  auto h_prog = obj(d_header_progs[i]);
+
+  if (!h_prog.has_source)
+ throw error(CL_INVALID_OPERATION);
+
+  headers.insert(make_pair(h_name, h_prog.source()));
+   }
+
+   prog.build(devs, opts, headers);

return CL_SUCCESS;
 
 } catch (error e) {

diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp
b/src/gallium/state_trackers/clover/core/compiler.hpp index
6ef84d1..c2c4063 100644
--- a/src/gallium/state_trackers/clover/core/compiler.hpp
+++ b/src/gallium/state_trackers/clover/core/compiler.hpp
@@ -29,11 +29,15 @@

 #include pipe/p_defines.h
 
 namespace clover {

+   typedef compat::paircompat::vector_refconst char,
+compat::vector_refconst char  vector_ref_pair;
+

module compile_program_llvm(const compat::string source,

-   pipe_shader_ir ir,
-   const compat::string target,
-   const compat::string opts,
-   compat::string r_log);
+

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

2014-10-10 Thread Olivier Galibert

Applied.

 OG.


On Fri, Sep 26, 2014 at 6:11 PM, Erik Faye-Lund kusmab...@gmail.com wrote:
 Our current atan()-approximation is pretty inaccurate at 1.0, so
 let's try to improve the situation by doing a direct approximation
 without going through atan.

 This new implementation uses an 11th degree polynomial to approximate
 atan in the [-1..1] range, and the following identitiy to reduce the
 entire range to [-1..1]:

 atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x)

 This range-reduction idea is taken from the paper Fast computation
 of Arctangent Functions for Embedded Applications: A Comparative
 Analysis (Ukil et al. 2011).

 The polynomial that approximates atan(x) is:

 x   * 0.793128310355 - x^3  * 0.3326756418091246 +
 x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
 x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444

 This polynomial was found with the following GNU Octave script:

 x = linspace(0, 1);
 y = atan(x);
 n = [1, 3, 5, 7, 9, 11];
 format long;
 polyfitc(x, y, n)

 The polyfitc function is not built-in, but too long to include here.
 It can be downloaded from the following URL:

 http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m

 This fixes the following piglit test:
 shaders/glsl-const-folding-01

 Signed-off-by: Erik Faye-Lund kusmab...@gmail.com
 Reviewed-by: Ian Romanick ian.d.roman...@intel.com
 ---
  src/glsl/builtin_functions.cpp | 65 
 +++---
  1 file changed, 55 insertions(+), 10 deletions(-)

 diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
 index 9be7f6d..c126b60 100644
 --- a/src/glsl/builtin_functions.cpp
 +++ b/src/glsl/builtin_functions.cpp
 @@ -442,6 +442,7 @@ private:
 ir_swizzle *matrix_elt(ir_variable *var, int col, int row);

 ir_expression *asin_expr(ir_variable *x);
 +   void do_atan(ir_factory body, const glsl_type *type, ir_variable *res, 
 operand y_over_x);

 /**
  * Call function \param f with parameters specified as the linked
 @@ -2684,11 +2685,7 @@ builtin_builder::_atan2(const glsl_type *type)
ir_factory outer_then(outer_if-then_instructions, mem_ctx);

/* Then...call atan(y/x) */
 -  ir_variable *y_over_x = outer_then.make_temp(glsl_type::float_type, 
 y_over_x);
 -  outer_then.emit(assign(y_over_x, div(y, x)));
 -  outer_then.emit(assign(r, mul(y_over_x, rsq(add(mul(y_over_x, 
 y_over_x),
 -  imm(1.0f));
 -  outer_then.emit(assign(r, asin_expr(r)));
 +  do_atan(body, glsl_type::float_type, r, div(y, x));

/* ...and fix it up: */
ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f)));
 @@ -2711,17 +2708,65 @@ builtin_builder::_atan2(const glsl_type *type)
 return sig;
  }

 +void
 +builtin_builder::do_atan(ir_factory body, const glsl_type *type, 
 ir_variable *res, operand y_over_x)
 +{
 +   /*
 +* range-reduction, first step:
 +*
 +*  / y_over_x if |y_over_x| = 1.0;
 +* x = 
 +*  \ 1.0 / y_over_x   otherwise
 +*/
 +   ir_variable *x = body.make_temp(type, atan_x);
 +   body.emit(assign(x, div(min2(abs(y_over_x),
 +imm(1.0f)),
 +   max2(abs(y_over_x),
 +imm(1.0f);
 +
 +   /*
 +* approximate atan by evaluating polynomial:
 +*
 +* x   * 0.793128310355 - x^3  * 0.3326756418091246 +
 +* x^5 * 0.1938924977115610 - x^7  * 0.1173503194786851 +
 +* x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444
 +*/
 +   ir_variable *tmp = body.make_temp(type, atan_tmp);
 +   body.emit(assign(tmp, mul(x, x)));
 +   body.emit(assign(tmp, 
 mul(add(mul(sub(mul(add(mul(sub(mul(add(mul(imm(-0.0121323213173444f),
 + tmp),
 + 
 imm(0.0536813784310406f)),
 + tmp),
 + 
 imm(0.1173503194786851f)),
 + tmp),
 + imm(0.1938924977115610f)),
 + tmp),
 + imm(0.3326756418091246f)),
 + tmp),
 + imm(0.793128310355f)),
 + x)));
 +
 +   /* range-reduction fixup */
 +   body.emit(assign(tmp, add(tmp,
 + mul(b2f(greater(abs(y_over_x),
 +  imm(1.0f, type-components(,
 +  add(mul(tmp,
 +  imm(-2.0f)),
 +  imm(M_PI_2f));
 +
 +   /* sign fixup */
 +   body.emit(assign(res,

Re: [Mesa-dev] [PATCH 3/3] clover: add clCompile

2014-10-10 Thread Tom Stellard

On Fri, Oct 10, 2014 at 07:51:40PM +0200, EdB wrote:
 On Friday 10 October 2014 10:16:08 Tom Stellard wrote:
  On Thu, Oct 09, 2014 at 09:22:46PM +0200, EdB wrote:
   On Thursday, October 09, 2014 06:29:40 AM Tom Stellard wrote:
On Sun, Sep 28, 2014 at 12:57:22PM +0200, EdB wrote:
 ---
 
  src/gallium/state_trackers/clover/api/dispatch.cpp |  2 +-
  src/gallium/state_trackers/clover/api/program.cpp  | 39
  +++--- .../state_trackers/clover/core/compiler.hpp
  
   | 12 ---
  
  src/gallium/state_trackers/clover/core/error.hpp   |  2 +-
  src/gallium/state_trackers/clover/core/program.cpp | 14 ++--
  src/gallium/state_trackers/clover/core/program.hpp |  5 ++-
  .../state_trackers/clover/llvm/invocation.cpp  | 39
  ++ 7 files changed, 93 insertions(+), 20
  deletions(-)
 
 diff --git a/src/gallium/state_trackers/clover/api/dispatch.cpp
 b/src/gallium/state_trackers/clover/api/dispatch.cpp index
 35d150d..b5a4094 100644
 --- a/src/gallium/state_trackers/clover/api/dispatch.cpp
 +++ b/src/gallium/state_trackers/clover/api/dispatch.cpp
 @@ -122,7 +122,7 @@ namespace clover {
 
clReleaseDevice,
clCreateImage,
clCreateProgramWithBuiltInKernels,
 
 -  NULL, // clCompileProgram
 +  clCompileProgram,
 
NULL, // clLinkProgram
clUnloadPlatformCompiler,
NULL, // clGetKernelArgInfo
 
 diff --git a/src/gallium/state_trackers/clover/api/program.cpp
 b/src/gallium/state_trackers/clover/api/program.cpp index
 6771735..33df0cd 100644
 --- a/src/gallium/state_trackers/clover/api/program.cpp
 +++ b/src/gallium/state_trackers/clover/api/program.cpp
 @@ -152,14 +152,34 @@ CLOVER_API cl_int
 
  clBuildProgram(cl_program d_prog, cl_uint num_devs,
  
 const cl_device_id *d_devs, const char *p_opts,
 void (*pfn_notify)(cl_program, void *),
 
 -   void *user_data) try {
 +   void *user_data) {
 +   cl_int error = clCompileProgram(d_prog, num_devs, d_devs, p_opts,
 +   0, 0, 0,
 +   pfn_notify, user_data);
 +   return error == CL_COMPILE_PROGRAM_FAILURE ?
 + CL_BUILD_PROGRAM_FAILURE : error;
 +}
 +
 +CLOVER_API cl_int
 +clCompileProgram(cl_program d_prog, cl_uint num_devs,
 + const cl_device_id *d_devs, const char *p_opts,
 + cl_uint num_headers, const cl_program
 *d_header_progs,
 + const char **header_names,
 + void (*pfn_notify)(cl_program, void *),
 + void *user_data) try {
 
 auto prog = obj(d_prog);
 
 auto devs = (d_devs ? objs(d_devs, num_devs) :
  ref_vectordevice(prog.context().devices()));
 
 auto opts = (p_opts ? p_opts : );
 
 -   if (bool(num_devs) != bool(d_devs) ||
 -   (!pfn_notify  user_data))
 +   if (bool(num_devs) != bool(d_devs))
 +  throw error(CL_INVALID_VALUE);
 +
 +   if (!pfn_notify  user_data)
 +  throw error(CL_INVALID_VALUE);
 +
 +   if (bool(num_headers) != bool(header_names) ||
 +   bool(num_headers) != bool(d_header_progs))
 
throw error(CL_INVALID_VALUE);
 
 if (any_of([](const device dev) {
 
 @@ -170,7 +190,18 @@ clBuildProgram(cl_program d_prog, cl_uint
 num_devs,
 
 if (prog.kernel_ref_count())
 
throw error(CL_INVALID_OPERATION);
 
 -   prog.build(devs, opts);
 +   std::mapconst std::string, const std::string headers;
 +   for (cl_uint i = 0; i  num_headers; ++i) {
 +  auto h_name = std::string(header_names[i]);
 +  auto h_prog = obj(d_header_progs[i]);
 +
 +  if (!h_prog.has_source)
 + throw error(CL_INVALID_OPERATION);
 +
 +  headers.insert(make_pair(h_name, h_prog.source()));
 +   }
 +
 +   prog.build(devs, opts, headers);
 
 return CL_SUCCESS;
  
  } catch (error e) {
 
 diff --git a/src/gallium/state_trackers/clover/core/compiler.hpp
 b/src/gallium/state_trackers/clover/core/compiler.hpp index
 6ef84d1..c2c4063 100644
 --- a/src/gallium/state_trackers/clover/core/compiler.hpp
 +++ b/src/gallium/state_trackers/clover/core/compiler.hpp
 @@ -29,11 +29,15 @@
 
  #include pipe/p_defines.h
  
  namespace clover {
 
 +   typedef compat::paircompat::vector_refconst char,
 +compat::vector_refconst char  vector_ref_pair;
 +
 
 module compile_program_llvm(const compat::string source,
 
 -   pipe_shader_ir ir,
 -

Re: [Mesa-dev] [PATCH 0/7] Tidying up of ubo/texbo state flagging

2014-10-10 Thread Anuj Phogat

On Wed, Oct 1, 2014 at 2:02 AM, Chris Forbes chr...@ijw.co.nz wrote:
 This series fixes some problems with UBO and TexBO state flagging:

 - glTexBuffer() and glTexBufferRange() never actually dirtied anything,
   and so didn't work unless something else happened to dirty the correct
   state (binding a UBO, or a non-buffer texture, or forcing a batch flush,.. )

 - i965 would reemit all the texture state when a UBO changed, even though
   the atom didn't actually depend on UBO state.

 - Reallocating the backing for any buffer object would cause all the
   texture state and UBO state to be reemitted, even if the buffer object had
   never been used as a buffer texture or a UBO.

 I noticed these issues while writing some simple test programs to explore
 how Mesa+i965 behaved when the app tries to minimize driver overhead.

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Leaving the issue of how other drivers can make use of this new state flag,
this series is Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] cl workdim v2

2014-10-10 Thread Jan Vesely

On Wed, 2014-10-08 at 18:02 +0300, Francisco Jerez wrote:
 Jan Vesely jan.ves...@rutgers.edu writes:
 
  [SNIP]
   
I also don't like that this way there is no difference between
explicit and implicit kernel arguments. On the other hand it's simple,
and does not need additional per driver code.
   
   Yeah...  We definitely want to hide these from the user, as e.g. the
   CL_KERNEL_NUM_ARGS param is required by the spec to return the number of
   arguments provided by the user, and we don't want the user to set
   implicit args, so it gets a bit messy.  I think I like better your
   original idea of passing them as launch_grid() arguments, even though
   the grid offset and dimension parameters are somewhat artificial from a
   the hardware's point of view.
  
  sorry to bug you some more with this. I tried one more thing before
  going back to the launch_grid parameters. this time it implements a
  parallel infrastructure for implicit arguments by creating artificial
  module arguments for uint and size_t (I don't think we need more for
  implicit arguments).
  
  I only added the work dimension argument but adding more should be easy.
  If you think that the launch_grid way is better, I'll stop experimenting
  as I ran out of ideas I wanted to try.
 
  ping
  should I just resend using git instead of attachments?
 
 Hi Jan, I'm sorry, I finally had a while to have a look into this.  I've
 taken your series and tried to fix the couple of issues I wasn't very
 comfortable with, see the attached series.  Does it look OK to you?
 Note that it's completely untested, maybe you could give it a run on
 your system?

Hi,

It took me a while to get back to this too.

the first patch is kind of unrelated and imo can go in independently
(you can add my R-b).

I'll need to spend some more time (hopefully this weekend) to fully
understand the rest and give it a R-b (if you need/want it).
but it works (with the same changes to llvm and libclc as my patches
need), with the attached fix.
so with that change you can add my acked/tested by.
I ran a full piglit with no changes compared to my version

regards,
Jan


 
 Thanks.
 
 
  
  thanks,
  jan
 
  [SNIP]
 
  -- 
  Jan Vesely jan.ves...@rutgers.edu
 

-- 
Jan Vesely jan.ves...@rutgers.edu
diff --git a/src/gallium/state_trackers/clover/core/module.hpp b/src/gallium/state_trackers/clover/core/module.hpp
index 268e3ba..ee6caf9 100644
--- a/src/gallium/state_trackers/clover/core/module.hpp
+++ b/src/gallium/state_trackers/clover/core/module.hpp
@@ -80,7 +80,7 @@ namespace clover {
   enum semantic semantic = general) :
 type(type), size(size),
 target_size(target_size), target_align(target_align),
-ext_type(ext_type), semantic(general) { }
+ext_type(ext_type), semantic(semantic) { }
 
  argument(enum type type, size_t size) :
 type(type), size(size),


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84894] New: Mesa 10.3 Breaks Arch Linux Multiseat

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=84894

Bug ID: 84894
   Summary: Mesa 10.3 Breaks Arch Linux Multiseat
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: shillshoc...@gmail.com

As reported here:
https://bbs.archlinux.org/viewtopic.php?pid=1464726

Updating to Mesa 10.3 causes multiseat to stop working with Arch Linux and
lightdm. The second seat becomes a black screen on boot. Is not specific to
card manufacturer - problem exists with Nvidia, AMD and Intel configurations.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker

2014-10-10 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=79706

Vinson Lee v...@freedesktop.org changed:

   What|Removed |Added

 Depends on||83463

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory.

2014-10-10 Thread Vinson Lee

This patch fixes this build error on DragonFly BSD.

  CC   os/os_misc.lo
os/os_misc.c: In function 'os_get_total_physical_memory':
os/os_misc.c:132:2: error: #error Unsupported *BSD

Signed-off-by: Vinson Lee v...@freedesktop.org
---
 src/gallium/auxiliary/os/os_misc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/auxiliary/os/os_misc.c 
b/src/gallium/auxiliary/os/os_misc.c
index 4c5a22d..ebf033c 100644
--- a/src/gallium/auxiliary/os/os_misc.c
+++ b/src/gallium/auxiliary/os/os_misc.c
@@ -128,6 +128,8 @@ os_get_total_physical_memory(uint64_t *size)
mib[1] = HW_PHYSMEM64;
 #elif defined(PIPE_OS_FREEBSD)
mib[1] = HW_REALMEM;
+#elif defined(PIPE_OS_DRAGONFLY)
+   mib[1] = HW_PHYSMEM;
 #else
 #error Unsupported *BSD
 #endif
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 84566] Unify the format conversion code

[Mesa-dev] [Bug 84566] Unify the format conversion code

Re: [Mesa-dev] [PATCH 1/3] winsys/radeon: Use separate caching buffer manager for each set of flags

Re: [Mesa-dev] [PATCH 3/3] r600g, radeonsi: Only set use_staging_texture = TRUE once

[Mesa-dev] [Bug 84566] Unify the format conversion code

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

Re: [Mesa-dev] [PATCH] glsl: implement switch flow control using a loop

[Mesa-dev] [Bug 84566] Unify the format conversion code

Re: [Mesa-dev] [PATCH 3/3] clover: add clCompile

[Mesa-dev] [Bug 84570] Borderlands 2: Constant frame rate drops while playing; really bad with additionl lighting

Re: [Mesa-dev] [PATCH 3/3] clover: add clCompile

Re: [Mesa-dev] [PATCH] glsl: improve accuracy of atan()

Re: [Mesa-dev] [PATCH 3/3] clover: add clCompile

Re: [Mesa-dev] [PATCH 0/7] Tidying up of ubo/texbo state flagging

Re: [Mesa-dev] [PATCH 0/3] cl workdim v2

[Mesa-dev] [Bug 84894] New: Mesa 10.3 Breaks Arch Linux Multiseat

[Mesa-dev] [Bug 79706] [TRACKER] Mesa regression tracker

[Mesa-dev] [PATCH] auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory.

19 matches

Site Navigation

Mail list logo

Footer information