Re: [Mesa-dev] RFC about anv change that should be applicable to radv

2018-06-28 Thread Tapani Pälli



On 06/27/2018 06:25 PM, Mauro Rossi wrote:

Hi,

Il giorno mer 27 giu 2018 alle ore 10:41 Tapani Pälli 
mailto:tapani.pa...@intel.com>> ha scritto:


Hi;

On 06/13/2018 09:32 AM, Mauro Rossi wrote:
 > +Samuel Pitoiset
 >
 >
 > 2018-06-11 22:31 GMT+02:00 Mauro Rossi mailto:issor.or...@gmail.com>
 > >>:
 >
 >     Hi Bas,
 >
 >     commit [1] removed a check on 'supported' attribute in
 >     src/intel/vulkan/anv_entrypoints_gen.py
 >
 >     Should the check on 'supported' attribute be removed also in
 >     src/amd/vulkan/radv_entrypoints_gen.py ?

Yes


Infact with that change the vulkan.radv module works with amdgpu (dc=1) 
on HD7790,

I'm going to submit the patch to mesa-dev.


 >     Thanks for your feedback
 >     Mauro
 >
 >     [1]
 >

https://cgit.freedesktop.org/mesa/mesa/commit/?id=63525ba730e3d8a466d7f6382a2b91f4c75dd171
 >   
  

 >
 >
 >
 > Hi Sam, Bas,
 >
 > there is an important matter regarding Android builds (android-x86)
 >
 > I have a series of patches for Android makefiles to build radv
for that
 > platform,
 > they are building but they require the change in
 > src/amd/vulkan/radv_entrypoints_gen.py
 > to have the necessary entrypoints and vulkan apps starting to work.
 >
 > What I forgot to say is that I have the Android building rules
ready to
 > submit to mesa-dev,
 > but they require to build libLLVM with a different name e.g
libLLVM60 or
 > libLLVM_mesa and set the correct HAVE_LLVM properly
 >
 > The patches themselves would break the Android build for Intel,
because
 > amd tree is built unconditionally,
 > but the libLLVM"for mesa" shared library is not in place in AOSP,
Intel
 > builds and not even in android-x86 oreo,

This *should* not be a problem for us since it's the dependencies
set in
product.mk  that define what libraries will be
built. Android-IA
(nowadays "Celadon") should just not then list the library name built
for radv.


If vulkan.radv is not added to PRODUCT_PACKAGES list then it should not 
be a problem for Android-IA


The problem I'm referring to is the libLLVM shared dependency in itself

AOSP builds libLLVM from external/llvm for its own purposes, version 3.9 
is bundled with for Oreo branches,
recent mesa have ceased support for that version, so we cannot avoid 
building the bundled libLLVM and for mesa we need to build libLLVM50 
(different shared library module name)
because Android Build system does not allow for duplicated shared 
libraries module names.


If this is not a problem I will submit the patches with Android building 
rules assuming that llvm shared library can be the current i.e. libLLVM,
but in reality to build vulkan.radv for Android, libLLVM must be version 
5.0 or later and requires "the trick" to side-build another 
external/llvm50 project.


Hmm that sucks but I guess there is no other way around it :/

This setup does not have many users, but upstream the Android radv 
patches could make sense, if it's not disruptive for other users.

If someone know a different way or view please let me know.



One issue is that anv currently builds as
'vulkan.$(TARGET_BOARD_PLATFORM)' which is very generic, we should
probably have something like vulkan.anv.$(TARGET_BOARD_PLATFORM)
instead
and then use ro.hardware.vulkan=vulkan.anv.$(TARGET_BOARD_PLATFORM) in
device.mk  so that Android finds it (?)


vulkan.anv should be ok, as the vulkan HAL module name should not need 
to depend on the $(TARGET_BOARD_PLATFORM) label


If I recall correctly for a 'vulkan.anv' named module the property is 
set as:


setpropro.hardware.vulkan anv

if set as ro.hardware.vulkan vulkan.anv the module may not be found.

In android-x86 we plan to set  the property in init.sh, according to the 
loaded drmfb module, for available vulkan hals:


for inteldrmfb setprop ro.hardware.vulkan anv
for amdgpudrmfb setprop ro.hardware.vulkan radv
[in the future as example] forvirtiodrmfbsetpropro.hardware.vulkan virgl


I chose to include product platform there because "preferred paths" for 
Vulkan HAL module [1] are:


/vendor/lib/hw/vulkan..so
/vendor/lib64/hw/vulkan..so

Celadon sets name in device.mk like this:
PRODUCT_PROPERTY_OVERRIDES += ro.hardware.vulkan=project-celadon

I don't think product name is explicitly required but maybe it would be 
good to be there if it is "preferred".


[1] https://source.android.com/devices/graphics/implement-vulkan



Mauro



 > so I wanted to start discussing about how to integrate ravd for
Android
 > in a way that does not break other drivers builds,
 > if there are other interested 

Re: [Mesa-dev] [PATCH 18/18] radeonsi: enable OpenGL 4.4 compat profile

2018-06-28 Thread Marek Olšák
For patches 15-18:

Reviewed-by: Marek Olšák 

Marek

On Thu, Jun 28, 2018 at 2:46 AM, Timothy Arceri  wrote:
> ---
>  src/gallium/drivers/radeonsi/si_get.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_get.c 
> b/src/gallium/drivers/radeonsi/si_get.c
> index 0e8617d0fee..96ff2a9e46b 100644
> --- a/src/gallium/drivers/radeonsi/si_get.c
> +++ b/src/gallium/drivers/radeonsi/si_get.c
> @@ -210,13 +210,12 @@ static int si_get_param(struct pipe_screen *pscreen, 
> enum pipe_cap param)
> return 4;
>
> case PIPE_CAP_GLSL_FEATURE_LEVEL:
> +   case PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY:
> if (sscreen->info.has_indirect_compute_dispatch)
> -   return 450;
> +   return param == PIPE_CAP_GLSL_FEATURE_LEVEL ?
> +   450 : 440;
> return 420;
>
> -   case PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY:
> -   return 330;
> -
> case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
> return MIN2(sscreen->info.max_alloc_size, INT_MAX);
>
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/18] mesa: add compat profile support for ARB_multi_draw_indirect

2018-06-28 Thread Marek Olšák
Same feedback as on patch 12.

Marek

On Thu, Jun 28, 2018 at 2:46 AM, Timothy Arceri  wrote:
> ---
>  src/mesa/main/extensions_table.h |  2 +-
>  src/mesa/vbo/vbo_exec_array.c| 75 +++-
>  2 files changed, 74 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index 1446a4bd421..12b796777df 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -88,7 +88,7 @@ EXT(ARB_invalidate_subdata  , dummy_true
>  EXT(ARB_map_buffer_alignment, dummy_true 
> , GLL, GLC,  x ,  x , 2011)
>  EXT(ARB_map_buffer_range, ARB_map_buffer_range   
> , GLL, GLC,  x ,  x , 2008)
>  EXT(ARB_multi_bind  , dummy_true 
> , GLL, GLC,  x ,  x , 2013)
> -EXT(ARB_multi_draw_indirect , ARB_draw_indirect  
> ,  x , GLC,  x ,  x , 2012)
> +EXT(ARB_multi_draw_indirect , ARB_draw_indirect  
> , GLL, GLC,  x ,  x , 2012)
>  EXT(ARB_multisample , dummy_true 
> , GLL,  x ,  x ,  x , 1994)
>  EXT(ARB_multitexture, dummy_true 
> , GLL,  x ,  x ,  x , 1998)
>  EXT(ARB_occlusion_query , ARB_occlusion_query
> , GLL,  x ,  x ,  x , 2001)
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 0d92de2e3ad..4e24cdcf263 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -1744,7 +1744,36 @@ vbo_exec_MultiDrawArraysIndirect(GLenum mode, const 
> GLvoid *indirect,
>
> /* If  is zero, the array elements are treated as tightly packed. 
> */
> if (stride == 0)
> -  stride = 4 * sizeof(GLuint);  /* sizeof(DrawArraysIndirectCommand) 
> */
> +  stride = sizeof(DrawArraysIndirectCommand);
> +
> +   /* From the ARB_draw_indirect spec:
> +*
> +*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
> +*compatibility profile, this indicates that DrawArraysIndirect and
> +*DrawElementsIndirect are to source their arguments directly from the
> +*pointer passed as their  parameters."
> +*/
> +   if (ctx->API == API_OPENGL_COMPAT &&
> +   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
> +
> +  if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
> +   "glMultiDrawArraysIndirect"))
> + return;
> +
> +  const ubyte *ptr = (const ubyte *) indirect;
> +  for (unsigned i = 0; i < primcount; i++) {
> + DrawArraysIndirectCommand *cmd = (DrawArraysIndirectCommand *) ptr;
> + _mesa_DrawArraysInstanced(mode, cmd->first, cmd->count, 
> cmd->primCount);
> +
> + if (stride == 0) {
> +ptr += sizeof(DrawArraysIndirectCommand);
> + } else {
> +ptr += stride;
> + }
> +  }
> +
> +  return;
> +   }
>
> FLUSH_FOR_DRAW(ctx);
>
> @@ -1783,7 +1812,49 @@ vbo_exec_MultiDrawElementsIndirect(GLenum mode, GLenum 
> type,
>
> /* If  is zero, the array elements are treated as tightly packed. 
> */
> if (stride == 0)
> -  stride = 5 * sizeof(GLuint);  /* 
> sizeof(DrawElementsIndirectCommand) */
> +  stride = sizeof(DrawElementsIndirectCommand);
> +
> +
> +   /* From the ARB_draw_indirect spec:
> +*
> +*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
> +*compatibility profile, this indicates that DrawArraysIndirect and
> +*DrawElementsIndirect are to source their arguments directly from the
> +*pointer passed as their  parameters."
> +*/
> +   if (ctx->API == API_OPENGL_COMPAT &&
> +   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
> +  /*
> +   * Unlike regular DrawElementsInstancedBaseVertex commands, the indices
> +   * may not come from a client array and must come from an index buffer.
> +   * If no element array buffer is bound, an INVALID_OPERATION error is
> +   * generated.
> +   */
> +  if (!_mesa_is_bufferobj(ctx->Array.VAO->IndexBufferObj)) {
> + _mesa_error(ctx, GL_INVALID_OPERATION,
> + "glMultiDrawElementsIndirect(no buffer bound "
> + "to GL_ELEMENT_ARRAY_BUFFER)");
> +
> + return;
> +  }
> +
> +  if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
> +   "glMultiDrawArraysIndirect"))
> + return;
> +
> +  const ubyte *ptr = (const ubyte *) indirect;
> +  for (unsigned i = 0; i < primcount; i++) {
> + vbo_exec_DrawElementsIndirect(mode, type, ptr);
> +
> + if (stride == 0) {
> +ptr += sizeof(DrawElementsIndirectCommand);
> + } else {
> +ptr += 

Re: [Mesa-dev] [PATCH 13/18] mesa: make valid_draw_indirect_multi() accessible externally

2018-06-28 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Thu, Jun 28, 2018 at 2:46 AM, Timothy Arceri  wrote:
> We will use this to add compat support to ARB_multi_draw_indirect
> in the following patch.
> ---
>  src/mesa/main/draw_validate.c | 24 
>  src/mesa/main/draw_validate.h |  3 +++
>  2 files changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/src/mesa/main/draw_validate.c b/src/mesa/main/draw_validate.c
> index 352263c5c78..c0a234a2bc2 100644
> --- a/src/mesa/main/draw_validate.c
> +++ b/src/mesa/main/draw_validate.c
> @@ -1192,10 +1192,10 @@ valid_draw_indirect_elements(struct gl_context *ctx,
> return valid_draw_indirect(ctx, mode, indirect, size, name);
>  }
>
> -static inline GLboolean
> -valid_draw_indirect_multi(struct gl_context *ctx,
> -  GLsizei primcount, GLsizei stride,
> -  const char *name)
> +GLboolean
> +_mesa_valid_draw_indirect_multi(struct gl_context *ctx,
> +GLsizei primcount, GLsizei stride,
> +const char *name)
>  {
>
> /* From the ARB_multi_draw_indirect specification:
> @@ -1259,8 +1259,8 @@ _mesa_validate_MultiDrawArraysIndirect(struct 
> gl_context *ctx,
> /* caller has converted stride==0 to drawArraysNumParams * sizeof(GLuint) 
> */
> assert(stride != 0);
>
> -   if (!valid_draw_indirect_multi(ctx, primcount, stride,
> -  "glMultiDrawArraysIndirect"))
> +   if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
> +"glMultiDrawArraysIndirect"))
>return GL_FALSE;
>
> /* number of bytes of the indirect buffer which will be read */
> @@ -1287,8 +1287,8 @@ _mesa_validate_MultiDrawElementsIndirect(struct 
> gl_context *ctx,
> /* caller has converted stride==0 to drawElementsNumParams * 
> sizeof(GLuint) */
> assert(stride != 0);
>
> -   if (!valid_draw_indirect_multi(ctx, primcount, stride,
> -  "glMultiDrawElementsIndirect"))
> +   if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
> +"glMultiDrawElementsIndirect"))
>return GL_FALSE;
>
> /* number of bytes of the indirect buffer which will be read */
> @@ -1366,8 +1366,8 @@ _mesa_validate_MultiDrawArraysIndirectCount(struct 
> gl_context *ctx,
> /* caller has converted stride==0 to drawArraysNumParams * sizeof(GLuint) 
> */
> assert(stride != 0);
>
> -   if (!valid_draw_indirect_multi(ctx, maxdrawcount, stride,
> -  "glMultiDrawArraysIndirectCountARB"))
> +   if (!_mesa_valid_draw_indirect_multi(ctx, maxdrawcount, stride,
> +"glMultiDrawArraysIndirectCountARB"))
>return GL_FALSE;
>
> /* number of bytes of the indirect buffer which will be read */
> @@ -1397,8 +1397,8 @@ _mesa_validate_MultiDrawElementsIndirectCount(struct 
> gl_context *ctx,
> /* caller has converted stride==0 to drawElementsNumParams * 
> sizeof(GLuint) */
> assert(stride != 0);
>
> -   if (!valid_draw_indirect_multi(ctx, maxdrawcount, stride,
> -  "glMultiDrawElementsIndirectCountARB"))
> +   if (!_mesa_valid_draw_indirect_multi(ctx, maxdrawcount, stride,
> +
> "glMultiDrawElementsIndirectCountARB"))
>return GL_FALSE;
>
> /* number of bytes of the indirect buffer which will be read */
> diff --git a/src/mesa/main/draw_validate.h b/src/mesa/main/draw_validate.h
> index 7a181153fb7..d015c7e830e 100644
> --- a/src/mesa/main/draw_validate.h
> +++ b/src/mesa/main/draw_validate.h
> @@ -44,6 +44,9 @@ _mesa_is_valid_prim_mode(const struct gl_context *ctx, 
> GLenum mode);
>  extern GLboolean
>  _mesa_valid_prim_mode(struct gl_context *ctx, GLenum mode, const char *name);
>
> +extern GLboolean
> +_mesa_valid_draw_indirect_multi(struct gl_context *ctx, GLsizei primcount,
> +GLsizei stride, const char *name);
>
>  extern GLboolean
>  _mesa_validate_DrawArrays(struct gl_context *ctx, GLenum mode, GLsizei 
> count);
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/18] mesa: add ARB_draw_indirect support to compat profile

2018-06-28 Thread Marek Olšák
On Thu, Jun 28, 2018 at 2:46 AM, Timothy Arceri  wrote:
> ---
>  src/mesa/main/bufferobj.c|  3 +-
>  src/mesa/main/extensions_table.h |  2 +-
>  src/mesa/vbo/vbo_exec_array.c| 66 +++-
>  3 files changed, 67 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index 67f9cd0a902..1d1e51bc015 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -129,8 +129,7 @@ get_buffer_target(struct gl_context *ctx, GLenum target)
>   return >QueryBuffer;
>break;
> case GL_DRAW_INDIRECT_BUFFER:
> -  if ((ctx->API == API_OPENGL_CORE &&
> -   ctx->Extensions.ARB_draw_indirect) ||
> +  if ((_mesa_is_desktop_gl(ctx) && ctx->Extensions.ARB_draw_indirect) ||
> _mesa_is_gles31(ctx)) {
>   return >DrawIndirectBuffer;
>}
> diff --git a/src/mesa/main/extensions_table.h 
> b/src/mesa/main/extensions_table.h
> index f79a52cee8c..1446a4bd421 100644
> --- a/src/mesa/main/extensions_table.h
> +++ b/src/mesa/main/extensions_table.h
> @@ -58,7 +58,7 @@ EXT(ARB_direct_state_access , dummy_true
>  EXT(ARB_draw_buffers, dummy_true 
> , GLL, GLC,  x ,  x , 2002)
>  EXT(ARB_draw_buffers_blend  , ARB_draw_buffers_blend 
> , GLL, GLC,  x ,  x , 2009)
>  EXT(ARB_draw_elements_base_vertex   , ARB_draw_elements_base_vertex  
> , GLL, GLC,  x ,  x , 2009)
> -EXT(ARB_draw_indirect   , ARB_draw_indirect  
> ,  x , GLC,  x ,  x , 2010)
> +EXT(ARB_draw_indirect   , ARB_draw_indirect  
> , GLL, GLC,  x ,  x , 2010)
>  EXT(ARB_draw_instanced  , ARB_draw_instanced 
> , GLL, GLC,  x ,  x , 2008)
>  EXT(ARB_enhanced_layouts, ARB_enhanced_layouts   
> , GLL, GLC,  x ,  x , 2013)
>  EXT(ARB_explicit_attrib_location, ARB_explicit_attrib_location   
> , GLL, GLC,  x ,  x , 2009)
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 792907ac044..0d92de2e3ad 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -39,6 +39,21 @@
>  #include "main/macros.h"
>  #include "main/transformfeedback.h"
>
> +typedef struct {
> +   GLuint count;
> +   GLuint primCount;
> +   GLuint first;
> +   GLuint reservedMustBeZero;
> +} DrawArraysIndirectCommand;
> +
> +typedef struct {
> +   GLuint count;
> +   GLuint primCount;
> +   GLuint firstIndex;
> +   GLint  baseVertex;
> +   GLuint reservedMustBeZero;
> +} DrawElementsIndirectCommand;
> +

reservedMustBeZero is redefined by ARB_base_instance. I'm sure you'll
find out what needs to be changed here. :)

Marek

>
>  /**
>   * Check that element 'j' of the array has reasonable data.
> @@ -1616,6 +1631,20 @@ vbo_exec_DrawArraysIndirect(GLenum mode, const GLvoid 
> *indirect)
>_mesa_debug(ctx, "glDrawArraysIndirect(%s, %p)\n",
>_mesa_enum_to_string(mode), indirect);
>
> +   /* From the ARB_draw_indirect spec:
> +*
> +*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
> +*compatibility profile, this indicates that DrawArraysIndirect and
> +*DrawElementsIndirect are to source their arguments directly from the
> +*pointer passed as their  parameters."
> +*/
> +   if (ctx->API == API_OPENGL_COMPAT &&
> +   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
> +  DrawArraysIndirectCommand *cmd = (DrawArraysIndirectCommand *) 
> indirect;
> +  _mesa_DrawArraysInstanced(mode, cmd->first, cmd->count, 
> cmd->primCount);
> +  return;
> +   }
> +
> FLUSH_FOR_DRAW(ctx);
>
> if (_mesa_is_no_error_enabled(ctx)) {
> @@ -1647,6 +1676,41 @@ vbo_exec_DrawElementsIndirect(GLenum mode, GLenum 
> type, const GLvoid *indirect)
>_mesa_enum_to_string(mode),
>_mesa_enum_to_string(type), indirect);
>
> +   /* From the ARB_draw_indirect spec:
> +*
> +*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
> +*compatibility profile, this indicates that DrawArraysIndirect and
> +*DrawElementsIndirect are to source their arguments directly from the
> +*pointer passed as their  parameters."
> +*/
> +   if (ctx->API == API_OPENGL_COMPAT &&
> +   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
> +  /*
> +   * Unlike regular DrawElementsInstancedBaseVertex commands, the indices
> +   * may not come from a client array and must come from an index buffer.
> +   * If no element array buffer is bound, an INVALID_OPERATION error is
> +   * generated.
> +   */
> +  if (!_mesa_is_bufferobj(ctx->Array.VAO->IndexBufferObj)) {
> + _mesa_error(ctx, GL_INVALID_OPERATION,
> + "glDrawElementsIndirect(no buffer bound "

Re: [Mesa-dev] [PATCH 11/18] mesa: generate GL_INVALID_OPERATION using draw indirect in dlist

2018-06-28 Thread Marek Olšák
For patches 1-11:

Reviewed-by: Marek Olšák 

Marek

On Thu, Jun 28, 2018 at 2:46 AM, Timothy Arceri  wrote:
> The spec doesn't explicitly say to generate an error but since
> DrawArraysInstanced* and DrawElementsInstanced* do, it makes
> sense to do it for these functions also.
> ---
>  src/mesa/main/dlist.c | 47 +++
>  1 file changed, 47 insertions(+)
>
> diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
> index e2ab2eb8aa1..5ff0a23018c 100644
> --- a/src/mesa/main/dlist.c
> +++ b/src/mesa/main/dlist.c
> @@ -1913,6 +1913,47 @@ 
> save_DrawElementsInstancedBaseVertexBaseInstance(UNUSED GLenum mode,
> "glDrawElementsInstancedBaseVertexBaseInstance() during 
> display list compile");
>  }
>
> +static void APIENTRY
> +save_DrawArraysIndirect(UNUSED GLenum mode,
> +UNUSED const void *indirect)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +   _mesa_error(ctx, GL_INVALID_OPERATION,
> +   "glDrawArraysIndirect() during display list compile");
> +}
> +
> +static void APIENTRY
> +save_DrawElementsIndirect(UNUSED GLenum mode,
> +  UNUSED GLenum type,
> +  UNUSED const void *indirect)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +   _mesa_error(ctx, GL_INVALID_OPERATION,
> +   "glDrawElementsIndirect() during display list compile");
> +}
> +
> +static void APIENTRY
> +save_MultiDrawArraysIndirect(UNUSED GLenum mode,
> + UNUSED const void *indirect,
> + UNUSED GLsizei primcount,
> + UNUSED GLsizei stride)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +   _mesa_error(ctx, GL_INVALID_OPERATION,
> +   "glMultiDrawArraysIndirect() during display list compile");
> +}
> +
> +static void APIENTRY
> +save_MultiDrawElementsIndirect(UNUSED GLenum mode,
> +   UNUSED GLenum type,
> +   UNUSED const void *indirect,
> +   UNUSED GLsizei primcount,
> +   UNUSED GLsizei stride)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +   _mesa_error(ctx, GL_INVALID_OPERATION,
> +   "glMultiDrawElementsIndirect() during display list compile");
> +}
>
>  /**
>   * While building a display list we cache some OpenGL state.
> @@ -11410,6 +11451,12 @@ _mesa_initialize_save_table(const struct gl_context 
> *ctx)
> SET_DrawElementsInstancedBaseInstance(table, 
> save_DrawElementsInstancedBaseInstance);
> SET_DrawElementsInstancedBaseVertexBaseInstance(table, 
> save_DrawElementsInstancedBaseVertexBaseInstance);
>
> +   /* GL_ARB_draw_indirect / GL_ARB_multi_draw_indirect */
> +   SET_DrawArraysIndirect(table, save_DrawArraysIndirect);
> +   SET_DrawElementsIndirect(table, save_DrawElementsIndirect);
> +   SET_MultiDrawArraysIndirect(table, save_MultiDrawArraysIndirect);
> +   SET_MultiDrawElementsIndirect(table, save_MultiDrawElementsIndirect);
> +
> /* OpenGL 4.2 / GL_ARB_separate_shader_objects */
> SET_UseProgramStages(table, save_UseProgramStages);
> SET_ProgramUniform1f(table, save_ProgramUniform1f);
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: enable EXT_texture_array by default in 1.30

2018-06-28 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Jun 29, 2018 at 12:42 AM, Timothy Arceri  wrote:
> This extension was made core in OpenGL 3.0.
>
> This fixes rendering issues in No Man's Sky.
> ---
>  src/compiler/glsl/builtin_functions.cpp | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/glsl/builtin_functions.cpp 
> b/src/compiler/glsl/builtin_functions.cpp
> index 7119903795f..787a72b49c5 100644
> --- a/src/compiler/glsl/builtin_functions.cpp
> +++ b/src/compiler/glsl/builtin_functions.cpp
> @@ -336,20 +336,22 @@ static bool
>  texture_array_lod(const _mesa_glsl_parse_state *state)
>  {
> return lod_exists_in_stage(state) &&
> -  state->EXT_texture_array_enable;
> +  (state->is_version(130, 0) ||
> +   state->EXT_texture_array_enable);
>  }
>
>  static bool
>  fs_texture_array(const _mesa_glsl_parse_state *state)
>  {
> return state->stage == MESA_SHADER_FRAGMENT &&
> -  state->EXT_texture_array_enable;
> +  (state->is_version(130, 0) ||
> +   state->EXT_texture_array_enable);
>  }
>
>  static bool
>  texture_array(const _mesa_glsl_parse_state *state)
>  {
> -   return state->EXT_texture_array_enable;
> +   return state->is_version(130, 0) || state->EXT_texture_array_enable;
>  }
>
>  static bool
> --
> 2.17.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: enable EXT_texture_array by default in 1.30

2018-06-28 Thread Timothy Arceri
This extension was made core in OpenGL 3.0.

This fixes rendering issues in No Man's Sky.
---
 src/compiler/glsl/builtin_functions.cpp | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 7119903795f..787a72b49c5 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -336,20 +336,22 @@ static bool
 texture_array_lod(const _mesa_glsl_parse_state *state)
 {
return lod_exists_in_stage(state) &&
-  state->EXT_texture_array_enable;
+  (state->is_version(130, 0) ||
+   state->EXT_texture_array_enable);
 }
 
 static bool
 fs_texture_array(const _mesa_glsl_parse_state *state)
 {
return state->stage == MESA_SHADER_FRAGMENT &&
-  state->EXT_texture_array_enable;
+  (state->is_version(130, 0) ||
+   state->EXT_texture_array_enable);
 }
 
 static bool
 texture_array(const _mesa_glsl_parse_state *state)
 {
-   return state->EXT_texture_array_enable;
+   return state->is_version(130, 0) || state->EXT_texture_array_enable;
 }
 
 static bool
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600/sb: cleanup if_conversion iterator to be legal C++

2018-06-28 Thread Dave Airlie
From: Dave Airlie 

The current code causes:
/usr/include/c++/8/debug/safe_iterator.h:207:
Error: attempt to copy from a singular iterator.

This is due to the iterators getting invalidated, fix the
reverse iterator to use the return value from erase, and
cast it properly.

Cc: 
---
 src/gallium/drivers/r600/sb/sb_if_conversion.cpp | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/r600/sb/sb_if_conversion.cpp 
b/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
index 3f6431b80f5..5556531f145 100644
--- a/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
+++ b/src/gallium/drivers/r600/sb/sb_if_conversion.cpp
@@ -42,16 +42,13 @@ int if_conversion::run() {
regions_vec  = sh.get_regions();
 
unsigned converted = 0;
-
-   for (regions_vec::reverse_iterator N, I = rv.rbegin(), E = rv.rend();
-   I != E; I = N) {
-   N = I; ++N;
-
+   for (regions_vec::reverse_iterator I = rv.rbegin(); I != rv.rend(); ) {
region_node *r = *I;
if (run_on(r)) {
-   rv.erase(I.base() - 1);
+   I = decltype(I){rv.erase(std::next(I).base())};
++converted;
-   }
+   } else
+   ++I;
}
return 0;
 }
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Radeonsi OpenGL 4.4 compat profile support

2018-06-28 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

on RX 580
with Plasma 5, UH, UV, Blender 2.79b, Krita 4.1 and glmark2.

Dieter

Am 28.06.2018 08:46, schrieb Timothy Arceri:

Sorry to keep spamming the list with this stuff, but Dave helped
out with ARB_vertex_attrib_64bit support and the spec bug I
submitted for indirect compute dispatch was resolved so it
seemed like a good idea to send it all out again together with
these updates.

Pretty much everything has corresponding piglit tests, but I've
also been testing with a few games and I'm now seeing games such
Doom and Wolfenstein working on wine where previously the version
overrides were not enough to get them to work.

There has also been a report that proper compat support fixes
some issues with Dying Light.

Please review.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] radeonsi: move VS_STATE_SGPR before draw SGPRs

2018-06-28 Thread Dieter Nützel

For the series:

Tested-by: Dieter Nützel 

on RX 580
with Plasma 5, UH, UV, Blender 2.79b, Krita 4.1 and glmark2.

Dieter

Am 27.06.2018 22:10, schrieb Marek Olšák:

From: Marek Olšák 

for vertex color clamping.
---
 src/gallium/drivers/radeonsi/si_shader.c | 14 +++---
 src/gallium/drivers/radeonsi/si_shader.h |  9 ++---
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c
b/src/gallium/drivers/radeonsi/si_shader.c
index c6b91ba5cf3..9bee8440027 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3442,21 +3442,21 @@ static void
si_set_ls_return_value_for_tcs(struct si_shader_context *ctx)
ret = si_insert_input_ptr(ctx, ret, ctx->param_rw_buffers,
  8 + SI_SGPR_RW_BUFFERS);
ret = si_insert_input_ptr(ctx, ret,
  ctx->param_bindless_samplers_and_images,
  8 + SI_SGPR_BINDLESS_SAMPLERS_AND_IMAGES);

ret = si_insert_input_ret(ctx, ret, ctx->param_vs_state_bits,
  8 + SI_SGPR_VS_STATE_BITS);

 #if !HAVE_32BIT_POINTERS
-   ret = si_insert_input_ptr(ctx, ret, ctx->param_vs_state_bits + 1,
+   ret = si_insert_input_ptr(ctx, ret, ctx->param_vs_state_bits + 4,
  8 + GFX9_SGPR_2ND_SAMPLERS_AND_IMAGES);
 #endif

ret = si_insert_input_ret(ctx, ret, ctx->param_tcs_offchip_layout,
  8 + GFX9_SGPR_TCS_OFFCHIP_LAYOUT);
ret = si_insert_input_ret(ctx, ret, ctx->param_tcs_out_lds_offsets,
  8 + GFX9_SGPR_TCS_OUT_OFFSETS);
ret = si_insert_input_ret(ctx, ret, ctx->param_tcs_out_lds_layout,
  8 + GFX9_SGPR_TCS_OUT_LAYOUT);

@@ -3482,21 +3482,21 @@ static void
si_set_es_return_value_for_gs(struct si_shader_context *ctx)
ret = si_insert_input_ret(ctx, ret, ctx->param_merged_wave_info, 3);
 	ret = si_insert_input_ret(ctx, ret, ctx->param_merged_scratch_offset, 
5);


ret = si_insert_input_ptr(ctx, ret, ctx->param_rw_buffers,
  8 + SI_SGPR_RW_BUFFERS);
ret = si_insert_input_ptr(ctx, ret,
  ctx->param_bindless_samplers_and_images,
  8 + SI_SGPR_BINDLESS_SAMPLERS_AND_IMAGES);

 #if !HAVE_32BIT_POINTERS
-   ret = si_insert_input_ptr(ctx, ret, ctx->param_vs_state_bits + 1,
+   ret = si_insert_input_ptr(ctx, ret, ctx->param_vs_state_bits + 4,
  8 + GFX9_SGPR_2ND_SAMPLERS_AND_IMAGES);
 #endif

unsigned vgpr;
if (ctx->type == PIPE_SHADER_VERTEX)
vgpr = 8 + GFX9_VSGS_NUM_USER_SGPR;
else
vgpr = 8 + GFX9_TESGS_NUM_USER_SGPR;

for (unsigned i = 0; i < 5; i++) {
@@ -4628,24 +4628,24 @@ static void
declare_global_desc_pointers(struct si_shader_context *ctx,
 {
ctx->param_rw_buffers = add_arg(fninfo, ARG_SGPR,
ac_array_in_const32_addr_space(ctx->v4i32));
ctx->param_bindless_samplers_and_images = add_arg(fninfo, ARG_SGPR,
ac_array_in_const32_addr_space(ctx->v8i32));
 }

 static void declare_vs_specific_input_sgprs(struct si_shader_context 
*ctx,

struct si_function_info *fninfo)
 {
+   ctx->param_vs_state_bits = add_arg(fninfo, ARG_SGPR, ctx->i32);
add_arg_assign(fninfo, ARG_SGPR, ctx->i32, >abi.base_vertex);
add_arg_assign(fninfo, ARG_SGPR, ctx->i32, >abi.start_instance);
add_arg_assign(fninfo, ARG_SGPR, ctx->i32, >abi.draw_id);
-   ctx->param_vs_state_bits = add_arg(fninfo, ARG_SGPR, ctx->i32);
 }

 static void declare_vs_input_vgprs(struct si_shader_context *ctx,
   struct si_function_info *fninfo,
   unsigned *num_prolog_vgprs)
 {
struct si_shader *shader = ctx->shader;

add_arg_assign(fninfo, ARG_VGPR, ctx->i32, >abi.vertex_id);
if (shader->key.as_ls) {
@@ -4859,27 +4859,26 @@ static void create_function(struct
si_shader_context *ctx)
add_arg(, ARG_SGPR, ctx->i32); /* unused
(SPI_SHADER_PGM_LO/HI_GS << 8) */
add_arg(, ARG_SGPR, ctx->i32); /* unused
(SPI_SHADER_PGM_LO/HI_GS >> 24) */

declare_global_desc_pointers(ctx, );
declare_per_stage_desc_pointers(ctx, ,
(ctx->type == 
PIPE_SHADER_VERTEX ||
 ctx->type == 
PIPE_SHADER_TESS_EVAL));
if (ctx->type == PIPE_SHADER_VERTEX) {
declare_vs_specific_input_sgprs(ctx, );
} else {
+   ctx->param_vs_state_bits = add_arg(, ARG_SGPR, 
ctx->i32);
 			ctx->param_tcs_offchip_layout = add_arg(, ARG_SGPR, 
ctx->i32);

 

Re: [Mesa-dev] [PATCH] radv: optimize vkCmd{Set, Reset}Event() a little bit

2018-06-28 Thread Dieter Nützel

Tested-by: Dieter Nützel 

on RX 580 with F1 2017.

Dieter

Am 28.06.2018 12:21, schrieb Samuel Pitoiset:

Always emitting a bottom-of-pipe event is quite dumb. Instead,
start to optimize these functions by syncing PFP for the
top-of-pipe and syncing ME for the post-index-fetch event.

This can still be improved by emitting EOS events for
syncing PS and CS stages.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 46 ++--
 1 file changed, 38 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
b/src/amd/vulkan/radv_cmd_buffer.c

index 074e9c4c7f..17385aace1 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -4275,14 +4275,44 @@ static void write_event(struct radv_cmd_buffer
*cmd_buffer,

MAYBE_UNUSED unsigned cdw_max =
radeon_check_space(cmd_buffer->device->ws, cs, 18);

-   /* TODO: this is overkill. Probably should figure something out from
-* the stage mask. */
-
-   si_cs_emit_write_event_eop(cs,
-  
cmd_buffer->device->physical_device->rad_info.chip_class,
-  radv_cmd_buffer_uses_mec(cmd_buffer),
-  V_028A90_BOTTOM_OF_PIPE_TS, 0,
-  EOP_DATA_SEL_VALUE_32BIT, va, 2, value);
+   /* Flags that only require a top-of-pipe event. */
+   static const VkPipelineStageFlags top_of_pipe_flags =
+   VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
+
+   /* Flags that only require a post-index-fetch event. */
+   static const VkPipelineStageFlags post_index_fetch_flags =
+   top_of_pipe_flags |
+   VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
+   VK_PIPELINE_STAGE_VERTEX_INPUT_BIT;
+
+   /* TODO: Emit EOS events for syncing PS/CS stages. */
+
+   if (!(stageMask & ~top_of_pipe_flags)) {
+   /* Just need to sync the PFP engine. */
+   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
+   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
+   S_370_WR_CONFIRM(1) |
+   S_370_ENGINE_SEL(V_370_PFP));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, value);
+   } else if (!(stageMask & ~post_index_fetch_flags)) {
+   /* Sync ME because PFP reads index and indirect buffers. */
+   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
+   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
+   S_370_WR_CONFIRM(1) |
+   S_370_ENGINE_SEL(V_370_ME));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, value);
+   } else {
+   /* Otherwise, sync all prior GPU work using an EOP event. */
+   si_cs_emit_write_event_eop(cs,
+  
cmd_buffer->device->physical_device->rad_info.chip_class,
+  radv_cmd_buffer_uses_mec(cmd_buffer),
+  V_028A90_BOTTOM_OF_PIPE_TS, 0,
+  EOP_DATA_SEL_VALUE_32BIT, va, 2, 
value);
+   }

assert(cmd_buffer->cs->cdw <= cdw_max);
 }

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Plumb invariant output attrib thru TGSI

2018-06-28 Thread Dave Airlie
On 29 June 2018 at 11:03, Robert Tarasov  wrote:
> ping for push. it's not pushed yet.

Oops I thought I'd done it.

will push it now.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Plumb invariant output attrib thru TGSI

2018-06-28 Thread Robert Tarasov
ping for push. it's not pushed yet.

On Wed, Jun 20, 2018 at 5:55 PM, Robert Tarasov 
wrote:

> From: "Joe M. Kniss" 
>
> Add support for glsl 'invariant' modifier for output data declarations.
> Gallium drivers that use TGSI serialization currently loose invariant
> modifiers in glsl shaders.
>
> v2: use boolean for invariant instead of unsigned.
>
> Change-Id: Ieac8639116def45233513b6867a847cf7fda2f55
> Tested: chromiumos on qemu with virglrenderer.
> ---
>  src/gallium/auxiliary/tgsi/tgsi_strings.c  |  2 ++
>  src/gallium/auxiliary/tgsi/tgsi_strings.h  |  2 ++
>  src/gallium/auxiliary/tgsi/tgsi_text.c | 18 ++
>  src/gallium/auxiliary/tgsi/tgsi_ureg.c | 28 +++---
>  src/gallium/auxiliary/tgsi/tgsi_ureg.h |  4 +++-
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  8 +--
>  6 files changed, 46 insertions(+), 16 deletions(-)
>
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c
> b/src/gallium/auxiliary/tgsi/tgsi_strings.c
> index 4f28b49ce8a..434871273f2 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
> @@ -185,6 +185,8 @@ const char 
> *tgsi_interpolate_locations[TGSI_INTERPOLATE_LOC_COUNT]
> =
> "SAMPLE",
>  };
>
> +const char *tgsi_invariant_name = "INVARIANT";
> +
>  const char *tgsi_primitive_names[PIPE_PRIM_MAX] =
>  {
> "POINTS",
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.h
> b/src/gallium/auxiliary/tgsi/tgsi_strings.h
> index bb2d3458dde..20e3f7127f6 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.h
> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.h
> @@ -52,6 +52,8 @@ extern const char *tgsi_interpolate_names[TGSI_
> INTERPOLATE_COUNT];
>
>  extern const char *tgsi_interpolate_locations[
> TGSI_INTERPOLATE_LOC_COUNT];
>
> +extern const char *tgsi_invariant_name;
> +
>  extern const char *tgsi_primitive_names[PIPE_PRIM_MAX];
>
>  extern const char *tgsi_fs_coord_origin_names[2];
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c
> b/src/gallium/auxiliary/tgsi/tgsi_text.c
> index 02241a66bfe..815b1ee65db 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_text.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
> @@ -1586,10 +1586,6 @@ static boolean parse_declaration( struct
> translate_ctx *ctx )
>  break;
>   }
>}
> -  if (i == TGSI_INTERPOLATE_COUNT) {
> - report_error( ctx, "Expected semantic or interpolate attribute"
> );
> - return FALSE;
> -  }
> }
>
> cur = ctx->cur;
> @@ -1609,6 +1605,20 @@ static boolean parse_declaration( struct
> translate_ctx *ctx )
>}
> }
>
> +   cur = ctx->cur;
> +   eat_opt_white(  );
> +   if (*cur == ',' && !is_vs_input) {
> +  cur++;
> +  eat_opt_white(  );
> +  if (str_match_nocase_whole( , tgsi_invariant_name )) {
> + decl.Declaration.Invariant = 1;
> + ctx->cur = cur;
> +  } else {
> + report_error( ctx, "Expected semantic, interpolate attribute, or
> invariant ");
> + return FALSE;
> +  }
> +   }
> +
> advance = tgsi_build_full_declaration(
>,
>ctx->tokens_cur,
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> index 7d2b9af140d..f1bebe1e155 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
> @@ -140,6 +140,7 @@ struct ureg_program
>unsigned first;
>unsigned last;
>unsigned array_id;
> +  boolean invariant;
> } output[UREG_MAX_OUTPUT];
> unsigned nr_outputs, nr_output_regs;
>
> @@ -427,7 +428,8 @@ ureg_DECL_output_layout(struct ureg_program *ureg,
>  unsigned index,
>  unsigned usage_mask,
>  unsigned array_id,
> -unsigned array_size)
> +unsigned array_size,
> +boolean invariant)
>  {
> unsigned i;
>
> @@ -455,6 +457,7 @@ ureg_DECL_output_layout(struct ureg_program *ureg,
>ureg->output[i].first = index;
>ureg->output[i].last = index + array_size - 1;
>ureg->output[i].array_id = array_id;
> +  ureg->output[i].invariant = invariant;
>ureg->nr_output_regs = MAX2(ureg->nr_output_regs, index +
> array_size);
>ureg->nr_outputs++;
> }
> @@ -480,7 +483,8 @@ ureg_DECL_output_masked(struct ureg_program *ureg,
>  unsigned array_size)
>  {
> return ureg_DECL_output_layout(ureg, name, index, 0,
> -  ureg->nr_output_regs, usage_mask,
> array_id, array_size);
> +  ureg->nr_output_regs, usage_mask,
> array_id,
> +  array_size, FALSE);
>  }
>
>
> @@ -1512,7 +1516,8 @@ emit_decl_semantic(struct ureg_program *ureg,
> unsigned semantic_index,
> unsigned streams,
> unsigned usage_mask,
> -   

Re: [Mesa-dev] [PATCH 03/10] glsl: don't let an 'if' then-branch kill copy propagation for else-branch

2018-06-28 Thread Caio Marcelo de Oliveira Filho
Hi,

> > The hurt instruction count is caused because the extra propagation
> > causes an input variable to be read from two branches of an
> > if (load_input intrinsic in NIR). Depending on the complexity of each
> > branch this might be a win or not in terms of cycles.
> 
> I just sent out a patch (nir/opt_peephole_select: Don't try to remove
> flow control around indirect loads) that deals with a similar sort of
> thing.  Were the cases you observed also indirect loads?

Thanks for this comment. The cases in question look like:

  (declare (location=... shader_in ) vec4 )

  ...

  (declare () vec4 )
  (assign  (xyzw) (var_ref )  (var_ref ) ) 

Later assignments using '' get '' instead.


> Maybe we
> should add those to the class of things that don't get copy propagated

I thought of that, but my conclusion was that the propagation can
actually give us wins if the uses are both inside nested branches in
the two toplevel branches. I might be wrong, but the resulting NIR
also looks "resolvable", so the outcome here might be find the right
pass in NIR to improve.


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/12] nir: Add partial redundancy elimination for compares

2018-06-28 Thread Caio Marcelo de Oliveira Filho
Hi,

On Wed, Jun 27, 2018 at 09:46:24PM -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> This pass attempts to dectect code sequences like
> 
> if (x < y) {
> z = y - z;

Typo "z = x - y".

 
> Currently only floating point compares and adds are supported.  Adding
> support for integer will be a challenge due to integer overflow.  There
> are a couple possible solutions, but they may not apply to all
> architectures.

Optional: consider mentioning this in the initial comment block.



> diff --git a/src/compiler/nir/nir_instr_set.c 
> b/src/compiler/nir/nir_instr_set.c
> index 1a491f46ff4..e9371af230a 100644
> --- a/src/compiler/nir/nir_instr_set.c
> +++ b/src/compiler/nir/nir_instr_set.c
> @@ -239,7 +239,8 @@ get_neg_instr(const nir_src *s)
>  {
> const struct nir_alu_instr *const alu = nir_src_as_alu_instr_const(s);
>  
> -   return alu->op == nir_op_fneg || alu->op == nir_op_ineg ? alu : NULL;
> +   return alu != NULL && (alu->op == nir_op_fneg || alu->op == nir_op_ineg)
> +  ? alu : NULL;
>  }

Squash this chunk into the previous patch that adds this function.


> +static struct block_instructions *
> +push_block(struct block_queue *bq)
> +{
> +   struct block_instructions *bi =
> +  (struct block_instructions *) exec_list_pop_head(>reusable_blocks);
> +
> +   if (bi == NULL) {
> +  bi = calloc(1, sizeof(struct block_instructions));
> +
> +  if (bi == NULL)
> + return NULL;

Callsites use bi->instructions without checking, so is this check here
useful?


> +static void
> +rewrite_compare_instruction(nir_builder *bld, nir_alu_instr *orig_cmp,
> +nir_alu_instr *orig_add, bool zero_on_left)
> +{
> +   void *const mem_ctx = ralloc_parent(orig_cmp);
> +
> +   bld->cursor = nir_before_instr(_cmp->instr);
> +
> +   /* This is somewhat tricky.  The compare instruction may be something like
> +* (fcmp, a, b) while the add instruction is something like (fadd, 
> fneg(a),
> +* b).  This is problematic because the SSA value for the fneg(a) may not
> +* exist yet at the compare instruction.
> +*
> +* We fabricate the operands of the new add.  This is done using
> +* information provided by zero_on_left.  If zero_on_left is true, we know
> +* the resulting compare instruction is (fcmp, 0.0, (fadd, x, y)).  If the
> +* original compare instruction was (fcmp, a, b), x = b and y = -a.  If
> +* zero_on_left is false, the resulting compare instruction is (fcmp,
> +* (fadd, x, y), 0.0) and x = a and y = -b.
> +*/
> +   nir_ssa_def *const a = nir_ssa_for_alu_src(bld, orig_cmp, 0);
> +   nir_ssa_def *const b = nir_ssa_for_alu_src(bld, orig_cmp, 1);
> +
> +   nir_ssa_def *const fadd = zero_on_left
> +  ? nir_fadd(bld, b, nir_fneg(bld, a))
> +  : nir_fadd(bld, a, nir_fneg(bld, b));
> +
> +   nir_ssa_def *const zero =
> +  nir_imm_floatN_t(bld, 0.0, orig_add->dest.dest.ssa.bit_size);
> +
> +   nir_ssa_def *const cmp = zero_on_left
> +  ? nir_build_alu(bld, orig_cmp->op, zero, fadd, NULL, NULL)
> +  : nir_build_alu(bld, orig_cmp->op, fadd, zero, NULL, NULL);
> +
> +   /* Generating extra moves of the results is the easy way to make sure the
> +* writemasks match the original instructions.  Later optimization passes
> +* will clean these up.
> +*/

Why it isn't sufficient to set the write_mask for the instructions
that were just created?



> +   /* The operands of both instructions are, with some liberty,
> +* commutative.  Check all three permutations.  The third
> +* permutaiton is a negation of the original operation, so it

Typo "permutation".


> +   } else if (nir_alu_srcs_equal(cmp, alu, 1, 0) &&
> +  nir_alu_srcs_negative_equal(cmp, alu, 0, 1)) {
> +  /* This is the case where (A cmp B) matches (B + -A).
> +   *
> +   *A cmp B <=> 0 cmp B + -A
> +   */

Shouldn't (-A + B) be handled here too? (If so, we have four valid
permutations).


> +  rewrite_compare_instruction(bld, cmp, alu, true);
> +
> +  *a = NULL;
> +  rewrote_compare = true;
> +  break;
> +   }
> +}
> +
> +/* Bail after a compare in the most dominating block is found.
> + * This is necessary because 'alu' has been remove from the

Typo "removed".



Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nv50/ir: Improve Maintainability of Target*::initOpInfo()

2018-06-28 Thread Karol Herbst
On Sat, Jun 16, 2018 at 12:26 PM, Rhys Perry  wrote:
> This is mainly useful for when one needs to add new opcodes in a painless
> and reliable way.
>
> Signed-off-by: Rhys Perry 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_target_nv50.cpp | 21 
> -
>  .../drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 20 +++-
>  2 files changed, 23 insertions(+), 18 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> index 83b4102b0a..c4073000aa 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> @@ -111,16 +111,15 @@ void TargetNV50::initOpInfo()
>  {
> unsigned int i, j;
>
> -   static const uint32_t commutative[(OP_LAST + 31) / 32] =
> +   static const uint32_t commutativeList[] =

please change the type to operation

> {
> -  // ADD, MUL, MAD, FMA, AND, OR, XOR, MAX, MIN, SET_AND, SET_OR, 
> SET_XOR,
> -  // SET, SELP, SLCT
> -  0x0ce0ca00, 0x007e, 0x, 0x
> +  OP_ADD, OP_MUL, OP_MAD, OP_FMA, OP_AND, OP_OR, OP_XOR, OP_MAX, OP_MIN,
> +  OP_SET_AND, OP_SET_OR, OP_SET_XOR, OP_SET, OP_SELP, OP_SLCT
> };
> -   static const uint32_t shortForm[(OP_LAST + 31) / 32] =
> +   static const uint32_t shortFormList[] =

please change the type to operation

> {
> -  // MOV, ADD, SUB, MUL, MAD, SAD, RCP, L/PINTERP, TEX, TXF
> -  0x00014e40, 0x0080, 0x1260, 0x
> +  OP_MOV, OP_ADD, OP_SUB, OP_MUL, OP_MAD, OP_SAD, OP_RCP, OP_LINTERP,
> +  OP_PINTERP, OP_TEX, OP_TXF
> };
> static const operation noDestList[] =
> {
> @@ -157,12 +156,16 @@ void TargetNV50::initOpInfo()
>
>opInfo[i].hasDest = 1;
>opInfo[i].vector = (i >= OP_TEX && i <= OP_TEXCSAA);
> -  opInfo[i].commutative = (commutative[i / 32] >> (i % 32)) & 1;
> +  opInfo[i].commutative = false;
>opInfo[i].pseudo = (i < OP_MOV);
>opInfo[i].predicate = !opInfo[i].pseudo;
>opInfo[i].flow = (i >= OP_BRA && i <= OP_JOIN);
> -  opInfo[i].minEncSize = (shortForm[i / 32] & (1 << (i % 32))) ? 4 : 8;
> +  opInfo[i].minEncSize = 8;
> }
> +   for (i = 0; i < sizeof(commutativeList) / sizeof(commutativeList[0]); ++i)
> +  opInfo[commutativeList[i]].commutative = true;
> +   for (i = 0; i < sizeof(shortFormList) / sizeof(shortFormList[0]); ++i)
> +  opInfo[shortFormList[i]].minEncSize = 4;

I think we may should use range-based for loops:

for (const operation  : commutativeList
   opInfo[op].commutative = true;
for (const operation  : shortFormList)
   opInfo[op].minEncSize = 4;

It depends on C++11, but I think it is time to move forward :) (it
might require some changes to build system files to require C++11 in
Nouveau.)

Anyway it would make code like this so much cleaner and easier to read :)

The code is fine as it is, but I would prefer to use C++11 in such
cases, because it makes sense.

> for (i = 0; i < sizeof(noDestList) / sizeof(noDestList[0]); ++i)
>opInfo[noDestList[i]].hasDest = 0;
> for (i = 0; i < sizeof(noPredList) / sizeof(noPredList[0]); ++i)
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> index 954aec0a2f..cc1efb4e71 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> @@ -191,17 +191,15 @@ void TargetNVC0::initOpInfo()
>  {
> unsigned int i, j;
>
> -   static const uint32_t commutative[(OP_LAST + 31) / 32] =
> +   static const operation commutative[] =
> {
> -  // ADD, MUL, MAD, FMA, AND, OR, XOR, MAX, MIN, SET_AND, SET_OR, 
> SET_XOR,
> -  // SET, SELP, SLCT
> -  0x0ce0ca00, 0x007e, 0x, 0x
> +  OP_ADD, OP_MUL, OP_MAD, OP_FMA, OP_AND, OP_OR, OP_XOR, OP_MAX, OP_MIN,
> +  OP_SET_AND, OP_SET_OR, OP_SET_XOR, OP_SET, OP_SELP, OP_SLCT
> };
>
> -   static const uint32_t shortForm[(OP_LAST + 31) / 32] =
> +   static const operation shortForm[] =
> {
> -  // ADD, MUL, MAD, FMA, AND, OR, XOR, MAX, MIN
> -  0x0ce0ca00, 0x, 0x, 0x
> +  OP_ADD, OP_MUL, OP_MAD, OP_FMA, OP_AND, OP_OR, OP_XOR, OP_MAX, OP_MIN
> };
>
> static const operation noDest[] =
> @@ -240,12 +238,16 @@ void TargetNVC0::initOpInfo()
>
>opInfo[i].hasDest = 1;
>opInfo[i].vector = (i >= OP_TEX && i <= OP_TEXCSAA);
> -  opInfo[i].commutative = (commutative[i / 32] >> (i % 32)) & 1;
> +  opInfo[i].commutative = false;
>opInfo[i].pseudo = (i < OP_MOV);
>opInfo[i].predicate = !opInfo[i].pseudo;
>opInfo[i].flow = (i >= OP_BRA && i <= OP_JOIN);
> -  opInfo[i].minEncSize = (shortForm[i / 32] & (1 << (i % 32))) ? 4 : 8;
> +  opInfo[i].minEncSize = 8;
> }
> +   for (i = 0; i < 

Re: [Mesa-dev] [PATCH 4/8] radv: add all dependencies from external to the first subpass

2018-06-28 Thread Fredrik Höglund
On Wednesday 27 June 2018, Samuel Pitoiset wrote:
> 
> On 06/27/2018 02:12 AM, Bas Nieuwenhuizen wrote:
> > Reviewed-by: Bas Nieuwenhuizen 
> > 
> > for patch 3-4. Not sure they should go to stable though, since they
> > are optimizations?
> 
> Isn't the whole series for optimization purposes?

No, patches 2 and 7 are bugfixes. Patch 7 will cause additional flushes
in the meta code paths without patch 6, so that patch is also marked
for stable. Patch 8 is an optimization of patch 7 that I decided to keep
separate for bisectibility.

That being said, these patches don't fix any issues in any real applications
as far as I'm aware.

Patches 1, 3 and 4 are just optimizations, so those should indeed not go
to stable.

> > 
> > On Tue, Jun 26, 2018 at 11:49 PM, Fredrik Höglund  wrote:
> >> This is to avoid repeating dependencies when more than one subpass
> >> has a dependency from external.
> >>
> >> Cc: 
> >> Signed-off-by: Fredrik Höglund 
> >> ---
> >>   src/amd/vulkan/radv_pass.c | 4 
> >>   1 file changed, 4 insertions(+)
> >>
> >> diff --git a/src/amd/vulkan/radv_pass.c b/src/amd/vulkan/radv_pass.c
> >> index 2827f5f1a8d..7e6fd84af55 100644
> >> --- a/src/amd/vulkan/radv_pass.c
> >> +++ b/src/amd/vulkan/radv_pass.c
> >> @@ -179,6 +179,10 @@ VkResult radv_CreateRenderPass(
> >>  if (src == dst)
> >>  continue;
> >>
> >> +   if (src == VK_SUBPASS_EXTERNAL) {
> >> +   /* Add all dependencies from external to the first 
> >> subpass */
> >> +   dst = 0;
> >> +   }
> >>  if (dst == VK_SUBPASS_EXTERNAL) {
> >>  if (pCreateInfo->pDependencies[i].dstStageMask != 
> >> VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT)
> >>  pass->end_barrier.src_stage_mask |= 
> >> pCreateInfo->pDependencies[i].srcStageMask;
> >> --
> >> 2.17.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/12] intel/compiler: More peephole select

2018-06-28 Thread Caio Marcelo de Oliveira Filho
Hi,

> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index 67c062d91f5..6a0d4090fa7 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -557,7 +557,22 @@ brw_nir_optimize(nir_shader *nir, const struct 
> brw_compiler *compiler,
>OPT(nir_copy_prop);
>OPT(nir_opt_dce);
>OPT(nir_opt_cse);
> +
> +  /* Passing 0 to the peephole select pass causes it to convert
> +   * if-statements that contain only move instructions in the branches
> +   * regardless of the count.
> +   *
> +   * Passing 0 to the peephole select pass causes it to convert

Typo "Passing 1".


> +   * if-statements that contain at most a single ALU instruction (total)
> +   * in both branches.  Before Gen6, some math instructions were
> +   * prohibitively expensive and the results of compare operations need 
> an
> +   * extra resolve step.  For these reasons, this pass is more harmful
> +   * than good on those platforms.
> +   */
>OPT(nir_opt_peephole_select, 0);
> +  if (compiler->devinfo->gen >= 6)
> + OPT(nir_opt_peephole_select, 1);

It is not clear to me why running the pass twice (with 0 and then 1)
instead of using gen >= 6 to select either 0 or 1; or running both
passes with 1 if gen >= 6 (since 1 covers 0).

I do understand the second execution can optimize more cases since
blocks get simplified in the first execution, but was expecting to be
sufficient to wait the next iteration of the main brw_nir_optimize
loop.


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] radv: don't flush src stages when dstStageMask == BOTTOM_OF_PIPE

2018-06-28 Thread Fredrik Höglund
On Wednesday 27 June 2018, Bas Nieuwenhuizen wrote:
> Don't we still need this when having layout transitions?

I think the answer is probably yes, though I haven't noticed any
regressions with this patch.

I'll post an updated version that takes transitions into account.

> On Tue, Jun 26, 2018 at 11:49 PM, Fredrik Höglund  wrote:
> > The Vulkan specification says:
> >
> >"An execution dependency with only VK_PIPELINE_STAGE_BOTTOM_OF_-
> > PIPE_BIT in the destination stage mask [...] does not delay
> > processing of subsequent commands."
> >
> > Cc: 
> > Signed-off-by: Fredrik Höglund 
> > ---
> >  src/amd/vulkan/radv_cmd_buffer.c | 3 ++-
> >  src/amd/vulkan/radv_pass.c   | 6 --
> >  2 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/amd/vulkan/radv_cmd_buffer.c 
> > b/src/amd/vulkan/radv_cmd_buffer.c
> > index 110a9a960a9..5bfcba28d83 100644
> > --- a/src/amd/vulkan/radv_cmd_buffer.c
> > +++ b/src/amd/vulkan/radv_cmd_buffer.c
> > @@ -4197,7 +4197,8 @@ void radv_CmdPipelineBarrier(
> > image);
> > }
> >
> > -   radv_stage_flush(cmd_buffer, srcStageMask);
> > +   if (destStageMask != VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT)
> > +   radv_stage_flush(cmd_buffer, srcStageMask);
> > cmd_buffer->state.flush_bits |= src_flush_bits;
> >
> > for (uint32_t i = 0; i < imageMemoryBarrierCount; i++) {
> > diff --git a/src/amd/vulkan/radv_pass.c b/src/amd/vulkan/radv_pass.c
> > index 15fee444cdc..7a0dca09496 100644
> > --- a/src/amd/vulkan/radv_pass.c
> > +++ b/src/amd/vulkan/radv_pass.c
> > @@ -174,11 +174,13 @@ VkResult radv_CreateRenderPass(
> > for (unsigned i = 0; i < pCreateInfo->dependencyCount; ++i) {
> > uint32_t dst = pCreateInfo->pDependencies[i].dstSubpass;
> > if (dst == VK_SUBPASS_EXTERNAL) {
> > -   pass->end_barrier.src_stage_mask = 
> > pCreateInfo->pDependencies[i].srcStageMask;
> > +   if (pCreateInfo->pDependencies[i].dstStageMask != 
> > VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT)
> > +   pass->end_barrier.src_stage_mask = 
> > pCreateInfo->pDependencies[i].srcStageMask;
> > pass->end_barrier.src_access_mask = 
> > pCreateInfo->pDependencies[i].srcAccessMask;
> > pass->end_barrier.dst_access_mask = 
> > pCreateInfo->pDependencies[i].dstAccessMask;
> > } else {
> > -   pass->subpasses[dst].start_barrier.src_stage_mask = 
> > pCreateInfo->pDependencies[i].srcStageMask;
> > +   if (pCreateInfo->pDependencies[i].dstStageMask != 
> > VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT)
> > +   
> > pass->subpasses[dst].start_barrier.src_stage_mask = 
> > pCreateInfo->pDependencies[i].srcStageMask;
> > pass->subpasses[dst].start_barrier.src_access_mask 
> > = pCreateInfo->pDependencies[i].srcAccessMask;
> > pass->subpasses[dst].start_barrier.dst_access_mask 
> > = pCreateInfo->pDependencies[i].dstAccessMask;
> > }
> > --
> > 2.17.0
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] src/egl/Makefile: fix build race

2018-06-28 Thread Ross Burton
There is a parallel make build issue in src/egl/drivers/dri2/
for wayland builds. Can be reproduced with:

$ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo
$ make -C src/egl/ drivers/dri2/platform_wayland.lo
../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal 
error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory

This patch adds the missing dependency.
---
 src/egl/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index be3547d968..1a2273b8c3 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -80,6 +80,7 @@ drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h: 
$(WL_DMABUF_XML)
 if HAVE_PLATFORM_WAYLAND
 drivers/dri2/linux-dmabuf-unstable-v1-protocol.lo: 
drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
 drivers/dri2/egl_dri2.lo: 
drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
+drivers/dri2/platform_wayland.lo: 
drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h
 
 AM_CFLAGS += $(WAYLAND_CLIENT_CFLAGS)
 libEGL_common_la_LIBADD += $(WAYLAND_CLIENT_LIBS)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: MESA_framebuffer_flip_y extension [v2]

2018-06-28 Thread Fritz Koenig
Adds an extension to glFramebufferParameteri
that will specify if the framebuffer is vertically
flipped. Historically system framebuffers are
vertically flipped and user framebuffers are not.
Checking to see the state was done by looking at
the name field.  This adds an explicit field.

v2:
* updated spec language [for chadv]
* correctly specifying ES 3.1 [for chadv]
* refactor access to rb->Name [for jason]
* handle GetFramebufferParameteriv [for chadv]
---
 docs/specs/MESA_framebuffer_flip_y.spec| 84 ++
 include/GLES2/gl2ext.h |  5 ++
 src/mapi/glapi/registry/gl.xml |  6 ++
 src/mesa/drivers/dri/i915/intel_fbo.c  |  7 +-
 src/mesa/drivers/dri/i965/intel_fbo.c  |  7 +-
 src/mesa/drivers/dri/nouveau/nouveau_fbo.c |  7 +-
 src/mesa/drivers/dri/radeon/radeon_fbo.c   |  7 +-
 src/mesa/drivers/dri/radeon/radeon_span.c  |  9 ++-
 src/mesa/drivers/dri/swrast/swrast.c   |  7 +-
 src/mesa/drivers/osmesa/osmesa.c   |  5 +-
 src/mesa/drivers/x11/xm_buffer.c   |  3 +-
 src/mesa/drivers/x11/xmesaP.h  |  3 +-
 src/mesa/main/accum.c  | 17 +++--
 src/mesa/main/dd.h |  3 +-
 src/mesa/main/extensions_table.h   |  1 +
 src/mesa/main/fbobject.c   | 18 -
 src/mesa/main/framebuffer.c|  1 +
 src/mesa/main/glheader.h   |  3 +
 src/mesa/main/mtypes.h |  3 +
 src/mesa/main/readpix.c| 20 +++---
 src/mesa/state_tracker/st_cb_fbo.c |  7 +-
 src/mesa/swrast/s_blit.c   | 17 +++--
 src/mesa/swrast/s_clear.c  |  3 +-
 src/mesa/swrast/s_copypix.c| 11 +--
 src/mesa/swrast/s_depth.c  |  6 +-
 src/mesa/swrast/s_drawpix.c| 26 ---
 src/mesa/swrast/s_renderbuffer.c   |  6 +-
 src/mesa/swrast/s_renderbuffer.h   |  3 +-
 src/mesa/swrast/s_stencil.c|  3 +-
 29 files changed, 241 insertions(+), 57 deletions(-)
 create mode 100644 docs/specs/MESA_framebuffer_flip_y.spec

diff --git a/docs/specs/MESA_framebuffer_flip_y.spec 
b/docs/specs/MESA_framebuffer_flip_y.spec
new file mode 100644
index 00..dca77a9541
--- /dev/null
+++ b/docs/specs/MESA_framebuffer_flip_y.spec
@@ -0,0 +1,84 @@
+Name
+
+MESA_framebuffer_flip_y
+
+Name Strings
+
+GL_MESA_framebuffer_flip_y
+
+Contact
+
+Fritz Koenig 
+
+Contributors
+
+Fritz Koenig, Google
+Kristian Høgsberg, Google
+Chad Versace, Google
+
+Status
+
+Proposal
+
+Version
+
+Version 1, June 7, 2018
+
+Number
+
+TBD
+
+Dependencies
+
+OpenGL ES 3.1 is required, for FramebufferParameteri.
+
+Overview
+
+Rendered buffers are normally returned right side up, as accessed
+top to bottom.  This extension allows those buffers to be upside down
+when accessed top to bottom.
+
+This extension defines a new framebuffer parameter,
+GL_FRAMEBUFFER_FLIP_Y_MESA, that changes the behavior of the reads and
+writes to the framebuffer attachment points. When 
GL_FRAMEBUFFER_FLIP_Y_MESA
+is GL_TRUE, render commands and pixel transfer operations access the
+backing store of each attachment point with an y-inverted coordinate
+system. This y-inversion is relative to the coordinate system set when
+GL_FRAMEBUFFER_FLIP_Y_MESA is GL_FALSE.
+
+Access through TexSubImage2D and similar calls will notice the effect of
+the flip when they are not attached to framebuffer objects because
+GL_FRAMEBUFFER_FLIP_Y_MESA is associated with the framebuffer object and
+not the attachment points.
+
+IP Status
+
+None
+
+Issues
+
+None
+
+New Procedures and Functions
+
+None
+
+New Types
+
+None
+
+New Tokens
+
+Accepted by the  argument of FramebufferParameteri and
+GetFramebufferParameteriv:
+
+GL_FRAMEBUFFER_FLIP_Y_MESA  0x8BBB
+
+Errors
+GL_INVALID_OPERATION is returned from  GetFramebufferParameteriv if this
+is called on a winsys framebuffer.
+
+Revision History
+
+Version 1, June, 2018
+Initial draft (Fritz Koenig)
diff --git a/include/GLES2/gl2ext.h b/include/GLES2/gl2ext.h
index a7d19a1fc8..0a93bfb865 100644
--- a/include/GLES2/gl2ext.h
+++ b/include/GLES2/gl2ext.h
@@ -2334,6 +2334,11 @@ GL_APICALL void GL_APIENTRY glGetPerfQueryInfoINTEL 
(GLuint queryId, GLuint quer
 #endif
 #endif /* GL_INTEL_performance_query */
 
+#ifndef GL_MESA_framebuffer_flip_y
+#define GL_MESA_framebuffer_flip_y 1
+#define GL_FRAMEBUFFER_FLIP_Y_MESA0x8BBB
+#endif /* GL_MESA_framebuffer_flip_y */
+
 #ifndef GL_MESA_program_binary_formats
 #define GL_MESA_program_binary_formats 1
 #define GL_PROGRAM_BINARY_FORMAT_MESA 0x875F
diff --git a/src/mapi/glapi/registry/gl.xml b/src/mapi/glapi/registry/gl.xml
index 833478aa51..13882eff7b 100644
--- a/src/mapi/glapi/registry/gl.xml
+++ b/src/mapi/glapi/registry/gl.xml
@@ -6568,6 +6568,7 @@ 

[Mesa-dev] [PATCH 2/2] i965: implement MESA_framebuffer_flip_yv [v2]

2018-06-28 Thread Fritz Koenig
Instead of using _mesa_is_winsys_fbo or
_mesa_is_user_fbo to infer if an fbo is
flipped use the InvertedY flag.
---
 src/mesa/drivers/dri/i965/brw_blorp.c |  2 +-
 src/mesa/drivers/dri/i965/brw_meta_util.c |  4 +-
 src/mesa/drivers/dri/i965/brw_sf.c|  6 +--
 src/mesa/drivers/dri/i965/genX_state_upload.c | 50 +--
 src/mesa/drivers/dri/i965/intel_extensions.c  |  1 +
 src/mesa/drivers/dri/i965/intel_fbo.c | 12 ++---
 .../drivers/dri/i965/intel_pixel_bitmap.c |  8 +--
 src/mesa/drivers/dri/i965/intel_pixel_copy.c  |  4 +-
 src/mesa/drivers/dri/i965/intel_pixel_draw.c  |  2 +-
 9 files changed, 43 insertions(+), 46 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c 
b/src/mesa/drivers/dri/i965/brw_blorp.c
index 5f99e51bc2..9fe3873291 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.c
+++ b/src/mesa/drivers/dri/i965/brw_blorp.c
@@ -685,7 +685,7 @@ brw_blorp_copytexsubimage(struct brw_context *brw,
/* Account for the fact that in the system framebuffer, the origin is at
 * the lower left.
 */
-   bool mirror_y = _mesa_is_winsys_fbo(ctx->ReadBuffer);
+   bool mirror_y = ctx->ReadBuffer->InvertedY;
if (mirror_y)
   apply_y_flip(, , src_rb->Height);
 
diff --git a/src/mesa/drivers/dri/i965/brw_meta_util.c 
b/src/mesa/drivers/dri/i965/brw_meta_util.c
index d292f5a8e2..ad671f600d 100644
--- a/src/mesa/drivers/dri/i965/brw_meta_util.c
+++ b/src/mesa/drivers/dri/i965/brw_meta_util.c
@@ -250,13 +250,13 @@ brw_meta_mirror_clip_and_scissor(const struct gl_context 
*ctx,
/* Account for the fact that in the system framebuffer, the origin is at
 * the lower left.
 */
-   if (_mesa_is_winsys_fbo(read_fb)) {
+   if (read_fb->InvertedY) {
   GLint tmp = read_fb->Height - *srcY0;
   *srcY0 = read_fb->Height - *srcY1;
   *srcY1 = tmp;
   *mirror_y = !*mirror_y;
}
-   if (_mesa_is_winsys_fbo(draw_fb)) {
+   if (draw_fb->InvertedY) {
   GLint tmp = draw_fb->Height - *dstY0;
   *dstY0 = draw_fb->Height - *dstY1;
   *dstY1 = tmp;
diff --git a/src/mesa/drivers/dri/i965/brw_sf.c 
b/src/mesa/drivers/dri/i965/brw_sf.c
index 37ce999dc0..25fe9b3dfe 100644
--- a/src/mesa/drivers/dri/i965/brw_sf.c
+++ b/src/mesa/drivers/dri/i965/brw_sf.c
@@ -90,7 +90,7 @@ brw_upload_sf_prog(struct brw_context *brw)
   return;
 
/* _NEW_BUFFERS */
-   bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
+   bool inverted_y = ctx->DrawBuffer->InvertedY;
 
memset(, 0, sizeof(key));
 
@@ -137,7 +137,7 @@ brw_upload_sf_prog(struct brw_context *brw)
 * Window coordinates in a FBO are inverted, which means point
 * sprite origin must be inverted, too.
 */
-   if ((ctx->Point.SpriteOrigin == GL_LOWER_LEFT) != render_to_fbo)
+   if ((ctx->Point.SpriteOrigin == GL_LOWER_LEFT) == inverted_y)
   key.sprite_origin_lower_left = true;
 
/* BRW_NEW_FS_PROG_DATA */
@@ -161,7 +161,7 @@ brw_upload_sf_prog(struct brw_context *brw)
* face orientation, just as we invert the viewport in
* sf_unit_create_from_key().
*/
-  key.frontface_ccw = brw->polygon_front_bit == render_to_fbo;
+  key.frontface_ccw = brw->polygon_front_bit != inverted_y;
}
 
if (!brw_search_cache(>cache, BRW_CACHE_SF_PROG,
diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 88fde9d12f..b5f3b6c92a 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -217,7 +217,7 @@ genX(upload_polygon_stipple)(struct brw_context *brw)
* to a FBO (i.e. any named frame buffer object), we *don't*
* need to invert - we already match the layout.
*/
-  if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
+  if (ctx->DrawBuffer->InvertedY) {
  for (unsigned i = 0; i < 32; i++)
 poly.PatternRow[i] = ctx->PolygonStipple[31 - i]; /* invert */
   } else {
@@ -257,7 +257,7 @@ genX(upload_polygon_stipple_offset)(struct brw_context *brw)
* to a user-created FBO then our native pixel coordinate system
* works just fine, and there's no window system to worry about.
*/
-  if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
+  if (ctx->DrawBuffer->InvertedY) {
  poly.PolygonStippleYOffset =
 (32 - (_mesa_geometric_height(ctx->DrawBuffer) & 31)) & 31;
   }
@@ -1468,7 +1468,7 @@ genX(upload_clip_state)(struct brw_context *brw)
 #endif
 
 #if GEN_GEN == 7
-  clip.FrontWinding = brw->polygon_front_bit == _mesa_is_user_fbo(fb);
+  clip.FrontWinding = brw->polygon_front_bit != fb->InvertedY;
 
   if (ctx->Polygon.CullFlag) {
  switch (ctx->Polygon.CullFaceMode) {
@@ -1583,7 +1583,7 @@ genX(upload_sf)(struct brw_context *brw)
 
 #if GEN_GEN <= 7
/* _NEW_BUFFERS */
-   bool render_to_fbo = _mesa_is_user_fbo(ctx->DrawBuffer);
+   bool inverted_y = ctx->DrawBuffer->InvertedY;
UNUSED const bool multisampled_fbo 

Re: [Mesa-dev] [PATCH 11/11] ac/radv: using tls to store llvm related info and speed up compiles (v3)

2018-06-28 Thread Dave Airlie
On 28 June 2018 at 18:23, Alex Smith  wrote:
> Hi Dave,
>
> I did a quick test with this on Rise of the Tomb Raider. It reduced the time
> taken to create all pipelines for the whole game over 8 threads (with
> RADV_DEBUG=nocache) from 12m24s to 11m35s. Nice improvement :)

Oh good to have some real world numbers.

Thanks for testing,
Dave.

>
> Also didn't see any issues, so:
>
> Tested-by: Alex Smith 
>
> Thanks,
> Alex
>
> On 27 June 2018 at 04:58, Dave Airlie  wrote:
>>
>> From: Dave Airlie 
>>
>> I'd like to encourage people to test this to see if it helps (like
>> does it make app startup better or less hitching in dxvk).
>>
>> The basic idea is to store a bunch of LLVM related data structs
>> in thread local storage so we can avoid reiniting them every time
>> we compile a shader. Since we know llvm objects aren't thread safe
>> it has to be stored using TLS to avoid any collisions.
>>
>> This should remove all the fixed overheads setup costs of creating
>> the pass manager each time.
>>
>> This takes a demo app time to compile the radv meta shaders on nocache
>> and exit from 1.7s to 1s.
>>
>> TODO: this doesn't work for radeonsi yet, but I'm not sure how TLS
>> works if you have radeonsi and radv loaded at the same time, if
>> they'll magically try and use the same tls stuff, in which case
>> this might explode all over the place.
>>
>> v2: fix llvm6 build, inline emit function, handle multiple targets
>> in one thread
>> v3: rebase and port onto new structure
>> ---
>>  src/amd/common/ac_llvm_helper.cpp | 120 --
>>  src/amd/common/ac_llvm_util.c |  10 +--
>>  src/amd/common/ac_llvm_util.h |   9 +++
>>  src/amd/vulkan/radv_debug.h   |   1 +
>>  src/amd/vulkan/radv_device.c  |   1 +
>>  src/amd/vulkan/radv_shader.c  |   2 +
>>  6 files changed, 132 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/amd/common/ac_llvm_helper.cpp
>> b/src/amd/common/ac_llvm_helper.cpp
>> index 27403dbe085..f1f1399b3fb 100644
>> --- a/src/amd/common/ac_llvm_helper.cpp
>> +++ b/src/amd/common/ac_llvm_helper.cpp
>> @@ -31,12 +31,21 @@
>>
>>  #include "ac_llvm_util.h"
>>  #include 
>> -#include 
>> -#include 
>> -#include 
>> -#include 
>> +#include 
>>  #include 
>>  #include 
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#if HAVE_LLVM >= 0x0700
>> +#include 
>> +#endif
>> +
>> +#if HAVE_LLVM < 0x0700
>> +#include "llvm/Support/raw_ostream.h"
>> +#endif
>> +#include 
>>
>>  void ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)
>>  {
>> @@ -101,11 +110,110 @@
>> ac_dispose_target_library_info(LLVMTargetLibraryInfoRef library_info)
>> delete reinterpret_cast> *>(library_info);
>>  }
>>
>> +class ac_llvm_per_thread_info {
>> +public:
>> +   ac_llvm_per_thread_info(enum radeon_family arg_family,
>> +   enum ac_target_machine_options
>> arg_tm_options)
>> +   : family(arg_family), tm_options(arg_tm_options),
>> + OStream(CodeString) {}
>> +   ~ac_llvm_per_thread_info() {
>> +   ac_llvm_compiler_dispose_internal(_info);
>> +   }
>> +
>> +   struct ac_llvm_compiler_info llvm_info;
>> +   enum radeon_family family;
>> +   enum ac_target_machine_options tm_options;
>> +   llvm::SmallString<0> CodeString;
>> +   llvm::raw_svector_ostream OStream;
>> +   llvm::legacy::PassManager pass;
>> +};
>> +
>> +/* we have to store a linked list per thread due to the possiblity of
>> multiple gpus being required */
>> +static thread_local std::list
>> ac_llvm_per_thread_list;
>> +
>>  bool ac_compile_to_memory_buffer(struct ac_llvm_compiler_info *info,
>>  LLVMModuleRef M,
>>  char **ErrorMessage,
>>  LLVMMemoryBufferRef *OutMemBuf)
>>  {
>> -   return LLVMTargetMachineEmitToMemoryBuffer(info->tm, M,
>> LLVMObjectFile,
>> -  ErrorMessage,
>> OutMemBuf);
>> +   ac_llvm_per_thread_info *thread_info = nullptr;
>> +   if (info->thread_stored) {
>> +   for (auto  : ac_llvm_per_thread_list) {
>> +   if (I.llvm_info.tm == info->tm) {
>> +   thread_info = 
>> +   break;
>> +   }
>> +   }
>> +
>> +   if (!thread_info) {
>> +   assert(0);
>> +   return false;
>> +   }
>> +   } else {
>> +   return LLVMTargetMachineEmitToMemoryBuffer(info->tm, M,
>> LLVMObjectFile,
>> +  ErrorMessage,
>> OutMemBuf);
>> +   }
>> +
>> +   llvm::TargetMachine *TM =
>> reinterpret_cast(thread_info->llvm_info.tm);
>> +   llvm::Module *Mod = llvm::unwrap(M);
>> +   llvm::StringRef Data;
>> +
>> +   Mod->setDataLayout(TM->createDataLayout());
>> +
>> +   

Re: [Mesa-dev] [PATCH] anv/cmd_buffer: emit binding tables always if push constants are dirty

2018-06-28 Thread Jason Ekstrand
On Thu, Jun 28, 2018 at 3:58 AM, Iago Toral  wrote:

> On Thu, 2018-06-28 at 08:47 +0200, Iago Toral wrote:
>
> On Wed, 2018-06-27 at 09:13 -0700, Jason Ekstrand wrote:
>
> On Wed, Jun 27, 2018 at 2:25 AM, Iago Toral  wrote:
>
> On Tue, 2018-06-26 at 10:59 -0700, Jason Ekstrand wrote:
>
> On Tue, Jun 26, 2018 at 4:08 AM, Iago Toral Quiroga 
> wrote:
>
> Storage images require to patch push constant stateto work, which happens
> during
> binding table emision. In the scenario where our pipeline and descriptors
> are
> not dirty, we don't re-emit the binding table, however, if our push
> constant
> state is dirty, we will re-emit the push constant state, trashing storage
> image setup.
>
> While that scenario is probably not very likely to happen in practice,
> there
> are some CTS tests that trigger this by clearing storage images and buffers
> and dispatching a compute shader in a loop. The clearing of the images
> and buffers will trigger a blorp execution which will dirty our push
> constant
> state, however, because  we don't alter the descriptors or the compute
> dispatch
> at all in the loop (we are basically execution the same program in a loop),
> our pipeline and descriptor state is not dirty. If the shader uses a
> storage
> image, then any iteration after the first will re-emit push constant state
> without re-emitting binding tables and the storage image will not be
> properly
> setup any more.
>
>
> I don't see why that is a problem.  The only thing flush_descriptor_sets
> does is fill out the binding and sampler tables and fill in the push
> constant data for storage images/buffers.  The actual HW packets are filled
> out by flush_push_constants and emit_descriptor_pointers.  Yes, blorp
> trashes our descriptor pointers but the descriptor sets should be fine.
> For push constants, it does emit 3DSTATE_CONSTANT_* but it doesn't actually
> modify anv_cmd_state::push_constants.
>
> Are secondary command buffers involved?  I could see something funny going
> on with those.
>
>
> No, no secondaries are involved. I did some more investigation and I think
> my explanation of the problem was not good, this is what is really
> happening:
>
> First, I found the problem in the compute pipeline and I only extended the
> fix to the graphics pipeline because it looked like the same rationale
> would apply, so I'll explain what happens in compute and then we can
> discuss whether the same problem applies to graphics.
>
> The test does something like this:
>
> for (...) {
> clear ssbos / storage images
> dispatch compute
> }
>
> The first iteration of this loop will find that the compute pipeline and
> descriptors are dirty and proceed to emit binding tables. We have storage
> images, so during that process the push constant buffer is amended to
> include storage images. Specifically, we call 
> anv_cmd_buffer_ensure_push_constants_size()
> for the images field. This gives us a size of 624.
>
> We move on to the second iteration of the loop. When we clear images and
> ssbos via blorp, we again mark the push constant buffer as dirty. Now we
> execute the compute dispatch and the first thing we do there is
> anv_cmd_buffer_push_base_group_id() which calls
> anv_cmd_buffer_ensure_push_constants_size() for the base group id, which
> gives as a size of 144. This is smaller than what we computed in the
> previous iteration, because we haven't called the same function for the
> images field yet. Unfortunately, we will never call that again, because we
> only do that during binding table emission and we only do that if the
> compute pipeline is dirty (it is not) or our descriptors are dirty (they
> are not). So we don't re-emit binding table and we don't ensure push
> constant space for the image data, but because we come from a blorp
> execution our push constant dirty flag is true, so we re-emit push constant
> data, only that this time we won't emit the push constant data we need for
> the storage images, which leads to the problem.
>
>
> The intention has always been that anv_cmd_buffer_ensure_push_constants_size
> would only ever grow the push constants and never shrink them.  The most
> obvious bug is in anv_cmd_buffer_ensure_push_constants_size.
>
>
> I thought that maybe making anv_cmd_buffer_ensure_push_constants_size()
> only update the size if we alloc or realloc would fix this, but that can
> cause GPU hangs in some cases when I run multiple tests in parallel, so I
> guess it isn't that simple.
>
>
> Ugh...  that makes things more interesting.  That does look like the right
> fix and now I'm wondering why it leads to a hang.
>
> In the compute case, flush_compute_descriptor_sets emits
> MEDIA_INTERFACE_DESCRIPTOR_LOAD.  My feeling is that not emitting that
> packet is the real bug.  In GL, we just re-emit all 4 compute packets all
> the time and don't try to track dirty bits.  I had patches to do that in
> Vulkan somewhere.  I rebased them and pushed them here:
>
> 

Re: [Mesa-dev] [PATCH 1/2] radv: enable/disable prediction for the DCC decompression pass

2018-06-28 Thread Dave Airlie
Seems sane,

Reviewed-by: Dave Airlie 

On 29 June 2018 at 01:14, Samuel Pitoiset  wrote:
> ping?
>
> On 04/18/2018 02:34 PM, Samuel Pitoiset wrote:
>>
>> Performing a DCC decompression pass is currently pretty rare,
>> but using prediction allows the GPU to skip unnecessary passes.
>>
>> Signed-off-by: Samuel Pitoiset 
>> ---
>>   src/amd/vulkan/radv_meta_fast_clear.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/amd/vulkan/radv_meta_fast_clear.c
>> b/src/amd/vulkan/radv_meta_fast_clear.c
>> index d5af7a1b0c..e702dc80a5 100644
>> --- a/src/amd/vulkan/radv_meta_fast_clear.c
>> +++ b/src/amd/vulkan/radv_meta_fast_clear.c
>> @@ -601,7 +601,7 @@ radv_emit_color_decompress(struct radv_cmd_buffer
>> *cmd_buffer,
>>  pipeline =
>> cmd_buffer->device->meta_state.fast_clear_flush.cmask_eliminate_pipeline;
>> }
>>   - if (!decompress_dcc && radv_image_has_dcc(image)) {
>> +   if (radv_image_has_dcc(image)) {
>> radv_emit_set_predication_state_from_image(cmd_buffer,
>> image, true);
>> cmd_buffer->state.predicating = true;
>> }
>> @@ -667,7 +667,7 @@ radv_emit_color_decompress(struct radv_cmd_buffer
>> *cmd_buffer,
>> _buffer->pool->alloc);
>> }
>> -   if (!decompress_dcc && radv_image_has_dcc(image)) {
>> +   if (radv_image_has_dcc(image)) {
>> cmd_buffer->state.predicating = false;
>> radv_emit_set_predication_state_from_image(cmd_buffer,
>> image, false);
>> }
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/1] swr/rast: last swr formatting changes for a while v2

2018-06-28 Thread Alok Hota
Sorry for the churn on these patches. They had to be split in three due
to some code change in between a mass formatting change. Anyway this
patch contains only formatting changes for the files that were updated
in the last patch

v2 : I added the clang formatting file we used

Alok Hota (1):
  swr/rast: Updating code style based on current clang-format rules

 .../drivers/swr/rasterizer/_clang-format  | 114 +++
 .../swr/rasterizer/jitter/JitManager.cpp  | 133 ++--
 .../swr/rasterizer/jitter/builder_gfx_mem.cpp |  90 +
 .../swr/rasterizer/jitter/builder_gfx_mem.h   | 101 +-
 .../jitter/functionpasses/lower_x86.cpp   | 189 +-
 5 files changed, 374 insertions(+), 253 deletions(-)
 create mode 100644 src/gallium/drivers/swr/rasterizer/_clang-format

-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] swr/rast: Updating code style based on current v clang-format rules v2

2018-06-28 Thread Alok Hota
added clang format file
---
 .../drivers/swr/rasterizer/_clang-format  | 114 +++
 .../swr/rasterizer/jitter/JitManager.cpp  | 133 ++--
 .../swr/rasterizer/jitter/builder_gfx_mem.cpp |  90 +
 .../swr/rasterizer/jitter/builder_gfx_mem.h   | 101 +-
 .../jitter/functionpasses/lower_x86.cpp   | 189 +-
 5 files changed, 374 insertions(+), 253 deletions(-)
 create mode 100644 src/gallium/drivers/swr/rasterizer/_clang-format

diff --git a/src/gallium/drivers/swr/rasterizer/_clang-format 
b/src/gallium/drivers/swr/rasterizer/_clang-format
new file mode 100644
index 00..ed4b9b409d
--- /dev/null
+++ b/src/gallium/drivers/swr/rasterizer/_clang-format
@@ -0,0 +1,114 @@
+---
+Language:Cpp
+# BasedOnStyle:  LLVM
+AccessModifierOffset: -4
+AlignAfterOpenBracket: Align
+AlignConsecutiveAssignments: true
+AlignConsecutiveDeclarations: true
+AlignEscapedNewlines: Left
+AlignOperands:   true
+AlignTrailingComments: true
+AllowAllParametersOfDeclarationOnNextLine: true
+AllowShortBlocksOnASingleLine: false
+AllowShortCaseLabelsOnASingleLine: false
+AllowShortFunctionsOnASingleLine: Inline
+AllowShortIfStatementsOnASingleLine: false
+AllowShortLoopsOnASingleLine: false
+AlwaysBreakAfterDefinitionReturnType: None
+AlwaysBreakAfterReturnType: None
+AlwaysBreakBeforeMultilineStrings: false
+AlwaysBreakTemplateDeclarations: true
+BinPackArguments: false
+BinPackParameters: false
+BraceWrapping:   
+  AfterClass:  true
+  AfterControlStatement: true
+  AfterEnum:   true
+  AfterFunction:   true
+  AfterNamespace:  true
+  AfterObjCDeclaration: true
+  AfterStruct: true
+  AfterUnion:  true
+  #AfterExternBlock: false
+  BeforeCatch: true
+  BeforeElse:  true
+  IndentBraces:false
+  SplitEmptyFunction: true
+  SplitEmptyRecord: true
+  SplitEmptyNamespace: true
+BreakBeforeBinaryOperators: None
+BreakBeforeBraces: Custom
+BreakBeforeInheritanceComma: false
+BreakBeforeTernaryOperators: true
+BreakConstructorInitializersBeforeComma: false
+BreakConstructorInitializers: AfterColon
+BreakAfterJavaFieldAnnotations: false
+BreakStringLiterals: true
+ColumnLimit: 100
+CommentPragmas:  '^ IWYU pragma:'
+CompactNamespaces: false
+ConstructorInitializerAllOnOneLineOrOnePerLine: false
+ConstructorInitializerIndentWidth: 4
+ContinuationIndentWidth: 4
+Cpp11BracedListStyle: true
+DerivePointerAlignment: false
+DisableFormat:   false
+ExperimentalAutoDetectBinPacking: false
+FixNamespaceComments: true
+ForEachMacros:   
+  - foreach
+  - Q_FOREACH
+  - BOOST_FOREACH
+#IncludeBlocks:   Preserve
+IncludeCategories: 
+  - Regex:   '^"(llvm|llvm-c|clang|clang-c)/'
+Priority:2
+  - Regex:   '^(<|"(gtest|gmock|isl|json)/)'
+Priority:3
+  - Regex:   '.*'
+Priority:1
+IncludeIsMainRegex: '(Test)?$'
+IndentCaseLabels: false
+#IndentPPDirectives: AfterHash
+IndentWidth: 4
+IndentWrappedFunctionNames: false
+JavaScriptQuotes: Leave
+JavaScriptWrapImports: true
+KeepEmptyLinesAtTheStartOfBlocks: false
+MacroBlockBegin: ''
+MacroBlockEnd:   ''
+MaxEmptyLinesToKeep: 1
+NamespaceIndentation: All
+ObjCBlockIndentWidth: 4
+ObjCSpaceAfterProperty: false
+ObjCSpaceBeforeProtocolList: true
+PenaltyBreakAssignment: 2
+PenaltyBreakBeforeFirstCallParameter: 19
+PenaltyBreakComment: 300
+PenaltyBreakFirstLessLess: 120
+PenaltyBreakString: 1000
+PenaltyExcessCharacter: 100
+PenaltyReturnTypeOnItsOwnLine: 60
+PointerAlignment: Left
+#RawStringFormats: 
+#  - Delimiter:   pb
+#Language:TextProto
+#BasedOnStyle:google
+ReflowComments:  true
+SortIncludes:false
+SortUsingDeclarations: true
+SpaceAfterCStyleCast: false
+SpaceAfterTemplateKeyword: true
+SpaceBeforeAssignmentOperators: true
+SpaceBeforeParens: ControlStatements
+SpaceInEmptyParentheses: false
+SpacesBeforeTrailingComments: 1
+SpacesInAngles:  false
+SpacesInContainerLiterals: true
+SpacesInCStyleCastParentheses: false
+SpacesInParentheses: false
+SpacesInSquareBrackets: false
+Standard:Cpp11
+TabWidth:4
+UseTab:  Never
+...
diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
index 5bacf55126..0312fc47fb 100644
--- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
+++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
@@ -59,7 +59,7 @@ using namespace SwrJit;
 //
 /// @brief Contructor for JitManager.
 /// @param simdWidth - SIMD width to be used in generated program.
-JitManager::JitManager(uint32_t simdWidth, const char *arch, const char *core) 
:
+JitManager::JitManager(uint32_t simdWidth, const char* arch, const char* core) 
:
 mContext(), mBuilder(mContext), mIsModuleFinalized(true), mJitNumber(0), 
mVWidth(simdWidth),
 mArch(arch)
 {
@@ -153,7 +153,7 @@ JitManager::JitManager(uint32_t simdWidth, const char 

Re: [Mesa-dev] [PATCH] anv: finish the binding_table_pool on destroyDevice when use_softpin

2018-06-28 Thread Jason Ekstrand

Rb

On June 28, 2018 08:50:21 Jose Maria Casanova Crespo 
 wrote:



Running VK-CTS in batch execution mode was raising the
VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the
same failing tests were run isolated they always passed.

createDevice and destroyDevice were called before and after every
tests. Because the binding_table_pool was never closed, we reached the
maximum number of open file descriptors (ulimit -n) and when that
happened every call to createDevice implied a
VK_ERROR_INITIALIZATION_FAILED error.

Fixes: c7db0ed4e94dce563d722e1b098684fbd7315d51
 ("anv: Use a separate pool for binding tables when soft pinning")


Cc: Scott D Phillips 
Cc: Jason Ekstrand 

---
src/intel/vulkan/anv_device.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index ea24a0ad03d..5266b269244 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1782,6 +1782,7 @@ void anv_DestroyDevice(
const VkAllocationCallbacks*pAllocator)
{
   ANV_FROM_HANDLE(anv_device, device, _device);
+   struct anv_physical_device *physical_device = 
>instance->physicalDevice;


   if (!device)
  return;
@@ -1808,6 +1809,8 @@ void anv_DestroyDevice(
   if (device->info.gen >= 10)
  anv_gem_close(device, device->hiz_clear_bo.gem_handle);

+   if (physical_device->use_softpin)
+  anv_state_pool_finish(>binding_table_pool);
   anv_state_pool_finish(>surface_state_pool);
   anv_state_pool_finish(>instruction_state_pool);
   anv_state_pool_finish(>dynamic_state_pool);
--
2.18.0




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/12] i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible

2018-06-28 Thread Caio Marcelo de Oliveira Filho
Patch looks good to me, consider suggestion in the end of the email.

Reviewed-by: Caio Marcelo de Oliveira Filho 


A related question: for the case "if (inst->opcode == BRW_OPCODE_CMP
&& !inst->src[1].is_zero())" don't we need to break if we find an ADD
that is not a match but writes to the register we use in comparison?
In the general case this is covered by the regions_overlap.

E.g.

0: adddest  src0  src1
1: addsrc0  src2  src3
2: cmp.ge.f0  null  src0  -src1

When scanning with inst=2 and scan_inst=1, the current code "goto not
match", and will keep scanning to scan_inst=0. Should it "break"
instead?


> diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp 
> b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
> index 9560cc3b6f7..0602e25ffc2 100644
> --- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp
> +++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp
> @@ -82,6 +82,14 @@ opt_cmod_propagation_local(bblock_t *block, vec4_visitor 
> *v)
>  if (scan_inst->opcode != BRW_OPCODE_ADD)
> goto not_match;
>  
> +if ((scan_inst->dst.writemask != WRITEMASK_X &&
> + scan_inst->dst.writemask != WRITEMASK_XYZW) ||
> +(scan_inst->dst.writemask == WRITEMASK_XYZW &&
> + inst->src[0].swizzle != BRW_SWIZZLE_XYZW) ||
> +(inst->dst.writemask & ~scan_inst->dst.writemask) != 0) {
> +   goto not_match;
> +}
> +

It could be worth to capture these conditions (and maybe the exec_size
/ group matching) into a helper function (or a local boolean earlier
in the code) to be used by this and the next block that handle the
case when regions overlap.


Thanks,
Caio
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/u_vbuf: drop min/max-scanning for empty indirect draws

2018-06-28 Thread Erik Faye-Lund
When building with asserts enabled, we'll end up triggering an assert
in pipe_buffer_map_range down this code-path, due to trying to map
an empty range. Even if we avoid that, we'll trigger another assert
a bit later, because u_vbuf_get_minmax_index returns a min-index of
-1 here, which gets promoted to an unsigned value, and gives us an
out-of-bounds buffer-mapping offset.

Since we can't really have a well-defined min/max range here when
the range is empty anyway, we should just drop this dance in the
first place. After all, no rendering is going to be produced.

This fixes a crash in dEQP-GLES31.functional.draw_indirect.random.0
on VirGL for me.

Signed-off-by: Erik Faye-Lund 
---

This is a resend of a mail that didn't reach the mailing-list yet.
Sorry if it appears twice for someone!

 src/gallium/auxiliary/util/u_vbuf.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
b/src/gallium/auxiliary/util/u_vbuf.c
index 42f37c7574..76a1d143d9 100644
--- a/src/gallium/auxiliary/util/u_vbuf.c
+++ b/src/gallium/auxiliary/util/u_vbuf.c
@@ -1183,6 +1183,9 @@ void u_vbuf_draw_vbo(struct u_vbuf *mgr, const struct 
pipe_draw_info *info)
   new_info.start = data[2];
   pipe_buffer_unmap(pipe, transfer);
   new_info.indirect = NULL;
+
+  if (!new_info.count)
+ return;
}
 
if (new_info.index_size) {
-- 
2.18.0.rc2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 50/53] intel/anv, blorp, i965: Implement the SKL 16x MSAA SIMD32 workaround

2018-06-28 Thread Kenneth Graunke
On Thursday, May 24, 2018 2:56:32 PM PDT Jason Ekstrand wrote:
> ---
>  src/intel/blorp/blorp_genX_exec.h | 14 ++
>  src/intel/vulkan/genX_pipeline.c  | 20 ++--
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 16 
>  3 files changed, 48 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 9947ad3..93f1204 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -763,6 +763,20 @@ blorp_emit_ps_config(struct blorp_batch *batch,
>   ps._16PixelDispatchEnable = prog_data->dispatch_16;
>   ps._32PixelDispatchEnable = prog_data->dispatch_32;
>  
> + /* From the Sky Lake PRM 3DSTATE_PS::32 Pixel Dispatch Enable:
> +  *
> +  *"When NUM_MULTISAMPLES = 16 or FORCE_SAMPLE_COUNT = 16, SIMD32
> +  *Dispatch must not be enabled for PER_PIXEL dispatch mode."
> +  *
> +  * Since 16x MSAA is first introduced on SKL, we don't need to apply
> +  * the workaround on any older hardware.
> +  */
> + if (GEN_GEN >= 9 && !prog_data->persample_dispatch &&
> + params->num_samples == 16) {
> +assert(ps._8PixelDispatchEnable || ps._16PixelDispatchEnable);
> +ps._32PixelDispatchEnable = false;
> + }
> +
>   ps.DispatchGRFStartRegisterForConstantSetupData0 =
>  brw_wm_prog_data_dispatch_grf_start_reg(prog_data, ps, 0);
>   ps.DispatchGRFStartRegisterForConstantSetupData1 =
> diff --git a/src/intel/vulkan/genX_pipeline.c 
> b/src/intel/vulkan/genX_pipeline.c
> index 6bdda5d..97ccc08 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -1445,7 +1445,8 @@ is_dual_src_blend_factor(VkBlendFactor factor)
>  
>  static void
>  emit_3dstate_ps(struct anv_pipeline *pipeline,
> -const VkPipelineColorBlendStateCreateInfo *blend)
> +const VkPipelineColorBlendStateCreateInfo *blend,
> +const VkPipelineMultisampleStateCreateInfo *multisample)
>  {
> MAYBE_UNUSED const struct gen_device_info *devinfo = 
> >device->info;
> const struct anv_shader_bin *fs_bin =
> @@ -1492,6 +1493,20 @@ emit_3dstate_ps(struct anv_pipeline *pipeline,
>ps._16PixelDispatchEnable = wm_prog_data->dispatch_16;
>ps._32PixelDispatchEnable = wm_prog_data->dispatch_32;
>  
> +  /* From the Sky Lake PRM 3DSTATE_PS::32 Pixel Dispatch Enable:
> +   *
> +   *"When NUM_MULTISAMPLES = 16 or FORCE_SAMPLE_COUNT = 16, SIMD32
> +   *Dispatch must not be enabled for PER_PIXEL dispatch mode."
> +   *
> +   * Since 16x MSAA is first introduced on SKL, we don't need to apply
> +   * the workaround on any older hardware.
> +   */
> +  if (GEN_GEN >= 9 && !wm_prog_data->persample_dispatch &&
> +  multisample && multisample->rasterizationSamples == 16) {
> + assert(ps._8PixelDispatchEnable || ps._16PixelDispatchEnable);
> + ps._32PixelDispatchEnable = false;
> +  }
> +
>ps.KernelStartPointer0 = fs_bin->kernel.offset +
> brw_wm_prog_data_prog_offset(wm_prog_data, 
> ps, 0);
>ps.KernelStartPointer1 = fs_bin->kernel.offset +
> @@ -1732,7 +1747,8 @@ genX(graphics_pipeline_create)(
> emit_3dstate_sbe(pipeline);
> emit_3dstate_wm(pipeline, subpass, pCreateInfo->pColorBlendState,
> pCreateInfo->pMultisampleState);
> -   emit_3dstate_ps(pipeline, pCreateInfo->pColorBlendState);
> +   emit_3dstate_ps(pipeline, pCreateInfo->pColorBlendState,
> +   pCreateInfo->pMultisampleState);
>  #if GEN_GEN >= 8
> emit_3dstate_ps_extra(pipeline, subpass, pCreateInfo->pColorBlendState);
> emit_3dstate_vf_topology(pipeline);
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> index 0e56f92..df2259d 100644
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> @@ -4032,6 +4032,22 @@ genX(upload_ps)(struct brw_context *brw)
>ps._16PixelDispatchEnable = prog_data->dispatch_16;
>ps._32PixelDispatchEnable = prog_data->dispatch_32;
>  
> +  /* From the Sky Lake PRM 3DSTATE_PS::32 Pixel Dispatch Enable:
> +   *
> +   *"When NUM_MULTISAMPLES = 16 or FORCE_SAMPLE_COUNT = 16, SIMD32
> +   *Dispatch must not be enabled for PER_PIXEL dispatch mode."
> +   *
> +   * Since 16x MSAA is first introduced on SKL, we don't need to apply
> +   * the workaround on any older hardware.
> +   *
> +   * _NEW_MULTISAMPLE

I don't think this is _NEW_MULTISAMPLE - that's for multisampling
related state knobs.  ctx->DrawBuffer is _NEW_BUFFERS, which isn't
flagged here, so this seems slightly off.

I think what you want instead is to use 

Re: [Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Gert Wollny
Am Donnerstag, den 28.06.2018, 18:09 +0200 schrieb Erik Faye-Lund:
> It still seems kinda strange (and fragile) to me to try to enumerate
> all possible sample locations up-front instead of querying a given
> texture for it's sample-locations.
With virgl, querying a texture for host-side information is quite
costly, so if we can get aways with one lookup when starting qemu, then
IMHO this is the preferred way to go.

> For instance, I don't think there's anything in the spec backing the
> correctness of the picking the 13 first positions from the 16 sample
> mode strategy. That just happens to work on these three drivers
> you've checked right now. It might not for future hardware, nor other
> drivers. And I wouldn't be too surprised if nasty details like
> framebuffer transforms (y-flipping in stuff like backbuffer vs FBO,
> renderbuffers, pbuffers, scanout rotations, the recently propsed
> y-flip extensions, you name it) could in fact "secretly" modify what
> the correct values are.
I think from the performance point of view it is a reasonable approach
to query these values once and re-use them. If this turns out to be a
problem then we can still think of another solution, but preferable one
where one doesn't have to go through all the stack for each coordinate
pair.

best, 
Gert

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/12] nir/opt_peephole_select: Don't try to remove flow control around indirect loads

2018-06-28 Thread Marek Olšák
On our hardware, indirect accesses can go through memory with no
bounds checking.

Marek

On Thu, Jun 28, 2018 at 12:46 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> That flow control may be trying to avoid invalid loads.  On at least
> some platforms, those loads can also be expensive.
>
> No shader-db changes on any Intel platform (even with the later patch
> "intel/compiler: More peephole select").
>
> NOTE: I've tried to CC everyone whose drive might be affected by this
> change.
>
> Signed-off-by: Ian Romanick 
> Cc: Eric Anholt 
> Cc: Rob Clark 
> Cc: Marek Olšák 
> ---
>  src/compiler/nir/nir_opt_peephole_select.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_peephole_select.c 
> b/src/compiler/nir/nir_opt_peephole_select.c
> index 4ca4f80d788..920ced2137c 100644
> --- a/src/compiler/nir/nir_opt_peephole_select.c
> +++ b/src/compiler/nir/nir_opt_peephole_select.c
> @@ -58,7 +58,8 @@
>   */
>
>  static bool
> -block_check_for_allowed_instrs(nir_block *block, unsigned *count, bool 
> alu_ok)
> +block_check_for_allowed_instrs(nir_block *block, unsigned *count,
> +   gl_shader_stage stage, bool alu_ok)
>  {
> nir_foreach_instr(instr, block) {
>switch (instr->type) {
> @@ -70,6 +71,13 @@ block_check_for_allowed_instrs(nir_block *block, unsigned 
> *count, bool alu_ok)
>  switch (intrin->variables[0]->var->data.mode) {
>  case nir_var_shader_in:
>  case nir_var_uniform:
> +   /* Don't try to remove flow control around an indirect load
> +* because that flow control may be trying to avoid invalid
> +* loads.
> +*/
> +   if (nir_deref_has_indirect(stage, intrin->variables[0]))
> +  return false;
> +
> break;
>
>  default:
> @@ -168,8 +176,10 @@ nir_opt_peephole_select_block(nir_block *block, 
> nir_shader *shader,
>
> /* ... and those blocks must only contain "allowed" instructions. */
> unsigned count = 0;
> -   if (!block_check_for_allowed_instrs(then_block, , limit != 0) ||
> -   !block_check_for_allowed_instrs(else_block, , limit != 0))
> +   if (!block_check_for_allowed_instrs(then_block, , 
> shader->info.stage,
> +   limit != 0) ||
> +   !block_check_for_allowed_instrs(else_block, , 
> shader->info.stage,
> +   limit != 0))
>return false;
>
> if (count > limit)
> --
> 2.14.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/12] intel/compiler: Silence unused parameter warnings brw_nir.c

2018-06-28 Thread Caio Marcelo de Oliveira Filho
Reviewed-by: Caio Marcelo de Oliveira Filho 


On Wed, Jun 27, 2018 at 09:46:14PM -0700, Ian Romanick wrote:
> From: Ian Romanick 
> 
> src/intel/compiler/brw_nir.c: In function ‘brw_nir_lower_vue_outputs’:
> src/intel/compiler/brw_nir.c:464:32: warning: unused parameter ‘is_scalar’ 
> [-Wunused-parameter]
>bool is_scalar)
> ^
> src/intel/compiler/brw_nir.c: In function ‘lower_bit_size_callback’:
> src/intel/compiler/brw_nir.c:610:57: warning: unused parameter ‘data’ 
> [-Wunused-parameter]
>  lower_bit_size_callback(const nir_alu_instr *alu, void *data)
>  ^~~~
> 
> Signed-off-by: Ian Romanick 
> ---
>  src/intel/compiler/brw_nir.c   | 5 ++---
>  src/intel/compiler/brw_nir.h   | 2 +-
>  src/intel/compiler/brw_shader.cpp  | 2 +-
>  src/intel/compiler/brw_vec4.cpp| 2 +-
>  src/intel/compiler/brw_vec4_gs_visitor.cpp | 2 +-
>  5 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
> index dfeea73b06a..67c062d91f5 100644
> --- a/src/intel/compiler/brw_nir.c
> +++ b/src/intel/compiler/brw_nir.c
> @@ -460,8 +460,7 @@ brw_nir_lower_fs_inputs(nir_shader *nir,
>  }
>  
>  void
> -brw_nir_lower_vue_outputs(nir_shader *nir,
> -  bool is_scalar)
> +brw_nir_lower_vue_outputs(nir_shader *nir)
>  {
> nir_foreach_variable(var, >outputs) {
>var->data.driver_location = var->data.location;
> @@ -593,7 +592,7 @@ brw_nir_optimize(nir_shader *nir, const struct 
> brw_compiler *compiler,
>  }
>  
>  static unsigned
> -lower_bit_size_callback(const nir_alu_instr *alu, void *data)
> +lower_bit_size_callback(const nir_alu_instr *alu, UNUSED void *data)
>  {
> assert(alu->dest.dest.is_ssa);
> if (alu->dest.dest.ssa.bit_size != 16)
> diff --git a/src/intel/compiler/brw_nir.h b/src/intel/compiler/brw_nir.h
> index 03f52da08e5..19442b47eae 100644
> --- a/src/intel/compiler/brw_nir.h
> +++ b/src/intel/compiler/brw_nir.h
> @@ -109,7 +109,7 @@ void brw_nir_lower_tes_inputs(nir_shader *nir, const 
> struct brw_vue_map *vue);
>  void brw_nir_lower_fs_inputs(nir_shader *nir,
>   const struct gen_device_info *devinfo,
>   const struct brw_wm_prog_key *key);
> -void brw_nir_lower_vue_outputs(nir_shader *nir, bool is_scalar);
> +void brw_nir_lower_vue_outputs(nir_shader *nir);
>  void brw_nir_lower_tcs_outputs(nir_shader *nir, const struct brw_vue_map 
> *vue,
> GLenum tes_primitive_mode);
>  void brw_nir_lower_fs_outputs(nir_shader *nir);
> diff --git a/src/intel/compiler/brw_shader.cpp 
> b/src/intel/compiler/brw_shader.cpp
> index b7fb06ddbd9..812a19aed29 100644
> --- a/src/intel/compiler/brw_shader.cpp
> +++ b/src/intel/compiler/brw_shader.cpp
> @@ -1194,7 +1194,7 @@ brw_compile_tes(const struct brw_compiler *compiler,
>  
> nir = brw_nir_apply_sampler_key(nir, compiler, >tex, is_scalar);
> brw_nir_lower_tes_inputs(nir, input_vue_map);
> -   brw_nir_lower_vue_outputs(nir, is_scalar);
> +   brw_nir_lower_vue_outputs(nir);
> nir = brw_postprocess_nir(nir, compiler, is_scalar);
>  
> brw_compute_vue_map(devinfo, _data->base.vue_map,
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index 0e40ddd0b3d..fa1f188ca0f 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -2824,7 +2824,7 @@ brw_compile_vs(const struct brw_compiler *compiler, 
> void *log_data,
> prog_data->double_inputs_read = shader->info.vs.double_inputs;
>  
> brw_nir_lower_vs_inputs(shader, key->gl_attrib_wa_flags);
> -   brw_nir_lower_vue_outputs(shader, is_scalar);
> +   brw_nir_lower_vue_outputs(shader);
> shader = brw_postprocess_nir(shader, compiler, is_scalar);
>  
> prog_data->base.clip_distance_mask =
> diff --git a/src/intel/compiler/brw_vec4_gs_visitor.cpp 
> b/src/intel/compiler/brw_vec4_gs_visitor.cpp
> index fb4c1259948..e03e75f91f3 100644
> --- a/src/intel/compiler/brw_vec4_gs_visitor.cpp
> +++ b/src/intel/compiler/brw_vec4_gs_visitor.cpp
> @@ -642,7 +642,7 @@ brw_compile_gs(const struct brw_compiler *compiler, void 
> *log_data,
>  
> shader = brw_nir_apply_sampler_key(shader, compiler, >tex, 
> is_scalar);
> brw_nir_lower_vue_inputs(shader, _vue_map);
> -   brw_nir_lower_vue_outputs(shader, is_scalar);
> +   brw_nir_lower_vue_outputs(shader);
> shader = brw_postprocess_nir(shader, compiler, is_scalar);
>  
> prog_data->base.clip_distance_mask =
> -- 
> 2.14.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965: Implement ARB_compute_variable_group_size.

2018-06-28 Thread Manolova, Plamena
Hi Karol,
Thank you for reviewing! I'll go ahead and push the changes you need from
nir_lower_system_values.c to master.

Thank you,
Pam

On Thu, Jun 28, 2018 at 5:50 AM, Karol Herbst  wrote:

> Hi,
>
> if the changes inside "src/compiler/nir/nir_lower_system_values.c" are
> extracted into a seperate patch, this patch with the equal changes
> would be
>
> Reviewed-by: Karol Herbst 
>
> I would need that for a nir to codegen pass for Nouveau and maybe it
> will help other drivers implementing this extension as well. I don't
> think it would hurt to extract those, right?
>
> Thanks!
>
> On Thu, Jun 7, 2018 at 5:34 PM, Plamena Manolova
>  wrote:
> > This patch adds the implementation of ARB_compute_variable_group_size
> > for i965. We do this by storing the group size in a buffer surface,
> > similarly to the work group number.
> >
> > v2: Fix some indentation inconsistencies (Jordan, Ilia)
> > Do DIV_ROUND_UP correctly in brw_nir_lower_cs_intrinsics.c (Jordan)
> > Use alphabetical order in features.txt (Matt)
> > Set the extension constants properly in brw_context.c
> >
> > Signed-off-by: Plamena Manolova 
> > ---
> >  docs/features.txt|  2 +-
> >  docs/relnotes/18.2.0.html|  1 +
> >  src/compiler/nir/nir_lower_system_values.c   | 13 
> >  src/intel/compiler/brw_compiler.h|  2 +
> >  src/intel/compiler/brw_fs.cpp| 45 
> >  src/intel/compiler/brw_fs_nir.cpp| 20 ++
> >  src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 88
> +---
> >  src/mesa/drivers/dri/i965/brw_compute.c  | 25 ++-
> >  src/mesa/drivers/dri/i965/brw_context.c  |  6 ++
> >  src/mesa/drivers/dri/i965/brw_context.h  |  1 +
> >  src/mesa/drivers/dri/i965/brw_cs.c   |  4 ++
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 27 +++-
> >  src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
> >  13 files changed, 193 insertions(+), 42 deletions(-)
> >
> > diff --git a/docs/features.txt b/docs/features.txt
> > index ed4050cf98..81b6663288 100644
> > --- a/docs/features.txt
> > +++ b/docs/features.txt
> > @@ -298,7 +298,7 @@ Khronos, ARB, and OES extensions that are not part
> of any OpenGL or OpenGL ES ve
> >
> >GL_ARB_bindless_texture   DONE (nvc0,
> radeonsi)
> >GL_ARB_cl_event   not started
> > -  GL_ARB_compute_variable_group_sizeDONE (nvc0,
> radeonsi)
> > +  GL_ARB_compute_variable_group_sizeDONE (i965,
> nvc0, radeonsi)
> >GL_ARB_ES3_2_compatibilityDONE
> (i965/gen8+)
> >GL_ARB_fragment_shader_interlock  DONE (i965)
> >GL_ARB_gpu_shader_int64   DONE
> (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
> > diff --git a/docs/relnotes/18.2.0.html b/docs/relnotes/18.2.0.html
> > index 0db37b620d..7475a56633 100644
> > --- a/docs/relnotes/18.2.0.html
> > +++ b/docs/relnotes/18.2.0.html
> > @@ -52,6 +52,7 @@ Note: some of the new features are only available with
> certain drivers.
> >
> >  
> >  GL_ARB_fragment_shader_interlock on i965
> > +GL_ARB_compute_variable_group_size on i965
> >  
> >
> >  Bug fixes
> > diff --git a/src/compiler/nir/nir_lower_system_values.c
> b/src/compiler/nir/nir_lower_system_values.c
> > index 487da04262..7ab005b000 100644
> > --- a/src/compiler/nir/nir_lower_system_values.c
> > +++ b/src/compiler/nir/nir_lower_system_values.c
> > @@ -57,6 +57,14 @@ convert_block(nir_block *block, nir_builder *b)
> >*gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID"
> >*/
> >
> > + /*
> > +  * If the local work group size is variable we can't lower the
> global
> > +  * invocation id here.
> > +  */
> > + if (b->shader->info.cs.local_size_variable) {
> > +break;
> > + }
> > +
> >   nir_const_value local_size;
> >   memset(_size, 0, sizeof(local_size));
> >   local_size.u32[0] = b->shader->info.cs.local_size[0];
> > @@ -102,6 +110,11 @@ convert_block(nir_block *block, nir_builder *b)
> >}
> >
> >case SYSTEM_VALUE_LOCAL_GROUP_SIZE: {
> > + /* If the local work group size is variable we can't lower it
> here */
> > + if (b->shader->info.cs.local_size_variable) {
> > +break;
> > + }
> > +
> >   nir_const_value local_size;
> >   memset(_size, 0, sizeof(local_size));
> >   local_size.u32[0] = b->shader->info.cs.local_size[0];
> > diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_
> compiler.h
> > index 8b4e6fe2e2..f54952c28f 100644
> > --- a/src/intel/compiler/brw_compiler.h
> > +++ b/src/intel/compiler/brw_compiler.h
> > @@ -759,6 +759,7 @@ struct brw_cs_prog_data {
> > unsigned threads;
> > bool uses_barrier;
> 

Re: [Mesa-dev] [PATCH] virgl: Support ARB_framebuffer_no_attachments

2018-06-28 Thread Emil Velikov
On 28 June 2018 at 17:16, Drew Davenport  wrote:
> On Wed, Jun 27, 2018 at 1:57 AM Emil Velikov  wrote:
>>
>> On 27 June 2018 at 09:55, Emil Velikov  wrote:
>> > Hi Drew,
>> >
>> > Just some food for thought. The patch in itself looks correct albeit 
>> > partial.
>> >
>> > On 27 June 2018 at 00:00, Drew Davenport  wrote:
>> >> This change lets the following test pass on virgl:
>> >> dEQP-GLES31.functional.state_query.framebuffer_default.framebuffer_default_samples_get_framebuffer_parameteriv
>> >> ---
>> >>  src/gallium/drivers/virgl/virgl_screen.c | 4 
>> >>  1 file changed, 4 insertions(+)
>> >>
>> >> diff --git a/src/gallium/drivers/virgl/virgl_screen.c 
>> >> b/src/gallium/drivers/virgl/virgl_screen.c
>> >> index 1eefbd6519f..3035d4b5e20 100644
>> >> --- a/src/gallium/drivers/virgl/virgl_screen.c
>> >> +++ b/src/gallium/drivers/virgl/virgl_screen.c
>> >> @@ -495,6 +495,10 @@ virgl_is_format_supported( struct pipe_screen 
>> >> *screen,
>> >> }
>> >>
>> >> if (bind & PIPE_BIND_RENDER_TARGET) {
>> >> +  /* For ARB_framebuffer_no_attachments. */
>> >> +  if (format == PIPE_FORMAT_NONE)
>> >> + return TRUE;
>> >> +
>> >
>> > For ARB_framebuffer_no_attachments to be advertised, one should return
>> > 1 for the PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT query.
>> > In current master says "not supported" (returns 0) and I couldn't spot
>> > any patch that toggles it.
> This patch was on top of the gl-4.3 branch from
> https://gitlab.freedesktop.org/airlied/mesa.git, which does advertise
> support for that capability.
>
>> >
>> > Is this a test which requires the functionality, without checking for
>> > the extension presence?
>> > Or perhaps the test is part of a larger series, which flips the switch?
> Perhaps I jumped the gun sending this patch now, since it depends on a
> bunch of other work that hasn't been merged yet. If it makes more
> sense I don't mind holding onto this patch for now and trying to get
> it merged later.
>
Normally when sending patch/series that depend on other work it's good
to provide a reference.
It's normally a small note after the --- line with the name of the
series (on mesa-dev ML), patchwork link or git repo/branch.

Without it some strange questions are bound to show up ;-)

I don't think you need to wait, but it's your call.

HTH
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] virgl: Support ARB_framebuffer_no_attachments

2018-06-28 Thread Drew Davenport
On Wed, Jun 27, 2018 at 6:47 AM Ilia Mirkin  wrote:
>
> On Tue, Jun 26, 2018 at 7:00 PM, Drew Davenport  
> wrote:
> > This change lets the following test pass on virgl:
> > dEQP-GLES31.functional.state_query.framebuffer_default.framebuffer_default_samples_get_framebuffer_parameteriv
> > ---
> >  src/gallium/drivers/virgl/virgl_screen.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/src/gallium/drivers/virgl/virgl_screen.c 
> > b/src/gallium/drivers/virgl/virgl_screen.c
> > index 1eefbd6519f..3035d4b5e20 100644
> > --- a/src/gallium/drivers/virgl/virgl_screen.c
> > +++ b/src/gallium/drivers/virgl/virgl_screen.c
> > @@ -495,6 +495,10 @@ virgl_is_format_supported( struct pipe_screen *screen,
> > }
> >
> > if (bind & PIPE_BIND_RENDER_TARGET) {
> > +  /* For ARB_framebuffer_no_attachments. */
> > +  if (format == PIPE_FORMAT_NONE)
> > + return TRUE;
> > +
>
> This has to be gated on the host driver supporting this functionality,
> right? I.e. ES 3.1 or having this ext.
>
> Also dEQP has a lot of no-fb tests (unfortunately I don't remember
> what they're called off-hand... maybe *no_framebuffer* or
> *no_attach*), you should ensure they all pass (or fail for
> known/understood reasons).
Yes, there are a bunch of tests
(dEQP-GLES31.functional.fbo.no_attachments.*) that are currently
failing. If you'd prefer to have a series of patches that address
these all together, I can hold off on this patch for now. Thanks for
your feedback.

>
> Cheers,
>
>   -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] virgl: Support ARB_framebuffer_no_attachments

2018-06-28 Thread Drew Davenport
On Wed, Jun 27, 2018 at 1:57 AM Emil Velikov  wrote:
>
> On 27 June 2018 at 09:55, Emil Velikov  wrote:
> > Hi Drew,
> >
> > Just some food for thought. The patch in itself looks correct albeit 
> > partial.
> >
> > On 27 June 2018 at 00:00, Drew Davenport  wrote:
> >> This change lets the following test pass on virgl:
> >> dEQP-GLES31.functional.state_query.framebuffer_default.framebuffer_default_samples_get_framebuffer_parameteriv
> >> ---
> >>  src/gallium/drivers/virgl/virgl_screen.c | 4 
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/src/gallium/drivers/virgl/virgl_screen.c 
> >> b/src/gallium/drivers/virgl/virgl_screen.c
> >> index 1eefbd6519f..3035d4b5e20 100644
> >> --- a/src/gallium/drivers/virgl/virgl_screen.c
> >> +++ b/src/gallium/drivers/virgl/virgl_screen.c
> >> @@ -495,6 +495,10 @@ virgl_is_format_supported( struct pipe_screen *screen,
> >> }
> >>
> >> if (bind & PIPE_BIND_RENDER_TARGET) {
> >> +  /* For ARB_framebuffer_no_attachments. */
> >> +  if (format == PIPE_FORMAT_NONE)
> >> + return TRUE;
> >> +
> >
> > For ARB_framebuffer_no_attachments to be advertised, one should return
> > 1 for the PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT query.
> > In current master says "not supported" (returns 0) and I couldn't spot
> > any patch that toggles it.
This patch was on top of the gl-4.3 branch from
https://gitlab.freedesktop.org/airlied/mesa.git, which does advertise
support for that capability.

> >
> > Is this a test which requires the functionality, without checking for
> > the extension presence?
> > Or perhaps the test is part of a larger series, which flips the switch?
Perhaps I jumped the gun sending this patch now, since it depends on a
bunch of other work that hasn't been merged yet. If it makes more
sense I don't mind holding onto this patch for now and trying to get
it merged later.

> >
> Silly typo, above should read
> "Or perhaps the patch is part ..."
>
> -Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Erik Faye-Lund
On Thu, Jun 28, 2018 at 5:54 PM Gert Wollny  wrote:
>
> Am Donnerstag, den 28.06.2018, 17:40 +0200 schrieb Erik Faye-Lund:
> > On Thu, Jun 28, 2018 at 5:31 PM Gert Wollny  > m> wrote:
> > >
> > > There are two aspects:
> > >
> > > For each number of samples there is indeed a fixes set of sample
> > > positions that only depends on the hardware. The set corresponding
> > > to the requested number of samples is used when the surface is
> > > created with GL_TRUE for the "fixedsamplelocations" parameter.
> >
> > What I'm trying to say is that the concept of a global,
> > hardware-dependent set of sample-positions isn't a thing.
> I'm not sure what you are getting at ...

See below...

> >
> > These are not the same, and querying the positions of the highest
> > sample-count mode isn't going to apply to any other mode. There
> > simply isn't a concept of "the hardware's sample locations". They
> > depend on the MSAA mode (effectively sample-count). This is exactly
> > why you need to create a dummy FBO to query this; you need to know
> > the sample count of the mode to answer this.
> And this is exactly what I'm doing on the host (see the patch against
> virglrenderer I linked before) to get the values into
> vs->caps.caps.v2.msaa_sample_positions: I go through the supported
> sample counts 2, 4, 8, up to 16 (if supported), store the positions,
> and pass them to the guest. I looked at the radeonsi, r600, and the
> intel driver, and they all define sample position sets for these sample
> numbers (only up to 8 for r600). If one requests a number that is not
> amongst these, then the sample positions for the next higher sample
> count are returned, (eg. for 13 sample one gets the first 13 positions
> for 16 samples).

Right, thanks for clarifying. I misread the code, and thought you
discarded the old set of samples after finding a bigger set of
samples, thus assuming any mode was a subset of the biggest mode.

It still seems kinda strange (and fragile) to me to try to enumerate
all possible sample locations up-front instead of querying a given
texture for it's sample-locations.

For instance, I don't think there's anything in the spec backing the
correctness of the picking the 13 first positions from the 16 sample
mode strategy. That just happens to work on these three drivers you've
checked right now. It might not for future hardware, nor other
drivers. And I wouldn't be too surprised if nasty details like
framebuffer transforms (y-flipping in stuff like backbuffer vs FBO,
renderbuffers, pbuffers, scanout rotations, the recently propsed
y-flip extensions, you name it) could in fact "secretly" modify what
the correct values are.

This is of course just my two cents.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Gert Wollny
Am Donnerstag, den 28.06.2018, 17:40 +0200 schrieb Erik Faye-Lund:
> On Thu, Jun 28, 2018 at 5:31 PM Gert Wollny  m> wrote:
> > 
> > There are two aspects:
> > 
> > For each number of samples there is indeed a fixes set of sample
> > positions that only depends on the hardware. The set corresponding
> > to the requested number of samples is used when the surface is
> > created with GL_TRUE for the "fixedsamplelocations" parameter.
> 
> What I'm trying to say is that the concept of a global,
> hardware-dependent set of sample-positions isn't a thing.
I'm not sure what you are getting at ...
> 
> These are not the same, and querying the positions of the highest
> sample-count mode isn't going to apply to any other mode. There
> simply isn't a concept of "the hardware's sample locations". They
> depend on the MSAA mode (effectively sample-count). This is exactly
> why you need to create a dummy FBO to query this; you need to know
> the sample count of the mode to answer this.
And this is exactly what I'm doing on the host (see the patch against
virglrenderer I linked before) to get the values into 
vs->caps.caps.v2.msaa_sample_positions: I go through the supported
sample counts 2, 4, 8, up to 16 (if supported), store the positions,
and pass them to the guest. I looked at the radeonsi, r600, and the
intel driver, and they all define sample position sets for these sample
numbers (only up to 8 for r600). If one requests a number that is not
amongst these, then the sample positions for the next higher sample
count are returned, (eg. for 13 sample one gets the first 13 positions
for 16 samples).

Best, 
Gert 

> 
> The fixedsamplelocations=false complication is entirely orthogonal to
> this.
> 
> > If this parameter is set to false, then all bets are off and the
> > sample
> > positions can even vary over the surface area. For this case
> > glGetMultisample is kind of useless, for the other case one can
> > query
> > all sets once on the host and then re-use these values (see this
> > patch
> > for the host side: https://patchwork.freedesktop.org/patch/233354/)
> > 
> > best,
> > Gert
> > 
> > > 
> > > On Thu, Jun 28, 2018 at 3:45 PM Gert Wollny  > > a.co
> > > m> wrote:
> > > > 
> > > > Use caps to obtain the multisample sample positions for up to
> > > > 16
> > > > positions and implement the according Gallium interface.
> > > > 
> > > > Fixes (when run on GL host):
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_1.sample_position
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_2.sample_position
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_3.sample_position
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_4.sample_position
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_8.sample_position
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_10.sample_positio
> > > > n
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_12.sample_positio
> > > > n
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_13.sample_positio
> > > > n
> > > > dEQP-
> > > > GLES31.functional.texture.multisample.samples_16.sample_positio
> > > > n
> > > > 
> > > > v2: remove unrelated chunk (thanks Ilia Mirkin)
> > > > v3: - also return positions for intermediate sample counts
> > > > - fix unused varible warning
> > > > - update description
> > > > 
> > > > Signed-off-by: Gert Wollny 
> > > > ---
> > > > I left the debug_printf in there, because this patch (together
> > > > with
> > > > the
> > > > related virglrenderer patch) is not sufficient to fix above
> > > > tests
> > > > on a GLES
> > > > host.
> > > > 
> > > >  src/gallium/drivers/virgl/virgl_context.c | 38
> > > > +++
> > > >  src/gallium/drivers/virgl/virgl_hw.h  |  1 +
> > > >  2 files changed, 39 insertions(+)
> > > > 
> > > > diff --git a/src/gallium/drivers/virgl/virgl_context.c
> > > > b/src/gallium/drivers/virgl/virgl_context.c
> > > > index e6f8dc8525..43c141e42d 100644
> > > > --- a/src/gallium/drivers/virgl/virgl_context.c
> > > > +++ b/src/gallium/drivers/virgl/virgl_context.c
> > > > @@ -920,6 +920,42 @@ virgl_context_destroy( struct pipe_context
> > > > *ctx )
> > > > FREE(vctx);
> > > >  }
> > > > 
> > > > +static void virgl_get_sample_position(struct pipe_context
> > > > *ctx,
> > > > +  unsigned sample_count,
> > > > +  unsigned index,
> > > > +  float *out_value)
> > > > +{
> > > > +   struct virgl_context *vctx = virgl_context(ctx);
> > > > +   struct virgl_screen *vs = virgl_screen(vctx->base.screen);
> > > > +
> > > > +   if (sample_count > vs->caps.caps.v1.max_samples) {
> > > > +  debug_printf("VIRGL: requested %d MSAA samples, but only
> > > > %d
> > > > supported\n",
> > > > +   sample_count, vs-
> > > > 

[Mesa-dev] [PATCH] anv: finish the binding_table_pool on destroyDevice when use_softpin

2018-06-28 Thread Jose Maria Casanova Crespo
Running VK-CTS in batch execution mode was raising the
VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the
same failing tests were run isolated they always passed.

createDevice and destroyDevice were called before and after every
tests. Because the binding_table_pool was never closed, we reached the
maximum number of open file descriptors (ulimit -n) and when that
happened every call to createDevice implied a
VK_ERROR_INITIALIZATION_FAILED error.

Fixes: c7db0ed4e94dce563d722e1b098684fbd7315d51
  ("anv: Use a separate pool for binding tables when soft pinning")


Cc: Scott D Phillips 
Cc: Jason Ekstrand 

---
 src/intel/vulkan/anv_device.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index ea24a0ad03d..5266b269244 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -1782,6 +1782,7 @@ void anv_DestroyDevice(
 const VkAllocationCallbacks*pAllocator)
 {
ANV_FROM_HANDLE(anv_device, device, _device);
+   struct anv_physical_device *physical_device = 
>instance->physicalDevice;
 
if (!device)
   return;
@@ -1808,6 +1809,8 @@ void anv_DestroyDevice(
if (device->info.gen >= 10)
   anv_gem_close(device, device->hiz_clear_bo.gem_handle);
 
+   if (physical_device->use_softpin)
+  anv_state_pool_finish(>binding_table_pool);
anv_state_pool_finish(>surface_state_pool);
anv_state_pool_finish(>instruction_state_pool);
anv_state_pool_finish(>dynamic_state_pool);
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Erik Faye-Lund
On Thu, Jun 28, 2018 at 5:31 PM Gert Wollny  wrote:
>
> Am Donnerstag, den 28.06.2018, 17:09 +0200 schrieb Erik Faye-Lund:
> > Unless I'm misunderstanding, this seems to indicate that the hardware
> > has a fixed set of sample positions, which I don't think is true for
> > most hardware. Instead, the sample locations is a function of the
> > multisampling mode for a given surface...
> There are two aspects:
>
> For each number of samples there is indeed a fixes set of sample
> positions that only depends on the hardware. The set corresponding to
> the requested number of samples is used when the surface is created
> with GL_TRUE for the "fixedsamplelocations" parameter.

What I'm trying to say is that the concept of a global,
hardware-dependent set of sample-positions isn't a thing.

For instance, a single-sample multi-sample mode usually has the first
(and only sample) in the center of the pixel, whereas a 4xMSAA pattern
usually has four samples in a rotated grid, none of them in the center
of the pixel.

These are not the same, and querying the positions of the highest
sample-count mode isn't going to apply to any other mode. There simply
isn't a concept of "the hardware's sample locations". They depend on
the MSAA mode (effectively sample-count). This is exactly why you need
to create a dummy FBO to query this; you need to know the sample count
of the mode to answer this.

The fixedsamplelocations=false complication is entirely orthogonal to this.

> If this parameter is set to false, then all bets are off and the sample
> positions can even vary over the surface area. For this case
> glGetMultisample is kind of useless, for the other case one can query
> all sets once on the host and then re-use these values (see this patch
> for the host side: https://patchwork.freedesktop.org/patch/233354/)
>
> best,
> Gert
>
> >
> > On Thu, Jun 28, 2018 at 3:45 PM Gert Wollny  > m> wrote:
> > >
> > > Use caps to obtain the multisample sample positions for up to 16
> > > positions and implement the according Gallium interface.
> > >
> > > Fixes (when run on GL host):
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_1.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_2.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_3.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_4.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_8.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_10.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_12.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_13.sample_position
> > > dEQP-
> > > GLES31.functional.texture.multisample.samples_16.sample_position
> > >
> > > v2: remove unrelated chunk (thanks Ilia Mirkin)
> > > v3: - also return positions for intermediate sample counts
> > > - fix unused varible warning
> > > - update description
> > >
> > > Signed-off-by: Gert Wollny 
> > > ---
> > > I left the debug_printf in there, because this patch (together with
> > > the
> > > related virglrenderer patch) is not sufficient to fix above tests
> > > on a GLES
> > > host.
> > >
> > >  src/gallium/drivers/virgl/virgl_context.c | 38
> > > +++
> > >  src/gallium/drivers/virgl/virgl_hw.h  |  1 +
> > >  2 files changed, 39 insertions(+)
> > >
> > > diff --git a/src/gallium/drivers/virgl/virgl_context.c
> > > b/src/gallium/drivers/virgl/virgl_context.c
> > > index e6f8dc8525..43c141e42d 100644
> > > --- a/src/gallium/drivers/virgl/virgl_context.c
> > > +++ b/src/gallium/drivers/virgl/virgl_context.c
> > > @@ -920,6 +920,42 @@ virgl_context_destroy( struct pipe_context
> > > *ctx )
> > > FREE(vctx);
> > >  }
> > >
> > > +static void virgl_get_sample_position(struct pipe_context *ctx,
> > > +  unsigned sample_count,
> > > +  unsigned index,
> > > +  float *out_value)
> > > +{
> > > +   struct virgl_context *vctx = virgl_context(ctx);
> > > +   struct virgl_screen *vs = virgl_screen(vctx->base.screen);
> > > +
> > > +   if (sample_count > vs->caps.caps.v1.max_samples) {
> > > +  debug_printf("VIRGL: requested %d MSAA samples, but only %d
> > > supported\n",
> > > +   sample_count, vs->caps.caps.v1.max_samples);
> > > +  return;
> > > +   }
> > > +
> > > +   /* The following is basically copied from
> > > dri/i965gen6_get_sample_position
> > > +* The only addition is that we hold the msaa positions for all
> > > sample
> > > +* counts in a flat array. */
> > > +   uint32_t bits = 0;
> > > +   if (sample_count == 1) {
> > > +  out_value[0] = out_value[1] = 0.5f;
> > > +  return;
> > > +   } else if (sample_count == 2) {
> > > +  bits = 

Re: [Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Gert Wollny
Am Donnerstag, den 28.06.2018, 17:09 +0200 schrieb Erik Faye-Lund:
> Unless I'm misunderstanding, this seems to indicate that the hardware
> has a fixed set of sample positions, which I don't think is true for
> most hardware. Instead, the sample locations is a function of the
> multisampling mode for a given surface...
There are two aspects: 

For each number of samples there is indeed a fixes set of sample
positions that only depends on the hardware. The set corresponding to
the requested number of samples is used when the surface is created
with GL_TRUE for the "fixedsamplelocations" parameter.

If this parameter is set to false, then all bets are off and the sample
positions can even vary over the surface area. For this case
glGetMultisample is kind of useless, for the other case one can query 
all sets once on the host and then re-use these values (see this patch
for the host side: https://patchwork.freedesktop.org/patch/233354/)

best, 
Gert

> 
> On Thu, Jun 28, 2018 at 3:45 PM Gert Wollny  m> wrote:
> > 
> > Use caps to obtain the multisample sample positions for up to 16
> > positions and implement the according Gallium interface.
> > 
> > Fixes (when run on GL host):
> > dEQP-
> > GLES31.functional.texture.multisample.samples_1.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_2.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_3.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_4.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_8.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_10.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_12.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_13.sample_position
> > dEQP-
> > GLES31.functional.texture.multisample.samples_16.sample_position
> > 
> > v2: remove unrelated chunk (thanks Ilia Mirkin)
> > v3: - also return positions for intermediate sample counts
> > - fix unused varible warning
> > - update description
> > 
> > Signed-off-by: Gert Wollny 
> > ---
> > I left the debug_printf in there, because this patch (together with
> > the
> > related virglrenderer patch) is not sufficient to fix above tests
> > on a GLES
> > host.
> > 
> >  src/gallium/drivers/virgl/virgl_context.c | 38
> > +++
> >  src/gallium/drivers/virgl/virgl_hw.h  |  1 +
> >  2 files changed, 39 insertions(+)
> > 
> > diff --git a/src/gallium/drivers/virgl/virgl_context.c
> > b/src/gallium/drivers/virgl/virgl_context.c
> > index e6f8dc8525..43c141e42d 100644
> > --- a/src/gallium/drivers/virgl/virgl_context.c
> > +++ b/src/gallium/drivers/virgl/virgl_context.c
> > @@ -920,6 +920,42 @@ virgl_context_destroy( struct pipe_context
> > *ctx )
> > FREE(vctx);
> >  }
> > 
> > +static void virgl_get_sample_position(struct pipe_context *ctx,
> > +  unsigned sample_count,
> > +  unsigned index,
> > +  float *out_value)
> > +{
> > +   struct virgl_context *vctx = virgl_context(ctx);
> > +   struct virgl_screen *vs = virgl_screen(vctx->base.screen);
> > +
> > +   if (sample_count > vs->caps.caps.v1.max_samples) {
> > +  debug_printf("VIRGL: requested %d MSAA samples, but only %d
> > supported\n",
> > +   sample_count, vs->caps.caps.v1.max_samples);
> > +  return;
> > +   }
> > +
> > +   /* The following is basically copied from
> > dri/i965gen6_get_sample_position
> > +* The only addition is that we hold the msaa positions for all
> > sample
> > +* counts in a flat array. */
> > +   uint32_t bits = 0;
> > +   if (sample_count == 1) {
> > +  out_value[0] = out_value[1] = 0.5f;
> > +  return;
> > +   } else if (sample_count == 2) {
> > +  bits = vs->caps.caps.v2.msaa_sample_positions[0] >> (8 *
> > index);
> > +   } else if (sample_count < 4) {
> > +  bits = vs->caps.caps.v2.msaa_sample_positions[1] >> (8 *
> > index);
> > +   } else if (sample_count < 8) {
> > +  bits = vs->caps.caps.v2.msaa_sample_positions[2 + (index >>
> > 2)] >> (8 * (index & 3));
> > +   } else if (sample_count < 8) {
> > +  bits = vs->caps.caps.v2.msaa_sample_positions[4 + (index >>
> > 2)] >> (8 * (index & 3));
> > +   }
> > +   out_value[0] = ((bits >> 4) & 0xf) / 16.0f;
> > +   out_value[1] = (bits & 0xf) / 16.0f;
> > +   debug_printf("VIRGL: sample postion [%2d/%2d] = (%f, %f)\n",
> > +index, sample_count, out_value[0], out_value[1]);
> > +}
> > +
> >  struct pipe_context *virgl_context_create(struct pipe_screen
> > *pscreen,
> >void *priv,
> >unsigned flags)
> > @@ -994,6 +1030,8 @@ struct pipe_context
> > *virgl_context_create(struct pipe_screen *pscreen,
> > 
> > vctx->base.set_blend_color = 

Re: [Mesa-dev] [PATCH 1/2] radv: enable/disable prediction for the DCC decompression pass

2018-06-28 Thread Samuel Pitoiset

ping?

On 04/18/2018 02:34 PM, Samuel Pitoiset wrote:

Performing a DCC decompression pass is currently pretty rare,
but using prediction allows the GPU to skip unnecessary passes.

Signed-off-by: Samuel Pitoiset 
---
  src/amd/vulkan/radv_meta_fast_clear.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_meta_fast_clear.c 
b/src/amd/vulkan/radv_meta_fast_clear.c
index d5af7a1b0c..e702dc80a5 100644
--- a/src/amd/vulkan/radv_meta_fast_clear.c
+++ b/src/amd/vulkan/radv_meta_fast_clear.c
@@ -601,7 +601,7 @@ radv_emit_color_decompress(struct radv_cmd_buffer 
*cmd_buffer,
 pipeline = 
cmd_buffer->device->meta_state.fast_clear_flush.cmask_eliminate_pipeline;
}
  
-	if (!decompress_dcc && radv_image_has_dcc(image)) {

+   if (radv_image_has_dcc(image)) {
radv_emit_set_predication_state_from_image(cmd_buffer, image, 
true);
cmd_buffer->state.predicating = true;
}
@@ -667,7 +667,7 @@ radv_emit_color_decompress(struct radv_cmd_buffer 
*cmd_buffer,
_buffer->pool->alloc);
  
  	}

-   if (!decompress_dcc && radv_image_has_dcc(image)) {
+   if (radv_image_has_dcc(image)) {
cmd_buffer->state.predicating = false;
radv_emit_set_predication_state_from_image(cmd_buffer, image, 
false);
}


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Erik Faye-Lund
Unless I'm misunderstanding, this seems to indicate that the hardware
has a fixed set of sample positions, which I don't think is true for
most hardware. Instead, the sample locations is a function of the
multisampling mode for a given surface...

On Thu, Jun 28, 2018 at 3:45 PM Gert Wollny  wrote:
>
> Use caps to obtain the multisample sample positions for up to 16
> positions and implement the according Gallium interface.
>
> Fixes (when run on GL host):
> dEQP-GLES31.functional.texture.multisample.samples_1.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_2.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_3.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_4.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_8.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_10.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_12.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_13.sample_position
> dEQP-GLES31.functional.texture.multisample.samples_16.sample_position
>
> v2: remove unrelated chunk (thanks Ilia Mirkin)
> v3: - also return positions for intermediate sample counts
> - fix unused varible warning
> - update description
>
> Signed-off-by: Gert Wollny 
> ---
> I left the debug_printf in there, because this patch (together with the
> related virglrenderer patch) is not sufficient to fix above tests on a GLES
> host.
>
>  src/gallium/drivers/virgl/virgl_context.c | 38 
> +++
>  src/gallium/drivers/virgl/virgl_hw.h  |  1 +
>  2 files changed, 39 insertions(+)
>
> diff --git a/src/gallium/drivers/virgl/virgl_context.c 
> b/src/gallium/drivers/virgl/virgl_context.c
> index e6f8dc8525..43c141e42d 100644
> --- a/src/gallium/drivers/virgl/virgl_context.c
> +++ b/src/gallium/drivers/virgl/virgl_context.c
> @@ -920,6 +920,42 @@ virgl_context_destroy( struct pipe_context *ctx )
> FREE(vctx);
>  }
>
> +static void virgl_get_sample_position(struct pipe_context *ctx,
> +  unsigned sample_count,
> +  unsigned index,
> +  float *out_value)
> +{
> +   struct virgl_context *vctx = virgl_context(ctx);
> +   struct virgl_screen *vs = virgl_screen(vctx->base.screen);
> +
> +   if (sample_count > vs->caps.caps.v1.max_samples) {
> +  debug_printf("VIRGL: requested %d MSAA samples, but only %d 
> supported\n",
> +   sample_count, vs->caps.caps.v1.max_samples);
> +  return;
> +   }
> +
> +   /* The following is basically copied from dri/i965gen6_get_sample_position
> +* The only addition is that we hold the msaa positions for all sample
> +* counts in a flat array. */
> +   uint32_t bits = 0;
> +   if (sample_count == 1) {
> +  out_value[0] = out_value[1] = 0.5f;
> +  return;
> +   } else if (sample_count == 2) {
> +  bits = vs->caps.caps.v2.msaa_sample_positions[0] >> (8 * index);
> +   } else if (sample_count < 4) {
> +  bits = vs->caps.caps.v2.msaa_sample_positions[1] >> (8 * index);
> +   } else if (sample_count < 8) {
> +  bits = vs->caps.caps.v2.msaa_sample_positions[2 + (index >> 2)] >> (8 
> * (index & 3));
> +   } else if (sample_count < 8) {
> +  bits = vs->caps.caps.v2.msaa_sample_positions[4 + (index >> 2)] >> (8 
> * (index & 3));
> +   }
> +   out_value[0] = ((bits >> 4) & 0xf) / 16.0f;
> +   out_value[1] = (bits & 0xf) / 16.0f;
> +   debug_printf("VIRGL: sample postion [%2d/%2d] = (%f, %f)\n",
> +index, sample_count, out_value[0], out_value[1]);
> +}
> +
>  struct pipe_context *virgl_context_create(struct pipe_screen *pscreen,
>void *priv,
>unsigned flags)
> @@ -994,6 +1030,8 @@ struct pipe_context *virgl_context_create(struct 
> pipe_screen *pscreen,
>
> vctx->base.set_blend_color = virgl_set_blend_color;
>
> +   vctx->base.get_sample_position = virgl_get_sample_position;
> +
> vctx->base.resource_copy_region = virgl_resource_copy_region;
> vctx->base.flush_resource = virgl_flush_resource;
> vctx->base.blit =  virgl_blit;
> diff --git a/src/gallium/drivers/virgl/virgl_hw.h 
> b/src/gallium/drivers/virgl/virgl_hw.h
> index ee58520f9b..82cbb8aed1 100644
> --- a/src/gallium/drivers/virgl/virgl_hw.h
> +++ b/src/gallium/drivers/virgl/virgl_hw.h
> @@ -298,6 +298,7 @@ struct virgl_caps_v2 {
>  uint32_t uniform_buffer_offset_alignment;
>  uint32_t shader_buffer_offset_alignment;
>  uint32_t capability_bits;
> +uint32_t msaa_sample_positions[8];
>  };
>
>  union virgl_caps {
> --
> 2.16.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___

[Mesa-dev] [Bug 102227] Commit 26aee6f4d5 causes crash-loop on android-x86 (surfaceflinger to exit with status 1)

2018-06-28 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102227

--- Comment #4 from Dainius Masiliūnas  ---
I'm experiencing this as well, on an AMD Brazos machine.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] meson: do not use the deprecated wayland-scanner "code"

2018-06-28 Thread Emil Velikov
From: Emil Velikov 

With version v1.15 the "code" option was deprecated in favour of
"private-code" or "public-code".

Before the interface symbol generated was exported (which is a bad idea
since it's internal implementation detail) and others may misuse it.

That was the case with libva approx. 1 year. Since then libva was fixed,
so we can finally hide it by using "private-code"

Inspired by similar xserver patch by Adam Jackson.

Cc: Dylan Baker 
Cc: Eric Engestrom 
Signed-off-by: Emil Velikov 
---
Any suggestions to the commit message for this and the autoconf commit
are highly appreciated ;-)
---
 meson.build | 6 ++
 src/egl/wayland/wayland-drm/meson.build | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 79bac89e7d9..429e24f411f 100644
--- a/meson.build
+++ b/meson.build
@@ -1271,6 +1271,11 @@ endif
 if with_platform_wayland
   dep_wl_scanner = dependency('wayland-scanner', native: true)
   prog_wl_scanner = 
find_program(dep_wl_scanner.get_pkgconfig_variable('wayland_scanner'))
+  if dep_wl_scanner.version().version_compare('>= 1.15')
+wl_scanner_arg = 'private-code'
+  else
+wl_scanner_arg = 'code'
+  endif
   dep_wl_protocols = dependency('wayland-protocols', version : '>= 1.8')
   dep_wayland_client = dependency('wayland-client', version : '>=1.11')
   dep_wayland_server = dependency('wayland-server', version : '>=1.11')
@@ -1286,6 +1291,7 @@ if with_platform_wayland
   pre_args += ['-DHAVE_WAYLAND_PLATFORM', '-DWL_HIDE_DEPRECATED']
 else
   prog_wl_scanner = []
+  wl_scanner_arg = ''
   dep_wl_protocols = null_dep
   dep_wayland_client = null_dep
   dep_wayland_server = null_dep
diff --git a/src/egl/wayland/wayland-drm/meson.build 
b/src/egl/wayland/wayland-drm/meson.build
index c627deaa1c3..983bf55fac8 100644
--- a/src/egl/wayland/wayland-drm/meson.build
+++ b/src/egl/wayland/wayland-drm/meson.build
@@ -24,7 +24,7 @@ wayland_drm_protocol_c = custom_target(
   'wayland-drm-protocol.c',
   input : 'wayland-drm.xml',
   output : 'wayland-drm-protocol.c',
-  command : [prog_wl_scanner, 'code', '@INPUT@', '@OUTPUT@'],
+  command : [prog_wl_scanner, wl_scanner_arg, '@INPUT@', '@OUTPUT@'],
 )
 
 wayland_drm_client_protocol_h = custom_target(
@@ -61,7 +61,7 @@ linux_dmabuf_unstable_v1_protocol_c = custom_target(
   'linux-dmabuf-unstable-v1-protocol.c',
   input : wayland_dmabuf_xml,
   output : 'linux-dmabuf-unstable-v1-protocol.c',
-  command : [prog_wl_scanner, 'code', '@INPUT@', '@OUTPUT@'],
+  command : [prog_wl_scanner, wl_scanner_arg, '@INPUT@', '@OUTPUT@'],
 )
 
 linux_dmabuf_unstable_v1_client_protocol_h = custom_target(
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] meson: use dependency()+find_program() for wayland-scanner

2018-06-28 Thread Emil Velikov
From: Emil Velikov 

Helps when the native wayland-scanner is located outside of PATH.
Inspired by the xserver code ;-)

Cc: Dylan Baker 
Cc: Eric Engestrom 
Signed-off-by: Emil Velikov 
---
 meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index b2722c71e5b..79bac89e7d9 100644
--- a/meson.build
+++ b/meson.build
@@ -1269,7 +1269,8 @@ endif
 # TODO: symbol mangling
 
 if with_platform_wayland
-  prog_wl_scanner = find_program('wayland-scanner')
+  dep_wl_scanner = dependency('wayland-scanner', native: true)
+  prog_wl_scanner = 
find_program(dep_wl_scanner.get_pkgconfig_variable('wayland_scanner'))
   dep_wl_protocols = dependency('wayland-protocols', version : '>= 1.8')
   dep_wayland_client = dependency('wayland-client', version : '>=1.11')
   dep_wayland_server = dependency('wayland-server', version : '>=1.11')
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] autoconf: do not use the deprecated wayland-scanner "code"

2018-06-28 Thread Emil Velikov
From: Emil Velikov 

With version v1.15 the "code" option was deprecated in favour of
"private-code" or "public-code".

Before the interface symbol generated was exported (which is a bad idea
since it's internal implementation detail) and others may misuse it.

That was the case with libva approx. 1 year. Since then libva was fixed,
so we can finally hide it by using "private-code"

Inspired by similar xserver patch by Adam Jackson.

Cc: Dylan Baker 
Cc: Eric Engestrom 
Signed-off-by: Emil Velikov 
---
 configure.ac| 4 
 src/egl/Makefile.am | 2 +-
 src/egl/wayland/wayland-drm/Makefile.am | 2 +-
 src/vulkan/Makefile.am  | 4 ++--
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index 502b1787c62..755c06bf867 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1809,6 +1809,10 @@ for plat in $platforms; do
 PKG_CHECK_MODULES([WAYLAND_SCANNER], [wayland-scanner],
   WAYLAND_SCANNER=`$PKG_CONFIG 
--variable=wayland_scanner wayland-scanner`,
   WAYLAND_SCANNER='')
+PKG_CHECK_EXISTS([wayland-scanner >= 1.15],
+  AC_SUBST(SCANNER_ARG, 'private-code'),
+  AC_SUBST(SCANNER_ARG, 'code'))
+
 if test "x$WAYLAND_SCANNER" = x; then
 AC_PATH_PROG([WAYLAND_SCANNER], [wayland-scanner], [:])
 fi
diff --git a/src/egl/Makefile.am b/src/egl/Makefile.am
index be3547d968f..5d7a208825c 100644
--- a/src/egl/Makefile.am
+++ b/src/egl/Makefile.am
@@ -71,7 +71,7 @@ WL_DMABUF_XML = 
$(WAYLAND_PROTOCOLS_DATADIR)/unstable/linux-dmabuf/linux-dmabuf-
 
 drivers/dri2/linux-dmabuf-unstable-v1-protocol.c: $(WL_DMABUF_XML)
$(MKDIR_GEN)
-   $(AM_V_GEN)$(WAYLAND_SCANNER) code $< $@
+   $(AM_V_GEN)$(WAYLAND_SCANNER) $(SCANNER_ARG) $< $@
 
 drivers/dri2/linux-dmabuf-unstable-v1-client-protocol.h: $(WL_DMABUF_XML)
$(MKDIR_GEN)
diff --git a/src/egl/wayland/wayland-drm/Makefile.am 
b/src/egl/wayland/wayland-drm/Makefile.am
index 0404c79e7fa..40174c6acdd 100644
--- a/src/egl/wayland/wayland-drm/Makefile.am
+++ b/src/egl/wayland/wayland-drm/Makefile.am
@@ -28,7 +28,7 @@ CLEANFILES = \
wayland-drm-server-protocol.h
 
 %-protocol.c : %.xml
-   $(AM_V_GEN)$(WAYLAND_SCANNER) code $< $@
+   $(AM_V_GEN)$(WAYLAND_SCANNER) $(SCANNER_ARG) $< $@
 
 %-server-protocol.h : %.xml
$(AM_V_GEN)$(WAYLAND_SCANNER) server-header $< $@
diff --git a/src/vulkan/Makefile.am b/src/vulkan/Makefile.am
index ce1a79d0c48..db3831229e9 100644
--- a/src/vulkan/Makefile.am
+++ b/src/vulkan/Makefile.am
@@ -76,7 +76,7 @@ WL_DRM_XML = 
$(top_srcdir)/src/egl/wayland/wayland-drm/wayland-drm.xml
 
 wsi/wayland-drm-protocol.c : $(WL_DRM_XML)
$(MKDIR_GEN)
-   $(AM_V_GEN)$(WAYLAND_SCANNER) code $< $@
+   $(AM_V_GEN)$(WAYLAND_SCANNER) $(SCANNER_ARG) $< $@
 
 wsi/wayland-drm-client-protocol.h : $(WL_DRM_XML)
$(MKDIR_GEN)
@@ -86,7 +86,7 @@ WL_DMABUF_XML = 
$(WAYLAND_PROTOCOLS_DATADIR)/unstable/linux-dmabuf/linux-dmabuf-
 
 wsi/linux-dmabuf-unstable-v1-protocol.c : $(WL_DMABUF_XML)
$(MKDIR_GEN)
-   $(AM_V_GEN)$(WAYLAND_SCANNER) code $< $@
+   $(AM_V_GEN)$(WAYLAND_SCANNER) $(SCANNER_ARG) $< $@
 
 wsi/linux-dmabuf-unstable-v1-client-protocol.h : $(WL_DMABUF_XML)
$(MKDIR_GEN)
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] fixes primarily for LLVM trunk support v2

2018-06-28 Thread Cherniak, Bruce
v2 Reviewed-by: Bruce Cherniak  

> On Jun 25, 2018, at 9:52 AM, Alok Hota  wrote:
> 
> These code changes were made in between some of the formatting changes.
> Unforunately we do have another formatting patch coming in after this,
> but I will keep that separate.
> 
> These patches are primarily focused on enhancing the BuilderGfxMem in
> our internal rasterizer and to support changes in the LLVM trunk going
> into version 7.0.0
> 
> v2 : accidentally included the wrong commits into the patch. Previous
> version included the formatting commit that was supposed to come after
> this patch, and did not include the first commit prior to adding
> SCATTERPS functionality
> 
> Alok Hota (3):
>  swr/rast: Adding Read/Write specifier to TranslateGfxAddress stack
>  swr/rast: Adding SCATTERPS functionality to BuilderGfxMem
>  swr/rast: Handling removed LLVM intrinsics in trunk
> 
> Vinson Lee (1):
>  swr/rast: Fix addPassesToEmitFile usage with llvm-7.0.
> 
> .../swr/rasterizer/jitter/JitManager.cpp  |  4 ++
> .../swr/rasterizer/jitter/builder_gfx_mem.cpp | 31 -
> .../swr/rasterizer/jitter/builder_gfx_mem.h   | 43 ---
> .../jitter/functionpasses/lower_x86.cpp   | 40 +
> 4 files changed, 91 insertions(+), 27 deletions(-)
> 
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3] virgl: Add support for glGetMultisample

2018-06-28 Thread Gert Wollny
Use caps to obtain the multisample sample positions for up to 16
positions and implement the according Gallium interface.

Fixes (when run on GL host):
dEQP-GLES31.functional.texture.multisample.samples_1.sample_position
dEQP-GLES31.functional.texture.multisample.samples_2.sample_position
dEQP-GLES31.functional.texture.multisample.samples_3.sample_position
dEQP-GLES31.functional.texture.multisample.samples_4.sample_position
dEQP-GLES31.functional.texture.multisample.samples_8.sample_position
dEQP-GLES31.functional.texture.multisample.samples_10.sample_position
dEQP-GLES31.functional.texture.multisample.samples_12.sample_position
dEQP-GLES31.functional.texture.multisample.samples_13.sample_position
dEQP-GLES31.functional.texture.multisample.samples_16.sample_position

v2: remove unrelated chunk (thanks Ilia Mirkin)
v3: - also return positions for intermediate sample counts
- fix unused varible warning
- update description 

Signed-off-by: Gert Wollny 
---
I left the debug_printf in there, because this patch (together with the
related virglrenderer patch) is not sufficient to fix above tests on a GLES
host.

 src/gallium/drivers/virgl/virgl_context.c | 38 +++
 src/gallium/drivers/virgl/virgl_hw.h  |  1 +
 2 files changed, 39 insertions(+)

diff --git a/src/gallium/drivers/virgl/virgl_context.c 
b/src/gallium/drivers/virgl/virgl_context.c
index e6f8dc8525..43c141e42d 100644
--- a/src/gallium/drivers/virgl/virgl_context.c
+++ b/src/gallium/drivers/virgl/virgl_context.c
@@ -920,6 +920,42 @@ virgl_context_destroy( struct pipe_context *ctx )
FREE(vctx);
 }
 
+static void virgl_get_sample_position(struct pipe_context *ctx,
+  unsigned sample_count,
+  unsigned index,
+  float *out_value)
+{
+   struct virgl_context *vctx = virgl_context(ctx);
+   struct virgl_screen *vs = virgl_screen(vctx->base.screen);
+
+   if (sample_count > vs->caps.caps.v1.max_samples) {
+  debug_printf("VIRGL: requested %d MSAA samples, but only %d supported\n",
+   sample_count, vs->caps.caps.v1.max_samples);
+  return;
+   }
+
+   /* The following is basically copied from dri/i965gen6_get_sample_position
+* The only addition is that we hold the msaa positions for all sample
+* counts in a flat array. */
+   uint32_t bits = 0;
+   if (sample_count == 1) {
+  out_value[0] = out_value[1] = 0.5f;
+  return;
+   } else if (sample_count == 2) {
+  bits = vs->caps.caps.v2.msaa_sample_positions[0] >> (8 * index);
+   } else if (sample_count < 4) {
+  bits = vs->caps.caps.v2.msaa_sample_positions[1] >> (8 * index);
+   } else if (sample_count < 8) {
+  bits = vs->caps.caps.v2.msaa_sample_positions[2 + (index >> 2)] >> (8 * 
(index & 3));
+   } else if (sample_count < 8) {
+  bits = vs->caps.caps.v2.msaa_sample_positions[4 + (index >> 2)] >> (8 * 
(index & 3));
+   }
+   out_value[0] = ((bits >> 4) & 0xf) / 16.0f;
+   out_value[1] = (bits & 0xf) / 16.0f;
+   debug_printf("VIRGL: sample postion [%2d/%2d] = (%f, %f)\n",
+index, sample_count, out_value[0], out_value[1]);
+}
+
 struct pipe_context *virgl_context_create(struct pipe_screen *pscreen,
   void *priv,
   unsigned flags)
@@ -994,6 +1030,8 @@ struct pipe_context *virgl_context_create(struct 
pipe_screen *pscreen,
 
vctx->base.set_blend_color = virgl_set_blend_color;
 
+   vctx->base.get_sample_position = virgl_get_sample_position;
+
vctx->base.resource_copy_region = virgl_resource_copy_region;
vctx->base.flush_resource = virgl_flush_resource;
vctx->base.blit =  virgl_blit;
diff --git a/src/gallium/drivers/virgl/virgl_hw.h 
b/src/gallium/drivers/virgl/virgl_hw.h
index ee58520f9b..82cbb8aed1 100644
--- a/src/gallium/drivers/virgl/virgl_hw.h
+++ b/src/gallium/drivers/virgl/virgl_hw.h
@@ -298,6 +298,7 @@ struct virgl_caps_v2 {
 uint32_t uniform_buffer_offset_alignment;
 uint32_t shader_buffer_offset_alignment;
 uint32_t capability_bits;
+uint32_t msaa_sample_positions[8];
 };
 
 union virgl_caps {
-- 
2.16.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] swr/rast: Adding SCATTERPS functionality to BuilderGfxMem

2018-06-28 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak 

> On Jun 25, 2018, at 9:52 AM, Alok Hota  wrote:
> 
> ---
> .../swr/rasterizer/jitter/builder_gfx_mem.cpp   | 13 +
> .../drivers/swr/rasterizer/jitter/builder_gfx_mem.h |  6 ++
> 2 files changed, 19 insertions(+)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
> index 8706bfa66b..df11914db1 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.cpp
> @@ -108,6 +108,19 @@ namespace SwrJit
> return vGather;
> }
> 
> +void BuilderGfxMem::SCATTERPS(
> +Value* pDst, Value* vSrc, Value* vOffsets, Value* vMask, 
> JIT_MEM_CLIENT usage)
> +{
> +
> +// address may be coming in as 64bit int now so get the pointer
> +if (pDst->getType() == mInt64Ty)
> +{
> +pDst = INT_TO_PTR(pDst, PointerType::get(mInt8Ty, 0));
> +}
> +
> +Builder::SCATTERPS(pDst, vSrc, vOffsets, vMask, usage);
> +}
> +
> 
> Value *BuilderGfxMem::OFFSET_TO_NEXT_COMPONENT(Value *base, Constant 
> *offset)
> {
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h 
> b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
> index a552ff9b26..dd20c06afe 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/builder_gfx_mem.h
> @@ -88,6 +88,12 @@ namespace SwrJit
> uint8_tscale = 1,
> JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
> 
> +virtual void SCATTERPS(Value* pDst,
> +   Value* vSrc,
> +   Value* vOffsets,
> +   Value* vMask,
> +   JIT_MEM_CLIENT usage = MEM_CLIENT_INTERNAL);
> +
> 
> Value *TranslateGfxAddressForRead(Value *xpGfxAddress,
>   Type * PtrTy = nullptr,
> -- 
> 2.17.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/1] swr/rast: Updating code style based on current clang-format rules

2018-06-28 Thread Cherniak, Bruce
Reviewed-by: Bruce Cherniak  

> On Jun 22, 2018, at 9:11 AM, Alok Hota  wrote:
> 
> ---
> .../swr/rasterizer/jitter/JitManager.cpp  | 133 ++--
> .../swr/rasterizer/jitter/builder_gfx_mem.cpp |  90 +
> .../swr/rasterizer/jitter/builder_gfx_mem.h   | 101 +-
> .../jitter/functionpasses/lower_x86.cpp   | 189 +-
> 4 files changed, 260 insertions(+), 253 deletions(-)
> 
> diff --git a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp 
> b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
> index 5bacf55126..0312fc47fb 100644
> --- a/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
> +++ b/src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp
> @@ -59,7 +59,7 @@ using namespace SwrJit;
> //
> /// @brief Contructor for JitManager.
> /// @param simdWidth - SIMD width to be used in generated program.
> -JitManager::JitManager(uint32_t simdWidth, const char *arch, const char 
> *core) :
> +JitManager::JitManager(uint32_t simdWidth, const char* arch, const char* 
> core) :
> mContext(), mBuilder(mContext), mIsModuleFinalized(true), mJitNumber(0), 
> mVWidth(simdWidth),
> mArch(arch)
> {
> @@ -153,7 +153,7 @@ JitManager::JitManager(uint32_t simdWidth, const char 
> *arch, const char *core) :
> }
> 
> #if LLVM_USE_INTEL_JITEVENTS
> -JITEventListener *vTune = 
> JITEventListener::createIntelJITEventListener();
> +JITEventListener* vTune = 
> JITEventListener::createIntelJITEventListener();
> mpExec->RegisterJITEventListener(vTune);
> #endif
> 
> @@ -163,7 +163,7 @@ JitManager::JitManager(uint32_t simdWidth, const char 
> *arch, const char *core) :
> #else
> // typedef void(__cdecl *PFN_FETCH_FUNC)(SWR_FETCH_CONTEXT& fetchInfo, 
> simdvertex& out);
> #endif
> -std::vector fsArgs;
> +std::vector fsArgs;
> 
> // llvm5 is picky and does not take a void * type
> fsArgs.push_back(PointerType::get(Gen_SWR_FETCH_CONTEXT(this), 0));
> @@ -212,21 +212,21 @@ void JitManager::SetupNewModule()
> }
> 
> 
> -DIType *
> -JitManager::CreateDebugStructType(StructType *   
>   pType,
> -  const std::string &
>   name,
> -  DIFile *   
>   pFile,
> +DIType*
> +JitManager::CreateDebugStructType(StructType*
>   pType,
> +  const std::string& 
>   name,
> +  DIFile*
>   pFile,
>   uint32_t
>  lineNum,
> -  const std::vector uint32_t>> )
> +  const std::vector uint32_t>>& members)
> {
> -DIBuilder  builder(*mpCurrentModule);
> -SmallVector ElemTypes;
> -DataLayout DL= DataLayout(mpCurrentModule);
> -uint32_t   size  = DL.getTypeAllocSizeInBits(pType);
> -uint32_t   alignment = DL.getABITypeAlignment(pType);
> -DINode::DIFlagsflags = DINode::DIFlags::FlagPublic;
> -
> -DICompositeType *pDIStructTy = builder.createStructType(pFile,
> +DIBuilder builder(*mpCurrentModule);
> +SmallVector ElemTypes;
> +DataLayoutDL= DataLayout(mpCurrentModule);
> +uint32_t  size  = DL.getTypeAllocSizeInBits(pType);
> +uint32_t  alignment = DL.getABITypeAlignment(pType);
> +DINode::DIFlags   flags = DINode::DIFlags::FlagPublic;
> +
> +DICompositeType* pDIStructTy = builder.createStructType(pFile,
> name,
> pFile,
> lineNum,
> @@ -240,14 +240,14 @@ JitManager::CreateDebugStructType(StructType *
> mDebugStructMap[pType] = pDIStructTy;
> 
> uint32_t idx = 0;
> -for (auto  : pType->elements())
> +for (auto& elem : pType->elements())
> {
> std::string name   = members[idx].first;
> uint32_tlineNum= members[idx].second;
> size   = DL.getTypeAllocSizeInBits(elem);
> alignment  = DL.getABITypeAlignment(elem);
> uint32_t  offset   = 
> DL.getStructLayout(pType)->getElementOffsetInBits(idx);
> -llvm::DIType *pDebugTy = GetDebugType(elem);
> +llvm::DIType* pDebugTy = GetDebugType(elem);
> ElemTypes.push_back(builder.createMemberType(
> pDIStructTy, name, pFile, lineNum, size, alignment, offset, 
> flags, pDebugTy));
> 
> @@ -258,22 +258,22 @@ 

Re: [Mesa-dev] [ANNOUNCE] mesa 18.1.3

2018-06-28 Thread Emil Velikov
Hi Dylan,

On 27 June 2018 at 21:07, Dylan Baker  wrote:
> Hi list,
>
> Mesa 18.1.3 is planned for Friday June 29th, at 10AM PDT.
>
> Statistics for this release:
>  - 37 queued
>  - 0 nominated
>  - 2 rejected
>
Please have some distinction (in the title) wrt the actual release announcement.
I've used "release candidate" in the past, since it's obvious and easy
to filter from the other mesa related emails.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [virglrenderer-devel] [PATCH] virgl/vtest: add support to vtest for new cap getting.

2018-06-28 Thread Jakob Bornecrantz

On 2018-06-27 23:26, Dave Airlie wrote:

On 28 June 2018 at 03:25, Jakob Bornecrantz  wrote:

On 2018-06-08 07:22, Dave Airlie wrote:


From: Dave Airlie 

The vtest protocol is pretty simple but also pretty dumb, and
the v1 caps query was fixed size, with no nice way to expand it,
however the server also ignores any command it doesn't understand.

So we can query v2 caps by sending a v2 followed by a v1, if the
v2 is ignored we know it's an old vtest server, and the we get
a v2 answer then we can just read the v1 answer and discard it.
---
   .../winsys/virgl/vtest/virgl_vtest_socket.c| 30
+++---
   src/gallium/winsys/virgl/vtest/vtest_protocol.h|  2 ++
   2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c
b/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c
index adec26b66b8..d25f9a3bd9e 100644
--- a/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c
+++ b/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c
@@ -129,12 +129,14 @@ int virgl_vtest_connect(struct virgl_vtest_winsys
*vws)
   int virgl_vtest_send_get_caps(struct virgl_vtest_winsys *vws,
 struct virgl_drm_caps *caps)
   {
-   uint32_t get_caps_buf[VTEST_HDR_SIZE];
+   uint32_t get_caps_buf[VTEST_HDR_SIZE * 2];
  uint32_t resp_buf[VTEST_HDR_SIZE];
-
+   uint32_t caps_size = sizeof(struct virgl_caps_v2);
  int ret;
  get_caps_buf[VTEST_CMD_LEN] = 0;
-   get_caps_buf[VTEST_CMD_ID] = VCMD_GET_CAPS;
+   get_caps_buf[VTEST_CMD_ID] = VCMD_GET_CAPS2;
+   get_caps_buf[VTEST_CMD_LEN + 2] = 0;
+   get_caps_buf[VTEST_CMD_ID + 2] = VCMD_GET_CAPS;
virgl_block_write(vws->sock_fd, _caps_buf,
sizeof(get_caps_buf));
   @@ -142,7 +144,27 @@ int virgl_vtest_send_get_caps(struct
virgl_vtest_winsys *vws,
  if (ret <= 0)
 return 0;
   -   ret = virgl_block_read(vws->sock_fd, >caps, sizeof(struct
virgl_caps_v1));
+   if (resp_buf[1] == 2) {
+   struct virgl_caps_v1 dummy;
+   uint32_t resp_size = resp_buf[0] - 1;
+   uint32_t dummy_size = 0;
+   if (resp_size > caps_size) {
+  dummy_size = resp_size - caps_size;
+  resp_size = caps_size;
+   }
+
+   ret = virgl_block_read(vws->sock_fd, >caps, resp_size);
+
+   if (dummy_size)
+  ret = virgl_block_read(vws->sock_fd, , dummy_size);



Isn't there a risk that the "dummy_size" is larger then the struct
virgl_caps_v1 dummy and cause it to write over the stack? (I am assuming you
are using the dummy here as a place to put the extra caps the host is
exposing but the driver isn't supporting).


We've pretty much fixed caps_v1 size for ever, the protocol won't return
anything other than the 308 byte struct we have now.


I may be wrong here, but isn't the "if (dummy_size)" read dealing with 
the case when the "struct virgl_caps_v2" has grown on the host but not 
the guest size? And has nothing to do with caps_v1 other then that is 
what dummy happens to be.


So in the case the host has extended the v2 caps struct with more then 
308 bytes and the driver hasn't been updated. Wont that cause the 
dummy_size to be larger then sizeof(struct virgl_caps_v1) and cause it 
to smash the stack? I mean it is doubtful to ever happen but it seems a 
bit of a repeat of what happened with the v1 to v2 switch.






Wouldn't it be better if we had a virgl_block_skip function?


We don't really know what a block is, it's just a unix socket, if we find
ourselves doing this more then maybe a dummy refactor might be neeeded
but hopefully this is a once off bad protocol design fix. We may want a new
protocol here anyways that is more robust anyways.


Okay sounds good.

Cheers, Jakob.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/12] nir/opt_peephole_select: Don't try to remove flow control around indirect loads

2018-06-28 Thread Rob Clark
On Thu, Jun 28, 2018 at 12:46 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> That flow control may be trying to avoid invalid loads.  On at least
> some platforms, those loads can also be expensive.

So for adreno, indirect uniform loads aren't really that expensive (it
takes a few cycles to get the value from an alu instruction into the
address register for indirect register addressing but that latency can
usually be hidden).. and invalid access won't cause a fault or
anything.  I think it just wraps around, so it will be some undefined
value but won't cause a fault.

This is completely different from UBO's were we could cause a fault,
and loads are real memory access, so more expensive.

So as long as this is just about uniforms (I think i965 calls them
'push constants'?), and not UBOs, then maybe we want
nir_shader_compiler_options flag?

BR,
-R

>
> No shader-db changes on any Intel platform (even with the later patch
> "intel/compiler: More peephole select").
>
> NOTE: I've tried to CC everyone whose drive might be affected by this
> change.
>
> Signed-off-by: Ian Romanick 
> Cc: Eric Anholt 
> Cc: Rob Clark 
> Cc: Marek Olšák 
> ---
>  src/compiler/nir/nir_opt_peephole_select.c | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/src/compiler/nir/nir_opt_peephole_select.c 
> b/src/compiler/nir/nir_opt_peephole_select.c
> index 4ca4f80d788..920ced2137c 100644
> --- a/src/compiler/nir/nir_opt_peephole_select.c
> +++ b/src/compiler/nir/nir_opt_peephole_select.c
> @@ -58,7 +58,8 @@
>   */
>
>  static bool
> -block_check_for_allowed_instrs(nir_block *block, unsigned *count, bool 
> alu_ok)
> +block_check_for_allowed_instrs(nir_block *block, unsigned *count,
> +   gl_shader_stage stage, bool alu_ok)
>  {
> nir_foreach_instr(instr, block) {
>switch (instr->type) {
> @@ -70,6 +71,13 @@ block_check_for_allowed_instrs(nir_block *block, unsigned 
> *count, bool alu_ok)
>  switch (intrin->variables[0]->var->data.mode) {
>  case nir_var_shader_in:
>  case nir_var_uniform:
> +   /* Don't try to remove flow control around an indirect load
> +* because that flow control may be trying to avoid invalid
> +* loads.
> +*/
> +   if (nir_deref_has_indirect(stage, intrin->variables[0]))
> +  return false;
> +
> break;
>
>  default:
> @@ -168,8 +176,10 @@ nir_opt_peephole_select_block(nir_block *block, 
> nir_shader *shader,
>
> /* ... and those blocks must only contain "allowed" instructions. */
> unsigned count = 0;
> -   if (!block_check_for_allowed_instrs(then_block, , limit != 0) ||
> -   !block_check_for_allowed_instrs(else_block, , limit != 0))
> +   if (!block_check_for_allowed_instrs(then_block, , 
> shader->info.stage,
> +   limit != 0) ||
> +   !block_check_for_allowed_instrs(else_block, , 
> shader->info.stage,
> +   limit != 0))
>return false;
>
> if (count > limit)
> --
> 2.14.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/cmd_buffer: emit binding tables always if push constants are dirty

2018-06-28 Thread Iago Toral
On Thu, 2018-06-28 at 08:47 +0200, Iago Toral wrote:
> On Wed, 2018-06-27 at 09:13 -0700, Jason Ekstrand wrote:
> > On Wed, Jun 27, 2018 at 2:25 AM, Iago Toral 
> > wrote:
> > > On Tue, 2018-06-26 at 10:59 -0700, Jason Ekstrand wrote:
> > > > On Tue, Jun 26, 2018 at 4:08 AM, Iago Toral Quiroga  > > > lia.com> wrote:
> > > > > Storage images require to patch push constant stateto work,
> > > > > which happens during
> > > > > 
> > > > > binding table emision. In the scenario where our pipeline and
> > > > > descriptors are
> > > > > 
> > > > > not dirty, we don't re-emit the binding table, however, if
> > > > > our push constant
> > > > > 
> > > > > state is dirty, we will re-emit the push constant state,
> > > > > trashing storage
> > > > > 
> > > > > image setup.
> > > > > 
> > > > > 
> > > > > 
> > > > > While that scenario is probably not very likely to happen in
> > > > > practice, there
> > > > > 
> > > > > are some CTS tests that trigger this by clearing storage
> > > > > images and buffers
> > > > > 
> > > > > and dispatching a compute shader in a loop. The clearing of
> > > > > the images
> > > > > 
> > > > > and buffers will trigger a blorp execution which will dirty
> > > > > our push constant
> > > > > 
> > > > > state, however, because  we don't alter the descriptors or
> > > > > the compute dispatch
> > > > > 
> > > > > at all in the loop (we are basically execution the same
> > > > > program in a loop),
> > > > > 
> > > > > our pipeline and descriptor state is not dirty. If the shader
> > > > > uses a storage
> > > > > 
> > > > > image, then any iteration after the first will re-emit push
> > > > > constant state
> > > > > 
> > > > > without re-emitting binding tables and the storage image will
> > > > > not be properly
> > > > > 
> > > > > setup any more.
> > > > 
> > > > I don't see why that is a problem.  The only thing
> > > > flush_descriptor_sets does is fill out the binding and sampler
> > > > tables and fill in the push constant data for storage
> > > > images/buffers.  The actual HW packets are filled out by
> > > > flush_push_constants and emit_descriptor_pointers.  Yes, blorp
> > > > trashes our descriptor pointers but the descriptor sets should
> > > > be fine.  For push constants, it does emit 3DSTATE_CONSTANT_*
> > > > but it doesn't actually modify anv_cmd_state::push_constants.
> > > > 
> > > > Are secondary command buffers involved?  I could see something
> > > > funny going on with those.
> > > 
> > > No, no secondaries are involved. I did some more investigation
> > > and I think my explanation of the problem was not good, this is
> > > what is really happening:
> > > 
> > > First, I found the problem in the compute pipeline and I only
> > > extended the fix to the graphics pipeline because it looked like
> > > the same rationale would apply, so I'll explain what happens in
> > > compute and then we can discuss whether the same problem applies
> > > to graphics.
> > > 
> > > The test does something like this:
> > > 
> > > for (...) {
> > >clear ssbos / storage images
> > >dispatch compute
> > > }
> > > 
> > > The first iteration of this loop will find that the compute
> > > pipeline and descriptors are dirty and proceed to emit binding
> > > tables. We have storage images, so during that process the push
> > > constant buffer is amended to include storage images.
> > > Specifically, we call anv_cmd_buffer_ensure_push_constants_size()
> > > for the images field. This gives us a size of 624.
> > > 
> > > We move on to the second iteration of the loop. When we clear
> > > images and ssbos via blorp, we again mark the push constant
> > > buffer as dirty. Now we execute the compute dispatch and the
> > > first thing we do there is anv_cmd_buffer_push_base_group_id()
> > > which calls anv_cmd_buffer_ensure_push_constants_size() for the
> > > base group id, which gives as a size of 144. This is smaller than
> > > what we computed in the previous iteration, because we haven't
> > > called the same function for the images field yet. Unfortunately,
> > > we will never call that again, because we only do that during
> > > binding table emission and we only do that if the compute
> > > pipeline is dirty (it is not) or our descriptors are  dirty (they
> > > are not). So we don't re-emit binding table and we don't ensure
> > > push constant space for the image data, but because we come from
> > > a blorp execution our push constant dirty flag is true, so we re-
> > > emit push constant data, only that this time we won't emit the
> > > push constant data we need for the storage images, which leads to
> > > the problem.
> > 
> > The intention has always been that
> > anv_cmd_buffer_ensure_push_constants_size would only ever grow the
> > push constants and never shrink them.  The most obvious bug is in
> > anv_cmd_buffer_ensure_push_constants_size.
> >  
> > > I thought that maybe making
> > > anv_cmd_buffer_ensure_push_constants_size() only update the size
> > > if we alloc or 

Re: [Mesa-dev] Question about EGL_KHR_partial_update implementation

2018-06-28 Thread Frank Binns
Hi Qiang,

Qiang Yu  writes:

> Hi Harish,
>
> I want to implement EGL_KHR_partial_update for lima mesa driver and find you
> worked on Android/Wayland support for it:
> https://patchwork.freedesktop.org/patch/160944/
> https://patchwork.freedesktop.org/patch/188695/
>
> So I have some question about it:
> your implementation seems to depend on platform (Android, wayland) support,
> for example call native_window_set_surface_damage() in Android implementation.
> What's the purpose of it, tell the Android SurfaceFlinger to redraw the damage
> region? And is this damage region the "surface damage" or "buffer
> damage" metioned
> in the EGL_KHR_partial_update?
> https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_partial_update.txt
>
> To my understand this extension should only depend on the driver support 
> instead
> of platform support while the EGL_KHR_swap_buffers_with_damage is the 
> opposite:
> https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_swap_buffers_with_damage.txt
>

I came across this too and I agree with you. I think the current
implementation is incorrect and it should actually be passing the damage
rectangles to the driver.

Thanks
Frank

> For lima implementation, I want to use the damage region (buffer
> damage) provided
> by EGL_KHR_partial_update to skip rendering of un-damaged region when
> eglSwapBuffersXXX. And tell damage region (surface damage) to compositor 
> should
> be left to eglSwapBuffersWithDamageKHR provided by
> EGL_KHR_swap_buffers_with_damage.
>
> Regards,
> Qiang
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: optimize vkCmd{Set, Reset}Event() a little bit

2018-06-28 Thread Samuel Pitoiset
Always emitting a bottom-of-pipe event is quite dumb. Instead,
start to optimize these functions by syncing PFP for the
top-of-pipe and syncing ME for the post-index-fetch event.

This can still be improved by emitting EOS events for
syncing PS and CS stages.

Signed-off-by: Samuel Pitoiset 
---
 src/amd/vulkan/radv_cmd_buffer.c | 46 ++--
 1 file changed, 38 insertions(+), 8 deletions(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 074e9c4c7f..17385aace1 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -4275,14 +4275,44 @@ static void write_event(struct radv_cmd_buffer 
*cmd_buffer,
 
MAYBE_UNUSED unsigned cdw_max = 
radeon_check_space(cmd_buffer->device->ws, cs, 18);
 
-   /* TODO: this is overkill. Probably should figure something out from
-* the stage mask. */
-
-   si_cs_emit_write_event_eop(cs,
-  
cmd_buffer->device->physical_device->rad_info.chip_class,
-  radv_cmd_buffer_uses_mec(cmd_buffer),
-  V_028A90_BOTTOM_OF_PIPE_TS, 0,
-  EOP_DATA_SEL_VALUE_32BIT, va, 2, value);
+   /* Flags that only require a top-of-pipe event. */
+   static const VkPipelineStageFlags top_of_pipe_flags =
+   VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
+
+   /* Flags that only require a post-index-fetch event. */
+   static const VkPipelineStageFlags post_index_fetch_flags =
+   top_of_pipe_flags |
+   VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT |
+   VK_PIPELINE_STAGE_VERTEX_INPUT_BIT;
+
+   /* TODO: Emit EOS events for syncing PS/CS stages. */
+
+   if (!(stageMask & ~top_of_pipe_flags)) {
+   /* Just need to sync the PFP engine. */
+   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
+   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
+   S_370_WR_CONFIRM(1) |
+   S_370_ENGINE_SEL(V_370_PFP));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, value);
+   } else if (!(stageMask & ~post_index_fetch_flags)) {
+   /* Sync ME because PFP reads index and indirect buffers. */
+   radeon_emit(cs, PKT3(PKT3_WRITE_DATA, 3, 0));
+   radeon_emit(cs, S_370_DST_SEL(V_370_MEM_ASYNC) |
+   S_370_WR_CONFIRM(1) |
+   S_370_ENGINE_SEL(V_370_ME));
+   radeon_emit(cs, va);
+   radeon_emit(cs, va >> 32);
+   radeon_emit(cs, value);
+   } else {
+   /* Otherwise, sync all prior GPU work using an EOP event. */
+   si_cs_emit_write_event_eop(cs,
+  
cmd_buffer->device->physical_device->rad_info.chip_class,
+  radv_cmd_buffer_uses_mec(cmd_buffer),
+  V_028A90_BOTTOM_OF_PIPE_TS, 0,
+  EOP_DATA_SEL_VALUE_32BIT, va, 2, 
value);
+   }
 
assert(cmd_buffer->cs->cdw <= cdw_max);
 }
-- 
2.18.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Question about EGL_KHR_partial_update implementation

2018-06-28 Thread Qiang Yu
Hi Harish,

I want to implement EGL_KHR_partial_update for lima mesa driver and find you
worked on Android/Wayland support for it:
https://patchwork.freedesktop.org/patch/160944/
https://patchwork.freedesktop.org/patch/188695/

So I have some question about it:
your implementation seems to depend on platform (Android, wayland) support,
for example call native_window_set_surface_damage() in Android implementation.
What's the purpose of it, tell the Android SurfaceFlinger to redraw the damage
region? And is this damage region the "surface damage" or "buffer
damage" metioned
in the EGL_KHR_partial_update?
https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_partial_update.txt

To my understand this extension should only depend on the driver support instead
of platform support while the EGL_KHR_swap_buffers_with_damage is the opposite:
https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_swap_buffers_with_damage.txt

For lima implementation, I want to use the damage region (buffer
damage) provided
by EGL_KHR_partial_update to skip rendering of un-damaged region when
eglSwapBuffersXXX. And tell damage region (surface damage) to compositor should
be left to eglSwapBuffersWithDamageKHR provided by
EGL_KHR_swap_buffers_with_damage.

Regards,
Qiang
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/11] ac/radv: using tls to store llvm related info and speed up compiles (v3)

2018-06-28 Thread Alex Smith
Hi Dave,

I did a quick test with this on Rise of the Tomb Raider. It reduced the
time taken to create all pipelines for the whole game over 8 threads (with
RADV_DEBUG=nocache) from 12m24s to 11m35s. Nice improvement :)

Also didn't see any issues, so:

Tested-by: Alex Smith 

Thanks,
Alex

On 27 June 2018 at 04:58, Dave Airlie  wrote:

> From: Dave Airlie 
>
> I'd like to encourage people to test this to see if it helps (like
> does it make app startup better or less hitching in dxvk).
>
> The basic idea is to store a bunch of LLVM related data structs
> in thread local storage so we can avoid reiniting them every time
> we compile a shader. Since we know llvm objects aren't thread safe
> it has to be stored using TLS to avoid any collisions.
>
> This should remove all the fixed overheads setup costs of creating
> the pass manager each time.
>
> This takes a demo app time to compile the radv meta shaders on nocache
> and exit from 1.7s to 1s.
>
> TODO: this doesn't work for radeonsi yet, but I'm not sure how TLS
> works if you have radeonsi and radv loaded at the same time, if
> they'll magically try and use the same tls stuff, in which case
> this might explode all over the place.
>
> v2: fix llvm6 build, inline emit function, handle multiple targets
> in one thread
> v3: rebase and port onto new structure
> ---
>  src/amd/common/ac_llvm_helper.cpp | 120 --
>  src/amd/common/ac_llvm_util.c |  10 +--
>  src/amd/common/ac_llvm_util.h |   9 +++
>  src/amd/vulkan/radv_debug.h   |   1 +
>  src/amd/vulkan/radv_device.c  |   1 +
>  src/amd/vulkan/radv_shader.c  |   2 +
>  6 files changed, 132 insertions(+), 11 deletions(-)
>
> diff --git a/src/amd/common/ac_llvm_helper.cpp b/src/amd/common/ac_llvm_
> helper.cpp
> index 27403dbe085..f1f1399b3fb 100644
> --- a/src/amd/common/ac_llvm_helper.cpp
> +++ b/src/amd/common/ac_llvm_helper.cpp
> @@ -31,12 +31,21 @@
>
>  #include "ac_llvm_util.h"
>  #include 
> -#include 
> -#include 
> -#include 
> -#include 
> +#include 
>  #include 
>  #include 
> +#include 
> +
> +#include 
> +#include 
> +#if HAVE_LLVM >= 0x0700
> +#include 
> +#endif
> +
> +#if HAVE_LLVM < 0x0700
> +#include "llvm/Support/raw_ostream.h"
> +#endif
> +#include 
>
>  void ac_add_attr_dereferenceable(LLVMValueRef val, uint64_t bytes)
>  {
> @@ -101,11 +110,110 @@ ac_dispose_target_library_info(LLVMTargetLibraryInfoRef
> library_info)
> delete reinterpret_cast *>(library_info);
>  }
>
> +class ac_llvm_per_thread_info {
> +public:
> +   ac_llvm_per_thread_info(enum radeon_family arg_family,
> +   enum ac_target_machine_options
> arg_tm_options)
> +   : family(arg_family), tm_options(arg_tm_options),
> + OStream(CodeString) {}
> +   ~ac_llvm_per_thread_info() {
> +   ac_llvm_compiler_dispose_internal(_info);
> +   }
> +
> +   struct ac_llvm_compiler_info llvm_info;
> +   enum radeon_family family;
> +   enum ac_target_machine_options tm_options;
> +   llvm::SmallString<0> CodeString;
> +   llvm::raw_svector_ostream OStream;
> +   llvm::legacy::PassManager pass;
> +};
> +
> +/* we have to store a linked list per thread due to the possiblity of
> multiple gpus being required */
> +static thread_local std::list
> ac_llvm_per_thread_list;
> +
>  bool ac_compile_to_memory_buffer(struct ac_llvm_compiler_info *info,
>  LLVMModuleRef M,
>  char **ErrorMessage,
>  LLVMMemoryBufferRef *OutMemBuf)
>  {
> -   return LLVMTargetMachineEmitToMemoryBuffer(info->tm, M,
> LLVMObjectFile,
> -  ErrorMessage,
> OutMemBuf);
> +   ac_llvm_per_thread_info *thread_info = nullptr;
> +   if (info->thread_stored) {
> +   for (auto  : ac_llvm_per_thread_list) {
> +   if (I.llvm_info.tm == info->tm) {
> +   thread_info = 
> +   break;
> +   }
> +   }
> +
> +   if (!thread_info) {
> +   assert(0);
> +   return false;
> +   }
> +   } else {
> +   return LLVMTargetMachineEmitToMemoryBuffer(info->tm, M,
> LLVMObjectFile,
> +  ErrorMessage,
> OutMemBuf);
> +   }
> +
> +   llvm::TargetMachine *TM = reinterpret_cast TargetMachine*>(thread_info->llvm_info.tm);
> +   llvm::Module *Mod = llvm::unwrap(M);
> +   llvm::StringRef Data;
> +
> +   Mod->setDataLayout(TM->createDataLayout());
> +
> +   thread_info->pass.run(*Mod);
> +
> +   Data = thread_info->OStream.str();
> +   *OutMemBuf = LLVMCreateMemoryBufferWithMemoryRangeCopy(Data.data(),
> Data.size(), "");
> +   thread_info->CodeString = "";
> +   return false;
> +}
> +
> +bool 

Re: [Mesa-dev] Radeonsi OpenGL 4.4 compat profile support

2018-06-28 Thread Mike Lothian
Hi

I can confirm Dying Light is still working great with this series, it
doesn't require any overrides and changes to the graphical settings no
longer cause crashes

Tropico 5 also launches and plays with the allow_higher_compat_version option
removed from /etc/drirc (I don't own the other 2 games that have this
option set)

Tested-by: Mike Lothian 

Thanks again for this

On Thu, 28 Jun 2018 at 07:47 Timothy Arceri  wrote:

> Sorry to keep spamming the list with this stuff, but Dave helped
> out with ARB_vertex_attrib_64bit support and the spec bug I
> submitted for indirect compute dispatch was resolved so it
> seemed like a good idea to send it all out again together with
> these updates.
>
> Pretty much everything has corresponding piglit tests, but I've
> also been testing with a few games and I'm now seeing games such
> Doom and Wolfenstein working on wine where previously the version
> overrides were not enough to get them to work.
>
> There has also been a report that proper compat support fixes
> some issues with Dying Light.
>
> Please review.
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/18] mesa: add missing display list support for ARB_compute_shader

2018-06-28 Thread Timothy Arceri
The extension is enabled for compat profile but there is currently
no display list support.

I filed a spec bug and it has been agreed that
glDispatchComputeIndirect should generate an INVALID_OPERATION
error when called during display list compilation.
---
 src/mesa/main/dlist.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index b2b1f723a17..e2ab2eb8aa1 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -510,6 +510,9 @@ typedef enum
OPCODE_SAMPLER_PARAMETERIIV,
OPCODE_SAMPLER_PARAMETERUIV,
 
+   /* ARB_compute_shader */
+   OPCODE_DISPATCH_COMPUTE,
+
/* GL_ARB_sync */
OPCODE_WAIT_SYNC,
 
@@ -6570,6 +6573,33 @@ save_DrawTransformFeedbackStreamInstanced(GLenum mode, 
GLuint name,
}
 }
 
+static void GLAPIENTRY
+save_DispatchCompute(GLuint num_groups_x, GLuint num_groups_y,
+ GLuint num_groups_z)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_DISPATCH_COMPUTE, 3);
+   if (n) {
+  n[1].ui = num_groups_x;
+  n[2].ui = num_groups_y;
+  n[3].ui = num_groups_z;
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_DispatchCompute(ctx->Exec, (num_groups_x, num_groups_y,
+   num_groups_z));
+   }
+}
+
+static void GLAPIENTRY
+save_DispatchComputeIndirect(GLintptr indirect)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_error(ctx, GL_INVALID_OPERATION,
+   "glDispatchComputeIndirect() during display list compile");
+}
+
 static void GLAPIENTRY
 save_UseProgram(GLuint program)
 {
@@ -10429,6 +10459,11 @@ execute_list(struct gl_context *ctx, GLuint list)
 }
 break;
 
+ /* ARB_compute_shader */
+ case OPCODE_DISPATCH_COMPUTE:
+CALL_DispatchCompute(ctx->Exec, (n[1].ui, n[2].ui, n[3].ui));
+break;
+
  /* GL_ARB_sync */
  case OPCODE_WAIT_SYNC:
 {
@@ -11138,6 +11173,10 @@ _mesa_initialize_save_table(const struct gl_context 
*ctx)
SET_DepthRangeArrayv(table, save_DepthRangeArrayv);
SET_DepthRangeIndexed(table, save_DepthRangeIndexed);
 
+   /* 122. ARB_compute_shader */
+   SET_DispatchCompute(table, save_DispatchCompute);
+   SET_DispatchComputeIndirect(table, save_DispatchComputeIndirect);
+
/* 173. GL_EXT_blend_func_separate */
SET_BlendFuncSeparate(table, save_BlendFuncSeparateEXT);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/18] radeonsi: enable OpenGL 4.4 compat profile

2018-06-28 Thread Timothy Arceri
---
 src/gallium/drivers/radeonsi/si_get.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_get.c 
b/src/gallium/drivers/radeonsi/si_get.c
index 0e8617d0fee..96ff2a9e46b 100644
--- a/src/gallium/drivers/radeonsi/si_get.c
+++ b/src/gallium/drivers/radeonsi/si_get.c
@@ -210,13 +210,12 @@ static int si_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
return 4;
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
+   case PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY:
if (sscreen->info.has_indirect_compute_dispatch)
-   return 450;
+   return param == PIPE_CAP_GLSL_FEATURE_LEVEL ?
+   450 : 440;
return 420;
 
-   case PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY:
-   return 330;
-
case PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE:
return MIN2(sscreen->info.max_alloc_size, INT_MAX);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/cmd_buffer: emit binding tables always if push constants are dirty

2018-06-28 Thread Iago Toral
On Wed, 2018-06-27 at 09:13 -0700, Jason Ekstrand wrote:
> On Wed, Jun 27, 2018 at 2:25 AM, Iago Toral 
> wrote:
> > On Tue, 2018-06-26 at 10:59 -0700, Jason Ekstrand wrote:
> > > On Tue, Jun 26, 2018 at 4:08 AM, Iago Toral Quiroga  > > a.com> wrote:
> > > > Storage images require to patch push constant stateto work,
> > > > which happens during
> > > > 
> > > > binding table emision. In the scenario where our pipeline and
> > > > descriptors are
> > > > 
> > > > not dirty, we don't re-emit the binding table, however, if our
> > > > push constant
> > > > 
> > > > state is dirty, we will re-emit the push constant state,
> > > > trashing storage
> > > > 
> > > > image setup.
> > > > 
> > > > 
> > > > 
> > > > While that scenario is probably not very likely to happen in
> > > > practice, there
> > > > 
> > > > are some CTS tests that trigger this by clearing storage images
> > > > and buffers
> > > > 
> > > > and dispatching a compute shader in a loop. The clearing of the
> > > > images
> > > > 
> > > > and buffers will trigger a blorp execution which will dirty our
> > > > push constant
> > > > 
> > > > state, however, because  we don't alter the descriptors or the
> > > > compute dispatch
> > > > 
> > > > at all in the loop (we are basically execution the same program
> > > > in a loop),
> > > > 
> > > > our pipeline and descriptor state is not dirty. If the shader
> > > > uses a storage
> > > > 
> > > > image, then any iteration after the first will re-emit push
> > > > constant state
> > > > 
> > > > without re-emitting binding tables and the storage image will
> > > > not be properly
> > > > 
> > > > setup any more.
> > > 
> > > I don't see why that is a problem.  The only thing
> > > flush_descriptor_sets does is fill out the binding and sampler
> > > tables and fill in the push constant data for storage
> > > images/buffers.  The actual HW packets are filled out by
> > > flush_push_constants and emit_descriptor_pointers.  Yes, blorp
> > > trashes our descriptor pointers but the descriptor sets should be
> > > fine.  For push constants, it does emit 3DSTATE_CONSTANT_* but it
> > > doesn't actually modify anv_cmd_state::push_constants.
> > > 
> > > Are secondary command buffers involved?  I could see something
> > > funny going on with those.
> > 
> > No, no secondaries are involved. I did some more investigation and
> > I think my explanation of the problem was not good, this is what is
> > really happening:
> > 
> > First, I found the problem in the compute pipeline and I only
> > extended the fix to the graphics pipeline because it looked like
> > the same rationale would apply, so I'll explain what happens in
> > compute and then we can discuss whether the same problem applies to
> > graphics.
> > 
> > The test does something like this:
> > 
> > for (...) {
> >clear ssbos / storage images
> >dispatch compute
> > }
> > 
> > The first iteration of this loop will find that the compute
> > pipeline and descriptors are dirty and proceed to emit binding
> > tables. We have storage images, so during that process the push
> > constant buffer is amended to include storage images. Specifically,
> > we call anv_cmd_buffer_ensure_push_constants_size() for the images
> > field. This gives us a size of 624.
> > 
> > We move on to the second iteration of the loop. When we clear
> > images and ssbos via blorp, we again mark the push constant buffer
> > as dirty. Now we execute the compute dispatch and the first thing
> > we do there is anv_cmd_buffer_push_base_group_id() which calls
> > anv_cmd_buffer_ensure_push_constants_size() for the base group id,
> > which gives as a size of 144. This is smaller than what we computed
> > in the previous iteration, because we haven't called the same
> > function for the images field yet. Unfortunately, we will never
> > call that again, because we only do that during binding table
> > emission and we only do that if the compute pipeline is dirty (it
> > is not) or our descriptors are  dirty (they are not). So we don't
> > re-emit binding table and we don't ensure push constant space for
> > the image data, but because we come from a blorp execution our push
> > constant dirty flag is true, so we re-emit push constant data, only
> > that this time we won't emit the push constant data we need for the
> > storage images, which leads to the problem.
> 
> The intention has always been that
> anv_cmd_buffer_ensure_push_constants_size would only ever grow the
> push constants and never shrink them.  The most obvious bug is in
> anv_cmd_buffer_ensure_push_constants_size.
>  
> > I thought that maybe making
> > anv_cmd_buffer_ensure_push_constants_size() only update the size if
> > we alloc or realloc would fix this, but that can cause GPU hangs in
> > some cases when I run multiple tests in parallel, so I guess it
> > isn't that simple.
> > 
> 
> Ugh...  that makes things more interesting.  That does look like the
> right fix and now I'm wondering why it leads to a hang.
> 
> 

[Mesa-dev] [PATCH 17/18] mesa: enable ARB_vertex_attrib_64bit in compat profile

2018-06-28 Thread Timothy Arceri
---
 src/mapi/glapi/gen/apiexec.py   | 20 ++--
 src/mesa/main/extensions_table.h|  2 +-
 src/mesa/main/tests/dispatch_sanity.cpp | 23 ---
 src/mesa/main/vtxfmt.c  |  2 +-
 4 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
index 1a91785d375..44552f43f29 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -113,16 +113,16 @@ functions = {
 # For Mesa this effectively means OpenGL 3.2 is required.  It seems
 # unlikely that Mesa will ever get support for any of the NV extensions
 # that add "equivalent functionality."
-"VertexAttribL1d": exec_info(core=32),
-"VertexAttribL2d": exec_info(core=32),
-"VertexAttribL3d": exec_info(core=32),
-"VertexAttribL4d": exec_info(core=32),
-"VertexAttribL1dv": exec_info(core=32),
-"VertexAttribL2dv": exec_info(core=32),
-"VertexAttribL3dv": exec_info(core=32),
-"VertexAttribL4dv": exec_info(core=32),
-"VertexAttribLPointer": exec_info(core=32),
-"GetVertexAttribLdv": exec_info(core=32),
+"VertexAttribL1d": exec_info(compatibility=32, core=32),
+"VertexAttribL2d": exec_info(compatibility=32, core=32),
+"VertexAttribL3d": exec_info(compatibility=32, core=32),
+"VertexAttribL4d": exec_info(compatibility=32, core=32),
+"VertexAttribL1dv": exec_info(compatibility=32, core=32),
+"VertexAttribL2dv": exec_info(compatibility=32, core=32),
+"VertexAttribL3dv": exec_info(compatibility=32, core=32),
+"VertexAttribL4dv": exec_info(compatibility=32, core=32),
+"VertexAttribLPointer": exec_info(compatibility=32, core=32),
+"GetVertexAttribLdv": exec_info(compatibility=32, core=32),
 
 # OpenGL 4.1 / GL_ARB_viewport_array.  The extension spec says:
 #
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 12b796777df..8ed1308182e 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -175,7 +175,7 @@ EXT(ARB_transpose_matrix, dummy_true
 EXT(ARB_uniform_buffer_object   , ARB_uniform_buffer_object
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_vertex_array_bgra   , EXT_vertex_array_bgra
  , GLL, GLC,  x ,  x , 2008)
 EXT(ARB_vertex_array_object , dummy_true   
  , GLL, GLC,  x ,  x , 2006)
-EXT(ARB_vertex_attrib_64bit , ARB_vertex_attrib_64bit  
  ,  x , GLC,  x ,  x , 2010)
+EXT(ARB_vertex_attrib_64bit , ARB_vertex_attrib_64bit  
  ,  32, GLC,  x ,  x , 2010)
 EXT(ARB_vertex_attrib_binding   , dummy_true   
  , GLL, GLC,  x ,  x , 2012)
 EXT(ARB_vertex_buffer_object, dummy_true   
  , GLL,  x ,  x ,  x , 2003)
 EXT(ARB_vertex_program  , ARB_vertex_program   
  , GLL,  x ,  x ,  x , 2002)
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 085b1f7dd5f..542dbbdee0f 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -603,6 +603,18 @@ const struct function common_desktop_functions_possible[] 
= {
{ "glUniformMatrix4x3dv", 40, -1 },
{ "glGetUniformdv", 43, -1 },
 
+   /* GL 4.1 */
+   { "glVertexAttribL1d", 41, -1 },
+   { "glVertexAttribL2d", 41, -1 },
+   { "glVertexAttribL3d", 41, -1 },
+   { "glVertexAttribL4d", 41, -1 },
+   { "glVertexAttribL1dv", 41, -1 },
+   { "glVertexAttribL2dv", 41, -1 },
+   { "glVertexAttribL3dv", 41, -1 },
+   { "glVertexAttribL4dv", 41, -1 },
+   { "glVertexAttribLPointer", 41, -1 },
+   { "glGetVertexAttribLdv", 41, -1 },
+
/* GL 4.3 */
{ "glIsRenderbuffer", 43, -1 },
{ "glBindRenderbuffer", 43, -1 },
@@ -1765,17 +1777,6 @@ const struct function gl_core_functions_possible[] = {
{ "glValidateProgramPipeline", 43, -1 },
{ "glGetProgramPipelineInfoLog", 43, -1 },
 
-   { "glVertexAttribL1d", 41, -1 },
-   { "glVertexAttribL2d", 41, -1 },
-   { "glVertexAttribL3d", 41, -1 },
-   { "glVertexAttribL4d", 41, -1 },
-   { "glVertexAttribL1dv", 41, -1 },
-   { "glVertexAttribL2dv", 41, -1 },
-   { "glVertexAttribL3dv", 41, -1 },
-   { "glVertexAttribL4dv", 41, -1 },
-   { "glVertexAttribLPointer", 41, -1 },
-   { "glGetVertexAttribLdv", 41, -1 },
-
 // { "glCreateSyncFromCLeventARB", 43, -1 },// XXX: Add to xml
 
{ "glDrawArraysInstancedBaseInstance", 43, -1 },
diff --git a/src/mesa/main/vtxfmt.c b/src/mesa/main/vtxfmt.c
index 61629a40fda..3e96c7d2fba 100644
--- a/src/mesa/main/vtxfmt.c
+++ b/src/mesa/main/vtxfmt.c
@@ -211,7 +211,7 @@ install_vtxfmt(struct gl_context *ctx, struct _glapi_table 
*tab,
   SET_VertexAttribL1ui64vARB(tab, vfmt->VertexAttribL1ui64vARB);
}
 
-   if (ctx->API == API_OPENGL_CORE) {
+   if 

[Mesa-dev] [PATCH 15/18] vbo_save: add support for doubles to display list code

2018-06-28 Thread Timothy Arceri
From: Dave Airlie 

Required for ARB_vertex_attrib_64bit compat profile support.
---
 src/mesa/main/mtypes.h   |  2 +-
 src/mesa/vbo/vbo_private.h   |  3 +++
 src/mesa/vbo/vbo_save_api.c  | 18 --
 src/mesa/vbo/vbo_save_draw.c | 18 +-
 4 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 0dfff313966..7ef7a3f1106 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4432,7 +4432,7 @@ struct gl_dlist_state
GLvertexformat ListVtxfmt;
 
GLubyte ActiveAttribSize[VERT_ATTRIB_MAX];
-   GLfloat CurrentAttrib[VERT_ATTRIB_MAX][4];
+   GLfloat CurrentAttrib[VERT_ATTRIB_MAX][8];
 
GLubyte ActiveMaterialSize[MAT_ATTRIB_MAX];
GLfloat CurrentMaterial[MAT_ATTRIB_MAX][4];
diff --git a/src/mesa/vbo/vbo_private.h b/src/mesa/vbo/vbo_private.h
index 3f7d0dc6082..86f6b41b793 100644
--- a/src/mesa/vbo/vbo_private.h
+++ b/src/mesa/vbo/vbo_private.h
@@ -214,6 +214,9 @@ _vbo_set_attrib_format(struct gl_context *ctx,
 {
const GLboolean integer = vbo_attrtype_to_integer_flag(type);
const GLboolean doubles = vbo_attrtype_to_double_flag(type);
+
+   if (doubles)
+  size /= 2;
_mesa_update_array_format(ctx, vao, attr, size, type, GL_RGBA,
  GL_FALSE, integer, doubles, offset);
/* Ptr for userspace arrays.
diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index 945a0c8bff5..d5b43d06845 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -791,9 +791,12 @@ copy_to_current(struct gl_context *ctx)
   const int i = u_bit_scan64();
   assert(save->attrsz[i]);
 
-  save->currentsz[i][0] = save->attrsz[i];
-  COPY_CLEAN_4V_TYPE_AS_UNION(save->current[i], save->attrsz[i],
-  save->attrptr[i], save->attrtype[i]);
+  if (save->attrtype[i] == GL_DOUBLE ||
+  save->attrtype[i] == GL_UNSIGNED_INT64_ARB)
+ memcpy(save->current[i], save->attrptr[i], save->attrsz[i] * 
sizeof(GLfloat));
+  else
+ COPY_CLEAN_4V_TYPE_AS_UNION(save->current[i], save->attrsz[i],
+ save->attrptr[i], save->attrtype[i]);
}
 }
 
@@ -935,11 +938,13 @@ upgrade_vertex(struct gl_context *ctx, GLuint attr, 
GLuint newsz)
  * get a glTexCoord4f() or glTexCoord1f() call.
  */
 static void
-fixup_vertex(struct gl_context *ctx, GLuint attr, GLuint sz)
+fixup_vertex(struct gl_context *ctx, GLuint attr,
+ GLuint sz, GLenum newType)
 {
struct vbo_save_context *save = _context(ctx)->save;
 
-   if (sz > save->attrsz[attr]) {
+   if (sz > save->attrsz[attr] ||
+   newType != save->attrtype[attr]) {
   /* New size is larger.  Need to flush existing vertices and get
* an enlarged vertex format.
*/
@@ -994,9 +999,10 @@ reset_vertex(struct gl_context *ctx)
 #define ATTR_UNION(A, N, T, C, V0, V1, V2, V3) \
 do {   \
struct vbo_save_context *save = _context(ctx)->save;\
+   int sz = (sizeof(C) / sizeof(GLfloat)); \
\
if (save->active_sz[A] != N)\
-  fixup_vertex(ctx, A, N); \
+  fixup_vertex(ctx, A, N * sz, T); \
\
{   \
   C *dest = (C *)save->attrptr[A];  \
diff --git a/src/mesa/vbo/vbo_save_draw.c b/src/mesa/vbo/vbo_save_draw.c
index 71620e9a3cd..409a353b520 100644
--- a/src/mesa/vbo/vbo_save_draw.c
+++ b/src/mesa/vbo/vbo_save_draw.c
@@ -54,16 +54,24 @@ copy_vao(struct gl_context *ctx, const struct 
gl_vertex_array_object *vao,
   struct gl_array_attributes *currval = >current[shift + i];
   const GLubyte size = attrib->Size;
   const GLenum16 type = attrib->Type;
-  fi_type tmp[4];
+  fi_type tmp[8];
+  int dmul = 1;
 
-  COPY_CLEAN_4V_TYPE_AS_UNION(tmp, size, *data, type);
+  if (type == GL_DOUBLE ||
+  type == GL_UNSIGNED_INT64_ARB)
+ dmul = 2;
+
+  if (dmul == 2)
+ memcpy(tmp, *data, size * dmul * sizeof(GLfloat));
+  else
+ COPY_CLEAN_4V_TYPE_AS_UNION(tmp, size, *data, type);
 
   if (type != currval->Type ||
-  memcmp(currval->Ptr, tmp, 4 * sizeof(GLfloat)) != 0) {
- memcpy((fi_type*)currval->Ptr, tmp, 4 * sizeof(GLfloat));
+  memcmp(currval->Ptr, tmp, 4 * sizeof(GLfloat) * dmul) != 0) {
+ memcpy((fi_type*)currval->Ptr, tmp, 4 * sizeof(GLfloat) * dmul);
 
  currval->Size = size;
- currval->_ElementSize = size * sizeof(GLfloat);
+ currval->_ElementSize = size * sizeof(GLfloat) * dmul;
  currval->Type = type;
  currval->Integer = 

[Mesa-dev] [PATCH 12/18] mesa: add ARB_draw_indirect support to compat profile

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/bufferobj.c|  3 +-
 src/mesa/main/extensions_table.h |  2 +-
 src/mesa/vbo/vbo_exec_array.c| 66 +++-
 3 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 67f9cd0a902..1d1e51bc015 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -129,8 +129,7 @@ get_buffer_target(struct gl_context *ctx, GLenum target)
  return >QueryBuffer;
   break;
case GL_DRAW_INDIRECT_BUFFER:
-  if ((ctx->API == API_OPENGL_CORE &&
-   ctx->Extensions.ARB_draw_indirect) ||
+  if ((_mesa_is_desktop_gl(ctx) && ctx->Extensions.ARB_draw_indirect) ||
_mesa_is_gles31(ctx)) {
  return >DrawIndirectBuffer;
   }
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index f79a52cee8c..1446a4bd421 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -58,7 +58,7 @@ EXT(ARB_direct_state_access , dummy_true
 EXT(ARB_draw_buffers, dummy_true   
  , GLL, GLC,  x ,  x , 2002)
 EXT(ARB_draw_buffers_blend  , ARB_draw_buffers_blend   
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_draw_elements_base_vertex   , ARB_draw_elements_base_vertex
  , GLL, GLC,  x ,  x , 2009)
-EXT(ARB_draw_indirect   , ARB_draw_indirect
  ,  x , GLC,  x ,  x , 2010)
+EXT(ARB_draw_indirect   , ARB_draw_indirect
  , GLL, GLC,  x ,  x , 2010)
 EXT(ARB_draw_instanced  , ARB_draw_instanced   
  , GLL, GLC,  x ,  x , 2008)
 EXT(ARB_enhanced_layouts, ARB_enhanced_layouts 
  , GLL, GLC,  x ,  x , 2013)
 EXT(ARB_explicit_attrib_location, ARB_explicit_attrib_location 
  , GLL, GLC,  x ,  x , 2009)
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index 792907ac044..0d92de2e3ad 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -39,6 +39,21 @@
 #include "main/macros.h"
 #include "main/transformfeedback.h"
 
+typedef struct {
+   GLuint count;
+   GLuint primCount;
+   GLuint first;
+   GLuint reservedMustBeZero;
+} DrawArraysIndirectCommand;
+
+typedef struct {
+   GLuint count;
+   GLuint primCount;
+   GLuint firstIndex;
+   GLint  baseVertex;
+   GLuint reservedMustBeZero;
+} DrawElementsIndirectCommand;
+
 
 /**
  * Check that element 'j' of the array has reasonable data.
@@ -1616,6 +1631,20 @@ vbo_exec_DrawArraysIndirect(GLenum mode, const GLvoid 
*indirect)
   _mesa_debug(ctx, "glDrawArraysIndirect(%s, %p)\n",
   _mesa_enum_to_string(mode), indirect);
 
+   /* From the ARB_draw_indirect spec:
+*
+*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
+*compatibility profile, this indicates that DrawArraysIndirect and
+*DrawElementsIndirect are to source their arguments directly from the
+*pointer passed as their  parameters."
+*/
+   if (ctx->API == API_OPENGL_COMPAT &&
+   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
+  DrawArraysIndirectCommand *cmd = (DrawArraysIndirectCommand *) indirect;
+  _mesa_DrawArraysInstanced(mode, cmd->first, cmd->count, cmd->primCount);
+  return;
+   }
+
FLUSH_FOR_DRAW(ctx);
 
if (_mesa_is_no_error_enabled(ctx)) {
@@ -1647,6 +1676,41 @@ vbo_exec_DrawElementsIndirect(GLenum mode, GLenum type, 
const GLvoid *indirect)
   _mesa_enum_to_string(mode),
   _mesa_enum_to_string(type), indirect);
 
+   /* From the ARB_draw_indirect spec:
+*
+*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
+*compatibility profile, this indicates that DrawArraysIndirect and
+*DrawElementsIndirect are to source their arguments directly from the
+*pointer passed as their  parameters."
+*/
+   if (ctx->API == API_OPENGL_COMPAT &&
+   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
+  /*
+   * Unlike regular DrawElementsInstancedBaseVertex commands, the indices
+   * may not come from a client array and must come from an index buffer.
+   * If no element array buffer is bound, an INVALID_OPERATION error is
+   * generated.
+   */
+  if (!_mesa_is_bufferobj(ctx->Array.VAO->IndexBufferObj)) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "glDrawElementsIndirect(no buffer bound "
+ "to GL_ELEMENT_ARRAY_BUFFER)");
+  } else {
+ DrawElementsIndirectCommand *cmd =
+(DrawElementsIndirectCommand *) indirect;
+
+ /* Convert offset to pointer */
+ void *offset = (void *)
+((cmd->firstIndex * _mesa_sizeof_type(type)) & 0xUL);
+
+ vbo_exec_DrawElementsInstancedBaseVertex(mode, cmd->count, type,

[Mesa-dev] [PATCH 16/18] mesa: add outstanding ARB_vertex_attrib_64bit dlist support

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/dlist.c | 178 ++
 1 file changed, 178 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 5ff0a23018c..ae23d292837 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -471,6 +471,10 @@ typedef enum
OPCODE_ATTR_2F_ARB,
OPCODE_ATTR_3F_ARB,
OPCODE_ATTR_4F_ARB,
+   OPCODE_ATTR_1D,
+   OPCODE_ATTR_2D,
+   OPCODE_ATTR_3D,
+   OPCODE_ATTR_4D,
OPCODE_MATERIAL,
OPCODE_BEGIN,
OPCODE_END,
@@ -6416,6 +6420,152 @@ save_VertexAttrib4fvARB(GLuint index, const GLfloat * v)
   index_error();
 }
 
+static void GLAPIENTRY
+save_VertexAttribL1d(GLuint index, GLdouble x)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS) {
+  Node *n;
+  SAVE_FLUSH_VERTICES(ctx);
+  n = alloc_instruction(ctx, OPCODE_ATTR_1D, 3);
+  if (n) {
+ n[1].ui = index;
+ ASSIGN_DOUBLE_TO_NODES(n, 2, x);
+  }
+
+  ctx->ListState.ActiveAttribSize[index] = 1;
+  memcpy(ctx->ListState.CurrentAttrib[index], [2], sizeof(GLdouble));
+
+  if (ctx->ExecuteFlag) {
+ CALL_VertexAttribL1d(ctx->Exec, (index, x));
+  }
+   } else {
+  index_error();
+   }
+}
+
+static void GLAPIENTRY
+save_VertexAttribL1dv(GLuint index, const GLdouble *v)
+{
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS)
+  save_VertexAttribL1d(index, v[0]);
+   else
+  index_error();
+}
+
+static void GLAPIENTRY
+save_VertexAttribL2d(GLuint index, GLdouble x, GLdouble y)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS) {
+  Node *n;
+  SAVE_FLUSH_VERTICES(ctx);
+  n = alloc_instruction(ctx, OPCODE_ATTR_2D, 5);
+  if (n) {
+ n[1].ui = index;
+ ASSIGN_DOUBLE_TO_NODES(n, 2, x);
+ ASSIGN_DOUBLE_TO_NODES(n, 4, y);
+  }
+
+  ctx->ListState.ActiveAttribSize[index] = 2;
+  memcpy(ctx->ListState.CurrentAttrib[index], [2],
+ 2 * sizeof(GLdouble));
+
+  if (ctx->ExecuteFlag) {
+ CALL_VertexAttribL2d(ctx->Exec, (index, x, y));
+  }
+   } else {
+  index_error();
+   }
+}
+
+static void GLAPIENTRY
+save_VertexAttribL2dv(GLuint index, const GLdouble *v)
+{
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS)
+  save_VertexAttribL2d(index, v[0], v[1]);
+   else
+  index_error();
+}
+
+static void GLAPIENTRY
+save_VertexAttribL3d(GLuint index, GLdouble x, GLdouble y, GLdouble z)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS) {
+  Node *n;
+  SAVE_FLUSH_VERTICES(ctx);
+  n = alloc_instruction(ctx, OPCODE_ATTR_3D, 7);
+  if (n) {
+ n[1].ui = index;
+ ASSIGN_DOUBLE_TO_NODES(n, 2, x);
+ ASSIGN_DOUBLE_TO_NODES(n, 4, y);
+ ASSIGN_DOUBLE_TO_NODES(n, 6, z);
+  }
+
+  ctx->ListState.ActiveAttribSize[index] = 3;
+  memcpy(ctx->ListState.CurrentAttrib[index], [2],
+ 3 * sizeof(GLdouble));
+
+  if (ctx->ExecuteFlag) {
+ CALL_VertexAttribL3d(ctx->Exec, (index, x, y, z));
+  }
+   } else {
+  index_error();
+   }
+}
+
+static void GLAPIENTRY
+save_VertexAttribL3dv(GLuint index, const GLdouble *v)
+{
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS)
+  save_VertexAttribL3d(index, v[0], v[1], v[2]);
+   else
+  index_error();
+}
+
+static void GLAPIENTRY
+save_VertexAttribL4d(GLuint index, GLdouble x, GLdouble y, GLdouble z,
+   GLdouble w)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS) {
+  Node *n;
+  SAVE_FLUSH_VERTICES(ctx);
+  n = alloc_instruction(ctx, OPCODE_ATTR_4D, 9);
+  if (n) {
+ n[1].ui = index;
+ ASSIGN_DOUBLE_TO_NODES(n, 2, x);
+ ASSIGN_DOUBLE_TO_NODES(n, 4, y);
+ ASSIGN_DOUBLE_TO_NODES(n, 6, z);
+ ASSIGN_DOUBLE_TO_NODES(n, 8, w);
+  }
+
+  ctx->ListState.ActiveAttribSize[index] = 4;
+  memcpy(ctx->ListState.CurrentAttrib[index], [2],
+ 4 * sizeof(GLdouble));
+
+  if (ctx->ExecuteFlag) {
+ CALL_VertexAttribL4d(ctx->Exec, (index, x, y, z, w));
+  }
+   } else {
+  index_error();
+   }
+}
+
+static void GLAPIENTRY
+save_VertexAttribL4dv(GLuint index, const GLdouble *v)
+{
+   if (index < MAX_VERTEX_GENERIC_ATTRIBS)
+  save_VertexAttribL4d(index, v[0], v[1], v[2], v[3]);
+   else
+  index_error();
+}
+
 static void GLAPIENTRY
 save_PrimitiveRestartNV(void)
 {
@@ -10360,6 +10510,26 @@ execute_list(struct gl_context *ctx, GLuint list)
  case OPCODE_ATTR_4F_ARB:
 CALL_VertexAttrib4fvARB(ctx->Exec, (n[1].e, [2].f));
 break;
+ case OPCODE_ATTR_1D: {
+GLdouble *d = (GLdouble *) [2];
+CALL_VertexAttribL1d(ctx->Exec, (n[1].ui, *d));
+break;
+ }
+ case OPCODE_ATTR_2D: {
+GLdouble *d = (GLdouble *) [2];
+CALL_VertexAttribL2dv(ctx->Exec, (n[1].ui, d));
+break;
+ }
+ case 

[Mesa-dev] [PATCH 05/18] mesa: add glUniformSubroutinesuiv() display list support

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/dlist.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index d49eebae00d..2425cf24f1b 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -523,6 +523,9 @@ typedef enum
/* ARB_uniform_buffer_object */
OPCODE_UNIFORM_BLOCK_BINDING,
 
+   /* ARB_shader_subroutines */
+   OPCODE_UNIFORM_SUBROUTINES,
+
/* EXT_polygon_offset_clamp */
OPCODE_POLYGON_OFFSET_CLAMP,
 
@@ -1161,6 +1164,7 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  case OPCODE_PIXEL_MAP:
 free(get_pointer([3]));
 break;
+ case OPCODE_UNIFORM_SUBROUTINES:
  case OPCODE_WINDOW_RECTANGLES:
 free(get_pointer([3]));
 break;
@@ -8712,6 +8716,28 @@ save_UniformBlockBinding(GLuint prog, GLuint index, 
GLuint binding)
}
 }
 
+static void GLAPIENTRY
+save_UniformSubroutinesuiv(GLenum shadertype, GLsizei count,
+   const GLuint *indices)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_UNIFORM_SUBROUTINES, 2 + POINTER_DWORDS);
+   if (n) {
+  GLint *indices_copy = NULL;
+
+  if (count > 0)
+ indices_copy = memdup(indices, sizeof(GLuint) * 4 * count);
+  n[1].e = shadertype;
+  n[2].si = count;
+  save_pointer([3], indices_copy);
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_UniformSubroutinesuiv(ctx->Exec, (shadertype, count, indices));
+   }
+}
+
 /** GL_EXT_window_rectangles */
 static void GLAPIENTRY
 save_WindowRectanglesEXT(GLenum mode, GLsizei count, const GLint *box)
@@ -10225,6 +10251,11 @@ execute_list(struct gl_context *ctx, GLuint list)
 CALL_UniformBlockBinding(ctx->Exec, (n[1].ui, n[2].ui, n[3].ui));
 break;
 
+ case OPCODE_UNIFORM_SUBROUTINES:
+CALL_UniformSubroutinesuiv(ctx->Exec, (n[1].e, n[2].si,
+   get_pointer([3])));
+break;
+
  /* GL_EXT_window_rectangles */
  case OPCODE_WINDOW_RECTANGLES:
 CALL_WindowRectanglesEXT(
@@ -4,6 +11145,9 @@ _mesa_initialize_save_table(const struct gl_context 
*ctx)
/* GL_ARB_uniform_buffer_object */
SET_UniformBlockBinding(table, save_UniformBlockBinding);
 
+   /* GL_ARB_shader_subroutines */
+   SET_UniformSubroutinesuiv(table, save_UniformSubroutinesuiv);
+
/* GL_ARB_draw_instanced */
SET_DrawArraysInstancedARB(table, save_DrawArraysInstancedARB);
SET_DrawElementsInstancedARB(table, save_DrawElementsInstancedARB);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/18] mesa: generate GL_INVALID_OPERATION using draw indirect in dlist

2018-06-28 Thread Timothy Arceri
The spec doesn't explicitly say to generate an error but since
DrawArraysInstanced* and DrawElementsInstanced* do, it makes
sense to do it for these functions also.
---
 src/mesa/main/dlist.c | 47 +++
 1 file changed, 47 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index e2ab2eb8aa1..5ff0a23018c 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -1913,6 +1913,47 @@ save_DrawElementsInstancedBaseVertexBaseInstance(UNUSED 
GLenum mode,
"glDrawElementsInstancedBaseVertexBaseInstance() during display 
list compile");
 }
 
+static void APIENTRY
+save_DrawArraysIndirect(UNUSED GLenum mode,
+UNUSED const void *indirect)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_error(ctx, GL_INVALID_OPERATION,
+   "glDrawArraysIndirect() during display list compile");
+}
+
+static void APIENTRY
+save_DrawElementsIndirect(UNUSED GLenum mode,
+  UNUSED GLenum type,
+  UNUSED const void *indirect)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_error(ctx, GL_INVALID_OPERATION,
+   "glDrawElementsIndirect() during display list compile");
+}
+
+static void APIENTRY
+save_MultiDrawArraysIndirect(UNUSED GLenum mode,
+ UNUSED const void *indirect,
+ UNUSED GLsizei primcount,
+ UNUSED GLsizei stride)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_error(ctx, GL_INVALID_OPERATION,
+   "glMultiDrawArraysIndirect() during display list compile");
+}
+
+static void APIENTRY
+save_MultiDrawElementsIndirect(UNUSED GLenum mode,
+   UNUSED GLenum type,
+   UNUSED const void *indirect,
+   UNUSED GLsizei primcount,
+   UNUSED GLsizei stride)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   _mesa_error(ctx, GL_INVALID_OPERATION,
+   "glMultiDrawElementsIndirect() during display list compile");
+}
 
 /**
  * While building a display list we cache some OpenGL state.
@@ -11410,6 +11451,12 @@ _mesa_initialize_save_table(const struct gl_context 
*ctx)
SET_DrawElementsInstancedBaseInstance(table, 
save_DrawElementsInstancedBaseInstance);
SET_DrawElementsInstancedBaseVertexBaseInstance(table, 
save_DrawElementsInstancedBaseVertexBaseInstance);
 
+   /* GL_ARB_draw_indirect / GL_ARB_multi_draw_indirect */
+   SET_DrawArraysIndirect(table, save_DrawArraysIndirect);
+   SET_DrawElementsIndirect(table, save_DrawElementsIndirect);
+   SET_MultiDrawArraysIndirect(table, save_MultiDrawArraysIndirect);
+   SET_MultiDrawElementsIndirect(table, save_MultiDrawElementsIndirect);
+
/* OpenGL 4.2 / GL_ARB_separate_shader_objects */
SET_UseProgramStages(table, save_UseProgramStages);
SET_ProgramUniform1f(table, save_ProgramUniform1f);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/18] mesa: add compat profile support for ARB_multi_draw_indirect

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/extensions_table.h |  2 +-
 src/mesa/vbo/vbo_exec_array.c| 75 +++-
 2 files changed, 74 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 1446a4bd421..12b796777df 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -88,7 +88,7 @@ EXT(ARB_invalidate_subdata  , dummy_true
 EXT(ARB_map_buffer_alignment, dummy_true   
  , GLL, GLC,  x ,  x , 2011)
 EXT(ARB_map_buffer_range, ARB_map_buffer_range 
  , GLL, GLC,  x ,  x , 2008)
 EXT(ARB_multi_bind  , dummy_true   
  , GLL, GLC,  x ,  x , 2013)
-EXT(ARB_multi_draw_indirect , ARB_draw_indirect
  ,  x , GLC,  x ,  x , 2012)
+EXT(ARB_multi_draw_indirect , ARB_draw_indirect
  , GLL, GLC,  x ,  x , 2012)
 EXT(ARB_multisample , dummy_true   
  , GLL,  x ,  x ,  x , 1994)
 EXT(ARB_multitexture, dummy_true   
  , GLL,  x ,  x ,  x , 1998)
 EXT(ARB_occlusion_query , ARB_occlusion_query  
  , GLL,  x ,  x ,  x , 2001)
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index 0d92de2e3ad..4e24cdcf263 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -1744,7 +1744,36 @@ vbo_exec_MultiDrawArraysIndirect(GLenum mode, const 
GLvoid *indirect,
 
/* If  is zero, the array elements are treated as tightly packed. */
if (stride == 0)
-  stride = 4 * sizeof(GLuint);  /* sizeof(DrawArraysIndirectCommand) */
+  stride = sizeof(DrawArraysIndirectCommand);
+
+   /* From the ARB_draw_indirect spec:
+*
+*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
+*compatibility profile, this indicates that DrawArraysIndirect and
+*DrawElementsIndirect are to source their arguments directly from the
+*pointer passed as their  parameters."
+*/
+   if (ctx->API == API_OPENGL_COMPAT &&
+   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
+
+  if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
+   "glMultiDrawArraysIndirect"))
+ return;
+
+  const ubyte *ptr = (const ubyte *) indirect;
+  for (unsigned i = 0; i < primcount; i++) {
+ DrawArraysIndirectCommand *cmd = (DrawArraysIndirectCommand *) ptr;
+ _mesa_DrawArraysInstanced(mode, cmd->first, cmd->count, 
cmd->primCount);
+
+ if (stride == 0) {
+ptr += sizeof(DrawArraysIndirectCommand);
+ } else {
+ptr += stride;
+ }
+  }
+
+  return;
+   }
 
FLUSH_FOR_DRAW(ctx);
 
@@ -1783,7 +1812,49 @@ vbo_exec_MultiDrawElementsIndirect(GLenum mode, GLenum 
type,
 
/* If  is zero, the array elements are treated as tightly packed. */
if (stride == 0)
-  stride = 5 * sizeof(GLuint);  /* sizeof(DrawElementsIndirectCommand) 
*/
+  stride = sizeof(DrawElementsIndirectCommand);
+
+
+   /* From the ARB_draw_indirect spec:
+*
+*"Initially zero is bound to DRAW_INDIRECT_BUFFER. In the
+*compatibility profile, this indicates that DrawArraysIndirect and
+*DrawElementsIndirect are to source their arguments directly from the
+*pointer passed as their  parameters."
+*/
+   if (ctx->API == API_OPENGL_COMPAT &&
+   !_mesa_is_bufferobj(ctx->DrawIndirectBuffer)) {
+  /*
+   * Unlike regular DrawElementsInstancedBaseVertex commands, the indices
+   * may not come from a client array and must come from an index buffer.
+   * If no element array buffer is bound, an INVALID_OPERATION error is
+   * generated.
+   */
+  if (!_mesa_is_bufferobj(ctx->Array.VAO->IndexBufferObj)) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "glMultiDrawElementsIndirect(no buffer bound "
+ "to GL_ELEMENT_ARRAY_BUFFER)");
+
+ return;
+  }
+
+  if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
+   "glMultiDrawArraysIndirect"))
+ return;
+
+  const ubyte *ptr = (const ubyte *) indirect;
+  for (unsigned i = 0; i < primcount; i++) {
+ vbo_exec_DrawElementsIndirect(mode, type, ptr);
+
+ if (stride == 0) {
+ptr += sizeof(DrawElementsIndirectCommand);
+ } else {
+ptr += stride;
+ }
+  }
+
+  return;
+   }
 
FLUSH_FOR_DRAW(ctx);
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/18] mesa: enable ARB_viewport_array in compat profile

2018-06-28 Thread Timothy Arceri
---
 src/mapi/glapi/gen/apiexec.py   | 16 
 src/mesa/main/extensions_table.h|  2 +-
 src/mesa/main/tests/dispatch_sanity.cpp | 17 +
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
index e69c6b4df16..1a91785d375 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -131,14 +131,14 @@ functions = {
 #
 # Mesa does not support either of the geometry shader extensions, so
 # OpenGL 3.2 is required.
-"ViewportArrayv": exec_info(core=32, es2=31),
-"ViewportIndexedf": exec_info(core=32, es2=31),
-"ViewportIndexedfv": exec_info(core=32, es2=31),
-"ScissorArrayv": exec_info(core=32, es2=31),
-"ScissorIndexed": exec_info(core=32, es2=31),
-"ScissorIndexedv": exec_info(core=32, es2=31),
-"DepthRangeArrayv": exec_info(core=32),
-"DepthRangeIndexed": exec_info(core=32),
+"ViewportArrayv": exec_info(compatibility=32, core=32, es2=31),
+"ViewportIndexedf": exec_info(compatibility=32, core=32, es2=31),
+"ViewportIndexedfv": exec_info(compatibility=32, core=32, es2=31),
+"ScissorArrayv": exec_info(compatibility=32, core=32, es2=31),
+"ScissorIndexed": exec_info(compatibility=32, core=32, es2=31),
+"ScissorIndexedv": exec_info(compatibility=32, core=32, es2=31),
+"DepthRangeArrayv": exec_info(compatibility=32, core=32),
+"DepthRangeIndexed": exec_info(compatibility=32, core=32),
 # GetFloati_v also GL_ARB_shader_atomic_counters
 # GetDoublei_v also GL_ARB_shader_atomic_counters
 
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 11345febe2e..f04fea9e3bc 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -182,7 +182,7 @@ EXT(ARB_vertex_program  , 
ARB_vertex_program
 EXT(ARB_vertex_shader   , ARB_vertex_shader
  , GLL, GLC,  x ,  x , 2002)
 EXT(ARB_vertex_type_10f_11f_11f_rev , ARB_vertex_type_10f_11f_11f_rev  
  , GLL, GLC,  x ,  x , 2013)
 EXT(ARB_vertex_type_2_10_10_10_rev  , ARB_vertex_type_2_10_10_10_rev   
  , GLL, GLC,  x ,  x , 2009)
-EXT(ARB_viewport_array  , ARB_viewport_array   
  ,  x , GLC,  x ,  x , 2010)
+EXT(ARB_viewport_array  , ARB_viewport_array   
  , GLL, GLC,  x ,  x , 2010)
 EXT(ARB_window_pos  , dummy_true   
  , GLL,  x ,  x ,  x , 2001)
 
 EXT(ATI_blend_equation_separate , EXT_blend_equation_separate  
  , GLL, GLC,  x ,  x , 2003)
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index ed99f1a1957..085b1f7dd5f 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -884,6 +884,15 @@ const struct function common_desktop_functions_possible[] 
= {
 // { "glTextureStorage2DMultisampleEXT", 43, -1 },  // XXX: Add to xml
 // { "glTextureStorage3DMultisampleEXT", 43, -1 },  // XXX: Add to xml
 
+   { "glViewportArrayv", 43, -1 },
+   { "glViewportIndexedf", 43, -1 },
+   { "glViewportIndexedfv", 43, -1 },
+   { "glScissorArrayv", 43, -1 },
+   { "glScissorIndexed", 43, -1 },
+   { "glScissorIndexedv", 43, -1 },
+   { "glDepthRangeArrayv", 43, -1 },
+   { "glDepthRangeIndexed", 43, -1 },
+
 /* GL 4.5 */
/* aliased versions checked above */
//{ "glGetGraphicsResetStatus", 45, -1 },
@@ -1766,14 +1775,6 @@ const struct function gl_core_functions_possible[] = {
{ "glVertexAttribL4dv", 41, -1 },
{ "glVertexAttribLPointer", 41, -1 },
{ "glGetVertexAttribLdv", 41, -1 },
-   { "glViewportArrayv", 43, -1 },
-   { "glViewportIndexedf", 43, -1 },
-   { "glViewportIndexedfv", 43, -1 },
-   { "glScissorArrayv", 43, -1 },
-   { "glScissorIndexed", 43, -1 },
-   { "glScissorIndexedv", 43, -1 },
-   { "glDepthRangeArrayv", 43, -1 },
-   { "glDepthRangeIndexed", 43, -1 },
 
 // { "glCreateSyncFromCLeventARB", 43, -1 },// XXX: Add to xml
 
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/18] mesa: make valid_draw_indirect_multi() accessible externally

2018-06-28 Thread Timothy Arceri
We will use this to add compat support to ARB_multi_draw_indirect
in the following patch.
---
 src/mesa/main/draw_validate.c | 24 
 src/mesa/main/draw_validate.h |  3 +++
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/src/mesa/main/draw_validate.c b/src/mesa/main/draw_validate.c
index 352263c5c78..c0a234a2bc2 100644
--- a/src/mesa/main/draw_validate.c
+++ b/src/mesa/main/draw_validate.c
@@ -1192,10 +1192,10 @@ valid_draw_indirect_elements(struct gl_context *ctx,
return valid_draw_indirect(ctx, mode, indirect, size, name);
 }
 
-static inline GLboolean
-valid_draw_indirect_multi(struct gl_context *ctx,
-  GLsizei primcount, GLsizei stride,
-  const char *name)
+GLboolean
+_mesa_valid_draw_indirect_multi(struct gl_context *ctx,
+GLsizei primcount, GLsizei stride,
+const char *name)
 {
 
/* From the ARB_multi_draw_indirect specification:
@@ -1259,8 +1259,8 @@ _mesa_validate_MultiDrawArraysIndirect(struct gl_context 
*ctx,
/* caller has converted stride==0 to drawArraysNumParams * sizeof(GLuint) */
assert(stride != 0);
 
-   if (!valid_draw_indirect_multi(ctx, primcount, stride,
-  "glMultiDrawArraysIndirect"))
+   if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
+"glMultiDrawArraysIndirect"))
   return GL_FALSE;
 
/* number of bytes of the indirect buffer which will be read */
@@ -1287,8 +1287,8 @@ _mesa_validate_MultiDrawElementsIndirect(struct 
gl_context *ctx,
/* caller has converted stride==0 to drawElementsNumParams * sizeof(GLuint) 
*/
assert(stride != 0);
 
-   if (!valid_draw_indirect_multi(ctx, primcount, stride,
-  "glMultiDrawElementsIndirect"))
+   if (!_mesa_valid_draw_indirect_multi(ctx, primcount, stride,
+"glMultiDrawElementsIndirect"))
   return GL_FALSE;
 
/* number of bytes of the indirect buffer which will be read */
@@ -1366,8 +1366,8 @@ _mesa_validate_MultiDrawArraysIndirectCount(struct 
gl_context *ctx,
/* caller has converted stride==0 to drawArraysNumParams * sizeof(GLuint) */
assert(stride != 0);
 
-   if (!valid_draw_indirect_multi(ctx, maxdrawcount, stride,
-  "glMultiDrawArraysIndirectCountARB"))
+   if (!_mesa_valid_draw_indirect_multi(ctx, maxdrawcount, stride,
+"glMultiDrawArraysIndirectCountARB"))
   return GL_FALSE;
 
/* number of bytes of the indirect buffer which will be read */
@@ -1397,8 +1397,8 @@ _mesa_validate_MultiDrawElementsIndirectCount(struct 
gl_context *ctx,
/* caller has converted stride==0 to drawElementsNumParams * sizeof(GLuint) 
*/
assert(stride != 0);
 
-   if (!valid_draw_indirect_multi(ctx, maxdrawcount, stride,
-  "glMultiDrawElementsIndirectCountARB"))
+   if (!_mesa_valid_draw_indirect_multi(ctx, maxdrawcount, stride,
+"glMultiDrawElementsIndirectCountARB"))
   return GL_FALSE;
 
/* number of bytes of the indirect buffer which will be read */
diff --git a/src/mesa/main/draw_validate.h b/src/mesa/main/draw_validate.h
index 7a181153fb7..d015c7e830e 100644
--- a/src/mesa/main/draw_validate.h
+++ b/src/mesa/main/draw_validate.h
@@ -44,6 +44,9 @@ _mesa_is_valid_prim_mode(const struct gl_context *ctx, GLenum 
mode);
 extern GLboolean
 _mesa_valid_prim_mode(struct gl_context *ctx, GLenum mode, const char *name);
 
+extern GLboolean
+_mesa_valid_draw_indirect_multi(struct gl_context *ctx, GLsizei primcount,
+GLsizei stride, const char *name);
 
 extern GLboolean
 _mesa_validate_DrawArrays(struct gl_context *ctx, GLenum mode, GLsizei count);
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/18] mesa: enable ARB_shader_subroutine in compat profile

2018-06-28 Thread Timothy Arceri
---
 src/mapi/glapi/gen/apiexec.py   | 16 
 src/mesa/main/extensions_table.h|  2 +-
 src/mesa/main/tests/dispatch_sanity.cpp | 19 +--
 3 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
index 00c80171274..e69c6b4df16 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -73,14 +73,14 @@ functions = {
 
 # OpenGL 4.0 / GL_ARB_shader_subroutines. Mesa only exposes this
 # extension with core profile.
-"GetSubroutineUniformLocation": exec_info(core=31),
-"GetSubroutineIndex": exec_info(core=31),
-"GetActiveSubroutineUniformiv": exec_info(core=31),
-"GetActiveSubroutineUniformName": exec_info(core=31),
-"GetActiveSubroutineName": exec_info(core=31),
-"UniformSubroutinesuiv": exec_info(core=31),
-"GetUniformSubroutineuiv": exec_info(core=31),
-"GetProgramStageiv": exec_info(core=31),
+"GetSubroutineUniformLocation": exec_info(compatibility=31, core=31),
+"GetSubroutineIndex": exec_info(compatibility=31, core=31),
+"GetActiveSubroutineUniformiv": exec_info(compatibility=31, core=31),
+"GetActiveSubroutineUniformName": exec_info(compatibility=31, core=31),
+"GetActiveSubroutineName": exec_info(compatibility=31, core=31),
+"UniformSubroutinesuiv": exec_info(compatibility=31, core=31),
+"GetUniformSubroutineuiv": exec_info(compatibility=31, core=31),
+"GetProgramStageiv": exec_info(compatibility=31, core=31),
 
 # OpenGL 4.0 / GL_ARB_gpu_shader_fp64.  The extension spec says:
 #
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 5fe2e88fe98..11345febe2e 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -123,7 +123,7 @@ EXT(ARB_shader_objects  , dummy_true
 EXT(ARB_shader_precision, ARB_shader_precision 
  , GLL, GLC,  x ,  x , 2010)
 EXT(ARB_shader_stencil_export   , ARB_shader_stencil_export
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_shader_storage_buffer_object, ARB_shader_storage_buffer_object 
  , GLL, GLC,  x ,  x , 2012)
-EXT(ARB_shader_subroutine   , dummy_true   
  ,  x , GLC,  x ,  x , 2010)
+EXT(ARB_shader_subroutine   , dummy_true   
  ,  31, GLC,  x ,  x , 2010)
 EXT(ARB_shader_texture_image_samples, ARB_shader_texture_image_samples 
  , GLL, GLC,  x ,  x , 2014)
 EXT(ARB_shader_texture_lod  , ARB_shader_texture_lod   
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_shader_viewport_layer_array , ARB_shader_viewport_layer_array  
  ,  x , GLC,  x ,  x , 2015)
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 6b319d8b030..ed99f1a1957 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -575,6 +575,15 @@ const struct function common_desktop_functions_possible[] 
= {
{ "glBlendFunci", 40, -1 },
{ "glBlendFuncSeparatei", 40, -1 },
 
+   { "glGetSubroutineUniformLocation", 40, -1 },
+   { "glGetSubroutineIndex", 40, -1 },
+   { "glGetActiveSubroutineUniformiv", 40, -1 },
+   { "glGetActiveSubroutineUniformName", 40, -1 },
+   { "glGetActiveSubroutineName", 40, -1 },
+   { "glUniformSubroutinesuiv", 40, -1 },
+   { "glGetUniformSubroutineuiv", 40, -1 },
+   { "glGetProgramStageiv", 40, -1 },
+
{ "glUniform1d", 40, -1 },
{ "glUniform2d", 40, -1 },
{ "glUniform3d", 40, -1 },
@@ -1547,16 +1556,6 @@ const struct function gl_core_functions_possible[] = {
/* GL 3.2 */
{ "glFramebufferTexture", 32, -1 },
 
-   /* GL 4.0 */
-   { "glGetSubroutineUniformLocation", 40, -1 },
-   { "glGetSubroutineIndex", 40, -1 },
-   { "glGetActiveSubroutineUniformiv", 40, -1 },
-   { "glGetActiveSubroutineUniformName", 40, -1 },
-   { "glGetActiveSubroutineName", 40, -1 },
-   { "glUniformSubroutinesuiv", 40, -1 },
-   { "glGetUniformSubroutineuiv", 40, -1 },
-   { "glGetProgramStageiv", 40, -1 },
-
/* GL 4.3 */
{ "glIsRenderbuffer", 43, -1 },
{ "glBindRenderbuffer", 43, -1 },
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/18] mesa: add ProgramUniform*d display list support

2018-06-28 Thread Timothy Arceri
This is required for fp64 to be enabled in compat profile.
---
 src/mesa/main/dlist.c | 514 ++
 1 file changed, 514 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index aec373b7ab1..d49eebae00d 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -399,6 +399,14 @@ typedef enum
OPCODE_PROGRAM_UNIFORM_2FV,
OPCODE_PROGRAM_UNIFORM_3FV,
OPCODE_PROGRAM_UNIFORM_4FV,
+   OPCODE_PROGRAM_UNIFORM_1D,
+   OPCODE_PROGRAM_UNIFORM_2D,
+   OPCODE_PROGRAM_UNIFORM_3D,
+   OPCODE_PROGRAM_UNIFORM_4D,
+   OPCODE_PROGRAM_UNIFORM_1DV,
+   OPCODE_PROGRAM_UNIFORM_2DV,
+   OPCODE_PROGRAM_UNIFORM_3DV,
+   OPCODE_PROGRAM_UNIFORM_4DV,
OPCODE_PROGRAM_UNIFORM_1I,
OPCODE_PROGRAM_UNIFORM_2I,
OPCODE_PROGRAM_UNIFORM_3I,
@@ -424,6 +432,15 @@ typedef enum
OPCODE_PROGRAM_UNIFORM_MATRIX42F,
OPCODE_PROGRAM_UNIFORM_MATRIX34F,
OPCODE_PROGRAM_UNIFORM_MATRIX43F,
+   OPCODE_PROGRAM_UNIFORM_MATRIX22D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX33D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX44D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX23D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX32D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX24D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX42D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX34D,
+   OPCODE_PROGRAM_UNIFORM_MATRIX43D,
 
/* GL_ARB_clip_control */
OPCODE_CLIP_CONTROL,
@@ -1107,6 +1124,10 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  case OPCODE_PROGRAM_UNIFORM_2FV:
  case OPCODE_PROGRAM_UNIFORM_3FV:
  case OPCODE_PROGRAM_UNIFORM_4FV:
+ case OPCODE_PROGRAM_UNIFORM_1DV:
+ case OPCODE_PROGRAM_UNIFORM_2DV:
+ case OPCODE_PROGRAM_UNIFORM_3DV:
+ case OPCODE_PROGRAM_UNIFORM_4DV:
  case OPCODE_PROGRAM_UNIFORM_1IV:
  case OPCODE_PROGRAM_UNIFORM_2IV:
  case OPCODE_PROGRAM_UNIFORM_3IV:
@@ -1126,6 +1147,15 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  case OPCODE_PROGRAM_UNIFORM_MATRIX32F:
  case OPCODE_PROGRAM_UNIFORM_MATRIX34F:
  case OPCODE_PROGRAM_UNIFORM_MATRIX43F:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX22D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX33D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX44D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX24D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX42D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX23D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX32D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX34D:
+ case OPCODE_PROGRAM_UNIFORM_MATRIX43D:
 free(get_pointer([5]));
 break;
  case OPCODE_PIXEL_MAP:
@@ -7486,6 +7516,158 @@ save_ProgramUniform4fv(GLuint program, GLint location, 
GLsizei count,
}
 }
 
+static void GLAPIENTRY
+save_ProgramUniform1d(GLuint program, GLint location, GLdouble x)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_PROGRAM_UNIFORM_1D, 4);
+   if (n) {
+  n[1].ui = program;
+  n[2].i = location;
+  ASSIGN_DOUBLE_TO_NODES(n, 3, x);
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ProgramUniform1d(ctx->Exec, (program, location, x));
+   }
+}
+
+static void GLAPIENTRY
+save_ProgramUniform2d(GLuint program, GLint location, GLdouble x, GLdouble y)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_PROGRAM_UNIFORM_2D, 6);
+   if (n) {
+  n[1].ui = program;
+  n[2].i = location;
+  ASSIGN_DOUBLE_TO_NODES(n, 3, x);
+  ASSIGN_DOUBLE_TO_NODES(n, 5, y);
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ProgramUniform2d(ctx->Exec, (program, location, x, y));
+   }
+}
+
+static void GLAPIENTRY
+save_ProgramUniform3d(GLuint program, GLint location,
+  GLdouble x, GLdouble y, GLdouble z)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_PROGRAM_UNIFORM_3D, 8);
+   if (n) {
+  n[1].ui = program;
+  n[2].i = location;
+  ASSIGN_DOUBLE_TO_NODES(n, 3, x);
+  ASSIGN_DOUBLE_TO_NODES(n, 5, y);
+  ASSIGN_DOUBLE_TO_NODES(n, 7, z);
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ProgramUniform3d(ctx->Exec, (program, location, x, y, z));
+   }
+}
+
+static void GLAPIENTRY
+save_ProgramUniform4d(GLuint program, GLint location,
+  GLdouble x, GLdouble y, GLdouble z, GLdouble w)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_PROGRAM_UNIFORM_4D, 10);
+   if (n) {
+  n[1].ui = program;
+  n[2].i = location;
+  ASSIGN_DOUBLE_TO_NODES(n, 3, x);
+  ASSIGN_DOUBLE_TO_NODES(n, 5, y);
+  ASSIGN_DOUBLE_TO_NODES(n, 7, z);
+  ASSIGN_DOUBLE_TO_NODES(n, 9, w);
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ProgramUniform4d(ctx->Exec, (program, location, x, y, z, w));
+   }
+}
+
+static 

[Mesa-dev] [PATCH 09/18] mesa: expose some ARB_viewport_array dependent extensions in compat

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/extensions_table.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index f04fea9e3bc..f79a52cee8c 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -16,7 +16,7 @@ EXT(AMD_seamless_cubemap_per_texture, 
AMD_seamless_cubemap_per_texture
 EXT(AMD_shader_stencil_export   , ARB_shader_stencil_export
  , GLL, GLC,  x ,  x , 2009)
 EXT(AMD_shader_trinary_minmax   , dummy_true   
  , GLL, GLC,  x ,  x , 2012)
 EXT(AMD_vertex_shader_layer , AMD_vertex_shader_layer  
  , GLL, GLC,  x ,  x , 2012)
-EXT(AMD_vertex_shader_viewport_index, AMD_vertex_shader_viewport_index 
  ,  x , GLC,  x ,  x , 2012)
+EXT(AMD_vertex_shader_viewport_index, AMD_vertex_shader_viewport_index 
  , GLL, GLC,  x ,  x , 2012)
 
 EXT(ANDROID_extension_pack_es31a, ANDROID_extension_pack_es31a 
  ,  x ,  x ,  x ,  31, 2014)
 
@@ -64,7 +64,7 @@ EXT(ARB_enhanced_layouts, 
ARB_enhanced_layouts
 EXT(ARB_explicit_attrib_location, ARB_explicit_attrib_location 
  , GLL, GLC,  x ,  x , 2009)
 EXT(ARB_explicit_uniform_location   , ARB_explicit_uniform_location
  , GLL, GLC,  x ,  x , 2012)
 EXT(ARB_fragment_coord_conventions  , ARB_fragment_coord_conventions   
  , GLL, GLC,  x ,  x , 2009)
-EXT(ARB_fragment_layer_viewport , ARB_fragment_layer_viewport  
  ,  x , GLC,  x ,  x , 2012)
+EXT(ARB_fragment_layer_viewport , ARB_fragment_layer_viewport  
  , GLL, GLC,  x ,  x , 2012)
 EXT(ARB_fragment_program, ARB_fragment_program 
  , GLL,  x ,  x ,  x , 2002)
 EXT(ARB_fragment_program_shadow , ARB_fragment_program_shadow  
  , GLL,  x ,  x ,  x , 2003)
 EXT(ARB_fragment_shader , ARB_fragment_shader  
  , GLL, GLC,  x ,  x , 2002)
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/18] mesa: add ARB_viewport_array display list support

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/dlist.c | 211 ++
 1 file changed, 211 insertions(+)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 2425cf24f1b..b2b1f723a17 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -290,6 +290,15 @@ typedef enum
OPCODE_TRANSLATE,
OPCODE_VIEWPORT,
OPCODE_WINDOW_POS,
+   /* ARB_viewport_array */
+   OPCODE_VIEWPORT_ARRAY_V,
+   OPCODE_VIEWPORT_INDEXED_F,
+   OPCODE_VIEWPORT_INDEXED_FV,
+   OPCODE_SCISSOR_ARRAY_V,
+   OPCODE_SCISSOR_INDEXED,
+   OPCODE_SCISSOR_INDEXED_V,
+   OPCODE_DEPTH_ARRAY_V,
+   OPCODE_DEPTH_INDEXED,
/* GL_ARB_multitexture */
OPCODE_ACTIVE_TEXTURE,
/* GL_ARB_texture_compression */
@@ -1164,6 +1173,9 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  case OPCODE_PIXEL_MAP:
 free(get_pointer([3]));
 break;
+ case OPCODE_VIEWPORT_ARRAY_V:
+ case OPCODE_SCISSOR_ARRAY_V:
+ case OPCODE_DEPTH_ARRAY_V:
  case OPCODE_UNIFORM_SUBROUTINES:
  case OPCODE_WINDOW_RECTANGLES:
 free(get_pointer([3]));
@@ -4612,6 +4624,154 @@ save_Viewport(GLint x, GLint y, GLsizei width, GLsizei 
height)
}
 }
 
+static void GLAPIENTRY
+save_ViewportIndexedf(GLuint index, GLfloat x, GLfloat y, GLfloat width,
+  GLfloat height)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_VIEWPORT_INDEXED_F, 5);
+   if (n) {
+  n[1].ui = index;
+  n[2].f = x;
+  n[3].f = y;
+  n[4].f = width;
+  n[5].f = height;
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ViewportIndexedf(ctx->Exec, (index, x, y, width, height));
+   }
+}
+
+static void GLAPIENTRY
+save_ViewportIndexedfv(GLuint index, const GLfloat *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_VIEWPORT_INDEXED_FV, 5);
+   if (n) {
+  n[1].ui = index;
+  n[2].f = v[0];
+  n[3].f = v[1];
+  n[4].f = v[2];
+  n[5].f = v[3];
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ViewportIndexedfv(ctx->Exec, (index, v));
+   }
+}
+
+static void GLAPIENTRY
+save_ViewportArrayv(GLuint first, GLsizei count, const GLfloat *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_VIEWPORT_ARRAY_V, 2 + POINTER_DWORDS);
+   if (n) {
+  n[1].ui = first;
+  n[2].si = count;
+  save_pointer([3], memdup(v, count * 4 * sizeof(GLfloat)));
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ViewportArrayv(ctx->Exec, (first, count, v));
+   }
+}
+
+static void GLAPIENTRY
+save_ScissorIndexed(GLuint index, GLint left, GLint bottom, GLsizei width,
+GLsizei height)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_SCISSOR_INDEXED, 5);
+   if (n) {
+  n[1].ui = index;
+  n[2].i = left;
+  n[3].i = bottom;
+  n[4].si = width;
+  n[5].si = height;
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ScissorIndexed(ctx->Exec, (index, left, bottom, width, height));
+   }
+}
+
+static void GLAPIENTRY
+save_ScissorIndexedv(GLuint index, const GLint *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_SCISSOR_INDEXED_V, 5);
+   if (n) {
+  n[1].ui = index;
+  n[2].i = v[0];
+  n[3].i = v[1];
+  n[4].si = v[2];
+  n[5].si = v[3];
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ScissorIndexedv(ctx->Exec, (index, v));
+   }
+}
+
+static void GLAPIENTRY
+save_ScissorArrayv(GLuint first, GLsizei count, const GLint *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, OPCODE_SCISSOR_ARRAY_V, 2 + POINTER_DWORDS);
+   if (n) {
+  n[1].ui = first;
+  n[2].si = count;
+  save_pointer([3], memdup(v, count * 4 * sizeof(GLint)));
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_ScissorArrayv(ctx->Exec, (first, count, v));
+   }
+}
+
+static void GLAPIENTRY
+save_DepthRangeIndexed(GLuint index, GLclampd n, GLclampd f)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *node;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   node = alloc_instruction(ctx, OPCODE_DEPTH_INDEXED, 3);
+   if (node) {
+  node[1].ui = index;
+  /* Mesa stores these as floats internally so we deliberately convert
+   * them to a float here.
+   */
+  node[2].f = n;
+  node[3].f = f;
+   }
+   if (ctx->ExecuteFlag) {
+  CALL_DepthRangeIndexed(ctx->Exec, (index, n, f));
+   }
+}
+
+static void GLAPIENTRY
+save_DepthRangeArrayv(GLuint first, GLsizei count, const GLclampd *v)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   Node *n;
+   ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
+   n = alloc_instruction(ctx, 

[Mesa-dev] Radeonsi OpenGL 4.4 compat profile support

2018-06-28 Thread Timothy Arceri
Sorry to keep spamming the list with this stuff, but Dave helped
out with ARB_vertex_attrib_64bit support and the spec bug I
submitted for indirect compute dispatch was resolved so it
seemed like a good idea to send it all out again together with
these updates.

Pretty much everything has corresponding piglit tests, but I've
also been testing with a few games and I'm now seeing games such
Doom and Wolfenstein working on wine where previously the version
overrides were not enough to get them to work.

There has also been a report that proper compat support fixes
some issues with Dying Light.

Please review.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/18] mesa: add Uniform*d support to display lists

2018-06-28 Thread Timothy Arceri
This is required so we can enable fp64 support in compat profile.
---
 src/mapi/glapi/gen/apiexec.py |  36 +--
 src/mesa/main/dlist.c | 493 ++
 2 files changed, 511 insertions(+), 18 deletions(-)

diff --git a/src/mapi/glapi/gen/apiexec.py b/src/mapi/glapi/gen/apiexec.py
index 20d6239ba14..00c80171274 100644
--- a/src/mapi/glapi/gen/apiexec.py
+++ b/src/mapi/glapi/gen/apiexec.py
@@ -85,24 +85,24 @@ functions = {
 # OpenGL 4.0 / GL_ARB_gpu_shader_fp64.  The extension spec says:
 #
 # "OpenGL 3.2 and GLSL 1.50 are required."
-"Uniform1d": exec_info(core=32),
-"Uniform2d": exec_info(core=32),
-"Uniform3d": exec_info(core=32),
-"Uniform4d": exec_info(core=32),
-"Uniform1dv": exec_info(core=32),
-"Uniform2dv": exec_info(core=32),
-"Uniform3dv": exec_info(core=32),
-"Uniform4dv": exec_info(core=32),
-"UniformMatrix2dv": exec_info(core=32),
-"UniformMatrix3dv": exec_info(core=32),
-"UniformMatrix4dv": exec_info(core=32),
-"UniformMatrix2x3dv": exec_info(core=32),
-"UniformMatrix2x4dv": exec_info(core=32),
-"UniformMatrix3x2dv": exec_info(core=32),
-"UniformMatrix3x4dv": exec_info(core=32),
-"UniformMatrix4x2dv": exec_info(core=32),
-"UniformMatrix4x3dv": exec_info(core=32),
-"GetUniformdv": exec_info(core=32),
+"Uniform1d": exec_info(compatibility=32, core=32),
+"Uniform2d": exec_info(compatibility=32, core=32),
+"Uniform3d": exec_info(compatibility=32, core=32),
+"Uniform4d": exec_info(compatibility=32, core=32),
+"Uniform1dv": exec_info(compatibility=32, core=32),
+"Uniform2dv": exec_info(compatibility=32, core=32),
+"Uniform3dv": exec_info(compatibility=32, core=32),
+"Uniform4dv": exec_info(compatibility=32, core=32),
+"UniformMatrix2dv": exec_info(compatibility=32, core=32),
+"UniformMatrix3dv": exec_info(compatibility=32, core=32),
+"UniformMatrix4dv": exec_info(compatibility=32, core=32),
+"UniformMatrix2x3dv": exec_info(compatibility=32,core=32),
+"UniformMatrix2x4dv": exec_info(compatibility=32, core=32),
+"UniformMatrix3x2dv": exec_info(compatibility=32, core=32),
+"UniformMatrix3x4dv": exec_info(compatibility=32, core=32),
+"UniformMatrix4x2dv": exec_info(compatibility=32, core=32),
+"UniformMatrix4x3dv": exec_info(compatibility=32, core=32),
+"GetUniformdv": exec_info(compatibility=32, core=32),
 
 # OpenGL 4.1 / GL_ARB_vertex_attrib_64bit.  The extension spec says:
 #
diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 4fc451000b5..aec373b7ab1 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -365,6 +365,25 @@ typedef enum
OPCODE_UNIFORM_3UIV,
OPCODE_UNIFORM_4UIV,
 
+   /* GL_ARB_gpu_shader_fp64 */
+   OPCODE_UNIFORM_1D,
+   OPCODE_UNIFORM_2D,
+   OPCODE_UNIFORM_3D,
+   OPCODE_UNIFORM_4D,
+   OPCODE_UNIFORM_1DV,
+   OPCODE_UNIFORM_2DV,
+   OPCODE_UNIFORM_3DV,
+   OPCODE_UNIFORM_4DV,
+   OPCODE_UNIFORM_MATRIX22D,
+   OPCODE_UNIFORM_MATRIX33D,
+   OPCODE_UNIFORM_MATRIX44D,
+   OPCODE_UNIFORM_MATRIX23D,
+   OPCODE_UNIFORM_MATRIX32D,
+   OPCODE_UNIFORM_MATRIX24D,
+   OPCODE_UNIFORM_MATRIX42D,
+   OPCODE_UNIFORM_MATRIX34D,
+   OPCODE_UNIFORM_MATRIX43D,
+
/* OpenGL 4.0 / GL_ARB_tessellation_shader */
OPCODE_PATCH_PARAMETER_I,
OPCODE_PATCH_PARAMETER_FV_INNER,
@@ -606,6 +625,22 @@ union uint64_pair
 };
 
 
+union float64_pair
+{
+   GLdouble d;
+   GLuint uint32[2];
+};
+
+
+#define ASSIGN_DOUBLE_TO_NODES(n, idx, value)  \
+   do {\
+  union float64_pair tmp;  \
+  tmp.d = value;   \
+  n[idx].ui = tmp.uint32[0];   \
+  n[idx+1].ui = tmp.uint32[1]; \
+   } while (0)
+
+
 /**
  * How many nodes to allocate at a time.  Note that bulk vertex data
  * from glBegin/glVertex/glEnd primitives will typically wind up in
@@ -1034,6 +1069,10 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  case OPCODE_UNIFORM_2FV:
  case OPCODE_UNIFORM_3FV:
  case OPCODE_UNIFORM_4FV:
+ case OPCODE_UNIFORM_1DV:
+ case OPCODE_UNIFORM_2DV:
+ case OPCODE_UNIFORM_3DV:
+ case OPCODE_UNIFORM_4DV:
  case OPCODE_UNIFORM_1IV:
  case OPCODE_UNIFORM_2IV:
  case OPCODE_UNIFORM_3IV:
@@ -1053,6 +1092,15 @@ _mesa_delete_list(struct gl_context *ctx, struct 
gl_display_list *dlist)
  case OPCODE_UNIFORM_MATRIX32:
  case OPCODE_UNIFORM_MATRIX34:
  case OPCODE_UNIFORM_MATRIX43:
+ case OPCODE_UNIFORM_MATRIX22D:
+ case OPCODE_UNIFORM_MATRIX33D:
+ case OPCODE_UNIFORM_MATRIX44D:
+ case OPCODE_UNIFORM_MATRIX24D:
+ case OPCODE_UNIFORM_MATRIX42D:
+ 

[Mesa-dev] [PATCH 03/18] mesa: enable ARB_gpu_shader_fp64 in compat profile

2018-06-28 Thread Timothy Arceri
---
 src/mesa/main/extensions_table.h|  2 +-
 src/mesa/main/tests/dispatch_sanity.cpp | 38 -
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 7af48a4ad91..5fe2e88fe98 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -76,7 +76,7 @@ EXT(ARB_get_program_binary  , dummy_true
 EXT(ARB_get_texture_sub_image   , dummy_true   
  , GLL, GLC,  x ,  x , 2014)
 EXT(ARB_gl_spirv, ARB_gl_spirv 
  ,  x,  GLC,  x ,  x , 2016)
 EXT(ARB_gpu_shader5 , ARB_gpu_shader5  
  , GLL, GLC,  x ,  x , 2010)
-EXT(ARB_gpu_shader_fp64 , ARB_gpu_shader_fp64  
  ,  x , GLC,  x ,  x , 2010)
+EXT(ARB_gpu_shader_fp64 , ARB_gpu_shader_fp64  
  ,  32, GLC,  x ,  x , 2010)
 EXT(ARB_gpu_shader_int64, ARB_gpu_shader_int64 
  ,  x , GLC,  x ,  x , 2015)
 EXT(ARB_half_float_pixel, dummy_true   
  , GLL, GLC,  x ,  x , 2003)
 EXT(ARB_half_float_vertex   , ARB_half_float_vertex
  , GLL, GLC,  x ,  x , 2008)
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 96ace539204..6b319d8b030 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -575,6 +575,25 @@ const struct function common_desktop_functions_possible[] 
= {
{ "glBlendFunci", 40, -1 },
{ "glBlendFuncSeparatei", 40, -1 },
 
+   { "glUniform1d", 40, -1 },
+   { "glUniform2d", 40, -1 },
+   { "glUniform3d", 40, -1 },
+   { "glUniform4d", 40, -1 },
+   { "glUniform1dv", 40, -1 },
+   { "glUniform2dv", 40, -1 },
+   { "glUniform3dv", 40, -1 },
+   { "glUniform4dv", 40, -1 },
+   { "glUniformMatrix2dv", 40, -1 },
+   { "glUniformMatrix3dv", 40, -1 },
+   { "glUniformMatrix4dv", 40, -1 },
+   { "glUniformMatrix2x3dv", 40, -1 },
+   { "glUniformMatrix2x4dv", 40, -1 },
+   { "glUniformMatrix3x2dv", 40, -1 },
+   { "glUniformMatrix3x4dv", 40, -1 },
+   { "glUniformMatrix4x2dv", 40, -1 },
+   { "glUniformMatrix4x3dv", 40, -1 },
+   { "glGetUniformdv", 43, -1 },
+
/* GL 4.3 */
{ "glIsRenderbuffer", 43, -1 },
{ "glBindRenderbuffer", 43, -1 },
@@ -1658,25 +1677,6 @@ const struct function gl_core_functions_possible[] = {
{ "glDrawArraysIndirect", 43, -1 },
{ "glDrawElementsIndirect", 43, -1 },
 
-   { "glUniform1d", 40, -1 },
-   { "glUniform2d", 40, -1 },
-   { "glUniform3d", 40, -1 },
-   { "glUniform4d", 40, -1 },
-   { "glUniform1dv", 40, -1 },
-   { "glUniform2dv", 40, -1 },
-   { "glUniform3dv", 40, -1 },
-   { "glUniform4dv", 40, -1 },
-   { "glUniformMatrix2dv", 40, -1 },
-   { "glUniformMatrix3dv", 40, -1 },
-   { "glUniformMatrix4dv", 40, -1 },
-   { "glUniformMatrix2x3dv", 40, -1 },
-   { "glUniformMatrix2x4dv", 40, -1 },
-   { "glUniformMatrix3x2dv", 40, -1 },
-   { "glUniformMatrix3x4dv", 40, -1 },
-   { "glUniformMatrix4x2dv", 40, -1 },
-   { "glUniformMatrix4x3dv", 40, -1 },
-   { "glGetUniformdv", 43, -1 },
-
{ "glBindTransformFeedback", 43, -1 },
{ "glDeleteTransformFeedbacks", 43, -1 },
{ "glGenTransformFeedbacks", 43, -1 },
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/18] mesa: stop hiding remaining query parameters from OpenGL compat

2018-06-28 Thread Timothy Arceri
I managed to miss these two in my last pass at this.
---
 src/mesa/main/get_hash_params.py | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 83136108e95..618e306e509 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -975,17 +975,13 @@ descriptor=[
 
 # GL_ARB_sparse_buffer
   [ "SPARSE_BUFFER_PAGE_SIZE_ARB", "CONTEXT_INT(Const.SparseBufferPageSize), 
extra_ARB_sparse_buffer" ],
-]},
 
-# Enums restricted to OpenGL Core profile
-{ "apis": ["GL_CORE"], "params": [
 # GL_ARB_shader_subroutine
   [ "MAX_SUBROUTINES", "CONST(MAX_SUBROUTINES), NO_EXTRA" ],
   [ "MAX_SUBROUTINE_UNIFORM_LOCATIONS", 
"CONST(MAX_SUBROUTINE_UNIFORM_LOCATIONS), NO_EXTRA" ],
 
 # GL_ARB_indirect_parameters
   [ "PARAMETER_BUFFER_BINDING_ARB", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_indirect_parameters" ],
-
-]}
+]},
 
 ]
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev