date:20140809

Re: [Mesa-dev] [PATCH] clover: support CL_PROGRAM_BINARY_TYPE query

2014-08-09 Thread EdB

On Saturday, August 09, 2014 01:18:57 AM Ilia Mirkin wrote:
 On Fri, Aug 8, 2014 at 10:10 PM, EdB edb+m...@sigluy.net wrote:
  ---
  
   src/gallium/state_trackers/clover/api/program.cpp  | 3 +++
   src/gallium/state_trackers/clover/core/program.cpp | 8 
   src/gallium/state_trackers/clover/core/program.hpp | 1 +
   3 files changed, 12 insertions(+)
  
  diff --git a/src/gallium/state_trackers/clover/api/program.cpp
  b/src/gallium/state_trackers/clover/api/program.cpp index
  b81ce69..0e9e3c9 100644
  --- a/src/gallium/state_trackers/clover/api/program.cpp
  +++ b/src/gallium/state_trackers/clover/api/program.cpp
  @@ -266,6 +266,9 @@ clGetProgramBuildInfo(cl_program d_prog, cl_device_id
  d_dev, 
 buf.as_string() = prog.build_log(dev);
 break;
  
  +   case CL_PROGRAM_BINARY_TYPE:
  +  buf.as_scalarcl_program_binary_type() = prog.binary_type(dev);
 
 break?

Thanks
 
  +
  
  default:
 throw error(CL_INVALID_VALUE);
  
  }
  
  diff --git a/src/gallium/state_trackers/clover/core/program.cpp
  b/src/gallium/state_trackers/clover/core/program.cpp index
  e09c3aa..482df7e 100644
  --- a/src/gallium/state_trackers/clover/core/program.cpp
  +++ b/src/gallium/state_trackers/clover/core/program.cpp
  @@ -103,6 +103,14 @@ program::build_log(const device dev) const {
  
  return _logs.count(dev) ? _logs.find(dev)-second : ;
   
   }
  
  +cl_program_binary_type
  +program::binary_type(const device dev) const {
  +   if (!_binaries.count(dev))
  +  return CL_PROGRAM_BINARY_TYPE_NONE;
  +   else
  +  return CL_PROGRAM_BINARY_TYPE_EXECUTABLE;
  +}
  +
  
   const compat::vectormodule::symbol 
   program::symbols() const {
   
  if (_binaries.empty())
  
  diff --git a/src/gallium/state_trackers/clover/core/program.hpp
  b/src/gallium/state_trackers/clover/core/program.hpp index
  1081454..b932b95 100644
  --- a/src/gallium/state_trackers/clover/core/program.hpp
  +++ b/src/gallium/state_trackers/clover/core/program.hpp
  @@ -57,6 +57,7 @@ namespace clover {
  
 cl_build_status build_status(const device dev) const;
 std::string build_opts(const device dev) const;
 std::string build_log(const device dev) const;
  
  +  cl_program_binary_type binary_type(const device dev) const;
  
 const compat::vectormodule::symbol symbols() const;
  
  --
  2.0.4
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] clover: support CL_PROGRAM_BINARY_TYPE query

2014-08-09 Thread EdB

---
 src/gallium/state_trackers/clover/api/program.cpp  | 4 
 src/gallium/state_trackers/clover/core/program.cpp | 8 
 src/gallium/state_trackers/clover/core/program.hpp | 1 +
 3 files changed, 13 insertions(+)

diff --git a/src/gallium/state_trackers/clover/api/program.cpp 
b/src/gallium/state_trackers/clover/api/program.cpp
index b81ce69..c3fe129 100644
--- a/src/gallium/state_trackers/clover/api/program.cpp
+++ b/src/gallium/state_trackers/clover/api/program.cpp
@@ -266,6 +266,10 @@ clGetProgramBuildInfo(cl_program d_prog, cl_device_id 
d_dev,
   buf.as_string() = prog.build_log(dev);
   break;
 
+   case CL_PROGRAM_BINARY_TYPE:
+  buf.as_scalarcl_program_binary_type() = prog.binary_type(dev);
+  break;
+
default:
   throw error(CL_INVALID_VALUE);
}
diff --git a/src/gallium/state_trackers/clover/core/program.cpp 
b/src/gallium/state_trackers/clover/core/program.cpp
index e09c3aa..482df7e 100644
--- a/src/gallium/state_trackers/clover/core/program.cpp
+++ b/src/gallium/state_trackers/clover/core/program.cpp
@@ -103,6 +103,14 @@ program::build_log(const device dev) const {
return _logs.count(dev) ? _logs.find(dev)-second : ;
 }
 
+cl_program_binary_type
+program::binary_type(const device dev) const {
+   if (!_binaries.count(dev))
+  return CL_PROGRAM_BINARY_TYPE_NONE;
+   else
+  return CL_PROGRAM_BINARY_TYPE_EXECUTABLE;
+}
+
 const compat::vectormodule::symbol 
 program::symbols() const {
if (_binaries.empty())
diff --git a/src/gallium/state_trackers/clover/core/program.hpp 
b/src/gallium/state_trackers/clover/core/program.hpp
index 1081454..b932b95 100644
--- a/src/gallium/state_trackers/clover/core/program.hpp
+++ b/src/gallium/state_trackers/clover/core/program.hpp
@@ -57,6 +57,7 @@ namespace clover {
   cl_build_status build_status(const device dev) const;
   std::string build_opts(const device dev) const;
   std::string build_log(const device dev) const;
+  cl_program_binary_type binary_type(const device dev) const;
 
   const compat::vectormodule::symbol symbols() const;
 
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-08-09 Thread Niels Ole Salscheider

On Tuesday 04 March 2014, 02:08:58, Marek Olšák wrote:
 Could you please do this without changing u_upload_mgr? You can still
 use u_upload_alloc to allocate buffer memory in the driver and the map
 buffer read/write flags are not important with persistent coherent
 buffer mappings anyway.

Since 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 we allocate CPU - GPU 
streaming buffers (i. e. those with PIPE_USAGE_STREAM) in VRAM.
We should therefore set buffer.usage to PIPE_USAGE_STAGING in 
u_upload_alloc_buffer when we use u_upload_mgr for downloads - otherwise we 
won't get any performance improvements.
Would it now be OK to change u_upload_mgr or do you have a better proposal?

Ole

 Marek
 
 On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider
 
 niels_...@salscheider-online.de wrote:
  Using the DMA engine for buffer downloads vastly improves performance.
  This is because reads from VRAM by the CPU are slow because of the high
  latency of the PCIe bus.
  
  The first patch allows u_upload_mgr to be used for downloads, too. The
  second patch then uses u_upload_mgr in the radeon driver for downloads.
  I considered to rename u_upload_mgr to u_transfer_mgr since it might be
  confusing that an upload manager can be used for downloads. But then
  again we also have transfers so that u_transfer_mgr might also be
  confusing. Thus, I decided not to rename it for now.
  
  Without these patches, the buffer_bandwidth benchmark from uCLbench gives
  me:
  
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory, 
  32 KB local memory)
  
  1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
  
  14.61 MB/s(DH)
  
  With these paches, the read performance is much better:
  
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory, 
  32 KB local memory)
  
  1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
  
  1841.07 MB/s(DH)
  
  Judging by these numbers, it might even make sense to use the DMA engine
  for larger buffer downloads...
  
  Niels Ole Salscheider (2):
util/u_upload_mgr: Allow to also use it for downloads
radeon: Use transfer manager for buffer downloads
   
   src/gallium/auxiliary/hud/hud_context.c |  3 +-
   src/gallium/auxiliary/util/u_blitter.c  |  3 +-
   src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
   src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
   src/gallium/auxiliary/util/u_vbuf.c |  3 +-
   src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
   src/gallium/drivers/ilo/ilo_context.c   |  3 +-
   src/gallium/drivers/r300/r300_context.c |  3 +-
   src/gallium/drivers/radeon/r600_buffer_common.c | 78
   +++-- src/gallium/drivers/radeon/r600_pipe_common.c 
| 14 -
   src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
   src/mesa/state_tracker/st_context.c |  9 ++-
   12 files changed, 136 insertions(+), 46 deletions(-)
  
  --
  1.9.0
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: add support for dynamic sampler offsets

2014-08-09 Thread Marek Olšák

Acked-by: Marek Olšák marek.ol...@amd.com

Marek

On Sat, Aug 9, 2014 at 7:52 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 So... can I get a review on this? It's the last bit needed for ARB_gs5
 (well, except for the actual setting of the extension bit to 1)

 The only additional change I have in my local version is that instead of

 +  inst-sampler_array_size =
 + ir-sampler-as_dereference_array()
 +-array-variable_referenced()-type-length;

 I use ir-sampler-as_dereference_array()-array-type-array_size()

 I sent out a few piglits and they pass with the nvc0 driver on both
 fermi and kepler hw (which do textures a little differently from one
 another).

   -ilia


 On Wed, Aug 6, 2014 at 12:02 PM, Marek Olšák mar...@gmail.com wrote:
 On Wed, Aug 6, 2014 at 5:53 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
   pc-MaxAddressRegs = pc-MaxNativeAddressRegs =
  _min(screen-get_shader_param(screen, sh, 
 PIPE_SHADER_CAP_MAX_ADDRS),
   MAX_PROGRAM_ADDRESS_REGS);

 Not really sure what that's referring to... ARB_vp/fp or something?

 Yes, ARB_vp needs 1, ARB_fp doesn't support indirect addresing (expects 0).


 Anyways, this is definitely a bit of a violation of that. OTOH, so is
 the indirect UBO indexing and indirect GS input access (assuming
 that's allowed), since those would use ADDR[1] and every driver
 (except nv30) returns 1, and sometimes 0 -- including
 nv50/nvc0/r600/radeonsi.

 So... dunno what the proper way to proceed is. Fix drivers to claim
 higher numbers? Continue the tradition of ignoring it and relying on
 the fact that GPU's that don't support it also won't support the
 features that cause it to get used?

 You don't have to worry about that for now. We can clean it up later.

 Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2] radeon: Use the DMA engine for buffer downloads

2014-08-09 Thread Marek Olšák

You can try to do the allocation of the staging buffer with
pipe_buffer_create instead of u_upload_mgr. You can also use
u_suballocator, which is like a stripped out version of u_upload_mgr.
You would need another instance of u_upload_mgr anyway, because we'd
like to continue using PIPE_USAGE_STREAM for uploads.

Marek

On Sat, Aug 9, 2014 at 2:35 PM, Niels Ole Salscheider
niels_...@salscheider-online.de wrote:
 On Tuesday 04 March 2014, 02:08:58, Marek Olšák wrote:
 Could you please do this without changing u_upload_mgr? You can still
 use u_upload_alloc to allocate buffer memory in the driver and the map
 buffer read/write flags are not important with persistent coherent
 buffer mappings anyway.

 Since 150ac07b855b5c5f879bf6ce9ca421ccd1a6c938 we allocate CPU - GPU
 streaming buffers (i. e. those with PIPE_USAGE_STREAM) in VRAM.
 We should therefore set buffer.usage to PIPE_USAGE_STAGING in
 u_upload_alloc_buffer when we use u_upload_mgr for downloads - otherwise we
 won't get any performance improvements.
 Would it now be OK to change u_upload_mgr or do you have a better proposal?

 Ole

 Marek

 On Mon, Mar 3, 2014 at 9:29 PM, Niels Ole Salscheider

 niels_...@salscheider-online.de wrote:
  Using the DMA engine for buffer downloads vastly improves performance.
  This is because reads from VRAM by the CPU are slow because of the high
  latency of the PCIe bus.
 
  The first patch allows u_upload_mgr to be used for downloads, too. The
  second patch then uses u_upload_mgr in the radeon driver for downloads.
  I considered to rename u_upload_mgr to u_transfer_mgr since it might be
  confusing that an upload manager can be used for downloads. But then
  again we also have transfers so that u_transfer_mgr might also be
  confusing. Thus, I decided not to rename it for now.
 
  Without these patches, the buffer_bandwidth benchmark from uCLbench gives
  me:
 
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory,
  32 KB local memory)
 
  1/1 direct 2000 Bytes   759.29 MB/s(HD) 17.13 MB/s(DD)
 
  14.61 MB/s(DH)
 
  With these paches, the read performance is much better:
 
  ./buffer_bandwidth --size=2000 --iterations=100
  # device 0: AMD BARTS // type gpu (192 MB global memory, 64 KB constant
  memory,
  32 KB local memory)
 
  1/1 direct 2000 Bytes   759.90 MB/s(HD) 613.49 MB/s(DD)
 
  1841.07 MB/s(DH)
 
  Judging by these numbers, it might even make sense to use the DMA engine
  for larger buffer downloads...
 
  Niels Ole Salscheider (2):
util/u_upload_mgr: Allow to also use it for downloads
radeon: Use transfer manager for buffer downloads
 
   src/gallium/auxiliary/hud/hud_context.c |  3 +-
   src/gallium/auxiliary/util/u_blitter.c  |  3 +-
   src/gallium/auxiliary/util/u_upload_mgr.c   | 49 +++-
   src/gallium/auxiliary/util/u_upload_mgr.h   | 13 -
   src/gallium/auxiliary/util/u_vbuf.c |  3 +-
   src/gallium/auxiliary/vl/vl_compositor.c|  3 +-
   src/gallium/drivers/ilo/ilo_context.c   |  3 +-
   src/gallium/drivers/r300/r300_context.c |  3 +-
   src/gallium/drivers/radeon/r600_buffer_common.c | 78
   +++-- src/gallium/drivers/radeon/r600_pipe_common.c
| 14 -
   src/gallium/drivers/radeon/r600_pipe_common.h   |  1 +
   src/mesa/state_tracker/st_context.c |  9 ++-
   12 files changed, 136 insertions(+), 46 deletions(-)
 
  --
  1.9.0
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: add support for dynamic sampler offsets

2014-08-09 Thread Roland Scheidegger

On closer look, it looks to me like it wouldn't be all that difficult to
make proper sampler array dcls (as you apparently get them quite easily
from glsl) which is really my only problem with it but it's not really
all that important so
Reviewed-by: Roland Scheidegger srol...@vmware.com

Am 09.08.2014 07:52, schrieb Ilia Mirkin:
 So... can I get a review on this? It's the last bit needed for ARB_gs5
 (well, except for the actual setting of the extension bit to 1)
 
 The only additional change I have in my local version is that instead of
 
 +  inst-sampler_array_size =
 + ir-sampler-as_dereference_array()
 +-array-variable_referenced()-type-length;
 
 I use ir-sampler-as_dereference_array()-array-type-array_size()
 
 I sent out a few piglits and they pass with the nvc0 driver on both
 fermi and kepler hw (which do textures a little differently from one
 another).
 
   -ilia
 
 
 On Wed, Aug 6, 2014 at 12:02 PM, Marek Olšák mar...@gmail.com wrote:
 On Wed, Aug 6, 2014 at 5:53 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
   pc-MaxAddressRegs = pc-MaxNativeAddressRegs =
  _min(screen-get_shader_param(screen, sh, 
 PIPE_SHADER_CAP_MAX_ADDRS),
   MAX_PROGRAM_ADDRESS_REGS);

 Not really sure what that's referring to... ARB_vp/fp or something?

 Yes, ARB_vp needs 1, ARB_fp doesn't support indirect addresing (expects 0).


 Anyways, this is definitely a bit of a violation of that. OTOH, so is
 the indirect UBO indexing and indirect GS input access (assuming
 that's allowed), since those would use ADDR[1] and every driver
 (except nv30) returns 1, and sometimes 0 -- including
 nv50/nvc0/r600/radeonsi.

 So... dunno what the proper way to proceed is. Fix drivers to claim
 higher numbers? Continue the tradition of ignoring it and relying on
 the fact that GPU's that don't support it also won't support the
 features that cause it to get used?

 You don't have to worry about that for now. We can clean it up later.

 Marek
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=Xj1ezgC9gYJQCVun05Qsf2AVkWdMTWcufiVpM6QH7Mk%3D%0As=03cd8d1ec34661de07f9c6aa8d6efef3163477a18a55dc4a89d2eaa7d73d16f5
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: add support for dynamic sampler offsets

2014-08-09 Thread Ilia Mirkin

On Sat, Aug 9, 2014 at 10:14 AM, Roland Scheidegger srol...@vmware.com wrote:
On closer look, it looks to me like it wouldn't be all that difficult to
make proper sampler array dcls (as you apparently get them quite easily

If you can briefly outline how you think that should be done, I'd be
happy to try to do it. Right now the samplers are declared in
st_translate_program, which only has the program-samplers_used
bitfield (where program is glsl_to_tgsi_visitor).

One quick thing that occurs to me is to keep track of samplers in a
map of index - array size (0 if it's not accessed indirectly or not
an array), and then using the map to create the declarations. Does
that seem reasonable?

Either way, this seems like a nice-to-have rather than required, so
I'm going to push this patch even if the other one isn't ready.

from glsl) which is really my only problem with it but it's not really
all that important so
Reviewed-by: Roland Scheidegger srol...@vmware.com

Thanks!

Am 09.08.2014 07:52, schrieb Ilia Mirkin:
So... can I get a review on this? It's the last bit needed for ARB_gs5
(well, except for the actual setting of the extension bit to 1)

The only additional change I have in my local version is that instead of

+ inst-sampler_array_size =
+ ir-sampler-as_dereference_array()
+-array-variable_referenced()-type-length;

I use ir-sampler-as_dereference_array()-array-type-array_size()

I sent out a few piglits and they pass with the nvc0 driver on both
fermi and kepler hw (which do textures a little differently from one
another).

-ilia

On Wed, Aug 6, 2014 at 12:02 PM, Marek Olšák mar...@gmail.com wrote:
On Wed, Aug 6, 2014 at 5:53 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
pc-MaxAddressRegs = pc-MaxNativeAddressRegs =
_min(screen-get_shader_param(screen, sh,
PIPE_SHADER_CAP_MAX_ADDRS),
MAX_PROGRAM_ADDRESS_REGS);

Not really sure what that's referring to... ARB_vp/fp or something?

Yes, ARB_vp needs 1, ARB_fp doesn't support indirect addresing (expects 0).

Anyways, this is definitely a bit of a violation of that. OTOH, so is
the indirect UBO indexing and indirect GS input access (assuming
that's allowed), since those would use ADDR[1] and every driver
(except nv30) returns 1, and sometimes 0 -- including
nv50/nvc0/r600/radeonsi.

So... dunno what the proper way to proceed is. Fix drivers to claim
higher numbers? Continue the tradition of ignoring it and relying on
the fact that GPU's that don't support it also won't support the
features that cause it to get used?

You don't have to worry about that for now. We can clean it up later.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=F4msKE2WxRzA%2BwN%2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0Am=Xj1ezgC9gYJQCVun05Qsf2AVkWdMTWcufiVpM6QH7Mk%3D%0As=03cd8d1ec34661de07f9c6aa8d6efef3163477a18a55dc4a89d2eaa7d73d16f5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: add support for dynamic sampler offsets

2014-08-09 Thread Roland Scheidegger

Am 09.08.2014 16:33, schrieb Ilia Mirkin:
On Sat, Aug 9, 2014 at 10:14 AM, Roland Scheidegger srol...@vmware.com
wrote:
On closer look, it looks to me like it wouldn't be all that difficult to
make proper sampler array dcls (as you apparently get them quite easily

Well I didn't think about it that much but nothing is stopping you from
putting whatever you need into the glsl_to_tgsi_visitor to keep that
array information instead of just filling out samplers_used.

Either way, this seems like a nice-to-have rather than required, so
I'm going to push this patch even if the other one isn't ready.
Agreed. The address thingy is also something which should go away as
discussed but really separate issue too.
btw you get the actual sampler type from the constant in the sampler reg
right? So if you have
4: TEX TEMP[1], TEMP[1], SAMP[ADDR[2].x+1], 2D
that +1 in there says the 2nd declared sampler determines what kind of
sampler this is (2d, 3d, shadow...).

Roland

from glsl) which is really my only problem with it but it's not really
all that important so
Reviewed-by: Roland Scheidegger srol...@vmware.com

Thanks!

Am 09.08.2014 07:52, schrieb Ilia Mirkin:
So... can I get a review on this? It's the last bit needed for ARB_gs5
(well, except for the actual setting of the extension bit to 1)

The only additional change I have in my local version is that instead of

+ inst-sampler_array_size =
+ ir-sampler-as_dereference_array()
+-array-variable_referenced()-type-length;

I use ir-sampler-as_dereference_array()-array-type-array_size()

I sent out a few piglits and they pass with the nvc0 driver on both
fermi and kepler hw (which do textures a little differently from one
another).

-ilia

Not really sure what that's referring to... ARB_vp/fp or something?

Yes, ARB_vp needs 1, ARB_fp doesn't support indirect addresing (expects 0).

You don't have to worry about that for now. We can clean it up later.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/st: add support for dynamic sampler offsets

2014-08-09 Thread Ilia Mirkin

On Sat, Aug 9, 2014 at 11:12 AM, Roland Scheidegger srol...@vmware.com wrote:
 Am 09.08.2014 16:33, schrieb Ilia Mirkin:
 On Sat, Aug 9, 2014 at 10:14 AM, Roland Scheidegger srol...@vmware.com 
 wrote:
 On closer look, it looks to me like it wouldn't be all that difficult to
 make proper sampler array dcls (as you apparently get them quite easily

 If you can briefly outline how you think that should be done, I'd be
 happy to try to do it. Right now the samplers are declared in
 st_translate_program, which only has the program-samplers_used
 bitfield (where program is glsl_to_tgsi_visitor).

 One quick thing that occurs to me is to keep track of samplers in a
 map of index - array size (0 if it's not accessed indirectly or not
 an array), and then using the map to create the declarations. Does
 that seem reasonable?

 Well I didn't think about it that much but nothing is stopping you from
 putting whatever you need into the glsl_to_tgsi_visitor to keep that
 array information instead of just filling out samplers_used.

 Either way, this seems like a nice-to-have rather than required, so
 I'm going to push this patch even if the other one isn't ready.
 Agreed. The address thingy is also something which should go away as
 discussed but really separate issue too.
 btw you get the actual sampler type from the constant in the sampler reg
 right? So if you have
 4: TEX TEMP[1], TEMP[1], SAMP[ADDR[2].x+1], 2D
 that +1 in there says the 2nd declared sampler determines what kind of
 sampler this is (2d, 3d, shadow...).

From what I can tell, samplers (in TGSI) have no attached semantics...
they're just an index into an array that you're promised was set up
correctly beforehand. For emitting the instructions, we look at the 2D
from the tex instruction.

BTW, I have a follow-up patch that I'll send shortly which generates

FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL IN[0], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL SAMP[0]
DCL SAMP[1..3]
DCL CONST[4]
DCL TEMP[0..1], LOCAL
DCL ADDR[0..2]
IMM[0] FLT32 {0., 0., 0., 0.}
  0: MOV TEMP[0].xy, IN[0].xyyy
  1: TEX TEMP[0], TEMP[0], SAMP[0], 2D
  2: MOV TEMP[1].xy, IN[0].xyyy
  3: UARL ADDR[2].x, CONST[4].
  4: TEX TEMP[1], TEMP[1], SAMP[ADDR[2].x+1], 2D
  5: MAD TEMP[0], TEMP[0], IMM[0]., TEMP[1]
  6: MOV OUT[0], TEMP[0]
  7: END

Is that what you had in mind?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: Fix glGetActiveAttribute for gl_VertexID when lowered.

2014-08-09 Thread Kenneth Graunke

The lower_vertex_id pass converts uses of the gl_VertexID system value
to the gl_BaseVertex and gl_VertexIDMESA system values.  Since
gl_VertexID is no longer accessed, it would not be considered active.

Of course, it should be, since the shader uses gl_VertexID.

v2: Move the var-name dereference past the var != NULL check.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/main/shader_query.cpp | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

[15:31] anujp idr: gles3conform-v5-wip is not doing well. 
getactiveattribute_index_more_than_num_attribs test segfaults in 
src/mesa/main/shader_query.cpp:132. You can reproduce it on IVB too.
[15:43] idr aw crap.
[15:43] idr anujp: I guess Kayden's series needs a bit more work. :(

This fixes that test.  I haven't run other tests on v2 of this, but
I don't really think I need to.  I literally just moved the var_name
declaration down a bit.

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 4871d09..766ad29 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -93,6 +93,7 @@ is_active_attrib(const ir_variable *var)
* and gl_InstanceID.
*/
   return var-data.location == SYSTEM_VALUE_VERTEX_ID ||
+ var-data.location == SYSTEM_VALUE_VERTEX_ID_ZERO_BASE ||
  var-data.location == SYSTEM_VALUE_INSTANCE_ID;
 
default:
@@ -133,7 +134,18 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint 
desired_index,
  continue;
 
   if (current_index == desired_index) {
-_mesa_copy_string(name, maxLength, length, var-name);
+ const char *var_name = var-name;
+
+ /* Since gl_VertexID may be lowered to gl_VertexIDMESA, we need to
+  * consider gl_VertexIDMESA as gl_VertexID for purposes of checking
+  * active attributes.
+  */
+ if (var-data.mode == ir_var_system_value 
+ var-data.location == SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) {
+var_name = gl_VertexID;
+ }
+
+_mesa_copy_string(name, maxLength, length, var_name);
 
 if (size)
*size = (var-type-is_array()) ? var-type-length : 1;
-- 
2.0.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gbm: Fix gallium build when X11 is in a non-system directory

2014-08-09 Thread Eric Anholt

pipe-loader.h will include Xlib.h when HAVE_PIPE_LOADER_XLIB is set in the
build.
---
 src/gallium/state_trackers/gbm/Makefile.am | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/state_trackers/gbm/Makefile.am 
b/src/gallium/state_trackers/gbm/Makefile.am
index 4d9f3fe..50995b3 100644
--- a/src/gallium/state_trackers/gbm/Makefile.am
+++ b/src/gallium/state_trackers/gbm/Makefile.am
@@ -25,6 +25,7 @@ include $(top_srcdir)/src/gallium/Automake.inc
 
 AM_CFLAGS = \
$(GALLIUM_CFLAGS) \
+   $(X11_INCLUDES) \
$(VISIBILITY_CFLAGS)
 
 AM_CPPFLAGS = \
-- 
2.0.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Fix glGetActiveAttribute for gl_VertexID when lowered.

2014-08-09 Thread Ian Romanick

On 08/09/2014 10:54 AM, Kenneth Graunke wrote:
 The lower_vertex_id pass converts uses of the gl_VertexID system value
 to the gl_BaseVertex and gl_VertexIDMESA system values.  Since
 gl_VertexID is no longer accessed, it would not be considered active.
 
 Of course, it should be, since the shader uses gl_VertexID.
 
 v2: Move the var-name dereference past the var != NULL check.

Right... which I didn't notice before because the var != NULL check is
hidden in is_active_attrib.  D'oh.

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/main/shader_query.cpp | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)
 
 [15:31] anujp idr: gles3conform-v5-wip is not doing well. 
 getactiveattribute_index_more_than_num_attribs test segfaults in 
 src/mesa/main/shader_query.cpp:132. You can reproduce it on IVB too.
 [15:43] idr aw crap.
 [15:43] idr anujp: I guess Kayden's series needs a bit more work. :(
 
 This fixes that test.  I haven't run other tests on v2 of this, but
 I don't really think I need to.  I literally just moved the var_name
 declaration down a bit.
 
 diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
 index 4871d09..766ad29 100644
 --- a/src/mesa/main/shader_query.cpp
 +++ b/src/mesa/main/shader_query.cpp
 @@ -93,6 +93,7 @@ is_active_attrib(const ir_variable *var)
 * and gl_InstanceID.
 */
return var-data.location == SYSTEM_VALUE_VERTEX_ID ||
 + var-data.location == SYSTEM_VALUE_VERTEX_ID_ZERO_BASE ||
   var-data.location == SYSTEM_VALUE_INSTANCE_ID;
  
 default:
 @@ -133,7 +134,18 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint 
 desired_index,
   continue;
  
if (current_index == desired_index) {
 -  _mesa_copy_string(name, maxLength, length, var-name);
 + const char *var_name = var-name;
 +
 + /* Since gl_VertexID may be lowered to gl_VertexIDMESA, we need to
 +  * consider gl_VertexIDMESA as gl_VertexID for purposes of checking
 +  * active attributes.
 +  */
 + if (var-data.mode == ir_var_system_value 
 + var-data.location == SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) {
 +var_name = gl_VertexID;
 + }
 +
 +  _mesa_copy_string(name, maxLength, length, var_name);
  
if (size)
   *size = (var-type-is_array()) ? var-type-length : 1;
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] BDW viewport extents + misc

2014-08-09 Thread Ben Widawsky

I realize it hasn't even been a week yet, but my remaining 2 weeks until
my sabbatical have just filled up, so if anyone needs me to rework this,
the sooner you let me know the better.

On Mon, Aug 04, 2014 at 12:24:00PM -0700, Ben Widawsky wrote:
 The patch commit messages and comments within the diffs explain the 
 intricacies
 of viewport extents and clipping. So rather, here is the data for these
 patches. All of the following is for a Broadwell system (which introduced
 viewport extents).
 
 EGYPT PERF
 ==
 No change
 
 WARSOW PERF
 ===
 No change
 

Add xonotic and trex to this list of no change.

 piglit
 ==
 viewport extents only:
 spec/ARB_viewport_array/render-scissor/Render multi-viewport scissor test: 
 fail pass
 spec/glsl-1.30/execution/built-in-functions/vs-max-ivec4-int: fail pass
 spec/ARB_viewport_array/render-scissor/Render multi-scissor rectangles: fail 
 pass
 spec/glsl-1.50/execution/geometry/max-input-components: fail pass
 
 viewport extents + gb clipping:
 spec/ARB_viewport_array/render-scissor/Render multi-viewport scissor test: 
 fail pass
 spec/glsl-1.30/execution/built-in-functions/vs-max-ivec4-int: fail pass
 spec/ARB_viewport_array/render-scissor/Render multi-scissor rectangles: fail 
 pass
 
 all:
 spec/ARB_viewport_array/render-scissor/Render multi-viewport scissor test: 
 fail pass
 spec/glsl-1.30/execution/built-in-functions/vs-max-ivec4-int: fail pass
 spec/ARB_viewport_array/render-scissor/Render multi-scissor rectangles: fail 
 pass
 
 As you can observe, there are no wins found here other than conformance. Given
 our understanding of the hardware, we expect these patches to produce a
 performance improvements for certain applications (specifically those which
 define viewports smaller than the drawing rectangle, but some other caveats
 apply on top of that).
 
 Ben Widawsky (4):
   i965/guardband: Improve comments for guardband clipping
   i965: Viewport extents on GEN8
   i965/guardband: Enable for all viewport dimensions (GEN8+)
   i965/clip: Removing scissor atom
 
  src/mesa/drivers/dri/i965/gen6_clip_state.c | 29 +-
  src/mesa/drivers/dri/i965/gen8_viewport_state.c | 52 
 ++---
  2 files changed, 57 insertions(+), 24 deletions(-)
 
 -- 
 2.0.3
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965/guardband: Improve comments for guardband clipping

2014-08-09 Thread Kenneth Graunke

On Monday, August 04, 2014 12:24:01 PM Ben Widawsky wrote:
 While working in this part of the code I had a great deal of trouble
 understanding what it was trying to do, and matching it with the spec.
 (mostly due bad wording in the PRM). To help future people, I've cleaned
 up the wording and provided some ascii art.
 ---
  src/mesa/drivers/dri/i965/gen8_viewport_state.c | 22 ++
  1 file changed, 18 insertions(+), 4 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c 
 b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 index b366246..b5171e0 100644
 --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 @@ -71,10 +71,24 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw)
 * maximum screen space coordinates of a small object may larger, but 
 we
 * have no way to enforce the object size other than through clipping.
 *
 -   * If you're surprised that we set clip to -gbx to +gbx and it seems 
 like
 -   * we'll end up with 16384 wide, note that for a 8192-wide render 
 target,
 -   * we'll end up with a normal (-1, 1) clip volume that just covers the
 -   * drawable.
 +   * The goal is to create the maximum sized guardband (8K x 8K) with the
 +   * viewport rectangle in the center of the guardband. This looks weird
 +   * because the hardware wants coordinates that are scaled to the 
 viewport
 +   * in NDC. In other words, an 8K x 8K viewport would have [-1,1] for X 
 and Y.
 +   * A 4K viewport would be [-2,2], 2K := [-4,4] etc.
 +   *
 +   * 
 +   * |Guardband |
 +   * |  |
 +   * |  |
 +   * | |viewport  | |
 +   * | |  | |
 +   * | |  | |
 +   * | |__| |
 +   * |  |
 +   * |  |
 +   * |__|
 +   *
 */
const float maximum_guardband_extent = 8192;
float gbx = maximum_guardband_extent / ctx-ViewportArray[i].Width;
 

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

and you can probably add a:
Reviewed-by: Chris Forbes chr...@ijw.co.nz
or at least Acked-by

based on #intel-gfx chatter:

[Friday, August 01, 2014] [07:24:53 PM] bwidawsk  Kayden, chrisf: useful p
atch? http://sprunge.us/SPeA
[Friday, August 01, 2014] [07:25:48 PM] Kaydenlooks reasonable to me
[Friday, August 01, 2014] [07:26:48 PM] chrisfsame here

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] i965/clip: Removing scissor atom

2014-08-09 Thread Kenneth Graunke

On Monday, August 04, 2014 12:24:04 PM Ben Widawsky wrote:
 On GEN8, a change in scissor state does not effect anything for the
 clipper/sf hardware state. The hardware will always do the right thing
 once the viewport extents are programmed. We can therefore remove the
 unecessary state emission.
 
 Ken originally spotted this.

Scissoring state affects the value of ctx-DrawBuffer-_Xmin, _Xmax, _Ymin, and 
_Ymax.  So, _NEW_SCISSORS was actually necessary prior to your patch #2.

Perhaps reword the commit message to be something like this:

Now that we no longer use ctx-DrawBuffer-_Xmin and related fields to program 
the screen-space viewport extents, we don't depend on any scissoring state.  So 
we can drop the _NEW_SCISSOR dependency.

 ---
  src/mesa/drivers/dri/i965/gen8_viewport_state.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c 
 b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 index 04a4530..d7e06c4 100644
 --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 @@ -100,13 +100,17 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw)
vp[10] = -gby; /* y-min */
vp[11] =  gby; /* y-max */
  
 -  /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport
 +  /* _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport
 * The hardware will take the intersection of the drawing rectangle,
 * scissor rectangle, and the viewport extents. We don't need to be
 * smart, and can therefore just program the viewport extents.
 */
float viewport_Xmax = ctx-ViewportArray[i].X + 
 ctx-ViewportArray[i].Width;
float viewport_Ymax = ctx-ViewportArray[i].Y + 
 ctx-ViewportArray[i].Height;
 +  if (viewport_Ymax  ctx-DrawBuffer-_Ymax ||
 +viewport_Xmax  ctx-DrawBuffer-_Xmax) {
 +perf_debug(Using viewport extents for savings\n);
 +  }

I suspect you didn't mean to add this :)  Please drop it, as using 
ctx-DrawBuffer-_Xmax actually introduces a dependency on _NEW_SCISSOR again. 
:)

With those fixed, this would get a:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

if (render_to_fbo) {
   vp[12] = ctx-ViewportArray[i].X;
   vp[13] = viewport_Xmax - 1;
 @@ -130,7 +134,7 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw)
  
  const struct brw_tracked_state gen8_sf_clip_viewport = {
 .dirty = {
 -  .mesa = _NEW_BUFFERS | _NEW_SCISSOR | _NEW_VIEWPORT,
 +  .mesa = _NEW_BUFFERS | _NEW_VIEWPORT,
.brw = BRW_NEW_BATCH,
.cache = 0,
 },
 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/4] i965/guardband: Enable for all viewport dimensions (GEN8+)

2014-08-09 Thread Kenneth Graunke

On Monday, August 04, 2014 12:24:03 PM Ben Widawsky wrote:
 The goal of guardband clipping is to try to avoid 3d clipping because it
 is an expensive operation. When guardband clipping is disabled, all
 geometry that intersects the viewport is to the FF 3d clipper. Objects

 ^^^ is sent to?

 which are entirely enclosed within the viewport are said to be
 trivially accepted while those entirely outside of the viewport are,
 trivially rejected.
 
 When guardband clipping is turned on the above behavior is change such

   is changed or changes ^^^

 that if the geometry is within the guardband, and intersects the
 viewport, it skips the 3d clipper. Prior to GEN8, this was problematic
 if the viewport was smaller than the screen as it could allow for
 rendering to occur outside of the viewport. That could be mitigated if
 the programmer specified a scissor region which was less than or equal
 to the viewport - but this is not required for correctness in OpenGL. In
 theory you could be clever with the guardband so as not to invoke this
 problem. We do not do this, and have no data that suggests we should
 bother (nor the converse data).
 
 With viewport extents in place on GEN8, it should be safe to turn on
 guardband clipping for all cases
 
 While here, add a comment to the code which confused me thoroughly.
 ---
  src/mesa/drivers/dri/i965/gen6_clip_state.c | 29 
 +++--
  1 file changed, 19 insertions(+), 10 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen6_clip_state.c 
 b/src/mesa/drivers/dri/i965/gen6_clip_state.c
 index 52027e0..42e78a1 100644
 --- a/src/mesa/drivers/dri/i965/gen6_clip_state.c
 +++ b/src/mesa/drivers/dri/i965/gen6_clip_state.c
 @@ -97,17 +97,26 @@ upload_clip_state(struct brw_context *brw)
 GEN6_USER_CLIP_CLIP_DISTANCES_SHIFT);
  
 dw2 |= GEN6_CLIP_GB_TEST;
 -   for (unsigned i = 0; i  ctx-Const.MaxViewports; i++) {
 -  if (ctx-ViewportArray[i].X != 0 ||
 -  ctx-ViewportArray[i].Y != 0 ||
 -  ctx-ViewportArray[i].Width != (float) fb-Width ||
 -  ctx-ViewportArray[i].Height != (float) fb-Height) {
 - dw2 = ~GEN6_CLIP_GB_TEST;
 - if (brw-gen = 8) {
 -perf_debug(Disabling GB clipping due to lack of Gen8 viewport 
 -   clipping setup code.  This should be fixed.\n);
 +   /* If the viewport dimensions  screen dimensions we have to disable

screen dimensions sounds like 1920x1080 (etc) to me.  How about drawable 
dimensions?

 +* guardband clipping. This is because as the code works today, the
 +* guardband rectangle is set to a fixed size. If that size is larger than
 +* the viewport (likely) Guardband clipping can potentially make geometry
 +* bypass the 3d clipping phase (geometry which intersects the viewport).
 +* This will possibly cause rendering to occur outside of the viewport, 
 but
 +* inside the screen.
 +*
 +* On GEN8 we get a useful viewport extents test which allows us to
 +* ignore this entirely.
 +*/

The comment seems a bit choppy to me.  How about something like:

/* If the viewport dimensions are smaller than the drawable dimensions,
 * we have to disable guardband clipping prior to Gen8.  We always program
 * the guardband to a fixed size, which is almost always larger than the
 * viewport.  Any geometry which intersects the viewport but lies within
 * the guardband would bypass the 3D clipping stage, so it wouldn't be
 * clipped to the viewport.  Rendering would happen beyond the viewport,
 * but still inside the drawable.
 *
 * Gen8+ introduces a viewport extents test which restricts rendering to
 * the viewport, so we can ignore this restriction.
 */

Regardless of whether you take my suggestion, this is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 +   if (brw-gen  8) {
 +  for (unsigned i = 0; i  ctx-Const.MaxViewports; i++) {
 + if (ctx-ViewportArray[i].X != 0 ||
 + ctx-ViewportArray[i].Y != 0 ||
 + ctx-ViewportArray[i].Width != (float) fb-Width ||
 + ctx-ViewportArray[i].Height != (float) fb-Height) {
 +dw2 = ~GEN6_CLIP_GB_TEST;
 +break;
   }
 - break;
}
 }
  
 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] i965: Viewport extents on GEN8

2014-08-09 Thread Kenneth Graunke

On Monday, August 04, 2014 12:24:02 PM Ben Widawsky wrote:
 Viewport extents are a 3rd rectangle that defines which pixels get
 discarded as part of the rasterization process. This can potentially
 improve performance by reducing cache usage, and freeing up PS cycles.

I'm not sure about cache effects...pretty sure it doesn't save PS cycles.

 It also permits the use of guardband clipping in all cases (see later
 patch). The actual pixels drawn to the screen are an intersection of the
 drawing rectangle, the viewport extents, and the scissor rectangle.
 
 Scissor rectangle is not super important this discussion as it should
 ^^^ to/for?
 always help do the right thing provided the programmer uses it.
 
 switch (viewport dimensions, drawrect dimension) {
case viewport  drawing rectangle: no effects; break;
case viewport == drawing rectangle: no effects; break;
case viewport  drawing rectangle:
   Pixels (after the viewport transformation but before expensive
   rastersizing and shading operations) which are outside of the
   viewport are discarded.

As we discussed, the 3D clipper normally gets involved and trims off any 
geometry outside of the viewport, but within the drawing rectangle.  So, 
expensive pixel shading operations would not happen regardless.

I think the main point of this patch is your earlier comment:
The actual pixels drawn to the screen are an intersection of the
 drawing rectangle, the viewport extents, and the scissor rectangle.

The previous code programmed the viewport extents to the intersection of the 
viewport, drawing rectangle, and scissor rectangle.  This is unnecessary, 
because the hardware does that intersection for us.  So we should simply 
program it to the viewport.

Also, please do change the title to include some sort of verb.  For example,

   i965: Simplify Gen8+ viewport extents programming.

Commit messages aside!  Your code looks good, and is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Thank you for doing this!

 }
 
 I am unable to find a test case where this improves performance, but in
 all my testing it doesn't hurt performance, and intuitively, it should
 not ever hurt performance. It also permits us to use the guardband more
 freely (see upcoming patch).
 
 v2: Updating commit message.
 ---
  src/mesa/drivers/dri/i965/gen8_viewport_state.c | 24 +++-
  1 file changed, 15 insertions(+), 9 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/gen8_viewport_state.c 
 b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 index b5171e0..04a4530 100644
 --- a/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_viewport_state.c
 @@ -100,17 +100,23 @@ gen8_upload_sf_clip_viewport(struct brw_context *brw)
vp[10] = -gby; /* y-min */
vp[11] =  gby; /* y-max */
  
 -  /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport 
 */
 +  /* _NEW_SCISSOR | _NEW_VIEWPORT | _NEW_BUFFERS: Screen Space Viewport
 +   * The hardware will take the intersection of the drawing rectangle,
 +   * scissor rectangle, and the viewport extents. We don't need to be
 +   * smart, and can therefore just program the viewport extents.
 +   */
 +  float viewport_Xmax = ctx-ViewportArray[i].X + 
 ctx-ViewportArray[i].Width;
 +  float viewport_Ymax = ctx-ViewportArray[i].Y + 
 ctx-ViewportArray[i].Height;
if (render_to_fbo) {
 - vp[12] = ctx-DrawBuffer-_Xmin;
 - vp[13] = ctx-DrawBuffer-_Xmax - 1;
 - vp[14] = ctx-DrawBuffer-_Ymin;
 - vp[15] = ctx-DrawBuffer-_Ymax - 1;
 + vp[12] = ctx-ViewportArray[i].X;
 + vp[13] = viewport_Xmax - 1;
 + vp[14] = ctx-ViewportArray[i].Y;
 + vp[15] = viewport_Ymax - 1;
} else {
 - vp[12] = ctx-DrawBuffer-_Xmin;
 - vp[13] = ctx-DrawBuffer-_Xmax - 1;
 - vp[14] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymax;
 - vp[15] = ctx-DrawBuffer-Height - ctx-DrawBuffer-_Ymin - 1;
 + vp[12] = ctx-ViewportArray[i].X;
 + vp[13] = viewport_Xmax - 1;
 + vp[14] = ctx-DrawBuffer-Height - viewport_Ymax;
 + vp[15] = ctx-DrawBuffer-Height - ctx-ViewportArray[i].Y - 1;
}
  
vp += 16;
 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/6] r600g: remove useless r600_resource_va calls

2014-08-09 Thread Marek Olšák

On Thu, Aug 7, 2014 at 4:06 PM, Alex Deucher alexdeuc...@gmail.com wrote:
 On Wed, Aug 6, 2014 at 5:49 PM, Marek Olšák mar...@gmail.com wrote:
 From: Marek Olšák marek.ol...@amd.com

 R600-R700 don't support virtual memory.

 For consistency, it might be nice to use gpu_address here as well, but
 just set it to 0 for 6xx/7xx.  Either way, series is:

Well, r600_resource_va isn't used elsewhere in the file. These are
only a few cases that seem to have been copied from the same evergreen
code. I assumed somebody had forgotten to clean them up.

Marek

 Reviewed-by: Alex Deucher alexander.deuc...@amd.com

 ---
  src/gallium/drivers/r600/r600_state.c | 27 +--
  1 file changed, 9 insertions(+), 18 deletions(-)

 diff --git a/src/gallium/drivers/r600/r600_state.c 
 b/src/gallium/drivers/r600/r600_state.c
 index 258ffd1..607b199 100644
 --- a/src/gallium/drivers/r600/r600_state.c
 +++ b/src/gallium/drivers/r600/r600_state.c
 @@ -595,25 +595,22 @@ texture_buffer_sampler_view(struct 
 r600_pipe_sampler_view *view,
 unsigned width0, unsigned height0)

  {
 -   struct pipe_context *ctx = view-base.context;
 struct r600_texture *tmp = (struct r600_texture*)view-base.texture;
 -   uint64_t va;
 int stride = util_format_get_blocksize(view-base.format);
 unsigned format, num_format, format_comp, endian;
 -   unsigned offset = view-base.u.buf.first_element * stride;
 +   uint64_t offset = view-base.u.buf.first_element * stride;
 unsigned size = (view-base.u.buf.last_element - 
 view-base.u.buf.first_element + 1) * stride;

 r600_vertex_data_type(view-base.format,
   format, num_format, format_comp,
   endian);

 -   va = r600_resource_va(ctx-screen, view-base.texture) + offset;
 view-tex_resource = tmp-resource;
 -
 view-skip_mip_address_reloc = true;
 -   view-tex_resource_words[0] = va;
 +
 +   view-tex_resource_words[0] = offset;
 view-tex_resource_words[1] = size - 1;
 -   view-tex_resource_words[2] = S_038008_BASE_ADDRESS_HI(va  32UL) |
 +   view-tex_resource_words[2] = S_038008_BASE_ADDRESS_HI(offset  
 32UL) |
 S_038008_STRIDE(stride) |
 S_038008_DATA_FORMAT(format) |
 S_038008_NUM_FORMAT_ALL(num_format) |
 @@ -1105,8 +1102,7 @@ static void r600_init_depth_surface(struct 
 r600_context *rctx,

 /* use htile only for first level */
 if (rtex-htile_buffer  !level) {
 -   uint64_t va = r600_resource_va(rctx-screen-b.b, 
 rtex-htile_buffer-b.b);
 -   surf-db_htile_data_base = va  8;
 +   surf-db_htile_data_base = 0;
 surf-db_htile_surface = S_028D24_HTILE_WIDTH(1) |
 S_028D24_HTILE_HEIGHT(1) |
 S_028D24_FULL_CACHE(1) |
 @@ -1944,7 +1940,6 @@ static void r600_emit_shader_stages(struct 
 r600_context *rctx, struct r600_atom

  static void r600_emit_gs_rings(struct r600_context *rctx, struct r600_atom 
 *a)
  {
 -   struct pipe_screen *screen = rctx-b.b.screen;
 struct radeon_winsys_cs *cs = rctx-b.rings.gfx.cs;
 struct r600_gs_rings_state *state = (struct r600_gs_rings_state*)a;
 struct r600_resource *rbuffer;
 @@ -1955,8 +1950,7 @@ static void r600_emit_gs_rings(struct r600_context 
 *rctx, struct r600_atom *a)

 if (state-enable) {
 rbuffer =(struct r600_resource*)state-esgs_ring.buffer;
 -   r600_write_config_reg(cs, R_008C40_SQ_ESGS_RING_BASE,
 -   (r600_resource_va(screen, rbuffer-b.b))  
 8);
 +   r600_write_config_reg(cs, R_008C40_SQ_ESGS_RING_BASE, 0);
 radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
 radeon_emit(cs, r600_context_bo_reloc(rctx-b, 
 rctx-b.rings.gfx, rbuffer,
   RADEON_USAGE_READWRITE,
 @@ -1965,8 +1959,7 @@ static void r600_emit_gs_rings(struct r600_context 
 *rctx, struct r600_atom *a)
 state-esgs_ring.buffer_size  8);

 rbuffer =(struct r600_resource*)state-gsvs_ring.buffer;
 -   r600_write_config_reg(cs, R_008C48_SQ_GSVS_RING_BASE,
 -   (r600_resource_va(screen, rbuffer-b.b))  
 8);
 +   r600_write_config_reg(cs, R_008C48_SQ_GSVS_RING_BASE, 0);
 radeon_emit(cs, PKT3(PKT3_NOP, 0, 0));
 radeon_emit(cs, r600_context_bo_reloc(rctx-b, 
 rctx-b.rings.gfx, rbuffer,
   RADEON_USAGE_READWRITE,
 @@ -2644,8 +2637,7 @@ void r600_update_gs_state(struct pipe_context *ctx, 
 struct r600_pipe_shader *sha
 r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_GS,
S_02887C_NUM_GPRS(rshader-bc.ngpr) |

Re: [Mesa-dev] [PATCH] r600g/compute: Fix Warnings

2014-08-09 Thread Marek Olšák

Hi Bruno,

Sorry, I fixed the warnings by myself before I saw your patch.

Marek

On Thu, Aug 7, 2014 at 12:07 PM, Bruno Jiménez brunoji...@gmail.com wrote:
 I have followed the following convention:
 - Positions in the pool are now 'int' (start_in_dw and related)
 - Sizes are 'unsigned' (size_in_dw and related)
 - IDs are 'unsigned'

 The pool and item's status are left as uint32_t
 The shadow has been also left as a pointer to an uint32_t
 ---
  src/gallium/drivers/r600/compute_memory_pool.c | 56 
 +-
  src/gallium/drivers/r600/compute_memory_pool.h | 26 ++--
  2 files changed, 41 insertions(+), 41 deletions(-)

 diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
 b/src/gallium/drivers/r600/compute_memory_pool.c
 index 0ee8ceb..8bcab00 100644
 --- a/src/gallium/drivers/r600/compute_memory_pool.c
 +++ b/src/gallium/drivers/r600/compute_memory_pool.c
 @@ -76,7 +76,7 @@ static void compute_memory_pool_init(struct 
 compute_memory_pool * pool,
 unsigned initial_size_in_dw)
  {

 -   COMPUTE_DBG(pool-screen, * compute_memory_pool_init() 
 initial_size_in_dw = %ld\n,
 +   COMPUTE_DBG(pool-screen, * compute_memory_pool_init() 
 initial_size_in_dw = %u\n,
 initial_size_in_dw);

 pool-size_in_dw = initial_size_in_dw;
 @@ -104,9 +104,9 @@ void compute_memory_pool_delete(struct 
 compute_memory_pool* pool)
   * \param size_in_dw   The size of the space we are looking for.
   * \return -1 on failure
   */
 -int64_t compute_memory_prealloc_chunk(
 +int compute_memory_prealloc_chunk(
 struct compute_memory_pool* pool,
 -   int64_t size_in_dw)
 +   unsigned size_in_dw)
  {
 struct compute_memory_item *item;

 @@ -114,7 +114,7 @@ int64_t compute_memory_prealloc_chunk(

 assert(size_in_dw = pool-size_in_dw);

 -   COMPUTE_DBG(pool-screen, * compute_memory_prealloc_chunk() 
 size_in_dw = %ld\n,
 +   COMPUTE_DBG(pool-screen, * compute_memory_prealloc_chunk() 
 size_in_dw = %u\n,
 size_in_dw);

 LIST_FOR_EACH_ENTRY(item, pool-item_list, link) {
 @@ -139,13 +139,13 @@ int64_t compute_memory_prealloc_chunk(
   */
  struct list_head *compute_memory_postalloc_chunk(
 struct compute_memory_pool* pool,
 -   int64_t start_in_dw)
 +   int start_in_dw)
  {
 struct compute_memory_item *item;
 struct compute_memory_item *next;
 struct list_head *next_link;

 -   COMPUTE_DBG(pool-screen, * compute_memory_postalloc_chunck() 
 start_in_dw = %ld\n,
 +   COMPUTE_DBG(pool-screen, * compute_memory_postalloc_chunck() 
 start_in_dw = %i\n,
 start_in_dw);

 /* Check if we can insert it in the front of the list */
 @@ -181,12 +181,12 @@ struct list_head *compute_memory_postalloc_chunk(
   * \see compute_memory_finalize_pending
   */
  int compute_memory_grow_defrag_pool(struct compute_memory_pool *pool,
 -   struct pipe_context *pipe, int new_size_in_dw)
 +   struct pipe_context *pipe, unsigned new_size_in_dw)
  {
 new_size_in_dw = align(new_size_in_dw, ITEM_ALIGNMENT);

 COMPUTE_DBG(pool-screen, * compute_memory_grow_defrag_pool() 
 -   new_size_in_dw = %d (%d bytes)\n,
 +   new_size_in_dw = %u (%u bytes)\n,
 new_size_in_dw, new_size_in_dw * 4);

 assert(new_size_in_dw = pool-size_in_dw);
 @@ -274,17 +274,17 @@ int compute_memory_finalize_pending(struct 
 compute_memory_pool* pool,
  {
 struct compute_memory_item *item, *next;

 -   int64_t allocated = 0;
 -   int64_t unallocated = 0;
 -   int64_t last_pos;
 +   unsigned allocated = 0;
 +   unsigned unallocated = 0;
 +   int last_pos;

 int err = 0;

 COMPUTE_DBG(pool-screen, * compute_memory_finalize_pending()\n);

 LIST_FOR_EACH_ENTRY(item, pool-item_list, link) {
 -   COMPUTE_DBG(pool-screen,   + list: offset = %i id = %i size 
 = %i 
 -   (%i bytes)\n,item-start_in_dw, item-id,
 +   COMPUTE_DBG(pool-screen,   + list: offset = %i id = %u size 
 = %u 
 +   (%u bytes)\n, item-start_in_dw, item-id,
 item-size_in_dw, item-size_in_dw * 4);
 }

 @@ -347,7 +347,7 @@ void compute_memory_defrag(struct compute_memory_pool 
 *pool,
 struct pipe_context *pipe)
  {
 struct compute_memory_item *item;
 -   int64_t last_pos;
 +   int last_pos;

 COMPUTE_DBG(pool-screen, * compute_memory_defrag()\n);

 @@ -374,7 +374,7 @@ void compute_memory_defrag(struct compute_memory_pool 
 *pool,
   */
  int compute_memory_promote_item(struct compute_memory_pool *pool,
 struct compute_memory_item *item, struct pipe_context *pipe,
 -   int64_t start_in_dw)
 +   int start_in_dw)
  {
 struct pipe_screen *screen = (struct pipe_screen *)pool-screen;
 struct r600_context *rctx = (struct r600_context

[Mesa-dev] [PATCH] radeonsi: simplify constant buffer upload for big endian

2014-08-09 Thread Marek Olšák

From: Marek Olšák marek.ol...@amd.com

Point util_memcpy_cpu_to_le32 to a buffer storage directly.
---
 src/gallium/drivers/radeonsi/si_descriptors.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 81ad14b..bfd2b76 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -649,20 +649,11 @@ void si_upload_const_buffer(struct si_context *sctx, 
struct r600_resource **rbuf
const uint8_t *ptr, unsigned size, uint32_t 
*const_offset)
 {
if (SI_BIG_ENDIAN) {
-   uint32_t *tmpPtr;
-   unsigned i;
-
-   if (!(tmpPtr = malloc(size))) {
-   R600_ERR(Failed to allocate BE swap buffer.\n);
-   return;
-   }
+   void *tmpPtr;
 
+   u_upload_alloc(sctx-b.uploader, 0, size, const_offset,
+  (struct pipe_resource**)rbuffer, tmpPtr);
util_memcpy_cpu_to_le32(tmpPtr, ptr, size);
-
-   u_upload_data(sctx-b.uploader, 0, size, tmpPtr, const_offset,
-   (struct pipe_resource**)rbuffer);
-
-   free(tmpPtr);
} else {
u_upload_data(sctx-b.uploader, 0, size, ptr, const_offset,
(struct pipe_resource**)rbuffer);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/4] BDW viewport extents + misc

2014-08-09 Thread Ben Widawsky

On Sat, Aug 09, 2014 at 12:07:58PM -0700, Ben Widawsky wrote:
 I realize it hasn't even been a week yet, but my remaining 2 weeks until
 my sabbatical have just filled up, so if anyone needs me to rework this,
 the sooner you let me know the better.

Hi Ken. Thanks a lot for reviewing it. I meant to incorporate all the
changes you requested. If you don't see one there, please let me know.

I've pushed a rebased branch here:
http://cgit.freedesktop.org/~bwidawsk/mesa/log/?h=bdw-extents

Do you have any idea or comments about the fixed piglit tests? I haven't
actually run the rebased branch yet to re-confirm the results. I was
just curious...

 
 On Mon, Aug 04, 2014 at 12:24:00PM -0700, Ben Widawsky wrote:
  The patch commit messages and comments within the diffs explain the 
  intricacies
  of viewport extents and clipping. So rather, here is the data for these
  patches. All of the following is for a Broadwell system (which introduced
  viewport extents).
  
  EGYPT PERF
  ==
  No change
  
  WARSOW PERF
  ===
  No change
  
 
 Add xonotic and trex to this list of no change.
 
  piglit
  ==
  viewport extents only:
  spec/ARB_viewport_array/render-scissor/Render multi-viewport scissor test: 
  fail pass
  spec/glsl-1.30/execution/built-in-functions/vs-max-ivec4-int: fail pass
  spec/ARB_viewport_array/render-scissor/Render multi-scissor rectangles: 
  fail pass
  spec/glsl-1.50/execution/geometry/max-input-components: fail pass
  
  viewport extents + gb clipping:
  spec/ARB_viewport_array/render-scissor/Render multi-viewport scissor test: 
  fail pass
  spec/glsl-1.30/execution/built-in-functions/vs-max-ivec4-int: fail pass
  spec/ARB_viewport_array/render-scissor/Render multi-scissor rectangles: 
  fail pass
  
  all:
  spec/ARB_viewport_array/render-scissor/Render multi-viewport scissor test: 
  fail pass
  spec/glsl-1.30/execution/built-in-functions/vs-max-ivec4-int: fail pass
  spec/ARB_viewport_array/render-scissor/Render multi-scissor rectangles: 
  fail pass
  
  As you can observe, there are no wins found here other than conformance. 
  Given
  our understanding of the hardware, we expect these patches to produce a
  performance improvements for certain applications (specifically those which
  define viewports smaller than the drawing rectangle, but some other caveats
  apply on top of that).
  
  Ben Widawsky (4):
i965/guardband: Improve comments for guardband clipping
i965: Viewport extents on GEN8
i965/guardband: Enable for all viewport dimensions (GEN8+)
i965/clip: Removing scissor atom
  
   src/mesa/drivers/dri/i965/gen6_clip_state.c | 29 +-
   src/mesa/drivers/dri/i965/gen8_viewport_state.c | 52 
  ++---
   2 files changed, 57 insertions(+), 24 deletions(-)
  
  -- 
  2.0.3
  
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
 -- 
 Ben Widawsky, Intel Open Source Technology Center

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

55 matches

Mail list logo