date:20170126

Re: [Mesa-dev] [PATCH] gbm: add support for loading third-party backend

2017-01-26 Thread Michel Dänzer

On 25/01/17 07:34 PM, Yu, Qiang wrote:
>> From: Michel Dänzer 
>> On 24/01/17 12:36 PM, Qiang Yu wrote:
>>> Third-party can put their backend to a directory configured with
>>> '--with-gbm-backenddir' and create a /etc/gbm.conf.d/*.conf file
>>> which contains the backend so file name to overwrite the default
>>> builtin DRI backend.
>>>
>>> The /etc/gbm.conf.d/*.conf will be sorted and the backends added
>>> will be tried one-by-one until one can successfully create a gbm
>>> device. The default DRI backend is tried at last.
>> 
>> If I understand correctly, any third-party backends will always take
>> priority over Mesa-internal backends. Is everyone okay with that? I'm a
>> little worried it might cause problems, but I can't come up with a
>> specific scenario now, and maybe it can be addressed if and when it
>> causes a problem in practice.
> 
> Right, the order can only be overwrite with GBM_BACKEND environment
> variable.
> 
> If we make the DRI backend also configurable by the conf file or give
> a "priority" property to a backend in conf and assign a default
> priority to DRI backend, your worry can be addressed.

Yeah, something like that is probably needed. Otherwise, I suspect e.g.
glamor will break if the amdgpu-pro GBM backend is installed, but the GL
libraries use Mesa, e.g. via GLVND.

Ideally, the GLVND and GBM backend selection mechanisms would be
integrated somehow to prevent that kind of inconsistency (at least by
default, unless the user goes out of their way to shoot themselves in
the foot :).


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radv: drop support for VK_AMD_NEGATIVE_VIEWPORT_HEIGHT

2017-01-26 Thread Andres Rodriguez

This extension was not correctly supported, and it conflicts with the
VK_KHR_MAINTENANCE1 spec.
---
 src/amd/vulkan/radv_device.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 7f68cdc..9f05dd6 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -119,10 +119,6 @@ static const VkExtensionProperties 
common_device_extensions[] = {
.extensionName = VK_AMD_DRAW_INDIRECT_COUNT_EXTENSION_NAME,
.specVersion = 1,
},
-   {
-   .extensionName = VK_AMD_NEGATIVE_VIEWPORT_HEIGHT_EXTENSION_NAME,
-   .specVersion = 1,
-   },
 };
 
 static VkResult
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/7] radv: use new error codes for AllocateDescriptorSets

2017-01-26 Thread Andres Rodriguez

There is a new error code in Maintenance1 that is more specific to the
situation: VK_ERROR_OUT_OF_POOL_MEMORY_KHR

Fixes CTS test case:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
---
 src/amd/vulkan/radv_descriptor_set.c | 2 +-
 src/amd/vulkan/radv_util.c   | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_descriptor_set.c 
b/src/amd/vulkan/radv_descriptor_set.c
index eb8b5d6..6d89d60 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -298,7 +298,7 @@ radv_descriptor_set_create(struct radv_device *device,
 
if (entry < 0) {
vk_free2(>alloc, NULL, set);
-   return 
vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
+   return 
vk_error(VK_ERROR_OUT_OF_POOL_MEMORY_KHR);
}
offset = pool->free_nodes[entry].offset;
pool->free_nodes[entry].next = pool->full_list;
diff --git a/src/amd/vulkan/radv_util.c b/src/amd/vulkan/radv_util.c
index c642bb7..9da442d 100644
--- a/src/amd/vulkan/radv_util.c
+++ b/src/amd/vulkan/radv_util.c
@@ -79,6 +79,7 @@ __vk_errorf(VkResult error, const char *file, int line, const 
char *format, ...)
/* Core errors */
ERROR_CASE(VK_ERROR_OUT_OF_HOST_MEMORY)
ERROR_CASE(VK_ERROR_OUT_OF_DEVICE_MEMORY)
+   ERROR_CASE(VK_ERROR_OUT_OF_POOL_MEMORY_KHR)
ERROR_CASE(VK_ERROR_INITIALIZATION_FAILED)
ERROR_CASE(VK_ERROR_DEVICE_LOST)
ERROR_CASE(VK_ERROR_MEMORY_MAP_FAILED)
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] radv: add trim command pool stub

2017-01-26 Thread Andres Rodriguez

---
 src/amd/vulkan/radv_cmd_buffer.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index c62d275..0b090b7 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1895,6 +1895,13 @@ VkResult radv_ResetCommandPool(
return VK_SUCCESS;
 }
 
+void radv_TrimCommandPoolKHR(
+VkDevicedevice,
+VkCommandPool   commandPool,
+VkCommandPoolTrimFlagsKHR   flags)
+{
+}
+
 void radv_CmdBeginRenderPass(
VkCommandBuffer commandBuffer,
const VkRenderPassBeginInfo*pRenderPassBegin,
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] radv: Fix vkCmdCopyImage for 2d slices into 3d Images

2017-01-26 Thread Andres Rodriguez

Previously the z offset of the destination image was being ignored. It
should be taken into account when copying into a 3d target.

Also, img_extent_el.depth was being incorrectly clamped to 1 due to the
source image being VK_IMAGE_TYPE_2D. This would result in the blit
failing to iterate over all the 3d slices. Instead we clamp to the
destination image type.

Fixes failures in CTS tests:
dEQP-VK.api.copy_and_blit.image_to_image.3d_images.*
---
 src/amd/vulkan/radv_meta_copy.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_meta_copy.c b/src/amd/vulkan/radv_meta_copy.c
index 64e0ea8..7258e0c 100644
--- a/src/amd/vulkan/radv_meta_copy.c
+++ b/src/amd/vulkan/radv_meta_copy.c
@@ -369,7 +369,7 @@ meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
const VkOffset3D src_offset_el =
meta_region_offset_el(src_image, 
[r].srcOffset);
const VkExtent3D img_extent_el =
-   meta_region_extent_el(src_image, [r].extent);
+   meta_region_extent_el(dest_image, [r].extent);
 
/* Start creating blit rect */
struct radv_meta_blit2d_rect rect = {
@@ -377,6 +377,9 @@ meta_copy_image(struct radv_cmd_buffer *cmd_buffer,
.height = img_extent_el.height,
};
 
+   if (dest_image->type == VK_IMAGE_VIEW_TYPE_3D)
+   b_dst.layer = dst_offset_el.z;
+
/* Loop through each 3D or array slice */
unsigned num_slices_3d = img_extent_el.depth;
unsigned num_slices_array = 
pRegions[r].dstSubresource.layerCount;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] radv: Don't allow any operations on non-supported depth/stencil formats.

2017-01-26 Thread Andres Rodriguez

From: Bas Nieuwenhuizen 

We really use the depth block for the blits.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_formats.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
index e276432..f56f67c 100644
--- a/src/amd/vulkan/radv_formats.c
+++ b/src/amd/vulkan/radv_formats.c
@@ -565,11 +565,12 @@ radv_physical_device_get_format_properties(struct 
radv_physical_device *physical
}
 
if (vk_format_is_depth_or_stencil(format)) {
-   if (radv_is_zs_format_supported(format))
+   if (radv_is_zs_format_supported(format)) {
tiled |= VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT;
-   tiled |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT;
-   tiled |= VK_FORMAT_FEATURE_BLIT_SRC_BIT |
-   VK_FORMAT_FEATURE_BLIT_DST_BIT;
+   tiled |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT;
+   tiled |= VK_FORMAT_FEATURE_BLIT_SRC_BIT |
+VK_FORMAT_FEATURE_BLIT_DST_BIT;
+   }
} else {
bool linear_sampling;
if (radv_is_sampler_format_supported(format, _sampling)) 
{
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/7] radv: Expose VK_KHR_maintenance1

2017-01-26 Thread Andres Rodriguez

---
 src/amd/vulkan/radv_device.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 4aa6af2..7f68cdc 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -104,6 +104,10 @@ static const VkExtensionProperties instance_extensions[] = 
{
 
 static const VkExtensionProperties common_device_extensions[] = {
{
+   .extensionName = VK_KHR_MAINTENANCE1_EXTENSION_NAME,
+   .specVersion = 1,
+   },
+   {
.extensionName = 
VK_KHR_SAMPLER_MIRROR_CLAMP_TO_EDGE_EXTENSION_NAME,
.specVersion = 1,
},
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] radv: Implement VK_KHR_maintenance1

2017-01-26 Thread Andres Rodriguez

This series implements the VK_KHR_maintenance1 extension. It is loosely
based on jekstrand's series for anv.

This series soft depends on one of Bas's patches that are not yet in master.
I'm not certain of protocol for this kind of situation. I've left them included
in this series, but if I should do something differently in the future let me
know.

Tested against the tests that reference Maintenance1 from cts branch 
vulkan-cts-1.0.2:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.descriptor_set
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary
dEQP-VK.synchronization.op.multi_queue.*.write_fill_buffer*
dEQP-VK.pipeline.render_to_image.3d.*
dEQP-VK.geometry.layered.3d.*
dEQP-VK.api.command_buffers.trim_command_pool
dEQP-VK.api.command_buffers.trim_command_pool_secondary
dEQP-VK.api.descriptor_pool.out_of_pool_memory
dEQP-VK.api.info.image_format_properties.*
dEQP-VK.draw.negative_viewport_height.*
dEQP-VK.api.copy_and_blit.image_to_image.3d_images.*

Test run totals:
  Passed:1200/2347 (51.1%)
  Failed:0/2347 (0.0%)
  Not supported: 1147/2347 (48.9%)
  Warnings:  0/2347 (0.0%)

The 'Not supported' tests are due to geometry shaders, which should be landing 
soon(TM).

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/7] radv: Expose transfer format features.

2017-01-26 Thread Andres Rodriguez

From: Bas Nieuwenhuizen 

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_formats.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
index f56f67c..c968cef 100644
--- a/src/amd/vulkan/radv_formats.c
+++ b/src/amd/vulkan/radv_formats.c
@@ -570,6 +570,8 @@ radv_physical_device_get_format_properties(struct 
radv_physical_device *physical
tiled |= VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT;
tiled |= VK_FORMAT_FEATURE_BLIT_SRC_BIT |
 VK_FORMAT_FEATURE_BLIT_DST_BIT;
+   tiled |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR |
+VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR;
}
} else {
bool linear_sampling;
@@ -591,6 +593,15 @@ radv_physical_device_get_format_properties(struct 
radv_physical_device *physical
tiled |= 
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BLEND_BIT;
}
}
+   if (util_is_power_of_two(vk_format_get_blocksize(format))) {
+   tiled |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR |
+VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR;
+   }
+   }
+
+   if (util_is_power_of_two(vk_format_get_blocksize(format))) {
+   linear |= VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR |
+ VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR;
}
 
if (format == VK_FORMAT_R32_UINT || format == VK_FORMAT_R32_SINT) {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] radv: vkAllocateCommandBuffers should NULL all output handles

2017-01-26 Thread Andres Rodriguez

This is part of the spec and fixes CTS tests:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_*
---
 src/amd/vulkan/radv_cmd_buffer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 0b090b7..a549a8e 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -1344,6 +1344,9 @@ VkResult radv_AllocateCommandBuffers(
VkResult result = VK_SUCCESS;
uint32_t i;
 
+   memset(pCommandBuffers, 0,
+   
sizeof(*pCommandBuffers)*pAllocateInfo->commandBufferCount);
+
for (i = 0; i < pAllocateInfo->commandBufferCount; i++) {
result = radv_create_cmd_buffer(device, pool, 
pAllocateInfo->level,
[i]);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: add VRAM-vis-usage HUD query

2017-01-26 Thread Dieter Nützel


Question more about the first one
[Mesa-dev] [PATCH 1/2] gallium/radeon: query the CPU accessible	size of 
VRAM


Is vram_vis_size > vram_size valid?

After some time I get this on r600/Turks XT/6670/2 GB:

/opt/mesa> setenv R600_DEBUG info
/opt/mesa> glxgears
pci_id = 0x6758
family = 45 (AMD TURKS)
chip_class = 6
gart_size = 1022 MB
vram_size = 2048 MB
vram_vis_size = 2085 MB
max_alloc_size = 1434 MB
has_virtual_memory = 0
gfx_ib_pad_with_type2 = 1
has_sdma = 1
has_uvd = 1
me_fw_version = 0
pfp_fw_version = 0
ce_fw_version = 0
vce_fw_version = 0
vce_harvest_config = 0
clock_crystal_freq = 27000
drm = 2.48.0
has_userptr = 1
r600_max_quad_pipes = 4
max_shader_clock = 800
num_good_compute_units = 6
max_se = 1
max_sh_per_se = 0
r600_gb_backend_map = 0
r600_gb_backend_map_valid = 1
r600_num_banks = 8
num_render_backends = 2
num_tile_pipes = 4
pipe_interleave_bytes = 256
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.

Look GOOD?

Thanks,
  Dieter

Am 26.01.2017 19:31, schrieb Marek Olšák:

For the series:

Reviewed-by: Marek Olšák 

On Jan 25, 2017 5:50 PM, "Nicolai Hähnle"  wrote:


Reviewed-by: Nicolai Hähnle 

On 25.01.2017 16:56, Samuel Pitoiset wrote:


This new query returns the current visible usage of VRAM accessed
by the CPU. It will return 0 on radeon because it's unimplemented.

Signed-off-by: Samuel Pitoiset 
---
src/gallium/drivers/radeon/r600_query.c   | 7 +++
src/gallium/drivers/radeon/r600_query.h   | 1 +
src/gallium/drivers/radeon/radeon_winsys.h| 1 +
src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
5 files changed, 14 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_query.c
b/src/gallium/drivers/radeon/r600_query.c
index 96157cd40e..d4e41306a4 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -71,6 +71,7 @@ static enum radeon_value_id
winsys_id_from_type(unsigned type)
case R600_QUERY_NUM_BYTES_MOVED: return
RADEON_NUM_BYTES_MOVED;
case R600_QUERY_NUM_EVICTIONS: return
RADEON_NUM_EVICTIONS;
case R600_QUERY_VRAM_USAGE: return RADEON_VRAM_USAGE;
+   case R600_QUERY_VRAM_VIS_USAGE: return
RADEON_VRAM_VIS_USAGE;
case R600_QUERY_GTT_USAGE: return RADEON_GTT_USAGE;
case R600_QUERY_GPU_TEMPERATURE: return
RADEON_GPU_TEMPERATURE;
case R600_QUERY_CURRENT_GPU_SCLK: return
RADEON_CURRENT_SCLK;
@@ -129,6 +130,7 @@ static bool r600_query_sw_begin(struct
r600_common_context *rctx,
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
+   case R600_QUERY_VRAM_VIS_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
@@ -238,6 +240,7 @@ static bool r600_query_sw_end(struct
r600_common_context *rctx,
case R600_QUERY_MAPPED_VRAM:
case R600_QUERY_MAPPED_GTT:
case R600_QUERY_VRAM_USAGE:
+   case R600_QUERY_VRAM_VIS_USAGE:
case R600_QUERY_GTT_USAGE:
case R600_QUERY_GPU_TEMPERATURE:
case R600_QUERY_CURRENT_GPU_SCLK:
@@ -1731,6 +1734,7 @@ static struct pipe_driver_query_info
r600_driver_query_list[] = {
X("num-bytes-moved",NUM_BYTES_MOVED,
BYTES, CUMULATIVE),
X("num-evictions",  NUM_EVICTIONS,
UINT64, CUMULATIVE),
X("VRAM-usage", VRAM_USAGE,
BYTES, AVERAGE),
+   X("VRAM-vis-usage", VRAM_VIS_USAGE,
BYTES, AVERAGE),
X("GTT-usage",  GTT_USAGE,
BYTES, AVERAGE),
X("back-buffer-ps-draw-ratio",  BACK_BUFFER_PS_DRAW_RATIO,
UINT64, AVERAGE),

@@ -1814,6 +1818,9 @@ static int r600_get_driver_query_info(struct
pipe_screen *screen,
case R600_QUERY_GPU_TEMPERATURE:
info->max_value.u64 = 125;
break;
+   case R600_QUERY_VRAM_VIS_USAGE:
+   info->max_value.u64 = rscreen->info.vram_vis_size;
+   break;
}

if (info->group_id != ~(unsigned)0 &&
rscreen->perfcounters)
diff --git a/src/gallium/drivers/radeon/r600_query.h
b/src/gallium/drivers/radeon/r600_query.h
index 20856a5b2e..f2af9240d2 100644
--- a/src/gallium/drivers/radeon/r600_query.h
+++ b/src/gallium/drivers/radeon/r600_query.h
@@ -66,6 +66,7 @@ enum {
R600_QUERY_NUM_BYTES_MOVED,
R600_QUERY_NUM_EVICTIONS,
R600_QUERY_VRAM_USAGE,
+   R600_QUERY_VRAM_VIS_USAGE,
R600_QUERY_GTT_USAGE,
R600_QUERY_GPU_TEMPERATURE,
R600_QUERY_CURRENT_GPU_SCLK,
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h
b/src/gallium/drivers/radeon/radeon_winsys.h
index e373e2f0a1..881bd5f2e4 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -88,6 +88,7 @@ enum radeon_value_id {
RADEON_NUM_BYTES_MOVED,
RADEON_NUM_EVICTIONS,
RADEON_VRAM_USAGE,
+RADEON_VRAM_VIS_USAGE,
RADEON_GTT_USAGE,
RADEON_GPU_TEMPERATURE, /* DRM 2.42.0 */
RADEON_CURRENT_SCLK,
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c

[Mesa-dev] [PATCH] radv: use proper maximum slice for layered view

2017-01-26 Thread Dave Airlie

From: Dave Airlie 

this fixes deferred shadows with geom shaders enabled.

but I think this fix is fine by itself.

Signed-off-by: Dave Airlie 
---
 src/amd/vulkan/radv_device.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index b1819a5..110a51b 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -1666,8 +1666,9 @@ radv_initialise_color_surface(struct radv_device *device,
va += iview->image->dcc_offset;
cb->cb_dcc_base = va >> 8;
 
+   uint32_t max_slice = iview->type == VK_IMAGE_VIEW_TYPE_3D ? 
iview->extent.depth : iview->layer_count;
cb->cb_color_view = S_028C6C_SLICE_START(iview->base_layer) |
-   S_028C6C_SLICE_MAX(iview->base_layer + iview->extent.depth - 1);
+   S_028C6C_SLICE_MAX(iview->base_layer + max_slice - 1);
 
cb->micro_tile_mode = iview->image->surface.micro_tile_mode;
pitch_tile_max = level_info->nblk_x / 8 - 1;
@@ -1819,8 +1820,9 @@ radv_initialise_ds_surface(struct radv_device *device,
z_offs += iview->image->surface.level[level].offset;
s_offs += iview->image->surface.stencil_level[level].offset;
 
+   uint32_t max_slice = iview->type == VK_IMAGE_VIEW_TYPE_3D ? 
iview->extent.depth : iview->layer_count;
ds->db_depth_view = S_028008_SLICE_START(iview->base_layer) |
-   S_028008_SLICE_MAX(iview->base_layer + iview->extent.depth - 1);
+   S_028008_SLICE_MAX(iview->base_layer + max_slice - 1);
ds->db_depth_info = S_02803C_ADDR5_SWIZZLE_MASK(1);
ds->db_z_info = S_028040_FORMAT(format) | S_028040_ZRANGE_PRECISION(1);
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V2 06/37] util: add a disk_cache_remove() function

2017-01-26 Thread Timothy Arceri

From: Timothy Arceri 

This will be used to remove cache items created with old versions
of Mesa or other invalid cache items from the cache.

V2: rename stub function (cache_* funtions were renamed disk_cache_*)
in master.
---
 src/util/disk_cache.c | 22 ++
 src/util/disk_cache.h | 12 
 2 files changed, 34 insertions(+)

diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c
index 551ceeb..7451b08 100644
--- a/src/util/disk_cache.c
+++ b/src/util/disk_cache.c
@@ -538,6 +538,28 @@ evict_random_item(struct disk_cache *cache)
 }
 
 void
+disk_cache_remove(struct disk_cache *cache, cache_key key)
+{
+   struct stat sb;
+
+   char *filename = get_cache_file(cache, key);
+   if (filename == NULL) {
+  return;
+   }
+
+   if (stat(filename, ) == -1) {
+  ralloc_free(filename);
+  return;
+   }
+
+   unlink(filename);
+   ralloc_free(filename);
+
+   if (sb.st_size)
+  p_atomic_add(cache->size, - sb.st_size);
+}
+
+void
 disk_cache_put(struct disk_cache *cache,
   cache_key key,
   const void *data,
diff --git a/src/util/disk_cache.h b/src/util/disk_cache.h
index 7e9cb80..1f2bf3d 100644
--- a/src/util/disk_cache.h
+++ b/src/util/disk_cache.h
@@ -78,6 +78,12 @@ void
 disk_cache_destroy(struct disk_cache *cache);
 
 /**
+ * Remove the item in the cache under the name \key.
+ */
+void
+disk_cache_remove(struct disk_cache *cache, cache_key key);
+
+/**
  * Store an item in the cache under the name \key.
  *
  * The item can be retrieved later with disk_cache_get(), (unless the item has
@@ -151,6 +157,12 @@ disk_cache_put(struct disk_cache *cache, cache_key key,
return;
 }
 
+static inline void
+disk_cache_remove(struct program_cache *cache, cache_key key)
+{
+   return;
+}
+
 static inline uint8_t *
 disk_cache_get(struct disk_cache *cache, cache_key key, size_t *size)
 {
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/7] i965: Implement EGL_ANDROID_native_fence_sync

2017-01-26 Thread Ben Widawsky


On 17-01-23 15:32:32, Chad Versace wrote:

On Fri 20 Jan 2017, Rafael Antognolli wrote:

I have tested this series with the branches that you mentioned, and with
piglit with the patches from my own branch:

https://github.com/rantogno/piglit/tree/review/fences-v02

Everything seems to work fine. You can add:

Tested-by: Rafael Antognolli 

I also have gone through these patches several times while they were
under development, and they look good to me. So if you disregard my
shallow knowledge of Mesa, you could add:

Reviewed-by: Rafael Antognolli 


Thanks. I've added the rb's and tb's locally.


So IMO, Rafael is a great person to review this. He's underselling his knowledge
to his detriment. Is there something preventing pushing these patches?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V2 08/37] glsl: add initial implementation of shader cache

2017-01-26 Thread Timothy Arceri

From: Timothy Arceri 

This uses disk_cache.c to write out a serialization of various
state that's required in order to successfully load and use a
binary written out by a drivers backend, this state is referred to as
"metadata" throughout the implementation.

This initial version is intended to work with all stages beside
compute.

This patch is based on the initial work done by Carl.

V2: extend the file's doxygen comment to cover some of the
design decisions.
---
 src/compiler/Makefile.glsl.am  |   3 +-
 src/compiler/Makefile.sources  |   4 +
 src/compiler/glsl/shader_cache.cpp | 579 +
 src/compiler/glsl/shader_cache.h   |  38 +++
 4 files changed, 623 insertions(+), 1 deletion(-)
 create mode 100644 src/compiler/glsl/shader_cache.cpp
 create mode 100644 src/compiler/glsl/shader_cache.h

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index f673196..41edb3c 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -131,7 +131,8 @@ glsl_libglsl_la_LIBADD = \
 
 glsl_libglsl_la_SOURCES =  \
$(LIBGLSL_GENERATED_FILES)  \
-   $(LIBGLSL_FILES)
+   $(LIBGLSL_FILES)\
+   $(LIBGLSL_SHADER_CACHE_FILES)
 
 glsl_libstandalone_la_SOURCES = \
$(GLSL_COMPILER_CXX_FILES)
diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index a8bb4d3..1e8edc0 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -142,6 +142,10 @@ LIBGLSL_FILES = \
glsl/s_expression.cpp \
glsl/s_expression.h
 
+LIBGLSL_SHADER_CACHE_FILES = \
+   glsl/shader_cache.cpp \
+   glsl/shader_cache.h
+
 # glsl_compiler
 
 GLSL_COMPILER_CXX_FILES = \
diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
new file mode 100644
index 000..87f38a4
--- /dev/null
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -0,0 +1,579 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file shader_cache.c
+ *
+ * GLSL shader cache implementation
+ *
+ * This uses disk_cache.c to write out a serialization of various
+ * state that's required in order to successfully load and use a
+ * binary written out by a drivers backend, this state is referred to as
+ * "metadata" throughout the implementation.
+ *
+ * The hash key for glsl metadata is a hash of the hashes of each GLSL
+ * source string as well as some API settings that change the final shader
+ * such as SSO, attribute binding, frag data bindins, etc.
+ *
+ * In order to avoid caching any actual IR we use the put_key/get_key support
+ * in the disk_cache to put the SHA-1 hash for each successfully compiled
+ * shader into the cache, and optimisticly return early from glCompileShader
+ * (if the identical shader had been successfully compiled in the past),
+ * in the hope that the final linked shader will be found in the cache.
+ * If anything goes wrong (shader variant not found, backend cache item is
+ * corrupt, etc) we will use a fallback path to compile and link the IR.
+ */
+
+#include "blob.h"
+#include "compiler/shader_info.h"
+#include "glsl_symbol_table.h"
+#include "glsl_parser_extras.h"
+#include "ir.h"
+#include "ir_optimization.h"
+#include "ir_rvalue_visitor.h"
+#include "ir_uniform.h"
+#include "linker.h"
+#include "link_varyings.h"
+#include "main/core.h"
+#include "nir.h"
+#include "program.h"
+#include "util/disk_cache.h"
+#include "util/mesa-sha1.h"
+#include "util/string_to_uint_map.h"
+
+extern "C" {
+#include "main/enums.h"
+#include "main/shaderobj.h"
+#include "program/program.h"
+}
+
+static void
+compile_shaders(struct gl_context *ctx, struct gl_shader_program *prog) {
+   for (unsigned i = 0; i < prog->NumShaders; i++) {
+

Re: [Mesa-dev] [PATCH] gallium/radeon: add a new HUD query for the number of mapped buffers

2017-01-26 Thread Michel Dänzer

On 26/01/17 08:07 PM, Christian König wrote:
> Am 26.01.2017 um 12:01 schrieb Samuel Pitoiset:
>> On 01/26/2017 03:45 AM, Michel Dänzer wrote:
>>> On 25/01/17 11:19 PM, Samuel Pitoiset wrote:
>>>
 I would like to approach the problem by reducing the amount of vram
 needed by the userspace in order to prevent TTM to move lot of data...
>>>
>>> One thing that might help there is not trying to put any buffers in VRAM
>>> which will (likely) be accessed by the CPU and which are larger than say
>>> 1/4 the size of CPU visible VRAM. And maybe also keeping track of the
>>> total size of such buffers we're trying to put in VRAM, and stop when it
>>> exceeds say 3/4.
>>
>> That could be a solution yes. But maybe, we should also try to reduce
>> the number of mapped VRAM (for buffers mapped only once).
> 
> For buffers mapped only once I suggest to just use a bouncing buffer in
> GART.
> 
> BTW: What kind of allocations are we talking about here? From the
> application or driver internal allocations (e.g. shader code for example)?

That's a good point about shader code, actually — we do upload the
shader machine code by writing with the CPU directly to VRAM. This could
account for at least some of the large number of mappings Sam is seeing.
It should be relatively easy to unmap these buffers immediately after
the code is written (in si_shader_binary_upload).

Longer term, maybe we should consider doing this without writing to VRAM
with the CPU.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Peter Frühberger

Hi Christian,

2017-01-26 12:00 GMT+01:00 Christian König :

> Hi Peter,
>
> Am 25.01.2017 um 19:45 schrieb Peter Frühberger:
>
>>
>>
>> Peter, Rainer any idea what I'm missing here? Do you guys use some
>> modified ffmpeg for Kodi or how does that work for you?
>>
>>
>> do you set the format correctly, e.g.: https://github.com/FernetMenta
>> /kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/
>> Video/VAAPI.cpp#L2697 to create the surfaces?
>>
>
> Well the problem here is that the VA-API interface is not consistent and
> I'm not sure how to implement it correctly.
>
> See your code for example:
>
>> VASurfaceAttrib attribs[1], *attrib;
>>
>> attrib = attribs;
>>
>> attrib->flags = VA_SURFACE_ATTRIB_SETTABLE;
>>
>> attrib->type = VASurfaceAttribPixelFormat;
>>
>> attrib->value.type = VAGenericValueTypeInteger;
>>
>> attrib->value.value.i = VA_FOURCC_NV12;
>>
>>
>>
> First Kodi specifies that NV12 should be used which implies that this is a
> 8bit surface.
>
> // create surfaces
>>
>> VASurfaceID surfaces[32];
>>
>> unsigned int format = VA_RT_FORMAT_YUV420;
>>
>> if (m_config.profile == VAProfileHEVCMain10)
>>
>> format = VA_RT_FORMAT_YUV420_10BPP;
>>
> But then Kodi requests a 10bit surface. Now what is the correct thing to
> do here?
>
> I can either create an NV12 surface, which would be 8bit but would result
> in either an error message or only 8bit dithering during decode.
>
> Or I can promote the surface to 10bit, which would result in a P010 or
> rather P016 format.
>
> Or and that is actually what I think would be best the VA-API driver
> should trow an error indicating that the application requested something
> impossible.


Yes you are right. Looks like a driver specific:
https://cgit.freedesktop.org/vaapi/intel-driver/tree/src/i965_drv_video.c#n1338

seems they use it as a hint to the subsampling: SUBSAMPLE_YUV420 and then
later compare with with the format again to choose.

>From code pov we should set the attribute to: VA_FOURCC_P010, right?

Regards
Peter




>
>
>> afterwards we just do drm / egl interop, via:
>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/
>> cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1374
>>
>
> I'm not sure if that will ever work correctly. The problem is that VA-API
> leaks to the application what the data layout in the surface is. As soon as
> we turn on tilling that will only work with rather crude hacks.
>
> I will try to get it working, but probably need help from you guys as well.
>
> Regards,
> Christian.
>
> You need ffmpeg 3.2.
>>
>> If you use vaPutSurface it will end up as RGBA32 or something, which is
>> why we use the above way.
>>
>> Best regards
>> Peter
>>
>>
>> Cheers,
>> Christian.
>>
>>
>>
>>
>>
>> --
>>Key-ID: 0x1A995A9B
>>keyserver: pgp.mit.edu 
>> ==
>> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>>
>
>
>


-- 
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Peter Frühberger

2017-01-26 16:36 GMT+01:00 Christian König :

> Am 26.01.2017 um 12:16 schrieb Peter Frühberger:
>
> Hi Christian,
>
> 2017-01-26 12:00 GMT+01:00 Christian König :
>
>> Hi Peter,
>>
>> Am 25.01.2017 um 19:45 schrieb Peter Frühberger:
>>
>>>
>>>
>>> Peter, Rainer any idea what I'm missing here? Do you guys use some
>>> modified ffmpeg for Kodi or how does that work for you?
>>>
>>>
>>> do you set the format correctly, e.g.:
>>> 
>>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/c
>>> ores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L2697 to create the surfaces?
>>>
>>
>> Well the problem here is that the VA-API interface is not consistent and
>> I'm not sure how to implement it correctly.
>>
>> See your code for example:
>>
>>> VASurfaceAttrib attribs[1], *attrib;
>>>
>>> attrib = attribs;
>>>
>>> attrib->flags = VA_SURFACE_ATTRIB_SETTABLE;
>>>
>>> attrib->type = VASurfaceAttribPixelFormat;
>>>
>>> attrib->value.type = VAGenericValueTypeInteger;
>>>
>>> attrib->value.value.i = VA_FOURCC_NV12;
>>>
>>>
>>>
>> First Kodi specifies that NV12 should be used which implies that this is
>> a 8bit surface.
>>
>> // create surfaces
>>>
>>> VASurfaceID surfaces[32];
>>>
>>> unsigned int format = VA_RT_FORMAT_YUV420;
>>>
>>> if (m_config.profile == VAProfileHEVCMain10)
>>>
>>> format = VA_RT_FORMAT_YUV420_10BPP;
>>>
>> But then Kodi requests a 10bit surface. Now what is the correct thing to
>> do here?
>>
>> I can either create an NV12 surface, which would be 8bit but would result
>> in either an error message or only 8bit dithering during decode.
>>
>> Or I can promote the surface to 10bit, which would result in a P010 or
>> rather P016 format.
>>
>> Or and that is actually what I think would be best the VA-API driver
>> should trow an error indicating that the application requested something
>> impossible.
>
>
> Yes you are right. Looks like a driver specific:
> 
> https://cgit.freedesktop.org/vaapi/intel-driver/tree/src/i965_drv_
> video.c#n1338
>
> seems they use it as a hint to the subsampling: SUBSAMPLE_YUV420 and then
> later compare with with the format again to choose.
>
> From code pov we should set the attribute to: VA_FOURCC_P010, right?
>
>
> Yes, I think so.
>
> Christian.
>

Fixed via:
https://github.com/FernetMenta/kodi-agile/commit/bb73b5535e2f4b65772451c23f75503d04de69ef
thanks for the heads up

Peter


>
>
>
> Regards
> Peter
>
>
>
>
>>
>>
>>> afterwards we just do drm / egl interop, via:
>>> https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/c
>>> ores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1374
>>>
>>
>> I'm not sure if that will ever work correctly. The problem is that VA-API
>> leaks to the application what the data layout in the surface is. As soon as
>> we turn on tilling that will only work with rather crude hacks.
>>
>> I will try to get it working, but probably need help from you guys as
>> well.
>>
>> Regards,
>> Christian.
>>
>> You need ffmpeg 3.2.
>>>
>>> If you use vaPutSurface it will end up as RGBA32 or something, which is
>>> why we use the above way.
>>>
>>> Best regards
>>> Peter
>>>
>>>
>>> Cheers,
>>> Christian.
>>>
>>>
>>>
>>>
>>>
>>> --
>>>Key-ID: 0x1A995A9B
>>>keyserver: pgp.mit.edu < 
>>> http://pgp.mit.edu>
>>> ==
>>> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>>>
>>
>>
>>
>
>
> --
>Key-ID: 0x1A995A9B
>keyserver: pgp.mit.edu
> ==
> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>
>
>


-- 
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Peter Frühberger

Hi Christian,

2017-01-26 19:34 GMT+01:00 Christian König :

> Am 26.01.2017 um 16:59 schrieb Peter Frühberger:
>
>
>
> 2017-01-26 16:36 GMT+01:00 Christian König :
>
>> Am 26.01.2017 um 12:16 schrieb Peter Frühberger:
>>
>> Hi Christian,
>>
>> 2017-01-26 12:00 GMT+01:00 Christian König < 
>> deathsim...@vodafone.de>:
>>
>>> Hi Peter,
>>>
>>> Am 25.01.2017 um 19:45 schrieb Peter Frühberger:
>>>


 Peter, Rainer any idea what I'm missing here? Do you guys use some
 modified ffmpeg for Kodi or how does that work for you?


 do you set the format correctly, e.g.: https://github.com/FernetMenta
 /kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L2697
 to create the surfaces?

>>>
>>> Well the problem here is that the VA-API interface is not consistent and
>>> I'm not sure how to implement it correctly.
>>>
>>> See your code for example:
>>>
 VASurfaceAttrib attribs[1], *attrib;

 attrib = attribs;

 attrib->flags = VA_SURFACE_ATTRIB_SETTABLE;

 attrib->type = VASurfaceAttribPixelFormat;

 attrib->value.type = VAGenericValueTypeInteger;

 attrib->value.value.i = VA_FOURCC_NV12;



>>> First Kodi specifies that NV12 should be used which implies that this is
>>> a 8bit surface.
>>>
>>> // create surfaces

 VASurfaceID surfaces[32];

 unsigned int format = VA_RT_FORMAT_YUV420;

 if (m_config.profile == VAProfileHEVCMain10)

 format = VA_RT_FORMAT_YUV420_10BPP;

>>> But then Kodi requests a 10bit surface. Now what is the correct thing to
>>> do here?
>>>
>>> I can either create an NV12 surface, which would be 8bit but would
>>> result in either an error message or only 8bit dithering during decode.
>>>
>>> Or I can promote the surface to 10bit, which would result in a P010 or
>>> rather P016 format.
>>>
>>> Or and that is actually what I think would be best the VA-API driver
>>> should trow an error indicating that the application requested something
>>> impossible.
>>
>>
>> Yes you are right. Looks like a driver specific: https://cgit.freedes
>> ktop.org/vaapi/intel-driver/tree/src/i965_drv_video.c#n1338
>>
>> seems they use it as a hint to the subsampling: SUBSAMPLE_YUV420 and
>> then later compare with with the format again to choose.
>>
>> From code pov we should set the attribute to: VA_FOURCC_P010, right?
>>
>>
>> Yes, I think so.
>>
>> Christian.
>>
>
> Fixed via: https://github.com/FernetMenta/kodi-agile/commit/
> bb73b5535e2f4b65772451c23f75503d04de69ef
> thanks for the heads up
>
>
> Great! Thanks a lot for cleaning that up so quickly.
>
> In that case I will just respond to such nonsense combinations with an
> error result.
>
> Christian.
>

just tell us when we can remove:
https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L534

:-)

Best regards
Peter


>
>
> Peter
>
>
>>
>>
>>
>> Regards
>> Peter
>>
>>
>>
>>
>>>
>>>
 afterwards we just do drm / egl interop, via:
 https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/c
 ores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1374

>>>
>>> I'm not sure if that will ever work correctly. The problem is that
>>> VA-API leaks to the application what the data layout in the surface is. As
>>> soon as we turn on tilling that will only work with rather crude hacks.
>>>
>>> I will try to get it working, but probably need help from you guys as
>>> well.
>>>
>>> Regards,
>>> Christian.
>>>
>>> You need ffmpeg 3.2.

 If you use vaPutSurface it will end up as RGBA32 or something, which is
 why we use the above way.

 Best regards
 Peter


 Cheers,
 Christian.





 --
Key-ID: 0x1A995A9B
keyserver: pgp.mit.edu 
 ==
 Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B

>>>
>>>
>>>
>>
>>
>> --
>>Key-ID: 0x1A995A9B
>>keyserver: pgp.mit.edu
>> ==
>> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>>
>>
>>
>
>
> --
>Key-ID: 0x1A995A9B
>keyserver: pgp.mit.edu
> ==
> Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
>
>
>


-- 
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Andy Furniss


Andy Furniss wrote:

Andy Furniss wrote:

Christian König wrote:

Hi guys,

ok this is completely work in progress and untested except for a
compile run.

Most of the stuff necessary should be there for VDPAU, but I'm
honestly not sure how to approach VAAPI.


These regress R9 285 8bit h264 vaapi decode with mpv,

[vd] Pixel formats supported by decoder: vdpau vaapi_vld yuv420p
[vd] Codec profile: High (0x64)
[vaapi] Using profile 'VAProfileH264High'.
[vaapi] vaCreateConfig(): the requested RT Format is not supported
[ffmpeg/video] h264: Reinit context to 3840x2160, pix_fmt: yuv420p
[vd] Falling back to software decoding.


Seems this may be an mpv issue caused by patch 7

ffmpeg is not regressed by this.


mpv current git is working OK, I was testing with a few days old.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH] i965: Share the workaround bo between all contexts

2017-01-26 Thread Chad Versace

On Thu 26 Jan 2017, Chad Versace wrote:
> On Thu 26 Jan 2017, Chris Wilson wrote:
> > Since the workaround bo is used strictly as a write-only buffer, we need
> > only allocate one per screen and use the same one from all contexts.
> > 
> > (The caveat here is during extension initialisation, where we write into
> > and read back register values from the buffer, but that is performed only
> > once for the first context - and baring synchronisation issues should not
> > be a problem. Safer would be to move that also to the screen.)
> > 
> > v2: Give the workaround bo its own init function and don't piggy back
> > intel_bufmgr_init() since it is not that related.
> > 
> > v3: Drop the reference count of the workaround bo for the context since
> > the context itself is owned by the screen (and so we can rely on the bo
> > existing for the lifetime of the context).
> 
> I like this idea, but I have questions and comments about the details.
> More questions than comments, really.
> 
> Today, with only Mesa changes, could we effectively do the same as
>   drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo);
> by hacking Mesa to set no read/write domain when emitting relocs for the
> workaround_bo? (I admit I don't fully understand the kernel's domain
> tracking). If that does work, then it just would require a small hack to
> brw_emit_pipe_control_write().
> 
> > Signed-off-by: Chris Wilson 
> > Cc: Kenneth Graunke 
> > Cc: Martin Peres 
> > Cc: Chad Versace 
> > Cc: Daniel Vetter 

> > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
> > b/src/mesa/drivers/dri/i965/intel_screen.c

> > +   /* We want to use this bo from any and all contexts, without undue
> > +* writing ordering between them. To prevent the kernel enforcing
> > +* the order due to writes from different contexts, we disable
> > +* the use of (the kernel's) implicit sync on this bo.
> > +*/
> > +   drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo);

> > +#ifndef HAVE_DRM_INTEL_GEM_BO_DISABLE_IMPLICIT_SYNC
> > +#define drm_intel_gem_bo_disable_implicit_sync(BO) do { } while (0)
> > +#endif

Until Mesa can actually disable the implicit sync, I think this patch
should be postponed. If it landed now, it may cause additional
unneccessary stalls between contexts. Chrome OS uses many contexts in
the same process, so if problems exist, they'll exhibit on CrOS. Perhaps
the extra stalls will be imperceptible, but I don't want to take the
risk.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] winsys/amdgpu: add a fast exit path into amdgpu_cs_add_buffer

2017-01-26 Thread Marek Olšák

From: Marek Olšák 

The time spent in the function dropped by 37% for torcs.
---
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 16 
 src/gallium/winsys/amdgpu/drm/amdgpu_cs.h |  5 +
 2 files changed, 21 insertions(+)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
index 0bc4ce9..2a1b932 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
@@ -437,40 +437,54 @@ static unsigned amdgpu_cs_add_buffer(struct 
radeon_winsys_cs *rcs,
 {
/* Don't use the "domains" parameter. Amdgpu doesn't support changing
 * the buffer placement during command submission.
 */
struct amdgpu_cs *acs = amdgpu_cs(rcs);
struct amdgpu_cs_context *cs = acs->csc;
struct amdgpu_winsys_bo *bo = (struct amdgpu_winsys_bo*)buf;
struct amdgpu_cs_buffer *buffer;
int index;
 
+   /* Fast exit for no-op calls.
+* This is very effective with suballocators and linear uploaders that
+* are outside of the winsys.
+*/
+   if (bo == cs->last_added_bo &&
+   (usage & cs->last_added_bo_usage) == usage &&
+   (1ull << priority) & cs->last_added_bo_priority_usage)
+  return cs->last_added_bo_index;
+
if (!bo->bo) {
   index = amdgpu_lookup_or_add_slab_buffer(acs, bo);
   if (index < 0)
  return 0;
 
   buffer = >slab_buffers[index];
   buffer->usage |= usage;
 
   usage &= ~RADEON_USAGE_SYNCHRONIZED;
   index = buffer->u.slab.real_idx;
} else {
   index = amdgpu_lookup_or_add_real_buffer(acs, bo);
   if (index < 0)
  return 0;
}
 
buffer = >real_buffers[index];
buffer->u.real.priority_usage |= 1llu << priority;
buffer->usage |= usage;
cs->flags[index] = MAX2(cs->flags[index], priority / 4);
+
+   cs->last_added_bo = bo;
+   cs->last_added_bo_index = index;
+   cs->last_added_bo_usage = buffer->usage;
+   cs->last_added_bo_priority_usage = buffer->u.real.priority_usage;
return index;
 }
 
 static bool amdgpu_ib_new_buffer(struct amdgpu_winsys *ws, struct amdgpu_ib 
*ib)
 {
struct pb_buffer *pb;
uint8_t *mapped;
unsigned buffer_size;
 
/* Always create a buffer that is at least as large as the maximum seen IB
@@ -638,20 +652,21 @@ static bool amdgpu_init_cs_context(struct 
amdgpu_cs_context *cs,
 
default:
case RING_GFX:
   cs->request.ip_type = AMDGPU_HW_IP_GFX;
   break;
}
 
for (i = 0; i < ARRAY_SIZE(cs->buffer_indices_hashlist); i++) {
   cs->buffer_indices_hashlist[i] = -1;
}
+   cs->last_added_bo = NULL;
 
cs->request.number_of_ibs = 1;
cs->request.ibs = >ib[IB_MAIN];
 
cs->ib[IB_CONST].flags = AMDGPU_IB_FLAG_CE;
cs->ib[IB_CONST_PREAMBLE].flags = AMDGPU_IB_FLAG_CE |
  AMDGPU_IB_FLAG_PREAMBLE;
 
return true;
 }
@@ -669,20 +684,21 @@ static void amdgpu_cs_context_cleanup(struct 
amdgpu_cs_context *cs)
   amdgpu_winsys_bo_reference(>slab_buffers[i].bo, NULL);
}
 
cs->num_real_buffers = 0;
cs->num_slab_buffers = 0;
amdgpu_fence_reference(>fence, NULL);
 
for (i = 0; i < ARRAY_SIZE(cs->buffer_indices_hashlist); i++) {
   cs->buffer_indices_hashlist[i] = -1;
}
+   cs->last_added_bo = NULL;
 }
 
 static void amdgpu_destroy_cs_context(struct amdgpu_cs_context *cs)
 {
amdgpu_cs_context_cleanup(cs);
FREE(cs->flags);
FREE(cs->real_buffers);
FREE(cs->handles);
FREE(cs->slab_buffers);
FREE(cs->request.dependencies);
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.h 
b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.h
index 90b9e83..495d55b 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.h
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_cs.h
@@ -87,20 +87,25 @@ struct amdgpu_cs_context {
amdgpu_bo_handle*handles;
uint8_t *flags;
struct amdgpu_cs_buffer *real_buffers;
 
unsignednum_slab_buffers;
unsignedmax_slab_buffers;
struct amdgpu_cs_buffer *slab_buffers;
 
int buffer_indices_hashlist[4096];
 
+   struct amdgpu_winsys_bo *last_added_bo;
+   unsignedlast_added_bo_index;
+   unsignedlast_added_bo_usage;
+   uint64_tlast_added_bo_priority_usage;
+
unsignedmax_dependencies;
 
struct pipe_fence_handle*fence;
 
/* the error returned from cs_flush for non-async submissions */
int error_code;
 };
 
 struct amdgpu_cs {
struct amdgpu_ib main; /* must be first because this is inherited */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC PATCH] gallium: add a common uploader to pipe_context

2017-01-26 Thread Marek Olšák

From: Marek Olšák 

For lower memory usage and more efficient updates of the buffer residency
list. (e.g. if drivers keep seeing the same buffer for many consecutive
"add" calls, the calls can be turned into no-ops trivially)
---
 src/gallium/include/pipe/p_context.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/include/pipe/p_context.h 
b/src/gallium/include/pipe/p_context.h
index 45098c9..5876968 100644
--- a/src/gallium/include/pipe/p_context.h
+++ b/src/gallium/include/pipe/p_context.h
@@ -69,33 +69,40 @@ struct pipe_stream_output_target;
 struct pipe_surface;
 struct pipe_transfer;
 struct pipe_vertex_buffer;
 struct pipe_vertex_element;
 struct pipe_video_buffer;
 struct pipe_video_codec;
 struct pipe_viewport_state;
 struct pipe_compute_state;
 union pipe_color_union;
 union pipe_query_result;
+struct u_upload_mgr;
 
 /**
  * Gallium rendering context.  Basically:
  *  - state setting functions
  *  - VBO drawing functions
  *  - surface functions
  */
 struct pipe_context {
struct pipe_screen *screen;
 
void *priv;  /**< context private data (for DRI for example) */
void *draw;  /**< private, for draw module (temporary?) */
 
+   /**
+* Stream uploader created by the driver. All drivers, state trackers, and
+* modules should use it.
+*/
+   struct u_upload_mgr *stream_uploader;
+
void (*destroy)( struct pipe_context * );
 
/**
 * VBO drawing
 */
/*@{*/
void (*draw_vbo)( struct pipe_context *pipe,
  const struct pipe_draw_info *info );
/*@}*/
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Intel-gfx] [PATCH] i965: Share the workaround bo between all contexts

2017-01-26 Thread Chad Versace

On Thu 26 Jan 2017, Chris Wilson wrote:
> On Thu, Jan 26, 2017 at 09:39:51AM -0800, Chad Versace wrote:
> > On Thu 26 Jan 2017, Chris Wilson wrote:
> > > Since the workaround bo is used strictly as a write-only buffer, we need
> > > only allocate one per screen and use the same one from all contexts.
> > > 
> > > (The caveat here is during extension initialisation, where we write into
> > > and read back register values from the buffer, but that is performed only
> > > once for the first context - and baring synchronisation issues should not
> > > be a problem. Safer would be to move that also to the screen.)
> > > 
> > > v2: Give the workaround bo its own init function and don't piggy back
> > > intel_bufmgr_init() since it is not that related.
> > > 
> > > v3: Drop the reference count of the workaround bo for the context since
> > > the context itself is owned by the screen (and so we can rely on the bo
> > > existing for the lifetime of the context).
> > 
> > I like this idea, but I have questions and comments about the details.
> > More questions than comments, really.
> > 
> > Today, with only Mesa changes, could we effectively do the same as
> >   drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo);
> > by hacking Mesa to set no read/write domain when emitting relocs for the
> > workaround_bo? (I admit I don't fully understand the kernel's domain
> > tracking). If that does work, then it just would require a small hack to
> > brw_emit_pipe_control_write().
> 
> Yes, for anything that is totally scratch just not setting the write
> hazard is the same. For something like the seqno page where we have
> multiple engines that we do want to be preserved, not settting the write
> hazzard had the consequence that page could be lost under memory pressure
> or across resume. (As usual there are some details that this part of the
> ABI had to be relaxed because userspace didn't have this flag.)
> But that doesn't sell many bananas.

Good. That's how I thought it worked.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] [v3] drm: Add new DRM_IOCTL_MODE_GETPLANE2

2017-01-26 Thread Ben Widawsky


Sorry, ignore this. I sent to the wrong list.

On 17-01-26 14:16:23, Ben Widawsky wrote:

Originally based off of a patch by Kristian.

This new ioctl extends DRM_IOCTL_MODE_GETPLANE, by returning information
about the modifiers that will work with each format.

It's modified from Kristian's patch in that the modifiers and formats
are setup by the driver, and then a callback is used to create the
format list. The LOC was enough difference that I don't think it made
sense to leave his authorship, but the new UABI was primarily his idea.

Additionally, I hit a couple of drivers which Kristian missed updating.

It also contains a change requested by Daniel to make the modifiers
array a sentinel based structure instead of a sized one. Upon discussion
on IRC, it was determined that having an invalid modifier might make
sense in general as well.

v2:
 - Make formats uint32_t, and use an offset, see the comment in the
 patch. Add a WARN_ON and early bail for when there are more than 32
 formats. (Rob)
 - Remove DRM_DEBUG_KMS (Ville)
 - make flags come before count in struct (Ville)

v3:
 - Make formats 64b again to defer the pain, and add a pad
 - Make init fail if > 64 instead of at get_plane. This could be made
 more optimal by doing it in get_plane because 0 masked modifiers don't
 need to be reported back to userspace. As a result, the first driver
 to go back 64 formats has to deal with this.
 - Fix the comment to be more clear.

Cc: Rob Clark 
Cc: Ville Syrjälä 
Cc: Daniel Stone 
Cc: "Kristian H. Kristensen" 
References: https://patchwork.kernel.org/patch/9482393/
Signed-off-by: Ben Widawsky 
---
drivers/gpu/drm/arc/arcpgu_crtc.c   |  1 +
drivers/gpu/drm/arm/hdlcd_crtc.c|  1 +
drivers/gpu/drm/arm/malidp_planes.c |  2 +-
drivers/gpu/drm/armada/armada_crtc.c|  1 +
drivers/gpu/drm/armada/armada_overlay.c |  1 +
drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c |  4 +-
drivers/gpu/drm/drm_ioctl.c |  2 +-
drivers/gpu/drm/drm_modeset_helper.c|  1 +
drivers/gpu/drm/drm_plane.c | 67 -
drivers/gpu/drm/drm_simple_kms_helper.c |  3 ++
drivers/gpu/drm/exynos/exynos_drm_plane.c   |  2 +-
drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c |  2 +-
drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c  |  1 +
drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c |  2 +-
drivers/gpu/drm/i915/intel_display.c|  7 ++-
drivers/gpu/drm/i915/intel_sprite.c |  4 +-
drivers/gpu/drm/imx/ipuv3-plane.c   |  4 +-
drivers/gpu/drm/mediatek/mtk_drm_plane.c|  2 +-
drivers/gpu/drm/meson/meson_plane.c |  1 +
drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c   |  2 +-
drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c   |  2 +-
drivers/gpu/drm/mxsfb/mxsfb_drv.c   |  2 +-
drivers/gpu/drm/nouveau/nv50_display.c  |  5 +-
drivers/gpu/drm/omapdrm/omap_plane.c|  3 +-
drivers/gpu/drm/rcar-du/rcar_du_plane.c |  4 +-
drivers/gpu/drm/rcar-du/rcar_du_vsp.c   |  5 +-
drivers/gpu/drm/rockchip/rockchip_drm_vop.c |  4 +-
drivers/gpu/drm/sti/sti_cursor.c|  1 +
drivers/gpu/drm/sti/sti_gdp.c   |  2 +-
drivers/gpu/drm/sti/sti_hqvdp.c |  2 +-
drivers/gpu/drm/sun4i/sun4i_layer.c |  1 +
drivers/gpu/drm/tegra/dc.c  | 12 ++---
drivers/gpu/drm/vc4/vc4_plane.c |  2 +-
drivers/gpu/drm/virtio/virtgpu_plane.c  |  2 +-
drivers/gpu/drm/zte/zx_plane.c  |  2 +-
include/drm/drm_plane.h | 21 +++-
include/drm/drm_simple_kms_helper.h |  1 +
include/uapi/drm/drm.h  |  1 +
include/uapi/drm/drm_fourcc.h   | 11 
include/uapi/drm/drm_mode.h | 44 
40 files changed, 199 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/arc/arcpgu_crtc.c 
b/drivers/gpu/drm/arc/arcpgu_crtc.c
index ad9a95916f1f..cd8a24c7c67d 100644
--- a/drivers/gpu/drm/arc/arcpgu_crtc.c
+++ b/drivers/gpu/drm/arc/arcpgu_crtc.c
@@ -218,6 +218,7 @@ static struct drm_plane *arc_pgu_plane_init(struct 
drm_device *drm)

ret = drm_universal_plane_init(drm, plane, 0xff, _pgu_plane_funcs,
   formats, ARRAY_SIZE(formats),
+  NULL,
   DRM_PLANE_TYPE_PRIMARY, NULL);
if (ret)
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
index 20ebfb4fbdfa..89fded880807 100644
--- a/drivers/gpu/drm/arm/hdlcd_crtc.c
+++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
@@ -283,6 +283,7 @@ static struct drm_plane *hdlcd_plane_init(struct drm_device 
*drm)

ret =

[Mesa-dev] [PATCHv2 6/8] nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.

2017-01-26 Thread Francisco Jerez

See "glsl: Rewrite atan2 implementation to fix accuracy and handling
of zero/infinity." for the rationale, but note that the instruction
count benefit discussed there is somewhat less important for the SPIRV
implementation, because the current code already emitted no control
flow instructions -- Still this saves us one hardware instruction per
scalar component on Intel SKL hardware.

Fixes the following Vulkan CTS tests on Intel hardware:

dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4
dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2
dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4

Note that most of the test-cases above expect IEEE-compliant handling
of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so
except for the last two the test-cases above weren't expected to pass
yet.  The reason they do is that the i965 back-end implementation of
the NIR fmin and fmax instructions is not quite GLSL-compliant (it
complies with IEEE 754 recommendations though), because fmin/fmax of a
NaN and a non-NaN argument currently always return the non-NaN
argument, which causes atan() to flush NaN to one and return the
expected value.  The front-end should probably not be relying on this
behavior for correctness though because other back-ends are likely to
behave differently -- A follow-up patch will handle the atan2(±∞, ±∞)
corner cases explicitly.

v2: Fix up argument scaling to take into account the range and
precision of exotic FP24 hardware.  Flip coordinate system for
arguments along the vertical line as if they were on the left
half-plane in order to avoid division by zero which may give
unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
some more comments.
---
 src/compiler/spirv/vtn_glsl450.c | 77 
 1 file changed, 55 insertions(+), 22 deletions(-)

diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index 0d32fdd..8509f64 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@ -302,28 +302,61 @@ build_atan(nir_builder *b, nir_ssa_def *y_over_x)
 static nir_ssa_def *
 build_atan2(nir_builder *b, nir_ssa_def *y, nir_ssa_def *x)
 {
-   nir_ssa_def *zero = nir_imm_float(b, 0.0f);
-
-   /* If |x| >= 1.0e-8 * |y|: */
-   nir_ssa_def *condition =
-  nir_fge(b, nir_fabs(b, x),
-  nir_fmul(b, nir_imm_float(b, 1.0e-8f), nir_fabs(b, y)));
-
-   /* Then...call atan(y/x) and fix it up: */
-   nir_ssa_def *atan1 = build_atan(b, nir_fdiv(b, y, x));
-   nir_ssa_def *r_then =
-  nir_bcsel(b, nir_flt(b, x, zero),
-   nir_fadd(b, atan1,
-   nir_bcsel(b, nir_fge(b, y, zero),
-nir_imm_float(b, M_PIf),
-nir_imm_float(b, -M_PIf))),
-   atan1);
-
-   /* Else... */
-   nir_ssa_def *r_else =
-  nir_fmul(b, nir_fsign(b, y), nir_imm_float(b, M_PI_2f));
-
-   return nir_bcsel(b, condition, r_then, r_else);
+   nir_ssa_def *zero = nir_imm_float(b, 0);
+   nir_ssa_def *one = nir_imm_float(b, 1);
+
+   /* If we're on the left half-plane rotate the coordinates π/2 clock-wise
+* for the y=0 discontinuity to end up aligned with the vertical
+* discontinuity of atan(s/t) along t=0.  This also makes sure that we
+* don't attempt to divide by zero along the vertical line, which may give
+* unspecified results on non-GLSL 4.1-capable hardware.
+*/
+   nir_ssa_def *flip = nir_fge(b, zero, x);
+   nir_ssa_def *s = nir_bcsel(b, flip, nir_fabs(b, x), y);
+   nir_ssa_def *t = nir_bcsel(b, flip, y, nir_fabs(b, x));
+
+   /* If the magnitude of the denominator exceeds some huge value, scale down
+* the arguments in order to prevent the reciprocal operation from flushing
+* its result to zero, which would cause precision problems, and for s
+* infinite would cause us to return a NaN instead of the correct finite
+* value.
+*
+* If fmin and fmax are respectively the smallest and largest positive
+* normalized floating point values representable by the implementation,
+* the constants below should be in agreement with:
+*
+*huge <= 1 / fmin
+*scale <= 1 / fmin / fmax (for |t| >= huge)
+*
+* In addition scale should be a negative power of two in order to avoid
+* loss of precision.  The values chosen below should work for most usual
+* floating point representations with at least the dynamic range of ATI's
+* 24-bit representation.
+*/
+   nir_ssa_def *huge = nir_imm_float(b, 1e18f);
+   nir_ssa_def *scale = nir_bcsel(b, nir_fge(b, nir_fabs(b, t), huge),
+  nir_imm_float(b, 0.25), one);
+   nir_ssa_def

[Mesa-dev] [PATCHv2 5/8] glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.

2017-01-26 Thread Francisco Jerez

This addresses several issues of the current atan2 implementation:

 - Negative zero (and negative denorms which end up getting flushed to
   zero) isn't handled correctly by the current implementation.  The
   reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
   on which side of the branch cut the argument is, which causes us to
   return incorrect results (off by up to 2π) for very small negative
   values.

 - There is a serious precision problem for x values of large enough
   magnitude introduced by the floating point division operation being
   implemented as a mul+rcp sequence.  This can lead to the quotient
   getting flushed to zero in some cases introducing an error of over
   8e6 ULP in the result -- Or in the most catastrophic case will
   cause us to return NaN instead of the correct value ±π/2 for y=±∞
   and x very large.  We can fix this easily by scaling down both
   arguments when the absolute value of the denominator goes above
   certain threshold.  The error of this atan2 implementation remains
   below 25 ULP in most of its domain except for a neighborhood of y=0
   where it reaches a maximum error of about 180 ULP.

 - It emits a bunch of instructions including no less than three
   if-else branches per scalar component that don't seem to get
   optimized out later on.  This implementation uses about 13% less
   instructions on Intel SKL hardware and doesn't emit any control
   flow instructions.

v2: Fix up argument scaling to take into account the range and
precision of exotic FP24 hardware.  Flip coordinate system for
arguments along the vertical line as if they were on the left
half-plane in order to avoid division by zero which may give
unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
some more comments.
---
 src/compiler/glsl/builtin_functions.cpp | 96 -
 1 file changed, 60 insertions(+), 36 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index 4a6c5af..432df65 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3560,44 +3560,68 @@ builtin_builder::_acos(const glsl_type *type)
 ir_function_signature *
 builtin_builder::_atan2(const glsl_type *type)
 {
-   ir_variable *vec_y = in_var(type, "vec_y");
-   ir_variable *vec_x = in_var(type, "vec_x");
-   MAKE_SIG(type, always_available, 2, vec_y, vec_x);
-
-   ir_variable *vec_result = body.make_temp(type, "vec_result");
-   ir_variable *r = body.make_temp(glsl_type::float_type, "r");
-   for (int i = 0; i < type->vector_elements; i++) {
-  ir_variable *y = body.make_temp(glsl_type::float_type, "y");
-  ir_variable *x = body.make_temp(glsl_type::float_type, "x");
-  body.emit(assign(y, swizzle(vec_y, i, 1)));
-  body.emit(assign(x, swizzle(vec_x, i, 1)));
-
-  /* If |x| >= 1.0e-8 * |y|: */
-  ir_if *outer_if =
- new(mem_ctx) ir_if(greater(abs(x), mul(imm(1.0e-8f), abs(y;
-
-  ir_factory outer_then(_if->then_instructions, mem_ctx);
-
-  /* Then...call atan(y/x) */
-  do_atan(outer_then, glsl_type::float_type, r, div(y, x));
-
-  /* ...and fix it up: */
-  ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f)));
-  inner_if->then_instructions.push_tail(
- if_tree(gequal(y, imm(0.0f)),
- assign(r, add(r, imm(M_PIf))),
- assign(r, sub(r, imm(M_PIf);
-  outer_then.emit(inner_if);
-
-  /* Else... */
-  outer_if->else_instructions.push_tail(
- assign(r, mul(sign(y), imm(M_PI_2f;
+   const unsigned n = type->vector_elements;
+   ir_variable *y = in_var(type, "y");
+   ir_variable *x = in_var(type, "x");
+   MAKE_SIG(type, always_available, 2, y, x);
 
-  body.emit(outer_if);
+   /* If we're on the left half-plane rotate the coordinates π/2 clock-wise
+* for the y=0 discontinuity to end up aligned with the vertical
+* discontinuity of atan(s/t) along t=0.  This also makes sure that we
+* don't attempt to divide by zero along the vertical line, which may give
+* unspecified results on non-GLSL 4.1-capable hardware.
+*/
+   ir_variable *flip = body.make_temp(glsl_type::bvec(n), "flip");
+   body.emit(assign(flip, gequal(imm(0.0f, n), x)));
+   ir_variable *s = body.make_temp(type, "s");
+   body.emit(assign(s, csel(flip, abs(x), y)));
+   ir_variable *t = body.make_temp(type, "t");
+   body.emit(assign(t, csel(flip, y, abs(x;
 
-  body.emit(assign(vec_result, r, 1 << i));
-   }
-   body.emit(ret(vec_result));
+   /* If the magnitude of the denominator exceeds some huge value, scale down
+* the arguments in order to prevent the reciprocal operation from flushing
+* its result to zero, which would cause precision problems, and for s
+* infinite would cause us to return a NaN instead of the correct finite
+* value.
+*
+* If fmin and fmax are respectively the

Re: [Mesa-dev] Build failure of Gallium OSMesa without LLVM

2017-01-26 Thread Tobias Droste

Hey Matt,

yeah sorry about this, it is a known problem:
https://bugs.freedesktop.org/show_bug.cgi?id=99010

A fix was commited but broke scons and Jose reverted it and doesn't accept the 
way it was fixed in general. 

See discussion here:
https://lists.freedesktop.org/archives/mesa-dev/2017-January/141263.html

Haven't had time to come up with a new solution, but Emil may come up with one 
faster than me.

If you need a build you can use the workaround from the bug:
https://bugs.freedesktop.org/show_bug.cgi?id=99010#c11

Tobias

Am Donnerstag, 26. Januar 2017, 13:46:06 CET schrieb Matt Turner:
> Reported against 17.0.0-rc2 [1], but occurs on master as well
> 
> Reproduce with
> 
> ./configure --with-dri-drivers= --with-gallium-drivers=swrast
> --disable-gallium-llvm --enable-gallium-osmesa
> 
> Fails linking Gallium OSMesa with undefined references to draw_llvm_destroy.
> 
> ../../../../src/gallium/auxiliary/.libs/libgallium.a(draw_context.o):
> In function `draw_destroy':
> /home/mattst88/projects/mesa/src/gallium/auxiliary/draw/draw_context.c:226:
> undefined reference to `draw_llvm_destroy'
> 
> [1 ] https://bugs.gentoo.org/show_bug.cgi?id=607320
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.

2017-01-26 Thread Francisco Jerez

Ian Romanick  writes:

> On 01/26/2017 12:20 PM, Francisco Jerez wrote:
>> Ian Romanick  writes:
>> 
>>> On 01/25/2017 10:53 AM, Francisco Jerez wrote:
 Hi Ian, and thank you for your comments,

 Ian Romanick  writes:

> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
>> Will avoid a regression in a future commit that introduces some
>> additional rcp operations.
>
> When I converted GLSL IR to ir_expression_operation.py, I was careful to
> keep all the expressions the same.  rcp and div had these weird guards.
> GLSL doesn't require that NaN be generated, and quite a few old GPUs
> don't.  If the atan2 implementation depends on NaN being generated by
> rcp, it may have problems on i915, r300, and similar GPUs.  I don't know
> what they generate, but it's not NaN and it's probably not 0.0.

 The atan2 implementation from patch 5 doesn't rely on NaNs being
 generated, but it does rely on the reciprocal operation handling zero
 and infinity correctly as specified by GLSL for the division operation.
>>>
>>> Okay.  That is the problem on older GPUs that I was referring.
>>> Specifically, all versions of the GLSL prior to 4.40 (!) say:
>>>
>>> Similarly, treatment of conditions such as divide by 0 may lead to
>>> an unspecified result, but in no case should such a condition lead
>>> to the interruption or termination of processing.
>> 
>> I cannot find this paragraph in any of the GLSL 4.1+ specs.
>
> Somehow I botched my search.  GLSL 4.10 was the first version to drop
> this in favor of, "Dividing by 0 results in the appropriately signed
> IEEE Inf."
>
>>> I believe that DX11 requires the GLSL 4.40+ behavior, so all even
>>> somewhat modern devices should just work.  It's all the pre-DX11
>>> hardware that's might be a problem.
>> 
>> Actually I don't think PATCH 5 necessarily cares about infinities not
>> getting generated -- The only requirement is for rcp(0) to return a
>> fairly large value in absolute value (the larger the more accurate the
>> result will be), even the sign of the result is pretty much irrelevant.
>
> I expect that's what non-Inf GPUs do / did... generate float MAXVAL.
> Now that I'm understanding (and remembering) all the issues better, I'm
> a bit less worried.  As I mentioned in the previous message, I'd like to
> see someone test this on r300... I think I still have one somewhere, but
> I won't be able to test it for at least two weeks.
>
>> That said there's a relatively straightforward change we could apply to
>> PATCH 5 and 6 in order to make the calculation more robust against
>> division by zero (it will probably make this patch unnecessary to avoid
>> piglit regressions but I think we want it anyway because of GLSL 4.1+):
>
> Yes... I believe we do need this patch for GLSL 4.10 correctness.
>
>> We could leverage the coordinate rotation to turn (y, 0) into (0, y),
>> avoiding division by zero along the whole vertical line -- The only
>> remaining case where we could potentially divide by zero is along the
>> left y=0 half-line, but the function jumps from -π to π along that line
>> so returning an undefined value within [-π, π] for y=0 and x < 0 is very
>> unlikely to hurt, because AFAICT all GLSL versions lacking well-defined
>> divide by zero didn't require the implementation to represent the sign
>> of zero consistently either, so the shader is unlikely to be able to
>> generate a signed zero value accurately enough to notice the problem...
>> 
>>> Now... talking to Jason just now, he reminded me that the spec also says
>>> the following about built-in functions:
>>>
>>> Function parameters specified as angle are assumed to be in units
>>> of radians. In no case will any of these functions result in a
>>> divide by zero error. If the divisor of a ratio is 0, then results
>>> will be undefined.
>>>
>> Heh... That could be interpreted as if atan2(y, 0) is undefined which
>> would make this discussion moot -- I'll send a v2 of PATCH 5 and 6
>> anyway since it should be easy enough to get right.
>
> I think it allows atan2(y, 0) to have an undefined result, but atan2(0,
> -abs(x)) should still produce 0.

Nope, atan2(±0, -abs(x)) is right along the discontinuity, and the
expected value would jump from -π to π depending on the sign of zero.

>  Based on my reading of patch 5, those inputs would lead to rcp(0).
>

Yeah, but then again hardware unable to give IEEE-compliant results for
division by zero is unlikely to be able to generate signed zero
accurately enough, so the result is going to have a maximum absolute
error of 2π anyway.

>>> We may be fine even on old, clunky hardware.  Looking at the code in
>>> patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces
>>> undefined results.  It looks like
>>> tests/shaders/glsl-fs-atan-2.shader_test should hit that case.  Anyone
>>> have r300 or r400

[Mesa-dev] [PATCH 1/3] [v3] drm: Add new DRM_IOCTL_MODE_GETPLANE2

2017-01-26 Thread Ben Widawsky

Originally based off of a patch by Kristian.

This new ioctl extends DRM_IOCTL_MODE_GETPLANE, by returning information
about the modifiers that will work with each format.

It's modified from Kristian's patch in that the modifiers and formats
are setup by the driver, and then a callback is used to create the
format list. The LOC was enough difference that I don't think it made
sense to leave his authorship, but the new UABI was primarily his idea.

Additionally, I hit a couple of drivers which Kristian missed updating.

It also contains a change requested by Daniel to make the modifiers
array a sentinel based structure instead of a sized one. Upon discussion
on IRC, it was determined that having an invalid modifier might make
sense in general as well.

v2:
  - Make formats uint32_t, and use an offset, see the comment in the
  patch. Add a WARN_ON and early bail for when there are more than 32
  formats. (Rob)
  - Remove DRM_DEBUG_KMS (Ville)
  - make flags come before count in struct (Ville)

v3:
  - Make formats 64b again to defer the pain, and add a pad
  - Make init fail if > 64 instead of at get_plane. This could be made
  more optimal by doing it in get_plane because 0 masked modifiers don't
  need to be reported back to userspace. As a result, the first driver
  to go back 64 formats has to deal with this.
  - Fix the comment to be more clear.

Cc: Rob Clark 
Cc: Ville Syrjälä 
Cc: Daniel Stone 
Cc: "Kristian H. Kristensen" 
References: https://patchwork.kernel.org/patch/9482393/
Signed-off-by: Ben Widawsky 
---
 drivers/gpu/drm/arc/arcpgu_crtc.c   |  1 +
 drivers/gpu/drm/arm/hdlcd_crtc.c|  1 +
 drivers/gpu/drm/arm/malidp_planes.c |  2 +-
 drivers/gpu/drm/armada/armada_crtc.c|  1 +
 drivers/gpu/drm/armada/armada_overlay.c |  1 +
 drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c |  4 +-
 drivers/gpu/drm/drm_ioctl.c |  2 +-
 drivers/gpu/drm/drm_modeset_helper.c|  1 +
 drivers/gpu/drm/drm_plane.c | 67 -
 drivers/gpu/drm/drm_simple_kms_helper.c |  3 ++
 drivers/gpu/drm/exynos/exynos_drm_plane.c   |  2 +-
 drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c |  2 +-
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c  |  1 +
 drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c |  2 +-
 drivers/gpu/drm/i915/intel_display.c|  7 ++-
 drivers/gpu/drm/i915/intel_sprite.c |  4 +-
 drivers/gpu/drm/imx/ipuv3-plane.c   |  4 +-
 drivers/gpu/drm/mediatek/mtk_drm_plane.c|  2 +-
 drivers/gpu/drm/meson/meson_plane.c |  1 +
 drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c   |  2 +-
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c   |  2 +-
 drivers/gpu/drm/mxsfb/mxsfb_drv.c   |  2 +-
 drivers/gpu/drm/nouveau/nv50_display.c  |  5 +-
 drivers/gpu/drm/omapdrm/omap_plane.c|  3 +-
 drivers/gpu/drm/rcar-du/rcar_du_plane.c |  4 +-
 drivers/gpu/drm/rcar-du/rcar_du_vsp.c   |  5 +-
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c |  4 +-
 drivers/gpu/drm/sti/sti_cursor.c|  1 +
 drivers/gpu/drm/sti/sti_gdp.c   |  2 +-
 drivers/gpu/drm/sti/sti_hqvdp.c |  2 +-
 drivers/gpu/drm/sun4i/sun4i_layer.c |  1 +
 drivers/gpu/drm/tegra/dc.c  | 12 ++---
 drivers/gpu/drm/vc4/vc4_plane.c |  2 +-
 drivers/gpu/drm/virtio/virtgpu_plane.c  |  2 +-
 drivers/gpu/drm/zte/zx_plane.c  |  2 +-
 include/drm/drm_plane.h | 21 +++-
 include/drm/drm_simple_kms_helper.h |  1 +
 include/uapi/drm/drm.h  |  1 +
 include/uapi/drm/drm_fourcc.h   | 11 
 include/uapi/drm/drm_mode.h | 44 
 40 files changed, 199 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/arc/arcpgu_crtc.c 
b/drivers/gpu/drm/arc/arcpgu_crtc.c
index ad9a95916f1f..cd8a24c7c67d 100644
--- a/drivers/gpu/drm/arc/arcpgu_crtc.c
+++ b/drivers/gpu/drm/arc/arcpgu_crtc.c
@@ -218,6 +218,7 @@ static struct drm_plane *arc_pgu_plane_init(struct 
drm_device *drm)
 
ret = drm_universal_plane_init(drm, plane, 0xff, _pgu_plane_funcs,
   formats, ARRAY_SIZE(formats),
+  NULL,
   DRM_PLANE_TYPE_PRIMARY, NULL);
if (ret)
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
index 20ebfb4fbdfa..89fded880807 100644
--- a/drivers/gpu/drm/arm/hdlcd_crtc.c
+++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
@@ -283,6 +283,7 @@ static struct drm_plane *hdlcd_plane_init(struct drm_device 
*drm)
 
ret = drm_universal_plane_init(drm, plane, 0xff, _plane_funcs,

Re: [Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.

2017-01-26 Thread Ian Romanick

On 01/26/2017 12:20 PM, Francisco Jerez wrote:
> Ian Romanick  writes:
> 
>> On 01/25/2017 10:53 AM, Francisco Jerez wrote:
>>> Hi Ian, and thank you for your comments,
>>>
>>> Ian Romanick  writes:
>>>
 On 01/24/2017 03:26 PM, Francisco Jerez wrote:
> Will avoid a regression in a future commit that introduces some
> additional rcp operations.

 When I converted GLSL IR to ir_expression_operation.py, I was careful to
 keep all the expressions the same.  rcp and div had these weird guards.
 GLSL doesn't require that NaN be generated, and quite a few old GPUs
 don't.  If the atan2 implementation depends on NaN being generated by
 rcp, it may have problems on i915, r300, and similar GPUs.  I don't know
 what they generate, but it's not NaN and it's probably not 0.0.
>>>
>>> The atan2 implementation from patch 5 doesn't rely on NaNs being
>>> generated, but it does rely on the reciprocal operation handling zero
>>> and infinity correctly as specified by GLSL for the division operation.
>>
>> Okay.  That is the problem on older GPUs that I was referring.
>> Specifically, all versions of the GLSL prior to 4.40 (!) say:
>>
>> Similarly, treatment of conditions such as divide by 0 may lead to
>> an unspecified result, but in no case should such a condition lead
>> to the interruption or termination of processing.
> 
> I cannot find this paragraph in any of the GLSL 4.1+ specs.

Somehow I botched my search.  GLSL 4.10 was the first version to drop
this in favor of, "Dividing by 0 results in the appropriately signed
IEEE Inf."

>> I believe that DX11 requires the GLSL 4.40+ behavior, so all even
>> somewhat modern devices should just work.  It's all the pre-DX11
>> hardware that's might be a problem.
> 
> Actually I don't think PATCH 5 necessarily cares about infinities not
> getting generated -- The only requirement is for rcp(0) to return a
> fairly large value in absolute value (the larger the more accurate the
> result will be), even the sign of the result is pretty much irrelevant.

I expect that's what non-Inf GPUs do / did... generate float MAXVAL.
Now that I'm understanding (and remembering) all the issues better, I'm
a bit less worried.  As I mentioned in the previous message, I'd like to
see someone test this on r300... I think I still have one somewhere, but
I won't be able to test it for at least two weeks.

> That said there's a relatively straightforward change we could apply to
> PATCH 5 and 6 in order to make the calculation more robust against
> division by zero (it will probably make this patch unnecessary to avoid
> piglit regressions but I think we want it anyway because of GLSL 4.1+):

Yes... I believe we do need this patch for GLSL 4.10 correctness.

> We could leverage the coordinate rotation to turn (y, 0) into (0, y),
> avoiding division by zero along the whole vertical line -- The only
> remaining case where we could potentially divide by zero is along the
> left y=0 half-line, but the function jumps from -π to π along that line
> so returning an undefined value within [-π, π] for y=0 and x < 0 is very
> unlikely to hurt, because AFAICT all GLSL versions lacking well-defined
> divide by zero didn't require the implementation to represent the sign
> of zero consistently either, so the shader is unlikely to be able to
> generate a signed zero value accurately enough to notice the problem...
> 
>> Now... talking to Jason just now, he reminded me that the spec also says
>> the following about built-in functions:
>>
>> Function parameters specified as angle are assumed to be in units
>> of radians. In no case will any of these functions result in a
>> divide by zero error. If the divisor of a ratio is 0, then results
>> will be undefined.
>>
> Heh... That could be interpreted as if atan2(y, 0) is undefined which
> would make this discussion moot -- I'll send a v2 of PATCH 5 and 6
> anyway since it should be easy enough to get right.

I think it allows atan2(y, 0) to have an undefined result, but atan2(0,
-abs(x)) should still produce 0.  Based on my reading of patch 5, those
inputs would lead to rcp(0).

>> We may be fine even on old, clunky hardware.  Looking at the code in
>> patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces
>> undefined results.  It looks like
>> tests/shaders/glsl-fs-atan-2.shader_test should hit that case.  Anyone
>> have r300 or r400 hardware to test that?
>>
>> This patch doesn't affect that, and, even with the "unspecified result"
>> rule, it's clearly correct.  This patch is
>>
>> Reviewed-by: Ian Romanick 
>>
 That said, this matches NIR, and it's probably fine.




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.

2017-01-26 Thread Kenneth Graunke

On Thursday, January 26, 2017 12:00:14 PM PST Eric Anholt wrote:
> Kenneth Graunke  writes:
> 
> > Applications may delete a shader program, create a new one, and bind it
> > before the next draw.  With terrible luck, malloc may randomly return a
> > chunk of memory for the new gl_program that happened to be the exact
> > same pointer as our previously bound gl_program.  In this case, our
> > logic to detect new programs in brw_upload_pipeline_state() would break:
> >
> >   if (brw->vertex_program != ctx->VertexProgram._Current) {
> >  brw->vertex_program = ctx->VertexProgram._Current;
> >  brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
> >   }
> >
> > Because the pointer is the same, we'd think it was the same program.
> > But it could be wildly different - a different stage altogether,
> > different sets of resources, and so on.  This causes utter chaos.
> 
> Any reason you're not just using _mesa_reference_program()?

That might be a better plan.

Conceptually, I thought of this more as a "weak reference" - we want to
know if the current program is the same as the last draw...but we don't
want to hold on to the program and prevent its deletion.

Plus, it would add a bit of reference counting overhead in the draw path
(though this is probably negligable - Mesa does a lot of it already).

I suppose using real references would mean that a deleted program would
remain alive until the next glDraw*() that used a different shader.
Which could theoretically be forever, but is likely not that long, so
it's probably acceptable.

I don't know.  I'll probably go with this patch for now since it works
and I've done a lot of testing.  If people prefer the other way, I can
write a patch to do that...

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread David Airlie



- Original Message -
> From: "Marek Olšák" 
> To: "Nicolai Hähnle" 
> Cc: "Ilia Mirkin" , mesa-dev@lists.freedesktop.org, 
> "Dave Airlie" 
> Sent: Friday, 27 January, 2017 6:18:32 AM
> Subject: Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability
> 
> On Thu, Jan 26, 2017 at 9:04 PM, Nicolai Hähnle  wrote:
> > On 26.01.2017 20:23, Ilia Mirkin wrote:
> >>
> >> I have no serious preference, but for doubles, we use a shader cap.
> >
> >
> > I think this:
> >
> >> On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle 
> >> wrote:
> >>>
> >>> v1.1: move to using a normal CAP. (Marek)
> >
> >
> > ... suggests that this has already gone back and forth :)
> >
> > Pragmatically, is there hardware that can do 64 bit ints only in some of
> > the
> > shader stages?
> 
> I'll give you a different answer: I think the double cap and other
> shader caps can be changed to normal caps if there is no reason for
> them to be shader caps.
> 

I think the doubles cap should be non-shader now, but not so motivated to write 
a patch.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2017-01-26 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

matt...@familycampground.org changed:

   What|Removed |Added

 CC||matt...@familycampground.or
   ||g

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Build failure of Gallium OSMesa without LLVM

2017-01-26 Thread Matt Turner

Reported against 17.0.0-rc2 [1], but occurs on master as well

Reproduce with

./configure --with-dri-drivers= --with-gallium-drivers=swrast
--disable-gallium-llvm --enable-gallium-osmesa

Fails linking Gallium OSMesa with undefined references to draw_llvm_destroy.

../../../../src/gallium/auxiliary/.libs/libgallium.a(draw_context.o):
In function `draw_destroy':
/home/mattst88/projects/mesa/src/gallium/auxiliary/draw/draw_context.c:226:
undefined reference to `draw_llvm_destroy'

[1 ] https://bugs.gentoo.org/show_bug.cgi?id=607320
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 95460] Please add more drivers (freedreno, virgl) to features.txt status document

2017-01-26 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=95460

--- Comment #4 from Shmerl  ---
FYI: Freedreno is visible in the MesaMatrix: https://mesamatrix.net

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/8] r300: let .get_name() append LLVM if built with LLVM

2017-01-26 Thread Marek Olšák

My preference would be not to use static variables in pipe_screen
functions, because all functions needs to be thread-safe, but anyway:

Reviewed-by: Marek Olšák 

Marek

On Thu, Jan 26, 2017 at 7:27 PM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Provides quick and direct feedback to the user/developer.
>
> Cc: Marek Olšák 
> Signed-off-by: Emil Velikov 
> ---
>  src/gallium/drivers/r300/r300_screen.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/r300/r300_screen.c 
> b/src/gallium/drivers/r300/r300_screen.c
> index e5e7535358..12b94723cb 100644
> --- a/src/gallium/drivers/r300/r300_screen.c
> +++ b/src/gallium/drivers/r300/r300_screen.c
> @@ -82,8 +82,16 @@ static const char* chip_families[] = {
>  static const char* r300_get_name(struct pipe_screen* pscreen)
>  {
>  struct r300_screen* r300screen = r300_screen(pscreen);
> +static char buffer[128];
> +const char *llvm = "";
>
> -return chip_families[r300screen->caps.family];
> +#ifdef HAVE_LLVM
> +llvm = " LLVM";
> +#endif
> +
> +util_snprintf(buffer, sizeof(buffer), "%s%s",
> +  chip_families[r300screen->caps.family], llvm);
> +return buffer;
>  }
>
>  static int r300_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] gallium: enable int64 on radeonsi, llvmpipe, softpipe

2017-01-26 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

Marek

On Thu, Jan 26, 2017 at 8:09 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> All of these have had support for the TGSI opcodes since before most of
> the glsl compiler work landed.
>
> Also update the docs accordingly, including the missing note about i965.
> ---
>  docs/features.txt|  2 +-
>  docs/relnotes/17.1.0.html| 61 
> 
>  src/gallium/drivers/llvmpipe/lp_screen.c |  2 +-
>  src/gallium/drivers/radeonsi/si_pipe.c   |  4 +--
>  src/gallium/drivers/softpipe/sp_screen.c |  2 +-
>  5 files changed, 66 insertions(+), 5 deletions(-)
>  create mode 100644 docs/relnotes/17.1.0.html
>
> diff --git a/docs/features.txt b/docs/features.txt
> index aff0016..55b1fbb 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -276,21 +276,21 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
>GL_OES_texture_storage_multisample_2d_array   DONE (all drivers 
> that support GL_ARB_texture_multisample)
>
>  Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL 
> ES version:
>
>GL_ARB_bindless_texture   started (airlied)
>GL_ARB_cl_event   not started
>GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi)
>GL_ARB_ES3_2_compatibilityDONE (i965/gen8+)
>GL_ARB_fragment_shader_interlock  not started
>GL_ARB_gl_spirv   not started
> -  GL_ARB_gpu_shader_int64   started (airlied for 
> core and Gallium, idr for i965)
> +  GL_ARB_gpu_shader_int64   DONE (i965/gen8+, 
> radeonsi, softpipe, llvmpipe)
>GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
>GL_ARB_parallel_shader_compilenot started, but 
> Chia-I Wu did some related work in 2014
>GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
> radeonsi, softpipe, swr)
>GL_ARB_post_depth_coverageDONE (i965)
>GL_ARB_robustness_isolation   not started
>GL_ARB_sample_locations   not started
>GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
> radeonsi, r600, softpipe, swr)
>GL_ARB_shader_atomic_counter_ops  DONE (nvc0, 
> radeonsi, softpipe)
>GL_ARB_shader_ballot  not started
>GL_ARB_shader_clock   DONE (i965/gen7+)
> diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
> new file mode 100644
> index 000..1b5535b
> --- /dev/null
> +++ b/docs/relnotes/17.1.0.html
> @@ -0,0 +1,61 @@
> + "http://www.w3.org/TR/html4/loose.dtd;>
> +
> +
> +  
> +  Mesa Release Notes
> +  
> +
> +
> +
> +
> +  The Mesa 3D Graphics Library
> +
> +
> +
> +
> +
> +Mesa 17.1.0 Release Notes / TBD
> +
> +
> +Mesa 17.1.0 is a new development release.
> +People who are concerned with stability and reliability should stick
> +with a previous release or wait for Mesa 17.1.1.
> +
> +
> +Mesa 17.1.0 implements the OpenGL 4.5 API, but the version reported by
> +glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
> +glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
> +Some drivers don't support all the features required in OpenGL 4.5.  OpenGL
> +4.5 is only available if requested at context creation
> +because compatibility contexts are not supported.
> +
> +
> +
> +SHA256 checksums
> +
> +TBD.
> +
> +
> +
> +New features
> +
> +
> +Note: some of the new features are only available with certain drivers.
> +
> +
> +
> +GL_ARB_gpu_shader_int64 on i965/gen8+, radeonsi, softpipe, llvmpipe
> +
> +
> +Bug fixes
> +
> +
> +
> +
> +Changes
> +
> +TBD.
> +
> +
> +
> +
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 6ef22b8..0982c35 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -260,20 +260,21 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
> pipe_cap param)
> case PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION:
>return 1;
> case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
>return 0;
> case PIPE_CAP_SAMPLER_VIEW_TARGET:
>return 1;
> case PIPE_CAP_FAKE_SW_MSAA:
>return 1;
> case PIPE_CAP_CONDITIONAL_RENDER_INVERTED:
> case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
> +   case PIPE_CAP_INT64:
>return 1;
>
> case PIPE_CAP_VENDOR_ID:
>return 0x;
> case PIPE_CAP_DEVICE_ID:
>return 0x;
> case PIPE_CAP_ACCELERATED:
>return 0;
> case PIPE_CAP_VIDEO_MEMORY: {
>/* XXX: Do we want to return

Re: [Mesa-dev] [PATCH 01/17] radeonsi: remove si_shader_context::param_tess_offchip

2017-01-26 Thread Edmondo Tommasina

For the series:
Tested-by: Edmondo Tommasina 

Tested with:
* The Witcher 2
* Talos Principle
* Shadow Tactics
* Wasteland 2
* X3: AP
* Pillars of Eternity
* Uningine Heaven & Valley

Thanks
edmondo


On Thu, Jan 26, 2017 at 5:04 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> we don't use on-chip tess.
> ---
>  src/gallium/drivers/radeonsi/si_shader.c  | 6 +++---
>  src/gallium/drivers/radeonsi/si_shader_internal.h | 5 -
>  2 files changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 5ca974e..ea1e8b3 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -5533,21 +5533,21 @@ static void declare_streamout_params(struct 
> si_shader_context *ctx,
>  LLVMTypeRef *params, LLVMTypeRef i32,
>  unsigned *num_params)
>  {
> int i;
>
> /* Streamout SGPRs. */
> if (so->num_outputs) {
> if (ctx->type != PIPE_SHADER_TESS_EVAL)
> params[ctx->param_streamout_config = (*num_params)++] 
> = i32;
> else
> -   ctx->param_streamout_config = ctx->param_tess_offchip;
> +   ctx->param_streamout_config = *num_params - 1;
>
> params[ctx->param_streamout_write_index = (*num_params)++] = 
> i32;
> }
> /* A streamout buffer offset is loaded if the stride is non-zero. */
> for (i = 0; i < 4; i++) {
> if (!so->stride[i])
> continue;
>
> params[ctx->param_streamout_offset[i] = (*num_params)++] = 
> i32;
> }
> @@ -5697,24 +5697,24 @@ static void create_function(struct si_shader_context 
> *ctx)
> for (i = 0; i < 3; i++)
> returns[num_returns++] = ctx->f32; /* VGPRs */
> break;
>
> case PIPE_SHADER_TESS_EVAL:
> params[SI_PARAM_TCS_OFFCHIP_LAYOUT] = ctx->i32;
> num_params = SI_PARAM_TCS_OFFCHIP_LAYOUT+1;
>
> if (shader->key.as_es) {
> params[ctx->param_oc_lds = num_params++] = ctx->i32;
> -   params[ctx->param_tess_offchip = num_params++] = 
> ctx->i32;
> +   params[num_params++] = ctx->i32;
> params[ctx->param_es2gs_offset = num_params++] = 
> ctx->i32;
> } else {
> -   params[ctx->param_tess_offchip = num_params++] = 
> ctx->i32;
> +   params[num_params++] = ctx->i32;
> declare_streamout_params(ctx, >selector->so,
>  params, ctx->i32, 
> _params);
> params[ctx->param_oc_lds = num_params++] = ctx->i32;
> }
> last_sgpr = num_params - 1;
>
> /* VGPRs */
> params[ctx->param_tes_u = num_params++] = ctx->f32;
> params[ctx->param_tes_v = num_params++] = ctx->f32;
> params[ctx->param_tes_rel_patch_id = num_params++] = ctx->i32;
> diff --git a/src/gallium/drivers/radeonsi/si_shader_internal.h 
> b/src/gallium/drivers/radeonsi/si_shader_internal.h
> index d37a9e7..9055b4d 100644
> --- a/src/gallium/drivers/radeonsi/si_shader_internal.h
> +++ b/src/gallium/drivers/radeonsi/si_shader_internal.h
> @@ -114,25 +114,20 @@ struct si_shader_context {
> int param_vs_prim_id;
> int param_instance_id;
> int param_vertex_index0;
> int param_tes_u;
> int param_tes_v;
> int param_tes_rel_patch_id;
> int param_tes_patch_id;
> int param_es2gs_offset;
> int param_oc_lds;
>
> -   /* Sets a bit if the dynamic HS control word was 0x8000. The bit 
> is
> -* 0x80 for VS, 0x1 for ES.
> -*/
> -   int param_tess_offchip;
> -
> LLVMTargetMachineRef tm;
>
> unsigned invariant_load_md_kind;
> unsigned range_md_kind;
> unsigned uniform_md_kind;
> unsigned fpmath_md_kind;
> LLVMValueRef fpmath_md_2p5_ulp;
> LLVMValueRef empty_md;
>
> /* Preloaded descriptors. */
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.

2017-01-26 Thread Francisco Jerez

Ian Romanick  writes:

> On 01/25/2017 10:53 AM, Francisco Jerez wrote:
>> Hi Ian, and thank you for your comments,
>> 
>> Ian Romanick  writes:
>> 
>>> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
 Will avoid a regression in a future commit that introduces some
 additional rcp operations.
>>>
>>> When I converted GLSL IR to ir_expression_operation.py, I was careful to
>>> keep all the expressions the same.  rcp and div had these weird guards.
>>> GLSL doesn't require that NaN be generated, and quite a few old GPUs
>>> don't.  If the atan2 implementation depends on NaN being generated by
>>> rcp, it may have problems on i915, r300, and similar GPUs.  I don't know
>>> what they generate, but it's not NaN and it's probably not 0.0.
>> 
>> The atan2 implementation from patch 5 doesn't rely on NaNs being
>> generated, but it does rely on the reciprocal operation handling zero
>> and infinity correctly as specified by GLSL for the division operation.
>
> Okay.  That is the problem on older GPUs that I was referring.
> Specifically, all versions of the GLSL prior to 4.40 (!) say:
>
> Similarly, treatment of conditions such as divide by 0 may lead to
> an unspecified result, but in no case should such a condition lead
> to the interruption or termination of processing.
>

I cannot find this paragraph in any of the GLSL 4.1+ specs.

> I believe that DX11 requires the GLSL 4.40+ behavior, so all even
> somewhat modern devices should just work.  It's all the pre-DX11
> hardware that's might be a problem.
>

Actually I don't think PATCH 5 necessarily cares about infinities not
getting generated -- The only requirement is for rcp(0) to return a
fairly large value in absolute value (the larger the more accurate the
result will be), even the sign of the result is pretty much irrelevant.

That said there's a relatively straightforward change we could apply to
PATCH 5 and 6 in order to make the calculation more robust against
division by zero (it will probably make this patch unnecessary to avoid
piglit regressions but I think we want it anyway because of GLSL 4.1+):
We could leverage the coordinate rotation to turn (y, 0) into (0, y),
avoiding division by zero along the whole vertical line -- The only
remaining case where we could potentially divide by zero is along the
left y=0 half-line, but the function jumps from -π to π along that line
so returning an undefined value within [-π, π] for y=0 and x < 0 is very
unlikely to hurt, because AFAICT all GLSL versions lacking well-defined
divide by zero didn't require the implementation to represent the sign
of zero consistently either, so the shader is unlikely to be able to
generate a signed zero value accurately enough to notice the problem...

> Now... talking to Jason just now, he reminded me that the spec also says
> the following about built-in functions:
>
> Function parameters specified as angle are assumed to be in units
> of radians. In no case will any of these functions result in a
> divide by zero error. If the divisor of a ratio is 0, then results
> will be undefined.
>
Heh... That could be interpreted as if atan2(y, 0) is undefined which
would make this discussion moot -- I'll send a v2 of PATCH 5 and 6
anyway since it should be easy enough to get right.

> We may be fine even on old, clunky hardware.  Looking at the code in
> patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces
> undefined results.  It looks like
> tests/shaders/glsl-fs-atan-2.shader_test should hit that case.  Anyone
> have r300 or r400 hardware to test that?
>
> This patch doesn't affect that, and, even with the "unspecified result"
> rule, it's clearly correct.  This patch is
>
> Reviewed-by: Ian Romanick 
>
>>> That said, this matches NIR, and it's probably fine.
>>>
 ---
  src/compiler/glsl/ir_expression_operation.py | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/compiler/glsl/ir_expression_operation.py 
 b/src/compiler/glsl/ir_expression_operation.py
 index f91ac9b..4ac1ffb 100644
 --- a/src/compiler/glsl/ir_expression_operation.py
 +++ b/src/compiler/glsl/ir_expression_operation.py
 @@ -422,7 +422,7 @@ ir_expression_operation = [
 operation("neg", 1, source_types=numeric_types, c_expression={'u': 
 "-((int) {src0})", 'default': "-{src0}"}),
 operation("abs", 1, source_types=signed_numeric_types, 
 c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 
 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}),
 operation("sign", 1, source_types=signed_numeric_types, 
 c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 
 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", 
 'i64': "({src0} > 0) - ({src0} < 0)"}),
 -   operation("rcp", 1, source_types=real_types, c_expression={'f':

Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread Marek Olšák

On Thu, Jan 26, 2017 at 9:04 PM, Nicolai Hähnle  wrote:
> On 26.01.2017 20:23, Ilia Mirkin wrote:
>>
>> I have no serious preference, but for doubles, we use a shader cap.
>
>
> I think this:
>
>> On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> v1.1: move to using a normal CAP. (Marek)
>
>
> ... suggests that this has already gone back and forth :)
>
> Pragmatically, is there hardware that can do 64 bit ints only in some of the
> shader stages?

I'll give you a different answer: I think the double cap and other
shader caps can be changed to normal caps if there is no reason for
them to be shader caps.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/8] softpipe: set softpipe_screen::use_llvm when draw is build with LLVM

2017-01-26 Thread Roland Scheidegger

Am 26.01.2017 um 19:27 schrieb Emil Velikov:
> From: Emil Velikov 
> 
> Currently we can build draw without LLVM thus honouring SOFTPIPE_USE_LLVM
> is misleading even if most of the code nicely falls-back to no-op in the
> lack of LLVM.
> 
> That does not seem to be the case in softpipe_draw_vbo() where extra
> prepare {prepare,cleanup}_{vertex,geometry}_sampling is present.
> 
> Haven't checked how much overhead the causes, but omitting it is the
> correct thing to do, afaict.
> 
> Note: the topic of "is it a smart idea to have softpipe build with
> LLVM-less draw" is to be checked another day.
This might not make much sense for other drivers, but for softpipe it
probably really does - it also defaults to non-llvm draw.
You are right though that we shouldn't set use_llvm if we didn't build
with llvm.

As for appending LLVM to the name, this sounds about right to me. Albeit
what we probably really want to know is if draw is actually using llvm,
not just if it was built with it (in particular with softpipe which
defaults to non-llvm). But no big deal...

For the series:
Reviewed-by: Roland Scheidegger 

> 
> Cc: Roland Scheidegger 
> Cc: Jose Fonseca 
> Signed-off-by: Emil Velikov 
> ---
> It's the better thing to do imho, but if you feel strongly against it
> feel free to drop it.
> ---
>  src/gallium/drivers/softpipe/sp_screen.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/gallium/drivers/softpipe/sp_screen.c 
> b/src/gallium/drivers/softpipe/sp_screen.c
> index 9bc8d10e8e..1a58eb9d99 100644
> --- a/src/gallium/drivers/softpipe/sp_screen.c
> +++ b/src/gallium/drivers/softpipe/sp_screen.c
> @@ -568,7 +568,9 @@ softpipe_create_screen(struct sw_winsys *winsys)
> screen->base.context_create = softpipe_create_context;
> screen->base.flush_frontbuffer = softpipe_flush_frontbuffer;
> screen->base.get_compute_param = softpipe_get_compute_param;
> +#ifdef HAVE_LLVM
> screen->use_llvm = debug_get_option_use_llvm();
> +#endif
>  
> util_format_s3tc_init();
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread Roland Scheidegger

Am 26.01.2017 um 21:11 schrieb Ilia Mirkin:
> On Thu, Jan 26, 2017 at 3:04 PM, Nicolai Hähnle  wrote:
>> On 26.01.2017 20:23, Ilia Mirkin wrote:
>>>
>>> I have no serious preference, but for doubles, we use a shader cap.
>>
>>
>> I think this:
>>
>>> On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle 
>>> wrote:

 v1.1: move to using a normal CAP. (Marek)
>>
>>
>> ... suggests that this has already gone back and forth :)
>>
>> Pragmatically, is there hardware that can do 64 bit ints only in some of the
>> shader stages?

If you let draw handle vs :-).

> 
> Highly doubtful. Just as doubtful as such hardware existing for
> doubles :) Like I said, I'm fine either way, mostly wanted to point
> out the inconsistency wrt doubles.

I agree, would be more consistent if both these flags look the same.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread Ilia Mirkin

On Thu, Jan 26, 2017 at 3:04 PM, Nicolai Hähnle  wrote:
> On 26.01.2017 20:23, Ilia Mirkin wrote:
>>
>> I have no serious preference, but for doubles, we use a shader cap.
>
>
> I think this:
>
>> On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle 
>> wrote:
>>>
>>> v1.1: move to using a normal CAP. (Marek)
>
>
> ... suggests that this has already gone back and forth :)
>
> Pragmatically, is there hardware that can do 64 bit ints only in some of the
> shader stages?

Highly doubtful. Just as doubtful as such hardware existing for
doubles :) Like I said, I'm fine either way, mostly wanted to point
out the inconsistency wrt doubles.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] docs/releasing: add a note about the relnotes template

2017-01-26 Thread Nicolai Hähnle


On 26.01.2017 20:26, Emil Velikov wrote:

From: Emil Velikov 

Signed-off-by: Emil Velikov 
---
I forget this a bit too often :-\


:D

Reviewed-by: Nicolai Hähnle 


---
 docs/releasing.html | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/releasing.html b/docs/releasing.html
index 2ed66a13f4..020f3dec70 100644
--- a/docs/releasing.html
+++ b/docs/releasing.html
@@ -161,6 +161,8 @@ To setup the branchpoint:
git checkout master
$EDITOR VERSION # bump the version number
git commit -as
+   cp docs/relnotes/{X.Y,X.Y+1}.html # copy/create relnotes template
+   git commit -as
git push origin X.Y-branchpoint X.Y
 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/4] st/glsl_to_tgsi: add support for 64-bit integers

2017-01-26 Thread Nicolai Hähnle


On 26.01.2017 20:20, Ilia Mirkin wrote:

On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle  wrote:

+   case ir_unop_i642b:
+  emit_asm(ir, TGSI_OPCODE_U64SNE, result_dst, op[0], 
st_src_reg_for_int(0));
+  break;


Does this work reliably? I would have imagined you'd need a
st_srg_reg_for_int64() variant...


It works because st_src_reg_for_int swizzles the constant across all 
channels. I'm pretty sure that other parts of the code rely on that as 
well...


Nicolai
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread Nicolai Hähnle


On 26.01.2017 20:23, Ilia Mirkin wrote:

I have no serious preference, but for doubles, we use a shader cap.


I think this:


On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle  wrote:

v1.1: move to using a normal CAP. (Marek)


... suggests that this has already gone back and forth :)

Pragmatically, is there hardware that can do 64 bit ints only in some of 
the shader stages?


Nicolai



v2: fill in the cap everywhere

Signed-off-by: Dave Airlie 
Reviewed-by: Marek Olšák  (v1)
---
 src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 17 insertions(+)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index 66c530d..c045f7e 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -234,20 +234,21 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_INT64:
   return 0;

/* Stream output. */
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
   return 0;

/* Geometry shader output, unsupported. */
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index a1c026c..abb8787 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -291,20 +291,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_INT64:
return 0;

case PIPE_CAP_MAX_VIEWPORTS:
return 1;

case PIPE_CAP_SHAREABLE_SHADERS:
/* manage the variants for these ourself, to avoid breaking precompile: 
*/
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
if (is_ir3(screen))
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 7889873..07d1488 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -292,20 +292,21 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_SAMPLER_VIEW_TARGET:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_INT64:
   return 0;

case PIPE_CAP_MAX_VIEWPORTS:
   return 1;

case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
   return 64;

case PIPE_CAP_GLSL_FEATURE_LEVEL:
   return 120;
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 6018cd1..960fd71 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -515,20 +515,21 @@ ilo_get_param(struct pipe_screen *screen,

Re: [Mesa-dev] [PATCH] i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.

2017-01-26 Thread Eric Anholt

Kenneth Graunke  writes:

> Applications may delete a shader program, create a new one, and bind it
> before the next draw.  With terrible luck, malloc may randomly return a
> chunk of memory for the new gl_program that happened to be the exact
> same pointer as our previously bound gl_program.  In this case, our
> logic to detect new programs in brw_upload_pipeline_state() would break:
>
>   if (brw->vertex_program != ctx->VertexProgram._Current) {
>  brw->vertex_program = ctx->VertexProgram._Current;
>  brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
>   }
>
> Because the pointer is the same, we'd think it was the same program.
> But it could be wildly different - a different stage altogether,
> different sets of resources, and so on.  This causes utter chaos.

Any reason you're not just using _mesa_reference_program()?


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/17] gallium/radeon: allow VRAM-only placements again on APUs & recent amdgpu

2017-01-26 Thread Marek Olšák

On Thu, Jan 26, 2017 at 8:43 PM, Ernst Sjöstrand  wrote:
> Should this code be able to handle drm 4.0.0?

DRM 4.0.0 doesn't exist and hopefully won't exist.

DRM 1.x.0 = intel
DRM 2.x.0 = radeon
DRM 3.x.0 = amdgpu

That's just a coincidence.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] gallium/radeon: rename grbm to mmio in the gpu load path

2017-01-26 Thread Samuel Pitoiset

We also want to monitor other MMIO counters like SRBM_STATUS2 in
order to know if SDMA is busy.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/radeon/r600_gpu_load.c| 30 +++
 src/gallium/drivers/radeon/r600_pipe_common.h | 35 ++-
 2 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_gpu_load.c 
b/src/gallium/drivers/radeon/r600_gpu_load.c
index 83b7bd7210..c84b86d76c 100644
--- a/src/gallium/drivers/radeon/r600_gpu_load.c
+++ b/src/gallium/drivers/radeon/r600_gpu_load.c
@@ -66,8 +66,8 @@
p_atomic_inc(>named.field.idle);  \
} while (0)
 
-static void r600_update_grbm_counters(struct r600_common_screen *rscreen,
- union r600_grbm_counters *counters)
+static void r600_update_mmio_counters(struct r600_common_screen *rscreen,
+ union r600_mmio_counters *counters)
 {
uint32_t value = 0;
 
@@ -116,7 +116,7 @@ static PIPE_THREAD_ROUTINE(r600_gpu_load_thread, param)
last_time = cur_time;
 
/* Update the counters. */
-   r600_update_grbm_counters(rscreen, >grbm_counters);
+   r600_update_mmio_counters(rscreen, >mmio_counters);
}
p_atomic_dec(>gpu_load_stop_thread);
return 0;
@@ -132,7 +132,7 @@ void r600_gpu_load_kill_thread(struct r600_common_screen 
*rscreen)
rscreen->gpu_load_thread = 0;
 }
 
-static uint64_t r600_read_grbm_counter(struct r600_common_screen *rscreen,
+static uint64_t r600_read_mmio_counter(struct r600_common_screen *rscreen,
   unsigned busy_index)
 {
/* Start the thread if needed. */
@@ -145,16 +145,16 @@ static uint64_t r600_read_grbm_counter(struct 
r600_common_screen *rscreen,
pipe_mutex_unlock(rscreen->gpu_load_mutex);
}
 
-   unsigned busy = 
p_atomic_read(>grbm_counters.array[busy_index]);
-   unsigned idle = p_atomic_read(>grbm_counters.array[busy_index 
+ 1]);
+   unsigned busy = 
p_atomic_read(>mmio_counters.array[busy_index]);
+   unsigned idle = p_atomic_read(>mmio_counters.array[busy_index 
+ 1]);
 
return busy | ((uint64_t)idle << 32);
 }
 
-static unsigned r600_end_grbm_counter(struct r600_common_screen *rscreen,
+static unsigned r600_end_mmio_counter(struct r600_common_screen *rscreen,
  uint64_t begin, unsigned busy_index)
 {
-   uint64_t end = r600_read_grbm_counter(rscreen, busy_index);
+   uint64_t end = r600_read_mmio_counter(rscreen, busy_index);
unsigned busy = (end & 0x) - (begin & 0x);
unsigned idle = (end >> 32) - (begin >> 32);
 
@@ -167,16 +167,16 @@ static unsigned r600_end_grbm_counter(struct 
r600_common_screen *rscreen,
if (idle || busy) {
return busy*100 / (busy + idle);
} else {
-   union r600_grbm_counters counters;
+   union r600_mmio_counters counters;
 
memset(, 0, sizeof(counters));
-   r600_update_grbm_counters(rscreen, );
+   r600_update_mmio_counters(rscreen, );
return counters.array[busy_index] ? 100 : 0;
}
 }
 
-#define BUSY_INDEX(rscreen, field) (>grbm_counters.named.field.busy - 
\
-   rscreen->grbm_counters.array)
+#define BUSY_INDEX(rscreen, field) (>mmio_counters.named.field.busy - 
\
+   rscreen->mmio_counters.array)
 
 static unsigned busy_index_from_type(struct r600_common_screen *rscreen,
 unsigned type)
@@ -211,19 +211,19 @@ static unsigned busy_index_from_type(struct 
r600_common_screen *rscreen,
case R600_QUERY_GPU_CB_BUSY:
return BUSY_INDEX(rscreen, cb);
default:
-   unreachable("query type does not correspond to grbm id");
+   unreachable("invalid query type");
}
 }
 
 uint64_t r600_begin_counter(struct r600_common_screen *rscreen, unsigned type)
 {
unsigned busy_index = busy_index_from_type(rscreen, type);
-   return r600_read_grbm_counter(rscreen, busy_index);
+   return r600_read_mmio_counter(rscreen, busy_index);
 }
 
 unsigned r600_end_counter(struct r600_common_screen *rscreen, unsigned type,
  uint64_t begin)
 {
unsigned busy_index = busy_index_from_type(rscreen, type);
-   return r600_end_grbm_counter(rscreen, begin, busy_index);
+   return r600_end_mmio_counter(rscreen, begin, busy_index);
 }
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index afb1385f97..76fbf2af98 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -352,27 +352,28 @@ struct r600_surface {
unsigned

[Mesa-dev] [PATCH 3/3] gallium/radeon: add new HUD queries for monitoring the CP

2017-01-26 Thread Samuel Pitoiset

There are even more counters in the CP_STAT register but I think
these ones are enough for now.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/radeon/r600_gpu_load.c| 34 +++
 src/gallium/drivers/radeon/r600_pipe_common.h |  9 +++
 src/gallium/drivers/radeon/r600_query.c   | 23 +-
 src/gallium/drivers/radeon/r600_query.h   |  7 ++
 4 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_gpu_load.c 
b/src/gallium/drivers/radeon/r600_gpu_load.c
index 5bea6e2643..588442bebd 100644
--- a/src/gallium/drivers/radeon/r600_gpu_load.c
+++ b/src/gallium/drivers/radeon/r600_gpu_load.c
@@ -61,6 +61,15 @@
 #define SRBM_STATUS2   0x0e4c
 #define SDMA_BUSY(x)   (((x) >> 5) & 0x1)
 
+#define CP_STAT 0x8680
+#define PFP_BUSY(x)(((x) >> 15) & 0x1)
+#define MEQ_BUSY(x)(((x) >> 16) & 0x1)
+#define ME_BUSY(x) (((x) >> 17) & 0x1)
+#define SURFACE_SYNC_BUSY(x)   (((x) >> 21) & 0x1)
+#define DMA_BUSY(x)(((x) >> 22) & 0x1)
+#define SCRATCH_RAM_BUSY(x)(((x) >> 24) & 0x1)
+#define CE_BUSY(x) (((x) >> 26) & 0x1)
+
 #define UPDATE_COUNTER(field, mask)\
do {\
if (mask(value))\
@@ -96,6 +105,17 @@ static void r600_update_mmio_counters(struct 
r600_common_screen *rscreen,
rscreen->ws->read_registers(rscreen->ws, SRBM_STATUS2, 1, );
 
UPDATE_COUNTER(sdma, SDMA_BUSY);
+
+   /* CP_STAT */
+   rscreen->ws->read_registers(rscreen->ws, CP_STAT, 1, );
+
+   UPDATE_COUNTER(pfp, PFP_BUSY);
+   UPDATE_COUNTER(meq, MEQ_BUSY);
+   UPDATE_COUNTER(me, ME_BUSY);
+   UPDATE_COUNTER(surf_sync, SURFACE_SYNC_BUSY);
+   UPDATE_COUNTER(dma, DMA_BUSY);
+   UPDATE_COUNTER(scratch_ram, SCRATCH_RAM_BUSY);
+   UPDATE_COUNTER(ce, CE_BUSY);
 }
 
 #undef UPDATE_COUNTER
@@ -221,6 +241,20 @@ static unsigned busy_index_from_type(struct 
r600_common_screen *rscreen,
return BUSY_INDEX(rscreen, cb);
case R600_QUERY_GPU_SDMA_BUSY:
return BUSY_INDEX(rscreen, sdma);
+   case R600_QUERY_GPU_PFP_BUSY:
+   return BUSY_INDEX(rscreen, pfp);
+   case R600_QUERY_GPU_MEQ_BUSY:
+   return BUSY_INDEX(rscreen, meq);
+   case R600_QUERY_GPU_ME_BUSY:
+   return BUSY_INDEX(rscreen, me);
+   case R600_QUERY_GPU_SURF_SYNC_BUSY:
+   return BUSY_INDEX(rscreen, surf_sync);
+   case R600_QUERY_GPU_DMA_BUSY:
+   return BUSY_INDEX(rscreen, dma);
+   case R600_QUERY_GPU_SCRATCH_RAM_BUSY:
+   return BUSY_INDEX(rscreen, scratch_ram);
+   case R600_QUERY_GPU_CE_BUSY:
+   return BUSY_INDEX(rscreen, ce);
default:
unreachable("invalid query type");
}
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 08de238bba..a1576c49a4 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -377,6 +377,15 @@ union r600_mmio_counters {
 
/* SRBM_STATUS2 */
struct r600_mmio_counter sdma;
+
+   /* CP_STAT */
+   struct r600_mmio_counter pfp;
+   struct r600_mmio_counter meq;
+   struct r600_mmio_counter me;
+   struct r600_mmio_counter surf_sync;
+   struct r600_mmio_counter dma;
+   struct r600_mmio_counter scratch_ram;
+   struct r600_mmio_counter ce;
} named;
unsigned array[0];
 };
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index ef73323bae..83c1c60211 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -163,6 +163,13 @@ static bool r600_query_sw_begin(struct r600_common_context 
*rctx,
case R600_QUERY_GPU_CP_BUSY:
case R600_QUERY_GPU_CB_BUSY:
case R600_QUERY_GPU_SDMA_BUSY:
+   case R600_QUERY_GPU_PFP_BUSY:
+   case R600_QUERY_GPU_MEQ_BUSY:
+   case R600_QUERY_GPU_ME_BUSY:
+   case R600_QUERY_GPU_SURF_SYNC_BUSY:
+   case R600_QUERY_GPU_DMA_BUSY:
+   case R600_QUERY_GPU_SCRATCH_RAM_BUSY:
+   case R600_QUERY_GPU_CE_BUSY:
query->begin_result = r600_begin_counter(rctx->screen,
 query->b.type);
break;
@@ -271,6 +278,13 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
case R600_QUERY_GPU_CP_BUSY:
case R600_QUERY_GPU_CB_BUSY:
case R600_QUERY_GPU_SDMA_BUSY:
+   case R600_QUERY_GPU_PFP_BUSY:
+   case R600_QUERY_GPU_MEQ_BUSY:
+   case

[Mesa-dev] [PATCH 2/3] gallium/radeon: add new GPU-sdma-busy HUD query

2017-01-26 Thread Samuel Pitoiset

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/radeon/r600_gpu_load.c| 11 +++
 src/gallium/drivers/radeon/r600_pipe_common.h |  3 +++
 src/gallium/drivers/radeon/r600_query.c   |  5 -
 src/gallium/drivers/radeon/r600_query.h   |  1 +
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_gpu_load.c 
b/src/gallium/drivers/radeon/r600_gpu_load.c
index c84b86d76c..5bea6e2643 100644
--- a/src/gallium/drivers/radeon/r600_gpu_load.c
+++ b/src/gallium/drivers/radeon/r600_gpu_load.c
@@ -58,6 +58,9 @@
 #define CB_BUSY(x) (((x) >> 30) & 0x1)
 #define GUI_ACTIVE(x)  (((x) >> 31) & 0x1)
 
+#define SRBM_STATUS2   0x0e4c
+#define SDMA_BUSY(x)   (((x) >> 5) & 0x1)
+
 #define UPDATE_COUNTER(field, mask)\
do {\
if (mask(value))\
@@ -71,6 +74,7 @@ static void r600_update_mmio_counters(struct 
r600_common_screen *rscreen,
 {
uint32_t value = 0;
 
+   /* GRBM_STATUS */
rscreen->ws->read_registers(rscreen->ws, GRBM_STATUS, 1, );
 
UPDATE_COUNTER(ta, TA_BUSY);
@@ -87,6 +91,11 @@ static void r600_update_mmio_counters(struct 
r600_common_screen *rscreen,
UPDATE_COUNTER(cp, CP_BUSY);
UPDATE_COUNTER(cb, CB_BUSY);
UPDATE_COUNTER(gui, GUI_ACTIVE);
+
+   /* SRBM_STATUS */
+   rscreen->ws->read_registers(rscreen->ws, SRBM_STATUS2, 1, );
+
+   UPDATE_COUNTER(sdma, SDMA_BUSY);
 }
 
 #undef UPDATE_COUNTER
@@ -210,6 +219,8 @@ static unsigned busy_index_from_type(struct 
r600_common_screen *rscreen,
return BUSY_INDEX(rscreen, cp);
case R600_QUERY_GPU_CB_BUSY:
return BUSY_INDEX(rscreen, cb);
+   case R600_QUERY_GPU_SDMA_BUSY:
+   return BUSY_INDEX(rscreen, sdma);
default:
unreachable("invalid query type");
}
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 76fbf2af98..08de238bba 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -374,6 +374,9 @@ union r600_mmio_counters {
struct r600_mmio_counter db;
struct r600_mmio_counter cp;
struct r600_mmio_counter cb;
+
+   /* SRBM_STATUS2 */
+   struct r600_mmio_counter sdma;
} named;
unsigned array[0];
 };
diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index d4e41306a4..ef73323bae 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -162,6 +162,7 @@ static bool r600_query_sw_begin(struct r600_common_context 
*rctx,
case R600_QUERY_GPU_DB_BUSY:
case R600_QUERY_GPU_CP_BUSY:
case R600_QUERY_GPU_CB_BUSY:
+   case R600_QUERY_GPU_SDMA_BUSY:
query->begin_result = r600_begin_counter(rctx->screen,
 query->b.type);
break;
@@ -269,6 +270,7 @@ static bool r600_query_sw_end(struct r600_common_context 
*rctx,
case R600_QUERY_GPU_DB_BUSY:
case R600_QUERY_GPU_CP_BUSY:
case R600_QUERY_GPU_CB_BUSY:
+   case R600_QUERY_GPU_SDMA_BUSY:
query->end_result = r600_end_counter(rctx->screen,
 query->b.type,
 query->begin_result);
@@ -1765,6 +1767,7 @@ static struct pipe_driver_query_info 
r600_driver_query_list[] = {
X("GPU-db-busy",GPU_DB_BUSY,UINT64, 
AVERAGE),
X("GPU-cp-busy",GPU_CP_BUSY,UINT64, 
AVERAGE),
X("GPU-cb-busy",GPU_CB_BUSY,UINT64, 
AVERAGE),
+   X("GPU-sdma-busy",  GPU_SDMA_BUSY,  UINT64, 
AVERAGE),
 
X("temperature",GPU_TEMPERATURE,UINT64, 
AVERAGE),
X("shader-clock",   CURRENT_GPU_SCLK,   HZ, AVERAGE),
@@ -1782,7 +1785,7 @@ static unsigned r600_get_num_queries(struct 
r600_common_screen *rscreen)
else if (rscreen->info.drm_major == 3)
return ARRAY_SIZE(r600_driver_query_list) - 3;
else
-   return ARRAY_SIZE(r600_driver_query_list) - 17;
+   return ARRAY_SIZE(r600_driver_query_list) - 18;
 }
 
 static int r600_get_driver_query_info(struct pipe_screen *screen,
diff --git a/src/gallium/drivers/radeon/r600_query.h 
b/src/gallium/drivers/radeon/r600_query.h
index f2af9240d2..0b32793c65 100644
--- a/src/gallium/drivers/radeon/r600_query.h
+++ b/src/gallium/drivers/radeon/r600_query.h
@@ -85,6 +85,7 @@ enum {
R600_QUERY_GPU_DB_BUSY,

Re: [Mesa-dev] [PATCH 8/8] nouveau: let .get_name() append LLVM if built with LLVM

2017-01-26 Thread Ilia Mirkin

Yeah, I'd much prefer this to be

if (nouveau_screen(pscreen)->class_3d < NV50_3D_CLASS)
  llvm = " LLVM";

or something along those lines. [Using dev->chipset will be annoying
since nv50 = 0x50, and nv67 (nv4x) = 0x67.]

On Thu, Jan 26, 2017 at 1:27 PM, Emil Velikov  wrote:
> From: Emil Velikov 
>
> Analogous to previous two commits. Afaict only nv30 uses draw, so if
> people prefer we can restrict this print only to those devices.
>
> Signed-off-by: Emil Velikov 
> ---
> Unrelated:
> Wasn't there a kernel/libdrm helper which can give us the complete
> device name ?
> ---
>  src/gallium/drivers/nouveau/nouveau_screen.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
> b/src/gallium/drivers/nouveau/nouveau_screen.c
> index f59e101caf..24177bd7da 100644
> --- a/src/gallium/drivers/nouveau/nouveau_screen.c
> +++ b/src/gallium/drivers/nouveau/nouveau_screen.c
> @@ -33,8 +33,13 @@ nouveau_screen_get_name(struct pipe_screen *pscreen)
>  {
> struct nouveau_device *dev = nouveau_screen(pscreen)->device;
> static char buffer[128];
> +   const char *llvm = "";
>
> -   util_snprintf(buffer, sizeof(buffer), "NV%02X", dev->chipset);
> +#ifdef HAVE_LLVM
> +   llvm = " LLVM";
> +#endif
> +
> +   util_snprintf(buffer, sizeof(buffer), "NV%02X%s", dev->chipset, llvm);
> return buffer;
>  }
>
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97250] Mesa/Clover: openCV library bugs on CL_MEM_USE_HOST_PTR

2017-01-26 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97250

Vedran Miletić  changed:

   What|Removed |Added

 Blocks||99553


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=99553
[Bug 99553] Tracker bug for runnning OpenCL applications on Clover
-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/17] gallium/radeon: allow VRAM-only placements again on APUs & recent amdgpu

2017-01-26 Thread Ernst Sjöstrand

Should this code be able to handle drm 4.0.0?

2017-01-26 17:04 GMT+01:00 Marek Olšák :

> From: Marek Olšák 
>
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index c6f4d0d..da6f020 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -163,22 +163,26 @@ void r600_init_resource_fields(struct
> r600_common_screen *rscreen,
> !rtex->surface.is_linear) {
> res->domains = RADEON_DOMAIN_VRAM;
> res->flags &= ~RADEON_FLAG_CPU_ACCESS;
> res->flags |= RADEON_FLAG_NO_CPU_ACCESS |
>  RADEON_FLAG_GTT_WC;
> }
>
> /* If VRAM is just stolen system memory, allow both VRAM and
>  * GTT, whichever has free space. If a buffer is evicted from
>  * VRAM to GTT, it will stay there.
> +*
> +* DRM 3.6.0 has good BO move throttling, so we can allow VRAM-only
> +* placements even with a low amount of stolen VRAM.
>  */
> if (!rscreen->info.has_dedicated_vram &&
> +   (rscreen->info.drm_major < 3 || rscreen->info.drm_minor < 6) &&
> res->domains == RADEON_DOMAIN_VRAM)
> res->domains = RADEON_DOMAIN_VRAM_GTT;
>
> if (rscreen->debug_flags & DBG_NO_WC)
> res->flags &= ~RADEON_FLAG_GTT_WC;
>
> /* Set expected VRAM and GART usage for the buffer. */
> res->vram_usage = 0;
> res->gart_usage = 0;
>
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/8] LLVM requirement for drivers using draw

2017-01-26 Thread Marek Olšák

On Thu, Jan 26, 2017 at 7:58 PM, Emil Velikov  wrote:
> On 26 January 2017 at 18:54, Marek Olšák  wrote:
>> They still have to ship LLVM to have GCN support. Or do they simply not ship
>> radeonsi?
>>
> The latter - they omit anything that requires LLVM.
> I dare not discuss how good/bad/etc of a choice that it, but some
> people have their reasons.

OK. If distros want to shoot themselves in the foot, I'm OK with that.

>
> -Emil
> P.S. Can you toggle to plain text emails, please ?

I'm always using plain text when I'm not posting from my phone. I
don't think the Android Gmail app can switch to plain text, but I
haven't checked.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] docs/releasing: add a note about the relnotes template

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Signed-off-by: Emil Velikov 
---
I forget this a bit too often :-\
---
 docs/releasing.html | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/releasing.html b/docs/releasing.html
index 2ed66a13f4..020f3dec70 100644
--- a/docs/releasing.html
+++ b/docs/releasing.html
@@ -161,6 +161,8 @@ To setup the branchpoint:
git checkout master
$EDITOR VERSION # bump the version number
git commit -as
+   cp docs/relnotes/{X.Y,X.Y+1}.html # copy/create relnotes template
+   git commit -as
git push origin X.Y-branchpoint X.Y
 
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread Ilia Mirkin

I have no serious preference, but for doubles, we use a shader cap.

On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle  wrote:
> From: Dave Airlie 
>
> v1.1: move to using a normal CAP. (Marek)
>
> v2: fill in the cap everywhere
>
> Signed-off-by: Dave Airlie 
> Reviewed-by: Marek Olšák  (v1)
> ---
>  src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
>  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
>  src/gallium/drivers/i915/i915_screen.c   | 1 +
>  src/gallium/drivers/ilo/ilo_screen.c | 1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
>  src/gallium/drivers/r300/r300_screen.c   | 1 +
>  src/gallium/drivers/r600/r600_pipe.c | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
>  src/gallium/drivers/softpipe/sp_screen.c | 1 +
>  src/gallium/drivers/svga/svga_screen.c   | 1 +
>  src/gallium/drivers/swr/swr_screen.cpp   | 1 +
>  src/gallium/drivers/vc4/vc4_screen.c | 1 +
>  src/gallium/drivers/virgl/virgl_screen.c | 1 +
>  src/gallium/include/pipe/p_defines.h | 1 +
>  17 files changed, 17 insertions(+)
>
> diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
> b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> index 66c530d..c045f7e 100644
> --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
> +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
> @@ -234,20 +234,21 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
> case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
> case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
> case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
> case PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS:
> case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
> case PIPE_CAP_NATIVE_FENCE_FD:
> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
> case PIPE_CAP_TGSI_FS_FBFETCH:
> case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> +   case PIPE_CAP_INT64:
>return 0;
>
> /* Stream output. */
> case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
> case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
> case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
> case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
>return 0;
>
> /* Geometry shader output, unsupported. */
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index a1c026c..abb8787 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -291,20 +291,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
> case PIPE_CAP_TGSI_VOTE:
> case PIPE_CAP_MAX_WINDOW_RECTANGLES:
> case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
> case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
> case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
> case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
> case PIPE_CAP_TGSI_FS_FBFETCH:
> case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> +   case PIPE_CAP_INT64:
> return 0;
>
> case PIPE_CAP_MAX_VIEWPORTS:
> return 1;
>
> case PIPE_CAP_SHAREABLE_SHADERS:
> /* manage the variants for these ourself, to avoid breaking 
> precompile: */
> case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
> case PIPE_CAP_VERTEX_COLOR_CLAMPED:
> if (is_ir3(screen))
> diff --git a/src/gallium/drivers/i915/i915_screen.c 
> b/src/gallium/drivers/i915/i915_screen.c
> index 7889873..07d1488 100644
> --- a/src/gallium/drivers/i915/i915_screen.c
> +++ b/src/gallium/drivers/i915/i915_screen.c
> @@ -292,20 +292,21 @@ i915_get_param(struct pipe_screen *screen, enum 
> pipe_cap cap)
> case PIPE_CAP_MULTI_DRAW_INDIRECT:
> case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
> case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
> case PIPE_CAP_SAMPLER_VIEW_TARGET:
> case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
> case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
> case PIPE_CAP_NATIVE_FENCE_FD:
> case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
> case PIPE_CAP_TGSI_FS_FBFETCH:
> case PIPE_CAP_TGSI_MUL_ZERO_WINS:
> +   case PIPE_CAP_INT64:
>return 0;
>
> case PIPE_CAP_MAX_VIEWPORTS:
>return 1;
>
> case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
>return 64;
>
> case PIPE_CAP_GLSL_FEATURE_LEVEL:
>return 120;
> diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
> b/src/gallium/drivers/ilo/ilo_screen.c
> index 6018cd1..960fd71 100644
> --- a/src/gallium/drivers/ilo/ilo_screen.c
> +++ b/src/gallium/drivers/ilo/ilo_screen.c
> @@ -515,20 +515,21 @@

Re: [Mesa-dev] [PATCH 2/4] st/glsl_to_tgsi: add support for 64-bit integers

2017-01-26 Thread Ilia Mirkin

On Thu, Jan 26, 2017 at 2:09 PM, Nicolai Hähnle  wrote:
> +   case ir_unop_i642b:
> +  emit_asm(ir, TGSI_OPCODE_U64SNE, result_dst, op[0], 
> st_src_reg_for_int(0));
> +  break;

Does this work reliably? I would have imagined you'd need a
st_srg_reg_for_int64() variant...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] gallium: Add integer 64 capability

2017-01-26 Thread Nicolai Hähnle

From: Dave Airlie 

v1.1: move to using a normal CAP. (Marek)

v2: fill in the cap everywhere

Signed-off-by: Dave Airlie 
Reviewed-by: Marek Olšák  (v1)
---
 src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 +
 src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
 src/gallium/drivers/i915/i915_screen.c   | 1 +
 src/gallium/drivers/ilo/ilo_screen.c | 1 +
 src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
 src/gallium/drivers/r300/r300_screen.c   | 1 +
 src/gallium/drivers/r600/r600_pipe.c | 1 +
 src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
 src/gallium/drivers/softpipe/sp_screen.c | 1 +
 src/gallium/drivers/svga/svga_screen.c   | 1 +
 src/gallium/drivers/swr/swr_screen.cpp   | 1 +
 src/gallium/drivers/vc4/vc4_screen.c | 1 +
 src/gallium/drivers/virgl/virgl_screen.c | 1 +
 src/gallium/include/pipe/p_defines.h | 1 +
 17 files changed, 17 insertions(+)

diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c 
b/src/gallium/drivers/etnaviv/etnaviv_screen.c
index 66c530d..c045f7e 100644
--- a/src/gallium/drivers/etnaviv/etnaviv_screen.c
+++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c
@@ -234,20 +234,21 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_MIXED_COLOR_DEPTH_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_INT64:
   return 0;
 
/* Stream output. */
case PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS:
case PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME:
case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS:
case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS:
   return 0;
 
/* Geometry shader output, unsupported. */
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index a1c026c..abb8787 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -291,20 +291,21 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_INT64:
return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
return 1;
 
case PIPE_CAP_SHAREABLE_SHADERS:
/* manage the variants for these ourself, to avoid breaking precompile: 
*/
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
if (is_ir3(screen))
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 7889873..07d1488 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -292,20 +292,21 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_MULTI_DRAW_INDIRECT:
case PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS:
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
case PIPE_CAP_SAMPLER_VIEW_TARGET:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY:
case PIPE_CAP_TGSI_FS_FBFETCH:
case PIPE_CAP_TGSI_MUL_ZERO_WINS:
+   case PIPE_CAP_INT64:
   return 0;
 
case PIPE_CAP_MAX_VIEWPORTS:
   return 1;
 
case PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT:
   return 64;
 
case PIPE_CAP_GLSL_FEATURE_LEVEL:
   return 120;
diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
b/src/gallium/drivers/ilo/ilo_screen.c
index 6018cd1..960fd71 100644
--- a/src/gallium/drivers/ilo/ilo_screen.c
+++ b/src/gallium/drivers/ilo/ilo_screen.c
@@ -515,20 +515,21 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
param)
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case

[Mesa-dev] [PATCH 4/4] gallium: enable int64 on radeonsi, llvmpipe, softpipe

2017-01-26 Thread Nicolai Hähnle

From: Nicolai Hähnle 

All of these have had support for the TGSI opcodes since before most of
the glsl compiler work landed.

Also update the docs accordingly, including the missing note about i965.
---
 docs/features.txt|  2 +-
 docs/relnotes/17.1.0.html| 61 
 src/gallium/drivers/llvmpipe/lp_screen.c |  2 +-
 src/gallium/drivers/radeonsi/si_pipe.c   |  4 +--
 src/gallium/drivers/softpipe/sp_screen.c |  2 +-
 5 files changed, 66 insertions(+), 5 deletions(-)
 create mode 100644 docs/relnotes/17.1.0.html

diff --git a/docs/features.txt b/docs/features.txt
index aff0016..55b1fbb 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -276,21 +276,21 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
   GL_OES_texture_storage_multisample_2d_array   DONE (all drivers that 
support GL_ARB_texture_multisample)
 
 Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES 
version:
 
   GL_ARB_bindless_texture   started (airlied)
   GL_ARB_cl_event   not started
   GL_ARB_compute_variable_group_sizeDONE (nvc0, radeonsi)
   GL_ARB_ES3_2_compatibilityDONE (i965/gen8+)
   GL_ARB_fragment_shader_interlock  not started
   GL_ARB_gl_spirv   not started
-  GL_ARB_gpu_shader_int64   started (airlied for 
core and Gallium, idr for i965)
+  GL_ARB_gpu_shader_int64   DONE (i965/gen8+, 
radeonsi, softpipe, llvmpipe)
   GL_ARB_indirect_parametersDONE (nvc0, radeonsi)
   GL_ARB_parallel_shader_compilenot started, but 
Chia-I Wu did some related work in 2014
   GL_ARB_pipeline_statistics_query  DONE (i965, nvc0, 
radeonsi, softpipe, swr)
   GL_ARB_post_depth_coverageDONE (i965)
   GL_ARB_robustness_isolation   not started
   GL_ARB_sample_locations   not started
   GL_ARB_seamless_cubemap_per_texture   DONE (i965, nvc0, 
radeonsi, r600, softpipe, swr)
   GL_ARB_shader_atomic_counter_ops  DONE (nvc0, radeonsi, 
softpipe)
   GL_ARB_shader_ballot  not started
   GL_ARB_shader_clock   DONE (i965/gen7+)
diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
new file mode 100644
index 000..1b5535b
--- /dev/null
+++ b/docs/relnotes/17.1.0.html
@@ -0,0 +1,61 @@
+http://www.w3.org/TR/html4/loose.dtd;>
+
+
+  
+  Mesa Release Notes
+  
+
+
+
+
+  The Mesa 3D Graphics Library
+
+
+
+
+
+Mesa 17.1.0 Release Notes / TBD
+
+
+Mesa 17.1.0 is a new development release.
+People who are concerned with stability and reliability should stick
+with a previous release or wait for Mesa 17.1.1.
+
+
+Mesa 17.1.0 implements the OpenGL 4.5 API, but the version reported by
+glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
+glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
+Some drivers don't support all the features required in OpenGL 4.5.  OpenGL
+4.5 is only available if requested at context creation
+because compatibility contexts are not supported.
+
+
+
+SHA256 checksums
+
+TBD.
+
+
+
+New features
+
+
+Note: some of the new features are only available with certain drivers.
+
+
+
+GL_ARB_gpu_shader_int64 on i965/gen8+, radeonsi, softpipe, llvmpipe
+
+
+Bug fixes
+
+
+
+
+Changes
+
+TBD.
+
+
+
+
diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 6ef22b8..0982c35 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -260,20 +260,21 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION:
   return 1;
case PIPE_CAP_TGSI_FS_FINE_DERIVATIVE:
   return 0;
case PIPE_CAP_SAMPLER_VIEW_TARGET:
   return 1;
case PIPE_CAP_FAKE_SW_MSAA:
   return 1;
case PIPE_CAP_CONDITIONAL_RENDER_INVERTED:
case PIPE_CAP_TGSI_ARRAY_COMPONENTS:
+   case PIPE_CAP_INT64:
   return 1;
 
case PIPE_CAP_VENDOR_ID:
   return 0x;
case PIPE_CAP_DEVICE_ID:
   return 0x;
case PIPE_CAP_ACCELERATED:
   return 0;
case PIPE_CAP_VIDEO_MEMORY: {
   /* XXX: Do we want to return the full amount fo system memory ? */
@@ -336,21 +337,20 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES:
case PIPE_CAP_TGSI_VOTE:
case PIPE_CAP_MAX_WINDOW_RECTANGLES:
case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED:
case PIPE_CAP_VIEWPORT_SUBPIXEL_BITS:
case PIPE_CAP_TGSI_CAN_READ_OUTPUTS:
case PIPE_CAP_NATIVE_FENCE_FD:
case

[Mesa-dev] [PATCH 0/4] ARB_gpu_shader_int64 for radeonsi and soft/llvmpipe

2017-01-26 Thread Nicolai Hähnle

Hi all,

now that the GLSL compiler work has landed recently, and TGSI has actually
had the opcodes for several months, it's time to finally flip the switch!

The bulk of the work here is actually due to Dave, with some cleanups from
Ian, so kudos to them. I just filled in some minor bits and checked that
things actually work on radeonsi.

Cheers,
Nicolai
---
 docs/features.txt|   2 +-
 docs/relnotes/17.1.0.html|  61 +
 src/gallium/drivers/etnaviv/etnaviv_screen.c |   1 +
 .../drivers/freedreno/freedreno_screen.c |   1 +
 src/gallium/drivers/i915/i915_screen.c   |   1 +
 src/gallium/drivers/ilo/ilo_screen.c |   1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |   1 +
 .../drivers/nouveau/nv30/nv30_screen.c   |   1 +
 .../drivers/nouveau/nv50/nv50_screen.c   |   1 +
 .../drivers/nouveau/nvc0/nvc0_screen.c   |   1 +
 src/gallium/drivers/r300/r300_screen.c   |   1 +
 src/gallium/drivers/r600/r600_pipe.c |   1 +
 src/gallium/drivers/radeonsi/si_pipe.c   |   3 +-
 src/gallium/drivers/softpipe/sp_screen.c |   1 +
 src/gallium/drivers/svga/svga_screen.c   |   1 +
 src/gallium/drivers/swr/swr_screen.cpp   |   1 +
 src/gallium/drivers/vc4/vc4_screen.c |   1 +
 src/gallium/drivers/virgl/virgl_screen.c |   1 +
 src/gallium/include/pipe/p_defines.h |   1 +
 src/mesa/state_tracker/st_extensions.c   |   1 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   | 220 +++--
 21 files changed, 283 insertions(+), 20 deletions(-)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] st/glsl_to_tgsi: add support for 64-bit integers

2017-01-26 Thread Nicolai Hähnle

From: Dave Airlie 

v2: add conversion opcodes.

v3 (idr): Rebase on replacemtn of TGSI_OPCODE_I2U64 with
TGSI_OPCODE_I2I64.

v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

v5 (nha): add clarifying comment about a subtle assumption

Signed-off-by: Dave Airlie 
Reviewed-by: Nicolai Hähnle 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 220 ++---
 1 file changed, 202 insertions(+), 18 deletions(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index a437645..224789e 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -894,27 +894,46 @@ glsl_to_tgsi_visitor::get_opcode(unsigned op,
if (op == TGSI_OPCODE_MOV)
return op;
 
assert(src0.type != GLSL_TYPE_ARRAY);
assert(src0.type != GLSL_TYPE_STRUCT);
assert(src1.type != GLSL_TYPE_ARRAY);
assert(src1.type != GLSL_TYPE_STRUCT);
 
if (is_resource_instruction(op))
   type = src1.type;
+   else if (src0.type == GLSL_TYPE_INT64 || src1.type == GLSL_TYPE_INT64)
+  type = GLSL_TYPE_INT64;
+   else if (src0.type == GLSL_TYPE_UINT64 || src1.type == GLSL_TYPE_UINT64)
+  type = GLSL_TYPE_UINT64;
else if (src0.type == GLSL_TYPE_DOUBLE || src1.type == GLSL_TYPE_DOUBLE)
   type = GLSL_TYPE_DOUBLE;
else if (src0.type == GLSL_TYPE_FLOAT || src1.type == GLSL_TYPE_FLOAT)
   type = GLSL_TYPE_FLOAT;
else if (native_integers)
   type = src0.type == GLSL_TYPE_BOOL ? GLSL_TYPE_INT : src0.type;
 
+#define case7(c, f, i, u, d, i64, ui64) \
+   case TGSI_OPCODE_##c: \
+  if (type == GLSL_TYPE_UINT64)   \
+ op = TGSI_OPCODE_##ui64; \
+  else if (type == GLSL_TYPE_INT64)   \
+ op = TGSI_OPCODE_##i64; \
+  else if (type == GLSL_TYPE_DOUBLE)   \
+ op = TGSI_OPCODE_##d; \
+  else if (type == GLSL_TYPE_INT)   \
+ op = TGSI_OPCODE_##i; \
+  else if (type == GLSL_TYPE_UINT) \
+ op = TGSI_OPCODE_##u; \
+  else \
+ op = TGSI_OPCODE_##f; \
+  break;
 #define case5(c, f, i, u, d)\
case TGSI_OPCODE_##c: \
   if (type == GLSL_TYPE_DOUBLE)   \
  op = TGSI_OPCODE_##d; \
   else if (type == GLSL_TYPE_INT)   \
  op = TGSI_OPCODE_##i; \
   else if (type == GLSL_TYPE_UINT) \
  op = TGSI_OPCODE_##u; \
   else \
  op = TGSI_OPCODE_##f; \
@@ -924,57 +943,66 @@ glsl_to_tgsi_visitor::get_opcode(unsigned op,
case TGSI_OPCODE_##c: \
   if (type == GLSL_TYPE_INT) \
  op = TGSI_OPCODE_##i; \
   else if (type == GLSL_TYPE_UINT) \
  op = TGSI_OPCODE_##u; \
   else \
  op = TGSI_OPCODE_##f; \
   break;
 
 #define case3(f, i, u)  case4(f, f, i, u)
-#define case4d(f, i, u, d)  case5(f, f, i, u, d)
+#define case6d(f, i, u, d, i64, u64)  case7(f, f, i, u, d, i64, u64)
 #define case3fid(f, i, d) case5(f, f, i, i, d)
+#define case3fid64(f, i, d, i64) case7(f, f, i, i, d, i64, i64)
 #define case2fi(f, i)   case4(f, f, i, i)
 #define case2iu(i, u)   case4(i, LAST, i, u)
 
-#define casecomp(c, f, i, u, d)   \
+#define case2iu64(i, i64)   case7(i, LAST, i, i, LAST, i64, i64)
+#define case4iu64(i, u, i64, u64)   case7(i, LAST, i, u, LAST, i64, u64)
+
+#define casecomp(c, f, i, u, d, i64, ui64)   \
case TGSI_OPCODE_##c: \
-  if (type == GLSL_TYPE_DOUBLE) \
+  if (type == GLSL_TYPE_INT64) \
+ op = TGSI_OPCODE_##i64; \
+  else if (type == GLSL_TYPE_UINT64)\
+ op = TGSI_OPCODE_##ui64; \
+  else if (type == GLSL_TYPE_DOUBLE)   \
  op = TGSI_OPCODE_##d; \
   else if (type == GLSL_TYPE_INT || type == GLSL_TYPE_SUBROUTINE)   \
  op = TGSI_OPCODE_##i; \
   else if (type == GLSL_TYPE_UINT) \
  op = TGSI_OPCODE_##u; \
   else if (native_integers) \
  op = TGSI_OPCODE_##f; \
   else \
  op = TGSI_OPCODE_##c; \
   break;
 
switch(op) {
-  case3fid(ADD, UADD, DADD);
-  case3fid(MUL, UMUL, DMUL);
+  case3fid64(ADD, UADD, DADD, U64ADD);
+  case3fid64(MUL, UMUL, DMUL, U64MUL);
   case3fid(MAD, UMAD, DMAD);
   case3fid(FMA, UMAD, DFMA);
-  case4d(DIV, IDIV, UDIV, DDIV);
-  case4d(MAX, IMAX, UMAX, DMAX);
-  case4d(MIN, IMIN, UMIN, DMIN);
-  case2iu(MOD, UMOD);
+  case6d(DIV, IDIV, UDIV, DDIV, I64DIV, U64DIV);
+  case6d(MAX, IMAX, UMAX, DMAX, I64MAX, U64MAX);
+  case6d(MIN, IMIN, UMIN, DMIN, I64MIN, U64MIN);
+  case4iu64(MOD, UMOD, I64MOD, U64MOD);
 
-  casecomp(SEQ, FSEQ, USEQ, USEQ, DSEQ);
-  casecomp(SNE, FSNE, USNE, USNE, DSNE);
-  casecomp(SGE, FSGE, ISGE, USGE, DSGE);
-  casecomp(SLT, FSLT, ISLT, USLT,

[Mesa-dev] [PATCH 3/4] st/mesa: add support for enabling ARB_gpu_shader_int64.

2017-01-26 Thread Nicolai Hähnle

From: Dave Airlie 

Signed-off-by: Dave Airlie 
Reviewed-by: Nicolai Hähnle 
---
 src/mesa/state_tracker/st_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 5ccc5d9..4600b88 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -581,20 +581,21 @@ void st_init_extensions(struct pipe_screen *screen,
   { o(OES_copy_image),   
PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS },
   { o(ARB_cull_distance),PIPE_CAP_CULL_DISTANCE
},
   { o(ARB_depth_clamp),  PIPE_CAP_DEPTH_CLIP_DISABLE   
},
   { o(ARB_depth_texture),PIPE_CAP_TEXTURE_SHADOW_MAP   
},
   { o(ARB_derivative_control),   PIPE_CAP_TGSI_FS_FINE_DERIVATIVE  
},
   { o(ARB_draw_buffers_blend),   PIPE_CAP_INDEP_BLEND_FUNC 
},
   { o(ARB_draw_indirect),PIPE_CAP_DRAW_INDIRECT
},
   { o(ARB_draw_instanced),   PIPE_CAP_TGSI_INSTANCEID  
},
   { o(ARB_fragment_program_shadow),  PIPE_CAP_TEXTURE_SHADOW_MAP   
},
   { o(ARB_framebuffer_object),   PIPE_CAP_MIXED_FRAMEBUFFER_SIZES  
},
+  { o(ARB_gpu_shader_int64), PIPE_CAP_INT64
},
   { o(ARB_indirect_parameters),  
PIPE_CAP_MULTI_DRAW_INDIRECT_PARAMS   },
   { o(ARB_instanced_arrays), 
PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR  },
   { o(ARB_occlusion_query),  PIPE_CAP_OCCLUSION_QUERY  
},
   { o(ARB_occlusion_query2), PIPE_CAP_OCCLUSION_QUERY  
},
   { o(ARB_pipeline_statistics_query),
PIPE_CAP_QUERY_PIPELINE_STATISTICS},
   { o(ARB_point_sprite), PIPE_CAP_POINT_SPRITE 
},
   { o(ARB_query_buffer_object),  PIPE_CAP_QUERY_BUFFER_OBJECT  
},
   { o(ARB_robust_buffer_access_behavior), 
PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR   },
   { o(ARB_sample_shading),   PIPE_CAP_SAMPLE_SHADING   
},
   { o(ARB_seamless_cube_map),PIPE_CAP_SEAMLESS_CUBE_MAP
},
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/8] glsl: Fix constant evaluation of the rcp op.

2017-01-26 Thread Ian Romanick

On 01/25/2017 10:53 AM, Francisco Jerez wrote:
> Hi Ian, and thank you for your comments,
> 
> Ian Romanick  writes:
> 
>> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
>>> Will avoid a regression in a future commit that introduces some
>>> additional rcp operations.
>>
>> When I converted GLSL IR to ir_expression_operation.py, I was careful to
>> keep all the expressions the same.  rcp and div had these weird guards.
>> GLSL doesn't require that NaN be generated, and quite a few old GPUs
>> don't.  If the atan2 implementation depends on NaN being generated by
>> rcp, it may have problems on i915, r300, and similar GPUs.  I don't know
>> what they generate, but it's not NaN and it's probably not 0.0.
> 
> The atan2 implementation from patch 5 doesn't rely on NaNs being
> generated, but it does rely on the reciprocal operation handling zero
> and infinity correctly as specified by GLSL for the division operation.

Okay.  That is the problem on older GPUs that I was referring.
Specifically, all versions of the GLSL prior to 4.40 (!) say:

Similarly, treatment of conditions such as divide by 0 may lead to
an unspecified result, but in no case should such a condition lead
to the interruption or termination of processing.

I believe that DX11 requires the GLSL 4.40+ behavior, so all even
somewhat modern devices should just work.  It's all the pre-DX11
hardware that's might be a problem.

Now... talking to Jason just now, he reminded me that the spec also says
the following about built-in functions:

Function parameters specified as angle are assumed to be in units
of radians. In no case will any of these functions result in a
divide by zero error. If the divisor of a ratio is 0, then results
will be undefined.

We may be fine even on old, clunky hardware.  Looking at the code in
patch 5, atan(0, -abs(x)) would still be a problem if rcp(0) produces
undefined results.  It looks like
tests/shaders/glsl-fs-atan-2.shader_test should hit that case.  Anyone
have r300 or r400 hardware to test that?

This patch doesn't affect that, and, even with the "unspecified result"
rule, it's clearly correct.  This patch is

Reviewed-by: Ian Romanick 

>> That said, this matches NIR, and it's probably fine.
>>
>>> ---
>>>  src/compiler/glsl/ir_expression_operation.py | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/compiler/glsl/ir_expression_operation.py 
>>> b/src/compiler/glsl/ir_expression_operation.py
>>> index f91ac9b..4ac1ffb 100644
>>> --- a/src/compiler/glsl/ir_expression_operation.py
>>> +++ b/src/compiler/glsl/ir_expression_operation.py
>>> @@ -422,7 +422,7 @@ ir_expression_operation = [
>>> operation("neg", 1, source_types=numeric_types, c_expression={'u': 
>>> "-((int) {src0})", 'default': "-{src0}"}),
>>> operation("abs", 1, source_types=signed_numeric_types, 
>>> c_expression={'i': "{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 
>>> 'd': "fabs({src0})", 'i64': "{src0} < 0 ? -{src0} : {src0}"}),
>>> operation("sign", 1, source_types=signed_numeric_types, 
>>> c_expression={'i': "({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 
>>> 0.0F) - ({src0} < 0.0F))", 'd': "double(({src0} > 0.0) - ({src0} < 0.0))", 
>>> 'i64': "({src0} > 0) - ({src0} < 0)"}),
>>> -   operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} 
>>> != 0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : 
>>> 0.0"}),
>>> +   operation("rcp", 1, source_types=real_types, c_expression={'f': "1.0F / 
>>> {src0}", 'd': "1.0 / {src0}"}),
>>> operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / 
>>> sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}),
>>> operation("sqrt", 1, source_types=real_types, c_expression={'f': 
>>> "sqrtf({src0})", 'd': "sqrt({src0})"}),
>>> operation("exp", 1, source_types=(float_type,), 
>>> c_expression="expf({src0})"), # Log base e on gentype
>>>




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure.ac: move require_dri_shared_libs_and_glapi() before its users

2017-01-26 Thread Emil Velikov

On 26 January 2017 at 18:48, Matt Turner  wrote:
> I ran into this with mesa-17.0.0_rc2. This patch should go into the 17.0 
> branch.
Seem like I forgot to run ./bin/get-extra-pick-list.sh which flags it up.

Thanks Matt, will ensure it's in for rc3.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/8] LLVM requirement for drivers using draw

2017-01-26 Thread Emil Velikov

On 26 January 2017 at 18:54, Marek Olšák  wrote:
> They still have to ship LLVM to have GCN support. Or do they simply not ship
> radeonsi?
>
The latter - they omit anything that requires LLVM.
I dare not discuss how good/bad/etc of a choice that it, but some
people have their reasons.

-Emil
P.S. Can you toggle to plain text emails, please ?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Directly using upstream headers (Was Re: [PATCH] vulkan: Don't install vk_platform.h or vulkan.h.)

2017-01-26 Thread Emil Velikov

On 25 January 2017 at 22:10, Chad Versace  wrote:
> On Tue 24 Jan 2017, Jason Ekstrand wrote:
>> On Tue, Jan 24, 2017 at 11:25 AM, Emil Velikov 

>> > I'd rather not.  That would make sense if we all lived in the 
>> open-source
>> > world where everything is upstream all the time.  Unfortunately, not 
>> all
>> of
>> > us have that luxury and we need to be able to work on experimental
>> branches
>> > of the spec that may have more extensions than are provided by any 
>> loader
>> > version we can install.  I'd be ok with a check for a particular loader
>> > version just to force distros to update their loader but I would like 
>> to
>> be
>> > able to build with arbitrary XML branches without having to install a
>> branch
>> > of the loader.
>> What if I tell you that you wouldn't need to install the loader ;-)
>> More as we get a .pc patches in.
>>
>>
>> A lot of extensions don't require explicit loader support.  I don't want to
>> have to update my loader (or put it in some folder and point pkg-config at 
>> it)
>> just to hack on them.
>
> And, Mesa should build Vulkan against its own imported headers.
> Otherwise, the Mesa build would effectively require version lock between
> it and the installed loader headers; this version lock is made more
> difficult because the loader headers aren't really versioned.  Upstream
> unintentionally breaks things. When I bisect Mesa with a git-bisect
> script, I do not also want to hack the script to checkout and re-install
> the loader during the bisect.
>
> Evidence: 
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=c085bfcec9915879e97a33c5235cf21607c72318
>
> A Bigger Problem: You cannot force the distro to upgrade its loader
> headers. If the loader-provided headers on Android N differ from those
> on Android O and Android P due to stupid upstream API breakage, and
> those once again differ from those in Fedora 25 and Fedora 29... Despite
> that, Mesa must continue to build on all supported platforms. Satisfying
> that may be hellish if Mesa builds against the system's Vulkan headers
> instead of its own.

In generally it's a matter of ensuring people do the better/more
robust (?) thing.
Sadly that means a bit (albeit somewhat trivial) amount of work on each side.

From Vulkan (Mesa in general) developers, POV:
 - to point to the specific files

Distributions:
 - update regularly

Khronos:
 - ensure that both headers and XML files are updated for development
branches...
Having the script which generates vulkan.h/other publicly accessible
would be also nice ;-)
 - write tests and wire those to make check (or equivalent)

In practise neither one is easy due to the amount if people it needs
to be coordinated with. And considering that people are always busy
with more important thing... I don't see it happening soon :-(

Either way, I hope we don't get to a situation similar to the *GL* headers.
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/8] LLVM requirement for drivers using draw

2017-01-26 Thread Marek Olšák

They still have to ship LLVM to have GCN support. Or do they simply not
ship radeonsi?

Marek

On Jan 26, 2017 7:30 PM, "Emil Velikov"  wrote:

> Hi all,
>
> Here's a few small fixes/functionality improvements when dealing with
> LLVM.
>
> Most notably the series adds "LLVM" string [when applicable] to the
> .get_name()
> callback for drivers that use draw.
>
> Thus developer can respond accordingly - be that "rebuild with LLVM or
> enjoy the bad performance" or otherwise to reports.
>
> We can go a step further and make both(?) configure and the callback
> produce more nagging message along the lines of "built w/o LLVM expect bad
> performance", if people prefer.
>
> With this in mind we can drop the LLVM requirement, which some
> builders/distros explicitly patch out.
>
> Let's be nice to each other and not force it onto them.
>
> What do you guys think ?
> Emil
>
> Emil Velikov (8):
>   virgl: remove unused draw_context.h include
>   llvmpipe: use draw_get_option_use_llvm() instead of open coding it
>   softpipe: set softpipe_screen::use_llvm when draw is build with LLVM
>   softpipe: let .get_name() append LLVM if built with LLVM
>   i915g: use draw_get_option_use_llvm() instead of open coding it
>   i915g: let .get_name() append LLVM if built with LLVM
>   r300: let .get_name() append LLVM if built with LLVM
>   nouveau: let .get_name() append LLVM if built with LLVM
>
>  src/gallium/drivers/i915/i915_screen.c   |  9 +++--
>  src/gallium/drivers/llvmpipe/lp_screen.c |  4 ++--
>  src/gallium/drivers/nouveau/nouveau_screen.c |  7 ++-
>  src/gallium/drivers/r300/r300_screen.c   | 10 +-
>  src/gallium/drivers/softpipe/sp_screen.c |  6 ++
>  src/gallium/drivers/virgl/virgl_screen.c |  1 -
>  6 files changed, 30 insertions(+), 7 deletions(-)
>
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure.ac: move require_dri_shared_libs_and_glapi() before its users

2017-01-26 Thread Matt Turner

I ran into this with mesa-17.0.0_rc2. This patch should go into the 17.0 branch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Share the workaround bo between all contexts

2017-01-26 Thread Chris Wilson

On Thu, Jan 26, 2017 at 09:39:51AM -0800, Chad Versace wrote:
> On Thu 26 Jan 2017, Chris Wilson wrote:
> > Since the workaround bo is used strictly as a write-only buffer, we need
> > only allocate one per screen and use the same one from all contexts.
> > 
> > (The caveat here is during extension initialisation, where we write into
> > and read back register values from the buffer, but that is performed only
> > once for the first context - and baring synchronisation issues should not
> > be a problem. Safer would be to move that also to the screen.)
> > 
> > v2: Give the workaround bo its own init function and don't piggy back
> > intel_bufmgr_init() since it is not that related.
> > 
> > v3: Drop the reference count of the workaround bo for the context since
> > the context itself is owned by the screen (and so we can rely on the bo
> > existing for the lifetime of the context).
> 
> I like this idea, but I have questions and comments about the details.
> More questions than comments, really.
> 
> Today, with only Mesa changes, could we effectively do the same as
>   drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo);
> by hacking Mesa to set no read/write domain when emitting relocs for the
> workaround_bo? (I admit I don't fully understand the kernel's domain
> tracking). If that does work, then it just would require a small hack to
> brw_emit_pipe_control_write().

However... There is a hack that requires the write hazard for gen6
pipecontrols unless you use the noreloc patches (hw limitation causing
pipecontrols to always use ggtt offsets not the ppgtt you have normally).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] nir: silence implicit conversion to 64bit

2017-01-26 Thread Jason Ekstrand

On Thu, Jan 26, 2017 at 10:23 AM, Ian Romanick  wrote:

> I keep seeing patches like this... is it time to move BITFIELD64_* from
> mtypes.h to somewhere in util for more general use?
>

It may be time


> On 01/26/2017 05:18 AM, Emil Velikov wrote:
> > From: Emil Velikov 
> >
> > MSVC warns about implicit conversion as below. Annotate the literal
> > appropriately to silence the warning.
> >
> > nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
> > implicitly converted to 64 bits (was 64-bit shift intended?)
> >
> > Signed-off-by: Emil Velikov 
> > ---
> >  src/compiler/nir/nir_gather_info.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/nir/nir_gather_info.c
> b/src/compiler/nir/nir_gather_info.c
> > index 35a1ce4dec..0c70787252 100644
> > --- a/src/compiler/nir/nir_gather_info.c
> > +++ b/src/compiler/nir/nir_gather_info.c
> > @@ -246,7 +246,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr,
> nir_shader *shader)
> > case nir_intrinsic_load_tess_level_outer:
> > case nir_intrinsic_load_tess_level_inner:
> >shader->info->system_values_read |=
> > - (1 << nir_system_value_from_intrinsic(instr->intrinsic));
> > + (1ull << nir_system_value_from_intrinsic(instr->intrinsic));
> >break;
> >
> > case nir_intrinsic_end_primitive:
> >
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 15/17] gallium/radeon: remove r600_common_context::max_db

2017-01-26 Thread Marek Olšák

It should be OK.

Marek


On Jan 26, 2017 7:17 PM, "Alex Deucher"  wrote:

> On Thu, Jan 26, 2017 at 12:39 PM, Gustaw Smolarczyk
>  wrote:
> > 2017-01-26 17:04 GMT+01:00 Marek Olšák :
> >> From: Marek Olšák 
> >>
> >> this cleanup is based on the vulkan driver, which seems to do the same
> thing
> >
> > Is this also ok for r600g? If I'm right, the amdgpu-pro Vulkan driver
> > doesn't have any support for pre-GCN hardware.
>
> On r6xx and newer, the CB:DB ratio should always be 1:1.  I think the
> only time it was different was some r3xx-r5xx chips that 2:1 CB:DB
> ratios.
>
> Alex
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] 10bit HEVC decoding for RadeonSI

2017-01-26 Thread Christian König


Am 26.01.2017 um 16:59 schrieb Peter Frühberger:



2017-01-26 16:36 GMT+01:00 Christian König >:


Am 26.01.2017 um 12:16 schrieb Peter Frühberger:

Hi Christian,

2017-01-26 12:00 GMT+01:00 Christian König
>:

Hi Peter,

Am 25.01.2017 um 19:45 schrieb Peter Frühberger:



Peter, Rainer any idea what I'm missing here? Do you
guys use some
modified ffmpeg for Kodi or how does that work for you?


do you set the format correctly, e.g.:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L2697
to create the surfaces?


Well the problem here is that the VA-API interface is not
consistent and I'm not sure how to implement it correctly.

See your code for example:

VASurfaceAttrib attribs[1], *attrib;

attrib = attribs;

attrib->flags = VA_SURFACE_ATTRIB_SETTABLE;

attrib->type = VASurfaceAttribPixelFormat;

attrib->value.type = VAGenericValueTypeInteger;

attrib->value.value.i = VA_FOURCC_NV12;



First Kodi specifies that NV12 should be used which implies
that this is a 8bit surface.

// create surfaces

VASurfaceID surfaces[32];

unsigned int format = VA_RT_FORMAT_YUV420;

if (m_config.profile == VAProfileHEVCMain10)

format = VA_RT_FORMAT_YUV420_10BPP;

But then Kodi requests a 10bit surface. Now what is the
correct thing to do here?

I can either create an NV12 surface, which would be 8bit but
would result in either an error message or only 8bit
dithering during decode.

Or I can promote the surface to 10bit, which would result in
a P010 or rather P016 format.

Or and that is actually what I think would be best the VA-API
driver should trow an error indicating that the application
requested something impossible.


Yes you are right. Looks like a driver specific:

https://cgit.freedesktop.org/vaapi/intel-driver/tree/src/i965_drv_video.c#n1338



seems they use it as a hint to the subsampling: SUBSAMPLE_YUV420
and then later compare with with the format again to choose.

From code pov we should set the attribute to: VA_FOURCC_P010, right?


Yes, I think so.

Christian.


Fixed via: 
https://github.com/FernetMenta/kodi-agile/commit/bb73b5535e2f4b65772451c23f75503d04de69ef

thanks for the heads up



Great! Thanks a lot for cleaning that up so quickly.

In that case I will just respond to such nonsense combinations with an 
error result.


Christian.


Peter





Regards
Peter






afterwards we just do drm / egl interop, via:

https://github.com/FernetMenta/kodi-agile/blob/master/xbmc/cores/VideoPlayer/DVDCodecs/Video/VAAPI.cpp#L1374





I'm not sure if that will ever work correctly. The problem is
that VA-API leaks to the application what the data layout in
the surface is. As soon as we turn on tilling that will only
work with rather crude hacks.

I will try to get it working, but probably need help from you
guys as well.

Regards,
Christian.

You need ffmpeg 3.2.

If you use vaPutSurface it will end up as RGBA32 or
something, which is why we use the above way.

Best regards
Peter


Cheers,
Christian.





-- 
   Key-ID: 0x1A995A9B

   keyserver: pgp.mit.edu
 
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157 C81B DA07 CF63 1A99
5A9B






-- 
 Key-ID: 0x1A995A9B

   keyserver: pgp.mit.edu 
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B






--
   Key-ID: 0x1A995A9B
   keyserver: pgp.mit.edu 
==
Fingerprint: 4606 DA19 EC2E 9A0B 0157  C81B DA07 CF63 1A99 5A9B



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure.ac: Require LLVM for r300 only on x86 and x86_64

2017-01-26 Thread Emil Velikov

On 26 January 2017 at 11:13, Andreas Boll  wrote:
> b3119a3 introduced a strict LLVM requirement for r300 on all
> architectures and thus configure fails on architectures where LLVM is
> not available or buggy.
>
> r300 doesn't strictly require LLVM, but for performance reasons we
> highly recommend LLVM usage. So require it at least on x86 and x86_64
> architectures as we have done before b3119a3.
>
Was hoping that nobody will notice ;-)

But seriously, I've just sent a series which reworks things - covering
the concerns that have been voiced before.
I would love to get that (or similar) in and for us to drop this
workaround. Until/if then that happens patch is spot on:

Reviewed-by: Emil Velikov 

Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH mesa] docs/repository: fix name of main branch

2017-01-26 Thread Matt Turner

On Thu, Jan 26, 2017 at 10:11 AM, Eric Engestrom
 wrote:
> This is git, not svn :P
>
> Signed-off-by: Eric Engestrom 

Reviewed-by: Matt Turner 

> ---
> I also noticed some difference between this file and the one on
> mesa3d.org; it might be worth making sure everything is sync'ed between
> the two (most likely just push the version on git to the webserver).

It would be nice to make that an automated process. Brian?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] gallium/radeon: add VRAM-vis-usage HUD query

2017-01-26 Thread Marek Olšák

For the series:

Reviewed-by: Marek Olšák 

On Jan 25, 2017 5:50 PM, "Nicolai Hähnle"  wrote:

Reviewed-by: Nicolai Hähnle 


On 25.01.2017 16:56, Samuel Pitoiset wrote:

> This new query returns the current visible usage of VRAM accessed
> by the CPU. It will return 0 on radeon because it's unimplemented.
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/radeon/r600_query.c   | 7 +++
>  src/gallium/drivers/radeon/r600_query.h   | 1 +
>  src/gallium/drivers/radeon/radeon_winsys.h| 1 +
>  src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 4 
>  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 1 +
>  5 files changed, 14 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/r600_query.c
> b/src/gallium/drivers/radeon/r600_query.c
> index 96157cd40e..d4e41306a4 100644
> --- a/src/gallium/drivers/radeon/r600_query.c
> +++ b/src/gallium/drivers/radeon/r600_query.c
> @@ -71,6 +71,7 @@ static enum radeon_value_id winsys_id_from_type(unsigned
> type)
> case R600_QUERY_NUM_BYTES_MOVED: return RADEON_NUM_BYTES_MOVED;
> case R600_QUERY_NUM_EVICTIONS: return RADEON_NUM_EVICTIONS;
> case R600_QUERY_VRAM_USAGE: return RADEON_VRAM_USAGE;
> +   case R600_QUERY_VRAM_VIS_USAGE: return RADEON_VRAM_VIS_USAGE;
> case R600_QUERY_GTT_USAGE: return RADEON_GTT_USAGE;
> case R600_QUERY_GPU_TEMPERATURE: return RADEON_GPU_TEMPERATURE;
> case R600_QUERY_CURRENT_GPU_SCLK: return RADEON_CURRENT_SCLK;
> @@ -129,6 +130,7 @@ static bool r600_query_sw_begin(struct
> r600_common_context *rctx,
> case R600_QUERY_MAPPED_VRAM:
> case R600_QUERY_MAPPED_GTT:
> case R600_QUERY_VRAM_USAGE:
> +   case R600_QUERY_VRAM_VIS_USAGE:
> case R600_QUERY_GTT_USAGE:
> case R600_QUERY_GPU_TEMPERATURE:
> case R600_QUERY_CURRENT_GPU_SCLK:
> @@ -238,6 +240,7 @@ static bool r600_query_sw_end(struct
> r600_common_context *rctx,
> case R600_QUERY_MAPPED_VRAM:
> case R600_QUERY_MAPPED_GTT:
> case R600_QUERY_VRAM_USAGE:
> +   case R600_QUERY_VRAM_VIS_USAGE:
> case R600_QUERY_GTT_USAGE:
> case R600_QUERY_GPU_TEMPERATURE:
> case R600_QUERY_CURRENT_GPU_SCLK:
> @@ -1731,6 +1734,7 @@ static struct pipe_driver_query_info
> r600_driver_query_list[] = {
> X("num-bytes-moved",NUM_BYTES_MOVED,BYTES,
> CUMULATIVE),
> X("num-evictions",  NUM_EVICTIONS,  UINT64,
> CUMULATIVE),
> X("VRAM-usage", VRAM_USAGE, BYTES,
> AVERAGE),
> +   X("VRAM-vis-usage", VRAM_VIS_USAGE, BYTES,
> AVERAGE),
> X("GTT-usage",  GTT_USAGE,  BYTES,
> AVERAGE),
> X("back-buffer-ps-draw-ratio",  BACK_BUFFER_PS_DRAW_RATIO,
> UINT64, AVERAGE),
>
> @@ -1814,6 +1818,9 @@ static int r600_get_driver_query_info(struct
> pipe_screen *screen,
> case R600_QUERY_GPU_TEMPERATURE:
> info->max_value.u64 = 125;
> break;
> +   case R600_QUERY_VRAM_VIS_USAGE:
> +   info->max_value.u64 = rscreen->info.vram_vis_size;
> +   break;
> }
>
> if (info->group_id != ~(unsigned)0 && rscreen->perfcounters)
> diff --git a/src/gallium/drivers/radeon/r600_query.h
> b/src/gallium/drivers/radeon/r600_query.h
> index 20856a5b2e..f2af9240d2 100644
> --- a/src/gallium/drivers/radeon/r600_query.h
> +++ b/src/gallium/drivers/radeon/r600_query.h
> @@ -66,6 +66,7 @@ enum {
> R600_QUERY_NUM_BYTES_MOVED,
> R600_QUERY_NUM_EVICTIONS,
> R600_QUERY_VRAM_USAGE,
> +   R600_QUERY_VRAM_VIS_USAGE,
> R600_QUERY_GTT_USAGE,
> R600_QUERY_GPU_TEMPERATURE,
> R600_QUERY_CURRENT_GPU_SCLK,
> diff --git a/src/gallium/drivers/radeon/radeon_winsys.h
> b/src/gallium/drivers/radeon/radeon_winsys.h
> index e373e2f0a1..881bd5f2e4 100644
> --- a/src/gallium/drivers/radeon/radeon_winsys.h
> +++ b/src/gallium/drivers/radeon/radeon_winsys.h
> @@ -88,6 +88,7 @@ enum radeon_value_id {
>  RADEON_NUM_BYTES_MOVED,
>  RADEON_NUM_EVICTIONS,
>  RADEON_VRAM_USAGE,
> +RADEON_VRAM_VIS_USAGE,
>  RADEON_GTT_USAGE,
>  RADEON_GPU_TEMPERATURE, /* DRM 2.42.0 */
>  RADEON_CURRENT_SCLK,
> diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> index ea4d25476f..c3dfda53f0 100644
> --- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> +++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
> @@ -451,6 +451,10 @@ static uint64_t amdgpu_query_value(struct
> radeon_winsys *rws,
> case RADEON_VRAM_USAGE:
>amdgpu_query_heap_info(ws->dev, AMDGPU_GEM_DOMAIN_VRAM, 0, );
>return heap.heap_usage;
> +   case RADEON_VRAM_VIS_USAGE:
> +  amdgpu_query_heap_info(ws->dev, AMDGPU_GEM_DOMAIN_VRAM,
> +

Re: [Mesa-dev] [PATCH 1/8] nir: silence implicit conversion to 64bit

2017-01-26 Thread Matt Turner

On Thu, Jan 26, 2017 at 10:22 AM, Jason Ekstrand  wrote:
> Ugh... windows defines long to be 32-bit on 32-bit platforms  Yeah.

long is 32-bit on 32-bit x86/Linux too...

I think you mean they define long to be 32-bits on 64-bit platforms.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/8] i915g: let .get_name() append LLVM if built with LLVM

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Provides quick and direct feedback to the user/developer.

Cc: Stéphane Marchesin 
Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/i915/i915_screen.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 6f9e612348..31803eab0e 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -65,6 +65,7 @@ i915_get_name(struct pipe_screen *screen)
 {
static char buffer[128];
const char *chipset;
+   const char *llvm = "";
 
switch (i915_screen(screen)->iws->pci_id) {
case PCI_CHIP_I915_G:
@@ -102,7 +103,11 @@ i915_get_name(struct pipe_screen *screen)
   break;
}
 
-   util_snprintf(buffer, sizeof(buffer), "i915 (chipset: %s)", chipset);
+#ifdef HAVE_LLVM
+   llvm = " LLVM";
+#endif
+
+   util_snprintf(buffer, sizeof(buffer), "i915 (chipset: %s)%s", chipset, 
llvm);
return buffer;
 }
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/8] virgl: remove unused draw_context.h include

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/virgl/virgl_screen.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/gallium/drivers/virgl/virgl_screen.c 
b/src/gallium/drivers/virgl/virgl_screen.c
index 4515f5e9ce..59d96f894d 100644
--- a/src/gallium/drivers/virgl/virgl_screen.c
+++ b/src/gallium/drivers/virgl/virgl_screen.c
@@ -27,7 +27,6 @@
 #include "os/os_time.h"
 #include "pipe/p_defines.h"
 #include "pipe/p_screen.h"
-#include "draw/draw_context.h"
 
 #include "tgsi/tgsi_exec.h"
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/8] llvmpipe: use draw_get_option_use_llvm() instead of open coding it

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Cc: Roland Scheidegger 
Cc: Jose Fonseca 
Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/llvmpipe/lp_screen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 89a1dc868e..c913767fc0 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -368,12 +368,12 @@ llvmpipe_get_shader_param(struct pipe_screen *screen, 
unsigned shader, enum pipe
   * support vertex shader texture lookups when LLVM is enabled in
   * the draw module.
   */
- if (debug_get_bool_option("DRAW_USE_LLVM", TRUE))
+ if (draw_get_option_use_llvm())
 return PIPE_MAX_SAMPLERS;
  else
 return 0;
   case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
- if (debug_get_bool_option("DRAW_USE_LLVM", TRUE))
+ if (draw_get_option_use_llvm())
 return PIPE_MAX_SHADER_SAMPLER_VIEWS;
  else
 return 0;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/8] i915g: use draw_get_option_use_llvm() instead of open coding it

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Currently one can build i915g without LLVM thus the current handling is
wrong. Whether using i915g w/o LLVM is a good idea or not is a question
for another time.

Cc: Stéphane Marchesin 
Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/i915/i915_screen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index 78898736d9..6f9e612348 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -114,7 +114,7 @@ i915_get_shader_param(struct pipe_screen *screen, unsigned 
shader, enum pipe_sha
   switch (cap) {
   case PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS:
   case PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS:
- if (debug_get_bool_option("DRAW_USE_LLVM", TRUE))
+ if (draw_get_option_use_llvm())
 return PIPE_MAX_SAMPLERS;
  else
 return 0;
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/8] nouveau: let .get_name() append LLVM if built with LLVM

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Analogous to previous two commits. Afaict only nv30 uses draw, so if
people prefer we can restrict this print only to those devices.

Signed-off-by: Emil Velikov 
---
Unrelated:
Wasn't there a kernel/libdrm helper which can give us the complete
device name ?
---
 src/gallium/drivers/nouveau/nouveau_screen.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nouveau_screen.c 
b/src/gallium/drivers/nouveau/nouveau_screen.c
index f59e101caf..24177bd7da 100644
--- a/src/gallium/drivers/nouveau/nouveau_screen.c
+++ b/src/gallium/drivers/nouveau/nouveau_screen.c
@@ -33,8 +33,13 @@ nouveau_screen_get_name(struct pipe_screen *pscreen)
 {
struct nouveau_device *dev = nouveau_screen(pscreen)->device;
static char buffer[128];
+   const char *llvm = "";
 
-   util_snprintf(buffer, sizeof(buffer), "NV%02X", dev->chipset);
+#ifdef HAVE_LLVM
+   llvm = " LLVM";
+#endif
+
+   util_snprintf(buffer, sizeof(buffer), "NV%02X%s", dev->chipset, llvm);
return buffer;
 }
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/8] r300: let .get_name() append LLVM if built with LLVM

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Provides quick and direct feedback to the user/developer.

Cc: Marek Olšák 
Signed-off-by: Emil Velikov 
---
 src/gallium/drivers/r300/r300_screen.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r300/r300_screen.c 
b/src/gallium/drivers/r300/r300_screen.c
index e5e7535358..12b94723cb 100644
--- a/src/gallium/drivers/r300/r300_screen.c
+++ b/src/gallium/drivers/r300/r300_screen.c
@@ -82,8 +82,16 @@ static const char* chip_families[] = {
 static const char* r300_get_name(struct pipe_screen* pscreen)
 {
 struct r300_screen* r300screen = r300_screen(pscreen);
+static char buffer[128];
+const char *llvm = "";
 
-return chip_families[r300screen->caps.family];
+#ifdef HAVE_LLVM
+llvm = " LLVM";
+#endif
+
+util_snprintf(buffer, sizeof(buffer), "%s%s",
+  chip_families[r300screen->caps.family], llvm);
+return buffer;
 }
 
 static int r300_get_param(struct pipe_screen* pscreen, enum pipe_cap param)
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/8] softpipe: let .get_name() append LLVM if built with LLVM

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Provides quick and direct feedback to the user/developer.

Cc: Roland Scheidegger 
Cc: Jose Fonseca 
Signed-off-by: Emil Velikov 
---
We can move all the ifdef to a helper, if people prefer.
---
 src/gallium/drivers/softpipe/sp_screen.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/gallium/drivers/softpipe/sp_screen.c 
b/src/gallium/drivers/softpipe/sp_screen.c
index 1a58eb9d99..305f7bff14 100644
--- a/src/gallium/drivers/softpipe/sp_screen.c
+++ b/src/gallium/drivers/softpipe/sp_screen.c
@@ -57,7 +57,11 @@ softpipe_get_vendor(struct pipe_screen *screen)
 static const char *
 softpipe_get_name(struct pipe_screen *screen)
 {
+#ifdef HAVE_LLVM
+   return "softpipe LLVM";
+#else
return "softpipe";
+#endif
 }
 
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/8] softpipe: set softpipe_screen::use_llvm when draw is build with LLVM

2017-01-26 Thread Emil Velikov

From: Emil Velikov 

Currently we can build draw without LLVM thus honouring SOFTPIPE_USE_LLVM
is misleading even if most of the code nicely falls-back to no-op in the
lack of LLVM.

That does not seem to be the case in softpipe_draw_vbo() where extra
prepare {prepare,cleanup}_{vertex,geometry}_sampling is present.

Haven't checked how much overhead the causes, but omitting it is the
correct thing to do, afaict.

Note: the topic of "is it a smart idea to have softpipe build with
LLVM-less draw" is to be checked another day.

Cc: Roland Scheidegger 
Cc: Jose Fonseca 
Signed-off-by: Emil Velikov 
---
It's the better thing to do imho, but if you feel strongly against it
feel free to drop it.
---
 src/gallium/drivers/softpipe/sp_screen.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/softpipe/sp_screen.c 
b/src/gallium/drivers/softpipe/sp_screen.c
index 9bc8d10e8e..1a58eb9d99 100644
--- a/src/gallium/drivers/softpipe/sp_screen.c
+++ b/src/gallium/drivers/softpipe/sp_screen.c
@@ -568,7 +568,9 @@ softpipe_create_screen(struct sw_winsys *winsys)
screen->base.context_create = softpipe_create_context;
screen->base.flush_frontbuffer = softpipe_flush_frontbuffer;
screen->base.get_compute_param = softpipe_get_compute_param;
+#ifdef HAVE_LLVM
screen->use_llvm = debug_get_option_use_llvm();
+#endif
 
util_format_s3tc_init();
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/8] LLVM requirement for drivers using draw

2017-01-26 Thread Emil Velikov

Hi all,

Here's a few small fixes/functionality improvements when dealing with 
LLVM.

Most notably the series adds "LLVM" string [when applicable] to the .get_name()
callback for drivers that use draw.

Thus developer can respond accordingly - be that "rebuild with LLVM or 
enjoy the bad performance" or otherwise to reports.

We can go a step further and make both(?) configure and the callback 
produce more nagging message along the lines of "built w/o LLVM expect bad 
performance", if people prefer.

With this in mind we can drop the LLVM requirement, which some 
builders/distros explicitly patch out.

Let's be nice to each other and not force it onto them.

What do you guys think ?
Emil
 
Emil Velikov (8):
  virgl: remove unused draw_context.h include
  llvmpipe: use draw_get_option_use_llvm() instead of open coding it
  softpipe: set softpipe_screen::use_llvm when draw is build with LLVM
  softpipe: let .get_name() append LLVM if built with LLVM
  i915g: use draw_get_option_use_llvm() instead of open coding it
  i915g: let .get_name() append LLVM if built with LLVM
  r300: let .get_name() append LLVM if built with LLVM
  nouveau: let .get_name() append LLVM if built with LLVM

 src/gallium/drivers/i915/i915_screen.c   |  9 +++--
 src/gallium/drivers/llvmpipe/lp_screen.c |  4 ++--
 src/gallium/drivers/nouveau/nouveau_screen.c |  7 ++-
 src/gallium/drivers/r300/r300_screen.c   | 10 +-
 src/gallium/drivers/softpipe/sp_screen.c |  6 ++
 src/gallium/drivers/virgl/virgl_screen.c |  1 -
 6 files changed, 30 insertions(+), 7 deletions(-)

-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/va encode handle ntsc framerate rate control

2017-01-26 Thread Andy Furniss

Tested with ffmpeg and gst-vaapi. Without this bits per
frame is set way too low.

Signed-off-by: Andy Furniss 
---
 src/gallium/state_trackers/va/picture.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/va/picture.c 
b/src/gallium/state_trackers/va/picture.c
index 82584ea..a024437 100644
--- a/src/gallium/state_trackers/va/picture.c
+++ b/src/gallium/state_trackers/va/picture.c
@@ -119,14 +119,30 @@ getEncParamPreset(vlVaContext *context)
context->desc.h264enc.rate_ctrl.fill_data_enable = 1;
context->desc.h264enc.rate_ctrl.enforce_hrd = 1;
context->desc.h264enc.enable_vui = false;
-   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0)
-  context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
-   context->desc.h264enc.rate_ctrl.target_bits_picture =
-  context->desc.h264enc.rate_ctrl.target_bitrate / 
context->desc.h264enc.rate_ctrl.frame_rate_num;
-   context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
-  context->desc.h264enc.rate_ctrl.peak_bitrate / 
context->desc.h264enc.rate_ctrl.frame_rate_num;
-   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
+   if (context->desc.h264enc.rate_ctrl.frame_rate_num == 0 ||
+   context->desc.h264enc.rate_ctrl.frame_rate_den == 0) {
+ context->desc.h264enc.rate_ctrl.frame_rate_num = 30;
+ context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
+   }
+   if (context->desc.h264enc.rate_ctrl.frame_rate_den > 1) {
+  context->desc.h264enc.rate_ctrl.target_bits_picture =
+ context->desc.h264enc.rate_ctrl.target_bitrate /
+ (context->desc.h264enc.rate_ctrl.frame_rate_num /
+ context->desc.h264enc.rate_ctrl.frame_rate_den + 1);
+  context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
+ context->desc.h264enc.rate_ctrl.peak_bitrate /
+ (context->desc.h264enc.rate_ctrl.frame_rate_num /
+ context->desc.h264enc.rate_ctrl.frame_rate_den + 1);
+   } else {
+  context->desc.h264enc.rate_ctrl.target_bits_picture =
+ context->desc.h264enc.rate_ctrl.target_bitrate /
+ context->desc.h264enc.rate_ctrl.frame_rate_num;
+  context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
+ context->desc.h264enc.rate_ctrl.peak_bitrate /
+ context->desc.h264enc.rate_ctrl.frame_rate_num;
+   }
 
+   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
context->desc.h264enc.ref_pic_mode = 0x0201;
 }
 
@@ -362,7 +378,7 @@ handleVAEncSequenceParameterBufferType(vlVaDriver *drv, 
vlVaContext *context, vl
   context->gop_coeff = VL_VA_ENC_GOP_COEFF;
context->desc.h264enc.gop_size = h264->intra_idr_period * 
context->gop_coeff;
context->desc.h264enc.rate_ctrl.frame_rate_num = h264->time_scale / 2;
-   context->desc.h264enc.rate_ctrl.frame_rate_den = 1;
+   context->desc.h264enc.rate_ctrl.frame_rate_den = h264->num_units_in_tick;
return VA_STATUS_SUCCESS;
 }
 
-- 
2.7.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] nir: silence implicit conversion to 64bit

2017-01-26 Thread Ian Romanick

I keep seeing patches like this... is it time to move BITFIELD64_* from
mtypes.h to somewhere in util for more general use?

On 01/26/2017 05:18 AM, Emil Velikov wrote:
> From: Emil Velikov 
> 
> MSVC warns about implicit conversion as below. Annotate the literal
> appropriately to silence the warning.
> 
> nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
> implicitly converted to 64 bits (was 64-bit shift intended?)
> 
> Signed-off-by: Emil Velikov 
> ---
>  src/compiler/nir/nir_gather_info.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/compiler/nir/nir_gather_info.c 
> b/src/compiler/nir/nir_gather_info.c
> index 35a1ce4dec..0c70787252 100644
> --- a/src/compiler/nir/nir_gather_info.c
> +++ b/src/compiler/nir/nir_gather_info.c
> @@ -246,7 +246,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
> nir_shader *shader)
> case nir_intrinsic_load_tess_level_outer:
> case nir_intrinsic_load_tess_level_inner:
>shader->info->system_values_read |=
> - (1 << nir_system_value_from_intrinsic(instr->intrinsic));
> + (1ull << nir_system_value_from_intrinsic(instr->intrinsic));
>break;
>  
> case nir_intrinsic_end_primitive:
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] nir: silence implicit conversion to 64bit

2017-01-26 Thread Jason Ekstrand

Ugh... windows defines long to be 32-bit on 32-bit platforms  Yeah.

Reviewed-by: Jason Ekstrand 

On Thu, Jan 26, 2017 at 6:05 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> Reviewed-by: Lionel Landwerlin 
>
> On 26/01/17 13:18, Emil Velikov wrote:
>
>> From: Emil Velikov 
>>
>> MSVC warns about implicit conversion as below. Annotate the literal
>> appropriately to silence the warning.
>>
>> nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
>> implicitly converted to 64 bits (was 64-bit shift intended?)
>>
>> Signed-off-by: Emil Velikov 
>> ---
>>   src/compiler/nir/nir_gather_info.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/compiler/nir/nir_gather_info.c
>> b/src/compiler/nir/nir_gather_info.c
>> index 35a1ce4dec..0c70787252 100644
>> --- a/src/compiler/nir/nir_gather_info.c
>> +++ b/src/compiler/nir/nir_gather_info.c
>> @@ -246,7 +246,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr,
>> nir_shader *shader)
>>  case nir_intrinsic_load_tess_level_outer:
>>  case nir_intrinsic_load_tess_level_inner:
>> shader->info->system_values_read |=
>> - (1 << nir_system_value_from_intrinsic(instr->intrinsic));
>> + (1ull << nir_system_value_from_intrinsic(instr->intrinsic));
>> break;
>>case nir_intrinsic_end_primitive:
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/8] nir: add extra const notation in compare_blocks()

2017-01-26 Thread Jason Ekstrand

LGTM

On Thu, Jan 26, 2017 at 6:05 AM, Lionel Landwerlin <
lionel.g.landwer...@intel.com> wrote:

> Reviewed-by: Lionel Landwerlin 
>
> On 26/01/17 13:18, Emil Velikov wrote:
>
>> From: Emil Velikov 
>>
>> MSVC warns about different const qualifiers. Add the extra const to
>> silence it.
>>
>> nir_phi_builder.c(244) : warning C4090: 'initializing' : different
>> 'const' qualifiers
>> nir_phi_builder.c(245) : warning C4090: 'initializing' : different
>> 'const' qualifiers
>>
>> Signed-off-by: Emil Velikov 
>> ---
>>   src/compiler/nir/nir_phi_builder.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/compiler/nir/nir_phi_builder.c
>> b/src/compiler/nir/nir_phi_builder.c
>> index acfc771da2..883884bb7f 100644
>> --- a/src/compiler/nir/nir_phi_builder.c
>> +++ b/src/compiler/nir/nir_phi_builder.c
>> @@ -241,8 +241,8 @@ nir_phi_builder_value_get_block_def(struct
>> nir_phi_builder_value *val,
>>   static int
>>   compare_blocks(const void *_a, const void *_b)
>>   {
>> -   nir_block * const * a = _a;
>> -   nir_block * const * b = _b;
>> +   const nir_block * const * a = _a;
>> +   const nir_block * const * b = _b;
>>return (*a)->index - (*b)->index;
>>   }
>>
>
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/8] glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.

2017-01-26 Thread Ian Romanick

On 01/25/2017 12:55 PM, Francisco Jerez wrote:
> Ian Romanick  writes:
> 
>> It's a real bummer that we have two implementations of this function
>> that are basically written in assembly... I'm not sure what else you'd
>> call generating IR by hand.  The code review and maintenance costs are
>> of the same magnitude for sure.
>>
>> We could move this to GLSL and let the standalone compiler generate the
>> builder code.  I don't think that is currently helpful.  However, for
>> future "soft" int64 and fp64 work the standalone compiler will need to
>> be extended to also generate NIR builder.  Once that is done, I think
>> the cost-benefit analysis changes.
> 
> Yes, I agree...  It was a real PITA to have to fix two different
> implementations of atan2, we should come up with some way to share the
> lowering of these built-ins even if they're still written in assembly --
> It would still halve the amount of pain we inflict on ourselves.
> 
>> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
>>> This addresses several issues of the current atan2 implementation:
>>>
>>>  - Negative zero (and negative denorms which end up getting flushed to
>>>zero) isn't handled correctly by the current implementation.  The
>>>reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
>>>on which side of the branch cut the argument is, which causes us to
>>>return incorrect results (off by up to 2π) for very small negative
>>>values.
>>>
>>>  - There is a serious precision problem for x values of large enough
>>>magnitude introduced by the floating point division operation being
>>>implemented as a mul+rcp sequence.  This can lead to the quotient
>>>getting flushed to zero in some cases introducing an error of over
>>>8e6 ULP in the result -- Or in the most catastrophic case will
>>>cause us to return NaN instead of the correct value ±π/2 for y=±∞
>>>and x very large.  We can fix this easily by scaling down both
>>>arguments when the absolute value of the denominator goes above
>>>certain threshold.  The error of this atan2 implementation remains
>>>below 25 ULP in most of its domain except for a neighborhood of y=0
>>>where it reaches a maximum error of about 180 ULP.
>>>
>>>  - It emits a bunch of instructions including no less than three
>>>if-else branches per scalar component that don't seem to get
>>>optimized out later on.  This implementation uses about 13% less
>>>instructions on Intel SKL hardware and doesn't emit any control
>>>flow instructions.
>>> ---
>>>  src/compiler/glsl/builtin_functions.cpp | 82 
>>> ++---
>>>  1 file changed, 46 insertions(+), 36 deletions(-)
>>>
>>> diff --git a/src/compiler/glsl/builtin_functions.cpp 
>>> b/src/compiler/glsl/builtin_functions.cpp
>>> index 4a6c5af..fd59381 100644
>>> --- a/src/compiler/glsl/builtin_functions.cpp
>>> +++ b/src/compiler/glsl/builtin_functions.cpp
>>> @@ -3560,44 +3560,54 @@ builtin_builder::_acos(const glsl_type *type)
>>>  ir_function_signature *
>>>  builtin_builder::_atan2(const glsl_type *type)
>>>  {
>>> -   ir_variable *vec_y = in_var(type, "vec_y");
>>> -   ir_variable *vec_x = in_var(type, "vec_x");
>>> -   MAKE_SIG(type, always_available, 2, vec_y, vec_x);
>>> -
>>> -   ir_variable *vec_result = body.make_temp(type, "vec_result");
>>> -   ir_variable *r = body.make_temp(glsl_type::float_type, "r");
>>> -   for (int i = 0; i < type->vector_elements; i++) {
>>> -  ir_variable *y = body.make_temp(glsl_type::float_type, "y");
>>> -  ir_variable *x = body.make_temp(glsl_type::float_type, "x");
>>> -  body.emit(assign(y, swizzle(vec_y, i, 1)));
>>> -  body.emit(assign(x, swizzle(vec_x, i, 1)));
>>> -
>>> -  /* If |x| >= 1.0e-8 * |y|: */
>>> -  ir_if *outer_if =
>>> - new(mem_ctx) ir_if(greater(abs(x), mul(imm(1.0e-8f), abs(y;
>>> -
>>> -  ir_factory outer_then(_if->then_instructions, mem_ctx);
>>> -
>>> -  /* Then...call atan(y/x) */
>>> -  do_atan(outer_then, glsl_type::float_type, r, div(y, x));
>>> -
>>> -  /* ...and fix it up: */
>>> -  ir_if *inner_if = new(mem_ctx) ir_if(less(x, imm(0.0f)));
>>> -  inner_if->then_instructions.push_tail(
>>> - if_tree(gequal(y, imm(0.0f)),
>>> - assign(r, add(r, imm(M_PIf))),
>>> - assign(r, sub(r, imm(M_PIf);
>>> -  outer_then.emit(inner_if);
>>> -
>>> -  /* Else... */
>>> -  outer_if->else_instructions.push_tail(
>>> - assign(r, mul(sign(y), imm(M_PI_2f;
>>> +   const unsigned n = type->vector_elements;
>>> +   ir_variable *y = in_var(type, "y");
>>> +   ir_variable *x = in_var(type, "x");
>>> +   MAKE_SIG(type, always_available, 2, y, x);
>>>  
>>> -  body.emit(outer_if);
>>> +   /* If we're on the left half-plane rotate the coordinates π/2 clock-wise
>>> +* for the y=0 discontinuity to end up aligned with the vertical
>>> +* discontinuity

Re: [Mesa-dev] [PATCH 15/17] gallium/radeon: remove r600_common_context::max_db

2017-01-26 Thread Alex Deucher

On Thu, Jan 26, 2017 at 12:39 PM, Gustaw Smolarczyk
 wrote:
> 2017-01-26 17:04 GMT+01:00 Marek Olšák :
>> From: Marek Olšák 
>>
>> this cleanup is based on the vulkan driver, which seems to do the same thing
>
> Is this also ok for r600g? If I'm right, the amdgpu-pro Vulkan driver
> doesn't have any support for pre-GCN hardware.

On r6xx and newer, the CB:DB ratio should always be 1:1.  I think the
only time it was different was some r3xx-r5xx chips that 2:1 CB:DB
ratios.

Alex
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH mesa] docs/repository: fix name of main branch

2017-01-26 Thread Eric Engestrom

This is git, not svn :P

Signed-off-by: Eric Engestrom 
---
I also noticed some difference between this file and the one on
mesa3d.org; it might be worth making sure everything is sync'ed between
the two (most likely just push the version on git to the webserver).
---
 docs/repository.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/repository.html b/docs/repository.html
index 1fb88bf717..8efc9cf23e 100644
--- a/docs/repository.html
+++ b/docs/repository.html
@@ -144,7 +144,7 @@ Developer git Access
 
 At any given time, there may be several active branches in Mesa's
 repository.
-Generally, the trunk contains the latest development (unstable)
+Generally, master contains the latest development (unstable)
 code while a branch has the latest stable code.
 
 
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: Fix build against clang SVN >= r293097

2017-01-26 Thread Francisco Jerez

Michel Dänzer  writes:

> From: Michel Dänzer 
>
> Signed-off-by: Michel Dänzer 

Reviewed-by: Francisco Jerez 

> ---
>
> Not sure if PropagateAttrs should be set to true or false. Setting it to
> true because clang SVN r293097 does so for CUDA, and there are no piglit
> quick_cl regressions with radeonsi on Kaveri.
>
>  src/gallium/state_trackers/clover/llvm/compat.hpp | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/compat.hpp 
> b/src/gallium/state_trackers/clover/llvm/compat.hpp
> index 81592ce702..906367b314 100644
> --- a/src/gallium/state_trackers/clover/llvm/compat.hpp
> +++ b/src/gallium/state_trackers/clover/llvm/compat.hpp
> @@ -83,7 +83,14 @@ namespace clover {
>   inline void
>   add_link_bitcode_file(clang::CodeGenOptions ,
> const std::string ) {
> -#if HAVE_LLVM >= 0x0308
> +#if HAVE_LLVM >= 0x0500
> +clang::CodeGenOptions::BitcodeFileToLink F;
> +
> +F.Filename = path;
> +F.PropagateAttrs = true;
> +F.LinkFlags = ::llvm::Linker::Flags::None;
> +opts.LinkBitcodeFiles.emplace_back(F);
> +#elif HAVE_LLVM >= 0x0308
>  opts.LinkBitcodeFiles.emplace_back(::llvm::Linker::Flags::None, 
> path);
>  #else
>  opts.LinkBitcodeFile = path;
> -- 
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Share the workaround bo between all contexts

2017-01-26 Thread Chris Wilson

On Thu, Jan 26, 2017 at 09:39:51AM -0800, Chad Versace wrote:
> On Thu 26 Jan 2017, Chris Wilson wrote:
> > Since the workaround bo is used strictly as a write-only buffer, we need
> > only allocate one per screen and use the same one from all contexts.
> > 
> > (The caveat here is during extension initialisation, where we write into
> > and read back register values from the buffer, but that is performed only
> > once for the first context - and baring synchronisation issues should not
> > be a problem. Safer would be to move that also to the screen.)
> > 
> > v2: Give the workaround bo its own init function and don't piggy back
> > intel_bufmgr_init() since it is not that related.
> > 
> > v3: Drop the reference count of the workaround bo for the context since
> > the context itself is owned by the screen (and so we can rely on the bo
> > existing for the lifetime of the context).
> 
> I like this idea, but I have questions and comments about the details.
> More questions than comments, really.
> 
> Today, with only Mesa changes, could we effectively do the same as
>   drm_intel_gem_bo_disable_implicit_sync(screen->workaround_bo);
> by hacking Mesa to set no read/write domain when emitting relocs for the
> workaround_bo? (I admit I don't fully understand the kernel's domain
> tracking). If that does work, then it just would require a small hack to
> brw_emit_pipe_control_write().

Yes, for anything that is totally scratch just not setting the write
hazard is the same. For something like the seqno page where we have
multiple engines that we do want to be preserved, not settting the write
hazzard had the consequence that page could be lost under memory pressure
or across resume. (As usual there are some details that this part of the
ABI had to be relaxed because userspace didn't have this flag.)
But that doesn't sell many bananas.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/8] i965/fs: Fix nir_op_fsign of absolute value.

2017-01-26 Thread Ian Romanick

On 01/25/2017 01:42 PM, Francisco Jerez wrote:
> Ian Romanick  writes:
> 
>> On 01/24/2017 03:26 PM, Francisco Jerez wrote:
>>> This does point at the front-end emitting silly code that could have
>>> been optimized out, but the current fsign implementation would emit
>>> bogus IR if abs was set for the argument (because it would apply the
>>> abs modifier on an unsigned integer type), and we shouldn't rely on
>>> the upper layer's optimization passes for correctness.
>>
>> Other than the atan2 code you emit later in the series, is there a test
>> for this?
>>
> 
> PATCH 5 would cause a pile of atan tests to regress without this, but
> see the attachment for a test-case that reproduces the problem in
> isolation.
> 
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 -
>>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>>> index e1ab598..e0c2fa0 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>>> @@ -701,7 +701,14 @@ fs_visitor::nir_emit_alu(const fs_builder , 
>>> nir_alu_instr *instr)
>>>break;
>>>  
>>> case nir_op_fsign: {
>>> -  if (type_sz(op[0].type) < 8) {
>>> +  if (op[0].abs) {
>>> + /* Straightforward since the source can be assumed to be
>>> +  * non-negative.
>>> +  */
>>> + set_condmod(BRW_CONDITIONAL_NZ, bld.MOV(result, op[0]));
>>> + set_predicate(BRW_PREDICATE_NORMAL, bld.MOV(result, 
>>> brw_imm_f(1.0f)));
>>
>> Does this work for DF source?
>>
> Yup.
> 
>> If we had an optimization pass for this, it would probably map
>> fsign(abs(a)) to float(a != 0) or double(a != 0).  This is different
>> from what we would generate for that, but I don't know which is better.
>>
> 
> The main reason I did it that way was because it's able to handle double
> or single precision as-is without any special cases -- To do double(a !=
> 0) you need a 2-src CMP instruction which means you cannot use a DF
> immediate to compare against.  float(a != 0) is probably marginally
> better for single-precision though, because the second instruction would
> have a higher chance of getting optimized out -- If you like I'll make
> the change and deal with the restrictions of DF immediates.

Ah... that makes sense.  This patch is

Reviewed-by: Ian Romanick 

I believe this should also get tagged for 13.x and 17.0 stable.

>>> +
>>> +  } else if (type_sz(op[0].type) < 8) {
>>>   /* AND(val, 0x8000) gives the sign bit.
>>>*
>>>* Predicated OR ORs 1.0 (0x3f80) with the sign bit if val is 
>>> not




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/8] st/mesa: explicitly handle/list 64bit ops in visit_expression()

2017-01-26 Thread Marek Olšák

Reviewed-by: Marek Olšák 

On Jan 26, 2017 2:22 PM, "Emil Velikov"  wrote:

> From: Emil Velikov 
>
> Follow the approach set in the file and handle all the ops, as otherwise
> the compiler throws a bunch of lovely warnings.
>
> Note that some versions of GCC have -Wswitch implied by -Wall, latter of
> which set in out autoconf and scons builds.
>
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 27
> +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index a437645d9e..b154fad185 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -2319,6 +2319,33 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression*
> ir, st_src_reg *op)
> case ir_binop_carry:
> case ir_binop_borrow:
> case ir_unop_ssbo_unsized_array_length:
> +   case ir_unop_bitcast_i642d:
> +   case ir_unop_bitcast_d2u64:
> +   case ir_unop_bitcast_d2i64:
> +   case ir_unop_i642i:
> +   case ir_unop_u642i:
> +   case ir_unop_i642u:
> +   case ir_unop_u642u:
> +   case ir_unop_i642b:
> +   case ir_unop_i642f:
> +   case ir_unop_u642f:
> +   case ir_unop_i642d:
> +   case ir_unop_u642d:
> +   case ir_unop_i2i64:
> +   case ir_unop_u2i64:
> +   case ir_unop_b2i64:
> +   case ir_unop_f2i64:
> +   case ir_unop_d2i64:
> +   case ir_unop_i2u64:
> +   case ir_unop_u2u64:
> +   case ir_unop_f2u64:
> +   case ir_unop_d2u64:
> +   case ir_unop_u642i64:
> +   case ir_unop_i642u64:
> +   case ir_unop_pack_int_2x32:
> +   case ir_unop_pack_uint_2x32:
> +   case ir_unop_unpack_int_2x32:
> +   case ir_unop_unpack_uint_2x32:
>/* This operation is not supported, or should have already been
> handled.
> */
>assert(!"Invalid ir opcode in glsl_to_tgsi_visitor::visit()");
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/8] st/mesa: use correct return statement for a void function

2017-01-26 Thread Marek Olšák

Reviewed-by: Marek Olšák 

On Jan 26, 2017 2:21 PM, "Emil Velikov"  wrote:

> From: Emil Velikov 
>
> Analogous to previous commit.
>
> Signed-off-by: Emil Velikov 
> ---
>  src/mesa/state_tracker/st_atifs_to_tgsi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/state_tracker/st_atifs_to_tgsi.c
> b/src/mesa/state_tracker/st_atifs_to_tgsi.c
> index b28c55ceff..9c4218e672 100644
> --- a/src/mesa/state_tracker/st_atifs_to_tgsi.c
> +++ b/src/mesa/state_tracker/st_atifs_to_tgsi.c
> @@ -245,7 +245,8 @@ emit_arith_inst(struct st_translate *t,
>  struct ureg_dst *dst, struct ureg_src *args, unsigned
> argcount)
>  {
> if (desc->TGSI_opcode == TGSI_OPCODE_NOP) {
> -  return emit_special_inst(t, desc, dst, args, argcount);
> +  emit_special_inst(t, desc, dst, args, argcount);
> +  return;
> }
>
> ureg_insn(t->ureg, desc->TGSI_opcode, dst, 1, args, argcount);
> --
> 2.11.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure.ac: Require LLVM for r300 only on x86 and x86_64

2017-01-26 Thread Marek Olšák

Reviewed-by: Marek Olšák 

On Jan 26, 2017 12:13 PM, "Andreas Boll"  wrote:

> b3119a3 introduced a strict LLVM requirement for r300 on all
> architectures and thus configure fails on architectures where LLVM is
> not available or buggy.
>
> r300 doesn't strictly require LLVM, but for performance reasons we
> highly recommend LLVM usage. So require it at least on x86 and x86_64
> architectures as we have done before b3119a3.
>
> Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in
> gallium_require_llvm")
> Cc: 17.0 
> Signed-off-by: Andreas Boll 
> ---
>  configure.ac | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 64ace9d..b35adc8 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2213,6 +2213,19 @@ gallium_require_llvm() {
>  }
>
>  dnl
> +dnl r300 doesn't strictly require LLVM, but for performance reasons we
> +dnl highly recommend LLVM usage. So require it at least on x86 and x86_64
> +dnl architectures.
> +dnl
> +r300_require_llvm() {
> +case "$host" in *gnux32) return;; esac
> +case "$host_cpu" in
> +i*86|x86_64|amd64) gallium_require_llvm $1
> +;;
> +esac
> +}
> +
> +dnl
>  dnl DRM is needed by X, Wayland, and offscreen rendering.
>  dnl Surfaceless is an alternative for the last one.
>  dnl
> @@ -2298,7 +2311,7 @@ if test -n "$with_gallium_drivers"; then
>  HAVE_GALLIUM_R300=yes
>  PKG_CHECK_MODULES([RADEON], [libdrm_radeon >=
> $LIBDRM_RADEON_REQUIRED])
>  require_libdrm "r300"
> -gallium_require_llvm "r300"
> +r300_require_llvm "r300"
>  ;;
>  xr600)
>  HAVE_GALLIUM_R600=yes
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 190 matches

Mail list logo