Because accessing global memory by uchar16/char16 will fully utilize memory bandwidth, so change CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR from 8 to 16. Three OpenCV cases will speedup from this patch: OCL_ThreshFixture_Threshold, 25% improvement OCL_MaxFixture_Max, 105% improvement OCL_MinFixture_Min, 105% improvement.
Signed-off-by: Chuanbo Weng <chuanbo.w...@intel.com> --- src/cl_gt_device.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_gt_device.h b/src/cl_gt_device.h index 37abfd2..ed19f10 100644 --- a/src/cl_gt_device.h +++ b/src/cl_gt_device.h @@ -24,7 +24,7 @@ .max_1d_global_work_sizes = {1024 * 1024 * 256, 1, 1}, .max_2d_global_work_sizes = {8192, 8192, 1}, .max_3d_global_work_sizes = {8192, 8192, 2048}, -.preferred_vector_width_char = 8, +.preferred_vector_width_char = 16, .preferred_vector_width_short = 8, .preferred_vector_width_int = 4, .preferred_vector_width_long = 2, -- 1.9.1 _______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/beignet