Re: [Beignet] [PATCH V2] backend: add global immediate optimization

2017-06-12 Thread Wang, Rander
-Original Message- From: Song, Ruiling Sent: Tuesday, June 13, 2017 10:24 AM To: Wang, Rander ; beig...@freedesktop.org Cc: Wang, Rander Subject: RE: [Beignet] [PATCH V2] backend: add global immediate optimization > + else if

Re: [Beignet] [PATCH V2] backend: add global immediate optimization

2017-06-12 Thread Song, Ruiling
> + else if (src0.type == GEN_TYPE_D || src0.type == GEN_TYPE_UD) > + { > +int s0 = src0.value.d; > +if (src0.absolute) > + s0 = fabs(s0); Here I think it should be abs(s0), right? > +if (src0.negation) > + s0 = -s0;

[Beignet] [PATCH v4 7/7] Optimize clEnqueueWriteImageByKernel and clEnqueuReadImageByKernel.

2017-06-12 Thread yan . wang
From: Yan Wang 1. Only copy the data by origin and region defined. 2. Add clFinish to guarantee the kernel copying is finished when blocking writing. Signed-off-by: Yan Wang --- src/cl_api_mem.c | 20 ++-- 1 file changed, 14

[Beignet] [PATCH v4 3/7] Add utest to test writing data into large image (TILE_Y) by map/unmap and USE_HOST_PTR mode.

2017-06-12 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- utests/runtime_use_host_ptr_large_image.cpp | 115 1 file changed, 115 insertions(+) diff --git a/utests/runtime_use_host_ptr_large_image.cpp

[Beignet] [PATCH v4 6/7] Fix bug of clEnqueueUnmapMemObjectForKernel and clEnqueueMapImageByKernel.

2017-06-12 Thread yan . wang
From: Yan Wang 1. Support wrrting data by mapping/unmapping mode. 2. Add mapping record logic. 3. Add clFinish to guarantee the kernel copying is finished. 4. Fix the error of calling clEnqueueMapImageByKernel. blocking_map and map_flags need be switched.

[Beignet] [PATCH v4 5/7] Add clFinish for guarantee the kernel copying is finished when create TILE_Y large image.

2017-06-12 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_mem.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/cl_mem.c b/src/cl_mem.c index 3f41fd8..b6dce3f 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -817,6 +817,13 @@

[Beignet] [PATCH v4 4/7] Add cl_mem_record_map_mem_for_kernel() for record map adress for TILE_Y image by kernel copying.

2017-06-12 Thread yan . wang
From: Yan Wang Signed-off-by: Yan Wang --- src/cl_mem.c | 109 +-- src/cl_mem.h | 5 +++ 2 files changed, 88 insertions(+), 26 deletions(-) diff --git a/src/cl_mem.c b/src/cl_mem.c

[Beignet] [PATCH v4 2/7] Add utest to test writing data into large image (TILE_Y) by map/unmap mode.

2017-06-12 Thread yan . wang
From: Yan Wang It is used to reproduce the bug of clCopyImage/clFillImage of conformance test. Signed-off-by: Yan Wang --- utests/compiler_copy_large_image.cpp | 198 +++ 1 file changed, 198 insertions(+)

[Beignet] [PATCH v4 1/7] Add utest case for filling image by small region.

2017-06-12 Thread yan . wang
From: Yan Wang It is used to reproduce the bug of allocations of conformance test. Signed-off-by: Yan Wang --- utests/compiler_fill_large_image.cpp | 50 1 file changed, 50 insertions(+) diff --git

[Beignet] [PATCH V3] backend: add global immediate optimization

2017-06-12 Thread rander.wang
there are some global immediates in global var list of LLVM. these imm can be integrated in instructions. for compiler_global_immediate_optimized test in utest, there are two global immediates: L0: MOV(1) %42<0>:UD : 0x0:UD