Re: [Beignet] [PATCH 2/2] Utest: Add -cl-kernel-arg-info to the utest test_get_arg_info

2015-09-06 Thread Yang, Rong R
LGTM, pushed, thanks. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > junyan...@inbox.com > Sent: Sunday, September 6, 2015 18:45 > To: beignet@lists.freedesktop.org > Cc: Junyan He > Subject: [Beignet] [PATCH 2/2] Utest: Add

Re: [Beignet] [PATCH] utests: fix test_get_arg_info fail

2015-09-06 Thread Yang, Rong R
Junyan also send a patch for it, because he separate to two patch, make it clearly, so pushed his patch, thanks. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Pan Xiuli > Sent: Sunday, September 6, 2015 11:30 > To:

[Beignet] [PATCH v2 2/4] Add extensions intel_accelerator and basic intel_motion_estimation.

2015-09-06 Thread Chuanbo Weng
v2: 1. Just upload the first vme_state. 2. Remove duplicated code in check_opt1_extension. 3. Check image format before cl_gpgpu_bind_image_for_vme. 4. Fix error of getting mv. Because we suppose this kernel run in SIMD16 mode, so dword 0 of grf 1 should be __gen_ocl_region(8,vme_result.s0),

[Beignet] [PATCH v2 3/4] Add basic utest for block_motion_estimate_intel.

2015-09-06 Thread Chuanbo Weng
If the CL device does not support this builtin kernel, the test returns PASS. Signed-off-by: Guo Yejun --- utests/CMakeLists.txt | 1 + utests/utest_helper.hpp | 1 + 2 files changed, 2 insertions(+) diff --git a/utests/CMakeLists.txt b/utests/CMakeLists.txt index

[Beignet] [PATCH v2 4/4] Add document of video motion estimation support.

2015-09-06 Thread Chuanbo Weng
Signed-off-by: Chuanbo Weng --- docs/Beignet.mdwn | 1 + docs/howto/video-motion-estimation-howto.mdwn | 79 +++ 2 files changed, 80 insertions(+) create mode 100644 docs/howto/video-motion-estimation-howto.mdwn diff

Re: [Beignet] [PATCH 3/3] add optimization for local copy propagation

2015-09-06 Thread Zhigang Gong
Is there any evidence that this optimization could bring actual improvement? I doubt it because it doesn't reduce any instruction. Actually, if the %42 is not in the liveout set of current BB, then the MOV could be removed, the exactly same optimization logic has been implemented in the GEN IR

[Beignet] [PATCH 1/3] add basic function to dump Selection IR

2015-09-06 Thread Guo Yejun
Selection IR is a representation between Gen IR and Gen ASM, it is almost a Gen instruction but *before* the register allocation. only basic dump supported, not fully completed yet. Once finished, can be refined as operator<< for relative classes. Signed-off-by: Guo Yejun

[Beignet] [PATCH 2/3] add basic structure for selection IR optimization

2015-09-06 Thread Guo Yejun
The idea is that many optimzations can be done at selection IR level, which is nearly ISA-like *before* physical register allocation. The optimization here can help to reduce register use/spill. It is hard to do the optimzation in late ASM stage since the ASM instructions are encoded without

[Beignet] [PATCH 3/3] add optimization for local copy propagation

2015-09-06 Thread Guo Yejun
it is done at selection ir level, for instructions like: MOV(8) %42<2>:UB : %53<32,8,4>:UB ADD(8) %43<2>:B: %40<16,8,2>:B -%42<16,8,2>:B can be optimized as: MOV(8) %42<2>:UB : %53<32,8,4>:UB ADD(8)

[Beignet] [PATCH v2 1/4] Add built-in function __gen_ocl_vme.

2015-09-06 Thread Chuanbo Weng
__gen_ocl_vme is used for hardware accelerated video motion estimation. It gets payload values as parameters and uses MOV to pass these payload values to VME SEND Message's payload grfs. The int8 return value is used to store SEND Message writeback. v2: Remove unnecessary 5 parameters(src_grf*)

[Beignet] [PATCH v2 1/2] GBE: continue to refine interfering check.

2015-09-06 Thread Zhigang Gong
More aggresive interfering check, even if both registers are in Livein set or Liveout set, they are still possible not interfering to each other. v2: Liveout interfering check need to take care those BBs which has only one register defined. For example: BBn: ... MOV %r1, %src ... Both

Re: [Beignet] [PATCH] add basic function to dump Selection IR

2015-09-06 Thread Guo, Yejun
Please ignore this patch, I did some minor changes and will re-send together with other patches. -Original Message- From: Guo, Yejun Sent: Monday, August 31, 2015 8:41 AM To: beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: [PATCH] add basic function to dump Selection IR Selection

Re: [Beignet] [PATCH 2/4] add extensions intel_accelerator and basic intel_motion_estimation

2015-09-06 Thread Song, Ruiling
> +if (kernel->vme) { > +fixed_local_sz[0] = 16; > +fixed_local_sz[1] = 1; Why it is 16? Does it work for all cases? > - if (global_work_size != NULL) > + if (kernel->vme) { > +fixed_global_sz[0] = (global_work_size[0]+15) / 16 * 16; > +fixed_global_sz[1] =

[Beignet] [PATCH] GBE: add check dumpASMFileName.empty()

2015-09-06 Thread Ruiling Song
Signed-off-by: Ruiling Song --- backend/src/backend/program.cpp | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp index d9e6416..e75e911 100644 ---

Re: [Beignet] [PATCH 2/4] add extensions intel_accelerator and basic intel_motion_estimation

2015-09-06 Thread Guo, Yejun
Regarding "fixed_local_sz[0] = 16", the reason is that the basic unit of VME hardware is 16*16 pixels, and our design is to handle 1*16 pixels in a work item, and use 16*1 as local size, so, each group is a basic unit of VME. For the extension concern "Is this a duplicate of code in

[Beignet] [PATCH 1/2] GBE: continue to refine interfering check.

2015-09-06 Thread Zhigang Gong
More aggresive interfering check, even if both registers are in Livein set or Liveout set, they are still possible not interfering to each other. Signed-off-by: Zhigang Gong --- backend/src/ir/value.cpp | 117 ++-

[Beignet] [PATCH 2/2] GBE: Fix one DAG analysis issue and enable multiple round phi copy elimination.

2015-09-06 Thread Zhigang Gong
Even if one value is killed in current BB, we still need to pass predecessor's definition into this BB. Otherwise, we will miss one definition. BB0: MOV %foo, %src0 BB1: MUL %foo, %src1, %f00 ... BR BB1 In the above case, both BB1 and BB0 are the predecessors of BB1. When pass the

[Beignet] [PATCH v2 4/5] GBE: add some dag helper routines to check registers' interfering.

2015-09-06 Thread Zhigang Gong
These helper function will be used in further phi mov optimization. v2: remove the useless debug message code. Signed-off-by: Zhigang Gong --- backend/src/ir/value.cpp | 100 +++ backend/src/ir/value.hpp | 13 ++ 2 files

Re: [Beignet] [PATCH] utests: Added unit tests to test LLVM and ASM dump generation.

2015-09-06 Thread Song, Ruiling
LGTM Thanks! Ruiling > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Sirisha Gandikota > Sent: Wednesday, September 2, 2015 4:44 PM > To: Zou, Nanhai; beignet@lists.freedesktop.org > Cc: Gandikota, Sirisha > Subject: [Beignet] [PATCH]

[Beignet] [PATCH 2/2] Utest: Add -cl-kernel-arg-info to the utest test_get_arg_info

2015-09-06 Thread junyan . he
From: Junyan He Signed-off-by: Junyan He --- utests/get_arg_info.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/utests/get_arg_info.cpp b/utests/get_arg_info.cpp index c1ea1ef..effeb64 100644 ---

[Beignet] [PATCH 1/2] Runtime: Add NULL pointer check in clGetKernelArgInfo

2015-09-06 Thread junyan . he
From: Junyan He There is no NULL pointer check for kernel->program->build_opts. This will cause utest test_get_arg_info crash. In fact, we will add -cl-kernel-arg-info flag for compiling ever time, and so the arg info is always avaible. But some test case deliberately