Re: [Beignet] [PATCH v2 1/4] Add built-in function __gen_ocl_vme.

2015-09-07 Thread Weng, Chuanbo
Please ignore this patchset, I will send out a new version soon. -Original Message- From: Weng, Chuanbo Sent: Monday, September 07, 2015 13:01 To: beignet@lists.freedesktop.org Cc: Weng, Chuanbo Subject: [PATCH v2 1/4] Add built-in function __gen_ocl_vme. __gen_ocl_vme is used for

[Beignet] [PATCH v3 2/4] Add extensions intel_accelerator and basic intel_motion_estimation.

2015-09-07 Thread Chuanbo Weng
v2: 1. Just upload the first vme_state. 2. Remove duplicated code in check_opt1_extension. 3. Check image format before cl_gpgpu_bind_image_for_vme. 4. Fix error of getting mv. Because we suppose this kernel run in SIMD16 mode, so dword 0 of grf 1 should be __gen_ocl_region(8,vme_result.s0),

[Beignet] [PATCH v3 1/4] Add built-in function __gen_ocl_vme.

2015-09-07 Thread Chuanbo Weng
__gen_ocl_vme is used for hardware accelerated video motion estimation. It gets payload values as parameters and uses MOV to pass these payload values to VME SEND Message's payload grfs. The int8 return value is used to store SEND Message writeback. v2: Remove unnecessary 5 parameters(src_grf*)

[Beignet] [PATCH v3 4/4] Add document of video motion estimation support.

2015-09-07 Thread Chuanbo Weng
Signed-off-by: Chuanbo Weng --- docs/Beignet.mdwn | 1 + docs/howto/video-motion-estimation-howto.mdwn | 79 +++ 2 files changed, 80 insertions(+) create mode 100644 docs/howto/video-motion-estimation-howto.mdwn diff

Re: [Beignet] [PATCH 3/3] add optimization for local copy propagation

2015-09-07 Thread Guo, Yejun
It is expected that there will be improvement with the optimization since some instructions are removed. As mentioned in the commit log, this patch itself does not remove any instruction, it modifies some instruction to make the removal possible. GenWriter::removeMOVs() did the work inside

Re: [Beignet] [PATCH 3/3] add optimization for local copy propagation

2015-09-07 Thread Zhigang Gong
Right, only the instructions created in instruction selection stage could be optimized here. No need to iterate all the instructions. And no need to do multiple round check. Just one round check, and only need to check those temporary registers(only live in this BB). If any register is in live

Re: [Beignet] [PATCH 3/3] add optimization for local copy propagation

2015-09-07 Thread Guo, Yejun
Yes, there will be penalty for the case in your example. I read several documents for local copy propagation, and none mentioned this case. :( For the method to iterate new instructions/registers, it requires to add the 'new' flag during GenIR to SelectionIR period, since the current

Re: [Beignet] [PATCH 3/3] add optimization for local copy propagation

2015-09-07 Thread Zhigang Gong
> -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Guo, Yejun > Sent: Monday, September 7, 2015 8:27 PM > To: Zhigang Gong; beignet@lists.freedesktop.org > Subject: Re: [Beignet] [PATCH 3/3] add optimization for local copy propagation > >