[Beignet] [PATCH v3 2/3] enable create image 2d from buffer in clCreateImage.

2015-09-08 Thread xionghu . luo
From: Luo Xionghu this patch allows create 2d image with a cl buffer with zero copy. v2: should use reference to manage the release the buffer and image. After being created, the buffer reference count is 2, and image reference count is 1. if image is released first, decrease the image reference

[Beignet] [PATCH v3 1/3] return 32 could gain 0.2% performance on opencv optical flow case.

2015-09-08 Thread xionghu . luo
From: Luo Xionghu Signed-off-by: Luo Xionghu --- src/cl_gt_device.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_gt_device.h b/src/cl_gt_device.h index bd87cc4..a51843d 100644 --- a/src/cl_gt_device.h +++ b/src/cl_gt_device.h @@ -39,7 +39,7 @@ .native_vector_width

[Beignet] [PATCH v3 3/3] add utest for creating 2d image from buffer.

2015-09-08 Thread xionghu . luo
From: Luo Xionghu v2: check cl_khr_image2d_from_buffer support first; use CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT to allocate memory. Signed-off-by: Luo Xionghu --- utests/CMakeLists.txt| 1 + utests/image_from_buffer.cpp | 83 2 files cha

Re: [Beignet] [PATCH] GBE: fix build error with LLVM 3.5 and previous version.

2015-09-08 Thread Yang, Rong R
LGTM, pushed, thanks. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Zhigang Gong > Sent: Wednesday, September 9, 2015 09:08 > To: beignet@lists.freedesktop.org > Cc: Gong, Zhigang > Subject: [Beignet] [PATCH] GBE: fix build error with LL

Re: [Beignet] [PATCH 2/3] add bswap64 for gen7/gen75 and gen8 seperately.

2015-09-08 Thread Luo, Xionghu
As LONG type variable is not uniform register, so no need to add the simd == 1 logic, and the uniform variable is already handled in it. Luo Xionghu Best Regards -Original Message- From: Yang, Rong R Sent: Tuesday, September 8, 2015 3:00 PM To: Luo, Xionghu; beignet@lists.freedesktop.o

[Beignet] [PATCH] GBE: fix build error with LLVM 3.5 and previous version.

2015-09-08 Thread Zhigang Gong
Signed-off-by: Zhigang Gong --- backend/src/backend/program.cpp | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp index 330bead..57a5037 100644 --- a/backend/src/backend/program.cpp +++ b/backend/src/backend

[Beignet] [PATCH 12/19] Backend: Add profiling registers into curbe.

2015-09-08 Thread junyan . he
From: Junyan He Signed-off-by: Junyan He --- backend/src/backend/gen_context.cpp | 17 + backend/src/backend/program.h |6 ++ 2 files changed, 23 insertions(+) diff --git a/backend/src/backend/gen_context.cpp b/backend/src/backend/gen_context.cpp index 696d86a.

[Beignet] [PATCH 16/19] Backend: Avoid CALC_TIMESTAMP and STORE_PROFILING being scheduled.

2015-09-08 Thread junyan . he
From: Junyan He We do not want CALC_TIMESTAMP and STORE_PROFILING to be scheduled with other instructions, because it will get the wrong timestamps. Signed-off-by: Junyan He --- backend/src/backend/gen_insn_scheduling.cpp |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git

[Beignet] [PATCH 15/19] Backend: Fix two bugs about curbe related pointer.

2015-09-08 Thread junyan . he
From: Junyan He 1. rename __gen_ocl_timestamp_buf to __gen_ocl_profiling_buf 2. printfbptr printfiptr and profilingbptr should be 64 bits on BDW later platforms. So just set them to QWORD. Signed-off-by: Junyan He --- backend/src/ir/profile.cpp|6 +++--- backend/src/llvm/llv

[Beignet] [PATCH 10/19] Backend: Add a auxiliary function to convert GenReg to uniform.

2015-09-08 Thread junyan . he
From: Junyan He Signed-off-by: Junyan He --- backend/src/backend/gen_register.hpp |9 + 1 file changed, 9 insertions(+) diff --git a/backend/src/backend/gen_register.hpp b/backend/src/backend/gen_register.hpp index 4f37e30..9e9e0e4 100644 --- a/backend/src/backend/gen_register.hpp

[Beignet] [PATCH 07/19] Backend: Insert store_profiling before lowed return.

2015-09-08 Thread junyan . he
From: Junyan He After the lowering return pass, a new block which just has one RET instruction will be generated, and all RET INSTs in the middle will be replaced by BRA INST. We want our store_profiling instruction to be inserted just before that return instruction and out of any condition bloc

[Beignet] [PATCH 14/19] Runtime: Bind the profiling buffer when profiling enabled.

2015-09-08 Thread junyan . he
From: Junyan He Signed-off-by: Junyan He --- src/cl_command_queue.c |8 ++ src/cl_command_queue_gen7.c | 37 +++ src/cl_driver.h | 16 src/cl_driver_defs.c|5 src/intel/intel_gpgpu.c | 58 ++

[Beignet] [PATCH 18/19] Backend; Implement emitCalcTimestampInstruction in GenContext.

2015-09-08 Thread junyan . he
From: Junyan He We will maintain a real clock to record the real execute time of the orginal code. We do not want to introduce overhead because of adding the profiling instructions, so every time we enter the proliling instructions block, we will calculate the real time clock value and update the

[Beignet] [PATCH 17/19] Backend: Add ADD_ and SUB_ timestamps help functions.

2015-09-08 Thread junyan . he
From: Junyan He The timestamps are calculated by Long type. Before BDW, there is no Long type support and we use i32 operations to implement them. Signed-off-by: Junyan He --- backend/src/backend/gen8_context.cpp | 24 +++ backend/src/backend/gen8_context.hpp |2 ++ backend/s

[Beignet] [PATCH 19/19] Backend: Implement StoreProfilingInstruction in GenContext.

2015-09-08 Thread junyan . he
From: Junyan He The offset 0 of the profiling buffer contains the log number. We will use atomic instruction to inc it every time a log is generated. We will generate one log for each HW gpu thread. The log contains the XYZ range of global work items which are executed on this thread, the EU id,

[Beignet] [PATCH 11/19] Backend: Add profilingProlog function for GenContext.

2015-09-08 Thread junyan . he
From: Junyan He The profilingProlog will collect useful information for profiling, including XYZ global range and prolog timestamp. Signed-off-by: Junyan He --- backend/src/backend/gen_context.cpp | 116 +++ backend/src/backend/gen_context.hpp |2 + 2 files

[Beignet] [PATCH 04/19] Backend: Add profiling registers to curbe.

2015-09-08 Thread junyan . he
From: Junyan He 1. Add five timestamp reigsters and one pointer register into curbe. The five timestamp reigsters will hold all the infomation of profiling timestamps, includes 20 uint timestamps for each point, 1 ulong prolog holding the start time and and 1 ulong epilog holding the

[Beignet] [PATCH 05/19] Backend: Add ProfilingInfo to Unit.

2015-09-08 Thread junyan . he
From: Junyan He The Unit will hold profiling infomation. The profiling infomation may be needed throughout the whole backend processing, so it is suitable to add it to unit. Signed-off-by: Junyan He --- backend/src/ir/unit.cpp |6 +- backend/src/ir/unit.hpp | 10 ++ 2 files c

[Beignet] [PATCH 13/19] Add profiling info APIs to runtime.

2015-09-08 Thread junyan . he
From: Junyan He Signed-off-by: Junyan He --- backend/src/backend/program.cpp | 26 +- backend/src/backend/program.h | 11 +++ backend/src/backend/program.hpp | 22 ++ backend/src/gbe_bin_interpreter.cpp |4 src/cl_

[Beignet] [PATCH 09/19] Backend: Add CalcTimestamp and StoreProfiling to insn selection.

2015-09-08 Thread junyan . he
From: Junyan He Signed-off-by: Junyan He --- backend/src/backend/gen_context.cpp|9 ++ backend/src/backend/gen_context.hpp|2 + .../src/backend/gen_insn_gen7_schedule_info.hxx|2 + backend/src/backend/gen_insn_selection.cpp | 140

[Beignet] [PATCH 08/19] Backend: Add IVAR OCL_PROFILING_LOG to control profiling log.

2015-09-08 Thread junyan . he
From: Junyan He We add OCL_PROFILING_LOG as a int type, because there may be different types of profiling format in the future. Signed-off-by: Junyan He --- backend/src/backend/gen_context.hpp |3 +++ backend/src/backend/gen_program.cpp |9 - backend/src/backend/gen_program

[Beignet] [PATCH 00/19 V2] Add Profiling support in beignet.

2015-09-08 Thread junyan . he
From: Junyan He The profiling support is enabled by this patch set. The profiling information is as following: -- Log 0 -- | fix functions id: 7 simd: 16 kernel id:0 | | thread id: 0 EU id: 1 half slice id: 0 | | disp

[Beignet] [PATCH 02/19] Backend: Add StoreProfiling and CalcTimestamp instructions

2015-09-08 Thread junyan . he
From: Junyan He Add two instructions for profiling usage. CalcTimestamp will calculate the timestamps and update the timestamp in the according slot. StoreProfiling will store the information to buffer and generate logs. Signed-off-by: Junyan He --- backend/src/ir/instruction.cpp | 96 ++

[Beignet] [PATCH 06/19] Backend: Add CalcTimestamp and StoreProfiling.

2015-09-08 Thread junyan . he
From: Junyan He When in profiling, the profiling inserter function will insert calc_timestamp for each point which we are interested in. At the end of the kernel, just before return, we will insert a store_profiling function call. The function will hold a reference to the global val profiling_buf

[Beignet] [PATCH 03/19] Backend: Add ProfilingInserter and a new function pass.

2015-09-08 Thread junyan . he
From: Junyan He When user enables profiling feature, we need to insert extra instructions to record and store the timestamps. By now, the function pass will just insert the requred instructions at the head of first 20 blocks. Later, we will support to insert timestamps at any point in the code.

[Beignet] [PATCH 01/19] Backend: Add ProfilingInfo class to ir.

2015-09-08 Thread junyan . he
From: Junyan He ProfilingInfo will play important role in output the profiling log. It will record the profiling information and generate the logs after clfinish. Signed-off-by: Junyan He --- backend/src/CMakeLists.txt |2 + backend/src/ir/profiling.cpp | 70 ++ bac

Re: [Beignet] [PATCH] Use __attribute__((destructor)), not atexit(3).

2015-09-08 Thread Yang, Rong R
It seems gcc/clang/icc support __attribute__((destructor)), but still have two comments. Thanks for your contribution. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Koop Mast > Sent: Thursday, August 27, 2015 18:45 > To: beignet@lists.f

Re: [Beignet] [PATCH] GBE: add check dumpASMFileName.empty()

2015-09-08 Thread Yang, Rong R
LGTM, pushed, thanks. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Ruiling Song > Sent: Sunday, September 6, 2015 15:12 > To: beignet@lists.freedesktop.org > Cc: Song, Ruiling > Subject: [Beignet] [PATCH] GBE: add check dumpASMFileName.

Re: [Beignet] [PATCH] GBE: Use addRemappedFile to avoid creating temporary cl source file.

2015-09-08 Thread Yang, Rong R
Ok, pushed. Thanks. > -Original Message- > From: Gong, Zhigang > Sent: Tuesday, September 8, 2015 16:28 > To: Yang, Rong R; Luo, Xionghu; beignet@lists.freedesktop.org > Subject: RE: [Beignet] [PATCH] GBE: Use addRemappedFile to avoid creating > temporary cl source file. > > > > > -

Re: [Beignet] [PATCH] utests: Added unit tests to test LLVM and ASM dump generation.

2015-09-08 Thread Yang, Rong R
Pushed, thanks. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Song, Ruiling > Sent: Sunday, September 6, 2015 15:05 > To: Gandikota, Sirisha; Zou, Nanhai; beignet@lists.freedesktop.org > Cc: Gandikota, Sirisha > Subject: Re: [Beignet] [P

Re: [Beignet] [PATCH] GBE: Use addRemappedFile to avoid creating temporary cl source file.

2015-09-08 Thread Gong, Zhigang
> -Original Message- > From: Yang, Rong R > Sent: Tuesday, September 8, 2015 4:11 PM > To: Gong, Zhigang; Luo, Xionghu; beignet@lists.freedesktop.org > Subject: RE: [Beignet] [PATCH] GBE: Use addRemappedFile to avoid creating > temporary cl source file. > > Is this remapped file virtual

Re: [Beignet] [PATCH] GBE: Use addRemappedFile to avoid creating temporary cl source file.

2015-09-08 Thread Yang, Rong R
Is this remapped file virtual file? If it is not a virtual file, I am afraid it is not thread/process safe. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > Gong, Zhigang > Sent: Tuesday, September 8, 2015 14:33 > To: Luo, Xionghu; beignet

Re: [Beignet] [PATCH] generate MOV instruction at selection stage when do simd_shuffle with imm value.

2015-09-08 Thread Guo, Yejun
Ping for review, thanks. -Original Message- From: Guo, Yejun Sent: Friday, August 28, 2015 7:06 AM To: beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: [PATCH] generate MOV instruction at selection stage when do simd_shuffle with imm value. the earlier the instruction is generated,

Re: [Beignet] [PATCH 2/3] add bswap64 for gen7/gen75 and gen8 seperately.

2015-09-08 Thread Yang, Rong R
It seems you don't handle simd == 1 long/ulong case. > -Original Message- > From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of > xionghu@intel.com > Sent: Thursday, August 13, 2015 14:28 > To: beignet@lists.freedesktop.org > Cc: Luo, Xionghu > Subject: [Beignet]