From: Junyan He
Signed-off-by: Junyan He
---
backend/src/gbe_bin_generater.cpp | 20 +++-
src/CMakeLists.txt|8 +++-
src/GetGenID.sh |2 ++
utests/CMakeLists.txt |7 ++-
4 files changed, 34 insertions(+), 3 dele
From: Luo
creates an array of sub-devices that each reference a non-intersecting
set of compute units within in_device, according to a partition scheme
given by properties.
---
src/cl_api.c | 10 --
src/cl_device_id.c | 6 ++
src/cl_device_id.h | 7 +++
src/cl_gt_device.h
isScalarOrBool is a legacy function which was used when the bool
is treated as a scalar register by default. Now, we are using
normal vector word register to represent bool, we no need to
keep this macro. And repace all of the uses to isScalarReg.
Signed-off-by: Zhigang Gong
---
backend/src/back
From: Luo
1. remove repeated user events in list.
2. missed braces in loops.
3. fix barrier event reference not incresed.
---
src/cl_alloc.c | 1 +
src/cl_event.c | 111 -
src/cl_event.h | 4 +++
3 files changed, 75 insertions(+), 41 de
From: Luo
seperate the kernel code from host code to make it clean; build the
kernels offline by gbe_bin_generator to improve the performance.
---
src/CMakeLists.txt | 23 ++-
src/cl_context.h | 16 +-
src/cl_mem.c
Will update 2 places in updated patch.
Luo Xionghu
Best Regards
-Original Message-
From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
xionghu@intel.com
Sent: Monday, May 12, 2014 12:41 PM
To: beignet@lists.freedesktop.org
Cc: Luo, Xionghu
Subject: [Beignet] [P
From: Luo
seperate the kernel code from host code to make it clean; build the
kernels offline by gbe_bin_generator to improve the performance.
---
src/CMakeLists.txt | 23 ++-
src/cl_context.h | 24 ++-
src/cl_gt_device.h
From: Luo
1. remove repeated user events in list.
2. missed braces in loops.
3. fix barrier event reference not incresed.
---
src/cl_alloc.c | 1 +
src/cl_event.c | 111 -
src/cl_event.h | 4 +++
3 files changed, 75 insertions(+), 41 de
HSW's scratch buffer alignment and the index set in vfe state are different
with IVB.
And when calc per thread's stack offset, will used R0.0's FFTID to, the define
of
FFTID also changed in HSW.
With this patch, all utest pass.
Signed-off-by: Yang Rong
---
backend/src/backend/context.cpp
Just pushed, thanks for the contribution.
On Wed, May 07, 2014 at 06:02:32PM +0800, junyan...@inbox.com wrote:
> From: Junyan He
>
> V2 for HSW enabling.
> rebase to the current master branch 42136987b9925396ad138cc2493bed8ab11cbe35
> Major Modification:
> 1. Seperate the gen_context and gen_enc
The atomic msg type should be GEN75_P1_UNTYPED_ATOMIC_OP. Correct it.
Signed-off-by: Yang Rong
---
backend/src/backend/gen75_encoder.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/backend/src/backend/gen75_encoder.cpp
b/backend/src/backend/gen75_encoder.cpp
index 9e758d
Each work group has it's own slm offset, and when dispatch threads,
TSG will handle it automatic in IVB. But it will fail in HSW.
After check, all work group's slm offset are 0, even the slm index is
correct in R0.0. So calc the slm offset for slm index, and add it
to the slm address.
TODO: need to
Should set the nomask in mov_df_imm and need handle exec_width=4 case in
setHeader.
Signed-off-by: Yang Rong
---
backend/src/backend/gen75_encoder.cpp | 1 +
backend/src/backend/gen75_encoder.hpp | 5 -
backend/src/backend/gen_encoder.cpp | 3 +++
backend/src/backend/gen_encoder.hpp | 4
Because LRI commands will be converted to NOOP, add the I915_EXEC_ENABLE_SLM
flag to the drm kernal driver, to enable SLM in the L3. Set the flag when
application use slm. Still keep the L3 config in the batch buffer for fulsim.
Also create and use the openCL own context when exec, to avoid affect
Per OCL spec, if the arg_value of clSetKernelArg is a memory object, it can be
NULL or point to NULL. Driver only handle NULL case, will crash if point to
NULL.
Correct it.
Signed-off-by: Yang Rong
---
src/cl_kernel.c | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git
The previour pipe control don't work, because it don't advance the batch buffer.
So the value set in function intel_gpgpu_pipe_control will be flushed later.
Fix it.
Signed-off-by: Yang Rong
---
src/intel/intel_gpgpu.c | 4 ++--
src/intel/intel_structs.h | 8 ++--
2 files changed, 8 inser
HSW: Byte scattered Read/Write require that the buffer size must be a multiple
of 4 bytes.
So simply alignment all buffer size to 4. Pass utest
compiler_function_constant0.
Because it is very light work around, align it without not check device.
Signed-off-by: Yang Rong
---
src/cl_mem.c
17 matches
Mail list logo