LGTM, pushed, thanks.
On Tue, Dec 24, 2013 at 01:40:16PM +0800, Yang Rong wrote:
It will trigger some bugs if local size not 1, will re-enable it after fix
these bugs.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
src/cl_api.c | 6 +++---
1 file changed, 3 insertions(+), 3
This fix is ok for me. Just pushed.
Actually, if a backend instruction want to use a flag register, it could not
just set it as source or destination register. You need to set the physicalFlag
to zero, and put the register's value to the flagIndex. Otherwise, when the
register fail to get a free
LGTM, pushed, thanks.
On Fri, Dec 27, 2013 at 05:15:36PM +0800, Yang Rong wrote:
Should return float, but long. Correct it.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/gen_convert.sh | 4 ++--
backend/src/ocl_convert.h | 4 ++--
2 files changed, 4 insertions(+), 4
Yi,
As we discussed, we need to remove these
generated kernel files with a make clean, and we need to generate them at next
time of make. Otherwise, we need to do a extra cmake . which is really not
convinient and not comply with the normal build policy.
I'd like to wait for your next version.
Modified according to ruiling's comment and pushed. Thanks.
On Tue, Dec 31, 2013 at 06:04:48AM +, Song, Ruiling wrote:
One comment. The patch Tested OK.
-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
This patch hit a utest regression:
compiler_long_convert_to_float:
compiler_long_convert_to_float()[FAILED]
Error: dst[i] == src[i]
at file /home/gongzg/git/fdo/beignet/utests/compiler_long_convert.cpp,
function compiler_long_convert_to_float, line 152
Could you take a look at it?
it to
other stage, I think it is meaningless.
-Original Message-
From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com]
Sent: Monday, December 23, 2013 6:32 PM
To: Yang, Rong R
Cc: beignet@lists.freedesktop.org
Subject: Re: [Beignet] [PATCH 1/2] Fix convert long/ulong to float
On Wed, Dec 25, 2013 at 02:28:48AM +, Song, Ruiling wrote:
See my two comments
-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong
Sent: Tuesday, December 24, 2013 2:29 PM
To: Yang, Rong R
Cc
LGTM, pushed, thanks.
On Tue, Dec 17, 2013 at 03:32:13PM +0800, Yang Rong wrote:
Convert input to float and convert float to input type again, as c. Compare
the
input and c, if not match the rtz/rtp/rtn require, +/- 1 ULP.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
allocate proxyvalue for %7, that's the root cause why
it triggers an assert when visit the instruction %10 = add i32 %7, %6.
On Fri, Dec 20, 2013 at 09:35:27AM +0800, Zhigang Gong wrote:
We should allocate register when we firstly visit ExtractElement
instruction, as we may refer the value before
Thanks for the contribution, just pushed this and the previous 8 patches.
On Fri, Dec 20, 2013 at 03:54:05PM +0800, Lv Meng wrote:
Signed-off-by: Lv Meng meng...@intel.com
---
backend/src/ocl_stdlib.tmpl.h | 39 ++-
1 file changed, 38 insertions(+), 1
.
Then this patch LGTM, will push it latter. Thanks.
-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Yang, Rong R
Sent: Tuesday, December 24, 2013 1:25 PM
To: Zhigang Gong
Cc: beignet@lists.freedesktop.org
Subject: Re
On Tue, Dec 24, 2013 at 05:27:43AM +, Yang, Rong R wrote:
One question.
-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong
Sent: Friday, December 20, 2013 3:27 PM
To: beignet
, it only considers the liveOut
information. Actually, we also need to consider the liveIn information.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/backend/gen_context.hpp|4 ++
backend/src/backend/gen_reg_allocation.cpp | 10 +++-
backend/src/ir/liveness.cpp
Hi List,
Although I submitted a patch recently to set the barrier function to
noduplicate attribute, but it seems that it still has problem.
Considering the following type of code:
Label0:
call barrier;
Label1:
.
br i1 %cmp0, label %label2, %label0
label2:
.
.
:
... (there is no definition of %7)
br label 2
label1:
%10 = add %7, %6
...
ret
label2:
%7 = ...
br label1
The value %7 is assigned after label2 but is referred at label1.
From the control flow, the IRs is valid. As the reference will
be executed after the assignment.
Signed-off-by: Zhigang Gong
Please ignore this version too. I just sent out another patch to
fix this problem better.
On Wed, Dec 18, 2013 at 02:47:59PM +0800, Zhigang Gong wrote:
Clang/llvm may generate some code similar to the following IRs:
... (there is no definition of %7)
br label 2
label1:
%10 = add %7
LGTM, pushed, thanks.
On Wed, Dec 18, 2013 at 06:43:59AM +, Song, Ruiling wrote:
Ping for review. Thanks!
-Original Message-
From: Song, Ruiling
Sent: Wednesday, December 11, 2013 2:38 PM
To: beignet@lists.freedesktop.org
Cc: Song, Ruiling
Subject: [PATCH] GBE: Fix logb
it fails to get one. And latter when emit the assignment
instruction, it just ignore the duplicate allocation.
v2: handle the case when %7 is a proxyValue.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/llvm/llvm_gen_backend.cpp | 77 -
1 file
it fails to get one. And latter when emit the assignment
instruction, it just ignore the duplicate allocation.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/llvm/llvm_gen_backend.cpp |9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/backend/src/llvm
, then the whole instruction seens to be a
constant either.
If that is the case, we fail to get a valid instruction and may trigger
an assert. This patch change to check another use of the local data to
avoid this assert.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/llvm
I found that the previous pass, gvn pass, may generate new vector instruction.
We just defer the scalarize pass to make sure the gen pass will not encounter
unsupported non scalar instructions.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/llvm/llvm_to_gen.cpp |2 +-
1
the user kernel, so we set
it the pcm lib file to the LinkBitCodeFile field of the clang
instance.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/CMakeLists.txt | 29 -
backend/src/GBEConfig.h.in |1 +
backend/src/backend/program.cpp
...@lists.freedesktop.org] On Behalf Of Zhigang Gong
Sent: Wednesday, December 11, 2013 2:53 PM
To: beignet@lists.freedesktop.org
Cc: Gong, Zhigang
Subject: [Beignet] [PATCH] Accelerate utest.
For some test cases which include more than one kernel, the current
implementation always build
, will casue size large than max alloc
size.
Enlarge the global memory size and using it to check size when alloc.
Signed-off-by: Yang Rong rong.r.y...@intel.com
Reviewed-by: Zhigang Gong zhigang.g...@linux.intel.com
But from my point of view, this may not be a failure according
LGTM, pushed, thanks.
On Wed, Dec 11, 2013 at 11:09:27AM +0800, junyan...@inbox.com wrote:
From: Junyan He junyan...@linux.intel.com
In clang, The PCH file will be used as an AST source, so
the check is strict. The macro define is also checked,
and if anything is different, the PCH is
.
This patch reduces 2/3 of the utests execution time.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
utests/compiler_abs.cpp |9 +++--
utests/compiler_abs_diff.cpp | 15 +
utests/compiler_basic_arithmetic.cpp | 40 --
utests
On Mon, Dec 02, 2013 at 05:10:27PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
src/cl_api.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/cl_api.c b/src/cl_api.c
index 54701aa..bc8ee1d 100644
--- a/src/cl_api.c
+++
.
The llvm/clang I am using is 3.3.1. What's yours?
On Wed, Dec 04, 2013 at 09:32:06AM +0100, Robert Jobbagy wrote:
I attached my output log.
2013/12/4 zhigang gong zhigang.g...@gmail.com
On Wed, Dec 4, 2013 at 3:23 AM, Robert Jobbagy
jobbagy.rob...@gmail.comwrote:
Thanks your help
LGTM, pushed, thanks.
On Mon, Dec 02, 2013 at 12:50:13PM +0800, Yang Rong wrote:
And also correct some UXX compares.
V2: Not use OCL_OPTIMIZE_IMMEDIATE for XOR and ORD compare.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/backend/gen_insn_selection.cpp | 10 --
__mesa_test_texobj_completeness(intel-ctx, obj);
After you remove/rename the mesa source code tree, you may need to clean the
beignet build
directory completely and do a clean build again.
Maybe I made some mistake?
2013/11/27 Zhigang Gong zhigang.g...@linux.intel.com
The root cause
LGTM, will push latter. Thanks.
On Thu, Nov 28, 2013 at 04:37:22PM +0800, Yang Rong wrote:
Because B/UB is treated as W/UW, so can't set src1's type when dismatch.
Set the correct type before getRegisterFromImmediate.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
I agree. This test case is useless now. Will push this patch latter, thanks.
On Wed, Nov 27, 2013 at 04:40:30PM +0800, Yang Rong wrote:
This test only try to allocate buffer with size large than
CL_DEVICE_MAX_MEM_ALLOC_SIZE, and
assert if return status if not CL_INVALID_BUFFER_SIZE. But in
The root cause of this is that the cl_khr_gl_sharing extension depends
on a specific mesa version:
92e6260c1960f78692417433206c38170ec1a625
You can find your mesa source code tree. It should be in ${HOME}/mesa.
Or you can check the CMakeCache.txt in your build directory to find
the following
If the llvm version is something like 3.3.1, the previous cmake script
will generate an incorrect cflags as: -DLLVM_33 1 which breaks the build.
This commit also update the stable llvm version from 3.1 to 3.3.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
CMake/FindLLVM.cmake |2
is that it reroute all the output
of the clang excution to internal buffer and don't print to the
console directly. If the user want to get the detail build log,
the CL_PROGRAM_BUILD_LOG could be used.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/backend/program.cpp | 53
LGTM, will push latter. Thanks.
On Fri, Nov 22, 2013 at 07:51:56PM +0800, Yang Rong wrote:
Use convert instruction in ir, and ALU1 in gen selection.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/backend/gen/gen_mesa_disasm.c | 2 ++
backend/src/backend/gen_context.cpp
Good idea to implement UXX instruction. LGTM, will push latter. Thanks.
On Mon, Nov 25, 2013 at 03:08:08PM +0800, Yang Rong wrote:
And also correct some UXX compares.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/backend/gen_insn_selection.cpp | 3 +-
On Wed, Nov 27, 2013 at 06:44:04AM +, Yang, Rong R wrote:
-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong
Sent: Wednesday, November 27, 2013 10:47 AM
To: beignet@lists.freedesktop.org
Cc
LGTM, will push latter, thanks.
On Wed, Nov 27, 2013 at 02:06:50PM +0800, Yang Rong wrote:
When create image, due to alignment, will casue size large than max alloc
size.
Enlarge the global memory size and using it to check size when alloc.
Signed-off-by: Yang Rong rong.r.y...@intel.com
CL_KERNEL_PRIVATE_MEM_SIZE is not implemented, this patch fix
this issue and can pass the piglit test case.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
src/cl_command_queue_gen7.c |2 +-
src/cl_device_id.c | 54 +--
src
This patch can pass piglit test case cl-api-create-buffer.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
src/cl_mem.c | 26 +++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/src/cl_mem.c b/src/cl_mem.c
index 8c9f8a8..8639502 100644
--- a/src
. It will fail
the size checking logic, then we fixup its type to sampler there.
As this workaround will only take effect when error occur, it will
not bring too much side effect to the normal cases. And it can
pass the existing test cases.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src
Pushed, thanks.
On Fri, Nov 15, 2013 at 11:40:30AM +0800, Yang Rong wrote:
When do LOADI/compare - compare optimize, IMM src1 will using LOADI type,
but LOADI doesn't care unsigned or signed. Should use the compare type.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
LGTM, pushed, thanks.
On Tue, Nov 12, 2013 at 05:17:13PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/ocl_stdlib.tmpl.h | 103
--
1 file changed, 59 insertions(+), 44 deletions(-)
diff --git
LGTM, pushed, thanks.
On Thu, Nov 14, 2013 at 11:14:33AM +0800, Yang Rong wrote:
Add mov bool support.
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/backend/gen_insn_selection.cpp | 23 ++-
backend/src/ir/instruction.cpp | 2 ++
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/ocl_stdlib.tmpl.h | 20 +++-
kernels/test_copy_image.cl|8 +---
2 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/backend/src/ocl_stdlib.tmpl.h b/backend/src/ocl_stdlib.tmpl.h
index
From: Zhigang Gong zhigang.g...@linux.intel.com
Need to keep consistency between the constant data
allocation and the constant register allocation.
So we need to skip the unused constant data at the
constant data allocation stage.
To avoid possible mismatching, add a new assert in
the constant
@lists.freedesktop.org]
On Behalf Of Zhigang Gong
Sent: Wednesday, November 13, 2013 8:35 AM
To: beignet@lists.freedesktop.org
Cc: Zhigang Gong
Subject: [Beignet] [PATCH 1/2] GBE: remove all vstore macros for constant
memory space.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/ocl_stdlib.tmpl.h
When a kernel has __attribute__((reqd_work_group_size(X, Y, Z))) qualifier,
the kernel will only accept that group size.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/backend/program.cpp |9 +
backend/src/backend/program.h |4
backend/src
When a kernel has __attribute__((reqd_work_group_size(X, Y, Z))) qualifier,
the kernel will only accept that group size.
v2: add binary load/store support.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/backend/program.cpp | 17 -
backend/src/backend
LGTM, thanks.
On Fri, Nov 08, 2013 at 10:51:34AM +0800, Homer Hsing wrote:
if an parameter is nan, then returns another parameter.
Signed-off-by: Homer Hsing homer.x...@intel.com
---
backend/src/ocl_stdlib.tmpl.h | 4
1 file changed, 4 insertions(+)
diff --git
I agree with you that use thread data is better than locking.
One comment, how about to use thread local storage to simplify
this patch as below:
struct _cl_commonand_queue {
...
__thread cl_gpgpu gpgpu;
...
};
Then in the initialization stage, set it to NULL;
queue-gpgpu = NULL;
In the head
instruction's argument, and all the subsequent instruction's
argument 1 is free to change.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
backend/src/backend/context.cpp|6 +++---
backend/src/backend/gen_insn_selection.cpp |2 +-
backend/src/llvm/llvm_gen_backend.cpp
It's better to never modify the hstride manually. Always modify it through h2
method.
And in h2 method, we need to check whether it is scalar, and only set to h2 for
non
scalar.
Otherwise, we may have many chance to make mistake with this scalar/non-scalar
stride issue.
This patch is good for
am using llvm 3.4-svn. On llvm 3.4-svn this patch works fine ...
-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong
Sent: Thursday, November 7, 2013 2:41 PM
To: Yang, Rong R
Cc: Xing, Homer
.
I have sent the patches, welcome any comments. Thanks!
Ruiling
-Original Message-
From: Song, Ruiling
Sent: Monday, November 04, 2013 4:24 PM
To: Zhigang Gong
Cc: beignet@lists.freedesktop.org
Subject: RE: [Beignet] [PATCH] GBE: disable MulAdd pattern in instruction
selection
From: Zhigang Gong zhigang.g...@intel.com
sizeof(str) already includes the '\0', we don't need to add
1 to it.
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
src/cl_gt_device.h |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/cl_gt_device.h b/src/cl_gt_device.h
From: Zhigang Gong zhigang.g...@intel.com
Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
src/cl_gen75_device.h |2 +-
src/cl_gen7_device.h |2 +-
src/cl_gt_device.h|6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/src/cl_gen75_device.h b/src
Pushed, thanks.
On Thu, Oct 31, 2013 at 11:12:54AM +0800, Homer Hsing wrote:
converting a data type to same type ...
Signed-off-by: Homer Hsing homer.x...@intel.com
---
backend/src/gen_convert.sh | 3 ---
backend/src/ocl_convert.h | 40
2 files
Now we have the scalar version of bitselect, so we
enable the vector version in the def file. Also remove
some comments.
Signed-off-by: Zhigang Gong zhigang.g...@linux.intel.com
---
backend/src/builtin_vector_proto.def |6 +-
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git
the output of fast_normalize when the
input is NaN.
Homer
-Original Message-
From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com]
Sent: Tuesday, October 29, 2013 10:50 AM
To: Xing, Homer
Cc: Lu, Guanqun; beignet@lists.freedesktop.org
Subject: Re: [Beignet] [PATCH] fix built
For most vector relational builtin functions, we need to
return -1 if the element result is true, return 0 if the element
result is 0. So we can simply put a - in front of the scalar
version of function for each element.
Reported by Yang Rong.
Signed-off-by: Zhigang Gong zhigang.g
LGTM, pushed the whole series, thanks.
On Mon, Oct 28, 2013 at 02:02:15PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/ocl_stdlib.tmpl.h | 8
1 file changed, 8 insertions(+)
diff --git a/backend/src/ocl_stdlib.tmpl.h
Your guess may be correct that the snapshot here is generated dynamically.
We will find a good place to host our official release image next time.
Thanks for your reminder.
On Mon, Oct 28, 2013 at 09:59:27AM +0100, Mario Kicherer wrote:
Well, it does for me:
# rm Release_v0.3.tar.gz; wget
LGTM, will push latter. Thanks.
On Mon, Oct 21, 2013 at 03:47:56PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
kernels/compiler_async_copy.cl | 38 +++
utests/compiler_async_copy.cpp | 86
+-
2 files
Pushed, thanks.
On Fri, Oct 18, 2013 at 03:11:29PM +0800, Ruiling Song wrote:
Also refine Undef value support.
Signed-off-by: Ruiling Song ruiling.s...@intel.com
---
backend/src/llvm/llvm_gen_backend.cpp | 12
kernels/compiler_global_constant.cl | 15 +--
2
Pushed, thanks.
On Mon, Oct 21, 2013 at 03:26:02PM +0800, Homer Hsing wrote:
Signed-off-by: Homer Hsing homer.x...@intel.com
---
utests/builtin_convert_sat.cpp | 23 +--
1 file changed, 13 insertions(+), 10 deletions(-)
diff --git a/utests/builtin_convert_sat.cpp
Pushed, thanks.
On Mon, Oct 21, 2013 at 03:48:13PM +0800, Ruiling Song wrote:
Signed-off-by: Ruiling Song ruiling.s...@intel.com
---
src/cl_api.c |3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/cl_api.c b/src/cl_api.c
index 0410e93..0e562ed 100644
---
Hi Yi,
Do you have any update for this patch's testing? Thanks.
On Thu, Sep 26, 2013 at 04:20:04PM +0800, Zhigang Gong wrote:
Ruiling tell me that llvm3.4 has not been released yet, so we may not switch
to LLVM 3.4 immediately.
But we will switch it eventually. For now, just test
=linux.intel@lists.freedesktop.org]
On Behalf Of Sun, Yi
Sent: Monday, October 14, 2013 3:31 PM
To: Zhigang Gong; Xing, Homer
Cc: beignet@lists.freedesktop.org
Subject: Re: [Beignet] [PATCH] support LLVM 3.4
Sorry not yet. So is it necessary?
I'm just worrying about if it can roll back successfully
Pushed, thanks.
On Fri, Oct 11, 2013 at 10:43:41AM +0800, junyan...@inbox.com wrote:
From: Junyan He junyan...@linux.intel.com
Signed-off-by: Junyan He junyan...@linux.intel.com
---
src/intel/intel_gpgpu.c |1 -
1 file changed, 1 deletion(-)
diff --git a/src/intel/intel_gpgpu.c
Pushed, thanks.
On Thu, Oct 10, 2013 at 03:13:50PM +0800, Ruiling Song wrote:
As Clang treat local variable in similar way like global constant,
(they are treated as Global variable in each own address space)
we refine the previous constant implementation in order to
share same code between
Pushed, thanks.
On Wed, Oct 09, 2013 at 04:14:48PM +0800, Homer Hsing wrote:
this patch passes following piglit test case
piglit/framework/../bin/cl-program-tester
generated_tests/cl/builtin/relational/builtin-float-isnan-1.0.generated.cl
Signed-off-by: Homer Hsing homer.x...@intel.com
On Wed, Oct 09, 2013 at 02:36:25PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
src/cl_api.c | 22 --
src/cl_program.c | 21 +
src/cl_program.h | 3 +++
3 files changed, 44 insertions(+), 2 deletions(-)
diff
LGTM, thanks.
On Wed, Oct 09, 2013 at 02:36:26PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
backend/src/backend/program.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/backend/src/backend/program.cpp
LGTM, thanks.
On Wed, Oct 09, 2013 at 02:36:27PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
src/cl_api.c | 25 -
1 file changed, 25 deletions(-)
diff --git a/src/cl_api.c b/src/cl_api.c
index 42948e8..71bab32 100644
---
Pushed, thanks.
On Sat, Sep 28, 2013 at 06:39:50AM +0200, Simon Richter wrote:
The ICD loader expects the first member of any dispatchable object to be
the dispatch table.
Signed-off-by: Simon Richter simon.rich...@hogyros.de
---
src/cl_mem.h |2 +-
1 file changed, 1 insertion(+), 1
On Sun, Oct 06, 2013 at 01:30:15AM -0400, Matt Harvey wrote:
Hi
I'm having trouble and was wondering if you guys had any insight. This
simple kernel:
__constant sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE |
CLK_ADDRESS_CLAMP_TO_EDGE | CLK_FILTER_NEAREST;
Try to change the above
, September 26, 2013 1:34 PM
To: Zhigang Gong; Yang, Rong R
Cc: beignet@lists.freedesktop.org
Subject: Re: [Beignet] [PATCH] Remove global offset need divide by local
size restriction.
I can take this bug. But currently I don't have enough information.
Zhigang, do you know more details about how
this patch.
-Original Message-
From: beignet-bounces+zhigang.gong=linux.intel@lists.freedesktop.org
[mailto:beignet-bounces+zhigang.gong=linux.intel@lists.freedesktop.org]
On Behalf Of Sun, Yi
Sent: Thursday, September 26, 2013 2:23 PM
To: Zhigang Gong; Xing, Homer
Cc: beignet
BTW, please use commit id 81a7569b... to reproduce this bug.
On Thu, Sep 26, 2013 at 04:14:59PM +0800, Zhigang Gong wrote:
You need to focus on the specific uint64 instruction in
reg_insn_selection.cpp, and analyze the src/dst register carefully,
Two possible root causes there:
1. Wrongly
this requirment. So I have to make this change.
Signed-off-by: Zhigang Gong zhigang.g...@linux.intel.com
---
backend/src/backend/context.cpp| 52 ++--
backend/src/backend/context.hpp|5 +-
backend/src/backend/gen_reg_allocation.cpp | 93
Per opencl spec, use read_imagei on a float image may cause
undefined behaviour. We fix up all type to int type.
Signed-off-by: Zhigang Gong zhigang.g...@linux.intel.com
---
src/cl_mem.c | 43 +++
1 file changed, 39 insertions(+), 4 deletions(-)
diff
is allocated among the normal
registers.
2. The register interval analyzing could handle the image/sampler information
correctly.
Signed-off-by: Zhigang Gong zhigang.g...@linux.intel.com
---
backend/src/backend/context.cpp| 34 +++-
backend/src/backend
Pushed, thanks.
On Sun, Sep 22, 2013 at 06:00:00AM +, Yang, Rong R wrote:
LGTM, test pass.
-Original Message-
From: beignet-bounces+rong.r.yang=intel@lists.freedesktop.org
[mailto:beignet-bounces+rong.r.yang=intel@lists.freedesktop.org] On
Behalf Of Homer Hsing
Sent:
The patch LGTM, and Guanqun's comment make sense to me. I will push it with
this change.
Thanks.
On Mon, Sep 23, 2013 at 07:06:02AM +, Lu, Guanqun wrote:
I would suggest you should add { } in the first if clause. because it has
nested if statement, that should be much clean.
This is a workaround. And I'm afraid it may don't work as you expected.
For example, a 1D task has 1024 work item. And the offset is 512.
We set the local group size to 16. As you hard coded the work offset to
0 for the GPGPU walker, so the gpgpu walker will dispatch 1024/16 = 64
work
Please ignore my previous comment. I misunderstanding that the offset will
affect the gpgpu thread dispatching which actually don't work that way.
So this patch LGTM, will push it latter. Thanks.
On Mon, Sep 23, 2013 at 02:04:08PM +0800, Yang Rong wrote:
Set to global offset to 0 in walker, and
This patchset LGTM, will push it latter. Thanks.
On Mon, Sep 23, 2013 at 03:06:23PM +0800, Lu Guanqun wrote:
Signed-off-by: Lu Guanqun guanqun...@intel.com
---
src/cl_mem_gl.c |6 ++
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/src/cl_mem_gl.c b/src/cl_mem_gl.c
This patch and the following patch is to introduce new API to cl lib.
It's better to put it into include/CL/cl_intel.h, and the naming rule
is to follow other non-standard API : clMapBufferIntel(cl_mem, cl_int*);
clXXXIntel.
And it's better to provide at least one unit test to show the usage
Hi guys,
I just pushed this patch, and it causes a regression as below:
compiler_abs_diff_long16:
compiler_abs_diff_long16()[FAILED]
Error: !memcmp(buf_data[2], cpu_diff, sizeof(T) * n)
at file /home/gongzg/git/fdo/beignet/utests/compiler_abs_diff.cpp, function
Nice catch. will push it latter. Thanks.
On Mon, Sep 23, 2013 at 03:19:42PM +0800, Homer Hsing wrote:
fix 64bit writing when data register is scalar
this patch make some piglit test case pass
Signed-off-by: Homer Hsing homer.x...@intel.com
---
backend/src/backend/gen_context.cpp | 2 +-
Pushed, thanks.
On Wed, Sep 11, 2013 at 11:21:37AM +0800, Homer Hsing wrote:
v2:
keep highest carry bit
tested by piglit test cases:
piglit/framework/../bin/cl-program-tester
generated_tests/cl/builtin/int/builtin-ulong-rhadd-1.0.generated.cl
Pushed, thanks.
On Fri, Sep 13, 2013 at 09:41:02AM +0800, Homer Hsing wrote:
version 2:
improve algorithm to convert signed integer
fix source operand type in llvm_gen_backend
enable predicate in addWithCarry
change test case to test signed integer
Signed-off-by: Homer Hsing
Two comments :
1. It seems that you use flag 1 directly at code generation stage. This may
bring some
conflict with current BB. It's better to do this type of things at instrucation
selection
stage and let the allocator to allocate flag register.
2. Is it possible to implement this builtin
Boqun,
LGTM, thanks for your contribution, pushed.
On Tue, Sep 17, 2013 at 11:41:50AM +0800, Boqun Feng wrote:
In some distros, python is linked to python3 not
python2, and GBE can't be built on such distros
without modification.
CMake provides a variable PYTHON_EXECUTABLE.
By default,
LGTM, pushed, thanks.
On Tue, Sep 17, 2013 at 04:10:00PM +0800, Yang Rong wrote:
Signed-off-by: Yang Rong rong.r.y...@intel.com
---
src/cl_event.c | 40 ++--
src/cl_event.h | 2 +-
2 files changed, 35 insertions(+), 7 deletions(-)
diff --git
This patchset LGTM, pushed, thanks.
On Wed, Sep 18, 2013 at 10:18:42AM +0800, Ruiling Song wrote:
struct/vector/array of vector/struct of array/array of struct.
Also fix a bug 'constant index into constant array get wrong result'
brought in by patch 'Fix non-4byte program global constant
LGTM, pushed, thanks.
On Tue, Sep 17, 2013 at 04:10:01PM +0800, Yang Rong wrote:
Add some event info to cl_command_queue.
One is non-complete user events, used to block marker event and barrier.
After these events become CL_COMPLETE, the events blocked by these events also
become CL_COMPLETE,
901 - 1000 of 1022 matches
Mail list logo