LGTM, pushed, thanks.
> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> junyan...@inbox.com
> Sent: Sunday, September 6, 2015 18:45
> To: beignet@lists.freedesktop.org
> Cc: Junyan He
> Subject: [Beignet] [PATCH 2/2] Utest: Add
Junyan also send a patch for it, because he separate to two patch, make it
clearly, so pushed his patch, thanks.
> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> Pan Xiuli
> Sent: Sunday, September 6, 2015 11:30
> To:
v2:
1. Just upload the first vme_state.
2. Remove duplicated code in check_opt1_extension.
3. Check image format before cl_gpgpu_bind_image_for_vme.
4. Fix error of getting mv. Because we suppose this kernel run in SIMD16
mode, so dword 0 of grf 1 should be
__gen_ocl_region(8,vme_result.s0),
If the CL device does not support this builtin kernel, the test returns
PASS.
Signed-off-by: Guo Yejun
---
utests/CMakeLists.txt | 1 +
utests/utest_helper.hpp | 1 +
2 files changed, 2 insertions(+)
diff --git a/utests/CMakeLists.txt b/utests/CMakeLists.txt
index
Signed-off-by: Chuanbo Weng
---
docs/Beignet.mdwn | 1 +
docs/howto/video-motion-estimation-howto.mdwn | 79 +++
2 files changed, 80 insertions(+)
create mode 100644 docs/howto/video-motion-estimation-howto.mdwn
diff
Is there any evidence that this optimization could bring actual improvement?
I doubt it because it doesn't reduce any instruction.
Actually, if the %42 is not in the liveout set of current BB, then the MOV
could be removed,
the exactly same optimization logic has been implemented in the GEN IR
Selection IR is a representation between Gen IR and Gen ASM, it is
almost a Gen instruction but *before* the register allocation.
only basic dump supported, not fully completed yet. Once finished,
can be refined as operator<< for relative classes.
Signed-off-by: Guo Yejun
The idea is that many optimzations can be done at selection IR level,
which is nearly ISA-like *before* physical register allocation. The
optimization here can help to reduce register use/spill.
It is hard to do the optimzation in late ASM stage since the ASM
instructions are encoded without
it is done at selection ir level, for instructions like:
MOV(8) %42<2>:UB : %53<32,8,4>:UB
ADD(8) %43<2>:B: %40<16,8,2>:B -%42<16,8,2>:B
can be optimized as:
MOV(8) %42<2>:UB : %53<32,8,4>:UB
ADD(8)
__gen_ocl_vme is used for hardware accelerated video motion estimation.
It gets payload values as parameters and uses MOV to pass these payload
values to VME SEND Message's payload grfs. The int8 return value is used
to store SEND Message writeback.
v2:
Remove unnecessary 5 parameters(src_grf*)
More aggresive interfering check, even if both registers are in
Livein set or Liveout set, they are still possible not interfering
to each other.
v2:
Liveout interfering check need to take care those BBs which has only one
register defined.
For example:
BBn:
...
MOV %r1, %src
...
Both
Please ignore this patch, I did some minor changes and will re-send together
with other patches.
-Original Message-
From: Guo, Yejun
Sent: Monday, August 31, 2015 8:41 AM
To: beignet@lists.freedesktop.org
Cc: Guo, Yejun
Subject: [PATCH] add basic function to dump Selection IR
Selection
> +if (kernel->vme) {
> +fixed_local_sz[0] = 16;
> +fixed_local_sz[1] = 1;
Why it is 16? Does it work for all cases?
> - if (global_work_size != NULL)
> + if (kernel->vme) {
> +fixed_global_sz[0] = (global_work_size[0]+15) / 16 * 16;
> +fixed_global_sz[1] =
Signed-off-by: Ruiling Song
---
backend/src/backend/program.cpp | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp
index d9e6416..e75e911 100644
---
Regarding "fixed_local_sz[0] = 16", the reason is that the basic unit of VME
hardware is 16*16 pixels, and our design is to handle 1*16 pixels in a work
item, and use 16*1 as local size, so, each group is a basic unit of VME.
For the extension concern "Is this a duplicate of code in
More aggresive interfering check, even if both registers are in
Livein set or Liveout set, they are still possible not interfering
to each other.
Signed-off-by: Zhigang Gong
---
backend/src/ir/value.cpp | 117 ++-
Even if one value is killed in current BB, we still need to
pass predecessor's definition into this BB. Otherwise, we will
miss one definition.
BB0:
MOV %foo, %src0
BB1:
MUL %foo, %src1, %f00
...
BR BB1
In the above case, both BB1 and BB0 are the predecessors of BB1.
When pass the
These helper function will be used in further phi mov optimization.
v2:
remove the useless debug message code.
Signed-off-by: Zhigang Gong
---
backend/src/ir/value.cpp | 100 +++
backend/src/ir/value.hpp | 13 ++
2 files
LGTM
Thanks!
Ruiling
> -Original Message-
> From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
> Sirisha Gandikota
> Sent: Wednesday, September 2, 2015 4:44 PM
> To: Zou, Nanhai; beignet@lists.freedesktop.org
> Cc: Gandikota, Sirisha
> Subject: [Beignet] [PATCH]
From: Junyan He
Signed-off-by: Junyan He
---
utests/get_arg_info.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/utests/get_arg_info.cpp b/utests/get_arg_info.cpp
index c1ea1ef..effeb64 100644
---
From: Junyan He
There is no NULL pointer check for kernel->program->build_opts.
This will cause utest test_get_arg_info crash.
In fact, we will add -cl-kernel-arg-info flag for compiling
ever time, and so the arg info is always avaible.
But some test case deliberately
21 matches
Mail list logo