-Original Message-
From: beignet-boun...@lists.freedesktop.org
[mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Guo, Yejun
Sent: Friday, February 14, 2014 7:09 PM
To: Zhigang Gong
Cc: beignet@lists.freedesktop.org
Subject: Re: [Beignet] [PATCH V2] GBE: add param to switch
With this design, we can decrease build time by not generating the second pch
file, the only shortcoming is that the logic is separated in different places.
And for the implementation, I plan to add a const static std::string variable
in program.cpp to hold the define relative content, and also
On Fri, Feb 14, 2014 at 08:20:47AM +, Guo, Yejun wrote:
With this design, we can decrease build time by not generating the second pch
file, the only shortcoming is that the logic is separated in different places.
Actually, the logic is much simpler than use a new strict pch file. You will
I mean the logic block, not the patch size. The following code in a single
logic block is separated into ocl_stdlib.tmpl.h and program.cpp. The code is
not direct to newcomers to do code review or add/remove one math function,
sometime later.
#ifdef sin
#undef sin
#endif
#define sin
Add OCL_STRICT_CONFORMANCE to switch the behavior of math func,
The funcs will be high precision with perf drops if it is 1,
Fast path with good enough precision will be selected if it is 0.
This change is to add the code basis, with 'sin' implmented as
an example, other math functions support
BTW, what kind of precision can be accepted when OCL_STRICT_CONFORMANCE is 0?
And is it necessary for QA to implement the conformance test case when
OCL_STRICT_CONFORMANCE is 0?
Thanks
--Sun, Yi
-Original Message-
From: beignet-boun...@lists.freedesktop.org
Hi Yi,
The fast path precision (when OCL_STRICT_CONFORMANCE is 0) depends on the GPU
hardware instruction, we have to figure out the precision one by one.
For the conformance test suite, it would be nice to add fast path test, but
just guess some tests will be meaningless if the max precision
I still think that introducing a new pch files is too heavy for this case.
And I just think of another way to achieve the same goal here.
1. You can put the following function to the ocl_stdlib.tmpl.h by default:
INLINE_OVERLOADABLE float __gen_ocl_internal_intelnative_sin(float x) {
return