I mean the logic block, not the patch size. The following code in a single logic block is separated into ocl_stdlib.tmpl.h and program.cpp. The code is not direct to newcomers to do code review or add/remove one math function, sometime later. > #ifdef sin > #undef sin > #endif > #define sin __gen_ocl_internal_intelnative_sin > INLINE_OVERLOADABLE float __gen_ocl_internal_intelnative_sin(float x) { > ... > }
I do not object the new design, and talk about the implementation detail of the new design in the second paragraph. Thanks Yejun -----Original Message----- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Friday, February 14, 2014 4:09 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH V2] GBE: add param to switch the behavior of math func On Fri, Feb 14, 2014 at 08:20:47AM +0000, Guo, Yejun wrote: > With this design, we can decrease build time by not generating the second pch > file, the only shortcoming is that the logic is separated in different places. Actually, the logic is much simpler than use a new strict pch file. You will see the patch will be much shorter. > > And for the implementation, I plan to add a const static std::string variable > in program.cpp to hold the define relative content, and also use the function > name like __gen_ocl_internal_fastpath_sin, do you think is it ok? Thanks. I'm not sure what do you mean here? > > Thanks > Yejun > > > -----Original Message----- > From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] > Sent: Friday, February 14, 2014 3:04 PM > To: Guo, Yejun > Cc: beignet@lists.freedesktop.org > Subject: Re: [Beignet] [PATCH V2] GBE: add param to switch the > behavior of math func > > I still think that introducing a new pch files is too heavy for this case. > And I just think of another way to achieve the same goal here. > > 1. You can put the following function to the ocl_stdlib.tmpl.h by default: > " > INLINE_OVERLOADABLE float __gen_ocl_internal_intelnative_sin(float x) { > return native_sin(x); > } > " > > 2. And you can add __gen_ocl_internal_intelnative_sin to the vectory proto > definition file builtin_vector_proto.def. > > 3. Then if the STRICT_CONFORMANCE is disabled, then at the program.cpp when > you compile the user kernel, you can just simply insert the following content > to the head of user kernel and then compile this modified kernel with the > single pch file. > > " > #ifdef sin > #undef sin > #endif > #define sin __gen_ocl_internal_intelnative_sin > " > > What do you think? > > On Thu, Feb 13, 2014 at 10:42:19AM +0800, Guo Yejun wrote: > > Add OCL_STRICT_CONFORMANCE to switch the behavior of math func, The > > funcs will be high precision with perf drops if it is 1, Fast path > > with good enough precision will be selected if it is 0. > > > > This change is to add the code basis, with 'sin' implmented as an > > example, other math functions support will be added later. > > > > Signed-off-by: Guo Yejun <yejun....@intel.com> > > --- > > backend/CMakeLists.txt | 3 ++- > > backend/src/CMakeLists.txt | 13 ++++++++++--- > > backend/src/GBEConfig.h.in | 1 + > > backend/src/backend/program.cpp | 12 +++++++++++- > > backend/src/ocl_stdlib.tmpl.h | 27 +++++++++++++++++++++++++++ > > utests/setenv.sh.in | 1 + > > 6 files changed, 52 insertions(+), 5 deletions(-) > > > > diff --git a/backend/CMakeLists.txt b/backend/CMakeLists.txt index > > dd55a4a..08122da 100644 > > --- a/backend/CMakeLists.txt > > +++ b/backend/CMakeLists.txt > > @@ -98,8 +98,9 @@ include_directories (${CMAKE_CURRENT_BINARY_DIR}) > > ############################################################## > > add_subdirectory (src) > > set(LOCAL_PCH_OBJECT_DIR ${LOCAL_PCH_OBJECT_DIR} PARENT_SCOPE) > > +set(LOCAL_PCH_STRICT_OBJECT_DIR ${LOCAL_PCH_STRICT_OBJECT_DIR} > > +PARENT_SCOPE) > > set(LOCAL_PCM_OBJECT_DIR ${LOCAL_PCM_OBJECT_DIR} PARENT_SCOPE) set > > (GBE_BIN_GENERATER > > - OCL_PCM_PATH=${LOCAL_PCM_OBJECT_DIR} > > OCL_PCH_PATH=${LOCAL_PCH_OBJECT_DIR} > > ${CMAKE_CURRENT_BINARY_DIR}/src/gbe_bin_generater > > + OCL_PCM_PATH=${LOCAL_PCM_OBJECT_DIR} > > + OCL_PCH_PATH=${LOCAL_PCH_OBJECT_DIR} > > + OCL_PCH_STRICT_PATH=${LOCAL_PCH_STRICT_OBJECT_DIR} > > + ${CMAKE_CURRENT_BINARY_DIR}/src/gbe_bin_generater > > PARENT_SCOPE) > > > > diff --git a/backend/src/CMakeLists.txt b/backend/src/CMakeLists.txt > > index 33494a0..df36069 100644 > > --- a/backend/src/CMakeLists.txt > > +++ b/backend/src/CMakeLists.txt > > @@ -43,6 +43,8 @@ add_custom_command( > > > > set (pch_object ${ocl_blob_file}.pch) set (local_pch_object > > ${ocl_blob_file}.local.pch) > > +set (pch_strict_object ${ocl_blob_file}.strict.pch) set > > +(local_pch_strict_object ${ocl_blob_file}.strict.local.pch) > > # generate pch object > > if (LLVM_VERSION_NODOT VERSION_GREATER 32) > > set (clang_cmd -cc1 -x cl -triple spir -ffp-contract=off) @@ > > -56,15 +58,17 @@ endif (LLVM_VERSION_NODOT VERSION_GREATER 32) set > > (clang_cmd ${clang_cmd} -fno-builtin > > -DGEN7_SAMPLER_CLAMP_BORDER_WORKAROUND) > > > > add_custom_command( > > - OUTPUT ${pch_object} > > - COMMAND rm -f ${pch_object} > > + OUTPUT ${pch_object} ${pch_strict_object} > > + COMMAND rm -f ${pch_object} ${pch_strict_object} > > COMMAND clang ${clang_cmd} --relocatable-pch -emit-pch -isysroot > > ${CMAKE_CURRENT_BINARY_DIR} ${ocl_blob_file} -o ${pch_object} > > COMMAND clang ${clang_cmd} -emit-pch ${ocl_blob_file} -o > > ${local_pch_object} > > + COMMAND clang ${clang_cmd} -DOCL_STRICT_CONFORMANCE=1 > > --relocatable-pch -emit-pch -isysroot ${CMAKE_CURRENT_BINARY_DIR} > > ${ocl_blob_file} -o ${pch_strict_object} > > + COMMAND clang ${clang_cmd} -DOCL_STRICT_CONFORMANCE=1 > > +-emit-pch ${ocl_blob_file} -o ${local_pch_strict_object} > > DEPENDS ${ocl_blob_file} > > ) > > > > add_custom_target(pch_object > > - DEPENDS ${pch_object}) > > + DEPENDS ${pch_object} ${pch_strict_object}) > > > > macro(ll_add_library ll_lib ll_sources) > > foreach (ll ${${ll_sources}}) > > @@ -196,13 +200,16 @@ TARGET_LINK_LIBRARIES(gbe_bin_generater gbe) > > #install (FILES backend/program.h DESTINATION include/gen) install > > (FILES ${ocl_blob_file} DESTINATION ${LIB_INSTALL_DIR}/beignet) > > install (FILES ${pch_object} DESTINATION ${LIB_INSTALL_DIR}/beignet) > > +install (FILES ${pch_strict_object} DESTINATION > > +${LIB_INSTALL_DIR}/beignet) > > install (FILES ${CMAKE_CURRENT_BINARY_DIR}/${pcm_lib} DESTINATION > > ${LIB_INSTALL_DIR}/beignet) # When build beignet itself, we need to > > export the local precompiled header file and precompiled module # file to > > libcl and utests. > > set (LOCAL_PCH_OBJECT_DIR > > "${local_pch_object}:${beignet_install_path}/ocl_stdlib.h.pch" > > PARENT_SCOPE) > > +set (LOCAL_PCH_STRICT_OBJECT_DIR > > +"${local_pch_strict_object}:${beignet_install_path}/ocl_stdlib.h.st > > +ri > > +ct.pch" PARENT_SCOPE) > > set (LOCAL_PCM_OBJECT_DIR > > "${CMAKE_CURRENT_BINARY_DIR}/${pcm_lib}:${beignet_install_path}/${pc > > m_ > > lib}" PARENT_SCOPE) > > > > set (PCH_OBJECT_DIR "${beignet_install_path}/ocl_stdlib.h.pch") > > +set (PCH_STRICT_OBJECT_DIR > > +"${beignet_install_path}/ocl_stdlib.h.strict.pch") > > set (PCM_OBJECT_DIR "${beignet_install_path}/${pcm_lib}") > > configure_file ( > > "GBEConfig.h.in" > > diff --git a/backend/src/GBEConfig.h.in b/backend/src/GBEConfig.h.in > > index 5bc09b8..c446754 100644 > > --- a/backend/src/GBEConfig.h.in > > +++ b/backend/src/GBEConfig.h.in > > @@ -2,4 +2,5 @@ > > #define LIBGBE_VERSION_MAJOR @LIBGBE_VERSION_MAJOR@ #define > > LIBGBE_VERSION_MINOR @LIBGBE_VERSION_MINOR@ #define PCH_OBJECT_DIR > > "@PCH_OBJECT_DIR@" > > +#define PCH_STRICT_OBJECT_DIR "@PCH_STRICT_OBJECT_DIR@" > > #define PCM_OBJECT_DIR "@PCM_OBJECT_DIR@" > > diff --git a/backend/src/backend/program.cpp > > b/backend/src/backend/program.cpp index 2492a8b..496a9a0 100644 > > --- a/backend/src/backend/program.cpp > > +++ b/backend/src/backend/program.cpp > > @@ -466,6 +466,7 @@ namespace gbe { > > > > BVAR(OCL_OUTPUT_BUILD_LOG, false); > > SVAR(OCL_PCH_PATH, PCH_OBJECT_DIR); > > + SVAR(OCL_PCH_STRICT_PATH, PCH_STRICT_OBJECT_DIR); > > SVAR(OCL_PCM_PATH, PCM_OBJECT_DIR); > > > > static bool buildModuleFromSource(const char* input, const char* > > output, std::string options, @@ -646,6 +647,7 @@ namespace gbe { > > extern std::string ocl_stdlib_str; > > > > BVAR(OCL_USE_PCH, true); > > + BVAR(OCL_STRICT_CONFORMANCE, true); > > static gbe_program programNewFromSource(const char *source, > > size_t stringSize, > > const char *options, @@ > > -743,6 +745,10 @@ namespace gbe { > > } > > > > std::string dirs = OCL_PCH_PATH; > > + if (OCL_STRICT_CONFORMANCE){ > > + dirs = OCL_PCH_STRICT_PATH; > > + } > > + > > std::istringstream idirs(dirs); > > std::string pchFileName; > > > > @@ -757,8 +763,12 @@ namespace gbe { > > clOpt += " -include-pch "; > > clOpt += pchFileName; > > clOpt += " "; > > - } else > > + } else { > > + if (OCL_STRICT_CONFORMANCE){ > > + clOpt += " -DOCL_STRICT_CONFORMANCE=1 "; > > + } > > fwrite(ocl_stdlib_str.c_str(), > > strlen(ocl_stdlib_str.c_str()), 1, clFile); > > + } > > > > // Write the source to the cl file > > fwrite(source, strlen(source), 1, clFile); diff --git > > a/backend/src/ocl_stdlib.tmpl.h b/backend/src/ocl_stdlib.tmpl.h > > index d191b8e..8401f0f 100755 > > --- a/backend/src/ocl_stdlib.tmpl.h > > +++ b/backend/src/ocl_stdlib.tmpl.h > > @@ -4462,6 +4462,33 @@ INLINE_OVERLOADABLE size_t > > get_image_array_size(image1d_array_t image) > > { return __gen_ocl_get_image_array_size(image); } #endif > > > > + > > + > > +/// It is required by OpenCL that built-in math functions (without > > +native_/half_) /// have high precision, but GPU hardware is > > +designed to be good enough precision, /// so most functions will be > > emulated with higph and make performance drops. > > +/// This is not an issue if the applications could choose the > > +proper functions, for /// example, use native_* functions for cases > > without highp requirement. > > +/// Due to the fact that applications always use math functions > > +without native_/half_, /// environment variable > > +OCL_STRICT_CONFORMANCE is introduced to switch the behavior /// of the > > math functions. > > +/// The math functions will be emulated with highp if > > +OCL_STRICT_CONFORMANCE is 1 (the following code block is disable), /// and > > choose fast path with good enough precision if OCL_STRICT_CONFORMANCE is 0 > > (the following code block is enabled). > > +#ifndef OCL_STRICT_CONFORMANCE > > + > > +#ifdef sin > > +#undef sin > > +#endif > > +#define sin __gen_ocl_internal_intelnative_sin > > +INLINE_OVERLOADABLE float __gen_ocl_internal_intelnative_sin(float > > +x) { > > + return native_sin(x); > > +} > > + > > +#endif //OCL_STRICT_CONFORMANCE > > + > > + > > + > > #pragma OPENCL EXTENSION cl_khr_fp64 : disable > > > > #undef DECL_IMAGE > > diff --git a/utests/setenv.sh.in b/utests/setenv.sh.in index > > ad77369..17a2e28 100644 > > --- a/utests/setenv.sh.in > > +++ b/utests/setenv.sh.in > > @@ -2,4 +2,5 @@ > > # > > export OCL_PCM_PATH=@LOCAL_PCM_OBJECT_DIR@ > > export OCL_PCH_PATH=@LOCAL_PCH_OBJECT_DIR@ > > +export OCL_PCH_STRICT_PATH=@LOCAL_PCH_STRICT_OBJECT_DIR@ > > export OCL_KERNEL_PATH=@CMAKE_CURRENT_SOURCE_DIR@/../kernels > > -- > > 1.7.9.5 > > > > _______________________________________________ > > Beignet mailing list > > Beignet@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/beignet > _______________________________________________ > Beignet mailing list > Beignet@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/beignet _______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/beignet