Hi, > Yes it does! Actually, what we would ideally do is to, by default, link > ViennaCL to the integrated set of numerical kernels (those of > libviennacl, which would be generated dynamically for the OpenCL > backend), and allow one to switch backend to > MKL/OpenBLAS/CuBLAS/FunFunFunBLAS... The only "obstacle" being that the > set of kernels supported by ViennaCL is bigger than the standard BLAS > interface.
There are two things to note here: a) ViennaCL 1.x.y needs to remain header-only by default for backward compatibility reasons b) I'm not worried about the larger set of operations provided by ViennaCL. With a function pointer table for the operations we can always initialize everything to use the native ViennaCL operations and only selectively overwrite those operations which are provided by other backends if they are used. > It was itching me to do it, but I was hesitating because it involves > significant changes. I'll start working on it in my > "external-blas_linking" folder. Yes, it involves a significant amount of changes. This also means that we better decide now whether we want to first invest into the 'external BLAS' feature, or the complete micro-scheduler integration. It doesn't make sense to work on both at the same time, it would otherwise delay subsequent releases. > There is still a dilemma, however, that I would like to sort out if > possible: for OpenCL, we have a set of pre-generated kernel sources (for > compilation time reasons). However, if they are pre-generated, then it > means that they cannot easily be coupled to the generator, and that they > are not optimal in performance. We've seen that even in the case of a > simple axpy operation, the bandwidth may greatly vary if the parameters > are not properly tuned (for CPUs and AMD GPUs, particularly). Wouldn't > it make sense to default everything to the generator, and to allow a > "VIENNACL_WITH_STATIC_OPENCL" flag? In the end, this would make the > shared libviennacl an alternative to GATLAS... Plus, I'm not convinced > that this would have a huge impact on the C++ compilation time, since > most of the workload only appears in the first time the generator is > instantiated. If you remember, the initial intent for the kernel generator was (and still is) that all these pre-generated kernels are only obtained from the generator. Of course, it will take some time to also include sparse operations, but at least for dense matrices and vector operations we can start redirecting the kernel generation to the generator. More precisely, the kernel string generation done in viennacl/linalg/opencl/kernels/vector.hpp viennacl/linalg/opencl/kernels/matrix.hpp viennacl/linalg/opencl/kernels/matrix_element.hpp viennacl/linalg/opencl/kernels/matrix_prod.hpp can already be redirected to the generator. The easiest way to include this is by sending the operation over to the generator, specify the desired kernel name, and append the returned string to the program source. This may result in some minor adjustments to the order of the kernel parameters in the calling instance, but gets caught quickly by the test suite. I don't expect it to influence the C++ compilation time, because the micro-scheduler and hence the generator are lightweight in terms of compiler load now. Btw: I expect that we want to move the device database out of the kernel generator. The reason is that parts of it are also useful for the CUDA backend, so we also want to access it if OpenCL is disabled. This doesn't have high priority, but it's worth keeping it in mind. Best regards, Karli ------------------------------------------------------------------------------ CenturyLink Cloud: The Leader in Enterprise Cloud Services. Learn Why More Businesses Are Choosing CenturyLink Cloud For Critical Workloads, Development Environments & Everything In Between. Get a Quote or Start a Free Trial Today. http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel