Neat. What opencl implementation are you building against? I get errors related to _svm_ parts of code. I.e. cl_device_svm_capabilities was not declared in this scope. Trying to use the Nvidia cuda sdk, just downloaded from their developer site (ver 8.0).
On May 6, 2017 8:59 AM, "Ghost Op" <ghosto...@gmail.com> wrote: > Hi everyone. A number of you have asked me to keep you informed of > any major updates on the OpenCL gr-clenabled project and the past > couple of weeks have been pretty active. There's now a version up in > the repo with a significant number of updates and all blocks have been > validated (at least in their basic modes). > > So here's the major updates: > > Validation flowgraphs - Almost all test flowgraphs have been posted in > the examples directory. You can run the comparisons on your own > hardware for comparison. This is important on older cards that don't > support double precision (you can check with the included clview > command-line tool). > > Signal Source Block - A discrepancy in the output was due to an > OpenCL issue. Turns out single/float precision wasn't producing > accurate enough numbers. This block now uses double precision if the > hardware supports it (most new hardware will) for an even cleaner > signal than the native block (no secondary nodes). > > Quad Demod - Same single/double trig discrepancy due to precision > which was corrected. > > Filters - A lot of work this week has been spent on filter validation > (hence the few emails about TD vs. FD from yesterday) > - Both FIR and FFT implementations are now implemented and > producing correct output > - A generic tap-based block was added for more flexibility > - A test-clfilter command-line tool was added to test performance > given a number of taps across OpenCL FIR, GNURadio FIR, OpenCL FFT, > and GNURadio FFT so you can pick the best performing filter given your > implementation. > > Costas Loop - A Costas Loop was added, however the performance on a > GPU kernel is horrible. Because of the sequential calculations, it > couldn't be SIMD parallel processed so it was written as an OpenCL > task-based kernel. This means it just runs single-threaded on a > single core, which is why the performance is so bad. However if > anyone has an OpenCL-capable FPGA card like an Altera I'd love to see > the result of running the included test-clenabled timing tool and see > how the Costas Loop performs. I just don't have access to one. > > Performance - Code was added to detect if the hardware supports Fused > Multiply/Add functionality for added kernel performance. If it's > available it's used. > > OpenCL Setup Instructions - For those that may not have OpenCL set up, > I added some installation guides in the setup_help directory for > Ubuntu and Debian with step-by-steps on getting it up and running. > I've taken both of those processes on several systems and been up and > running pretty quickly. I also pulled some of the important points > into the main page's README, since in my experience that's generally > all I look through too. > > Study - Based on the filter updates, the filter section in the study > in the docs directory was completely rewritten. The report was noted > as updated. > > I think that's the biggest updates for now. As always let me know if > anyone runs into any issues. > > _______________________________________________ > Discuss-gnuradio mailing list > Discuss-gnuradio@gnu.org > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio >
_______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio