Re: [Faudiostream-users] -vec and multiple control inputs

Orlarey Yann Fri, 12 Apr 2013 13:07:19 -0700

Le 12/04/13 12:30, Oli Larkin a écrit :
> Thanks for looking at this and for the information Yann. I just discovered 
> the benchmarks folder, which I'll use from now on to benchmark more 
> accurately.
>
> after running make in the benchmarks folder, I ran bench.sh, but it just 
> seems to run copy1. Does bench.sh need to be modified to work on OSX?
I did my tests on a Linux machine with a modified version of benchmarks that is 
not the 
one of the Faust distribution. Actually I am not sure the bench works for OSX, 
but if you 
want to try you must use the coreaudio targets, for example : make 
gcoreaudioscal


Cheers

Yann


>
> cheers,
>
> oli
>
> On 7 Apr 2013, at 17:50, Orlarey Yann wrote:
>
>> Hi Oli,
>>
>> The ~10x speedup of your version 2 is due to the fact that all the
>> parallel strings not only have the same controls, but are also applied
>> to the same two input signals. This leads to redundant computations
>> that the Faust compiler is able to discover and factorize.
>>
>> Here are my results, quite similar to yours. All tests where compiled
>> in scalar and vector modes using icc 13.1.0. and performed as alsa-gtk
>> applications on a Asus Zenbook quad-core i7-3517U CPU @ 1.90GHz
>> running Linux Mint 14. The results are in CPU usage.
>>
>>
>> 1) Results for test1.dsp (your version 1, multiple controls)
>> ------------------------------------------------------------
>> Test1.dsp is your version 1 example.
>>
>> test1, scalar mode  0.23%
>> test1, vector mode  0.21%
>>
>> Scalar and vector modes have similar results, even if vector mode is a
>> little bit faster here.
>>
>>
>> 2) Results for test2y.dsp (single controls)
>> -------------------------------------------
>> Test2y.dsp is similar to your version 2, but derived form test1.dsp by
>> simply removing all "...%1i..." from sliders labels, thus leading to
>> single controls.
>>
>> test2y, scalar mode  0.014%
>> test2y, vector mode  0.022%
>>
>> As in your experiments test2y is ~10x faster than test1 in vector
>> mode. The speedup is even better in scalar mode.
>>
>> If we look at the size of generate code for test1.dsp and test2y.dsp
>> we have :
>>
>> faust test1.dsp  | wc  => 155917 characters
>> faust test2y.dsp | wc  =>  21229 characters
>>
>> As we can see that test2y C++ translation is ~7x shorter due to many
>> redundant computations that the Faust compiler was able to discover
>> and factorize. Because of this, the C++ compilation time is also
>> shorter.
>>
>> To analyze the influence of redundant computations vs fewer controls
>> we can modify test2y.dsp to have separate inputs instead of stereo
>> inputs. Because strings will be applied to different inputs they wont
>> be factorized anymore and we should have performances close to test1.
>>
>>
>> 3) results for test2y4.dsp
>> --------------------------
>> Test2y4.dsp is derived from test2y.dsp by modifying stringbox
>> definition form :
>>
>> stringbox(n) = _ , _ <: par(s, n, stereostring(_, _, s)) :> _ , _;
>>
>> to :
>>
>> stringbox(n) =  par(s, n, stereostring(_, _, s)) :> _ , _;
>>
>> in order to have the strings applied to different inputs and avoid
>> factorizations.
>>
>> The size of the C++ code is now larger :
>>
>> faust test2y4.dsp | wc =>118218 characters
>>
>> and the performances are similar to test1 :
>>
>> test2y4, scalar mode  0.15%
>> test2y4, vector mode  0.12%
>>
>> Test2y4 is still faster than test1 because it has less controls to
>> compute.
>>
>> 4) results of test1smoothless.dsp
>> ---------------------------------
>> But if we simplify the control signals of test1 by removing all the
>> smooth, then test1smoothless.dsp outperform test2y4 in vector mode.
>>
>> test1smoothless, scalar mode  0.15%
>> test1smoothless, vector mode  0.09%
>>
>> Obviously it is probably not a good idea to remove all smooth, but
>> gain of performances can be probably be obtained by reorganizing them.
>> In particular it is better, in terms of performances, to smooth after
>> expensive computations like pow and similar than before.
>>
>> Cheers
>>
>> Yann
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Le 20/03/13 20:04, Oli Larkin a écrit :
>>> Hi,
>>>
>>> I have been experimenting with -vec in a .dsp involving multiple parallel 
>>> string resonators. This morning I was amazed at the performance boost I 
>>> got, but when I added multiple controls to my .dsp file things slowed down 
>>> a lot. Compilation takes much longer and when it finishes the compiled .vst 
>>> is much slower than the version with single controls (roughly 10x slower I 
>>> think). I realise there is more smoothing taking place, but I'm wondering 
>>> if there is something else in play that is causing the compiler not to 
>>> vectorize the code as well as with single controls.
>>>
>>> I'm compiling on osx 10.6 with faust Version 0.9.59, like this
>>>
>>> faust2vst stringbox.dsp -vec
>>>
>>> my system is a 2010 MBP, i7
>>>
>>> below are three versions of my faust .dsp. Strangely the third version 
>>> compiles and runs quickly making me think that the nested parallel 
>>> structures are causing problems for the auto vectorization.
>>>
>>> thanks very much for any tips,
>>>
>>> oli larkin
>>>
>>> // 
>>> ----------------------------------------------------------------------------------------------------------------------------------
>>> // stringbox.dsp VERSION 1 (multiple controls, slow)
>>>
>>> declare name "StringBox";
>>> declare description "Bank of 8 virtual strings";
>>> declare author "Oli Larkin ([email protected])";
>>> declare copyright "Oliver Larkin";
>>> declare version "0.1";
>>> declare licence "GPL";
>>>   import("filter.lib");
>>> dtmax = 4096;
>>>
>>> f(i) = hslider("A_freq%1i", 100, 20, 15000, 1) : smooth(0.999);
>>> t60(i) = hslider("B_decay%1i", 4, 0, 60, 0.01) : smooth(0.999);
>>> damp(i) = hslider("C_damp%1i", 1., 0, 1, 0.01) : smooth(0.999);
>>> g(i) = hslider("D_gain%1i", 0, -70, 0., 0.1) : db2linear : smooth(0.999);
>>> fd = hslider("E_diff", 0., 0., 1., 0.0001) : smooth(0.999);
>>>
>>> stringloop(x, s, c) = (+ : fdelay1a(dtmax, dtsamples, x)) ~ (dampingfilter 
>>> * fbk) : dcblocker
>>> with {
>>>     freq = f(s) + ((c-4) * fd);
>>>     coeff = damp(s);
>>>     dtsamples = (SR/freq) - 2;
>>>     fbk = pow(0.001,1.0/( freq*t60(s)));
>>>
>>>     h0 = (1. + coeff)/2;
>>>     h1 = (1. - coeff)/4;
>>>     dampingfilter(x) = (h0 * x' + h1*(x+x''));
>>> };
>>>
>>> rissetstring(x, s) = _ <: par(c, 9, stringloop(x, s, c)) :> _*0.01*g(s);
>>> stereostring(L, R, s) = rissetstring(L, s), rissetstring(R, s);
>>> stringbox(n) = _ , _ <: par(s, n, stereostring(_, _, s)) :> _ , _;
>>> process = stringbox(8);
>>>
>>> // 
>>> ----------------------------------------------------------------------------------------------------------------------------------
>>> // stringbox.dsp VERSION 2 (single controls, fast)
>>>
>>> declare name "StringBox";
>>> declare description "Bank of 8 virtual strings";
>>> declare author "Oli Larkin ([email protected])";
>>> declare copyright "Oliver Larkin";
>>> declare version "0.1";
>>> declare licence "GPL";
>>>   import("filter.lib");
>>> dtmax = 4096;
>>>
>>> f = hslider("A_freq", 100, 20, 15000, 1) : smooth(0.999);
>>> t60 = hslider("B_decay", 4, 0, 60, 0.01) : smooth(0.999);
>>> damp = hslider("C_damp", 1., 0, 1, 0.01) : smooth(0.999);
>>> g = hslider("D_gain", 0, -70, 0., 0.1) : db2linear : smooth(0.999);
>>> fd = hslider("E_diff", 0., 0., 1., 0.0001) : smooth(0.999);
>>>
>>> stringloop(x, s, c) = (+ : fdelay1a(dtmax, dtsamples, x)) ~ (dampingfilter 
>>> * fbk) : dcblocker
>>> with {
>>>     freq = f + ((c-4) * fd);
>>>     dtsamples = (SR/freq) - 2;
>>>     fbk = pow(0.001,1.0/( freq*t60));
>>>
>>>     h0 = (1. + damp)/2;
>>>     h1 = (1. - damp)/4;
>>>     dampingfilter(x) = (h0 * x' + h1*(x+x''));
>>> };
>>>
>>> rissetstring(x, s) = _ <: par(c, 9, stringloop(x, s, c)) :> _*0.01*g;
>>> stereostring(L, R, s) = rissetstring(L, s), rissetstring(R, s);
>>> stringbox(n) = _ , _ <: par(s, n, stereostring(_, _, s)) :> _ , _;
>>> process = stringbox(8);
>>>
>>> // 
>>> ----------------------------------------------------------------------------------------------------------------------------------
>>> // stringbox.dsp VERSION 3 (multiple controls, 1 comb filter per string, 
>>> instead of 9,  fast)
>>>
>>> declare name "StringBox";
>>> declare description "Bank of 8 virtual strings";
>>> declare author "Oli Larkin ([email protected])";
>>> declare copyright "Oliver Larkin";
>>> declare version "0.1";
>>> declare licence "GPL";
>>>   import("filter.lib");
>>> dtmax = 4096;
>>>
>>> f(i) = hslider("A_freq%1i", 100, 20, 15000, 1) : smooth(0.999);
>>> t60(i) = hslider("B_decay%1i", 4, 0, 60, 0.01) : smooth(0.999);
>>> damp(i) = hslider("C_damp%1i", 1., 0, 1, 0.01) : smooth(0.999);
>>> g(i) = hslider("D_gain%1i", 0, -70, 0., 0.1) : db2linear : smooth(0.999);
>>>
>>> stringloop(x, s) = (+ : fdelay1a(dtmax, dtsamples, x)) ~ (dampingfilter * 
>>> fbk) : dcblocker
>>> with {
>>>     freq = f(s);
>>>     coeff = damp(s);
>>>     dtsamples = (SR/freq) - 2;
>>>     fbk = pow(0.001,1.0/( freq*t60(s)));
>>>
>>>     h0 = (1. + coeff)/2;
>>>     h1 = (1. - coeff)/4;
>>>     dampingfilter(x) = (h0 * x' + h1*(x+x''));
>>> };
>>>
>>> stereostring(L, R, s) = stringloop(L, s), stringloop(R, s);
>>> stringbox(n) = _ , _ <: par(s, n, stereostring(_, _, s)) :> _ , _;
>>> process = stringbox(8);
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Everyone hates slow websites. So do we.
>>> Make your web apps faster with AppDynamics
>>> Download AppDynamics Lite for free today:
>>> http://p.sf.net/sfu/appdyn_d2d_mar
>>> _______________________________________________
>>> Faudiostream-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/faudiostream-users
>>>
>>>
>>
>
>


------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Faudiostream-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/faudiostream-users

Re: [Faudiostream-users] -vec and multiple control inputs

Reply via email to