Re: Gauging interest in another 3d audio project

@67
There is overhead because you need to get the bytes out of a Synthizer buffer and into the Numpy array. Then you need to get the bytes out of the output array and back into a synthizer buffer.

I have used these tools significantly. I've run Numba, I've run Theano, I've run Tensorflow. Numpy/scipy is what I reach for when I need to do math, and is what is powering the HRTF conversion scripts Synthizer itself needs.  My pipe dream plan to build a speech synthesizer relies on this stuff as well.

It might be possible to use the memoryview interfaces to do the copy without round tripping into Python but you'd still have to hold the GIL for that and holding the GIL is a bad idea for audio effects.  Plus the copy still happens.  There is one possibility here where you write your effect in cython with nogil and Synthizer provides the missing inter-thread synchronization piece, but even if this starts working you can't just go grab the GIL and start playing with Numpy and have things be amazing because your main game threads can just decide to hold onto it and not let go for a bit.

But let's say that none of that is a problem. You've got it in Numpy via some magic. The GIL is gone, somehow, which the Python people have been trying and failing to do for literally 10 years now. But in the case of this hypothetical example.  You still have two problems.

First is that Numpy does have overhead with small arrays, since every vectorized operation has to be arranged for by Python.  Numpy asymptotically approaches C with large arrays, but for 256 samples or so it certainly doesn't.  It'll be faster than Python, but saying it's zero overhead is incorrect by far.  It's constant overhead, but that means that if the arrays are small the constant overhead can sure count for a lot.  Rule 1 of algorithmic complexity and constant overhead and things like that is that you only get to sweep them under the rug if the problem size is large enough to hide it.

Second is that audio effects can't be vectorized.  Numpy vectorizes two audio operations of interest: the FIR filter and the IIR filter.  These equate to a lowpass/highpass/bandpass plugin, nothing more.  You can perhaps vectorize ringmod if you're sufficiently clever.  But that's it.  Most audio effects are recursive, for example echoes, where you need to compute every output sample one at a time so you can feed it back into the beginning.  For those that aren't, say a simple delay line, the least efficient way to do it is the obvious vectorized way: what you have to do instead is compute one sample at a time and use modulus to make indices wrap around in the buffer.  That can be vectorized a little bit if and only if the delay is always longer than the number of samples the effect needs to compute at once, but that's not usually the case.

And what that means is that your nice numpy operations just became read one value from the numpy array, do math in pure Python, write one value to the numpy array.  And at that point it's as much overhead as lists save that it's easier to copy into and out of numpy arrays from the bindings.  But at that point why bother doing that?  You can get zero copy if you just use the buffers directly, which is what the bindings will expose assuming that I'm right about being able to solve this in some fashion via cython nogil and Synthizer exposing some atomic lockfree operations.

When you use something like Numpy for machine learning, even on the CPU, there is overhead.  But 10-20 MS of overhead when the problem you're running takes 10-20 seconds per iteration is nothing.  On my machine currently, Synthizer's block size is 256 samples at a sampling rate of 44100, which is 5 MS per block.  You'll lose 1-2 MS of this to the GIL and 0.5 to 1 MS of this to getting data into and out of Numpy.  That leaves you 2 MS to play with.  If you compute slower than 2 MS on more than half the blocks, eventually Synthizer runs out of data to feed to the sound card.  If you do it on less than half the blocks, Synthizer can still run out of data if you miss 4 or 5 in a row, which will probably kick an as-yet-to-be-written latency increasing algorithm into gear, thus making the audio of your game progressively higher latency.

Now let's bring the GIL back in.  You've vectorized in Numpy, but every round trip to Python is going to release/acquire the GIL.  Let's say that you do that 5 times per effect.  The way Python works is that other threads running pure Python bytecode will only release the GIL after running a specific number of bytecodes (but I believe on Python 3 this got switched to a timer--I'm not going to go read C sourcecode for the sake of this post) so figure on 0.1 to 0.25 MS per GIL operation if the main game thread is trying to acquire it.  And there goes your computation window without you even doing any math.

And we haven't even talked about how if you accidentally allocate objects (which Python makes easy) it might call into the OS and knock more time off the very small amount of time you have in which to do your effect computations.

Keep in mind: most effects are literally millions of math operations, just to put this in perspective--even in C++ it's hard to stay within these time windows, and Synthizer is already going nuts about lockfree data structures and such to avoid doing certain negative things that Libaudioverse did which forced Libaudioverse's block size and latency to be much larger than what Synthizer can offer you in order to hide constant overheads.

This is a hard realtime system. Python is not suitable for hard realtime systems.  Hopefully the above is clear enough as to why and I won't have to keep explaining this.

Frankly, I know more Python than almost everyone on this forum and this is my third audio library, and arguably my second successful one.  Libaudioverse didn't fail, it's just much wider in scope and would take too much time to finish.  Point being, if I say something is a bad idea and it's to do with Python or audio, I probably know what I'm talking about, and I'd appreciate some credit as opposed to the current thing where you keep trying to tell me that I'm wrong.  You probably wouldn't use this if it existed anyway, simply because 99% of projects don't need a custom effect.  And as I've already said, custom sources are a different kind of thing and you'll be able to implement them in Python fine without a problem, and I'm already looking into what embeddable scripting languages meet the licensing and hard realtime requirements that need to be met for custom effects as well.

-- 
Audiogames-reflector mailing list
Audiogames-reflector@sabahattin-gucukoglu.com
https://sabahattin-gucukoglu.com/cgi-bin/mailman/listinfo/audiogames-reflector
  • ... AudioGames . net Forum — Developers room : Alan via Audiogames-reflector
    • ... AudioGames . net Forum — Developers room : visualstudio via Audiogames-reflector
    • ... AudioGames . net Forum — Developers room : camlorn via Audiogames-reflector
    • ... AudioGames . net Forum — Developers room : camlorn via Audiogames-reflector

Reply via email to