Re: [pulseaudio-discuss] [PATCH] Remove module-equalizer-sink

Alexander E. Patrakov Mon, 10 Mar 2014 12:52:40 -0700

10.03.2014 20:14, Jason Newton wrote:

Author here.


Hello, and thanks for showing up.

Wow, I didn't realize there was so many problems with it (somewhatsarcastic here). But I have been using it every day for hours sincebefore I submitted it and I've stayed up to date with pulseaudio (as auser) with each successive version as it's been realized on openSUSE.The only real issue I've had is that tsched and flash and theequalizer don't really work together and video games don't like thelatency - I never noticed a delay with mpv/mplayer and videos oranything iritating with music on mpd.

The delay is indeed not noticeable if you have nothing to compare thesound timing with. I.e. not a problem with music, but a problem with videos.

For those 2 trouble cases of flash/games I just move them to thehardware sink. Yea rewinding/seeks of audio cause a small audioglitch but its not something that bothers me much as it automaticallyrecovers within a second.

PulseAudio must be perfect :) and I think that this glitch is, intheory, fixable. After all, the past samples are stored in input_q, yourmodule is notified about rewinds, and it can reprocess the past sampleswhen needed.

It's worked pretty well for me. I've seen a few gentoo people usingit but I know it never really reached mass usage due to the(unfortunate) timing and naming of pulseaudio equalizer (the gtkladspa script with non-real-time tuning). I've liked the largespectrum of the filter to also have make shift notch filters for whenI'm to lazy to actually load things up in matlab or audacity.

For notches, IIR filters would be far more appropriate than anythingFFT-based, due to much lower algorithmic latency and, in some cases,lower CPU usage for the similar frequency resolution. As for theunfortunate timing and the confusing naming of the gtk-based script - Iagree.

I'll go through this a little bit to answer what I can at the moment(I just sort of stumbled on to this thread, I'm neck deep in otherstuff these days).
On Fri, Mar 7, 2014 at 11:43 AM, Alexander E. Patrakov<patra...@gmail.com <mailto:patra...@gmail.com>> wrote:
    1. The FFT size and the window size for overlap-add are chosen
    inconsistently. The window size is always 16000 samples. The FFT size
    depends on the sample rate. Thus, with sample rates of 16 kHz or
    below,
    there is a buffer overflow (window becomes larger than FFT).
This was done to allow more detail to be preserved from the signal, orat least that was the thought. FFT size would be the2^(ceil(log2(sample_rate)) so the spectrum covered all theoreticalfrequencies the audio signal at that sample rate could contain.Granularity of latency would be something that isn't too bad for mostpeople and still efficient as those FFTs are still expensive. Perhapsthis should have been based on the sample rate as well, but thelatency wasn't bad for me and cpu usage is overall 1-2% at 44.1khz onan i7. I don't recall it being bad (2-5%?) at 96khz back on an i7920. I didn't think one would really would have a sample rate lowerthan 16khz .

As for "so the spectrum covered all theoretical frequencies the audiosignal at that sample rate could contain". Any size of the FFT canrepresent all frequencies between 0 Hz and half the sample rate, theonly real difference is about granularity of directly representablefrequencies. I.e., with short FFT sizes, "weird" frequencies (those thatare not a multiple of the sample rate divided by the FFT size) arerepresented not directly, but by variation of the transform results insuccessive periods. I guess suitable granularity is what you mean here(after all, your choice of the FFT size can't represent half-integerfrequencies directly, and they do exist), and that's what I mean with mypoint 3.

Even though one is unlikely to have a sample rate lower than 16 kHz, itstill happens sometimes in real life (e.g. with BlueTooth), and a bufferoverflow is always a bug. So, to avoid inconsistent choices, you shouldchoose both the window size and the FFT size as a function of thesampling rate.

    2. There is no attempt to ensure that the impulse response of the
    equalizer is short enough to fit in the difference between the FFT
    size
    and the window size (a requirement for the correct operation of the
    overlap-add technique). See [1] for the FFT size consideration and [2]
    why it is important to regularize the desired frequency response.
Yes, I recall someone mentioned that theoretically there is "ringing"that occurs due to this... it's been below the noise floor for me ifit happens. Has it been a problem for anyone? By the way, maybe I'mremembering wrong but I believe that tails from the "ringing" cancelout with COLA method when the filter is constant, that's why I used itand allowed an arbitrary magnitude based filter (no phase adjustmentsthough). I'm not an audio engineer though, I mainly due spatialsignal processing on images which is a whole different ball-game.

It is below the noise floor only because you have not tried a non-smoothfrequency response (the one that you cannot draw in qpaeq withoutstretching its X11 window well beyond the screen size), and because ofthe excessive size of the window that allows for long tails of theimpulse response. Still, as long as you don't make it absolutely surethat the impulse response of the filter is shorter than the FFT sizeminus the window size, you are not implementing a convolution. COLAdoesn't help you here, it is indeed useful only for reconstruction (i.e.for audio codecs, i.e. for unity or constant gain), not for manipulation(i.e. filters, including the equalizer).

I also believe that you are confused by two similarly-named, oftendiscussed together, but actually different and unrelated things: (1)"overlap-add decomposition", which is related to short-time Fouriertransform and which only works if the window satisfies the COLAcondition, and (2) "overlap-add method", which is a means to preciselyimplement a convolution of a long signal with a short one, i.e. a usefultrick against circular convolution.



Let me explain again what happens here.

First, let's note that we need to create a filter that is linear andtime-invariant. This means it is a convolution. So we need a convolutionof the input signal with something related to the equalizer settings.That something has a well-defined length (not counting any trailingzeros), but, if you don't actively enforce it by windowing, you willgenerally end up with it occupying the whole FFT size (N). Let thatwell-defined length be L.

Then, let's note that convolution with such a signal will make a giveninput sample affect only the corresponding output sample and L-1 moreoutput samples in the future.

With the overlap-add method, one should cover the signal with a sequenceof windows, such that COLA is satisfied (the best choice here would berectangular windows, see below). Then pad each windowed piece of signalwith zeros. Then take the FFT, multiply the result by the FFT of the"something", take an IFFT of the result. You will end up withoverlapping pieces of the output signal that you just need to add.

Now let's consider the "take the FFT, multiply the result by the FFT ofthe something, take an IFFT of the result" step in more detail. What isprescribed here is a circular convolution. It is almost the same as theregular convolution, except that it, instead of producing some samplesto the right of the original signal, will wrap them around and push tothe left, adding to the existing samples (and thus corrupting them).

Padding is a way to avoid corruption here. To avoid corruption (i.e. tomake "normal" and circular convolution yield the same result), you needL-1 samples of padding. Since you also have to leave some space to theactual input signal, you need L < N, i.e. truncate the impulse response.To avoid severely corrupting the spectrum by truncation, you need toapply a window to the wanted impulse response.

Now let's deal with the (false) statement "the filter was is defined inthe frequency domain and only changes amplitude of frequency (notphase)" again. To see what you are convolving the input signal with, youneed to take the IFFT of the filter's frequency-domain representation.If it is even (which is true if your filter is a zero-phase one), thenyou'll end up with something like Figure 17-1(b) fromhttp://www.dspguide.com/ch17/1.htm . I.e. the filter will want to affectthe near future and the near past equally, but the past portion willwrap around to the far future. Oops. To avoid this, you will need toshift the impulse response circularly to the right, thus making itlinear-phase instead of zero-phase. And after that you will no longer beable to say that your filter does not affect phase.

And, here is what the book (http://www.dspguide.com/ch17/1.htm ) saysabout the "something": "it needs to be /shifted/, /truncated/, and/windowed/" - just what I explained above.



    3. The large window size (16000 samples) is unjustified. It only leads
    to excessively high latency. A "33 ms of audio" window and "66 ms of
    audio" FFT size would be sufficient for any practical implementation
    of the equalizer and would still allow one to control the attenuation
    at 30 Hz and 60 Hz separately (as commonly found in music
    players), all
    with only 33 ms of latency.

OK you can tune that if you think it's a problem... I was justbalancing something that worked ok for me while trying not to do toomany FFTs, a respectable default for the general case.

Too many short FFTs is not a problem. A long window is a problem,because you can't get the output sample corresponding to the first inputsample without processing the whole window, and this means that youcan't achieve low latency needed by games.

    4. This latency is not properly reported (fdo bug 41465).
Some of the harder to understand pulseaudio details like latencyescaped me - I know the concepts very well but I tried for hours totune it and asked Lennart countless times for help regarding this(this was in the days when he was starting the hand off). Perhapsother bugs or something in PA at the time made that harder tounderstand and work with. If you understand how to make that partwork better, please experiment with it and push a patch cause it waslost on me at the time.


There is indeed no useful documentation :(

Latency is how far ahead the last written sample is of what the userhears. It consists of three parts: latency of the parent sink, theamount buffered in input_q, and the algorithmic latency. Seesink_process_msg_cb() of module-virtual-sink. You have two problemshere: output_q (IMHO a totally unnecessary layer of buffering), and notcounting the algorithmic latency (which is, in samples, for causallinear-phase filters, M + L/2, where M is the window size). I cannotprovide a patch here, because this would need a well-defined L beforehand.

    5. There are numerous issues with code style. A lot of commented-out
    code, and a "FIXME: Please clean this up" that was never fixed in 4
    years.
I'll see what I can do about that, some of those were frommodule-virtual-sink when Lennart was updating it, most of them werecompletely unclear as to what should change or if changed resulted incrashes (the unref one). Again, Lennart got hard to access in thosedays and I couldn't figure out what to do with them as he put themthere. The rewind stuff was also very confusing overall, I recallthis (outside of module-equalizer-sink) coming up before and no clearanswers on the horizon. You got me on the dead code one though. Iargue a few of those lines should be left as they were frequentlyuseful in developing the filter or diagnosing something - perhaps Ishould guard them in a debug macro instead, rather than delete orleave them commented?


There is pa_log_debug().

    6. There are memory management issues. E.g. the
    already-pointed-out leak
    at the end of sink_input_pop_cb(), and the fact that a buffer for the
    FFT is allocated by fftw_malloc() but freed by free().
Interesting, rather than complain I would've just made a patch butpoint taken. I'll try to submit a patch for that soon but my time forPA efforts is pretty small.

I would have submitted the patch if this was the only problem and if theprevious person also fixed the unref instead of adding a FIXME.


    7. The use of the Hanning window is unjustified. A rectangular window
    works just as well for the purpose of overlap-add based
    convolution, but
    allows one to do less FFTs, i.e. go faster. In fact, the DSP Guide
    book
    does not even mention the variant of overlap-add with a
    non-rectangular
    window!

Here's a few links that might answer it, I'll also mention most of mywork was derived from these:

http://www.dsprelated.com/dspbooks/sasp/COLA_Examples.html
http://www.dsprelated.com/dspbooks/sasp/Frequency_Domain_COLA_Constraints.html

Unfortunately, as explained above, COLA is not the only thing required.Sufficient padding is required too, if modification of spectrum is needed.

http://www.dspguide.com/


Sorry, please link to concrete places, not the whole book,

And I have one more link for you to see:

https://ccrma.stanford.edu/~jos/sasp/Overlap_Add_Decomposition.html

The last sentence on the page explicitly recommends a rectangular windowfor the purpose of FIR filtering. This way, you will be able to shiftthe window by its whole width between the FFTs, not by the half of itswidth, thus needing only half of the FFTs.

I don't think it's so broken as you instigate it to be, but maybe I'mwrong... it does work as desired (as a user) however.

Hanning window works indeed, it just means some extra (unneeded) work,provided that you have enough padding.

    8. The module does not use any benefits (such as a chance to handle
    rewinds properly) of being a native PulseAudio module.
Please explain how to handle it and I might add it... again there wasalot of black magic in pulseaudio in those days wrt stuff like this.I suspect any of the module-virtual-sink inheritors might benefit inexactly the same way with potentially duplicated code. But I don'tknow how numerous they are...

You are right here, module-virtual-sink contains some wrong advice, likeresetting the filter on any rewind, even if it is possible to do better(i.e. to restore the past internal state of the filter).


Here is my point in more detail (knowingly exaggerated).

Writing a native PulseAudio module is hard: you need to implement 20callbacks, by copy-pasting the black magic, and with poor documentation.Writing a LADSPA plugin is much easier: only 8 callbacks (without muchcode duplication) and much better documentation, and the ability to usethe result in PulseAudio via module-ladspa-sink. So there must be somegain for the whole exercise of writing a native PulseAudio module to beworth the pain (both for you and for PulseAudio developers, as any codeis a liability).


So here is the potential gain.

A. The ability to report algorithmic latency properly, viaPA_SINK_MESSAGE_GET_LATENCY. In fact, there is a hack (control outputnamed "latency") in LADSPA for that, but PulseAudio doesn't use it. Itlooks easy to fix, though.

B. The ability to handle rewinds. In LADSPA, there is no way to say"forget the last N samples that I wrote to you". In PulseAudio, there issuch way, and it is essential for all modules to either support thatseamlessly (so that a call to pa_stream_write() with the last twoparameters other than 0,0 and overwriting the previously-written sampleswith themselves is a no-op), or cleanly fall back to the low-latencyno-rewinds mode in order for the "tsched" feature to be finally completed.

Your current module neither reports its algorithmic latency, nor fixesup its internal state (mostly overlap_accum) on rewind, so there is noactual gain, but all the pain (including the fact that I sent the"please remove" e-mail). As this currently looks like black magic thatnobody explained to you, maybe it is indeed a better idea to write aLADSPA plugin instead of a native PulseAudio virtual sink module?



    9. None of the known alternative "system-wide equalizer GUI" projects
    work on top of module-equalizer-sink. Google has no external hits
    on the
    [EqualizedSinks dbus] search query that would find any alternative
    GUIs
    that use module-equalizer-sink DBus interface. Both veromix [3] and
    pulseaudio-equalizer [4] work on top of module-ladspa-sink. And
    they are
    usable with video players, too, even though both are unmaintained.

OK...? is qpaeq a problem for you or did you miss it all together? Idon't see why this was listed.

The keyword here is "external". qpaeq is a part of PulseAudio, not anexternal project.



    10. It's more an optimization question, but I have a gut feeling that
    overlap-save [5] would be a better fit for PulseAudio buffering model
    than overlap-add. Namely, it would allow to abandon the need to store
    overlap_accum. Instead, the only buffer needed would be for
    looking back
    at the past portions of the input signal, and we already have that
    anyway due to the need to react to rewinds. In other words, from a
    qualified DSP engineer, I would rather expect a rewrite of the
    algorithm
    instead of any attempts to improve it gradually.

    In short, let's not pretend that PulseAudio has a working native
    equalizer module. It should have never been accepted in the
    current form
    in the first place. A better replacement already exists in the form of
    module-ladspa-sink + mbeq + veromix.

Regarding overlap-save, I evaluated this and specifically showoverlap-add instead, however having been in 2009 or so I can't reallyremember it so well anymore. Maybe it'll come back to me, I suspectwill come back when I look over the dspguide book again.

OK, I will wait. But my point stays: with overlap-save, you will notneed overlap_accum, and thus will need no internal state to restore onrewinds. This makes handling rewinds much easier.


--
Alexander E. Patrakov

_______________________________________________
pulseaudio-discuss mailing list
pulseaudio-discuss@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pulseaudio-discuss

Re: [pulseaudio-discuss] [PATCH] Remove module-equalizer-sink

Reply via email to