[pulseaudio-discuss] On scaling the HRIR in module-virtual-surround-sink

Alexander E. Patrakov Sat, 08 Mar 2014 10:59:23 -0800

Hello.

[Do not blindly apply patches from this e-mail! They mutually exclusive,and I don't have a firm opinion which one of them is correct.]

Today I tried to improve the existing module-virtual-surround-sink (butthe same issue also affects the IIR-based rewrite that is still sittingon my laptop). The problem is: the current normalization code does notdo what it is designed to do. The module clips on some testcases. Let mecopy-paste the problematic code for easy discussion.


    /* normalize hrir to avoid clipping */
    hrir_max = 0;
    for (i = 0; i < u->hrir_samples; i++) {
        hrir_sum = 0;
        for (j = 0; j < u->hrir_channels; j++)
            hrir_sum += fabs(u->hrir_data[i * u->hrir_channels + j]);

        if (hrir_sum > hrir_max)
            hrir_max = hrir_sum;
    }
    if (hrir_max > 1) {
        for (i = 0; i < u->hrir_samples; i++) {
            for (j = 0; j < u->hrir_channels; j++)
                u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
        }
    }

And here is where clipping (i.e. sum_left or sum_right becoming > 1.0)can happen:


        sum_right = 0;
        sum_left = 0;

        /* fold the input buffer with the impulse response */
        for (j = 0; j < u->hrir_samples; j++) {
            for (k = 0; k < u->channels; k++) {

current_sample =u->input_buffer[((u->input_buffer_offset + j) % u->hrir_samples) *u->channels + k];

sum_left += current_sample * u->hrir_data[j *u->hrir_channels + u->mapping_left[k]];sum_right += current_sample * u->hrir_data[j *u->hrir_channels + u->mapping_right[k]];

            }
        }

So, basically, here is the "normalization" logic: for each HRIR sample,take the sum of all absolute values of HRIR channels at that time. Findthe maximum of these sums. Divide that factor out, with an extra "fudgefactor" of 1.2. Without the fudge factor, the corresponding explanationin terms of the second code fragment is: if a single-sample click isplayed simultaneously through all 5.1 channels, the output of the filter(sum_left or sum_right) should just reach 1.0 at its maximum. Withhrir-kemar.wav from http://stuff.salscheider-online.de/hrir_kemar.tar.gz(mentioned in the original submission), this logic (including the fudgefactor) divides the HRIR by 1.862.

What I don't understand is why "simultaneous click in all inputchannels" is considered at all.

If the task is to avoid clipping even in the worst possible case, thenlet's construct this worst possible case. Here are the instructions.Take hrir-kemar.wav. Time-reverse it. Saturate all positive samples to1.0 and all negative samples to -1.0. With this specially-constructedsound, sum_left would at one point in time receive the sum of allabsolute values of all samples (in all channels) in the HRIR, so, thatis the value to divide out, and not what is written in the first codefragment. I.e., this would be the patch (sorry if mangled, it is notmeant to be applied anyway):


--- a/src/modules/module-virtual-surround-sink.c
+++ b/src/modules/module-virtual-surround-sink.c
@@ -533,7 +533,7 @@ int pa__init(pa_module*m) {

     const char *hrir_file;
     unsigned i, j, found_channel_left, found_channel_right;
-    float hrir_sum, hrir_max;
+    float hrir_sum;
     float *hrir_data;

     pa_sample_spec hrir_ss;
@@ -759,20 +759,21 @@ int pa__init(pa_module*m) {
     }

     /* normalize hrir to avoid clipping */
-    hrir_max = 0;
+    hrir_sum = 0.0;
     for (i = 0; i < u->hrir_samples; i++) {
-        hrir_sum = 0;
         for (j = 0; j < u->hrir_channels; j++)
             hrir_sum += fabs(u->hrir_data[i * u->hrir_channels + j]);
+    }
+

-        if (hrir_sum > hrir_max)
-            hrir_max = hrir_sum;
+    if (hrir_sum < 0.01) {
+        pa_log("hrir file is too quiet!");
+        goto fail;
     }
-    if (hrir_max > 1) {
-        for (i = 0; i < u->hrir_samples; i++) {
-            for (j = 0; j < u->hrir_channels; j++)
-                u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
-        }
+
+    for (i = 0; i < u->hrir_samples; i++) {
+        for (j = 0; j < u->hrir_channels; j++)
+            u->hrir_data[i * u->hrir_channels + j] /= hrir_sum;
     }

     /* create mapping between hrir and input */

(Here I also removed the "don't normalize quiet HRIRs" logic, as I don'tunderstand its purpose and it had an obvious bug of not including thefudge factor.)

Result: after this patch, the HRIR is divided by 24.339 and the moduleindeed survives the worst possible testcase just without clipping. Butthen, it becomes too quiet for normal use. That's why I don't want youto apply the above patch.

The original code, even though it clips heavily in the worst possibletestcase, indeed produces no audible clipping on most (but not all!)music DVDs that I tried, and I think that the correct problem to solvehere is indeed "produce no audible clipping on typical content". Sothere is some grain of truth in the current code.

Still, I want to change the fudge factor, because, as of now, the moduledoes not survive the following testcases (i.e. clips, with this clippingbeing very easy to detect by ear):

* A full-amplitude sine sweep from 100 Hz to 10 kHz, same samples inall channels.* The 5.1 soundtrack from the "Lichtmond 2: Universe of Light" musicBluRay.

The fudge factor required to play the whole Lichtmond 2 soundtrackwithout clipping is 2.4, corresponding to dividing the HRIR by 3.724. So:


--- a/src/modules/module-virtual-surround-sink.c
+++ b/src/modules/module-virtual-surround-sink.c
@@ -768,11 +768,15 @@ int pa__init(pa_module*m) {
         if (hrir_sum > hrir_max)
             hrir_max = hrir_sum;
     }
-    if (hrir_max > 1) {
-        for (i = 0; i < u->hrir_samples; i++) {
-            for (j = 0; j < u->hrir_channels; j++)
-                u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
-        }
+
+    if (hrir_max < 0.01) {
+        pa_log("hrir file is too quiet!");
+        goto fail;
+    }
+
+    for (i = 0; i < u->hrir_samples; i++) {
+        for (j = 0; j < u->hrir_channels; j++)
+            u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 2.4;
     }

     /* create mapping between hrir and input */

I don't like the above (working and almost minimal) patch, as it justreplaces one magic number with another, and I still don't understand whythe calculated hrir_max value is relevant. I.e. the same "correct"scaling factor, 3.724, can be also approximately produced this way, fromthe energy contained in the impulse response:


--- a/src/modules/module-virtual-surround-sink.c
+++ b/src/modules/module-virtual-surround-sink.c
@@ -533,7 +533,7 @@ int pa__init(pa_module*m) {

     const char *hrir_file;
     unsigned i, j, found_channel_left, found_channel_right;
-    float hrir_sum, hrir_max;
+    float hrir_sample, hrir_energy;
     float *hrir_data;

     pa_sample_spec hrir_ss;
@@ -758,21 +758,25 @@ int pa__init(pa_module*m) {
         goto fail;
     }

-    /* normalize hrir to avoid clipping */
-    hrir_max = 0;
+    /* normalizg hrir to avoid clipping on typical music */
+    hrir_energy = 0.0;
     for (i = 0; i < u->hrir_samples; i++) {
-        hrir_sum = 0;
-        for (j = 0; j < u->hrir_channels; j++)
-            hrir_sum += fabs(u->hrir_data[i * u->hrir_channels + j]);
+        for (j = 0; j < u->hrir_channels; j++) {
+            hrir_sample = u->hrir_data[i * u->hrir_channels + j];
+            hrir_energy += hrir_sample * hrir_sample;
+       }
+    }

-        if (hrir_sum > hrir_max)
-            hrir_max = hrir_sum;
+    if (hrir_energy < 0.0001) {
+        pa_log("hrir file is too quiet!");
+        goto fail;
     }
-    if (hrir_max > 1) {
-        for (i = 0; i < u->hrir_samples; i++) {
-            for (j = 0; j < u->hrir_channels; j++)
-                u->hrir_data[i * u->hrir_channels + j] /= hrir_max * 1.2;
-        }
+
+    hrir_sample = sqrt(hrir_energy) * 1.67;
+
+    for (i = 0; i < u->hrir_samples; i++) {
+        for (j = 0; j < u->hrir_channels; j++)
+            u->hrir_data[i * u->hrir_channels + j] /= hrir_sample;
     }

     /* create mapping between hrir and input */

Of course, the same criticism applies as for the second patch - whywould the energy be relevant here?


Some final remarks.

I don't insist on the 3.724 scaling factor and can accept occasionalclipping if it is not detectable by ear.

Due to the fact that we are dealing with "typical content", I understandthat a fudge factor is needed anyway, to deal with its typicalness.

The code has to deal with HRIRs of arbitrary origin, including thosefrom http://stuff.salscheider-online.de/hrir_listen.tar.gz , that's whya static factor to divide out won't cut it. But I haven't tested anyother HRIRs yet.

Wrong equalization of the HRIRs (i.e. expensive headphones sounding likecheap plugs with this module) is a separate bug that will be addressedin a separate e-mail when I become able to discuss it.


Thoughts?

--
Alexander E. Patrakov

_______________________________________________
pulseaudio-discuss mailing list
pulseaudio-discuss@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pulseaudio-discuss

[pulseaudio-discuss] On scaling the HRIR in module-virtual-surround-sink

Reply via email to