On 2014-11-13 23:49, Andrey Semashev wrote:
On Thursday 13 November 2014 11:32:09 you wrote:
On Thu, Nov 13, 2014 at 8:33 AM, David Henningsson

<david.hennings...@canonical.com> wrote:
On 2014-11-11 22:39, Andrey Semashev wrote:
In short, libsoxr is almost always faster than speex, and introduces much
less distortions. Its passband frequency is slightly lower than speex
though, and it can add a delay up to 20 ms in some cases.

I'm interested in knowing more about the delay. What are "some cases"?

"Some cases" means some sample rate combinations. In my tests I
measured the delay of the resampler, and it was ~20 ms max. I don't
have the results accessible now, I'll add them tonight to the results
page.

Ok, here are some interesting results.

Cool, thanks for the testing!

1. The delay does not depend on the input format (int16 vs float) or content.
I tried with two different input pieces of content.

2. The delay _does_ depend on the input frame size (i.e. the amount of input
samples you pass to the resampler in one chunk). I tested for frame sizes of
20 and 100 samples per channel. There isn't a particularly obvious relation
between the frame size and the delay.

3. The delay is typically lower for low quality presets (-lq, -mq), but that's
not always the case.

4. The delay is typically lower for 2-fold sample rate conversions (i.e. 48kHz
<-> 96kHz).

5. The delay varies in a wide range between different sample rate
combinations. Different quality presets, on the other hand, are not as
different. There are cases of 20 ms delay on all three -mq, -hq and -vhq
presets, as well as there are cases of <5 ms.

6. With frame size 20 min/max delay values are:

    -mq: 1.917/20.604 ms
    -hq: 2.708/20.000 ms
    -vhq: 4.208/20.000 ms

   In case of 44.1 kHz, 16 bit, int -> 48 kHz it is 20.604/20.000/20.000 ms in
the three presets.

7. With frame size 100 min/max delay values are:

    -mq: 2.771/12.336 ms
    -hq: 7.104/16.531 ms
    -vhq: 5.250/27.256 ms

   In case of 44.1 kHz, 16 bit, int -> 48 kHz it is 2.771/7.104/15.292 ms in
the three presets.

   Note that 27.256 for -vhq in this case is larger than I stated in my initial
announcement and the docs patch. I had not tested frame size 100 at that time.
I will update the docs patch accordingly (probably, by describing the delay
range more loosely).

I do not have an explanation for such diverse range of the delay value, and
its dependency on the frame size. It doesn't look like the filter is
"learning" from the input in some way since the delay doesn't depend on the
content. Perhaps there is some extensive buffering in the implementation.

Well, the delay must be constant given the parameters. If the delay was varying during playback, that would probably cause very interesting sound effects, such as music being slightly out of tempo or so...

What does vary during playback, however, is how big chunks we pass into the resampler in every go. Which begs the question if it is the first chunk that determines the delay, or...?

I can try and perform more tests with different frame sizes in attempt to
determine the approximate maximum delay. I suspect, even after such testing is
conducted I won't be absolutely sure that the discovered upper value won't
ever be exceeded in some other case I did not cover. I can, however, test with
the frame size that is used in PulseAudio, if such fixed or typical value
exists (does it?).

For now the bottom line is that the exact delay of the resampler is difficult
to predict, although it usually does not exceed 20 ms, except some rare cases
and -vhq. When delay is critical it is better to use another resampler, like
speex-5, for instance, which consistently stays below 1 ms across the board.
But I think, such cases are quite specialized, and soxr is still very well
applicable in general use.

Well, what is "quite specialized" and "general use"? If you use your computer primarily for gaming and VOIP, then that's what you consider "general use", and perhaps "listening to music so carefully that you hear the difference between different resamplers" is what you consider "quite specialized"...

So if it was up to me, I'd say let's keep speex-float-1 as the default, as it seems to give the best balance between quality, CPU power, and low latency.

With my upstream hat on, I don't mind adding soxr as an option, and with my distro hat on, I'm always worried about adding new dependencies...

--
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic
_______________________________________________
pulseaudio-discuss mailing list
pulseaudio-discuss@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pulseaudio-discuss

Reply via email to