07.09.2014 17:04, Laurențiu Nicola wrote:
Great, thanks!

On Sun, Sep 7, 2014, at 14:02, Alexander E. Patrakov wrote:
07.09.2014 16:58, Laurențiu Nicola wrote:
I have a question related to your tests. In my application, I need
resampling between close rates (let's say from 44200 to 44100). Do you
feel that the results would basically be the same in this kind of
situation?

Thanks,
Laurentiu Nicola

I have not tested. I will write instructions on the next week, so that
you can test yourself every situation that you want to.

Here are the instructions. Sorry for the delay.

1. Choose a sample rate that you will be resampling from. In your case, this is 44200 Hz. Generate a wav file with this sample rate, containing a linear sweep:

./wavegen.py --rate 44200 --length $(( 1024 * 1024 )) --amplitude 0.99 --format s16 --padding 65536 44200.wav

The length should be sufficient so that for every frequency bin of the FFT on the next steps the file contains a piece of sufficient-for-FFT length (or ideally several such pieces) with only that frequency. The amplitude should be 0.99 to avoid accidental clipping by the resampler. The padding is unfortunately needed because the recorded wav file from the resampler on the next step contains unwanted clicks for some unknown reason.

2. Resample the file. An easy but slow way is to use a null sink running at the rate you want to resample to. Here are the commands:

pacmd load-module module-null-sink rate=44100

parec -d null.monitor --fix-rate --file-format=wav 44200_to_44100.wav & paplay -d null 44200.wav ; killall parec

Hopefully these commands are obvious.

3. Analyze the result.

./resampler_plots.py --rate-from 44200 --skip 32768 --save plop --fftsize 1024 44200_to_44100.wav

There may be warnings about dropouts. If they are near the end, that's OK. Also there will be warnings about division by zero, that's because of the masked-out frequency components. Ignore them.

The meaning of parameters: rate-from is the sample rate of the original wav file that contained the linear sweep, skip means "skip this number of samples from the beginning" (because there is a click). After skipping, the analysis process skips further through the silent portion of the resampled file and automatically adapts to any unknown slope of frequency change.

The "fftsize" parameter, well, sets the FFT size. Useful values start from 1024. Below that, the resolution in the low-frequency part of the spectrum is not sufficient to determine audibility reliably, because the absolute threshold of hearing changes significantly within one frequency bin. The FFT size is specified in terms of the frequency bins. The required number of input samples for each signal piece is twice more, i.e. 2048 in our case.

The "save" parameter sets a base of all output filenames. So, you'll get:

plop_envelope.png: shows the amplitude of output signal vs frequency if the input signal contains only this frequency at the full scale.

plop_response.png: a spectrogram. To read it, select an input frequency. Then cut a column out of this spectrogram according ot the X axis. The amplitude of each output frequency component is then described by the color of the column at the height corresponding to the output frequency. E.g., it can be seen that, when given a 5 kHz input signal, the src-sinc-fastest resampler also produces some very weak unwanted output at 18 kHz.

plop_distortion_eq.png: the same, but with the line representing the wanted same-frequency output suppressed. I.e. only distortions, with the assumption that wanted-signal attenuation does not count as a distortion.

plop_distortion.png: the same, but with the same line replaced by the difference of wanted vs actual same-frequency output. I.e. only distortions, and the attenuation of the signal now counts as a distortion.

plop_audibility_eq.png: audibility of distortions (i.e. how much should one reduce the distortion before it becomes inaudible) given the input signal containing only this frequency at the maximum amplitude, if attenuation of high signal frequencies does not count as a distortion.

plop_audibility.png: the same, but now such attenuation counts as a distortion.

The results are valid only in the absolutely quiet room, i.e. represent the worst case.

One of the plots from src-sinc-fastest is attached for you to compare.

--
Alexander E. Patrakov
_______________________________________________
pulseaudio-discuss mailing list
pulseaudio-discuss@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pulseaudio-discuss

Reply via email to