On 8/25/15 7:08 PM, Ethan Duni wrote:
>if you can, with optimal coefficients designed with the tool of your choice, so i am ignoring any images between B and Nyquist-B, >upsample by 512x and then do linear interpolation between adjacent samples for continuous-time interpolation, you can show that it's >something like 12 dB S/N per octave of oversampling plus another 12 dB. that's 120 dB. that's how i got to 512x.

Wait, where does the extra 12dB come from? Seems like it should just be 12dB per octave of oversampling. What am I missing?

okay, this is painful. in our 2-decade old paper, Duane and i did this theoretical approximation analysis for drop-sample interpolation, and i did it myself for linear, but we did not put in the math for linear interpolation in the paper.

so, to satisfy Nyquist (or Shannon or Whittaker or the Russian guy) the sample rate Fs must exceed 2B which is twice the bandwidth. the oversampling ratio is defined to be Fs/(2B). and in octaves it is log2(Fs/(2B)). all frequencies in your baseband satisfy |f|<B and if it's highly oversampled, 2B << Fs.

now, i'm gonna assume that Fs is so much (like 512x) greater than 2B that i will assume the attenuation due to the sinc^2 for |f|<B is negligible. i will assume that the spectrum between -B and +B is uniformly flat (that's not quite worst case, but it's worser case than what music, in the bottom 5 or 6 octaves, is). so given a unit height on that uniform power spectrum, the energy will be 2B.

so, the k-th image (where k is not 0) will have a zero of the sinc^2 function going right through the heart of it. that's what's gonna kill the son-of-a-bitch. the energy of that image is:


       k*Fs+B
     integral{ (sinc(f/Fs))^4 df }
       k*Fs-B


since it's power spectrum it's sinc^4 for linear and sinc^2 for drop-sample interpolation.

changing the variable of integration


           +B
     integral{ (sinc((k*Fs+f)/Fs))^4 df }
           -B



           +B
     integral{ (sinc(k+f/Fs))^4 df }
           -B



     sinc(k+f/Fs) =  sin(pi*(k+f/Fs))/(pi*(k+f/Fs))

                  =  (-1)^k * sin(pi*f/Fs)/(pi*(k+f/Fs))

                  =approx  (-1)^k  *  (pi*f/Fs)/(pi*k)

                  since  |f| < B << Fs

raising to the 4th power gets rid of the toggling polarity.  so now it's

                        +B
     1/(k*Fs)^4 * integral{ f^4 df }  =  (2/5)/(k*Fs)^4 * B^5
                        -B


now you have to sum up the energies of all of the bad images (we are assuming that *all* of those images, *after* they are beaten down, will somehow fall into the baseband during resampling and their energies will team up). there are both negative and positive frequency images to add up. (but we don't add up the energy from the image at the baseband, that's the "good" image.)

        +inf                                               +inf
    2 * SUM{ (2/5)/(k*Fs)^4 * B^5 }  =  B*(4/5)*(B/Fs)^4 * SUM{1/k^4}
        k=1                                                k=1


the summation on the right is (pi^4)/90

so the energy of all of the nasty images (after being beaten down due to the application of the sinc^2 that comes from linear interpolation between the "subsamples") is

   B*(4/5)*(B/Fs)^4 * (pi^4)/90

and the  S/N ratio is 2B divided by that.

   (  (2/450) * (2B/Fs)^4 * (pi/2)^4  )^-1

in dB we use 3.01*log2() because this is an *energy* ratio, not a voltage ratio.

   -3.01*log2( (2/450) * (2B/Fs)^4 * (pi/2)^4 )

     =  3.01*log2(225) + 12.04*log2(2/pi)  +  12.04*log2( Fs/(2B) )

     =  15.6 dB  +  (12.04 dB) * log2( Fs/(2B) )


so, it seems to come out a little more than 12 dB. i think Duane did a better empirical analysis and he got it slightly less.

but, using linear interpolation between subsamples, you should get about 12 dB of S/N for every octave of oversampling plus 15 dB more.



>but the difference in price in memory only, *not* in computational burden.

Well, you don't get the full cost in computational burden since you can skip computing most of the upsamples.

exactly and it's the same whether you upsample by 32x or 512x. but upsampling by 512x will cost 8 times the memory to store coefficients.

But the complexity still goes up with increasing oversampling factor since the interpolation filter needs to get longer and longer, no?

no. that deals with a different issue, in my opinion. the oversampling ratio determines the number of discrete (and uniformly spaced) fractional delays. there is one FIR filter for each fractional delay. the number of coefs in the FIR filter is a performance issue regarding how well you're gonna beat down them images in between baseband and the next *oversampled* image. in the analysis above, i am assuming all of those in-between images are beaten down to zero. it's a crude analysis and i just wanted to see what the linear interpolation (on the upsampled signal) does for us.

So there is some balancing of computational burden involved. I can see how frequent coefficient calculations could swamp that for high order and/or exotic interpolators, along with the increased upsampler complexity since you need to compute more of the polyphase components to drive it. But it's not obvious to me on its face exactly where the minimum lies...

Of course that all goes out the window if you already need oversampling for other system concerns anyway, or are using some very cheap hardware resampler or whatever. Or if you're happy to throw an IIR upsampler at it. And in many cases you'll already have access to some nice optimized resampling software, whereas polynomial interpolators would need to be invented from scratch, so there's a practical man-hours concern as well. Likewise, it depends on how frequently the fractional delay is going to change. Obviously there are good reasons why analyses that include both the interpolator and the resampler are somewhat rare, there are a lot of moving parts and potential trade-offs.

>some apps where you might care less about inharmonic energy from images folding over (a.k.a. "aliasing"), you might not need to go that high of whatever-x.

I think this is the point where we need to fork into whether we are doing just a fractional delay, or if there is also difference between the output and input sampling rates.

those were the two classes of apps that i mentioned. in one class (the resampling class, of which SRC and pitch shifting are apps), if you're doing linear interpolation between the fractional delays, i think it is sufficient to multiply the input spectrum by (sinc(f/Fs))^2 (where 2|f| < 2B << Fs is the oversampled Fs so the sinc^2 might not do much to your baseband image, but it will beat down the other images real good).

If there is a sampling rate change, then we are worried about alias suppression and need to squash the images as you describe. But if it's just fractional delay, where we end up at the same sampling rate as the input, then the images all land back where they started and there is no signal aliasing.

agreed! so then when you are delayed by 1/2 sample, the filter is H(z)=(1/2)(1 + z^-1). and that filter goes to -inf dB at Fs/2 (the oversampled Fs). when it's a slowly changing or unchanging fractional delay, there is no issue about energy from aliasing or the foldover of images. it's just an LTI system and the issue is what LTI filter is it?

Instead, we only get aliasing of the polynomial interpolator's spectrum. I.e., we just end up with a linear filter that has an imperfect fractional-delay response (with the imperfection depending on fractional delay - worst at 1/2 sample - and also on frequency - worst at Nyquist).

It's not obvious to me how to create a spec on the fractional delay filter response that is a fair comparison to the 120dB (or is it 108dB as I mentioned above???)

or maybe 123 dB.

spec on aliasing suppression for the rate-change case. It's kind of apples and oranges. The analysis of how much error you get in the final response as a function of oversampling and polynomial order requires more complicated math/numerics (which I'll try to later do if I get some spare time), but for reference I would note that a half-sample delay achieved with (perfect) 512x oversampling and linear interpolation ends up with a worst-case (in-band) ripple of around 0.00005dB. That's a pretty tight filter spec. But note that it if we consider that difference to be an "error signal," it turns out to be at around -106dB, and not -120dB (or -108dB if that is the correct number). This is because those signal images added up coherently, so suppressing them by XdB doesn't guarantee an XdB "noise floor" in the final result. On the other hand, the response at lower frequencies is much tighter and the "noise floor" is actually much *lower* than the margin that the worst images were suppressed by (since in that region the coherent addition is working in our favor).

Again, that's not really an apples to apples comparison, but the point is that the coherent imaging in the case of fractional delay violates the assumptions of the straightforward aliasing-suppression analysis.

yes.  i have been saying that (or something consistent with that) all along.

but, with the same fractional-delay interpolator, you can accomplish either task, but the performance of the interpolator is evaluated differently.

It ends up being a question of how much oversampling is required to operate in a region of the interpolator response that is sufficiently close to the ideal filter, rather than a question of alias suppression. But I'm not sure how to systematically compare the two cases, again because it's not clear how to compare signal-to-alias ratio against an alias-free signal with an imperfect fractional-delay response. All I would add is that the general rate-change case has to contend with both aliasing suppression and imperfect fractional delay response, so I would expect a fractional-delay-only system to have looser requirements since the signal aliasing issue has been removed.

i think we're on the same page.  ain't we?

--

r b-j                  r...@audioimagination.com

"Imagination is more important than knowledge."



_______________________________________________
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to