Re: [music-dsp] Compensate for interpolation high frequency signal loss

robert bristow-johnson Wed, 26 Aug 2015 13:03:08 -0700

On 8/25/15 7:08 PM, Ethan Duni wrote:

>if you can, with optimal coefficients designed with the tool of yourchoice, so i am ignoring any images between B and Nyquist-B, >upsampleby 512x and then do linear interpolation between adjacent samples forcontinuous-time interpolation, you can show that it's >something like12 dB S/N per octave of oversampling plus another 12 dB. that's 120dB. that's how i got to 512x.
Wait, where does the extra 12dB come from? Seems like it should justbe 12dB per octave of oversampling. What am I missing?

okay, this is painful. in our 2-decade old paper, Duane and i did thistheoretical approximation analysis for drop-sample interpolation, and idid it myself for linear, but we did not put in the math for linearinterpolation in the paper.

so, to satisfy Nyquist (or Shannon or Whittaker or the Russian guy) thesample rate Fs must exceed 2B which is twice the bandwidth. theoversampling ratio is defined to be Fs/(2B). and in octaves it islog2(Fs/(2B)). all frequencies in your baseband satisfy |f|<B and ifit's highly oversampled, 2B << Fs.

now, i'm gonna assume that Fs is so much (like 512x) greater than 2Bthat i will assume the attenuation due to the sinc^2 for |f|<B isnegligible. i will assume that the spectrum between -B and +B isuniformly flat (that's not quite worst case, but it's worser case thanwhat music, in the bottom 5 or 6 octaves, is). so given a unit heighton that uniform power spectrum, the energy will be 2B.

so, the k-th image (where k is not 0) will have a zero of the sinc^2function going right through the heart of it. that's what's gonna killthe son-of-a-bitch. the energy of that image is:



       k*Fs+B
     integral{ (sinc(f/Fs))^4 df }
       k*Fs-B

since it's power spectrum it's sinc^4 for linear and sinc^2 fordrop-sample interpolation.


changing the variable of integration


           +B
     integral{ (sinc((k*Fs+f)/Fs))^4 df }
           -B



           +B
     integral{ (sinc(k+f/Fs))^4 df }
           -B



     sinc(k+f/Fs) =  sin(pi*(k+f/Fs))/(pi*(k+f/Fs))

                  =  (-1)^k * sin(pi*f/Fs)/(pi*(k+f/Fs))

                  =approx  (-1)^k  *  (pi*f/Fs)/(pi*k)

                  since  |f| < B << Fs

raising to the 4th power gets rid of the toggling polarity.  so now it's

                        +B
     1/(k*Fs)^4 * integral{ f^4 df }  =  (2/5)/(k*Fs)^4 * B^5
                        -B

now you have to sum up the energies of all of the bad images (we areassuming that *all* of those images, *after* they are beaten down, willsomehow fall into the baseband during resampling and their energies willteam up). there are both negative and positive frequency images to addup. (but we don't add up the energy from the image at the baseband,that's the "good" image.)


        +inf                                               +inf
    2 * SUM{ (2/5)/(k*Fs)^4 * B^5 }  =  B*(4/5)*(B/Fs)^4 * SUM{1/k^4}
        k=1                                                k=1


the summation on the right is (pi^4)/90

so the energy of all of the nasty images (after being beaten down due tothe application of the sinc^2 that comes from linear interpolationbetween the "subsamples") is


   B*(4/5)*(B/Fs)^4 * (pi^4)/90

and the  S/N ratio is 2B divided by that.

   (  (2/450) * (2B/Fs)^4 * (pi/2)^4  )^-1

in dB we use 3.01*log2() because this is an *energy* ratio, not avoltage ratio.


   -3.01*log2( (2/450) * (2B/Fs)^4 * (pi/2)^4 )

     =  3.01*log2(225) + 12.04*log2(2/pi)  +  12.04*log2( Fs/(2B) )

     =  15.6 dB  +  (12.04 dB) * log2( Fs/(2B) )

so, it seems to come out a little more than 12 dB. i think Duane did abetter empirical analysis and he got it slightly less.

but, using linear interpolation between subsamples, you should get about12 dB of S/N for every octave of oversampling plus 15 dB more.

>but the difference in price in memory only, *not* in computational burden.
Well, you don't get the full cost in computational burden since youcan skip computing most of the upsamples.

exactly and it's the same whether you upsample by 32x or 512x. butupsampling by 512x will cost 8 times the memory to store coefficients.

But the complexity still goes up with increasing oversampling factorsince the interpolation filter needs to get longer and longer, no?

no. that deals with a different issue, in my opinion. the oversamplingratio determines the number of discrete (and uniformly spaced)fractional delays. there is one FIR filter for each fractional delay.the number of coefs in the FIR filter is a performance issue regardinghow well you're gonna beat down them images in between baseband and thenext *oversampled* image. in the analysis above, i am assuming all ofthose in-between images are beaten down to zero. it's a crude analysisand i just wanted to see what the linear interpolation (on the upsampledsignal) does for us.

So there is some balancing of computational burden involved. I can seehow frequent coefficient calculations could swamp that for high orderand/or exotic interpolators, along with the increased upsamplercomplexity since you need to compute more of the polyphase componentsto drive it. But it's not obvious to me on its face exactly where theminimum lies...
Of course that all goes out the window if you already needoversampling for other system concerns anyway, or are using some verycheap hardware resampler or whatever. Or if you're happy to throw anIIR upsampler at it. And in many cases you'll already have access tosome nice optimized resampling software, whereas polynomialinterpolators would need to be invented from scratch, so there's apractical man-hours concern as well. Likewise, it depends on howfrequently the fractional delay is going to change. Obviously thereare good reasons why analyses that include both the interpolator andthe resampler are somewhat rare, there are a lot of moving parts andpotential trade-offs.
>some apps where you might care less about inharmonic energy fromimages folding over (a.k.a. "aliasing"), you might not need to go thathigh of whatever-x.
I think this is the point where we need to fork into whether we aredoing just a fractional delay, or if there is also difference betweenthe output and input sampling rates.

those were the two classes of apps that i mentioned. in one class (theresampling class, of which SRC and pitch shifting are apps), if you'redoing linear interpolation between the fractional delays, i think it issufficient to multiply the input spectrum by (sinc(f/Fs))^2 (where 2|f|< 2B << Fs is the oversampled Fs so the sinc^2 might not do much to yourbaseband image, but it will beat down the other images real good).

If there is a sampling rate change, then we are worried about aliassuppression and need to squash the images as you describe. But if it'sjust fractional delay, where we end up at the same sampling rate asthe input, then the images all land back where they started and thereis no signal aliasing.

agreed! so then when you are delayed by 1/2 sample, the filter isH(z)=(1/2)(1 + z^-1). and that filter goes to -inf dB at Fs/2 (theoversampled Fs). when it's a slowly changing or unchanging fractionaldelay, there is no issue about energy from aliasing or the foldover ofimages. it's just an LTI system and the issue is what LTI filter is it?

Instead, we only get aliasing of the polynomial interpolator'sspectrum. I.e., we just end up with a linear filter that has animperfect fractional-delay response (with the imperfection dependingon fractional delay - worst at 1/2 sample - and also on frequency -worst at Nyquist).
It's not obvious to me how to create a spec on the fractional delayfilter response that is a fair comparison to the 120dB (or is it 108dBas I mentioned above???)


or maybe 123 dB.

spec on aliasing suppression for the rate-change case. It's kind ofapples and oranges. The analysis of how much error you get in thefinal response as a function of oversampling and polynomial orderrequires more complicated math/numerics (which I'll try to later do ifI get some spare time), but for reference I would note that ahalf-sample delay achieved with (perfect) 512x oversampling and linearinterpolation ends up with a worst-case (in-band) ripple of around0.00005dB. That's a pretty tight filter spec. But note that it if weconsider that difference to be an "error signal," it turns out to beat around -106dB, and not -120dB (or -108dB if that is the correctnumber). This is because those signal images added up coherently, sosuppressing them by XdB doesn't guarantee an XdB "noise floor" in thefinal result. On the other hand, the response at lower frequencies ismuch tighter and the "noise floor" is actually much *lower* than themargin that the worst images were suppressed by (since in that regionthe coherent addition is working in our favor).
Again, that's not really an apples to apples comparison, but the pointis that the coherent imaging in the case of fractional delay violatesthe assumptions of the straightforward aliasing-suppression analysis.


yes.  i have been saying that (or something consistent with that) all along.

but, with the same fractional-delay interpolator, you can accomplisheither task, but the performance of the interpolator is evaluateddifferently.

It ends up being a question of how much oversampling is required tooperate in a region of the interpolator response that is sufficientlyclose to the ideal filter, rather than a question of aliassuppression. But I'm not sure how to systematically compare the twocases, again because it's not clear how to compare signal-to-aliasratio against an alias-free signal with an imperfect fractional-delayresponse. All I would add is that the general rate-change case has tocontend with both aliasing suppression and imperfect fractional delayresponse, so I would expect a fractional-delay-only system to havelooser requirements since the signal aliasing issue has been removed.


i think we're on the same page.  ain't we?

--

r b-j                  r...@audioimagination.com

"Imagination is more important than knowledge."



_______________________________________________
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

Reply via email to