It's been a while since I've thought about phase vocoders, and I've never
gotten real hands-on in building a time stretch algorithm, so let me make
sure I understand what you're doing:

On the analysis side you've got a fixed frame size and a fixed hop size. On
the synthesizer side, you have the same frame size but a variable hop size
(depending on stretch factor). Right? Is there are processing of the FFT
coefficients, or are they simply fed directly into the IFFT on the
synthesizer side? If you are not doing any frequency-domain processing,
then what you have is really a granular synthesizer. You can simply
eliminate the FFT and IFFT, since they do nothing except introduce
numerical error. This is simply a time-domain system that slices the input
into (overlapping, windowed) chunks and then re-spaces said chunks to
achieve the effect.

Unless there is also frequency domain processing going on? Then that's a
different story.

Granular synthesis based time modification is straightforward to implement,
low-complexity, and works okay for small modifications of the time scale.
But at some point its granularity becomes pronounced - for a large enough
stretch factor, your synthesis frames will no longer overlap, and so no
amount of monkeying with the shapes of the windows is ever going to get you
to anything like a perfect reconstruction state. Which can be a cool effect
in its own right, but it *is* an effect.

And that's okay, if what you want to achieve is a decent, cheap time
stretch algorithm for small time-scale modifications. A bit of window
manipulation or dynamic scaling will mostly address they modulation effects
you get in that range. Also the algorithm will be pretty cheap, since you
can eliminate the FFT and IFFT.

The way to do a serious phase vocoder based time stretch is to leave the
synthesis hop size fixed (and equal to the analysis hop size). That way the
whole chain is known to satisfy the perfect reconstruction condition. I.e.,
as long as you don't monkey with the FFT coefficients, you'll get out the
same signal as you put in (modulo some numerical error). To do time
stretching, then, you interpolate the frequency domain data. Specifically,
get the FFT coefficients in polar form, and then separately interpolate
their phases and magnitude (probably in the dB domain for the latter). This
will work pretty well for tonal signals, but will fail for noise-like
content. This is because the phase of noise-like content doesn't evolve
smoothly, but rather jumps around randomly. So interpolating it will impose
an artificial coherence on the noise. A high-performance phase vocoder time
stretch algorithm will employ some kind of classification stage, to work
out which FFT bins are coherent signals and which are noise-like, and then
apply appropriate interpolation to each of them separately.

E


On Tue, Aug 19, 2014 at 1:49 PM, Daniel Varela <danielvarela...@gmail.com>
wrote:

> Thanks for your quick reply Ethan.
>
> Indeed I was missing some relevant information on my first email. I am
> using the FFT / IFFT approach instead of the filter bank one. So I made my
> FFT frame size and analysis hop size fixed, achiving time stretching by
> changing the synthesis hop size by the time stretch ratio.
>
> My implementation is something very similar to the one exposed here
> <http://www.ece.uvic.ca/~peterd/48409/Bernardini.pdf>, but modified to
> work
> in real time processing frames of fixed size.
>
>
> 2014-08-19 21:30 GMT+01:00 Ethan Duni <ethan.d...@gmail.com>:
>
> > Maybe I'm missing something obvious, but shouldn't the filter bank itself
> > be constant? I.e., no change in the overlap or windowing. The time
> > stretch/compression is obtained by extrapolating/interpolating the
> analysis
> > parameters, not by shifting around the synthesis filter bank relative to
> > the analysis filter bank.
> >
> > No?
> >
> > E
> >
> >
> > On Tue, Aug 19, 2014 at 12:42 PM, Daniel Varela <
> danielvarela...@gmail.com
> > >
> > wrote:
> >
> > > Hello to the list!
> > >
> > > I am working in a time stretching effect using a standad phase vocoder
> > > approach.
> > >
> > > Basically I have an input parameter to control the time stretch ratio,
> > > which can change in real time from 0.5 ( sinthesys hop size = 1/2
> > analysys
> > > hop size, x200% original time stretch) to 2.0
> > > ( sinthesys hop size = 2 analysis hop size, x50% original time), going
> > > through all intermediate values.
> > >
> > > The problem I am facing is that when the sum of the products of
> analysis
> > > and synthesis windows is not equals to unity, I get, as expected,
> > > modulation on the final reconstructed signal.
> > >
> > > I have tried with the usual windows (Hanning, Hamming, Truncated
> > Gaussian,
> > > etc ) and each of then achieves perfect reconstruction for different
> > > overlapping factors, but as soon as the time stretch ratio changes the
> > > modulation appears.
> > >
> > > The paper FAST IMPLEMENTATION FOR NON-LINEAR TIME-SCALING OF STEREO
> > SIGNALS
> > > <
> > >
> >
> https://files.nyu.edu/jb2843/public/Publications_files/Ravelli-DAFx-2005.pdf
> > > >
> > > discuss this problem (4.2), suggesting to use assymetric triangular
> > windows
> > > for the synthesis.
> > >
> > > Anyone else has faced this problem?
> > >
> > > Thanks in advance.
> > >
> > > Daniel
> > > --
> > > dupswapdrop -- the music-dsp mailing list and website:
> > > subscription info, FAQ, source code archive, list archive, book
> reviews,
> > > dsp links
> > > http://music.columbia.edu/cmc/music-dsp
> > > http://music.columbia.edu/mailman/listinfo/music-dsp
> > >
> > --
> > dupswapdrop -- the music-dsp mailing list and website:
> > subscription info, FAQ, source code archive, list archive, book reviews,
> > dsp links
> > http://music.columbia.edu/cmc/music-dsp
> > http://music.columbia.edu/mailman/listinfo/music-dsp
> >
> --
> dupswapdrop -- the music-dsp mailing list and website:
> subscription info, FAQ, source code archive, list archive, book reviews,
> dsp links
> http://music.columbia.edu/cmc/music-dsp
> http://music.columbia.edu/mailman/listinfo/music-dsp
>
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Reply via email to