Re: [music-dsp] Real time variable time stretching

Ethan Duni Wed, 20 Aug 2014 09:28:47 -0700

> The classic Phase Vocoder algorithm integrates the phase difference
> between successive analysis frames to produce the phase component in
> the synthesis frames. No explicit interpolation is needed.


To do time-stretching, you then have to multiply that unwrapped phase by
the ratio of the hop sizes. Which is an explicit interpolation.

E


On Wed, Aug 20, 2014 at 2:13 AM, Joel Ross <joel.binarybr...@gmail.com>
wrote:

> The classic Phase Vocoder algorithm integrates the phase difference
> between successive analysis frames to produce the phase component in
> the synthesis frames. No explicit interpolation is needed. This is not
> the same as results achieved by time domain methods. It seems that
> reconstruction is never perfect in this case even for a one to one
> mapping of input frames to output frames, though this could be an
> error on my part - but I seem to remember that this is the case. I'd
> be interested in an explanation as to why...
>
> Regards,
>  Joel
>
> On 20 August 2014 02:26, Ethan Duni <ethan.d...@gmail.com> wrote:
> > It's been a while since I've thought about phase vocoders, and I've never
> > gotten real hands-on in building a time stretch algorithm, so let me make
> > sure I understand what you're doing:
> >
> > On the analysis side you've got a fixed frame size and a fixed hop size.
> On
> > the synthesizer side, you have the same frame size but a variable hop
> size
> > (depending on stretch factor). Right? Is there are processing of the FFT
> > coefficients, or are they simply fed directly into the IFFT on the
> > synthesizer side? If you are not doing any frequency-domain processing,
> > then what you have is really a granular synthesizer. You can simply
> > eliminate the FFT and IFFT, since they do nothing except introduce
> > numerical error. This is simply a time-domain system that slices the
> input
> > into (overlapping, windowed) chunks and then re-spaces said chunks to
> > achieve the effect.
> >
> > Unless there is also frequency domain processing going on? Then that's a
> > different story.
> >
> > Granular synthesis based time modification is straightforward to
> implement,
> > low-complexity, and works okay for small modifications of the time scale.
> > But at some point its granularity becomes pronounced - for a large enough
> > stretch factor, your synthesis frames will no longer overlap, and so no
> > amount of monkeying with the shapes of the windows is ever going to get
> you
> > to anything like a perfect reconstruction state. Which can be a cool
> effect
> > in its own right, but it *is* an effect.
> >
> > And that's okay, if what you want to achieve is a decent, cheap time
> > stretch algorithm for small time-scale modifications. A bit of window
> > manipulation or dynamic scaling will mostly address they modulation
> effects
> > you get in that range. Also the algorithm will be pretty cheap, since you
> > can eliminate the FFT and IFFT.
> >
> > The way to do a serious phase vocoder based time stretch is to leave the
> > synthesis hop size fixed (and equal to the analysis hop size). That way
> the
> > whole chain is known to satisfy the perfect reconstruction condition.
> I.e.,
> > as long as you don't monkey with the FFT coefficients, you'll get out the
> > same signal as you put in (modulo some numerical error). To do time
> > stretching, then, you interpolate the frequency domain data.
> Specifically,
> > get the FFT coefficients in polar form, and then separately interpolate
> > their phases and magnitude (probably in the dB domain for the latter).
> This
> > will work pretty well for tonal signals, but will fail for noise-like
> > content. This is because the phase of noise-like content doesn't evolve
> > smoothly, but rather jumps around randomly. So interpolating it will
> impose
> > an artificial coherence on the noise. A high-performance phase vocoder
> time
> > stretch algorithm will employ some kind of classification stage, to work
> > out which FFT bins are coherent signals and which are noise-like, and
> then
> > apply appropriate interpolation to each of them separately.
> >
> > E
> >
> >
> > On Tue, Aug 19, 2014 at 1:49 PM, Daniel Varela <
> danielvarela...@gmail.com>
> > wrote:
> >
> >> Thanks for your quick reply Ethan.
> >>
> >> Indeed I was missing some relevant information on my first email. I am
> >> using the FFT / IFFT approach instead of the filter bank one. So I made
> my
> >> FFT frame size and analysis hop size fixed, achiving time stretching by
> >> changing the synthesis hop size by the time stretch ratio.
> >>
> >> My implementation is something very similar to the one exposed here
> >> <http://www.ece.uvic.ca/~peterd/48409/Bernardini.pdf>, but modified to
> >> work
> >> in real time processing frames of fixed size.
> >>
> >>
> >> 2014-08-19 21:30 GMT+01:00 Ethan Duni <ethan.d...@gmail.com>:
> >>
> >> > Maybe I'm missing something obvious, but shouldn't the filter bank
> itself
> >> > be constant? I.e., no change in the overlap or windowing. The time
> >> > stretch/compression is obtained by extrapolating/interpolating the
> >> analysis
> >> > parameters, not by shifting around the synthesis filter bank relative
> to
> >> > the analysis filter bank.
> >> >
> >> > No?
> >> >
> >> > E
> >> >
> >> >
> >> > On Tue, Aug 19, 2014 at 12:42 PM, Daniel Varela <
> >> danielvarela...@gmail.com
> >> > >
> >> > wrote:
> >> >
> >> > > Hello to the list!
> >> > >
> >> > > I am working in a time stretching effect using a standad phase
> vocoder
> >> > > approach.
> >> > >
> >> > > Basically I have an input parameter to control the time stretch
> ratio,
> >> > > which can change in real time from 0.5 ( sinthesys hop size = 1/2
> >> > analysys
> >> > > hop size, x200% original time stretch) to 2.0
> >> > > ( sinthesys hop size = 2 analysis hop size, x50% original time),
> going
> >> > > through all intermediate values.
> >> > >
> >> > > The problem I am facing is that when the sum of the products of
> >> analysis
> >> > > and synthesis windows is not equals to unity, I get, as expected,
> >> > > modulation on the final reconstructed signal.
> >> > >
> >> > > I have tried with the usual windows (Hanning, Hamming, Truncated
> >> > Gaussian,
> >> > > etc ) and each of then achieves perfect reconstruction for different
> >> > > overlapping factors, but as soon as the time stretch ratio changes
> the
> >> > > modulation appears.
> >> > >
> >> > > The paper FAST IMPLEMENTATION FOR NON-LINEAR TIME-SCALING OF STEREO
> >> > SIGNALS
> >> > > <
> >> > >
> >> >
> >>
> https://files.nyu.edu/jb2843/public/Publications_files/Ravelli-DAFx-2005.pdf
> >> > > >
> >> > > discuss this problem (4.2), suggesting to use assymetric triangular
> >> > windows
> >> > > for the synthesis.
> >> > >
> >> > > Anyone else has faced this problem?
> >> > >
> >> > > Thanks in advance.
> >> > >
> >> > > Daniel
> >> > > --
> >> > > dupswapdrop -- the music-dsp mailing list and website:
> >> > > subscription info, FAQ, source code archive, list archive, book
> >> reviews,
> >> > > dsp links
> >> > > http://music.columbia.edu/cmc/music-dsp
> >> > > http://music.columbia.edu/mailman/listinfo/music-dsp
> >> > >
> >> > --
> >> > dupswapdrop -- the music-dsp mailing list and website:
> >> > subscription info, FAQ, source code archive, list archive, book
> reviews,
> >> > dsp links
> >> > http://music.columbia.edu/cmc/music-dsp
> >> > http://music.columbia.edu/mailman/listinfo/music-dsp
> >> >
> >> --
> >> dupswapdrop -- the music-dsp mailing list and website:
> >> subscription info, FAQ, source code archive, list archive, book reviews,
> >> dsp links
> >> http://music.columbia.edu/cmc/music-dsp
> >> http://music.columbia.edu/mailman/listinfo/music-dsp
> >>
> > --
> > dupswapdrop -- the music-dsp mailing list and website:
> > subscription info, FAQ, source code archive, list archive, book reviews,
> dsp links
> > http://music.columbia.edu/cmc/music-dsp
> > http://music.columbia.edu/mailman/listinfo/music-dsp
> --
> dupswapdrop -- the music-dsp mailing list and website:
> subscription info, FAQ, source code archive, list archive, book reviews,
> dsp links
> http://music.columbia.edu/cmc/music-dsp
> http://music.columbia.edu/mailman/listinfo/music-dsp
>
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Real time variable time stretching

Reply via email to