Do you mean as a time-scaler or as a pitch-shifter? WSOLA can and does work real-time in a pitch-shifter. But a time-scaler can't be real-time whether it's WSOLA or a phase-vocoder. Because a real-time process requires the output to process the input indefinitely without the input and output pointers colliding or diverting away from each other indefinitely.
--r b-j r...@audioimagination.com "Imagination is more important than knowledge." -------- Original message -------- From: Alex Dashevski <alexd...@gmail.com> Date: 5/28/2018 10:22 PM (GMT-08:00) To: robert bristow-johnson <r...@audioimagination.com>, music-dsp@music.columbia.edu Subject: Re: [music-dsp] WSOLA Hi, I mean WSOLA on RealTime. How can I proof to my instructor that it's not possible ? Why do I need to do resampling ? Android sample and resample in the same frequency(in my case,48Khz). Maybe, do you mean to do a processing with 8Khz(subsample) ? I also want to achieve the high performance and minimum latency. How can I proof to my instructor that correct way to implement is pitch shifting and not WSOLA on RealTime? Thanks,Alex 2018-05-29 4:19 GMT+03:00 robert bristow-johnson <r...@audioimagination.com>: ---------------------------- Original Message ---------------------------- Subject: Re: [music-dsp] WSOLA From: "Alex Dashevski" <alexd...@gmail.com> Date: Sun, May 27, 2018 2:56 pm To: philb...@mobileer.com music-dsp@music.columbia.edu -------------------------------------------------------------------------- > Hi, > > I don't understand your answer. > I have already audio echo application on Android. Buffer size and Frequency > sample infuence on latency. > Could you explain me how implement WSOLA on Real-time ? It is a bit more > difficult . >yes WSOLA is a little difficult, but less difficult than a phase-vocoder. now, >when you say "WSOLA" and "Real-time" in the same breath, do you mean a pitch >shifter? not a time-scaler, right? because pitch shifting can be done >real-time, but time-scaling has to be done with an input buffer (with some number of samples) getting made into a longer (more samples) or shorter (fewer samples) buffer with the same sample rate. that can't be done on an operation the runs on indefinitely, even with a long throughput delay. eventually the input and output pointers will collide.but you can combine time-scaling and resampling (the latter is mathematically well defined) to get pitch shifting that can run on forever. one operation increases the number of samples and the other reduces the number of samples exactly in reciprocal proportion. so the number of samples coming out every buffer of time is the same as the number going in.now the "S" in acronym stands for "Similarity", so you have to position the windows in the input waveform to be similar to the waveform in the output. the waveform in the first-half of the input window should match the similarity to the waveform in the last-half of the output window of the previous frame. normally the frame hop is exactly half of the window width. and the window shape should be complementary like a Hann window.i believe that 240 sample buffer in the Android is an input/output sample buffer for the media I/O. you can't really do anything with that buffer except pull in input samples and push out output samples. you will have to (using whatever programming environment one uses to make Android apps) allocate memory and create your own buffers to hold about 100 ms of sound. in that buffer, you will use a technique called AMDF, ASDF, or autocorrelation to measure waveform similarity. your input frame hop distance (which has both integer and fractional parts) is the output frame hop size times the reciprocal of the time-stretch factor. so, if you're time-stretching (instead of time-compressing), your input frame will advance more slowly than your output frame, that increases the number of samples. but in that output buffer, you will resample (interpolate) with a step-size that is time-stretch factor (do this only for output samples that have already been overlapped and added) thus reducing the final output number of samples back to the original number. you will allow some jitter on the input window that is informed by the result of the waveform similarity analysis.that's how you do WSOLA, as best as i understand it.-- r b-j r...@audioimagination.com "Imagination is more important than knowledge." _______________________________________________ dupswapdrop: music-dsp mailing list music-dsp@music.columbia.edu https://lists.columbia.edu/mailman/listinfo/music-dsp
_______________________________________________ dupswapdrop: music-dsp mailing list music-dsp@music.columbia.edu https://lists.columbia.edu/mailman/listinfo/music-dsp