Thanks for tip, I had a brief look at this paper before.
I think the issue it adresses is not the problem I encounter now.
But it might be interesting again at a later stage or if I return to the time domain pitch shift.

This is how I do it now, it seems simple & correct but I am not 100% sure,
it still sounds bad except for zero pitch shifts, pure timestretches without pitch shift sound ok-ish though Especially compared to the previous version with time domain pitch shifting it sounds much worse.

1.loop through bins -------------

    calculate input phase difference

    subtract phase step of hop size of the original bin //= phase-frequency-deviation

    multiply by frequency factor

    add phase step of hop size of the target bin

    wrap into -pi...pi range

    accumulate to bins phaseaccumulator

-------- end loop

2.loop through bins ------------
    calculate spectral envelopes/formant correction
-------- end loop

3.loop through bins ------------
    shift bins
-------- end loop

4.iFFT


for 2., the formant correction, you have to do this before the bin shift :

loop through bins -------------
    smooth amplitude spectrum according to ERB scale or similar
----- end loop

loop through bins -------------
    shift bins of a copy of the smoothed spectrum in oppsite direction (1/freq factor)
    // smooth again, or don't, or use MIPmapping )
    calculate amplitudes * spectrale envelope 2 / spectral envelope 1
----- end loop


this seems correct (?) and does both, pitch shifting and time stretching
However, it doesnt sound good, it actually sounds kind of resonant, and tinny
(noise elements seem to be converted to ringing sinusoids) and strange
except when the shift is 0, and also sounds better for downshifts then for upshifts

also missing is a transient detection (in time domain?) to reset phases
to the original phases, and a noise / sinusiod detection, which might improve things

part of the bad sound may just be the fromant correction, it sounds a little bit like there were
two voices speaking in synch, another part is the low number of overlaps
example:
https://soundcloud.com/traumlos_kalt/freq-domain-pv-shift-test-4e-2f-4c-test-2/s-6SJ93
1024 @ 22050 kHz











_______________________________________________
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to