(Pdb) p np.fft.fft(randvec2irfft).real array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.92559664, 0.56804456, 0.52889492, 0.79172504, 0.38344152, 0.96366276, 0.891773 , 0.43758721, 0.64589411, 0.4236548 , 0.54488318, 0.60276338, 0.71518937]) (Pdb) p np.concatenate([randvec[:15-1], randvec[15-1:15], randvec[15-2:0:-1]]) array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 , 0.64589411, 0.43758721, 0.891773 , 0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606, 0.92559664, 0.56804456, 0.52889492, 0.79172504, 0.38344152, 0.96366276, 0.891773 , 0.43758721, 0.64589411, 0.4236548 , 0.54488318, 0.60276338, 0.71518937])
so the full complex frequencies are the positive reflection of the real ones. (Pdb) doublerandvec = np.concatenate([randvec[:15-1], randvec[15-1:15], randvec[15-2:0:-1]], axis=0) (Pdb) p np.fft.ifft(doublerandvec).real array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) (Pdb) p randvec2irfft array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) doublerandvec is the frequencies, and i'm trying to debug the inverse fourier transform, which is easy to put into a matrix. (Pdb) freqs15f = fftfreq(len(doublerandvec), complex=True) (Pdb) p rfreqs15f array([0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, 0.5 ]) (Pdb) p np.fft.rfftfreq(len(doublerandvec)) array([0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, 0.5 ]) rfreqs15f is the real space frequencies for randvec. doublerandvec has the complex space frequencies. (Pdb) p freqs15f array([ 0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, -0.5 , -0.46428571, -0.42857143, -0.39285714, -0.35714286, -0.32142857, -0.28571429, -0.25 , -0.21428571, -0.17857143, -0.14285714, -0.10714286, -0.07142857, -0.03571429]) (Pdb) p np.fft.fftfreq(len(doublerandvec)) array([ 0. , 0.03571429, 0.07142857, 0.10714286, 0.14285714, 0.17857143, 0.21428571, 0.25 , 0.28571429, 0.32142857, 0.35714286, 0.39285714, 0.42857143, 0.46428571, -0.5 , -0.46428571, -0.42857143, -0.39285714, -0.35714286, -0.32142857, -0.28571429, -0.25 , -0.21428571, -0.17857143, -0.14285714, -0.10714286, -0.07142857, -0.03571429]) i made freqs15f to hold the full frequencies so now i can make a normal complex matrix, and compare it to the faulty real-valued ones (Pdb) ift15 = create_freq2time(freqs=freqs15f) i'm poking at it, and it looks like my analysis of rows and columns has been wrong in spots. the first dimension is the row. the second is the column. (Pdb) p ift15[:,6] array([ 0.03571429+0.00000000e+00j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429+1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j, 0.03571429+2.62424314e-17j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429-1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j]) (Pdb) p irft15[:,6] array([ 0.06666667+0.00000000e+00j, 0.01483473+6.49951941e-02j, -0.06006459+2.89255826e-02j, -0.04156599-5.21220988e-02j, 0.04156599-5.21220988e-02j, 0.06006459+2.89255826e-02j, -0.01483473+6.49951941e-02j, -0.06666667+2.44929360e-17j, -0.01483473-6.49951941e-02j, 0.06006459-2.89255826e-02j, 0.04156599+5.21220988e-02j, -0.04156599+5.21220988e-02j, -0.06006459-2.89255826e-02j, 0.01483473-6.49951941e-02j, 0.06666667-4.89858720e-17j]) The matrices differ by a factor of approximately but not exactly 2.0 . I can now troubleshoot a problem cause to the scaling by freq_count in create_freq2time . This freq_count is wrong for real-valued data. [celebration] 0829 0842 I changed create_freq2time to keep track of freq_count and things look somewhat better. the assertion still fails. (Pdb) ift15[:,6] array([ 0.03571429+0.00000000e+00j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429+1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j, 0.03571429+2.62424314e-17j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429-1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j]) (Pdb) irft15[:,6] array([ 0.03571429+0.00000000e+00j, 0.00794718+3.48188540e-02j, -0.03217746+1.54958478e-02j, -0.02226749-2.79225529e-02j, 0.02226749-2.79225529e-02j, 0.03217746+1.54958478e-02j, -0.00794718+3.48188540e-02j, -0.03571429+1.31212157e-17j, -0.00794718-3.48188540e-02j, 0.03217746-1.54958478e-02j, 0.02226749+2.79225529e-02j, -0.02226749+2.79225529e-02j, -0.03217746-1.54958478e-02j, 0.00794718-3.48188540e-02j, 0.03571429-2.62424314e-17j]) So, now the action of double the repeated range should make the right result. And I'm thinking I could undo that, when taking the inverse, by halving and mirroring them. 0849 Looks like the concatenation does now work: (Pdb) p randvec2irfft array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) (Pdb) p (doublerandvec @ np.concatenate([irft15[:-1], irft15[-1:].conj(), irft15[-2:0:-1].conj()], axis=0)).real array([ 6.23788233e-01, -1.10813893e-02, -2.20954659e-02, 3.42088940e-02, 3.90241536e-02, -5.46560429e-03, -4.05515245e-02, 2.58696585e-02, -2.15808106e-03, -1.95104618e-02, -6.35896998e-02, 9.52535278e-02, -4.83357438e-02, 1.69736668e-04, -3.84507295e-02, 1.69736668e-04, -4.83357438e-02, 9.52535278e-02, -6.35896998e-02, -1.95104618e-02, -2.15808106e-03, 2.58696585e-02, -4.05515245e-02, -5.46560429e-03, 3.90241536e-02, 3.42088940e-02, -2.20954659e-02, -1.10813893e-02]) so for those middle values, I can pass likely both the negative and the positive frequencies to the wavelet function and sum them. 0857 I added the summation and the assertion around the easier inverse transform is now passing. next is the forward matrix. i have some interest in, later, spending time organising some of these conditional cases; there's redundancy and special-casing. it's cool that it makes working real-valued inverse transform matrices now; they're half the size. 0900 0903 this has a clear inverse, which needs a square matrix. i was thinking i could make it square before inverting it. the matrix has only real components, rather than the imaginary components present in the full complex matrix. can i still turn it into a working square matrix? if not, the code that generates it does have access to the full frequencies.
# AGPL-3 Karl Semich 2022 import numpy as np # TODO: shift max_freq up with min_freq, when max_freq not specified def fftfreq(freq_count = None, sample_rate = None, min_freq = None, max_freq = None, dc_offset = True, complex = True, sample_time = None, repetition_time = None, repetition_samples = None, freq_sample_rate = None, freq_sample_time = None): ''' Calculates and returns the frequency bins used to convert between the time and frequency domains with a discrete Fourier transform. With no optional arguments, this function should be equivalent to numpy.fft.fftfreq . To specify frequencies in terms of repetitions / sample_count, set sample_rate to sample_count. If frequencies were in Hz (cycles/sec), this implies that sample_count should be considered to last 1 second, so the frequency becomes equal to the cycle count. Parameters: - freq_count: the number of frequency bins to generate - sample_rate: the time-domain sample rate (default: 1 sample/time_unit) - min_freq: the minimum frequency the signal contains - max_fraq: the maximum frequency the signal contains - dc_offset: whether to include a DC offset component (0 Hz) - complex: whether to generate negative frequencies - sample_time: sample_rate as the duration of a single sample - repetition_time: min_freq as a period time, possibly of a subsignal - repetition_samples: min_freq as a period size in samples - freq_sample_rate: convert to or from a different frequency-domain sample rate - freq_sample_time: freq_sample_rate as the duration of a single sample Returns a vector of sinusoid time scalings that can be used to perform or analyse a discrete Fourier transform. Multiply this vector by the time-domain sample rate to get the frequencies. ''' assert not sample_time or not sample_rate assert not freq_sample_time or not freq_sample_rate assert (min_freq, repetition_time, repetition_samples).count(None) >= 2 assert freq_count or min_freq or repetition_time or repetition_samples if sample_time is not None: sample_rate = 1 / sample_time if freq_sample_time is not None: freq_sample_rate = 1 / freq_sample_time sample_rate = sample_rate or freq_sample_rate or 1 freq_sample_rate = freq_sample_rate or sample_rate or 1 if repetition_time: min_freq = 1 / repetition_time elif repetition_samples: min_freq = freq_sample_rate / repetition_samples # here freq_count adopts the complex + DC offset value, for calculations # it might be clearer to use e.g. sample_count, and consolidate the cases with lower if freq_count is None: freq_count = int(np.ceil(freq_sample_rate / min_freq)) else: if not dc_offset: freq_count += 1 if not complex: freq_count = int((freq_count - 1) * 2) if not min_freq: min_freq = freq_sample_rate / freq_count if not max_freq: #max_freq = freq_sample_rate / 2 max_freq = freq_count * min_freq / 2 if freq_count % 2 != 0: #max_freq -= freq_sample_rate / (2 * freq_count) max_freq -= min_freq / 2 min_freq /= sample_rate max_freq /= sample_rate if freq_count % 2 == 0: if complex: neg_freqs = np.linspace(-max_freq, -min_freq, num=freq_count // 2, endpoint=True) pos_freqs = -neg_freqs[:0:-1] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = pos_freqs[:0] else: pos_freqs = np.linspace(min_freq, max_freq, num=freq_count // 2, endpoint=True) neg_freqs = -pos_freqs[::-1] if complex else pos_freqs[:0] return np.concatenate([ np.array([0] if dc_offset else []), pos_freqs, neg_freqs ]) def create_freq2time(time_count = None, freqs = None, wavelet = lambda x: np.exp(2j * np.pi * x)): ''' Creates a matrix that will perform an inverse discrete Fourier transform when it post-multiplies a vector of complex frequency magnitudes. Example time_data = spectrum @ create_freq2time(len(spectrum)) Parameters: - time_count: size of the output vector, defaults to the frequency bincount - freqs: frequency bins to convert, defaults to traditional FFT frequencies Returns: - an inverse discrete Fourier matrix of shape (len(freqs), time_count) ''' assert (time_count is not None) or (freqs is not None) if freqs is None: freqs = fftfreq(time_count) elif time_count is None: if freqs[-1] < 0: # complex data freq_count = len(freqs) time_count = freq_count else: freq_count = len(freqs) time_count = (freq_count - 1) * 2 is_complex = bool(freqs[-1] < 0) has_dc_offset = bool(freqs[0] == 0) freq_count = len(freqs) if not has_dc_offset: freq_count += 1 if not is_complex: freq_count = (freq_count - 1) * 2 if time_count is None: time_count = freq_count offsets = np.arange(time_count) #mat = np.exp(2j * np.pi * np.outer(freqs, offsets)) mat = wavelet(np.outer(freqs, offsets)) if not is_complex: is_even = bool(freqs[-1] == 0.5) head_idx = 1 if has_dc_offset else 0 tail_idx = -1 if is_even else None mat[head_idx:tail_idx,:] += wavelet(np.outer(-freqs[head_idx:tail_idx], offsets)) return mat / freq_count # scaled to match numpy convention def create_time2freq(time_count = None, freqs = None, wavelet = lambda x: np.exp(2j * np.pi * x)): ''' Creates a matrix that will perform a forward discrete Fourier transform when it post-multiplies a vector of time series data. If time_count is too small or large, the minimal least squares solution over all the data passed will be produced. This function is equivalent to calling .pinv() on the return value of create_freq2time. If the return value is single-use, it is more efficient and accurate to use numpy.linalg.lstsq . Example spectrum = time_data @ create_time2freq(len(time_data)) Parameters: - time_count: size of the input vector, defaults to the frequency bincount - freqs: frequency bins to produce, defaults to traditional FFT frequencies Returns: - a discrete Fourier matrix of shape (time_count, len(freqs)) ''' forward_mat = create_freq2time(time_count, freqs, wavelet) reverse_mat = np.linalg.pinv(forward_mat) return reverse_mat def peak_pair_idcs(freq_data, dc_offset=True, sum=True): dc_offset = int(dc_offset) freq_heights = abs(freq_data) # squares and sums the components if sum: while len(freq_heights).shape > 1: freq_heights = freq_height.sum(axis=0) paired_heights = freq_heights[...,dc_offset:-1] + freq_heights[...,dc_offset+1:] peak_idx = paired_heights.argmax(axis=-1, keepdims=True) + dc_offset return np.concatenate(peak_idx, peak_idx + 1, axis=-1) def improve_peak(time_data, min_freq, max_freq): freqs = fftfreq(time_data.shape[-1], min_freq = min_freq, max_freq = max_freq) freq_data = time_data @ np.linalg.inv(create_freq2time(freqs)) left, right = peak_pair_idcs(freq_data) return freq_data[left], freq_data[right] def test(): np.random.seed(0) randvec = np.random.random(16) freqs16 = fftfreq(16) ift16 = create_freq2time(16, freqs=freqs16) ft16 = create_time2freq(16) randvec2time = randvec @ ift16 randvec2freq = randvec @ ft16 randvec2ifft = np.fft.ifft(randvec) randvec2fft = np.fft.fft(randvec) assert np.allclose(freqs16, np.fft.fftfreq(16)) assert np.allclose(randvec2ifft, randvec2time) assert np.allclose(randvec2fft, randvec2freq) assert np.allclose(randvec2ifft, np.linalg.solve(ft16.T, randvec)) assert np.allclose(randvec2fft, np.linalg.solve(ift16.T, randvec)) rfreqs15t = fftfreq(repetition_samples=15, complex=False) rfreqs15f = fftfreq(15, complex=False) irft15 = create_freq2time(freqs=rfreqs15f) rft15 = create_time2freq(15, freqs=rfreqs15t) randvec2rtime15 = randvec[:15] @ irft15 randvec2rfreq15 = randvec[:15] @ rft15 randvec2irfft = np.fft.irfft(randvec[:15]) randvec2rfft = np.fft.rfft(randvec[:15]) assert np.allclose(rfreqs15t, np.fft.rfftfreq(15)) assert np.allclose(rfreqs15f, np.fft.rfftfreq(28)) assert np.allclose(randvec2rtime15, randvec2irfft) assert np.allclose(randvec2rfreq15, randvec2rfft) # [ 0, 1, 2, 3, 4, 3, 2, 1] #irft9_16 = create_freq2time(16, fftfreq(9, complex = False)) #rft16_30 = create_time2freq(16, fftfreq(30, complex = False)) #randvec2 #assert np.allclose((randvec @ rft16) @ irft16, randvec) # sample data at a differing rate time_rate = np.random.random() * 2 freq_rate = 1.0 freqs = np.fft.fftfreq(len(randvec)) rescaling_freqs = fftfreq(len(randvec), freq_sample_rate = freq_rate, sample_rate = time_rate) rescaling_ift = create_freq2time(freqs = rescaling_freqs) rescaling_ft = create_time2freq(freqs = rescaling_freqs) rescaled_time_data = np.array([ np.mean([ randvec[freqidx] * np.exp(2j * np.pi * freqs[freqidx] * sampleidx / time_rate) for freqidx in range(len(randvec)) ]) for sampleidx in range(len(randvec)) ]) assert np.allclose(rescaled_time_data, randvec @ rescaling_ift) assert np.allclose(rescaled_time_data, np.linalg.solve(rescaling_ft.T, randvec)) unscaled_freq_data = rescaled_time_data @ rescaling_ft unscaled_time_data = unscaled_freq_data @ ift16 assert np.allclose(unscaled_freq_data, randvec) assert np.allclose(unscaled_time_data, randvec2time) assert np.allclose(np.linalg.solve(rescaling_ift.T, rescaled_time_data), randvec) # extract a repeating wave longvec = np.empty(len(randvec)*100) short_count = np.random.random() * 4 + 1 short_duration = len(longvec) / short_count for idx in range(len(longvec)): longvec[idx] = randvec[int((idx / len(longvec) * short_duration) % len(randvec))] wavelet = lambda x: (np.floor(x * 2) % 2) * 2 - 1 shortspace_freqs = fftfreq(None, complex = False, dc_offset = True, repetition_samples = len(randvec)) longspace_freqs = fftfreq(len(shortspace_freqs), complex = False, dc_offset = True, repetition_samples = short_duration) assert np.allclose(longspace_freqs, shortspace_freqs / (short_duration / len(randvec))) inserting_ift = create_freq2time(len(longvec), longspace_freqs, wavelet = wavelet) extracting_ft = create_time2freq(len(longvec), longspace_freqs, wavelet = wavelet) extracting_ift = create_freq2time(freqs = shortspace_freqs, wavelet = wavelet) unextracting_ft = create_time2freq(freqs = shortspace_freqs, wavelet = wavelet) inserting_spectrum = randvec @ unextracting_ft assert np.allclose(inserting_spectrum @ extracting_ift, randvec) assert np.allclose(longvec, inserting_spectrum @ inserting_ift) extracted_freqs = longvec @ extracting_ft extracted_randvec = extracted_freqs @ extracting_ift assert np.allclose(extracted_randvec, randvec) if __name__ == '__main__': test()