from:"Ethan Duni"

Re: [music-dsp] Computational complexity of common DSP algorithms

2020-03-19 Thread Ethan Duni

On Thu, Mar 19, 2020 at 8:11 AM Dario Sanfilippo 
wrote:

>
> I believe that the time complexity of FFT is O(nlog(n)); would you perhaps
> have a list or reference to a paper that shows the time complexity of
> common DSP systems such as a 1-pole filter?
>

The complexity depends on the topology. The cheapest topologies (direct
forms) are something like 2*M operations per sample, where M is the filter
order. Other topologies are optimized for other properties (such as noise
robustness, modulation robustness, etc.) and exhibit higher complexity -
generic state-variable topologies can scale as M^2 operations per sample,
for example.

> If simply comparing two algorithms by the number of operations needed to
> compute a sample, would you include delays in filters as an operation? I'm
> just wondering as some papers about FFT only include real multiplications
> and additions as operations.
>

Delays usually get accounted as memory requirements in this type of
analysis. That isn't to say that copying data around in a real computer
doesn't take time, but this is usually abstracted away in the generic DSP
algorithm accounting. The underlying assumption being that the DSP
throughput is essentially computation bound, and so reducing the total
number of MACs is the goal. But that's not terribly appropriate for a
software system running on a modern personal computer, for example.

Ethan

>
> Thanks for your help,
> Dario
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

[music-dsp] Sliding Phase Vocoder (was FIR blog post & interactive demo)

2020-03-19 Thread Ethan Duni

On Tue, Mar 10, 2020 at 1:05 PM Richard Dobson  wrote:

>
> Our ICMC paper can be found here, along with a few beguiling sound
> examples:
>
> http://dream.cs.bath.ac.uk/SDFT/

So this is pretty cool stuff. I can't say I've digested the whole idea yet,
but I had a couple of obvious questions.

In particular, the analyzer is defined by a recursive formula, and I gather
that the synthesizer effectively becomes an oscillator bank. So, are
special numerical techniques required to implement this, in order to avoid
the build-up of round-off noise over time?

Ethan
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] FIR blog post & interactive demo

2020-03-18 Thread Ethan Duni

Hi Eric

I'm sure your filterbank EQ sounds fine. Aliasing should be contained to a
very low level if appropriate windows/overlap are used and the filter
response isn't pushed to any extremes.

But, zero-phase (offline) processing is straightforward to achieve with
FIR. You just do a linear-phase design, and then compensate the output by
exactly that delay. Opinions differ as to whether linear phase is
preferable to minimum phase in audio terms, due to issues like pre-echo.
This is a moot point for mild EQ settings, which is why you most often see
linear/zero phase EQs used in mastering contexts.

Zero-phase (offline) IIR can be also be achieved by filtering in two
passes: one forward in time, and the other backwards in time. However this
requires the filter design to be a "square root" of the desired response,
so there is some extra work compared to the FIR flavor.

But, note that linear phase FIR requires the roots to come in reciprocal
pairs, which is an equivalent "squared response" constraint. I.e., you can
design a linear phase FIR by starting with a min-phase FIR of half the
length (and square root of the desired response), and then convolving it
with its time-reversal, just as in the IIR case.

Ethan

On Sun, Mar 15, 2020 at 6:06 PM Zhiguang Eric Zhang  wrote:

>
> Hi Ethan,
>
>
> It's been a few years since I've ran or heard this FFT filterbank EQ.  I
> do remember it being quite clean, indeed, I chose to work on it precisely
> because I realized that it could be designed to be zero-phase (meaning no
> phase distortion like you get from traditional FIR/IIR eqs).
>
> The 'perfect reconstruction' dogma is tricky because you have to remember
> that we are also discussing this in the context of audio coding and
> compression, where the coefficients are necessarily changed in order to get
> coding gain and also quantization noise (which is masked through
> application of the psychoacoustic model, etc).
>
> I urge you to take a look or even run the algorithm in my github with
> audio if you want to hear whether or not there is quantization noise from
> this FFT EQ or not (from changing the coefficients, etc).
>
>
> cheers,
> Eric Z
> https://www.github.com/kardashevian
>
> On Fri, Mar 13, 2020 at 6:18 PM Ethan Duni  wrote:
>
>> On Thu, Mar 12, 2020 at 9:35 PM robert bristow-johnson <
>> r...@audioimagination.com> wrote:
>>
>>>  i am not always persuaded that the analysis window is preserved in the
>>> frequency-domain modification operation.
>>
>>
>> It definitely is *not* preserved under modification, generally.
>>
>> The Perfect Reconstruction condition assumes that there is no
>> modification to the coefficients. It's just a basic guarantee that the
>> filterbank is actually able to reconstruct the signal to begin with. The
>> details of the windows/zero-padding determine exactly what happens to all
>> of the block processing artifacts when you modify things.
>>
>> if it's a phase vocoder and you do the Miller Puckette thing and apply
>>> the same phase to a entire spectral peak, then supposedly the window shape
>>> is preserved on each sinusoidal component.
>>
>>
>> Even that is only approximate IIRC, in that it assumes well-separated
>> sinusoids or similar?
>>
>> The larger point being that preserving window shape under modification is
>> an exceptional case that requires special handling.
>>
>> for analysis, there might be other properties of the window that is more
>>> important than being complementary.
>>>
>>
>> That's true enough: this isn't as crucial in analysis-only as it is for
>> synthesis. Although, I do consider Parseval to be pretty bedrock in terms
>> of DSP intuition, and would not want to introduce frame-rate modulations
>> into analysis without a clear reason (of which there are many good
>> examples, don't get me wrong).
>>
>>
>>> if, after analysis, i am modifying each Gaussian pulse and inverse DFT
>>> back to the time domain, i will have a Gaussian window effectively on the
>>> output frame.  by multiplying by a Hann window and dividing by the original
>>> Gaussian window, the result has a Hann window shape and that should be
>>> complementary in the overlap-add.
>>>
>>
>> So, a relevant distinction here is whether an STFT filterbank uses
>> matching analysis and synthesis windows. The PR condition is that their
>> product obeys COLA.
>>
>> In the vanilla case, the analysis and synthesis windows are constrained
>> to match (actually they're time-reversals of one another, but that only

Re: [music-dsp] FIR blog post & interactive demo

2020-03-13 Thread Ethan Duni

On Thu, Mar 12, 2020 at 9:35 PM robert bristow-johnson <
r...@audioimagination.com> wrote:

>  i am not always persuaded that the analysis window is preserved in the
> frequency-domain modification operation.

It definitely is *not* preserved under modification, generally.

The Perfect Reconstruction condition assumes that there is no modification
to the coefficients. It's just a basic guarantee that the filterbank is
actually able to reconstruct the signal to begin with. The details of the
windows/zero-padding determine exactly what happens to all of the block
processing artifacts when you modify things.

if it's a phase vocoder and you do the Miller Puckette thing and apply the
> same phase to a entire spectral peak, then supposedly the window shape is
> preserved on each sinusoidal component.

Even that is only approximate IIRC, in that it assumes well-separated
sinusoids or similar?

The larger point being that preserving window shape under modification is
an exceptional case that requires special handling.

for analysis, there might be other properties of the window that is more
> important than being complementary.
>

That's true enough: this isn't as crucial in analysis-only as it is for
synthesis. Although, I do consider Parseval to be pretty bedrock in terms
of DSP intuition, and would not want to introduce frame-rate modulations
into analysis without a clear reason (of which there are many good
examples, don't get me wrong).

> if, after analysis, i am modifying each Gaussian pulse and inverse DFT
> back to the time domain, i will have a Gaussian window effectively on the
> output frame.  by multiplying by a Hann window and dividing by the original
> Gaussian window, the result has a Hann window shape and that should be
> complementary in the overlap-add.
>

So, a relevant distinction here is whether an STFT filterbank uses matching
analysis and synthesis windows. The PR condition is that their product
obeys COLA.

In the vanilla case, the analysis and synthesis windows are constrained to
match (actually they're time-reversals of one another, but that only
matters for asymmetric windows). Then, the PR condition is COLA on the
square of the (common) window, and the appropriate window is of "square
root" type, such as cosine. This is a "balanced" design, in that the
analyzer and synthesizer play equal roles in the windowing.

Note that this matching constraint removes many degrees of freedom from the
window design. In general, for mismatched analysis and synthesis windows,
the PR condition is very "loose." For example, you can use literally
anything you want for the analysis window, provided the values are finite
and non-zero (negative is okay!). Then you can pick any COLA window, and
solve for the synthesis window as their ratio. In this way you can design
PR filterbanks with arbitrarily bad performance :P

So for the mismatched case, we need some additional design principle(s) to
drive the window designs. Offhand, there seem to be two notable approaches
to this. One is that rectangular "windows" are desired on the synthesis
side in order to accommodate zero-padding/fast convolution type operation.
Then, the analysis window is whatever COLA window you care to use for
analysis purposes. As discussed, this is only appropriate for when the
modification is constrained to be a length-K FIR kernel.

The other is like your Gaussian example where you want to use a particular
window for analysis/modification reasons, and then need to square that with
the PR condition on the synthesis side. The downside here is that the
resulting synthesis windows are not as well behaved in terms of suppressing
block processing artifacts. They tend to become heavy-shouldered, exhibit
regions of amplification, etc. This can be worth it, but only if you gain
enough from the analysis/modification properties.

> > Rectangular windows are a perfectly valid choice here, albeit one with
> poor sidelobe suppression.
>
> but it doesn't matter with overlap-add fast convolution.  somehow, the
> sidelobe effects come out in the wash, because we can insure (to finite
> precision) the correctness of the output with a time-domain analysis.
>

Right, the rectangular windows are not being used for spectral estimation
in the fast convolution context, so their spectral properties are
irrelevant. They just represent a time-domain accounting of what the
circular convolution is doing.

> so you're oversampling in the frequency domain because you're zero-padding
> in the time domain.
>

Correct, zero-padding in the time domain is equivalent to upsampling in the
frequency domain.

> > Note that this equivalence happens because we are adding an additional
> time-variant stage (zero-padding/raw OLA), to explicitly correct for the
> time-variant effects of the underlying DFT operation. This is the block
> processing analog of upsampling a scalar signal by K so that we can apply
> an order-K polynomial nonlinearity without aliasing.
>
> where is

Re: [music-dsp] FIR blog post & interactive demo

2020-03-12 Thread Ethan Duni

Hi Robert

On Wed, Mar 11, 2020 at 4:19 PM robert bristow-johnson <
r...@audioimagination.com> wrote:

>
> i don't think it's too generic for "STFT processing".  step #4 is pretty
> generic.
>

I think the part that chafes my intuition is more that the windows in steps
#2 and #6 should "match" in some way, and obey an appropriate perfect
reconstruction condition. I think of STFT as intentionally wiping out any
spill-over effects between frames with synthesis windowing, to impose a
particular time-frequency tiling. Whereas fast convolution is defined by
how it explicitly accounts for spill-over between frames.

My intuition isn't definitive, but that's what comes to mind. In any case,
"STFT processing" is a very generic term.

>
> here is my attempt to quantitatively define and describe the STFT:
>
>
> https://dsp.stackexchange.com/questions/45625/is-windowed-fourier-transform-a-synonym-for-stft/45631#45631
>

Cool, that's a helpful reference for this stuff.

In terms of "what even is STFT", it seems there is more consensus on the
analysis part. Many STFT applications don't involve any synthesis or
filtering, but only frequency domain parameter estimation. For
analysis-only, probably everyone agrees that STFT consists of some Constant
OverLap Add (COLA) window scheme, followed by DFT. Rectangular windows are
a perfectly valid choice here, albeit one with poor sidelobe suppression.
Note that there are two potential layers of oversampling available: one
from overlapped windows, and another from zero-padding.

To summarize my understanding of your earlier remarks, the situation gets
fuzzier for synthesis. Broadly, there are two basic approaches. One is to
keep the COLA analysis and use raw (unwindowed) overlap-add for synthesis.
The other is to add synthesis windows, in which case the PR condition
becomes COLA on the product of the analysis and synthesis windows (I'd call
this "STFT filter bank" or maybe "FFT phase vocoder" depending on the
audience/application).

The first approach has immediate problems if the DFT values are modified,
because the COLA condition is not enforced on the output. For the special
case that the modification is multiplication by a DFT kernel that
corresponds to a length-K FIR filter, this can be accommodated by
zero-padding type oversampling, which results in the Overlap-Add flavor of
fast convolution to account for the inter frame effects. Note that this
implicitly extends the (raw) overlap-add region in synthesis accordingly -
the analysis windows obey COLA, but the synthesis "windows" have different
support and are not part of the PR condition.

As you point out, this works for any COLA analysis window scheme, not just
rectangular, although the efficiency is correspondingly reduced with
overlap. This system is equivalent to a SISO FIR, up to finite word length
effects. Note that this equivalence happens because we are adding an
additional time-variant stage (zero-padding/raw OLA), to explicitly correct
for the time-variant effects of the underlying DFT operation. This is the
block processing analog of upsampling a scalar signal by K so that we can
apply an order-K polynomial nonlinearity without aliasing.

The synthesis window approach is more general in the types of modifications
that can be accommodated (spectral subtraction, nonlinear operations,
etc.). This is because it allows time domain aliasing to occur, but
explicitly suppresses it by attenuating the frame edges. This is also
throwing oversampling at the problem, but of the overlap type instead of
the zero-padding type.

You can also apply zero-padding on top of synthesis windows to further
increase the margin for circular aliasing. However unlike fast convolution
you would still apply the synthesis windows to remove spill-over between
frames and not use raw OLA. This is required for the filterbank PR
condition. There is no equivalent SISO system in this case. The level of
aliasing is determined by how hard you push on the response, and how much
overlap/zero-padding you can afford. I.e., it's ultimately engineered/tuned
rather than designed out explicitly as in fast convolution.

We're all on the same page on this stuff, I hope?

Ethan
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] FIR blog post & interactive demo

2020-03-11 Thread Ethan Duni

On Tue, Mar 10, 2020 at 8:36 AM Spencer Russell  wrote:

>
> The point I'm making here is that overlap-add fast FIR is a special case
> of STFT-domain multiplication and resynthesis. I'm defining the standard
> STFT pipeline here as:
>
> 1. slice your signal into frames
> 2. pointwise-multiply an analysis window by each frame
> 3. perform `rfft` on each frame to give the STFT domain representation
> 4. modify the STFT representation
> 5. perform `irfft` on each frame
> 6. pointwise-multiply a synthesis window on each frame
> 7. overlap-add each frame to get the resulting time-domain signal
>

I don't think there is a precise definition of STFT, but IMO this is too
generic. The fundamental design parameters for an STFT system are the
window shapes and overlaps, but in fast convolution those degrees of
freedom are eliminated entirely.

The reason this distinction is important is that STFT is for cases where
you want to estimate the response in the frequency domain. If you can't
apply a useful analysis window, then there isn't much point.

If you already have your response expressed as length-K FIRs in the time
domain, then you don't need STFT. You just apply the FIRs directly (using
fast convolution if appropriate). STFT is not an attractive topology just
for implementing time-varying FIR, as such.

>
> This is just to make the point that fast FIR is a special case of STFT
> processing.

So, a useful distinction here is "oversampled filterbanks". The way that
fast convolution works is through oversampling/zero-padding. This creates
margin in the time domain to accommodate aliasing. You can apply
oversampling - in any amount - to an STFT system to reduce the aliasing at
the cost of increased overhead. Fast convolution is sort of a corner case,
where you can eliminate the aliasing entirely with finite oversampling, at
the cost of losing your analysis and synthesis windowing (and introducing
extra spill-over in the synthesis).

> > Right, but if you are using length K FFT and zero-padding by K-1, then
> > the hop size is 1 sample and there are no windows.
>
> Whoops, this was dumb on my part. I was not referring to a hop size of 1!
> Hopefully my explanation above is more clear.
>

But, the reason I made this point is you specified that the FFTs are length
K. If you are using length N+K FFTs, then the estimated response is length
N+K, and we are back to the original problem of ensuring that a DFT
response is appropriately time limited.

This can be done by brute force of course: length N+K IFFT => length K
window => length N+K FFT. But that is the same complexity as the STFT
signal path! So, we're back to smoothing/conditioning in the frequency
domain, windowing in the time domain, and accepting some engineered level
of time domain aliasing. The difference is that the oversampling (N+K vs N)
has given us additional margin to accommodate it (i.e., we can tolerate
sharper filters).

Fast convolution is for cases where filter design is done offline: then all
those filter computations can be done ahead of time,  there is no aliasing,
and everything is great! But if the filters get estimated at runtime then
you run into prohibitive costs. Since STFT is the latter case, in practice
fast convolution isn't much help. The two approaches are orthogonal in that
sense, despite their structural similarities.

>
> You could think of STFT multiplication as applying a different FIR filter
> to each frame and then cross-fading between them, which is clearly not the
> same as continually varying the FIR parameters in the time domain.

By linearity, cross-fading between two fixed FIRs is equivalent to applying
a single time-varying FIR with the coefficients cross-faded the same way
(at least for direct form). The synthesis window defines this cross-fade
behavior for STFT.

It's still not exactly equivalent, because the STFT is a time-variant MIMO
system that does not admit an exact SISO equivalent. However, the
difference is really more that it creates aliasing (which should be very
low level if properly designed), and not that you can't make an FIR with
comparable response. It's not trivial, but it is relatively straightforward
to take an STFT response, consider the window used to obtain it, and spit
out a corresponding FIR. These can then be interpolated according to the
synthesis windows to operate a comparable time-varying SISO FIR system.

Such a system might actually be preferable to the STFT synthesis chain,
since it wouldn't create circular convolution artifacts in the first place.
However, it is expensive - you're already running the STFT analysis chain,
converting the filter to the time domain is *more* expensive than the STFT
synthesis, and then you still have to run the time-varying FIR on top of
that.

> They do seem to have a tight relationship though, and when we do STFT
> modifications it seems that in some contexts we're trying to approximate
> the time-varying FIR filter.
>

Yeah, multiplying DFTs is funda

Re: [music-dsp] FIR blog post & interactive demo

2020-03-10 Thread Ethan Duni



> On Mar 10, 2020, at 3:38 AM, Richard Dobson  wrote:
> 
> You can have windows when hop size is 1 sample (as used in the sliding phase 
> vocoder (SPV) proposed by Andy Moorer exactly 20 years ago, and the focus of 
> a research project I was part of around 2007). So long as the window is based 
> on sines and cosines, it can be done bin by bin as a frequency-domain 
> covolution. The SPV has been implemented in Csound for those cases where 
> single-sample control rates were being used. For larger block sizes it 
> reverts to the standard block-based streaming phase vocoder.

That’s very interesting, I’d certainly like to read more.

Ethan
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] FIR blog post & interactive demo

2020-03-09 Thread Ethan Duni

It is certainly possible to combine STFT with fast convolution in various ways. 
But doing so imposes significant overhead costs and constrains the overall 
design in strong ways. 

For example, this approach:

> On Mar 9, 2020, at 7:16 AM, Spencer Russell  wrote:
> 
> 
> if you have an KxN STFT (K frequency components and N frames) then then 
> zero-padding each frame by K-1 should still eliminate any time-aliasing even 
> if your filter has hard edges in the frequency domain, right?

Right, but if you are using length K FFT and zero-padding by K-1, then the hop 
size is 1 sample and there are no windows. 

This is just applying the raw IDFT of the response as an FIR, which is not 
appropriate for something estimated in a windowed filterbank domain. Deriving 
an equivalent FIR from, say, an estimated noise reduction mask is not trivial.

> 
> I understand the role of time-domain windowing in STFT processing to be 
> mostly:
> 1. Reduce frequency-domain ripple (side-lobes in each band)

Right, this is the “analysis” aspect, where the window controls the spectral 
characteristics (frequency selectivity, bandwidth, leakage, etc.)

> 2. Provide a sort of cross-fade from frame-to-frame to smooth out framing 
> effects

And that is the “synthesis” aspect, where the window controls the 
characteristics of the artifacts introduced by processing. Note that “framing 
effects” are by definition time-variant: this is a form of aliasing.

Ethan
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] FIR blog post & interactive demo

2020-03-08 Thread Ethan Duni

On Sun, Mar 8, 2020 at 8:02 PM Spencer Russell  wrote:

> In fact, the the standard STFT analysis/synthesis pipeline is the same
> thing as overlap-add "fast convolution" if you:
>
> 1. Use a rectangular window with a length equal to your hop size
> 2. zero-pad each input frame by the length of your FIR kernel minus 1
>

Indeed, the two ideas are closely related and can be combined. It's more a
difference in the larger approach.

If you can specify the desired response in terms of an FIR of some fixed
length, then you can account for the circular effects and use fast FIR.
Note that this is a time-variant MIMO system constructed to be equivalent
to a time-invariant SISO system (modulo finite word length effects, as you
say).

Alternatively, the desired response can be specified in the STFT domain.
This comes up naturally in situations where it is estimated in the
frequency domain to begin with, such as noise suppression or channel
equalization. Then, circular convolution effects are controlled through a
combination of pre/post windowing and smoothing/conditioning of the
frequency response. Unlike the fast FIR case, the time-variant effects are
only approximately suppressed: this is a time-variant MIMO system that is
*not* equivalent to any time-invariant SISO system.

So there is an extra layer of engineering needed in STFT systems to ensure
that time domain aliasing is adequately suppressed. With fast FIR, you just
calculate the correct size to zero-pad (or delete), and then there is no
aliasing to worry about.

Ethan
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] FIR blog post & interactive demo

2020-03-08 Thread Ethan Duni


> 
> If the system is suitably designed (e.g. correct window and overlap),
> you can filter using an FFT and get identical results to a time domain
> FIR filter (up-to rounding/precision limits, of course). The
> appropriate window and overlap process will cause all circular
> convolution artefacts to cancel.

Fast FIR is a different thing than an FFT filter bank.

You can combine the two approaches but I don’t think that’s what is being done 
here?

Ethan
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] FIR blog post & interactive demo

2020-03-08 Thread Ethan Duni

No, MDCT TDAC is the same. Perfect reconstruction only obtains if the 
coefficients are not changed at all. Coding noise causes (uncancelled) time 
domain aliasing that is shaped according to the window design. Limiting this 
effect is a primary aspect of MDCT codec design.

Ethan 

> On Mar 8, 2020, at 4:45 PM, zhiguang zhang  wrote:
> 
> 
> Audio compression by definition 'alters' the transform coefficients and they 
> get perfect reconstruction with no aliasing due to the transform alone.  In 
> fact 'TDAC' or time domain aliasing cancellation is a hallmark of the MDCT or 
> DCT type IV which is ubiquitous in audio codecs.
> 
>> On Sun, Mar 8, 2020, 7:41 PM Ethan Duni  wrote:
>> FFT filterbanks are time variant due to framing effects and the circular 
>> convolution property. They exhibit “perfect reconstruction” if you design 
>> the windows correctly, but this only applies if the FFT coefficients are not 
>> altered between analysis and synthesis. If you alter the FFT coefficients 
>> (i.e., “filtering”), it causes time domain aliasing. 
>> 
>> So, the operation of such a system can’t be reduced down to an equivalent 
>> LTI frequency response. We have the more basic issue of limiting the 
>> aliasing to acceptable levels. This depends partially on the frame size, 
>> overlap, and window shape, as these determine how any aliasing is 
>> distributed in a time/frequency sense. But more fundamentally, you have to 
>> put constraints on the response curves to limit the aliasing. I.e. the 
>> steeper the transitions in the frequency response, the longer the implied 
>> impulse response, and so the time domain aliasing gets worse.
>> 
>> It is certainly possible to run any filter bank offline and compensate for 
>> its latency, in order to get a “zero phase” processing. But fundamentally 
>> they have framing delay given by the frame size, and algorithmic latency 
>> given by the overlap. These are the delays that you’d compensate when 
>> running offline.
>> 
>> Ethan
>> 
>>>> On Mar 8, 2020, at 2:04 PM, zhiguang zhang  
>>>> wrote:
>>>> 
>>> 
>>> The system is memoryless just because it is based on the DFT and nothing 
>>> else, which is also why it's time-invariant.  unless you alter certain 
>>> parameters in real-time like the window size or hop size or windowing 
>>> function, etc, any input gives you the same output at any given time, which 
>>> is the definition of time-invariance.
>>> 
>>> well, you're RBJ and I see that you used to work at Kurzweil until 2008.  
>>> that's cool and what have you been up to since then?  incidentally i was in 
>>> California until 2008.
>>> 
>>> As you might be able to tell, i don't care too much about the fact that 
>>> time domain filtering theory is brought up often because I did my master's 
>>> thesis with this frequency domain FFT filter, which I believe was rather 
>>> novel at the time of completion.  I do know that the field is steeped in 
>>> tradition, which might be why I'm writing to the mailing list and to you in 
>>> particular.  but bringing up traditional FIR/IIR filtering terminology to 
>>> describe FFT filtering doesn't make sense in my mind.  I'm not in the audio 
>>> field.  but yes, I do believe that the system is time invariant, but I 
>>> don't have time to prove myself to you on this forum at this time, nor do I 
>>> have any interest in meeting Dr Bosi at AES.
>>> 
>>> -ez
>>> 
>>> 
>>> 
>>>> On Sun, Mar 8, 2020 at 4:42 PM robert bristow-johnson 
>>>>  wrote:
>>>> 
>>>> 
>>>> > On March 8, 2020 3:07 PM zhiguang zhang  wrote:
>>>> > 
>>>> > 
>>>> > Well I believe the system is LTI just because the DFT is LTI by 
>>>> > definition.
>>>> 
>>>> TI is nowhere in the definition of the DFT.  L is a consequence of the 
>>>> definition of the DFT, but the DFT is not an LTI system.  it is an 
>>>> operation done to a finite segment of samples of a discrete-time signal.
>>>> 
>>>> > The impulse response of a rectangular window I believe is that of a sinc 
>>>> > function,
>>>> 
>>>> window functions do not have impulse responses.
>>>> 
>>>> both window functions and impulse responses can be Fourier transformed.  
>>>> the Fourier transform of the latter is what we call the "f

Re: [music-dsp] FIR blog post & interactive demo

2020-03-08 Thread Ethan Duni

FFT filterbanks are time variant due to framing effects and the circular 
convolution property. They exhibit “perfect reconstruction” if you design the 
windows correctly, but this only applies if the FFT coefficients are not 
altered between analysis and synthesis. If you alter the FFT coefficients 
(i.e., “filtering”), it causes time domain aliasing. 

So, the operation of such a system can’t be reduced down to an equivalent LTI 
frequency response. We have the more basic issue of limiting the aliasing to 
acceptable levels. This depends partially on the frame size, overlap, and 
window shape, as these determine how any aliasing is distributed in a 
time/frequency sense. But more fundamentally, you have to put constraints on 
the response curves to limit the aliasing. I.e. the steeper the transitions in 
the frequency response, the longer the implied impulse response, and so the 
time domain aliasing gets worse.

It is certainly possible to run any filter bank offline and compensate for its 
latency, in order to get a “zero phase” processing. But fundamentally they have 
framing delay given by the frame size, and algorithmic latency given by the 
overlap. These are the delays that you’d compensate when running offline.

Ethan

> On Mar 8, 2020, at 2:04 PM, zhiguang zhang  wrote:
> 
> 
> The system is memoryless just because it is based on the DFT and nothing 
> else, which is also why it's time-invariant.  unless you alter certain 
> parameters in real-time like the window size or hop size or windowing 
> function, etc, any input gives you the same output at any given time, which 
> is the definition of time-invariance.
> 
> well, you're RBJ and I see that you used to work at Kurzweil until 2008.  
> that's cool and what have you been up to since then?  incidentally i was in 
> California until 2008.
> 
> As you might be able to tell, i don't care too much about the fact that time 
> domain filtering theory is brought up often because I did my master's thesis 
> with this frequency domain FFT filter, which I believe was rather novel at 
> the time of completion.  I do know that the field is steeped in tradition, 
> which might be why I'm writing to the mailing list and to you in particular.  
> but bringing up traditional FIR/IIR filtering terminology to describe FFT 
> filtering doesn't make sense in my mind.  I'm not in the audio field.  but 
> yes, I do believe that the system is time invariant, but I don't have time to 
> prove myself to you on this forum at this time, nor do I have any interest in 
> meeting Dr Bosi at AES.
> 
> -ez
> 
> 
> 
>> On Sun, Mar 8, 2020 at 4:42 PM robert bristow-johnson 
>>  wrote:
>> 
>> 
>> > On March 8, 2020 3:07 PM zhiguang zhang  wrote:
>> > 
>> > 
>> > Well I believe the system is LTI just because the DFT is LTI by definition.
>> 
>> TI is nowhere in the definition of the DFT.  L is a consequence of the 
>> definition of the DFT, but the DFT is not an LTI system.  it is an operation 
>> done to a finite segment of samples of a discrete-time signal.
>> 
>> > The impulse response of a rectangular window I believe is that of a sinc 
>> > function,
>> 
>> window functions do not have impulse responses.
>> 
>> both window functions and impulse responses can be Fourier transformed.  the 
>> Fourier transform of the latter is what we call the "frequency response" of 
>> the system.  i am not sure what they call the fourier transform of a window 
>> function.  what is done with the frequency response (multiplication) is 
>> *not* what is done with the fourier transform of a window function 
>> (convolution).
>> 
>> > which has ripple artifacts.
>> 
>> there are no ripple artifacts in fast convolution using a rectangular 
>> window.  you need to learn what that is.
>> 
>> > Actually, the overlap-add method (sorry I don't have time to dig into the 
>> > differences between overlap-add and overlap-save right now)
>> 
>> what you need is time to learn the basics and learn the proper terminology 
>> of things so that confusion in communication is minimum.
>> 
>> > minimizes artifacts depending on the windowing function.
>> 
>> again, there are no ripple artifacts in fast convolution using a rectangular 
>> window.  none whatsoever.
>> 
>> > A sine window ...
>> 
>> i think you might mean the "Hann window" (sometimes misnamed "Hanning", but 
>> that is an old misnomer).  i have never heard of a "sine window" and i have 
>> been doing this for 45 years.  perhaps the classic Fred Harris paper on 
>> windows has a "sine window".
>> 
>> > ... actually sums to 1,
>> 
>> that's what we mean by "complementary".
>> 
>> > the proof of which can be found in audio coding theory. I suggest you 
>> > check out the book by Bosi.
>> 
>> i didn't even know Marina did a book, but i am not surprized.  i've known 
>> (or been acquainted with) Marina since she was with Digidesign back in the 
>> early 90s.  before the Dolby Lab days.  before her injury at the New York 
>> Hilton in 1993.  would you

Re: [music-dsp] FIR blog post & interactive demo

2020-03-08 Thread Ethan Duni

It is physically impossible to build a causal, zero-phase system with 
non-trivial frequency response. 

Ethan

> On Mar 7, 2020, at 7:42 PM, Zhiguang Eric Zhang  wrote:
> 
> 
> Not to threadjack from Alan Wolfe, but the FFT EQ was responsive written in C 
> and running on a previous gen MacBook Pro from 2011.  It wouldn't have been 
> usable in a DAW even without any UI.  It was running FFTW.
> 
> As far as linear / zero-phase, I didn't think about the impulse response but 
> what you might get are ripple artifacts from the FFT windowing function.  
> Otherwise the algorithm is inherently zero-phase.
> 
>> On Sat, Mar 7, 2020, 7:11 PM robert bristow-johnson 
>>  wrote:
>> 
>> 
>> > On March 7, 2020 6:43 PM zhiguang zhang  wrote:
>> > 
>> > 
>> > Yes, essentially you do have the inherent delay involving a window of 
>> > samples in addition to the compute time.
>> 
>> yes.  but the compute time is really something to consider as a binary 
>> decision of whether or not the process can be real time.
>> 
>> assuming your computer can process these samples real time, the delay of 
>> double-buffering is *twice* the delay of a single buffer or "window" (i 
>> would not use that term, i might use the term "sample block" or "sample 
>> frame") of samples.  suppose your buffer or sample block is, say, 4096 
>> samples.  while you are computing your FFT (which is likely bigger than 4K), 
>> multiplication in the frequency domain, inverse FFT and overlap-adding or 
>> overlap-scrapping, you are inputting the 4096 samples to be processed for 
>> your *next* sample frame and you are outputting the 4096 samples of your 
>> *previous* sample frame.  so the entire delay, due to block processing, is 
>> two frame lengths, in this case, 8192 samples.
>> 
>> now if the processing time required to do the FFT, frequency-domain 
>> multiplication, iFFT, and overlap-add/scrap exceeds the time of one frame, 
>> then the process cannot be real time.  but if the time required to do all of 
>> that (including the overhead of interrupt I/O-ing the samples in/out of the 
>> blocks) is less than that of a frame, the process is or can be made into a 
>> real-time process and the throughput delay is two frames.
>> 
>> > > On Sat, Mar 7, 2020, at 6:00 AM, Zhiguang Eric Zhang wrote:
>> > > ... FFT filtering is essentially zero-phase ...
>> 
>> FFT filtering **can** be linear-phase (which is zero-phase plus a constant 
>> delay) if the FIR filter impulse response is designed to be linear-phase (or 
>> symmetrical).  it doesn't have to be linear phase.
>> 
>> --
>> 
>> r b-j  r...@audioimagination.com
>> 
>> "Imagination is more important than knowledge."
>> 
>> > On Sat, Mar 7, 2020, 5:40 PM Spencer Russell  wrote:
>> > > On Sat, Mar 7, 2020, at 6:00 AM, Zhiguang Eric Zhang wrote:
>> > > > Traditional FIR/IIR filtering is ubiquitous but actually does suffer 
>> > > > from drawbacks such as phase distortion and the inherent delay 
>> > > > involved. FFT filtering is essentially zero-phase, but instead of 
>> > > > delays due to samples, you get delays due to FFT computational 
>> > > > complexity instead.
>> > > 
>> > > I wouldn’t say the delay when using FFT processing is due to 
>> > > computational complexity fundamentally. Compute affects your max 
>> > > throughput more than your latency. In other words, if you had an 
>> > > infinitely-fast computer you would still have to deal with latency. The 
>> > > issue is just that you need at least 1 block of input before you can do 
>> > > anything. It’s the same thing as with FIR filters, they need to be 
>> > > causal so they can’t be zero-phase. In fact you could interchange the 
>> > > FFT processing with a bank of FIR band pass filters that you sample from 
>> > > whenever you want to get your DFT frame. (that’s basically just a 
>> > > restatement of what I said before about the STFT.)
>> > > 
>> > > -s
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.columbia.edu_mailman_listinfo_music-2Ddsp&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=w_CiiFx8eb9uUtrPcg7_DA&m=Rfp1ozFiVA9OFrUgHnk8k4E7erWKoT0p7iI83xAoYEo&s=PQF0MzrvvmNpczp2_8SZoGEb7ojFOWxu-5BZ5mBpcb0&e=
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] high & low pass correlated dither noise question

2019-06-27 Thread Ethan Duni

So as Nigel and Robert have already explained, in general you need to
separately handle the spectral shaping and pdf shaping. This dither
algorithm works by limiting to the particular case of triangular pdf with a
single pole at z=+/-1. For that case, the state of the spectral shaping
filter can be combined with the state of the pdf shaper, and so a single
process (with no multiplies!) handles both pdf shaping and spectral shaping.

For arbitrary order M, you would roll one die at each step and then sum it
with M previous rolls (possibly with some set of signs inverted). So the OP
example is M=1. You have your choice of 2^M spectral shapes, depending on
which (if any) of the previous rolls you invert. For the "highness" output,
you will want to invert every other previous roll. As M increases, the
output gets more Gaussian.

However for higher orders this multiplier-free algorithm does not produce
attractive spectral shapes. For even orders, the highpass does not have a
zero at z=1. For odd orders, the frequency response has large notches in
the middle of the bandwidth.

For most applications, a triangular pdf with single zero at z=1 is a
perfectly good dither configuration, and there is no need to go any
further. If you are looking for a higher-order dither algorithm without
multiplies, I think the way to extend this would be to include bit shifts
in the summation. Then you can get some reasonable spectral shapes. The
simple summation approach is too constrained for orders>1.

Ethan

On Thu, Jun 27, 2019 at 7:43 AM Alan Wolfe  wrote:

> I read a pretty cool article the other day:
> https://www.digido.com/ufaqs/dither-noise-probability-density-explained/
>
> It says that if you have two dice (A and B) that you can roll both dice
> and then...
> 1) Re-roll die A and sum A and B
> 2) Re-roll die B and sum A and B
> 3) Re-roll die A and sum A and B
> 4) repeat to get a low pass filtered triangular noise distribution.
>
> It says that you can modify it for high pass filtered triangle noise by
> rolling both dice and then...
> 1) Re-roll die A and take A - B
> 2) Re-roll die B and take B - A
> 3) Re-roll die A and take A - B
> 4) repeat to get a high pass filtered triangular noise distribution.
>
> What i'm wondering is, what is the right thing to do if you want to do
> this with more than 2 dice? (going higher order)
>
> For low pass filtered noise with 3+ more dice (which would be more
> gaussian distributed than triangle), would you only re-roll one die each
> time, or would you reroll all BUT one die each time.
>
> I have the same question about the high pass filtered noise with 3+ more
> dice, but in that case I think i know what to do about the subtraction
> order...  I think the right thing to do if you have N dice is to sum them
> all up, but after each "roll" you flip the sign of every die.
>
> What do you guys think?
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Who uses YIN or pYIN for pitch detection?

2019-03-06 Thread Ethan Duni

Looks like they use the Viterbi algorithm to get the pitch tracks. 

> On Mar 6, 2019, at 6:59 PM, Jay  wrote:
> 
> 
> Looks like there's a link to a python implementation on this topics page, 
> might provide some insights:
> https://github.com/topics/pitch-tracking
> 
> 
> 
> 
> 
> 
> 
> 
>> On Wed, Mar 6, 2019 at 6:44 PM robert bristow-johnson 
>>  wrote:
>>  
>> 
>> Hay, any peeps around here that use YIN?  or pYIN?
>> 
>> Some of you who hang around the DSP Stack Exchange might know that I am 
>> unimpressed with YIN, namely that I don't think there is anything novel 
>> about it (w.r.t. Average Squared Difference Function, ASDF) other than this 
>> so-called "Cumulative Mean Normalized Difference Function"(CMNDF) which 
>> seems to have the only purpose to prevent choosing the lag of 0 as the 
>> best-fit lag.  Big Dl.  There are other ways to do that, and otherwise 
>> the CMNDF just fucks up the ASDF result, at least a little, at the lags 
>> around the period length.  And ASDF is still the measure of best fit.  Here 
>> is where I complain a little about YIN:
>> > https://dsp.stackexchange.com/questions/51823/yin-pitch-estimation-algoritm-simplified-explanation/51842#51842
>> >  
>> 
>> Here is a copy of the original YIN:
>> > [Cheveigne A, Kawahara H. - *YIN, a fundamental frequency estimator for 
>> > speech and music*](http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf )
>>  
>> and the new, improved probabilistic YIN:
>> > [Mauch M, Dixon S. - *PYIN: A fundamental frequency estimator using 
>> > probabilistic threshold 
>> > distributions*](http://matthiasmauch.de/_pdf/mauch_pyin_2014.pdf )
>> Now, while I don't want to use YIN to find pitch candidates (I think I do a 
>> better job of it with just the ASDF), I am curious about pYIN in what 
>> exactly they do with the pitch candidates.  I understand Hidden Markov 
>> Models, or at least I used to, but I do not know what Mauch and Dixon do to 
>> actually pick the final candidate.  Has anyone else slogged through this 
>> enough to understand what hey are doing?  How do they connect a candidate 
>> from the previous frame to a candidate of the current frame?, and then, how 
>> does pYIN score each candidate and choose the candidate that will be output 
>> as the pitch?
>> 
>> If anyone worked on this, please lemme know.
>> 
>> 
>> --
>> 
>> r b-j r...@audioimagination.com
>> 
>> "Imagination is more important than knowledge."
>>  
>> 
>>  
>> 
>>  
>> 
>>  
>> 
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Auto-tune sounds like vocoder

2019-01-16 Thread Ethan Duni

Aren't Auto-Tune and similar built on LPC vocoders? I had the impression
that was publicly known (recalling magazine interviews/articles from the
late 90s). The secret sauce being all the stuff required for pitch
tracking, unvoiced segments, different tunings, vibrato, corner cases, etc.

But as far as "sounds like a vocoder," the basic nature of the effect is
exactly that: you have a formant filter that tracks the input speech, which
you excite with a synthetic signal. If that synthetic signal tracks the
real excitation very closely, this sounds quite natural. If you push hard
on the effect (or just do a bad job at the synthesis part), the artificial
nature of the excitation becomes apparent and the result is essentially the
"classic" synthesizer-driven vocoder sound.

Also the factors others have mentioned: stuff like hard pitch quantization
and intermodulation artifacts make the excitation sound "robotic", and you
get a further chorus effect from mixing with unprocessed input.

Ethan

On Wed, Jan 16, 2019 at 1:16 AM Andy Farnell 
wrote:

> On Tue, Jan 15, 2019 at 08:05:11PM +0100, David Reaves wrote:
>
> > I’m wondering about why the ever-prevalent auto-tune effect in much
> > of today's (cough!) music (cough!) seems, to my ears, to have such
> > a vocoder-y sound to it. Are the two effects related?
>
> So, I would say yes, they're related. Weakly. As Sampo says,
> the method is essentially a grain-wise Fourier reconstruction.
> Upshot is it sounds like a vocoder because it is the voice
> 'vocoded' with a pulse stream at near to the original fundamental
> (but corrected). Additionally two other things enhance the
> psychoacoustic impression that it's a classic vocoder. First
> is the pitch quantisation, so when you glissando there's
> a stepped effect that makes the banding stand out more.
> And second, as Ben says, some mixing of the dry and wet usually
> produces a chorus/flanger effect on top.
>
> Disclaimer: I have never seen the Antares source code so
> could be guessing very wrongly, but that's what my ears think.
>
> best,
> Andy
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] 2-point DFT Matrix for subbands Re: FFT for realtime synthesis?

2018-11-09 Thread Ethan Duni

gm wrote:
>This is brining up my previous question again, how do you decimate a
spectrum
>by an integer factor properly, can you just add the bins?

To decimate by N, you just take every Nth bin.

>the orginal spectrum represents a longer signal so I assume folding
>of the waveform occurs?

Yeah, you will get time-domain aliasing unless your DFT is oversampled
(i.e., zero-padded in time domain) by a factor of (at least) N to begin
with. For critically sampled signals the result is severe distortion (i.e.,
SNR ~= 0dB).

>but maybe this doesn't matter in practice for some applications?

The only applications I know of that tolerate time-domain aliasing in
transforms are WOLA filter banks - which are explicitly designed to cancel
these (severe!) artifacts in the surrounding time-domain processing.

Ethan D

On Fri, Nov 9, 2018 at 6:39 AM gm  wrote:

> This is brining up my previous question again, how do you decimate a
> spectrum
> by an integer factor properly, can you just add the bins?
>
> the orginal spectrum represents a longer signal so I assume folding
> of the waveform occurs? but maybe this doesn't matter in practice for some
> applications?
>
> The background is still that I want to use a higher resolution for
> ananlysis and
> a lower resolution for synthesis in a phase vocoder.
>
> Am 08.11.2018 um 21:45 schrieb Ethan Duni:
>
> Not sure can get the odd bins *easily*, but it is certainly possible.
> Conceptually, you can take the (short) IFFT of each block, then do the
> (long) FFT of the combined blocks. The even coefficients simplify out as
> you observed, the odd ones will be messier. Not sure quite how messy - I've
> only looked at the details for DCT cases.
>
> Probably the clearest way to think about it is in the frequency domain.
> Conceptually, the two consecutive short DFTs are the same as if we had
> taken two zero-padded long DFTs, and then downsampled each by half. So the
> way to combine them is to reverse that process: upsample them by 2, and
> then add them together (with appropriate compensation for the
> zero-padding/boxcar window).
>
> Ethan D
>
> On Thu, Nov 8, 2018 at 8:12 AM Ethan Fenn  wrote:
>
>> I'd really like to understand how combining consecutive DFT's can work.
>> Let's say our input is x0,x1,...x7 and the DFT we want to compute is
>> X0,X1,...X7
>>
>> We start by doing two half-size DFT's:
>>
>> Y0 = x0 + x1 + x2 + x3
>> Y1 = x0 - i*x1 - x2 + i*x3
>> Y2 = x0 - x1 + x2 - x3
>> Y3 = x0 + i*x1 - x2 - i*x3
>>
>> Z0 = x4 + x5 + x6 + x7
>> Z1 = x4 - i*x5 - x6 + i*x7
>> Z2 = x4 - x5 + x6 - x7
>> Z3 = x4 + i*x5 - x6 - i*x7
>>
>> Now I agree because of periodicity we can compute all the even-numbered
>> bins easily: X0=Y0+Z0, X2=Y1+Z1, and so on.
>>
>> But I don't see how we can get the odd bins easily from the Y's and Z's.
>> For instance we should have:
>>
>> X1 = x0 + (r - r*i)*x1 - i*x2 + (-r - r*i)*x3 - x4 + (-r + r*i)*x5 + i*x6
>> + (r + r*i)*x7
>>
>> where r=sqrt(1/2)
>>
>> Is it actually possible? It seems like the phase of the coefficients in
>> the Y's and Z's advance too quickly to be of any use.
>>
>> -Ethan
>>
>>
>>
>> On Mon, Nov 5, 2018 at 3:40 PM, Ethan Duni  wrote:
>>
>>> You can combine consecutive DFTs. Intuitively, the basis functions are
>>> periodic on the transform length. But it won't be as efficient as having
>>> done the big FFT (as you say, the decimation in time approach interleaves
>>> the inputs, so you gotta pay the piper to unwind that). Note that this is
>>> for naked transforms of successive blocks of inputs, not a WOLA filter
>>> bank.
>>>
>>> There are Dolby codecs that do similar with a suitable flavor of DCT
>>> (type II I think?) - you have your encoder going along at the usual frame
>>> rate, but if it detects a string of stationary inputs it can fold them
>>> together into one big high-res DCT and code that instead.
>>>
>>> On Mon, Nov 5, 2018 at 11:34 AM Ethan Fenn 
>>> wrote:
>>>
>>>> I don't think that's correct -- DIF involves first doing a single stage
>>>> of butterfly operations over the input, and then doing two smaller DFTs on
>>>> that preprocessed data. I don't think there is any reasonable way to take
>>>> two "consecutive" DFTs of the raw input data and combine them into a longer
>>>> DFT.
>>>>
>>>> (And I don't know anything about the historical question!)
>>>>
>>>> -Etha

Re: [music-dsp] 2-point DFT Matrix for subbands Re: FFT for realtime synthesis?

2018-11-08 Thread Ethan Duni

Not sure can get the odd bins *easily*, but it is certainly possible.
Conceptually, you can take the (short) IFFT of each block, then do the
(long) FFT of the combined blocks. The even coefficients simplify out as
you observed, the odd ones will be messier. Not sure quite how messy - I've
only looked at the details for DCT cases.

Probably the clearest way to think about it is in the frequency domain.
Conceptually, the two consecutive short DFTs are the same as if we had
taken two zero-padded long DFTs, and then downsampled each by half. So the
way to combine them is to reverse that process: upsample them by 2, and
then add them together (with appropriate compensation for the
zero-padding/boxcar window).

Ethan D

On Thu, Nov 8, 2018 at 8:12 AM Ethan Fenn  wrote:

> I'd really like to understand how combining consecutive DFT's can work.
> Let's say our input is x0,x1,...x7 and the DFT we want to compute is
> X0,X1,...X7
>
> We start by doing two half-size DFT's:
>
> Y0 = x0 + x1 + x2 + x3
> Y1 = x0 - i*x1 - x2 + i*x3
> Y2 = x0 - x1 + x2 - x3
> Y3 = x0 + i*x1 - x2 - i*x3
>
> Z0 = x4 + x5 + x6 + x7
> Z1 = x4 - i*x5 - x6 + i*x7
> Z2 = x4 - x5 + x6 - x7
> Z3 = x4 + i*x5 - x6 - i*x7
>
> Now I agree because of periodicity we can compute all the even-numbered
> bins easily: X0=Y0+Z0, X2=Y1+Z1, and so on.
>
> But I don't see how we can get the odd bins easily from the Y's and Z's.
> For instance we should have:
>
> X1 = x0 + (r - r*i)*x1 - i*x2 + (-r - r*i)*x3 - x4 + (-r + r*i)*x5 + i*x6
> + (r + r*i)*x7
>
> where r=sqrt(1/2)
>
> Is it actually possible? It seems like the phase of the coefficients in
> the Y's and Z's advance too quickly to be of any use.
>
> -Ethan
>
>
>
> On Mon, Nov 5, 2018 at 3:40 PM, Ethan Duni  wrote:
>
>> You can combine consecutive DFTs. Intuitively, the basis functions are
>> periodic on the transform length. But it won't be as efficient as having
>> done the big FFT (as you say, the decimation in time approach interleaves
>> the inputs, so you gotta pay the piper to unwind that). Note that this is
>> for naked transforms of successive blocks of inputs, not a WOLA filter
>> bank.
>>
>> There are Dolby codecs that do similar with a suitable flavor of DCT
>> (type II I think?) - you have your encoder going along at the usual frame
>> rate, but if it detects a string of stationary inputs it can fold them
>> together into one big high-res DCT and code that instead.
>>
>> On Mon, Nov 5, 2018 at 11:34 AM Ethan Fenn 
>> wrote:
>>
>>> I don't think that's correct -- DIF involves first doing a single stage
>>> of butterfly operations over the input, and then doing two smaller DFTs on
>>> that preprocessed data. I don't think there is any reasonable way to take
>>> two "consecutive" DFTs of the raw input data and combine them into a longer
>>> DFT.
>>>
>>> (And I don't know anything about the historical question!)
>>>
>>> -Ethan
>>>
>>>
>>>
>>> On Mon, Nov 5, 2018 at 2:18 PM, robert bristow-johnson <
>>> r...@audioimagination.com> wrote:
>>>
>>>>
>>>>
>>>> Ethan, that's just the difference between Decimation-in-Frequency FFT
>>>> and Decimation-in-Time FFT.
>>>>
>>>> i guess i am not entirely certainly of the history, but i credited both
>>>> the DIT and DIF FFT to Cooley and Tukey.  that might be an incorrect
>>>> historical impression.
>>>>
>>>>
>>>>
>>>>  Original Message
>>>> 
>>>> Subject: Re: [music-dsp] 2-point DFT Matrix for subbands Re: FFT for
>>>> realtime synthesis?
>>>> From: "Ethan Fenn" 
>>>> Date: Mon, November 5, 2018 10:17 am
>>>> To: music-dsp@music.columbia.edu
>>>>
>>>> --
>>>>
>>>> > It's not exactly Cooley-Tukey. In Cooley-Tukey you take two
>>>> _interleaved_
>>>> > DFT's (that is, the DFT of the even-numbered samples and the DFT of
>>>> the
>>>> > odd-numbered samples) and combine them into one longer DFT. But here
>>>> you're
>>>> > talking about taking two _consecutive_ DFT's. I don't think there's
>>>> any
>>>> > cheap way to combine these to exactly recover an indiv

Re: [music-dsp] 2-point DFT Matrix for subbands Re: FFT for realtime synthesis?

2018-11-05 Thread Ethan Duni

You can combine consecutive DFTs. Intuitively, the basis functions are
periodic on the transform length. But it won't be as efficient as having
done the big FFT (as you say, the decimation in time approach interleaves
the inputs, so you gotta pay the piper to unwind that). Note that this is
for naked transforms of successive blocks of inputs, not a WOLA filter
bank.

There are Dolby codecs that do similar with a suitable flavor of DCT (type
II I think?) - you have your encoder going along at the usual frame rate,
but if it detects a string of stationary inputs it can fold them together
into one big high-res DCT and code that instead.

On Mon, Nov 5, 2018 at 11:34 AM Ethan Fenn  wrote:

> I don't think that's correct -- DIF involves first doing a single stage of
> butterfly operations over the input, and then doing two smaller DFTs on
> that preprocessed data. I don't think there is any reasonable way to take
> two "consecutive" DFTs of the raw input data and combine them into a longer
> DFT.
>
> (And I don't know anything about the historical question!)
>
> -Ethan
>
>
>
> On Mon, Nov 5, 2018 at 2:18 PM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>>
>>
>> Ethan, that's just the difference between Decimation-in-Frequency FFT and
>> Decimation-in-Time FFT.
>>
>> i guess i am not entirely certainly of the history, but i credited both
>> the DIT and DIF FFT to Cooley and Tukey.  that might be an incorrect
>> historical impression.
>>
>>
>>
>>  Original Message 
>> Subject: Re: [music-dsp] 2-point DFT Matrix for subbands Re: FFT for
>> realtime synthesis?
>> From: "Ethan Fenn" 
>> Date: Mon, November 5, 2018 10:17 am
>> To: music-dsp@music.columbia.edu
>> --
>>
>> > It's not exactly Cooley-Tukey. In Cooley-Tukey you take two
>> _interleaved_
>> > DFT's (that is, the DFT of the even-numbered samples and the DFT of the
>> > odd-numbered samples) and combine them into one longer DFT. But here
>> you're
>> > talking about taking two _consecutive_ DFT's. I don't think there's any
>> > cheap way to combine these to exactly recover an individual bin of the
>> > longer DFT.
>> >
>> > Of course it's possible you'll be able to come up with a clever
>> frequency
>> > estimator using this information. I'm just saying it won't be exact in
>> the
>> > way Cooley-Tukey is.
>> >
>> > -Ethan
>> >
>> >
>>
>>
>>
>> --
>>
>> r b-j r...@audioimagination.com
>>
>> "Imagination is more important than knowledge."
>>
>>
>>
>>
>>
>>
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Antialiased OSC

2018-11-01 Thread Ethan Duni

Well you definitely want a monotonic, equal-amplitude crossfade, and
probably also time symmetry. So I think raised sinc is right out.

In terms of finer design considerations it depends on the time scale. For
longer crossfades (>100ms), steady-state considerations apply, and you can
design for frequency domain characteristics. I.e., raised cosine, half of
your favorite analysis window, etc.

But for shorter crossfades (particularly 20ms and below), time domain
considerations dominate and you want to minimize the max slope of the
crossfade curve. So a linear crossfade is indicated here.

Of course linear crossfade is also the cheapest option, so you really need
a reason *not* to use it.

Ethan (D)

On Thu, Nov 1, 2018 at 12:18 PM robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Antialiased OSC
> From: "Sampo Syreeni" 
> Date: Wed, October 31, 2018 9:35 pm
> To: philb...@mobileer.com
> "A discussion list for music-related DSP" 
> Cc: "robert bristow-johnson" 
> --
>
> > On 2018-08-06, Phil Burk wrote:
> >
> >> I crossfade between two adjacent wavetables.
> >
> > Yes. Now the question is, how to fade between them, optimally.
> >
> > I once again don't have any math to back this up, but intuition says the
> > mixing function ought to be something like a sinc function or a raised
> > cosine, at the lower rate. Because off the inherent bandlimit. And then
> > the ability of such linear phase thingies to be turned into one-off
> > interpolation thingies.
> >
> > Doing it at the lower rate, for the lower wavetable, would seem to be
> > the easiest, while holding to band limitation.
>
> interpolating between samples of a wavetable and crossfading between
> wavetables are different issues.
>
> if this wavetable synthesis is for the purpose of synthesizing a
> bandlimited saw, square, triangle, PWM, sync saw, sync square, then you
> adjacent wavetables going up and down the keyboard should be identical
> except on will have more harmonics at the top set to zero.
>
> i think a linear crossfade, mixing only the two adjacent wavetables, is
> the correct way to do it.
>
>
> --
>
> r b-j r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
>
>
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] pitch shifting in frequency domain Re: FFT for realtime synthesis?

2018-10-28 Thread Ethan Duni

You should have a search for papers by Jean Laroche and Mark Dolson, such
as "About This Phasiness Business" for some good information on phase
vocoder processing. They address time scale modification mostly in that
specific paper, but many of the insights apply in general, and you will
find references to other applications.

Ethan

On Sun, Oct 28, 2018 at 4:45 PM gm  wrote:

>
>
> Am 28.10.2018 um 22:28 schrieb gm:
> > I am thinking now that resetting the phase to the original when the
> > amplitude exceeds the previous value
> > is probably wrong too, because the phase should be different when
> > shifted to a different bin
> > if you want to preserve the waveshape
> > I am not sure about this, but when you think of a minimum phase
> > waveshape the phases are all different, depending on the frequency.
>
> This whole phase resetting I had is nonsens:
>
> consider for instance speech, the partials meander from bin to bin
> from one frame to the next, so you always have the case that the amplitude
> is larger than it was in the previous frame, but that is not a transient
> where you would want to reset phase.
>
> On the other hand it sounds tinny when the phases are always running freely
> and transients don't have the waveshapes they should have when you
> stretch time and shift pitch.
>
> So you would need partial tracking or something to that effect I assume.
>
> Also the formant correction I described worked pretty well in the "TDM"
> version but
> not well in the phase vocoder version, I dont know why, cause I am doing
> it the same way.
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Resampling

2018-10-06 Thread Ethan Duni

Alex, it sounds like you are confusing algorithmic latency with framing 
latency. At each frame, you take in 10ms (or whatever) of input, and then 
provide 10ms of output. This (plus processing time to generate the output) is 
the IO latency of the process. But the algorithm itself can add additional 
signal delay. 

Consider a simple delay process, wherein the algorithm maintains an internal 
delay buffer. At each 10ms frame, it reads the new input into the end of the 
buffer, and writes out 10ms of output from the front of the buffer. So the IO 
latency is 10ms, but the algorithmic latency is determined by the length of the 
delay buffer.

So if your WSOLA process requires more memory than the IO buffer, then it 
should maintain a longer internal memory. Then for each frame, you first digest 
the input into the buffer, then perform whatever processing to get 1 frame of 
output, and then save whatever state variables you need for the next frame. 
This internal buffer will add signal latency, but not IO latency.

Ethan

> On Oct 6, 2018, at 10:25 AM, gm  wrote:
> 
> 
> 
>> Am 06.10.2018 um 19:07 schrieb Alex Dashevski:
>> What do you mean "replay" ? duplicate buffer ?
> 
> I mean to just read the buffer for the output.
> So in my example you play back 10 ms audio (windowed of course), then you 
> move your read pointer and play
> that audio back again, and so on, untill the next "slice" or "grain" or 
> "snippet" of audio is played back.
> 
>> I have the opposite problem. My original buffer size doesn't contain full 
>> cycle of the pitch.
> 
> then your pitch is too low or your buffer too small - there is no way around 
> this, it's physics / causality.
> You can decrease the number of samples of the buffer with a lower sample rate,
> but not the duration/latency required.
> 
>> How can I succeed to shift pitch ?
> 
> You wrote you can have a latency of < 100ms, but 100ms should be sufficient 
> for this.
> 
> 
> 
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
> 
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Antialiased OSC

2018-08-06 Thread Ethan Duni

rbj wrote:
>i, personally, would rather see a consistent method used throughout the
MIDI keyboard range

If you squint at it hard enough, you can maybe convince yourself that the
naive sawtooth generator is just a memory optimization for low-frequency
wavetable entries. I mean, it does a perfect job at DC right? :]



On Sun, Aug 5, 2018 at 4:27 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Antialiased OSC
> From: "Nigel Redmon" 
> Date: Sun, August 5, 2018 1:30 pm
> To: music-dsp@music.columbia.edu
> --
>
> > Yes, that’s a good way, not only for LFO but for that rare time you want
> to sweep down into the nether regions to show off.
>
>
> i, personally, would rather see a consistent method used throughout the
> MIDI keyboard range; high notes or low.  it's hard to gracefully transition
> from one method to a totally different method while the note sweeps.  like
> what if portamento is turned on?  the only way to clicklessly jump from
> wavetable to a "naive" sawtooth would be to crossfade.  but crossfading to
> a wavetable richer in harmonics is already built in.  and what if the
> "classic" waveform wasn't a saw but something else?  more general?
>
>
> > I think a lot of people don’t consider that the error of a “naive”
> oscillator becomes increasingly smaller for lower frequencies. Of course,
> it’s waveform specific, so that’s why I suggested bigger tables. (Side
> comment: If you get big enough tables, you could choose to skip linear
> interpolation altogether—at constant table size, the higher frequency
> octave/whatever tables, where it matters more, will be progressively more
> oversampled anyway.)
>
> well, Duane Wise and i visited this drop-sample vs. linear vs. various
> different cubic splines (Lagrange, Hermite...) a couple decades ago.  for
> really high quality audio (not the same as an electronic musical
> instrument), i had been able to show that, for 120 dB S/N, 512x
> oversampling is sufficient for linear interpolation but 512K is what is
> needed for drop sample.  even relaxing those standards, choosing to forgo
> linear interpolation for drop-sample "interpolation" might require bigger
> wavetables than you might wanna pay for.  for the general wavetable synth
> (or NCO or DDS or whatever you wanna call this LUT thing, including just
> sample playback) i would never recommend interpolation cruder than linear.
> Nigel, i remember your code didn't require big tables and you could have
> each wavetable a different size (i think you had the accumulated phase be a
> float between 0 and 1 and that was scaled to the wavetable size, right?)
> but then that might mean you have to do better interpolation than linear,
> if you want it clean.
>
>
>
> > Funny thing I found in writing the wavetable articles. One soft synth
> developer dismissed the whole idea of wavetables (in favor of minBLEPs,
> etc.). When I pointed out that wavetables allow any waveform, he said the
> other methods did too. I questioned that assertion by giving an example of
> a wavetable with a few arbitrary harmonics. He countered that it wasn’t a
> waveform. I guess some people only consider the basic synth waves as
> “waveforms”. :-D
> >
>
> i've had arguments like this with other Kurzweil people while i worked
> there a decade ago (still such a waste when you consider how good and how
> much work they put into their sample-playback, looping, and interpolation
> hardware, only a small modification was needed to make it into a decent
> wavetable synth with morphing).
>
> for me, a "waveform" is any quasi-periodic function.  A note from any
> decently harmonic instrument; piano, fiddle, a plucked guitar, oboe,
> trumpet, flute, all of those can be done with wavetable synthesis (and
> most, maybe all, of them can be limited to 127 harmonics allowing archived
> wavetables to be as small as 256).
>
> these are the two necessary ingredients to wavetable synthesis:
> quasi-periodic note (that means it can be represented as a Fourier series
> with slowly-changing Fourier coefficients) and bandlimited.  if it's
> quasi-periodic and bandlimited it can be done with wavetable synthesis.  to
> me, for someone to argue against that, means to me that they are arguing
> against Fourier and Shannon.
>
> there is a straight-forward way of pitch tracking the sampled note from
> attack to release, and from that slowly-changing period information, there
> is a straight-forward way to sample it to 256 points per cycle and
> converting each adjacent cycle into a wavetable.  that's a lotta redundant
> data and most of the wavetables (nearly all of them) can be culled with the
> assumption that the wavetables surviving the culling process will be
> linearly cross-faded from one to the next.
>
> and if several notes (say up and down the keyboard) are samp

Re: [music-dsp] Playing a Square Wave

2018-06-13 Thread Ethan Duni

>The simple question that forced itself on me often, as I"m sure some can
relate,
>after having been used to all those early signal sources including a host
of analog
>synthesizers I had in the past, and a lot of music in various analog forms
from standard
>pop to G. Duke and Rose Royce to mention a few of my favorites from an
earlier era,
>is how can it be that such a simple wave like the square wave, just two
signal levels
>with a near instantaneous jump between them, can be so hard to make
digital, if you
> listen with a HiFi system and some normal musical signal discernment ?

I think this is less of a DAC issue than the various quirks of the analog
VCO designs. Like the classic function-generator style ones that start with
a sawtooth oscillator, then use comparators to generate a pulse wave,
full-wave rectifier to get triangle output, and wave shaping on triangle to
generate a sinusoidal output. Do VA people simulate these circuits
explicitly? Most of what I recall has been BLIT based stuff, or for digital
synthesis of specific waveform types the wavetable approach described by
RBJ is pretty straightforward and compelling.

For those who don't recall from undergrad EE lab:

The sawtooth oscillator is, basically, a variable current source feeding a
capacitor, which dumps when its voltage reaches a constant (say 1V). So you
get a linear rise (constant current feeding constant capacitance) and
control the frequency by altering the input current. The capacitor has to
drain through a non-zero resistor, so there is some finite discharge time
to reset, and also temperature compensation is required on the current
source, etc. You probably also put a DC-blocking cap at the output, so
there is some fixed highpass characteristic built in there (but maybe not
if this is an LFO). You can implement hard sync with another oscillator by
dumping the cap whenever the master oscillator hits some level.

To get the pulse output, you run this through a comparator circuit, with
the comparison voltage determining the pulse width (0V for square wave,
assuming a +/-1V sawtooth).

To get a triangle wave output, you full-wave rectify the sawtooth (and then
need to add another DC blocker)

To get a sinusoidal output, you use some diodes or other nonlinear
components to do an approximate instantaneous wave shaping on the triangle.

The fun bit in a modular synth is that all these synchronized outputs are
available simultaneously, and can be run through different
filter/modulation paths downstream, etc.

E

On Wed, Jun 13, 2018 at 9:06 AM, Theo Verelst  wrote:

> Neil Goldman wrote:
>
>>  > such a simple wave like the square wave, just two signal levels with a
>> near instantaneous
>> jump between them
>>
>> I think I disagree with this definition of a square wave
>>
>> Even assuming a magically perfect and noiseless analog square wave
>> generator, at the very
>> least your speaker cones can't teleport between two positions.
>>
>
> Sure, there's a lot of stuff not just happening with speakers, but with
> acoustics of
> any kind as well, always very audible.
>
> For me when I started building musical circuits of the kinds I described,
> in high school,
> there were perfect enough squares available: electronics "function
> generators" would offer near perfect square waves with rise times orders of
> magnitude faster than an audio circuit would normally require, and crystal
> oscillators would keep jitter pretty low for
> creating square tones.
>
> It's true the imperfections I mean are the "digitally created" ones, which
> includes the exact filtering in the DAC, the linearity of it, which filter
> components have been (and not been) used like for DC coupling and analogue
> (mean IIR automatically) anti-aliasing filtering. The converter chip I use
> often at the moment (a PCM5102A DAC with up to 50MHz very low jitter and
> ground separated clock) is analog filtered with a single high quality
> component analog filter, DC coupled with a very high quality OPA627 OpAmp,
> with no subsequent electrolytic capacitors in the signal path. On my big
> monitoring or studio headphones it should be able to be very accurate.
>
> It's hard to explain, but it's possible to take the given filtering inside
> a DAC (to begin with the standard "oversampling" digital one) and try to
> invert it to the extend that the streaming filter inversion allows you to
> control the signal at oversampling clock speed to some level of accuracy.
> This might cost headroom: some inversions might take a lot of amplitude
> which is lost in the out coming signal, but nevertheless there are a few
> ways to try to do this.
>
> So taking a square wave with a given limitation of zero harmonics above a
> certain frequency (like can be created with additive synthesis)
> approximating the "perfect reconstruction" at the output of the DAC
> oversampling filter output is a real possibility. Unfortunately inversion
> of such a FIR or IIR filter with near bit-accuracy
> at

Re: [music-dsp] Clock drift and compensation

2018-03-09 Thread Ethan Duni

Hi ben

You don't need to evaluate the asin() - it's piecewise monotonic and
symmetrical, so you can get the same comparison directly in the signal
domain.

Specifically, notice that x(n) = sin(2*pi*(1/4)*n) = [...0,1,0,-1,...]. So
you get the same result just by checking ( abs( x[n] - x[n-1] ) == 1 )

Ethan

On Fri, Mar 9, 2018 at 10:58 AM, Benny Alexandar 
wrote:

> Hi GM,
> Instead of finding Hilbert transform, I tried with just finding the angle
> between samples
> of a fixed frequency sine wave.
> I tried to create a sine wave of  frequency x[n] = sin ( 2 * pi * 1/4 *
> n), and tried calculating the angle between samples,
> it should be 90 degree. This also can be used to detect any discontinuity
> in the signal.
> Below is the octave code which I tried.
>
> One cycle of sine wave consists of 4 samples, two +ve and two -ve.
>
> % generate the sine wave of frequency 1/4
> for i = 1 : 20
>x(i) = sin( 2 * pi * ( 1 / 4) * i);
> end
>
> % find the angle between samples in degrees.
>  for i = 1:20
> ang(i)  =  asin( x(i) ) * (180 / pi);
>  end
>
> % find the absolute difference between angles
> for i = 1:20
>  diff(i) =  abs( ang( i + 1 ) - ang( i ));
> end
>
> % check for discontinuity
> for i = 1:20
> if (diff(i) != 90)
>   disp("discontinuity")
> endif
> end
>
>
> Please verify this logic is correct for discontinuity check.
>
> -ben
>
>
>
> --
> *From:* music-dsp-boun...@music.columbia.edu  columbia.edu> on behalf of gm 
> *Sent:* Monday, January 29, 2018 1:29 AM
>
> *To:* music-dsp@music.columbia.edu
> *Subject:* Re: [music-dsp] Clock drift and compensation
>
>
> diff gives you the phase step per sample,
> basically the frequency.
>
> However the phase will jump back to zero periodically when the phase
> exceeds 360°
> (when it wraps around) in this case diff will get you a wrong result.
>
> So you need to "unwrap" the phase or the phase difference, for example:
>
>
> diff = phase_new - phase_old
> if phase_old > Pi and phase_new < Pi then diff += 2Pi
>
> or similar.
>
> Am 28.01.2018 um 17:19 schrieb Benny Alexandar:
>
> Hi GM,
>
> >> HT -> Atan2 -> differenciate -> unwrap
> Could you please explain how to find the drift using HT,
>
> HT -> gives real(I) & imaginary (Q) components of real signal
> Atan2 -> the phase of an I Q signal
> diff-> gives what ?
> unwrap ?
>
> -ben
>
>
> --
> *From:* music-dsp-boun...@music.columbia.edu  columbia.edu>  on behalf of gm
>  
> *Sent:* Saturday, January 27, 2018 5:20 PM
> *To:* music-dsp@music.columbia.edu
> *Subject:* Re: [music-dsp] Clock drift and compensation
>
>
> I don't understand your project at all so not sure if this is helpful,
> probably not,
> but you can calculate the drift or instantanous frequency of a sine wave
> on a per sample basis
> using a Hilbert transform
> HT -> Atan2 -> differenciate -> unwrap
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Sampling theory "best" explanation

2017-09-11 Thread Ethan Duni

Thanks for the reference and explanation Robert. Now that you jog my memory
I realize you covered this before, some time back. If I can find a few
minutes I will try to work out a version including the images from the
resampling. The trade-off between resampling and interpolation is not
entirely clear to me in this analysis.

So linearly interpolating two adjacent FIR polyphases is equivalent to a
single FIR with interpolated coefficients. I.e., we're using an affine
approximation of the underlying resampling kernel. And at high resampling
ratios this will be a good approximation. IIRC this has been covered on the
list before as well. However, it costs twice as much CPU as running a
single phase - so isn't the fair comparison to an FIR of double the order
(and so, double the resampling ratio)?

Ethan D

On Wed, Sep 6, 2017 at 9:57 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Sampling theory "best" explanation
> From: "Ethan Duni" 
> Date: Wed, September 6, 2017 4:49 pm
> To: "robert bristow-johnson" 
> "A discussion list for music-related DSP" 
> --
>
> > rbj wrote:
> >>what do you mean be "non-ideal"? that it's not an ideal brick wall LPF?
> > it's still LTI if it's some other filter **unless** you're meaning that
> > the possible aliasing.
> >
> > Yes, that is exactly what I am talking about. LTI systems cannot produce
> > aliasing.
> >
> > Without an ideal bandlimiting filter, resampling doesn't fulfill either
> > definition of time invariance. Not the classic one in terms of sample
> > shifts, and not the "common real time" one suggested for multirate cases.
> >
> > It's easy to demonstrate this by constructing a counterexample. Consider
> > downsampling by 2, and an input signal that contains only a single
> sinusoid
> > with frequency above half the (input) Nyquist rate, and at a frequency
> that
> > the non-ideal bandlimiting filter fails to completely suppress. To be
> LTI,
> > shifting the input by one sample should result in a half-sample shift in
> > the output (i.e., bandlimited interpolation). But this doesn't happen,
> due
> > to aliasing. This becomes obvious if you push the frequency of the input
> > sinusoid close to the (input) Nyquist frequency - instead of a
> half-sample
> > shift in the output, you get negation!
> >
> >>we draw the little arrows with different heights and we draw the impulses
> > scaled with samples of negative value as arrows pointing down
> >
> > But that's just a graph of the discrete time sequence.
>
> well, even if the *information* necessary is the same, a graph of x[n]
> need only be little dots, one per sample.  or discrete lines (without
> arrowheads).
>
> but the use of the symbol of an arrow for an impulse is a symbol of
> something difficult to graph for a continuous-time function (not to be
> confused with a continuous function).  if the impulse heights and
> directions (up or down) are analog to the sample value magnitude and
> polarity, those graphing object suffice to depict these *hypothetical*
> impulses in the continuous-time domain.
>
>
> >
> >>you could do SRC without linear interpolation (ZOH a.k.a. "drop-sample")
> > but you would need a much larger table
> >>(if i recall correctly, 1024 times larger, so it would be 512Kx
> > oversampling) to get the same S/N. if you use 512x
> >>oversampling and ZOH interpolation, you'll only get about 55 dB S/N for
> an
> > arbitrary conversion ratio.
> >
> > Interesting stuff, it didn't occur to me that the SNR would be that low.
> > How do you estimate SNR for a particular configuration (i.e., target
> > resampling ratio, fixed upsampling factor, etc)? Is that for ideal 512x
> > resampling, or does it include the effects of a particular filter design
> > choice?
>
> this is what Duane Wise and i ( https://www.researchgate.net/
> publication/266675823_Performance_of_Low-Order_
> Polynomial_Interpolators_in_the_Presence_of_Oversampled_Input ) were
> trying to show and Olli Niemitalo (in his pink elephant paper
> http://yehar.com/blog/wp-content/uploads/2009/08/deip.pdf ).
>
> so let's say that you're oversampling by a factor of R.  if the sample
> rate is 96 kHz and the audio is limited to 20 kHz, the oversampling ratio
> is 2.4 . but now imagine it's *highly* oversampled (which we can get from
> po

Re: [music-dsp] Sampling theory "best" explanation

2017-09-06 Thread Ethan Duni

Okay, no big deal. It's easy to come off the wrong way in complicated, fast
moving email threads.

Ethan D

On Wed, Sep 6, 2017 at 6:37 PM, Nigel Redmon  wrote:

> Ethan, I wasn't taking a swipe at you, by any stretch. In fact, I wasn't
> even addressing your ADC comment. It was actually about things like the
> idea of making DACs with impulses. As I noted, we don't because there are
> ways that are easier and accomplish the same goal, but it is feasible. I've
> had people say in the past to me it's absurd, and I've assured them that a
> reasonable and practical approximation of it would indeed produce a
> reasonable approximation of a decent DAC. That's a pretty relative
> statement because the quality depends on how hard you want try, but I
> subsequently saw that Julius Smith make the same assertion on his website.
>
> Sorry you misinterpreted it.
>
> On Sep 7, 2017, at 5:34 AM, Ethan Duni  wrote:
>
> Nigel Redmon wrote:
> >As an electrical engineer, we find great humor when people say we can't
> do impulses.
>
> I'm the electrical engineer who pointed out that impulses don't exist and
> are not found in actual ADCs. If you have some issue with anything I've
> posted, I'll thank you to address it to me directly and respectfully.
>
> Taking oblique swipes at fellow list members, impugning their standing as
> engineers, etc. is poisonous to the list community.
>
> >What constitutes an impulse depends on the context—nano seconds,
> milliseconds...
>
> If it has non-zero pulse width, it isn't an impulse in the relevant sense:
> multiplying by such a function would not model the sampling process. You
> would need to introduce additional operations to describe how this finite
> region of non-zero signal around each sample time is translated into a
> unique sample value.
>
> >For ADC, we effectively measure an instantaneous voltage and store it as
> an impulse.
>
> I don't know of any ADC design that stores voltages as "impulse" signals,
> even approximately. The measured voltage is represented through modulation
> schemes such as PDM, PWM, PCM, etc.
>
> Impulse trains are a convenient pedagogical model for understanding
> aliasing, reconstruction filters, etc., but there is a considerable gap
> between that model and what actually goes on in a real ADC.
>
> >If you can make a downsampler that has no audible aliasing (and you
> can), I think the process has to be called linear, even if you can make a
> poor quality one that isn't.
>
> I'm not sure how you got onto linearity, but the subject is
> time-invariance.
>
> I have no objection to calling resamplers "approximately time-invariant"
> or "asymptotically time-invariant" or somesuch, in the sense that you can
> get as close to time-invariant behavior as you like by throwing resources
> at the bandlimiting filter. This is qualitatively different from other
> archetypical examples of time-variant systems (modulation, envelopes, etc.)
> where explicitly time-variant behavior is the goal, even in the ideal case.
> Moreover, I agree that this distinction is important and worth
> highlighting.
>
> However, there needs to be *some* qualifier - the bare statement
> "(re)sampling is LTI" is incorrect and misleading. It obscures that fact
> that addressing the aliasing caused by the system's time-variance is the
> principle concern in the design of resamplers. The fact that a given design
> does a good job is great and all - but that only happens because the
> designer recognizes that the system is time-invariant, and dedicates
> resources to mitigating the impact of aliasing.
>
> >If you get too picky and call something non-linear, when for practical
> decision-making purposes it clearly is, it seem you've defeated the purpose.
>
> If you insist on labelling all resamplers as "time-invariant," without any
> further qualification, then it will mess up practical decision making.
> There will be no reason to consider the effects of aliasing - LTI systems
> cannot produce aliasing - when making practical system design decisions.
> You only end up with approximately-LTI behavior because you recognize at
> the outset that the system is *not* LTI, and make appropriate design
> decisions to limit the impact of aliasing. So this is putting the cart
> before the horse.
>
> The appropriate way to deal with this is not to get hung up on the label
> "LTI" (or any specialized variations thereof), but to simply quote the
> actual performance of the system (SNR, spurious-free dynamic range, etc.).
> In that way, everything is clear to the designers and clients

Re: [music-dsp] Sampling theory "best" explanation

2017-09-06 Thread Ethan Duni

because that defined what
> we're dealing with when we do "music-dsp". But as far as DAC not using
> impulses, it's only because the shortcut is trivial. Like I said, audio
> sample rates are slow, not that hard to do a good enough job for
> demonstration with "close enough" impulses.
>
> Don't anyone get mad at me, please. Just sitting on a plane at LAX at 1AM,
> waiting to fly 14 hours...on the first leg...amusing myself before going
> offline for a while
>
> ;-)
>
>
> On Sep 4, 2017, at 10:07 PM, Ethan Duni  wrote:
>
> rbj wrote:
>
> >1. resampling is LTI **if**, for the TI portion, one appropriately scales
> time.
>
> Have we established that this holds for non-ideal resampling? It doesn't
> seem like it does, in general.
>
> If not, then the phrase "resampling is LTI" - without some kind of "ideal"
> qualifier - seems misleading. If it's LTI then what are all these aliases
> doing in my outputs?
>
> >no one *really* zero-stuffs samples into the stream
>
> Nobody does it *explicitly* but it seems misleading to say we don't
> *really* do it. We employ optimizations to handle this part implicitly, but
> the starting point for that is exactly to *really* stuff zeroes into the
> stream. This is true in the same sense that the FFT *really* computes the
> DFT.
>
> Contrast that with pedagogical abstractions like the impulse train model
> of sampling. Nobody has ever *really* sampled a signal this way, because
> impulses do not exist in reality.
>
> >7. and i disagree with the statement: "The other big pedagogical problem
> with impulse train representation is that it can't be graphed in a >useful
> way."  graphing functions is an abstract representation to begin with, so
> we can use these abstract vertical arrows to represent >impulses.
>
> That is my statement, so I'll clarify: you can graph an impulse train with
> a particular period. But how do you graph the product of the impulse train
> with a continuous-time function (i.e., the sampling operation)? Draw a
> graph of a generic impulse train, with the scaling of each impulse written
> out next to it? That's not useful. That's just a generic impulse train
> graph and a print-out of the sequence values. The useful graph here is of
> the sample sequence itself.
>
> >if linear interpolation is done between the subsamples, i have found that
> upsampling by a factor of 512 followed by linear interpolation >between
> those teeny-little upsampled samples, that this will result in 120 dB S/N
>
> What is the audio use case wherein 512x upsampling is not already
> sufficient time resolution? I'm curious why you'd need additional
> interpolation at that point.
>
> Ethan D
>
> On Mon, Sep 4, 2017 at 1:49 PM, Nigel Redmon 
> wrote:
>
>> The fact that 5,17,-12,2 at sample rate 1X and
>>> 5,0,0,0,17,0,0,0,-12,0,0,0,2,0,0,0 at sample rate 4X are identical is
>>> obvious only for samples representing impulses.
>>
>>
>> I agree that the zero-stuff-then-lowpass technique is much more obvious
>> when we you consider the impulse train corresponding to the signal. But I
>> find it peculiar to assert that these two sequences are "identical." If
>> they're identical in any meaningful sense, why don't we just stop there and
>> call it a resampler? The reason is that what we actually care about in the
>> end is what the corresponding bandlimited functions look like, and
>> zero-stuffing is far from being an identity operation in this domain. We're
>> instead done constructing a resampler when we end up with an operation that
>> preserves the bandlimited function -- or preserves as much of it as
>> possible in the case of downsampling.
>>
>>
>> Well, when I say they are identical, the spectrum is identical. In other
>> words, they represent the same signal. The fact that it doesn’t make it
>> a resampler is a different thing—an additional constraint. We only have
>> changed the data rate (not the signal) when we insert zeros. Most of the
>> time, we want to also change the signal (by getting rid of the aliases,
>> that were above half the sample rate and now below). That’s why my article
>> made a big deal  (point #3) of pointing out that the digital samples
>> represent not the original analog signal, but a modulated version of it.
>>
>> Of course, we differ only in semantics, just making mine clear. When I
>> say they represent the same signal, I don’t just mean the portion of the
>> spectrum in the audio band or below half the sample rate—I mean the whole
>

Re: [music-dsp] Sampling theory "best" explanation

2017-09-06 Thread Ethan Duni

rbj wrote:
>what do you mean be "non-ideal"?  that it's not an ideal brick wall LPF?
 it's still LTI if it's some other filter **unless** you're meaning that
the possible aliasing.

Yes, that is exactly what I am talking about. LTI systems cannot produce
aliasing.

Without an ideal bandlimiting filter, resampling doesn't fulfill either
definition of time invariance. Not the classic one in terms of sample
shifts, and not the "common real time" one suggested for multirate cases.

It's easy to demonstrate this by constructing a counterexample. Consider
downsampling by 2, and an input signal that contains only a single sinusoid
with frequency above half the (input) Nyquist rate, and at a frequency that
the non-ideal bandlimiting filter fails to completely suppress. To be LTI,
shifting the input by one sample should result in a half-sample shift in
the output (i.e., bandlimited interpolation). But this doesn't happen, due
to aliasing. This becomes obvious if you push the frequency of the input
sinusoid close to the (input) Nyquist frequency - instead of a half-sample
shift in the output, you get negation!

>we draw the little arrows with different heights and we draw the impulses
scaled with samples of negative value as arrows pointing down

But that's just a graph of the discrete time sequence.

>you could do SRC without linear interpolation (ZOH a.k.a. "drop-sample")
but you would need a much larger table
>(if i recall correctly, 1024 times larger, so it would be 512Kx
oversampling) to get the same S/N.  if you use 512x
>oversampling and ZOH interpolation, you'll only get about 55 dB S/N for an
arbitrary conversion ratio.

Interesting stuff, it didn't occur to me that the SNR would be that low.
How do you estimate SNR for a particular configuration (i.e., target
resampling ratio, fixed upsampling factor, etc)? Is that for ideal 512x
resampling, or does it include the effects of a particular filter design
choice?

Ethan D




On Tue, Sep 5, 2017 at 9:44 AM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Sampling theory "best" explanation
> From: "Ethan Duni" 
> Date: Tue, September 5, 2017 1:07 am
> To: "A discussion list for music-related DSP" <
> music-dsp@music.columbia.edu>
> --
>
> > rbj wrote:
> >
> >>1. resampling is LTI **if**, for the TI portion, one appropriately scales
> > time.
> >
> > Have we established that this holds for non-ideal resampling? It doesn't
> > seem like it does, in general.
>
> what do you mean be "non-ideal"?  that it's not an ideal brick wall LPF?
>  it's still LTI if it's some other filter **unless** you're meaning that
> the possible aliasing.
>
>
> > If not, then the phrase "resampling is LTI" - without some kind of
> "ideal"
> > qualifier - seems misleading. If it's LTI then what are all these aliases
> > doing in my outputs?
> >
> >>no one *really* zero-stuffs samples into the stream
> >
> > Nobody does it *explicitly*
>
> people using an IIR filter for reconstruction might be putting in the
> zeros explicitly.
>
> > but it seems misleading to say we don't
> > *really* do it. We employ optimizations to handle this part implicitly,
> but
> > the starting point for that is exactly to *really* stuff zeroes into the
> > stream. This is true in the same sense that the FFT *really* computes the
> > DFT.
> >
> > Contrast that with pedagogical abstractions like the impulse train model
> of
> > sampling. Nobody has ever *really* sampled a signal this way, because
> > impulses do not exist in reality.
>
> it's the only direct way i can think of to demonstrate that we are
> discarding all of the information between samples, yet keeping the
> information at the sampling instances. it's what dirac impulses are for the
> "sampling" or "sifting" property (but the math guys are unhappy if we don't
> immediately surround that with an integral, they don't like naked dirac
> impulse functions).
>
>
> >
> >>7. and i disagree with the statement: "The other big pedagogical problem
> > with impulse train representation is that it can't be graphed in a
> >useful
> > way." graphing functions is an abstract representation to begin with, so
> > we can use these abstract vertical arrows to represent >impulses.
> >
> > That is my statement, so I'll clarify: you

Re: [music-dsp] Sampling theory "best" explanation

2017-09-04 Thread Ethan Duni

imited functions of a real number.
>
> None of these are exactly identical, as sequences of numbers are not the
> same sort of beast as functions of a real number. But obviously there is a
> one-to-one correspondence between objects in classes 1 and 2. Less
> obviously -- but more interestingly and importantly! -- there is a
> one-to-one correspondence between objects in classes 1 and 3. So any
> operation on any of these three classes will have a corresponding operation
> in the other two.
>
> This is what the math tells us. It does not tell us that any of these
> classes are identical to each other or that thinking of one correspondence
> is more correct than the other.
>
> The fact that 5,17,-12,2 at sample rate 1X and
>> 5,0,0,0,17,0,0,0,-12,0,0,0,2,0,0,0 at sample rate 4X are identical is
>> obvious only for samples representing impulses.
>
>
> I agree that the zero-stuff-then-lowpass technique is much more obvious
> when we you consider the impulse train corresponding to the signal. But I
> find it peculiar to assert that these two sequences are "identical." If
> they're identical in any meaningful sense, why don't we just stop there and
> call it a resampler? The reason is that what we actually care about in the
> end is what the corresponding bandlimited functions look like, and
> zero-stuffing is far from being an identity operation in this domain. We're
> instead done constructing a resampler when we end up with an operation that
> preserves the bandlimited function -- or preserves as much of it as
> possible in the case of downsampling.
>
> This is why it is more natural for me to think of the discrete signal and
> the bandlimited function as being more closely identified. The impulse
> train is a related mathematical entity which is useful to pull out of the
> toolbox on some occasions.
>
> I'm not really interested in arguing that the way I think about things is
> superior -- as I've stated above I think the math is neutral on this point,
> and what mental model works best is different from person to person. It can
> be a bit like arguing what shoe size is best. But I do think it's
> counterproductive to discourage people from thinking about the discrete
> signal <-> bandlimited function correspondence. I think real insight and
> intuition in DSP is built up by comparing what basic operations look like
> in each of these different universes (as well as in their frequency domain
> equivalents).
>
> -Ethan
>
>
>
> On Mon, Sep 4, 2017 at 2:14 PM, Ethan Fenn  wrote:
>
>> Time variance is a bit subtle in the multi-rate context. For integer
>>> downsampling, as you point out, it might make more sense to replace the
>>> classic n-shift-in/n-shift-out definition of time invariance with one that
>>> works in terms of the common real time represented by the different
>>> sampling rates. So an integer shift into a 2x downsampler should be a
>>> half-sample shift in the output. In ideal terms (brickwall filters/sinc
>>> functions) this all clearly works out.
>>
>>
>> I think the thing to say about integer downsampling with respect to time
>>> variance is that it's that partitions the space of input shifts, where if
>>> you restrict yourself to shifts from a given partition you will see time
>>> invariance (in a certain sense).
>>
>>
>> So this to me is a good example of how thinking of discrete time signals
>> as representing bandlimited functions is useful. Because if we're thinking
>> of things this way, we can simply define an operation in the space of
>> discrete signals as being LTI iff the corresponding operation in the space
>> of bandlimited functions is LTI. This generalizes the usual definition, and
>> your partitioned-shift concept, in exactly the way we want, and we find
>> that ideal resamplers (of any ratio, integer/rational/irrational) are in
>> fact LTI as our intuition suggests they should be.
>>
>> -Ethan F
>>
>>
>>
>> On Mon, Sep 4, 2017 at 1:00 AM, Ethan Duni  wrote:
>>
>>> Hmm this is quite a few discussions of LTI with respect to resampling
>>> that have gone badly on the list over the years...
>>>
>>> Time variance is a bit subtle in the multi-rate context. For integer
>>> downsampling, as you point out, it might make more sense to replace the
>>> classic n-shift-in/n-shift-out definition of time invariance with one that
>>> works in terms of the common real time represented by the different
>>> sampling rates. So an integer shift into a 2x downsampler should be a
>>> half-sample shift in the o

Re: [music-dsp] Sampling theory "best" explanation

2017-09-03 Thread Ethan Duni

Hmm this is quite a few discussions of LTI with respect to resampling that
have gone badly on the list over the years...

Time variance is a bit subtle in the multi-rate context. For integer
downsampling, as you point out, it might make more sense to replace the
classic n-shift-in/n-shift-out definition of time invariance with one that
works in terms of the common real time represented by the different
sampling rates. So an integer shift into a 2x downsampler should be a
half-sample shift in the output. In ideal terms (brickwall filters/sinc
functions) this all clearly works out.

On the other hand, I hesitate to say "resampling is LTI" because that seems
to imply that resampling doesn't produce aliasing. And of course aliasing
is a central concern in the design of resamplers. So I can see how this
rubs people the wrong way.
.
It's not clear to me that a realizable downsampler (i.e., with non-zero
aliasing) passes the "real time" definition of LTI?

I think the thing to say about integer downsampling with respect to time
variance is that it's that partitions the space of input shifts, where if
you restrict yourself to shifts from a given partition you will see time
invariance (in a certain sense).

More generally, resampling is kind of an edge case with respect to time
invariance, in the sense that resamplers are time-variant systems that are
trying as hard as they can to act like time invariant systems. As opposed
to, say, modulators or envelopes or such,

Ethan D

On Fri, Sep 1, 2017 at 10:09 PM, Nigel Redmon  wrote:

> Interesting comments, Ethan.
>
> Somewhat related to your points, I also had a situation on this board
> years ago where I said that sample rate conversion was LTI. It was a
> specific context, regarding downsampling, so a number of people, one by
> one, basically quoted back the reason I was wrong. That is, basically that
> for downsampling 2:1, you’d get a different result depending on which set
> of points you discard (decimation), and that alone meant it isn’t LTI. Of
> course, the fact that the sample values are different doesn’t mean what
> they represent is different—one is just a half-sample delay of the other. I
> was surprised a bit that they accepted so easily that SRC couldn’t be used
> in a system that required LTI, just because it seemed to violate the
> definition of LTI they were taught.
>
> On Sep 1, 2017, at 3:46 PM, Ethan Duni  wrote:
>
> Ethan F wrote:
> >I see your nitpick and raise you. :o) Surely there are uncountably many
> such functions,
> >as the power at any apparent frequency can be distributed arbitrarily
> among the bands.
>
> Ah, good point. Uncountable it is!
>
> Nigel R wrote:
> >But I think there are good reasons to understand the fact that samples
> represent a
> >modulated impulse train.
>
> I entirely agree, and this is exactly how sampling was introduced to me
> back in college (we used Oppenheim and Willsky's book "Signals and
> Systems"). I've always considered it the canonical EE approach to the
> subject, and am surprised to learn that anyone thinks otherwise.
>
> Nigel R wrote:
> >That sounds like a dumb observation, but I once had an argument on this
> board:
> >After I explained why we stuff zeros of integer SRC, a guy said my
> explanation was BS.
>
> I dunno, this can work the other way as well. There was a guy a while back
> who was arguing that the zero-stuffing used in integer upsampling is
> actually not a time-variant operation, on the basis that the zeros "are
> already there" in the impulse train representation (so it's a "null
> operation" basically). He could not explain how this putatively-LTI system
> was introducing aliasing into the output. Or was this the same guy?
>
> So that's one drawback to the impulse train representation - you need the
> sample rate metadata to do *any* meaningful processing on such a signal.
> Otherwise you don't know which locations are "real" zeros and which are
> just "filler." Of course knowledge of sample rate is always required to
> make final sense of a discrete-time audio signal, but in the usual sequence
> representation we don't need it just to do basic operations, only for
> converting back to analog or interpreting discrete time operations in
> analog terms (i.e., what physical frequency is the filter cut-off at,
> etc.).
>
> The other big pedagogical problem with impulse train representation is
> that it can't be graphed in a useful way.
>
> People will also complain that it is poorly defined mathematically (and
> indeed the usual treatments handwave these concerns), but my rejoinder
> would be that it can all be made rigorous by adopting non-standard
> analy

Re: [music-dsp] Sampling theory "best" explanation

2017-09-01 Thread Ethan Duni

Ethan F wrote:
>I see your nitpick and raise you. :o) Surely there are uncountably many
such functions,
>as the power at any apparent frequency can be distributed arbitrarily
among the bands.

Ah, good point. Uncountable it is!

Nigel R wrote:
>But I think there are good reasons to understand the fact that samples
represent a
>modulated impulse train.

I entirely agree, and this is exactly how sampling was introduced to me
back in college (we used Oppenheim and Willsky's book "Signals and
Systems"). I've always considered it the canonical EE approach to the
subject, and am surprised to learn that anyone thinks otherwise.

Nigel R wrote:
>That sounds like a dumb observation, but I once had an argument on this
board:
>After I explained why we stuff zeros of integer SRC, a guy said my
explanation was BS.

I dunno, this can work the other way as well. There was a guy a while back
who was arguing that the zero-stuffing used in integer upsampling is
actually not a time-variant operation, on the basis that the zeros "are
already there" in the impulse train representation (so it's a "null
operation" basically). He could not explain how this putatively-LTI system
was introducing aliasing into the output. Or was this the same guy?

So that's one drawback to the impulse train representation - you need the
sample rate metadata to do *any* meaningful processing on such a signal.
Otherwise you don't know which locations are "real" zeros and which are
just "filler." Of course knowledge of sample rate is always required to
make final sense of a discrete-time audio signal, but in the usual sequence
representation we don't need it just to do basic operations, only for
converting back to analog or interpreting discrete time operations in
analog terms (i.e., what physical frequency is the filter cut-off at,
etc.).

The other big pedagogical problem with impulse train representation is that
it can't be graphed in a useful way.

People will also complain that it is poorly defined mathematically (and
indeed the usual treatments handwave these concerns), but my rejoinder
would be that it can all be made rigorous by adopting non-standard
analysis/hyperreal numbers. So, no harm no foul, as far as "correctness" is
concerned, although it does hobble the subject as a gateway into "real
math."

Ethan D

On Fri, Sep 1, 2017 at 2:38 PM, Ethan Fenn  wrote:

> This needs an additional qualifier, something about the bandlimited
>> function with the lowest possible bandwidth, or containing DC, or
>> "baseband," or such.
>
>
> Yes, by bandlimited here I mean bandlimited to [-Nyquist, Nyquist].
>
> Otherwise, there are a countably infinite number of bandlimited functions
>> that interpolate any given set of samples. These get used in "bandpass
>> sampling," which is uncommon in audio but commonplace in radio
>> applications.
>
>
> I see your nitpick and raise you. :o) Surely there are uncountably many
> such functions, as the power at any apparent frequency can be distributed
> arbitrarily among the bands.
>
> -Ethan F
>
>
> On Fri, Sep 1, 2017 at 5:30 PM, Ethan Duni  wrote:
>
>> >I'm one of those people who prefer to think of a discrete-time signal
>> as
>> >representing the unique bandlimited function interpolating its samples.
>>
>> This needs an additional qualifier, something about the bandlimited
>> function with the lowest possible bandwidth, or containing DC, or
>> "baseband," or such.
>>
>> Otherwise, there are a countably infinite number of bandlimited functions
>> that interpolate any given set of samples. These get used in "bandpass
>> sampling," which is uncommon in audio but commonplace in radio
>> applications.
>>
>> Ethan D
>>
>> On Fri, Sep 1, 2017 at 1:31 PM, Ethan Fenn 
>> wrote:
>>
>>> Thanks for posting this! It's always interesting to get such a good
>>> glimpse at someone else's mental model.
>>>
>>> I'm one of those people who prefer to think of a discrete-time signal as
>>> representing the unique bandlimited function interpolating its samples. And
>>> I don't think this point of view has crippled my understanding of
>>> resampling or any other DSP techniques!
>>>
>>> I'm curious -- from the impulse train point of view, how do you
>>> understand fractional delays? Or taking the derivative of a signal? Do you
>>> have to pass into the frequency domain in order to understand these?
>>> Thinking of a signal as a bandlimited function, I find it pretty easy to
>>> understand both of these processes from first principles in the

Re: [music-dsp] Sampling theory "best" explanation

2017-09-01 Thread Ethan Duni

>I'm one of those people who prefer to think of a discrete-time signal as
>representing the unique bandlimited function interpolating its samples.

This needs an additional qualifier, something about the bandlimited
function with the lowest possible bandwidth, or containing DC, or
"baseband," or such.

Otherwise, there are a countably infinite number of bandlimited functions
that interpolate any given set of samples. These get used in "bandpass
sampling," which is uncommon in audio but commonplace in radio
applications.

Ethan D

On Fri, Sep 1, 2017 at 1:31 PM, Ethan Fenn  wrote:

> Thanks for posting this! It's always interesting to get such a good
> glimpse at someone else's mental model.
>
> I'm one of those people who prefer to think of a discrete-time signal as
> representing the unique bandlimited function interpolating its samples. And
> I don't think this point of view has crippled my understanding of
> resampling or any other DSP techniques!
>
> I'm curious -- from the impulse train point of view, how do you understand
> fractional delays? Or taking the derivative of a signal? Do you have to
> pass into the frequency domain in order to understand these? Thinking of a
> signal as a bandlimited function, I find it pretty easy to understand both
> of these processes from first principles in the time domain, which is one
> reason I like to think about things this way.
>
> -Ethan
>
>
>
>
> On Mon, Aug 28, 2017 at 12:15 PM, Nigel Redmon 
> wrote:
>
>> Hi Remy,
>>
>> On Aug 28, 2017, at 2:16 AM, Remy Muller  wrote:
>>
>> I second Sampo about giving some more hints about Hilbert spaces,
>> shift-invariance, Riesz representation theorem… etc
>>
>>
>> I think you’ve hit upon precisely what my blog isn’t, and why it exists
>> at all. ;-)
>>
>> Correct me if you said it somewhere and I didn't saw it, but an important
>> *implicit* assumption in your explanation is that you are talking about
>> "uniform bandlimited sampling”.
>>
>>
>> Sure, like the tag line in the upper right says, it’s a blog about
>> "practical digital audio signal processing".
>>
>> Personnally, my biggest enlighting moment regarding sampling where when I
>> read these 2 articles:
>>
>>
>> Nice, thanks for sharing.
>>
>> "Sampling—50 Years After Shannon"
>> http://bigwww.epfl.ch/publications/unser0001.pdf
>>
>> and
>>
>> "Sampling Moments and Reconstructing Signals of Finite Rate of
>> Innovation: Shannon Meets Strang–Fix"
>> https://infoscience.epfl.ch/record/104246/files/DragottiVB07.pdf
>>
>> I wish I had discovered them much earlier during my signal processing
>> classes.
>>
>> Talking about generalized sampling, may seem abstract and beyond what you
>> are trying to explain. However, in my personal experience, sampling seen
>> through the lense of approximation theory as 'just a projection' onto a
>> signal subspace made everything clearer by giving more perspective:
>>
>>- The choice of basis functions and norms is wide. The sinc function
>>being just one of them and not a causal realizable one (infinite temporal
>>support).
>>- Analysis and synthesis functions don't have to be the same (cf
>>wavelets bi-orthogonal filterbanks)
>>- Perfect reconstruction is possible without requiring
>>bandlimitedness!
>>- The key concept is 'consistent sampling': *one seeks a signal
>>approximation that is such that it would yield exactly the same
>>measurements if it was reinjected into the system*.
>>- All that is required is a "finite rate of innovation" (in the
>>statistical sense).
>>- Finite support kernels are easier to deal with in real-life because
>>they can be realized (FIR) (reminder: time-limited <=> non-bandlimited)
>>- Using the L2 norm is convenient because we can reason about best
>>approximations in the least-squares sense and solve the projection problem
>>using Linear Algebra using the standard L2 inner product.
>>- Shift-invariance is even nicer since it enables *efficient* signal
>>processing.
>>- Using sparser norms like the L1 norm enables sparse sampling and
>>the whole field of compressed sensing. But it comes at a price: we have to
>>use iterative projections to get there.
>>
>> All of this is beyond your original purpose, but from a pedagocial
>> viewpoint, I wish these 2 articles were systematically cited in a "Further
>> Reading" section at the end of any explanation regarding the sampling
>> theorem(s).
>>
>> At least the wikipedia page cites the first article and has a section
>> about non-uniform and sub-nyquist sampling but it's easy to miss the big
>> picture for a newcomer.
>>
>> Here's a condensed presentation by Michael Unser for those who would like
>> to have a quick historical overview:
>> http://bigwww.epfl.ch/tutorials/unser0906.pdf
>>
>>
>> On 27/08/17 08:20, Sampo Syreeni wrote:
>>
>> On 2017-08-25, Nigel Redmon wrote:
>>
>> http://www.earlevel.com/main/tag/sampling-theory-series/?order=asc
>>
>>
>> Personally I'd make

Re: [music-dsp] advice regarding USB oscilloscope

2017-03-08 Thread Ethan Duni

These PicoScopes look pretty cool :]

As it happens I am just now trying to free up some garage space to get an
electronics bench together. But it's coming up on 20 years since I last
soldered and it's a whole different world with scopes now. So thanks for
this thread!

Also if anybody knows good resources for refurbishing old receivers and
speakers please point me in that direction.

E

On Wed, Mar 8, 2017 at 8:21 PM, Andrew Simper  wrote:

> Hi Remy,
>
> I use the signal generator all the time to calibrate the pot on the
> probes when in x10 mode using the square wave output. Note that the
> scope runs off USB power so you can't generate very hot signals, it's
> +- 2V (USB is 5V), you'll need to make your own external booster
> circuit for general use. The 5000 has a proper analog signal generator
> from what I can tell, and the 5000B adds a 14-bit sample based
> arbitrary waveform generator that runs at 200MHz, so absolutely fine
> for any audio applications, but for us audio guys we have soundcards
> to play back waveforms, so it's not that much use.
>
> I wish they made this scope when I bought my first one, I bought the
> 12bit 4226 model, which still works great, but I would love this new
> one!
>
> Cheers,
>
> Andy
>
> On 9 March 2017 at 07:19, Remy Muller  wrote:
> > hi,
> >
> > AudioPrecision looks nice but it's way over my budget considering that it
> > won't be used on a daily basis.
> >
> > Looking at the specs, the QuantAsylum audio card only seems to have AC
> > coupling (down to 1.6Hz) and their oscillosccope page is a bit short on
> > details.
> >
> > Hacking a soundcard as an oscilloscope could be very convenient since it
> > benefits from all the standard audio softwares and can easily get beyong
> the
> > 2/4 channels, but it's limited to AC coupling, unless there are
> soundcards
> > that have DC coupled inputs? AFAIK most only provide DC outputs.
> > Furthermore having to do homemade matched probes and attenuators is not
> very
> > 'plug and play'.
> >
> > Since bitscope seems to only provide 8-bit ADC, Picoscope is thus very
> high
> > on my list, in particular the 5000 series. I'm wondering whether their
> > Arbitrary Waveform Generator option is really worth it though.
> >
> > @Andrew I just found a python wrapper based on ctypes
> > https://github.com/colinoflynn/pico-python
> >
> > Thanks for all the feedback!
> >
> >
> > On 08/03/17 12:16, Roshan Wijetunge wrote:
> >
> > Depending on how cheap and improvised you want to go, and how handy you
> are
> > with basic electronics, you can easily adapt your soundcard to work as an
> > oscilloscope. There are a number of guides on the internet on how to do
> > this, such as:
> >
> > http://makezine.com/projects/sound-card-oscilloscope/
> >
> > I have used the following variation with good results:
> >
> > - Probe via resistor to mic input of mixer
> > - Mixer line out to line of USB soundcard
> > - Schwa Schope plugin running in any DAW host (e.g. Reaper)
> >
> > I used this setup as it utilised components I already had available, and
> it
> > has proved very useful for debugging audio hardware, being able to trace
> > signals through a circuit as well as biasing amplifier stages in
> pre-amps.
> > Using the mixer gave me control over input signal range though clearly
> you
> > have to be careful with gain staging so as not to introduce distortion to
> > the signal.
> >
> > I also improvised a signal generator using a Electro Harmonix Tube Zipper
> > guitar effects pedal. It's an auto-wah type pedal, but you can set the
> > resonance to maximum, sensitivity to zero and it generates a nice clean
> > stable sine wave.
> >
> > Best Regards
> > Roshan
> >
> >
> >
> > On 8 March 2017 at 09:57, Andrew Simper  wrote:
> >>
> >> Picoscope make the cheapest 16-bit scopes around (USD 1000), the
> >> 16-bit stuff from Tektronix is a lot more expensive (USD 31000 -
> >> that's right I didn't accidentally add an extra zero, it's x30 the
> >> price). I would recommend using the Picoscope and use Python's easy c
> >> bindings to call the Picoscope library functions to do what you want.
> >>
> >> Cheers,
> >>
> >> Andy
> >>
> >> On 7 March 2017 at 22:59, Remy Muller  wrote:
> >> > Hi,
> >> >
> >> > I'd like to invest into an USB oscilloscope.
> >> >
> >> > The main purpose is in analog data acquisition and instrumentation.
> >> > Since
> >> > the main purpose is audio, bandwidth is not really an issue, most
> models
> >> > seem to provide 20MHz or much more and I'm mostly interested in analog
> >> > inputs, not logical ones.
> >> >
> >> > Ideally I'd like to have
> >> >
> >> >  - Mac, Windows and Linux support
> >> >
> >> > - 4 channels or more
> >> >
> >> > - 16-bit ADC
> >> >
> >> > - up to 20V
> >> >
> >> > - general purpose output generator*
> >> >
> >> > - a scripting API (python preferred)
> >> >
> >> > * I have been told that most oscilloscopes have either no or limited
> >> > output,
> >> > and that I'd rather use a soundcard for generating dedi

Re: [music-dsp] ± 45° Hilbert transformer using pair of IIR APFs

2017-02-09 Thread Ethan Duni

> how do you quadrature modulate without Hilbert filters?
>

Perhaps I'm using the wrong term - the operation in question is just the
multiplication of a signal by e^jwn. Or, equivalently, multiplying the real
part by cos(wn) and the imaginary part by sin(wn) - a pair of "quadrature
oscillators."


> i think you can calculate how much energy exists at the negative
> frequencies and that comes out to be an error image signal that also gets
> modulated up or down along with the image you want.
>

Right, to the extent that the LPF fails to completely block the negative
frequencies, they remain as error images and show up in the output. However
it seems easier to track this in the Weaver case, since this error is given
directly by the suppression characteristic of the LPF, rather than via
in-band phase cancellation as in the Hartley/Hilbert case.

My thinking is that the Weaver modulator gets a direct benefit from
oversampling, since the error images get moved further and further out into
the stopband of the LPFs (easing their design constraints), whereas the
Hartley approach does not, since it is stuck trying to maintain a phase
relationship across the signal band, no matter how much it is oversampled.

> i'm looking at the diagram and i think that's how my old Heathkit HW-100
> did it where the image rejection was done with
> piezo-electric-crystal-lattice filters which are BPFs (not LPFs) with
> bad-ass selectivity on both sides.
>

This sound like a third, related method (apparently called the Bandpass
Method). Weaver is kind of the same underlying idea, but it down-modulates
the signal first so that the filter can be done at baseband (as an LPF)
instead of doing it directly on the high frequency signal (then you need a
BPF with tight bandwidth and high center frequency, which is expensive).
The downside is an extra modulator is needed.

E
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] ± 45° Hilbert transformer using pair of IIR APFs

2017-02-09 Thread Ethan Duni

On Tue, Feb 7, 2017 at 6:49 AM, Ethan Fenn  wrote:

> So I guess the general idea with these frequency shifters is something
> like:
>
> pre-filter -> generate Hilbert pair -> multiply by e^iwt -> take the real
> part
>
> Am I getting that right?
>

Exactly, this is a single sideband modulation technique, specifically
called the Hartley Modulator.

I wonder if anyone has compared to the Weaver Modulator approach? That one
works like this:

input -> quadrature modulate to center image on DC -> (complex) LPF ->
quadrature modulate back up to desired frequency -> take the real part

Obviously this approach requires two quadrature modulators instead of one,
and two (real) LPFs instead of one Hilbert transformer. On the other hand,
LPFs are easier to deal with than Hilbert filters - we aren't trying to
stick a huge phase transition right at DC, we can use any number of common
design techniques/topologies, and we don't have the extra constraint of
needing an allpass response. So it's not obvious to me that the two LPFs in
Weaver aren't actually cheaper than the one Hilbert filter in Hartley, for
equivalent performance.

Also the performance is easier to get a handle on, since the unwanted
images are controlled directly by the LPFs, rather than relying on phase
cancellation of in-band components in the image region. Moreover, the
less-ideal parts of the filter response can be pushed up into the
transition region (where there is presumably much less signal energy - or
even effectively none if there is significant oversampling to allow
headroom for frequency shifting), compared to the Hilbert approach which
has to try to maintain the correct phase shift all the way across the
signal band.

Most of the comparisons I've found between Hartley and Weaver are from the
analog radio domain, and the concerns mostly don't apply to a DSP context
(things about DC coupling in the quadrature oscillators, building matched
LPFs with tight tolerance, etc.). Any thoughts?

E
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Can anyone figure out this simple, but apparently wrong, mixing technique?

2016-12-10 Thread Ethan Duni

Ha this article made me chuckle. All the considerations about odd 8 bit
audio formats!

This method has his desired property that if all but one input is silent,
you get the non-silent one at output without attenuation or other
degradation. But the inclusion of the cross term makes it quite non-linear
so there's going to be serious distortion when actually mixing multiple
signals.

Not sure why he's worried about doing audio processing in 8 bit resolution
in 2016.

E

On Sat, Dec 10, 2016 at 11:44 AM, Ethan Fenn  wrote:

> Doesn't make sense to me either. If the inputs are two pure sines, you'll
> get combination tones showing up in the output. And they won't be
> particularly quiet either.
>
> -Ethan
>
>
>
> On Sat, Dec 10, 2016 at 2:31 PM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>>
>>
>> it's this Victor Toth article: http://www.vttoth.com
>> /CMS/index.php/technical-notes/68 and it doesn't seem to make sense to
>> me.
>>
>>
>>
>> it doesn't matter if it's 8-bit offset binary or not, there should not be
>> a multiplication of two signals in the definition.
>>
>> i cannot see what i am missing.  can anyone enlighten me?
>>
>>
>>
>> --
>>
>> r b-j  r...@audioimagination.com
>>
>> "Imagination is more important than knowledge."
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Allpass filter

2016-12-08 Thread Ethan Duni

>ap := IFFT(FFT(lp)/FFT(p))
>It's simply a complex division in frequency domain.

That won't really work - the denominator there corresponds to an IIR
filter, so you're going to get considerable aliasing doing this.

With considerable zero-padding it may work approximately, but you aren't
going to get an actual allpass filter out of this (except in the trivial
cases of flat filters and such).

The right way to get the allpass is not to use any FFTs, you just use lp
and p directly as the coefficients of an IIR filter. If you want an FIR
approximation to that, then compute the impulse response and
truncate/window it to the desired length.

FFT domain is generally not a good place to design filters - you're only
controlling what happens at the bin centers, and all kinds of wild things
can happen in between them. And it's difficult to account for the
circular/finite length effects.

Ethan Duni

On Thu, Dec 8, 2016 at 7:11 AM, STEFFAN DIEDRICHSEN 
wrote:

>
> On 08.12.2016|KW49, at 15:32, Uli Brueggemann 
> wrote:
>
> It's simply a complex division in frequency domain.
>
>
> That’s correct. I’m not sure, if you need to zero-pad the FFTs to avoid
> time-aliasing since the spectral multiplication is a convolution. But on
> the other hand a spectral division is a deconvolution, so you should be
> fine.
>
> Best,
>
> Steffan
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Allpass filter

2016-12-07 Thread Ethan Duni

I'm not sure I quite follow what the goal is here? If you already have lp
and p, then there aren't any additional calculations needed to obtain ap -
it's an IIR filter with numerator coefficients given by lp, and denominator
coefficients given by p. The pulse response is obtained by running the
filter for a pulse input.

Is the goal to get an FIR approximation of ap? As Stefan has pointed out,
an FIR allpass is a simple delay. If you are willing to relax the allpass
criterion, you can get an FIR approximation by just truncating the pulse
response at whatever length is sufficient for your application.

Unless I'm missing something, seems like all of the difficulties are in
obtaining p and lp in the first place?

Ethan

On Wed, Dec 7, 2016 at 4:10 AM, Uli Brueggemann 
wrote:

> Hi,
>
> I'm searching a solution for an allpass filter calculation with following
> conditions:
>
> There is a given pulse response p with a transfer function H. It is
> possible to derive a linear phase pulse response lp from the magnitude of H.
>
> Now there is an equation
> p * ap = lp  (* = convolution, ap = allpass)
>
> Thus
> ap = lp * p^-1
>
> The magnitude of ap = 1, so ap applies only phase shifts. Its group delay
> is inverse to the group delay of p.
>
> Is there a solution to elegantly calculate the pulse response ap ? The
> calculation of p^-1 may be difficult or numerically unstable.
>
> Cheers
> Uli
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] efficient running max algorithm

2016-09-02 Thread Ethan Duni

Right aren't monotonic signals the worst case here? Or maybe not, since
they're worst for one wedge, but best for the other?

Ethan D

On Fri, Sep 2, 2016 at 10:12 AM, Evan Balster  wrote:

> Just a few clarifications:
>
> - Local maxima and first difference don't really matter.  The maximum
> wedge describes global maxima for all intervals [t-x, t], where x=[R-1..0].
>
> - If two equal samples are encountered, the algorithm can forget about the
> older of them.
>
> It has been very helpful for my purposes to visualize the algorithm thus:
>  Imagine drawing lines rightward from all points on the signal.  Those
> which extend to the present time without intersecting other parts of the
> signal form the minimum and maximum wedge.  The newest sample, and only the
> newest sample, exists in both wedges.
>
> [image: Inline image 3]
> http://interactopia.com/archive/images/lemire_algorithm.png
>
> The algorithm can safely forget anything in grey because it has been
> "shadowed" by newer maximum or minimum values.
>
> – Evan Balster
> creator of imitone 
>
> On Fri, Sep 2, 2016 at 1:50 AM, Ross Bencina 
> wrote:
>
>> On 2/09/2016 4:37 PM, Ross Bencina wrote:
>>
>>> When the first difference is positive, the history is trimmed. This is
>>> the only time any kind of O(N) or O(log2(N)) operation is performed.
>>> First difference positive implies that a new local maximum is achieved:
>>> in this case, all of the most recent history samples that are less than
>>> the new maximum are trimmed.
>>>
>>
>> Correction:
>>
>> [The new local maximum dominates all
>>> history up to the most recent *history sample* that exceeds it, which is
>>> retained.]
>>>
>>
>> I had said "most recent local maximum" but the history contains a
>> monotonic non-increasing sequence.
>>
>> Ross.
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] idealized flat impact like sound

2016-07-30 Thread Ethan Duni

So like a cascade of allpass filters then?

Ethan D

On Fri, Jul 29, 2016 at 11:10 AM, gm  wrote:

>
> I think what I am looking for would be the perfect reverb.
>
> So that's the question reformulated: how could you construct a perfectly
> flat short reverb?
>
> It's the same problem.
>
>
>
> Am 29.07.2016 um 12:18 schrieb Tito Latini:
>
>> An idea is to create a kind of "ideal" residual: i.e. the transient is
>> a band-limited impulse and an enveloped (maybe expdec) noise is added
>> after two, three or a few samples. The parameters are:
>>
>>  - noise env
>>  - delay of the noise in samples
>>  - transient-to-noise ratio (% of transient, % of noise)
>>
>> The transient (band-limited impulse) is spectrally flat and the noise
>> adds the reverberation (you could start with a very low level).
>>
>>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Anyone using unums?

2016-04-15 Thread Ethan Duni

>okay, this PDF was more useful than the other.  once i got down to slide
#31,

> i could see the essential definition of what a "unum" is.

>big deeel.

>first of all, if the word size is fixed and known (and how would you know
how far

>to go to get to the extra meta-data: inexact bit, num exponent bits, num
fractional

>bits, at the end of the word if the word size was not known in advance?),
then the

>num fractional bits field is a waste of bits.  the num fractional bits +
num exponent

> bits + 1 (the sign bit) adds to the word width.  big fat hairy deeel.

That older pdf is describing the "V1" unums. The newer stuff in the OP is
proposing "V2" unums that don't have those issues. The new ones don't have
the fractional bits/exponential bits format, nor the size fields. It's an
alternative way of extending fixed point to handle real numbers - rather
than adding an exponent field and then dealing with all of the exceptions
and (implicit) accuracy issues, you add the concept of sets on the
(projective) real line, and explicitly include reciprocals.

Seems like the obvious application is more in these robotics/physics
simulations where you want to do brute-force constraint checking, and not
necessarily at terribly high resolution. The set-theoretic stuff allows you
to do this in a rigorous, efficient, parallel manner. Not obvious what uses
this might have for music DSP, although the availability of cheap division
would obviate the hoops that we are in the habit of jumping through to
avoid divisions. Also kind of neat that the format always tells you
explicitly whether it is an exact number or a range of possible values,
which opens up a new approach to dealing with propagation of round-off
error. I.e., in traditional designs the results are rounded to the nearest
exact number in each operation, and the associated error is implicit. So we
end up having to over-spec systems to allow for that. And if we change the
order of operations, we can't expect bit-exact results with floats. With
unums the error remains explicit, as does its propagation through series of
operations. So you can do things like re-order your operations and still
get bit-exact results, and you get to specify exactly where, when, how - or
*if* - the rounding to exact numbers occurs. You are not forced to accept
some implicit rounding every time you do an operation, and then spec in
worst-case error bounds to ensure the desired accuracy.

On the other hand, the LUT approach seems like it will not scale well to
higher resolutions, and it will probably be quite some time before we have
any hardware that implements this stuff...

Ethan D

On Fri, Apr 15, 2016 at 1:24 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Anyone using unums?
> From: "Evan Balster" 
> Date: Fri, April 15, 2016 11:46 am
> To: music-dsp@music.columbia.edu
> --
>
> > Tangentally, I read some slides
> > 
> > about the previous unum format, which seems to describe a "meta-float"
> with
> > very different characteristics that the ones here.
>
> okay, this PDF was more useful than the other.  once i got down to slide
> #31, i could see the essential definition of what a "unum" is.
>
> big deeel.
>
> first of all, if the word size is fixed and known (and how would you know
> how far to go to get to the extra meta-data: inexact bit, num exponent
> bits, num fractional bits, at the end of the word if the word size was not
> known in advance?), then the num fractional bits field is a waste of bits.
>  the num fractional bits + num exponent bits + 1 (the sign bit) adds to the
> word width.  big fat hairy deeel.
>
>
>
> here are a couple of real number formats i have seen in practice in real
> devices (neither had an IEEE floating-point coprocessor):
>
> 1.  essentially the same format as IEEE-754 float except instead of being
> sign-magnitude representation (a negative number looks just like its
> positive value except the MSB is 1 instead of 0) the format is 2's
> complement.  the exponent is biased like IEEE and denorms happen when the
> exponent bits (of the positive value) are all zeros.  the number of
> exponent bits might have been different and the word width might have bee
> lower (take a look at slide #20, the 16-bit float).  there are no NaNs and
> there is no negative zero, every bit combination corresponds to a unique
> real number.  since the positive values are all strictly increasing if the
> bit pattern is viewed as a regular binary integer, using the 2's complement
> instead of sign-magnitude essentially preserves the integer comparison
> result (which does not happen for IEEE if both numbers are negative).
>
> it is essentially the same format as used by the DEC PDP-10 (see
> http://www.quadibloc.c

Re: [music-dsp] High quality really broad bandwidth pinknoise (ideally more than 32 octaves)

2016-04-14 Thread Ethan Duni

Any noise other than white noise is correlated, by definition. That's what
"white noise" means - uncorrelated. Correlation in the time domain is
equivalent to non-constant shape in the frequency domain.

Ethan

On Thu, Apr 14, 2016 at 12:24 PM, Seth Nickell  wrote:

> Maybe stupid question: Is pink noise inherently correlated or is this a
> property of the algorithms currently in use?
>
> -Seth
>
> On Thu, Apr 14, 2016 at 7:11 AM Stefan Stenzel <
> stefan.sten...@waldorfmusic.de> wrote:
>
>> Dude is called Nyquist, and noise is not generally uncorrelated. White
>> noise usually is. Pink noise is not.
>>
>>
>> > On 14 Apr 2016, at 15:12 , Theo Verelst  wrote:
>> >
>> > HI,
>> >
>> > Talking about "perfect noise", you may want to consider these
>> theoretics:
>> >
>> > - what do you do near the Niquist frequency ? Or more practical: noise
>> that gets near the NF will probably cause strange effects in practical DACs
>> and when the digital signal is to be interpreted as "perfectly
>> re-constructable" there's probably a lot of trouble in the high frequency
>> range
>> >
>> > - "perfect noise" is also uncorrelated for most peoples' understanding,
>> which creates a problem when using filters: all FIR responses or digital
>> quasi poles and zeros you use show up as correlation at the output of the
>> noise generator.
>> >
>> > T.V.
>> > ___
>> > dupswapdrop: music-dsp mailing list
>> > music-dsp@music.columbia.edu
>> > https://lists.columbia.edu/mailman/listinfo/music-dsp
>> >
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] confirm a2ab2276c83b0f9c59752d823250447ab4b666

2016-03-29 Thread Ethan Duni

Supposing this is some griefer it seems reasonable to ignore them - but is
there a possibility that this is a symptom of some kind of server attack or
attempt to profile/track list members?

I've never received any unsub notices myself but it is a little
disconcerting that somebody persists at doing this. I'd think that a
griefer would give up after a while.

E

On Tue, Mar 29, 2016 at 7:13 AM, Douglas Repetto  wrote:

> I get reports about this every couple weeks. Because it's a double opt-out
> no one is actually being unsubscribed from the list unless they want to be.
> So please ignore these bogus unsub messages. It's not worth spending time
> worrying about it.
>
> douglas
>
>
> On Mon, Mar 28, 2016 at 6:37 PM, Evan Balster  wrote:
>
>> This happened to me also, but I didn't give it much thought.
>>
>>
>> On Mon, Mar 28, 2016 at 4:31 PM, robert bristow-johnson <
>> r...@audioimagination.com> wrote:
>>
>>>
>>>
>>> h.  i wonder if someone is trying to tell me something
>>>
>>>
>>>
>>>  Original Message
>>> 
>>> Subject: confirm a2ab2276c83b0f9c59752d823250447ab4b666
>>> From: music-dsp-requ...@music.columbia.edu
>>> Date: Mon, March 28, 2016 2:31 pm
>>> To: r...@audioimagination.com
>>>
>>> --
>>>
>>> > Mailing list removal confirmation notice for mailing list music-dsp
>>> >
>>> > We have received a request for the removal of your email address,
>>> > "r...@audioimagination.com" from the music-dsp@music.columbia.edu
>>> > mailing list. To confirm that you want to be removed from this
>>> > mailing list, simply reply to this message, keeping the Subject:
>>> > header intact. Or visit this web page:
>>> >
>>> >
>>> https://lists.columbia.edu/mailman/confirm/music-dsp/a2ab2276c83b0f9c59752d823250447ab4b666
>>> >
>>> >
>>> > Or include the following line -- and only the following line -- in a
>>> > message to music-dsp-requ...@music.columbia.edu:
>>> >
>>> > confirm a2ab2276c83b0f9c59752d823250447ab4b666
>>> >
>>> > Note that simply sending a `reply' to this message should work from
>>> > most mail readers, since that usually leaves the Subject: line in the
>>> > right form (additional "Re:" text in the Subject: is okay).
>>> >
>>> > If you do not wish to be removed from this list, please simply
>>> > disregard this message. If you think you are being maliciously
>>> > removed from the list, or have any other questions, send them to
>>> > music-dsp-ow...@music.columbia.edu.
>>> >
>>> >
>>>
>>> i think *someone* is being a wee bit malicious.  or at least a bit
>>> mischievous.
>>>
>>> (BTW, i changed the number enough that i doubt it will work for anyone.
>>>  but try it, if you want.)
>>>
>>>
>>>
>>>
>>>
>>>  Original Message
>>> 
>>> Subject: Re: [music-dsp] Changing Biquad filter coefficients on-the-fly,
>>> how to handle filter state?
>>> From: "vadim.zavalishin" 
>>> Date: Mon, March 28, 2016 2:20 pm
>>> To: r...@audioimagination.com
>>> music-dsp@music.columbia.edu
>>>
>>> --
>>>
>>> > robert bristow-johnson писал 2016-03-28 17:57:
>>> >> using the trapezoid rule to model/approximate the integrator of an
>>> >> analog filter is no different than applying bilinear transform
>>> >> (without compensation for frequency warping) to the same integrator.
>>> >>
>>> >> s^(-1) <--- T/2 * (1 + z^(-1)) / (1 - z^(-1))
>>> >
>>> > This statement implies the LTI case, where the concept of the transfer
>>> > function exists.
>>>
>>> i didn't say that.  i said "applying ... to the same integrator."  about
>>> each individual "transfer function" that looks like "s^(-1)"
>>>
>>>
>>>
>>> > In the topic of this thread we are talking about
>>> > time-varying case, this means that the transfer function concept
>>> doesn't
>>> > apply anymore.
>>>
>>>
>>>
>>> well, there's slow time and there's fast time.  and the space between
>>> the two depends on how wildly one twists the knob.  while the filter
>>> properties are varying, we want the thing to sound like a filter (with
>>> properties that vary).  there *is* a concept of frequency response (which
>>> may vary).
>>>
>>>
>>>
>>> for each individual integrator you are replacing the
>>> continuous-time-domain equivalent of s^(-1) with the discrete-time-domain
>>> equivalent of T/2 * (1 + z^(-1)) / (1 - z^(-1)), which is the same as the
>>> trapezoid rule.
>>>
>>>
>>>
>>> > Specifically, filters with identical *formal* transfer
>>> > functions will behave differently and this is exactly the topic of the
>>> > discussion.
>>>
>>> i didn't say anything about a "transfer function", until this post.  i
>>> am saying that the trapezoidal rule for modeling integrators is replacing
>>> those integrators (which by themselves are LTI and *do* happen to have a
>>> transfer function of "s^(-1)") with whatever "T/2 * (1 + z^(-1)

Re: [music-dsp] Changing Biquad filter coefficients on-the-fly, how to handle filter state?

2016-03-03 Thread Ethan Duni

Yeah zeroing out the state is going to lead to a transient, since the
filter has to ring up.

If you want to go that route, one possibility is to use two filters in
parallel: one that keeps the old state/coeffs but gets zero input, and
another that has zero state and gets the new input/coeffs. You then add
their outputs together, combining the ring-out of the old state/coeffs with
the ring-up of the new coeffs/input. This is the zero-state/zero-input
decomposition. However this can still result in transient artifacts if the
filter has changed a lot (i.e., the old coeffs might have a short ring-out
time, but the new ones have a long ring-up time, or vice versa). And if
your coeffs aren't changing much, then probably you can get away with more
direct methods. But a worthwhile exercise for theoretical edification, I
think.

Another thing to consider is how to interpret the state variables for
different filter topologies. If you use Direct Form I then the state
variables are simply the previous inputs and outputs to/from the filter,
which still make sense if you change the coeffs. Other topologies have
state variables that corresponds to history samples multiplied by the
coeffs, so when you change the coeffs they cease to make sense and you have
problems. If any of the coeffs are 0 you lose info and then can't "convert"
to the state corresponding to a different set of coeffs. Not that Direct
Form I is immune to artifacts when you change the coeffs, but they tend to
be much less severe than other topologies for a given set of coeffs/inputs.
This is because it's only a matter of the coefficient mismatch, and not the
additional factor of the state interpretation mismatch.

>Now if we're going to change to new filter coefficients, we have two
degrees of freedom
>in the state space. A reasonable goal seems to be to match the value and
derivative of
>the output after we change the coefficients.

I'm not sure quite how this would work for discrete time? Is the idea to
interpret them as continuous-time filters for the purposes of the state
update?

E

On Thu, Mar 3, 2016 at 11:34 AM, Ethan Fenn  wrote:

> As a simple fix, I would also try just leaving the state alone rather than
> zeroing it out. I've done this plenty of times before and it's always
> sounded okay for moderate/gradual changes of the coefficients.
>
> As for doing it "correctly" -- I haven't read up on this but my thinking
> would go like so... let's suppose we have a biquad in a free ringing state,
> no longer receiving any input. Its output will be a sum of two complex
> exponentials. Given the current state variables, we can compute the
> weighting of these two exponentials, giving us the current value and first
> derivative of the output.
>
> Now if we're going to change to new filter coefficients, we have two
> degrees of freedom in the state space. A reasonable goal seems to be to
> match the value and derivative of the output after we change the
> coefficients. So we can perform the same analysis with the new
> coefficients, getting new exponents in the output. We can figure out what
> new state variables will give us the desired match. I haven't done the math
> yet but I don't see any complications and this would give us a matrix
> mapping old state variables to new ones.
>
> Is this the kind of reasoning I would find in the references, or is there
> a better way to think about this?
>
> -Ethan
>
>
>
> On Wed, Mar 2, 2016 at 12:49 PM, Theo Verelst  wrote:
>
>> Paul Stoffregen wrote:
>>
>>> Does anyone have any suggestions or publications or references to best
>>> practices for what
>>> to do with the state variables of a biquad filter when changing the
>>> coefficients?
>>> ...
>>>
>>
>> I am not directly familiar with the programming of the particular biquad
>> filter variation, but some others, and like many have said, I suspect that
>> varying the external cutoff and/or resonance control, computing the effects
>> on the various filter coefficients, and essentially not making any changes
>> across some sort of singularity (if there is one), or overly rapid changes,
>> should leave the audio running through fine.
>>
>> The state variable filter theory and origins in the use of audio
>> equipment like the early analog synthesizers isn't exactly the same as the
>> digital implementations, so there are things to consider more accurately if
>> you want them wonderful sweeping resonant filter sounds on a sound source:
>> there are non-linearities for instance in a Moog ladder filter, and the
>> "state" which is remembered in an analog filter in the capacitors, gets
>> warped when the voltage controls change. A mostly linearized digital filter
>> simulation will probably not automatically exhibit the same behavior as the
>> well known analog filters. Also, all kinds signal subtleties are lost in
>> the approximation by sampling the time axis, which may or may not be a
>> problem.
>>
>> I looked at the code quickly, couldn't find the "definition"
>>

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-26 Thread Ethan Duni

Theo wrote:
>I get there are certain statistical ideas involved. I wonder
>however where those ideas in practice lead to, because
>of a number of assumptions, like the "statistical variance"
>of a signal. I get that a self correlation of a signal in some
>normal definition gives an idea of the power, and that you
>could take it that you compute power per frequency band.
>But what does it mean when you talk about variance ?

>Of course to determine a statistical measure about a spectrum,
>either based on sampled signals or (where the analysis comes
>from and is only generally correct for signal from - to + inf) on a
>continuous signal, and based either on a Fourier integral/summation
>or a Fast Fourier analysis (with certain analysis length and frequency
>bin accuracy), you could use the general big numbers theorem and
>presume there's a mean and a variance. It would be nice to at least
> make credible why this is an ok analysis, because a lot of signals are
>far from Gaussian distributed in the sense of the frequency spectrum.

So we are employing prob/stat terms like mean and variance, and normalizing
the power spectrum so that it looks like a probability density.

However, this is only a matter of semantics, we are not required to
actually treat the signals in question as random. The whole thing works the
same way regardless of whether we apply it to the power spectral density of
a random signal, or the power spectrum of a deterministic signal (which is
what we've been doing so far here).

The goal is to find some simple features of the spectrum that capture
something about how "bright" it is - so the center of mass of the spectrum,
and maybe also its spread. Then we can compare these features to make
estimates of whether one signal is "brighter" than another, for example.
This is not required to be a complete characterization of the spectrum in
question - as you note, absent some other assumption like Gaussianity, the
first two moments will not be sufficient to completely characterize it.
It's only supposed to give us some (hopefully) meaningful indication of
certain broad properties of the spectrum. The hope would be that two
(different) spectra with the same first moment will have similar
"brightness," and so that statistic is sufficient to capture the property
in question.

These are simply features of a power spectrum, much like familiar
quantities of bandwidth, peak level, transition width, etc. They do admit a
prob/stat interpretation, which is interesting but secondary to the primary
motivation here.

E

On Thu, Feb 25, 2016 at 11:04 AM, Theo Verelst  wrote:

> Evan Balster wrote:
>
>> ...
>>
>> To that end:  A handy, cheap algorithm for approximating the
>> power-weighted spectral
>> centroid -- a signal's "mean frequency" -- which is a good heuristic for
>> perceived sound
>> brightness .
>> In spite of
>> its simplicity, ...
>>
> Hi,
>
> Always interesting to learn a few more tricks, and thanks to Ethan's
> explanation I get there are certain statistical ideas involved. I wonder
> however where those ideas in practice lead to, because of a number of
> assumptions, like the "statistical variance" of a signal. I get that a self
> correlation of a signal in some normal definition gives an idea of the
> power, and that you could take it that you compute power per frequency
> band. But what does it mean when you talk about variance ? Mind you I know
> the general theoretics up to the quantum mechanics that worked on these
> subjects long ago fine, but I wonder what the understanding here is?
>
> Some have remarked about the analysis of a signal into ground frequency
> and harmonics that it might be hard to summarize and make an ordinal
> measure for "brightness" as a one dimensional quantity, I mean of you look
> at a number of peaks in a frequency graph, how do you sum up the frequency
> of the signal, if there is one, and the meaning of the various harmonics in
> the spectrum, if they are to be taken as a measure of the brightness? So a
> trick is fine, though I do not completely understand the meaning of a
> brightness measure for frequency analysis.
>
> Of course to determine a statistical measure about a spectrum, either
> based on sampled signals or (where the analysis comes from and is only
> generally correct for signal from - to + inf) on a continuous signal, and
> based either on a Fourier integral/summation or a Fast Fourier analysis
> (with certain analysis length and frequency bin accuracy), you could use
> the general big numbers theorem and presume there's a mean and a variance.
> It would be nice to at least make credible why this is an ok analysis,
> because a lot of signals are far from Gaussian distributed in the sense of
> the frequency spectrum.
>
> T.
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/list

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-25 Thread Ethan Duni

>Lastly, it's important to note that differentiation and
semi-differentiation
>filters are always approximate for sampled signals, and will tend to
>exhibit poor behavior for very high frequencies and (for
semi-differentiation)
>very low ones.

I'm not sure there's necessarily a problem at low frequencies for the
inverse pinking filters. A regular pinking filter definitely has to depart
from the ideal response at low frequencies, since the ideal response blows
up there. So if you obtain an inverse pinking filter by designing a pinking
filter and then taking its inverse, you will indeed end up with a departure
from the ideal at low frequencies. However, there is nothing problematic
about the ideal response of an inverse pinking filter in the low frequency
region - it simply goes through zero at DC. So it should be possible to
design an inverse pinking filter directly, without the departure in the low
frequency region.

Of course it may not make much difference in practice. Indeed, we probably
would want to stick a high-pass filter in front of the entire spectral
centroid estimator, in order to ensure that the denominator term (the
accumulated power in the unfiltered signal) doesn't blow up. In which case,
the low frequency response of the inverse pinking filter shouldn't matter
anyway.

E

On Thu, Feb 25, 2016 at 12:57 PM, Evan Balster  wrote:

> For my own benefit and that of future readers, I'm going to summarize the
> thread so far.
>
>
> The discussion here concerns metrics of "brightness" -- that is, the
> tendency of a given signal toward higher or lower signal content.  The
> method proposed for analyzing brightness involves inspecting "moments" of
> power in the frequency domain -- that is, the statistical distribution of
> power among frequencies.
>
>
> The algorithm I originally proposed uses a simple differentiator to
> approximate what I thought was a "mean frequency" -- the first moment of
> the distribution of power among frequencies in the signal.  As others have
> remarked, the revised algorithm  (as seen
> in the latest pastebin code) computes a *standard deviation* of
> frequencies in a real signal.  If you take out the square-root operation,
> it becomes the variance, or second moment.  The first moment (mean) is in
> fact *always zero* for real signals due to the symmetry of the frequency
> domain.
>
> The flaw of my algorithm is that, given a signal comprising two sinusoid
> frequencies of equal power, it will produce a quadratic mean
>  of the two frequencies
> rather than a linear mean.  If the frequencies are 100hz and 200hz, for
> instance, my algorithm will produce a centroid of about 158.1hz.  It's
> reasonable that we would prefer an algorithm that instead yields a more
> intuitive result 150hz -- the first moment of power in the *positive 
> *frequency
> domain.
>
> To achieve this linear spectral centroid then we need to use a filter
> approximating a semi-derivative -- also known as a "+3dB per octave" or
> "reverse-pinking" filter.  With one of these, we may compute a ratio of the
> power of the "un-pinked" signal to the power of the original signal,
> without a square-root operation -- and this gives us a "mean frequency"
> that will behave as we desire.
>
> Each of these techniques may be extended with further levels of
> differentiation or semi-differentiation step to compute additional moments:
>  in the case of the original technique, we can use a second-derivative
> approximation to get the fourth moment of the symmetric frequency domain.
> In the case of the "linear spectral centroid" technique, we can either
> apply the reverse-pinking filter again (or use a simple differentiator) to
> get the second moment, corresponding to the variance of frequencies in the
> signal.
>
> Lastly, it's important to note that differentiation and
> semi-differentiation filters are always approximate for sampled signals,
> and will tend to exhibit poor behavior for very high frequencies and (for
> semi-differentiation) very low ones.  The band of frequencies which will be
> handled accurately is a function of the filters used to approximate
> differentiation and semi-differentiation.
>
>
> When working with tonal signals, it has been proposed that brightness be
> normalized through division by fundamental frequency.  This produces a
> dimensionless (?) metric which is orthogonal to the tone's pitch, and does
> not typically fall below a value of one.  Whether such a metric corresponds
> more closely to brightness than the spectral centroid in hertz depends on a
> psychoacoustics question:  Do humans perceive brightness as a quality which
> is independent from pitch?
>
> – Evan Balster
> creator of imitone 
>
> On Thu, Feb 25, 2016 at 1:04 PM, Theo Verelst  wrote:
>
>> Evan Balster wrote:
>>
>>> ...
>>>
>>> To that end:  A handy, cheap algorithm for approximating the
>>> power-weighted spectr

Re: [music-dsp] Time-domain noisiness estimator

2016-02-21 Thread Ethan Duni

Not a purely time-domain approach, but you can consider comparing sparsity
in the time and Fourier domains. The idea is that periodic/tonal type
signals may be non-sparse in the time domain, but look sparse in the
frequency domain (because all of the energy is on/around harmonics).
Similarly, transient signals are quite sparse in the time domain, but quite
non-sparse in the frequency domain. Noisy signals, in comparison, aren't
sparse in either domain. So if you detect sparsity in at least one domain,
you mark that as signal. If you don't find a sparse structure in either
domain, you classify as noise. You can imagine expanding this to a larger
set of test transforms, working in the LPC residual domain, etc.

The underlying idea is that "signal" and "noise" can be distinguished in
the sense that signals have some compact structure underlying them. But
that structure may only be apparent in one domain or another. Noise,
however, doesn't have such a structure, so no matter what (fixed) transform
you throw at it, it still looks more-or-less uniform and featureless.

E

On Sun, Feb 21, 2016 at 3:01 PM, Dario Sanfilippo <
sanfilippo.da...@gmail.com> wrote:

> Hello.
>
> Corey: I'm honestly not so familiar with auto-correlation; I'm aware that
> it is implemented for pitch-detection but I didn't know about those other
> features; would you have a reference or link to a document I could check
> out?
>
> Evan: I get your point; in my case I was following more of a low-level and
> context-independent approach, namely that of trying to distinguish between
> periodic and non-periodic signals; indeed that's why I thought that varying
> ZCR could have worked out.
>
> James: do you mean that frequency modulated sounds and bell sounds are
> perceived as noisy although they have a non-varying ZCR? A noisy signal
> with a DC-offset would also make the algorithm faulty.
>
> Would you say that the most reliable estimation if that of the FFT-based
> flatness?
>
> Best,
> Dario
>
> On 21 February 2016 at 21:03, James McCartney  wrote:
>
>>
>> wouldn't using varying ZCR be defeated by frequency modulated or bell
>> tones?
>> One could also craft a very noisy signal with a perfectly periodic ZCR.
>> James McCartney
>>
>>
>> On Feb 19, 2016, at 04:49, Dario Sanfilippo 
>> wrote:
>>
>> Hello everybody.
>>
>> Following on a discussion about cheap/time-domain spectral centroid
>> estimators, I thought it could have been interesting to also discuss
>> time-domain noisiness estimators.
>>
>> I think that a common approach is the FFT-based spectral flatness
>> algorithm. In the time-domain, zero-crossing rate is another common
>> approach, although it seems to only work for specific cases like voiced Vs.
>> unvoiced sounds, or percussive Vs. non-percussive. A very high frequency
>> sinewave would also have a high ZCR, although it is not noisy.
>>
>> I tried implementing a rudimentary noisiness estimator based on the idea
>> that a noisy signal is characterised by a varying ZCR, rather than a high
>> ZCR. What I did was to use a differentiator on successive averaging windows
>> of ZCR, and then I averaged the absolute value of differentiator's output
>> to obtain an index.
>>
>> The algorithm seems to work fine for most cases, although some particular
>> frequencies of a sinusoidal input result in unexpected indexes. I guess
>> that a problem here is to find a good compromise in the averaging windows
>> of the ZCR. I am using 10-msec windows which seemed to work OK. I was also
>> thinking that I could make the averaging window time-variant, piloting it
>> based on a centroid estimation in order to optimes it according to the
>> spectral content of the signal.
>>
>> Does any of the above make sense for you? Are you aware of other
>> algorithms using a similar technique?
>>
>> If you're familiar with the Pure Data audio environment, you can have a
>> look at the patch here:
>> https://dl.dropboxusercontent.com/u/43961783/noisiness.jpg
>>
>> Thanks,
>> Dario
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-19 Thread Ethan Duni

>i haven't even found that in the lit.  which is
>why i was interested when Evan brought this topic up.

You can check this one out, haven't read it closely:
http://www.icmpc8.umn.edu/proceedings/ICMPC8/PDF/AUTHOR/MP040215.PDF

>the first moment is 0. due to the symmetry of what we're looking at.

>but i think that we were supposed to be integrating only positive

>values of w.  and then this centroid becomes more like a mean,

> not so much a variance.


Ah, I wasn't very explicit about this part. I fudged the integral limits
because I didn't want to do too much ascii math. But it's worth being clear
about the relationship between the full Parseval integral and the one-sided
centroid integral (using integral(a,b,x) for the definite integral of x
from a to b):

integral(-inf,inf,|w|^n |X(w)|^2 dw) = integral(-inf,0,|w|^n |X(w)|^2
dw) + integral(0,inf,|w|^n
|X(w)|^2 dw)

 = integral(0,inf,|w|^n |X(w)|^2 dw) + integral(0,inf,|w|^n
|X(w)|^2 dw) = 2*integral(0,inf,|w|^n |X(w)|^2 dw)


Notice that we're integrating monomials in |w| - this is to enforce the
symmetry so that it corresponds to the one-sided integral, rather than the
moments of the two-sided power spectrum. So if you plug that into the RHS
of the second equation from my last post (n=2 for numerator, and n=0 for
denominator), the two's cancel out and we see that we're computing the
second moment of the normalized power spectrum. That's why we need to
either add a square root at the end (as Evan corrected up-thread) or
replace the differentiator with an inverse pinking filter as you suggested
(then we get the proper |w| integrand).


>same as the old standby: http://www.firstpr.com.au/dsp/pink-noise/

>but swap the poles and the zeros.


Of course that will work, but I was thinking that an inverse pinking filter
doesn't have the same difficulties regarding behavior around DC, So maybe
you can do a bit better with the inverse filter by taking advantage of the
extra freedom. Probably a relatively academic concern but figured I'd put
it out there...


E

On Thu, Feb 18, 2016 at 5:27 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
> From: "Ethan Duni" 
> Date: Thu, February 18, 2016 4:48 pm
> --
>
> > I've noticed
> > in my (cursory) searches that some people use amplitude spectra and
> others
> > use power spectra, but the only thing I've found in the way of comparison
> > tests was to do with whether it gets normalized by fundamental frequency
> or
> > not.
>
>
> i haven't even found that in the lit.  which is why i was interested when
> Evan brought this topic up.
>
>
>
> > Let's start in continuous time, with some real signal x(t) with FT X(w).
> > Recall the differentiation property, d/dt x(t) <=> jwX(w). Next, let's
> use
> > Parseval's theorem (ignoring the normalization constants because they'll
> > cancel out later):
> >
> > integral( |x(t)|^2 dt) = integral( |X(w)|^2 dw), and likewise integral(
> > |d/dt x(t)|^2 dt) = integral( |w|^2 |X(w)|^2 dw).
> >
> > Thus, the ratio of the time-domain integrals gives:
> >
> > integral( |d/dt x(t)|^2 dt)/integral( |x(t)|^2 dt) = integral( |w|^2
> > |X(w)|^2 dw)/integral( |X(w)|^2 dw)
> >
> > I.e., if we run a differentiator, then compute the ratio of the power in
> > that to the power in the original signal, the result is the second moment
> > of the (normalized) power spectrum.
>
>
>
> it's "second moment" because both positive and negative frequencies are
> used.
>
>
>
> > This corresponds to the system Evan
> > proposed in the OP, without the later square root modification. So that's
> > something, but presumably we want to get the *first* moment of the
> > normalized power spectrum.
>
>
>
> the first moment is 0. due to the symmetry of what we're looking at.
>
>
>
> but i think that we were supposed to be integrating only positive values
> of w.  and then this centroid becomes more like a mean, not so much a
> variance.
>
>
> > One option is to replace the differentiator with an inverse pinking
> filter,
> > as rbj suggested. Are there any good references on design of inverse
> > pinking filters?
> >
>
> same as the old standby: http://www.firstpr.com.au/dsp/pink-noise/
>
>
>
> but swap the poles and the zeros.
>
>
>
>
> > Another option is to stick some square roots on these quantities, as Evan
> > suggested in a subsequent post. But moving those through the integrals
> > means, according to Jensen's inequality, that

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Ethan Duni

when solving low-level problems.  (That said, the
> failures that result from "fumbling in the dark" can sometimes lead to
> groundbreaking discoveries.)
>
> Research into perception tells us that most phenomena are perceived
> proportional to the logarithm of their intensity.  It tells us further that
> auditory stimuli are received in a form *resembling *the frequency
> domain.  We're mathematicians, not neuroscientists, and that discipline
> comes with a powerful confirmation bias for simple, "elegant" solutions.
> But the cochlea is not cleanly modeled
> <http://www.cns.nyu.edu/~david/courses/perception/lecturenotes/pitch/pitch.html>
> by a fourier transform, and as to what happens beyond, Minsky said it best:
> the simplest explanation is that there is no simple explanation.  In
> absence of hard research, we can't reasonably expect to add logarithm
> flavoring to such a simple formula and expect it to converge with the
> result of billions of years of evolution.
>
> Anyway, that's why -- in spite of my extensive research in pitch tracking
> -- I don't touch perception modeling with a ten-foot pole.  It's a soft
> science and it's all too easy to develop the misconception that you know
> what you're doing.  Because it will be a long time before the perceptual
> properties of any brightness metric can be clearly understood, I'll stick
> to formulas whose mathematical properties are transparent -- these lend
> themselves infinitely better to being small pieces of larger systems.
>
> – Evan Balster
> creator of imitone <http://imitone.com>
>
> On Thu, Feb 18, 2016 at 11:24 AM, Ethan Duni  wrote:
>
>> >Weighting a mean with log-magnitude can quickly lead to nonsense.
>>
>> To use log magnitude you'd first have to normalize it to look like a
>> probability density (non-negative, sums to one). Meaning you add an offset
>> so that the lowest value is zero, and then normalize. Obviously that
>> puts restrictions on the class of signals it can handle - there can't be
>> any zeros on the unit circle (in practice we'd just apply a minimum
>> threshold at, say, -60dB or whatever) - and involves other complications
>> (I'm not sure there's a sensible time-domain interpretation).
>>
>> >I apply Occam's razor when making decisions about what metrics
>> correspond most closely to nature
>>
>> What is the natural phenomenon that we're trying to model here?
>>
>> > log-magnitude is rarely sensible outside of perception modeling
>>
>> But isn't the goal here to estimate the "brightness" of a signal?
>> Perceptual modelling is exactly why I bring log spectra up.
>>
>> E
>>
>>
>>
>> On Thu, Feb 18, 2016 at 7:42 AM, Evan Balster  wrote:
>>
>>> Weighting a mean with log-magnitude can quickly lead to nonsense.
>>> Trivial examples:
>>>
>>>- 0dB sine at 100hz, 6dB sine at 200hz --> log centroid is 200hz
>>>- -6dB sine at 100hz, 12dB sine at 200hz --> log centroid is 300hz
>>>(!)
>>>
>>> Sanfillipo's adaptive median finding technique is still applicable, but
>>> will produce the same result as a power or magnitude version.
>>>
>>> I apply Occam's razor when making decisions about what metrics
>>> correspond most closely to nature.  I choose the formula which is
>>> mathematically simplest while utilizing operations that make sense for the
>>> dimensionality of the operands and do not induce undue discontinuities.
>>> Power is simpler to compute than magnitude, log-magnitude is rarely
>>> sensible outside of perception modeling, and (unlike zero-crossing
>>> techniques) a small change in the signal will always produce a
>>> proportionally small change in the metrics.
>>>
>>> At next opportunity I should post up some code describing how to compute
>>> higher moments with the differential brightness estimator.
>>>
>>> – Evan Balster
>>> creator of imitone <http://imitone.com>
>>>
>>> On Thu, Feb 18, 2016 at 1:00 AM, Ethan Duni 
>>> wrote:
>>>
>>>> >normalized to fundamental frequency or not
>>>> >normalized (so that no pitch detector is needed)?
>>>>
>>>> Yeah tonal signals open up a whole other can of worms. I'd like to
>>>> understand the broadband case first, with relatively simple spectral
>>>> statistics that correspond to the clever time-domain estimators discussed
>>>> so far in

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-18 Thread Ethan Duni

>Weighting a mean with log-magnitude can quickly lead to nonsense.

To use log magnitude you'd first have to normalize it to look like a
probability density (non-negative, sums to one). Meaning you add an offset
so that the lowest value is zero, and then normalize. Obviously that puts
restrictions on the class of signals it can handle - there can't be any
zeros on the unit circle (in practice we'd just apply a minimum threshold
at, say, -60dB or whatever) - and involves other complications (I'm not
sure there's a sensible time-domain interpretation).

>I apply Occam's razor when making decisions about what metrics correspond
most closely to nature

What is the natural phenomenon that we're trying to model here?

> log-magnitude is rarely sensible outside of perception modeling

But isn't the goal here to estimate the "brightness" of a signal?
Perceptual modelling is exactly why I bring log spectra up.

E



On Thu, Feb 18, 2016 at 7:42 AM, Evan Balster  wrote:

> Weighting a mean with log-magnitude can quickly lead to nonsense.  Trivial
> examples:
>
>- 0dB sine at 100hz, 6dB sine at 200hz --> log centroid is 200hz
>- -6dB sine at 100hz, 12dB sine at 200hz --> log centroid is 300hz (!)
>
> Sanfillipo's adaptive median finding technique is still applicable, but
> will produce the same result as a power or magnitude version.
>
> I apply Occam's razor when making decisions about what metrics correspond
> most closely to nature.  I choose the formula which is mathematically
> simplest while utilizing operations that make sense for the dimensionality
> of the operands and do not induce undue discontinuities.  Power is simpler
> to compute than magnitude, log-magnitude is rarely sensible outside of
> perception modeling, and (unlike zero-crossing techniques) a small change
> in the signal will always produce a proportionally small change in the
> metrics.
>
> At next opportunity I should post up some code describing how to compute
> higher moments with the differential brightness estimator.
>
> – Evan Balster
> creator of imitone <http://imitone.com>
>
> On Thu, Feb 18, 2016 at 1:00 AM, Ethan Duni  wrote:
>
>> >normalized to fundamental frequency or not
>> >normalized (so that no pitch detector is needed)?
>>
>> Yeah tonal signals open up a whole other can of worms. I'd like to
>> understand the broadband case first, with relatively simple spectral
>> statistics that correspond to the clever time-domain estimators discussed
>> so far in the thread.
>>
>> The ideas for time-domain approaches got me thinking about what the
>> optimal time-domain approach would look like. But of course it depends on
>> what definition of spectral centroid you use. For the mean of the power
>> spectrum it seems relatively straightforward to get some tractable
>> expressions - I guess this is the inspiration for the one based on an
>> approximate differentiator. But I suspect that mean of the log power
>> spectrum is more perceptually meaningful.
>>
>> E
>>
>> On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
>> r...@audioimagination.com> wrote:
>>
>>>
>>>
>>>  Original Message
>>> 
>>> Subject: Re: [music-dsp] Cheap spectral centroid recipe
>>> From: "Ethan Duni" 
>>> Date: Wed, February 17, 2016 11:21 pm
>>> To: "A discussion list for music-related DSP" <
>>> music-dsp@music.columbia.edu>
>>>
>>> --
>>>
>>> >>It's essentially computing a frequency median,
>>> >>rather than a frequency mean as is the case
>>> >>with the derivative-power technique described
>>> >> in my original approach.
>>> >
>>> > So I'm wondering, is there any consensus on what is the best measure of
>>> > central tendency for a music signal spectrum? There's the median vs the
>>> > mean (vs trimmed means, mode, etc). But what is the right domain in the
>>> > first place: magnitude spectrum, power spectrum, log power spectrum or
>>> ???
>>>
>>> normalized to fundamental frequency or not normalized (so that no pitch
>>> detector is needed)?  should identical waveforms at higher pitches have the
>>> same centroid parameter or a higher centroids?
>>>
>>> spectral "brightness" is a multi-dimensional perceptual parameter.  you
>>> can have two tones with the same spectral centroid

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread Ethan Duni

>normalized to fundamental frequency or not
>normalized (so that no pitch detector is needed)?

Yeah tonal signals open up a whole other can of worms. I'd like to
understand the broadband case first, with relatively simple spectral
statistics that correspond to the clever time-domain estimators discussed
so far in the thread.

The ideas for time-domain approaches got me thinking about what the optimal
time-domain approach would look like. But of course it depends on what
definition of spectral centroid you use. For the mean of the power spectrum
it seems relatively straightforward to get some tractable expressions - I
guess this is the inspiration for the one based on an approximate
differentiator. But I suspect that mean of the log power spectrum is more
perceptually meaningful.

E

On Wed, Feb 17, 2016 at 8:34 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Cheap spectral centroid recipe
> From: "Ethan Duni" 
> Date: Wed, February 17, 2016 11:21 pm
> To: "A discussion list for music-related DSP" <
> music-dsp@music.columbia.edu>
> --
>
> >>It's essentially computing a frequency median,
> >>rather than a frequency mean as is the case
> >>with the derivative-power technique described
> >> in my original approach.
> >
> > So I'm wondering, is there any consensus on what is the best measure of
> > central tendency for a music signal spectrum? There's the median vs the
> > mean (vs trimmed means, mode, etc). But what is the right domain in the
> > first place: magnitude spectrum, power spectrum, log power spectrum or
> ???
>
> normalized to fundamental frequency or not normalized (so that no pitch
> detector is needed)?  should identical waveforms at higher pitches have the
> same centroid parameter or a higher centroids?
>
> spectral "brightness" is a multi-dimensional perceptual parameter.  you
> can have two tones with the same spectral centroid (however consistent way
> you measure it) and sound very different if the "second moment" or
> "variance" is much different.
>
>
>
> --
>
>
> r b-j   r...@audioimagination.com
>
>
>
>
> "Imagination is more important than knowledge."
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Cheap spectral centroid recipe

2016-02-17 Thread Ethan Duni

>It's essentially computing a frequency median,
>rather than a frequency mean as is the case
>with the derivative-power technique described
> in my original approach.

So I'm wondering, is there any consensus on what is the best measure of
central tendency for a music signal spectrum? There's the median vs the
mean (vs trimmed means, mode, etc). But what is the right domain in the
first place: magnitude spectrum, power spectrum, log power spectrum or ???

E

On Wed, Feb 17, 2016 at 2:40 PM, Evan Balster  wrote:

> Dario's adaptive approach is interesting.  It's essentially computing a
> frequency median, rather than a frequency mean as is the case with the
> derivative-power technique described in my original approach.
>
> Dario, I would suggest experimenting with zero-phase FIR filters if you're
> doing offline music analysis.  This would allow you to iteratively refine
> your median "in-place" for different points in time.
>
> – Evan Balster
> creator of imitone 
>
> On Wed, Feb 17, 2016 at 7:52 AM, STEFFAN DIEDRICHSEN 
> wrote:
>
>> This reminds me a bit of the voiced / unvoiced detection for vocoders or
>> level independent de-essers. It works quite well.
>>
>>
>> Steffan
>>
>>
>>
>> On 17.02.2016|KW7, at 13:08, Diemo Schwarz 
>> wrote:
>>
>>1. Apply a first-difference filter to input signal A, yielding signal
>> B.
>> 2. Square signal A, yielding signal AA; square signal B, yielding
>> signal BB.
>> 3. Apply a low-pass filter of your choice to AA, yielding PA, and BB,
>>yielding PB.
>> 4. Divide PB by PA, then multiply the result by the input signal's
>> sampling
>>rate divided by pi.
>>
>>
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Anyone using Chebyshev polynomials to approximate trigonometric functions in FPGA DSP

2016-01-20 Thread Ethan Duni

>given the same order N for the polynomials, whether your basis set are
> the Tchebyshevs, T_n(x), or the basis is just set of x^n, if you come up
>with a min/max optimal fit to your data, how can the two polynomials be
>different?

Right, if you do that you'll end up with equivalent answers (to within
numerical precision).

The idea is that you avoid the cost of doing the iterative algorithm to get
the optimal polynomial, and instead you simply truncate the Chebyshev
expansion to the desired order to get an approximation. For well-behaved
target functions it should be quite close. The justification is that the
Chebyshev polynomials each look like solutions to the minimax problem (they
oscillate between +-1 and the Nth polynomial has N+1 extrema), and the
error from truncating a series is approximately proportional to the last
retained term, so truncating a Chebyshev expansion should resemble the
optimal polynomial.

E

On Wed, Jan 20, 2016 at 10:32 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] Anyone using Chebyshev polynomials to approximate
> trigonometric functions in FPGA DSP
> From: "Ross Bencina" 
> Date: Wed, January 20, 2016 11:00 pm
> To: "A discussion list for music-related DSP" <
> music-dsp@music.columbia.edu>
> --
>
> > On 21/01/2016 2:36 PM, robert bristow-johnson wrote:
> > > i thought i understood Tchebyshev polynomials well. including their
> > > trig definitions (for |x|<1), but if what you're trying to do is
> > > generate a sinusoid from polynomials, i don't understand where the
> > > "Tchebyshev" (with or without the "T") comes in.
> > >
> > > is it min/max error (a.k.a. Tchebyshev norm)?
> >
> > Here's the relevant passage from p. 119:
> >
> > An article about sines and cosines wouldn’t be complete without some
> > mention of the use of Chebyshev polynomials. Basically, the theory of
> > Chebyshev polynomials allows the programmer to tweak the coefficients a
> > bit for a lower error bound overall. When I truncate a polynomial, I
> > typically get very small errors when x is small, and the errors increase
> > dramatically and exponentially outside a certain range near x = 0. The
> > Chebyshev polynomials, on the other hand, oscillate about zero with peak
> > deviations that are bounded and equal. Expressing the power series in
> > terms of Chebyshev polynomials allows you to trade off the small errors
> > near zero for far less error near the extremes of the argument range.
>
> okay, i guess.  i'll have to do a bit more internet research, because
> stated as such, this isn't exactly in my "Approximation Theory" book
> (Cheney).  the "Tchebyshev norm" for error is mentioned.  but the technique
> is the "Remes" (with an "s") algorithm.
>
>
>
> so, the fundamental thing that i don't understand, is that given the same
> order N for the polynomials, whether your basis set are the Tchebyshevs,
> T_n(x), or the basis is just set of x^n, if you come up with a min/max
> optimal fit to your data, how can the two polynomials be different?  i know
> that Parks-McClellan use the Tchebyshevs to convert an optimized polynomial
> to a sum of cosines (of which a symmetrical FIR filter comes forth), but i
> am curious how the technique that Jack shows you gets a different answer
> than the Remes alg, after the sum of Tchebyshevs is expanded and converted
> to a sum of x^n.
>
>
>
> --
>
>
>
> r b-j   r...@audioimagination.com
>
>
>
>
> "Imagination is more important than knowledge."
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-16 Thread Ethan Duni

>> [..] the autocorrelation is
>>
>>  = (1/3)*(1-P)^|k|
>>
>> (I checked that with a little MC code before posting.) So the power
>> spectrum is (1/3)/(1 + (1-P)z^-1)

The FT of (1/3)*(1-P)^|k| is (1/3)*(1-Q^2)/(1-2Qcos(w) + Q^2), where Q =
(1-P).

Looks like you were thinking of the expression for the transform of the
one-sided decaying signal u[k]*(1-P)^k?

E



On Mon, Nov 16, 2015 at 12:35 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
> > Am 16.11.2015 20:00, schrieb Martin Vicanek:
> >> [..] the autocorrelation is
> >>
> >>  = (1/3)*(1-P)^|k|
> >>
> >> (I checked that with a little MC code before posting.) So the power
> >> spectrum is (1/3)/(1 + (1-P)z^-1), i.e flat at DC and pink at higher
> >> frequencies. For reasonably small P the corner frequency is
> >>
> >> w_c = P/sqrt(1-P).
> >
> > Erratum: The power spectrum is brown, not pink. The fall-off is 12
> > dB/octave, not 6. Sorry, next time I'll use a larger envelope. ;-)
> >
>
> well, pink is -3 dB/octave and red (a.k.a. brown) is -6 dB/octave.  a
> roll-off of -12 dB/octave would be very brown.
>
>
>
> --
>
>
>
>
> r b-j   r...@audioimagination.com
>
>
>
>
> "Imagination is more important than knowledge."
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-11 Thread Ethan Duni

>there is nothing *motivating* us to define Rx[k] = E{x[n] x[n+k]} except
that we
>expect that expectation value (which is an average) to be the same as the
other definition

Sure there is. That definition gets you everything you need to work out a
whole list of major results (for example, optimal linear predictors and how
they relate the the properties of the probabilistic model) without any
reference to statistics. You get all the insights into how everything fits
together, and then you move on to the extra wrinkles that arise when
dealing in statistical estimates of the quantities in question.

To relate this back to the OP: Ross gave us a probabilistic description of
a random process, from which we can work out the autocorrelation and psd
without any reference to ergodicity, or any signal realizations to compute
sample autocorrelations on.

>otherwise, given the probabilistic definition, why would we expect the
Fourier Transform of Rx[k] = E{x[n] x[n+k]} to be the power spectrum?

In the modern context, that is the *definition* of the power spectral
density. The question of whether any particular statistic converges to it
is a separate question, considered after setting up the underlying
probabilistic models and relationships. You sort out what the underlying
quantity is, and only then do you consider how well a particular statistic
is able to approach it.

>by definition, **iff** it's ergodic, then the statistical estimate (by
that you mean the average over time)
>converges to the probabilistic expectation value.  if it's *not* ergodic,
then you don't know that they are the same.

Right, that's the definition of ergodicity. This seems phrased as a
disagreement or criticism but I'm not seeing the issue?

I certainly agree that autocorrelation and power spectral density are of
limited utility in the context of non-ergodic processes. And even more so
for non-stationary processes. But they're still well-defined (well, not so
much psd for non-WSS processes, but autocorrelation is perfectly general).

>what you call the "statistical estimate" is what i call the "primary
definition".

Right.

>well, it's not just random processes that have autocorrelations.
 deterministic signals have them too.

Deterministic signals are a subset of random processes. The probabilistic
treatment is a generalization of the deterministic case. It's overkill if
you only want to deal with deterministic signals, but in the general case
it's all you need.

>your first communications class (the book i had was A.B. Carlson) started
out with probability and stochastic processes???

My first communications class required as a prerequisite an entire course
on random processes. Which in turn required as a prerequisite yet another
entire course on basic probability and statistics. So there were two entire
courses of prob/stat/random processes pre-reqs before you get to day 1 of
communications systems.

Not sure what Carlson looked like in your time, but the modern editions do
a kind of weird parallel-track thing in this area. He does deterministic
signals first, and uses the same definitions as you. Then halfway through
he switches to random signals, and defines autocorrelation and psd directly
in terms of expected values as I describe. So it's "one definition for
deterministic case, another for random case," and then some paragraphs
bringing up the concept of ergodicity and how it bridges the two cases. The
way that the Carlson pedagogy would approach the OP - where we were given
an explicit description of a random signal - is in probabilistic terms
using definitions of acf and psd in terms of expected value.

E

On Wed, Nov 11, 2015 at 5:02 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>
>  Original Message ----
> Subject: Re: [music-dsp] how to derive spectrum of random sample-and-hold
> noise?
> From: "Ethan Duni" 
> Date: Wed, November 11, 2015 7:36 pm
> To: "robert bristow-johnson" 
> "A discussion list for music-related DSP" 
> --
>
> >>no. we need ergodicity to take a definition of autocorrelation, which we
> > are all familiar with:
> >
> >> Rx[k] = lim_{N->inf} 1/(2N+1) sum_{n=-N}^{+N} x[n] x[n+k]
> >
> >>and turn that into a probabilistic expression
> >
> >> Rx[k] = E{ x[n] x[n-k] }
> >
> >>which we can figger out with the joint p.d.f.
> >
> >
> > That's one way to do it. And if you're working only within the class of
> > stationary signals, it's a convenient way to set everything up. But it's
> > not necessary. There's nothing stopping you from simply defining
> > autoco

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-11 Thread Ethan Duni

>no.  we need ergodicity to take a definition of autocorrelation, which we
are all familiar with:

>  Rx[k] = lim_{N->inf} 1/(2N+1) sum_{n=-N}^{+N} x[n] x[n+k]

>and turn that into a probabilistic expression

>  Rx[k] = E{ x[n] x[n-k] }

>which we can figger out with the joint p.d.f.

That's one way to do it. And if you're working only within the class of
stationary signals, it's a convenient way to set everything up. But it's
not necessary. There's nothing stopping you from simply defining
autocorrelation as r(n,k) = E(x[n]x[n-k]) at the outset. You then need (WS)
stationarity to make that a function of only the lag, and then ergodicity
to establish that the statistical estimate of autocorrelation (the sample
autocorrelation, as it is commonly known) will converge, but you can ignore
it if you are just dealing with probabilistic quantities and not worrying
about the statistics.

>i totally disagree.  i consider this to be fundamental (and it's how i
remember doing statistical communication theory back in grad school).

That was a common approach in classical signal processing
literature/cirricula, since you're typically assuming stationarity at the
outset anyway. And this approach matches the historical development of the
concepts (people were computing sample autocorrelations before they squared
away the probabilistic interpretation). But this is kind of a historical
artifact that has fallen out of favor.

In modern statistical signal processing contexts (and the wider prob/stat
world) it's typically done the other way around: you define all the random
variables/processes up front, and then define autocorrelation as r(n,k) =
E(x[n]x[n-k]). Once you have that all sorted out, you turn to the question
of whether the corresponding statistics (the sample autocorrelation
function for example) converge, which is where the ergodicity stuff comes
in. The advantage to doing it this way is that you start with the most
general stuff requiring the least assumptions, and then build up more and
more specific results as you add assumptions. Assuming ergodicity at the
outset and defining everything in terms of the statistics produces the same
results for that case, but leaves you unable to say anything about
non-stationary signals, non-ergodic signals, etc.

Leafing through my college books, I can't find a single one that does it
the old way. They all start with definitions in the probability domain, and
then tackle the statistics after that's all set up.

E

On Wed, Nov 11, 2015 at 4:04 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] how to derive spectrum of random sample-and-hold
> noise?
> From: "Ethan Duni" 
> Date: Wed, November 11, 2015 5:57 pm
> To: "robert bristow-johnson" 
> "A discussion list for music-related DSP" 
> --
>
> >>all ergodic processes are stationary. (not necessarily the other way
> > around.)
> >
> > Ah, right, there is no constant mean for a time average to converge to if
> > the process isn't stationary in the first place. Been a while since I
> > worried about the details of ergodicity, mostly I have the intuitive
> notion
> > that there is no "unreachable" state or infinite memory (ala a fully
> > connected Markov chain).
> >
> >>the reason (besides forgetting stuff i learned 4 decades ago) i left out
> > "stationary" was that i was sorta conflating the two. i just wanted to be
> > able to turn the time-averages in the whatever norm (and L^2 is as good
> as
> > any) with probabilistic averages, which is the root meaning of the
> property
> > "ergodic". but probably "stationary" is a better (stronger) assumption to
> > make.
> >
> > Err, didn't we just establish that ergodicity is the stronger condition?
> >
>
> yeah, we did.
>
>
> > Also I don't think we need to worry about ergodicity in the first place.
> > The process in the OP is ergodic (for P not equal to 0) but we don't need
> > to use that anywhere.
>
>
>
> yeah, we do.
>
>
>
> > We can compute the autocorrelation directly without
> > any reference to time averages or other statistics.
>
>
>
> well, the original definition of autocorrelation *is* in reference to a
> time average.  same thing with the "A" in AMDF and ASDF (the latter is an
> upside-down version of autocorrelation).
>
>
>
> > We only need ergodicity
> > if we also want to estimate the autocorrelation/psd from example data.
>
> no.  we

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-11 Thread Ethan Duni

>all ergodic processes are stationary.  (not necessarily the other way
around.)

Ah, right, there is no constant mean for a time average to converge to if
the process isn't stationary in the first place. Been a while since I
worried about the details of ergodicity, mostly I have the intuitive notion
that there is no "unreachable" state or infinite memory (ala a fully
connected Markov chain).

>the reason (besides forgetting stuff i learned 4 decades ago) i left out
"stationary" was that i was sorta conflating the two. i just wanted to be
able to turn the time-averages in the whatever norm (and L^2 is >as good as
any) with probabilistic averages, which is the root meaning of the property
"ergodic".  but probably "stationary" is a better (stronger) assumption to
make.

Err, didn't we just establish that ergodicity is the stronger condition?

Also I don't think we need to worry about ergodicity in the first place.
The process in the OP is ergodic (for P not equal to 0) but we don't need
to use that anywhere. We can compute the autocorrelation directly without
any reference to time averages or other statistics. We only need ergodicity
if we also want to estimate the autocorrelation/psd from example data.
Which is important for making plots to verify that the answer is correct,
but not needed just to derive the autocorrelation/spectrum themselves.
Unless I missed something - where did this ergodicity assumption come from?

E

On Tue, Nov 10, 2015 at 6:33 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

>
>
>  Original Message 
> Subject: Re: [music-dsp] how to derive spectrum of random sample-and-hold
> noise?
> From: "Ethan Duni" 
> Date: Tue, November 10, 2015 8:58 pm
> To: "A discussion list for music-related DSP" <
> music-dsp@music.columbia.edu>
> --
>
> >>(Semi-)stationarity, I'd say. Ergodicity is a weaker condition, true,
> >>but it doesn't then really capture how your usual L^2 correlative
> >>measures truly work.
> >
> > I think we need both conditions, no?
>
> all ergodic processes are stationary.  (not necessarily the other way
> around.)
>
>
>
> the reason (besides forgetting stuff i learned 4 decades ago) i left out
> "stationary" was that i was sorta conflating the two.  i just wanted to be
> able to turn the time-averages in the whatever norm (and L^2 is as good as
> any) with probabilistic averages, which is the root meaning of the property
> "ergodic".  but probably "stationary" is a better (stronger) assumption to
> make.
>
>
>
>
> >
> >>Something like that, yes, except that you have to factor in aliasing.
> >
> > What aliasing? Isn't this process generated directly in the discrete time
> > domain?
>
> i'm thinking the same thing.  it's a discrete-time Markov process.  just
> model it and analyze it as such. assuming stationarity, we should be able
> to derive an autocorrelation function (and i think you guys did) and from
> that (and the DTFT) you have the (periodic) power spectrum.
>
> worry about frequency aliasing when you decide to output this to a DAC.
>
>
>
> --
>
>
>
>
> r b-j   r...@audioimagination.com
>
>
>
>
> "Imagination is more important than knowledge."
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-10 Thread Ethan Duni

>(Semi-)stationarity, I'd say. Ergodicity is a weaker condition, true,
>but it doesn't then really capture how your usual L^2 correlative
>measures truly work.

I think we need both conditions, no?

>Something like that, yes, except that you have to factor in aliasing.

What aliasing? Isn't this process generated directly in the discrete time
domain?

E

On Tue, Nov 10, 2015 at 5:43 PM, Sampo Syreeni  wrote:

> On 2015-11-04, robert bristow-johnson wrote:
>
> it is the correct way to characterize the spectra of random signals. the
>> spectra (PSD) is the Fourier Transform of autocorrelation and is scaled as
>> magnitude-squared.
>>
>
> The normal way to derive the spectrum of S/H-noise goes a bit around these
> kinds of considerations. It takes as given that we have a certain sampling
> frequency, which is the same as the S/H frequency. Under that assumption,
> sample-and-hold takes any value, and holds it constant for a sampling
> period. You can model that by a convolution with a rectangular function
> which takes the value one for one sampling period, and which is zero
> everywhere else. Then the rest of the modelling has to do with normal
> aliasing analysis.
>
> That's at least how they did it before the era of delta-sigma converters.
>
> with the assumption of ergodicity, [...]
>>
>
> (Semi-)stationarity, I'd say. Ergodicity is a weaker condition, true, but
> it doesn't then really capture how your usual L^2 correlative measures
> truly work.
>
> i have a sneaky suspicion that this Markov process is gonna be something
>> like pink noise.
>>
>
> Something like that, yes, except that you have to factor in aliasing.
>
>
> r[n] = uniform_random(0, 1)
> if (r[n] <= P)
>x[n] = uniform_random(-1, 1);
> else
>x[n] = x[n-1];
>
>
> If P==1, that give uniform white noise. If P==0, it yields a constant. If
> P==.5, half of the time it holds the previous value.
>
> In a continuous time Markov process you'd get something like pink noise,
> yes. But in a discrete time process you have to factor in aliasing. It goes
> pretty bad, pretty fast.
>
> --
> Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front
> +358-40-3255353, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-05 Thread Ethan Duni

>since the whole signal has infinite power, the units really
>need to be power per unit frequency per unit time, which
>(confusingly) is the same thing as power.

I think you mean to say "infinite energy" and then "energy per unit
frequency per unit time," no?

E

On Thu, Nov 5, 2015 at 8:21 AM, Ethan Fenn  wrote:

> Let's see if I got this right: each bin contains the power for a frequency
>> interval of 2pi/N radians. If I multiply each bin's power by N/2pi I should
>> get power values in units of power/radian.
>>
>
> Sounds reasonable to me, but I'm not sure I've got it right so who knows!
>
> I think I was slightly off when I said that the units of psd are power per
> unit frequency -- since the whole signal has infinite power, the units
> really need to be power per unit frequency per unit time, which
> (confusingly) is the same thing as power. This could be another reason why
> some special scaling is needed as compared to a finite-length FFT.
>
> I'm not sure whether the FFT values should be fringing above the psd line
>> or not:
>
>
> The psd line is the expected value, so some FFT values should be above it
> and some below. You could try averaging the squared spectra from a bunch of
> separate FFT trials and see if that makes things coverge toward the line.
>
> -Ethan
>
>
> On Thu, Nov 5, 2015 at 3:48 PM, Ross Bencina 
> wrote:
>
>> Thanks Ethan,
>>
>> I think that I have it working. It would be great is someone could check
>> the scaling though. I'm not sure whether the FFT values should be fringing
>> above the psd line or not:
>>
>> https://www.dropbox.com/s/txc0txhxqr1t274/SH1_2.png?dl=0
>>
>> I removed the hamming window, which was causing scaling problems. The FFT
>> output is now scaled so that the the sum of power over all bins matches the
>> power of the time domain signal:
>>
>>
>> https://gist.github.com/RossBencina/a15a696adf0232c73a55/bdefe5ab0b5c218a966bd6a04d9d998a708faf99
>>
>>
>> On 6/11/2015 12:02 AM, Ethan Fenn wrote:
>>
>>> And is psd[w] in exactly the same units as the magnitude squared
>>> spectrum of x[n] (i.e. |ft(x)|^2)?
>>>
>>>
>>> More or less, with the proviso that you have to be careful whether
>>> you're talking about power per unit frequency which the psd will give
>>> you, and power per frequency bin which is often the correct
>>> interpretation of magnitude squared FFT results -- the latter depending
>>> on the FFT scaling conventions used.
>>>
>>
>> Let's see if I got this right: each bin contains the power for a
>> frequency interval of 2pi/N radians. If I multiply each bin's power by
>> N/2pi I should get power values in units of power/radian.
>>
>>
>> The psd makes no reference to any transform length, since it's based on
>>> the statistical properties of the process. So I think it would be wrong
>>> (or at least inexact) to have a scale related to N applied to it. If you
>>> want the magnitude squared results of an FFT to match the psd, it seems
>>> more correct to scale the FFT and try a few different N's to see what
>>> factor of N will give consistent results.
>>>
>>
>> That makes sense.
>>
>>
>> As to the exact scale that should be applied... I think there should be
>>> a 1/3 in the expression for psd, because E[x^2]=1/3 where x is uniform
>>> in [-1,1]. Aside from that, there might be a factor of 2pi depending on
>>> whether we're talking about power per linear or angular frequency. And
>>> there could be others I'm not thinking of maybe someone else can
>>> shed more light here.
>>>
>>
>> I multiplied the psd by 1/3 and as you can see from the graph it looks as
>> though the FFT and the psd are more-or-less aligned.
>>
>>
>> Hope that's somewhat helpful!
>>>
>>
>> Very clear thanks,
>>
>> Ross.
>>
>>
>>
>> -Ethan
>>>
>>>
>>>
>>>
>>> On Thu, Nov 5, 2015 at 11:00 AM, Ross Bencina
>>> mailto:rossb-li...@audiomulch.com>> wrote:
>>>
>>> Thanks Ethan(s),
>>>
>>> I was able to follow your derivation. A few questions:
>>>
>>> On 4/11/2015 7:07 PM, Ethan Duni wrote:
>>>
>>> It's pretty straightforward to derive the autocorrelation and
>>> psd for
>>> this one. Let me restate it with some convenient not

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-05 Thread Ethan Duni

>So for y[n] ~U(-1,1) I should multiply psd[w] by what exactly?

The variance of y[n]. For U(-1,1) this is 1/3. From your subsequent post it
sounds like you got this ironed out?

>What is the method that you used to go from ac[k] to psd[w]?
>Robert mentioned that psd was the Fourier transform of ac.
>Is this particular case a standard transform that you knew off the top of
your head?

Yeah it's just the DTFT of the autocorrelation function. You can find that
one in a suitably complete table of transform pairs, or just evaluate it
directly by using the geometric series formula and a bit of manipulation as
Ethan F described (that's what I did, for old time's sake).

Are you comparing this to a big FFT of a long sequence of samples from this
process (i.e., periodogram)? The basic shape should be visible there, but
with quite a lot of noise (because the frequency resolution increases at
the same rate as the number of samples, you have a constant number of
samples per frequency bin so that error never converges to zero). To really
see the effect you can use a more sophisticated spectral density estimate
like Welch's method. Basically you chop the signal into chunks (with length
determined by your frequency resolution) and then average the FFT
magnitudes of those. That way you have a constant frequency resolution and
increasing samples per bin, so the result will converge to the underlying
PSD. There are more details with windowing and overlap and scaling, but
that's the basic idea.

https://en.wikipedia.org/wiki/Spectral_density_estimation

E



On Thu, Nov 5, 2015 at 2:00 AM, Ross Bencina 
wrote:

> Thanks Ethan(s),
>
> I was able to follow your derivation. A few questions:
>
> On 4/11/2015 7:07 PM, Ethan Duni wrote:
>
>> It's pretty straightforward to derive the autocorrelation and psd for
>> this one. Let me restate it with some convenient notation. Let's say
>> there are a parameter P in (0,1) and 3 random processes:
>> r[n] i.i.d. ~U(0,1)
>> y[n] i.i.d. ~(some distribution with at least first and second moments
>> finite)
>> x[n] = (r[n]>
>> Note that I've switched the probability of holding to P from (1-P), and
>> that the signal being sampled-and-held can have an arbitrary (if well
>> behaved) distribution. Let's also assume wlog that E[y[n]y[n]] = 1
>> (Scale the final results by the power of whatever distribution you
>> prefer).
>>
>
> So for y[n] ~U(-1,1) I should multiply psd[w] by what exactly?
>
>
> Now, the autocorrelation function is ac[k] = E[x[n]x[n-k]]. Let's work
>> through the first few values:
>> k=0:
>> ac[0] = E[x[n]x[n]] = E[y[n]y[n]] = 1
>> k=1:
>> ac[1] = E[x[n]x[n-1]] = P*E[x[n-1]x[n-1]] + (1-P)*E[x[n-1]y[n]] =
>> P*E[y[n]y[n]] = P
>>
>> The idea is that P of the time, x[n] = x[n-1] (resulting in the first
>> term) and (1-P) of the time, x[n] is a new, uncorrelated sample from
>> y[n]. So we're left with P times the power (assumed to be 1 above).
>>
>> k=2:
>> ac[2] = P*P*E[x[n-2]x[n-2]] = P^2
>>
>> Again, we decompose the expected value into the case where x[n] = x[n-2]
>> - this only happens if both of the previous samples were held
>> (probability P^2). The rest of the time - if there was at least one
>> sample event - we have uncorrelated variables and the term drops out.
>>
>> So, by induction and symmetry, we conclude:
>>
> >
>
>> ac[k] = P^abs(k)
>>
> >
>
>> And so the psd is given by:
>>
>> psd[w] = (1 - P^2)/(1 - 2Pcos(w) + P^2)
>>
>
> What is the method that you used to go from ac[k] to psd[w]? Robert
> mentioned that psd was the Fourier transform of ac. Is this particular case
> a standard transform that you knew off the top of your head?
>
> And is psd[w] in exactly the same units as the magnitude squared spectrum
> of x[n] (i.e. |ft(x)|^2)?
>
>
> Unless I've screwed up somewhere?
>>
>
> A quick simulation suggests that it might be okay:
>
> https://www.dropbox.com/home/Public?preview=SH1_1.png
>
>
> But I don't seem to have the scale factors correct. The psd has
> significantly smaller magnitude than the fft.
>
> Here's the numpy code I used (also pasted below).
>
> https://gist.github.com/RossBencina/a15a696adf0232c73a55
>
> The FFT output is scaled by (2.0/N) prior to computing the magnitude
> squared spectrum.
>
> I have also scaled the PSD by (2.0/N). That doesn't seem quite right to me
> for two reasons: (1) the scale factor is applied to the linear FFT, but to
> the mag squared PSD and (2) I don't have the 1/3 factor anywhere.
>
> Any thoughts on what I'm doing wrong?
>
>

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-04 Thread Ethan Duni

Yep that's the same approach I just posted :]

E

On Tue, Nov 3, 2015 at 11:48 PM, Ethan Fenn  wrote:

> How about this:
>
> For a lag of t, the probability that no new samples have been accepted is
> (1-P)^|t|.
>
> So the autocorrelation should be:
>
> AF(t) = E[x(n)x(n+t)] = (1-P)^|t| * E[x(n)^2] + (1 -
> (1-P)^|t|)*E[x(n)*x_new]
>
> The second term covers the case that a new sample has popped up, so x(n)
> and x(n+t) are uncorrelated. So, this term vanishes. The first term is
> (1/3)*(1-P)^|t|, so I reckon:
>
> AF(t) = (1/3)*(1-P)^|t|
>
> Does that make sense?
>
> -Ethan
>
>
>
>
>
> On Wed, Nov 4, 2015 at 8:21 AM, robert bristow-johnson <
> r...@audioimagination.com> wrote:
>
>>
>>
>>  Original Message 
>> Subject: Re: [music-dsp] how to derive spectrum of random sample-and-hold
>> noise?
>> From: "Ross Bencina" 
>> Date: Wed, November 4, 2015 12:22 am
>> To: r...@audioimagination.com
>> music-dsp@music.columbia.edu
>> --
>>
>>
>>
>> with mods
>>
>>
>> > Using ASDF instead of autocorrelation:
>> >
>> > let n be an arbitrary time index
>> > let t be the ASDF lag time of interest
>> >
>> > ASDF[t] = (x[n] - x[n-t])^2
>> >
>> > there are two cases:
>> >
>> > case 1, (holding): x[n-t] == x[n]
>>
>> this has probability of P^|t|
>>
>>
>>
>>
>> > case 2, (not holding) x[n-t] == uniform_random(-1, 1)
>>
>> this has probability of 1 - P^|t|
>>
>>
>> >
>> > In case 1, ASDF[t] = 0
>> > In case 2, ASDF[t] = (1/3)^2  (i think)
>>
>>
>>
>> so maybe it's
>>
>>
>>
>> ASDF[t] = 0 * P^|t|  +  (1/3)^2 * (1 - P^|t|)
>>
>>
>>
>> now the autocorrelation function (AF) is related to the ASDF as
>>
>>
>>
>> AF[t] =  mean{ x[n] * x[n-t] }
>>
>> AF[t] =  mean{ (x[n])^2 }  - (1/2)*mean{ (x[n] - x[n-t])^2 }
>>
>>
>> AF[t] =  mean{ (x[n])^2 }  - (1/2)*ASDF[t]
>>
>>
>>
>> AF[t]  =  (1/3)  -  (1/2) * (1/3)^2 * (1 - P^|t|)
>>
>>
>>
>> this doesn't quite look right to me.  somehow i was expecting  AF[t] to
>> go to zero as t goes to infinity.
>>
>>
>>
>>
>> > To get the limit of ASDF[t], weight the values of the two cases by the
>> > probability of each case case. (Which seems like a textbook waiting-time
>> > problem, but will require me to return to my textbook).
>> >
>> > Then I just need to convert the ASDF to PSD somehow.
>>
>>
>>
>>   ASDF[t] = 2*AF[0] - 2*AF[t]
>>
>>
>>
>> or
>>
>>
>>
>>   AF[t]  =  AF[0]  - (1/2)*ASDF[t]
>>
>>
>>
>>
>>
>> PSD = Fourier_Transform{ AF[t] }
>>
>>
>> > Does that seem like a reasonable approach?
>>
>>
>> it's the approach i am struggling with.   somehow, i don't like the AF i
>> get.
>>
>>
>>
>> --
>>
>>
>>
>>
>> r b-j   r...@audioimagination.com
>>
>>
>>
>>
>> "Imagination is more important than knowledge."
>>
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-04 Thread Ethan Duni

It's pretty straightforward to derive the autocorrelation and psd for this
one. Let me restate it with some convenient notation. Let's say there are a
parameter P in (0,1) and 3 random processes:
r[n] i.i.d. ~U(0,1)
y[n] i.i.d. ~(some distribution with at least first and second moments
finite)
x[n] = (r[n]
wrote:

> On 4/11/2015 5:26 AM, Ethan Duni wrote:
>
>> Do you mean the literal Fourier spectrum of some realization of this
>> process, or the power spectral density? I don't think you're going to
>> get a closed-form expression for the former (it has a random component).
>>
>
> I am interested in the long-term magnitude spectrum. I had assumed
> (wrongly?) that in the limit (over an infinite length series), that the
> fourier integral would converge. And modeling in that way would be
> (slightly) more familiar to me. However, If autocorrelation or psd is the
> better way to characterize the spectra of random signals then I should
> learn about that.
>
>
> For the latter what you need to do is work out an expression for the
>> autocorrelation function of the process.
>>
> >
>
>> As far as the autocorrelation function goes you can get some hints by
>> thinking about what happens for different values of P. For P=1 you get
>> an IID uniform noise process, which will have autocorrelation equal to a
>> kronecker delta, and so psd equal to 1. For P=0 you get a constant
>> signal. If that's the zero signal, then the autocorrelation and psd are
>> both zero. If it's a non-zero signal (depends on your initial condition
>> at n=-inf) then the autocorrelation is a constant and the psd is a dirac
>> delta.Those are the extreme cases. For P in the middle, you have a
>> piecewise-constant signal where the length of each segment is given by a
>> stopping time criterion on the uniform process (and P). If you grind
>> through the math, you should end up with an autocorrelation that decays
>> down to zero, with a rate of decay related to P (the larger P, the
>> longer the decay). The FFT of that will have a similar shape, but with
>> the rate of decay inversely proportional to P (ala Heisenberg
>> Uncertainty principle).
>>
>> So in broad strokes, what you should see is a lowpass spectrum
>> parameterized by P - for P very small, you approach a flat spectrum, and
>> for P close to 1 you approach a spectrum that's all DC.
>>
>> Deriving the exact expression for the autocorrelation/spectrum is left
>> as an exercise for the reader :]
>>
>
> Ok, thanks. That gives me a place to start looking.
>
> Ross.
>
>
>
> E
>>
>> On Tue, Nov 3, 2015 at 9:42 AM, Ross Bencina > <mailto:rossb-li...@audiomulch.com>> wrote:
>>
>> Hi Everyone,
>>
>> Suppose that I generate a time series x[n] as follows:
>>
>>  >>>
>> P is a constant value between 0 and 1
>>
>> At each time step n (n is an integer):
>>
>> r[n] = uniform_random(0, 1)
>> x[n] = (r[n] <= P) ? uniform_random(-1, 1) : x[n-1]
>>
>> Where "(a) ? b : c" is the C ternary operator that takes on the
>> value b if a is true, and c otherwise.
>> <<<
>>
>> What would be a good way to derive a closed-form expression for the
>> spectrum of x? (Assuming that the series is infinite.)
>>
>>
>> I'm guessing that the answer is an integral over the spectra of
>> shifted step functions, but I don't know how to deal with the random
>> magnitude of each step, or the random onsets. Please assume that I
>> barely know how to take the Fourier transform of a step function.
>>
>> Maybe the spectrum of a train of randomly spaced, random amplitude
>> pulses is easier to model (i.e. w[n] = x[n] - x[n-1]). Either way,
>> any hints would be appreciated.
>>
>> Thanks in advance,
>>
>> Ross.
>>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-03 Thread Ethan Duni

Wait, just realized I wrote that last part backwards. It should be:

So in broad strokes, what you should see is a lowpass spectrum
parameterized by P - for P very small, you approach a DC spectrum, and for
P close to 1 you approach a spectrum that's flat.

On Tue, Nov 3, 2015 at 10:26 AM, Ethan Duni  wrote:

> Do you mean the literal Fourier spectrum of some realization of this
> process, or the power spectral density? I don't think you're going to get a
> closed-form expression for the former (it has a random component). For the
> latter what you need to do is work out an expression for the
> autocorrelation function of the process.
>
> As far as the autocorrelation function goes you can get some hints by
> thinking about what happens for different values of P. For P=1 you get an
> IID uniform noise process, which will have autocorrelation equal to a
> kronecker delta, and so psd equal to 1. For P=0 you get a constant signal.
> If that's the zero signal, then the autocorrelation and psd are both zero.
> If it's a non-zero signal (depends on your initial condition at n=-inf)
> then the autocorrelation is a constant and the psd is a dirac delta. Those
> are the extreme cases. For P in the middle, you have a piecewise-constant
> signal where the length of each segment is given by a stopping time
> criterion on the uniform process (and P). If you grind through the math,
> you should end up with an autocorrelation that decays down to zero, with a
> rate of decay related to P (the larger P, the longer the decay). The FFT of
> that will have a similar shape, but with the rate of decay inversely
> proportional to P (ala Heisenberg Uncertainty principle).
>
> So in broad strokes, what you should see is a lowpass spectrum
> parameterized by P - for P very small, you approach a flat spectrum, and
> for P close to 1 you approach a spectrum that's all DC.
>
> Deriving the exact expression for the autocorrelation/spectrum is left as
> an exercise for the reader :]
>
> E
>
> On Tue, Nov 3, 2015 at 9:42 AM, Ross Bencina 
> wrote:
>
>> Hi Everyone,
>>
>> Suppose that I generate a time series x[n] as follows:
>>
>> >>>
>> P is a constant value between 0 and 1
>>
>> At each time step n (n is an integer):
>>
>> r[n] = uniform_random(0, 1)
>> x[n] = (r[n] <= P) ? uniform_random(-1, 1) : x[n-1]
>>
>> Where "(a) ? b : c" is the C ternary operator that takes on the value b
>> if a is true, and c otherwise.
>> <<<
>>
>> What would be a good way to derive a closed-form expression for the
>> spectrum of x? (Assuming that the series is infinite.)
>>
>>
>> I'm guessing that the answer is an integral over the spectra of shifted
>> step functions, but I don't know how to deal with the random magnitude of
>> each step, or the random onsets. Please assume that I barely know how to
>> take the Fourier transform of a step function.
>>
>> Maybe the spectrum of a train of randomly spaced, random amplitude pulses
>> is easier to model (i.e. w[n] = x[n] - x[n-1]). Either way, any hints would
>> be appreciated.
>>
>> Thanks in advance,
>>
>> Ross.
>> ___
>> dupswapdrop: music-dsp mailing list
>> music-dsp@music.columbia.edu
>> https://lists.columbia.edu/mailman/listinfo/music-dsp
>>
>
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] how to derive spectrum of random sample-and-hold noise?

2015-11-03 Thread Ethan Duni

Do you mean the literal Fourier spectrum of some realization of this
process, or the power spectral density? I don't think you're going to get a
closed-form expression for the former (it has a random component). For the
latter what you need to do is work out an expression for the
autocorrelation function of the process.

As far as the autocorrelation function goes you can get some hints by
thinking about what happens for different values of P. For P=1 you get an
IID uniform noise process, which will have autocorrelation equal to a
kronecker delta, and so psd equal to 1. For P=0 you get a constant signal.
If that's the zero signal, then the autocorrelation and psd are both zero.
If it's a non-zero signal (depends on your initial condition at n=-inf)
then the autocorrelation is a constant and the psd is a dirac delta. Those
are the extreme cases. For P in the middle, you have a piecewise-constant
signal where the length of each segment is given by a stopping time
criterion on the uniform process (and P). If you grind through the math,
you should end up with an autocorrelation that decays down to zero, with a
rate of decay related to P (the larger P, the longer the decay). The FFT of
that will have a similar shape, but with the rate of decay inversely
proportional to P (ala Heisenberg Uncertainty principle).

So in broad strokes, what you should see is a lowpass spectrum
parameterized by P - for P very small, you approach a flat spectrum, and
for P close to 1 you approach a spectrum that's all DC.

Deriving the exact expression for the autocorrelation/spectrum is left as
an exercise for the reader :]

E

On Tue, Nov 3, 2015 at 9:42 AM, Ross Bencina 
wrote:

> Hi Everyone,
>
> Suppose that I generate a time series x[n] as follows:
>
> >>>
> P is a constant value between 0 and 1
>
> At each time step n (n is an integer):
>
> r[n] = uniform_random(0, 1)
> x[n] = (r[n] <= P) ? uniform_random(-1, 1) : x[n-1]
>
> Where "(a) ? b : c" is the C ternary operator that takes on the value b if
> a is true, and c otherwise.
> <<<
>
> What would be a good way to derive a closed-form expression for the
> spectrum of x? (Assuming that the series is infinite.)
>
>
> I'm guessing that the answer is an integral over the spectra of shifted
> step functions, but I don't know how to deal with the random magnitude of
> each step, or the random onsets. Please assume that I barely know how to
> take the Fourier transform of a step function.
>
> Maybe the spectrum of a train of randomly spaced, random amplitude pulses
> is easier to model (i.e. w[n] = x[n] - x[n-1]). Either way, any hints would
> be appreciated.
>
> Thanks in advance,
>
> Ross.
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
dupswapdrop: music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Fourier and its negative exponent

2015-10-05 Thread Ethan Duni

>the reason why it's merely convention is that if the minus sign was
swapped
>between the forward and inverse Fourier transform in all of the literature
and
>practice, all of the theorems would work the same as they do now.

Note that in some other areas they do actually use other conventions. It's
been a while since I've looked at it but IIRC in areas like geophysics they
have the signs swapped around.

Also there are different conventions about where to put the normalization
constants (on the analysis side, or on the synthesis side, or take the
square root and include it on both). Those make a bit more difference for
some of the theorems like Parseval, but again it all works the same you
just gotta be careful to be consistent.

E

On Mon, Oct 5, 2015 at 2:52 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 10/5/15 5:40 PM, robert bristow-johnson wrote:
>
>>
>> about an hour ago i posted to this list and it hasn't shown up on my end.
>>
>>
> okay, something got lost in the aether. i am reposting this:
>
>
> On 10/5/15 9:28 AM, Stijn Frishert wrote:
>
>> In trying to get to grips with the discrete Fourier transform, I have a
>> question about the minus sign in the exponent of the complex sinusoids you
>> correlate with doing the transform.
>>
>> The inverse transform doesn’t contain this negation and a quick search on
>> the internet tells me Fourier analysis and synthesis work as long as one of
>> the formulas contains that minus and the other one doesn’t.
>>
>> So: why? If the bins in the resulting spectrum represent how much of a
>> sinusoid was present in the original signal (cross-correlation), I would
>> expect synthesis to use these exact same sinusoids to get back to the
>> original signal. Instead it uses their inverse! How can the resulting
>> signal not be 180 phase shifted?
>>
>> This may be text-book dsp theory, but I’ve looked and searched and
>> everywhere seems to skip over it as if it’s self-evident.
>>
>
>
> hi Stijn,
>
> so just to confuse things further, i'll add my 2 cents that i had always
> thought made it less confusing. (but people have disabused me of that
> notion.)
>
> first of all, it's a question oft asked in DSP circles, like the USENET
> comp.dsp or, more recently at Stack Exchange (not a bad thing to sign up
> and participate in):
>
>
> http://dsp.stackexchange.com/questions/19004/why-is-a-negative-exponent-present-in-fourier-and-laplace-transform
>
>
>
> in my opinion, the answer to your question is one word: "convention".
>
> the reason why it's merely convention is that if the minus sign was
> swapped between the forward and inverse Fourier transform in all of the
> literature and practice, all of the theorems would work the same as they do
> now.
>
> the reason for that is that the two imaginary numbers +j and -j are,
> qualitatively, *exactly* the same even though they are negatives of each
> other and are not zero. (the same cannot be said for +1 and -1, which are
> qualitatively different.) both +j and -j are purely imaginary and have
> equal claim to squaring to become -1.
>
> so, by convention, they chose +j in the inverse Fourier Transform and -j
> had to come out in the forward Fourier transform. they could have chosen -j
> for the inverse F.T., but then they would need +j in the forward F.T.
>
> so why did they do that? in signal processing, where we are as comfortable
> with negative frequency as we are with positive frequency it's because if
> you want to represent a single (complex) sinusoid at an angular frequency
> of omega_0 with an amplitude of 1 and phase offset of zero, it is:
>
>
> e^(j*omega_0*t)
>
> so, when we represent a periodic signal with fundamental frequency of
> omega_0>0 (that is, the period is 2*pi/omega_0), it is:
>
> +inf
> x(t) = SUM X[k] * e^(j*k*omega_0*t)
> k=-inf
>
>
> each frequency component is at frequency k*omega_0. for positive
> frequencies, k>0, for negative, k<0.
>
>
> to extract the coefficient X[m], we must multiply x(t) by
> e^(-j*m*omega_0*t) to cancel the factor e^(j*m*omega_0*t) in that term
> (when k=m) in that summation, and then we average. the m-th term is now DC
> and averaging will get X[m]. all of the other terms are AC and averaging
> will eventually make those terms go to zero. so only X[m] is left.
>
> that is conceptually the basic way in which Fourier series or Fourier
> transform works. (discrete or continuous.)
>
>
> but, we could do the same thing all over again, this time replace every
> occurrence of +j with -j and every -j with +j, and the same results will
> come out. the choice of +j in the above two expressions is one of
> convention.
>
>
>
>
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
>
> ___
> dupswapdrop: music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___

Re: [music-dsp] warts in JUCE

2015-09-04 Thread Ethan Duni

I don't have a dog in any JUCE fight, but excluding the sample rate from an
AudioSampleBuffer type object seems like good design to me. The reason is
that system parameters that depend on the sample rate tend to be things
like buffer sizes, and so changing them is typically not real-time thread
safe. So you want to sequester all of the dependencies on sample rate into
the init time functions (where your signal processing objects can keep a
record of it as private variables if needed), and then make a point of
excluding it from the audio buffer objects that get passed around to the
runtime functions. Runtime functions generally shouldn't need to be
reminded of the sample rate at every call, the objects in question learned
the sample rate at init time and did all the appropriate memory allocation
then.

In the special cases where that is appropriate to inform a runtime function
of the sample rate, you want that to show up in the interface explicitly so
that it is clear. If you happen to be working in one of those special cases
and find the extra arguments cumbersome, you can always just write a
convenience wrapper class.

>but the change will *not* have *any* salient side-effects on *existing*
code.  that's why it's backward compatible.

The kicker is that making a change backward compatible in that way also
means that it isn't truly "forward compatible." That is, since there is
zero impact on legacy code, nobody can ever trust anybody's code to respect
this new sample rate member variable. And so nobody can really make use of
it - they have to continue to assume that it isn't supported and jump
through all of the same hoops as before, only now there's one more bit of
cruft to complicate the picture. It's only truly useful in self-contained
projects, in which case again why not just roll your own wrapper class for
convenience.

>i truly believe you and i have different "Gospels of OOP" or differing
fundamental principles of what the meaning of modular design is

Being myself more of a plain-C guy historically who has been doing more C++
lately, one thing that I have recently learned about is the basic
philosophical differences over these design issues. There's one school of
thought that says objects are supposed to model real-world entities, so for
example you should stick all of your audio data in a single audio object
and proceed from there. And of course sample rate belongs, since it's
needed to relate the object to the physical audio. Another school of
thought views them more in terms of abstract division of responsibilities,
and then works to pare objects down to only what's needed. So an object
who's job is to convey blocks of audio for realtime processing excludes the
sample rate, since that is generally not allowed to change during realtime
operation.

It's worth noting that software engineers with lots of experience dealing
with large projects tend to favor the latter view.

Along those lines:
>if AudioSampleBuffer did *not* contain the numChannels or numSamples, and
you had to pass that data around all the time along with >the
AudioSampleBuffer, just so you could do something with it, i think you
would conclude that something is missing and should be >included in the
object definition.  i don't grok why anyone would not come to the same
conclusion regarding sampleRate.

One distinction is that numChannels and numSamples are both required just
to make sense of the buffer contents as digital data. SampleRate is needed
to figure out how that digital data relates back to a hypothetical piece of
analog data. The other is that numSamples generally needs to vary at
runtime, but sampleRate is supposed to be init-time only.

Again, whether anyone finds that reasoning compelling depends on where they
come down on the basic philosophy of OOP.

Also, what Chris just posted while I was typing this :P

E

On Fri, Sep 4, 2015 at 12:38 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 9/4/15 12:27 PM, mdsp wrote:
>
>> I've tried to make my point but I think there's still quite a lot of
>> misunderstanding.
>>
>
> sometimes it's misunderstanding.
>
> again, i'm not a C++ expert (i said this on my old email to Jules that i
> posted here), but *am* an expert in nearly all particulars in C and also
> know what to expect as it is compiled into machine code.  i know about
> processing in real time and in non-real time.
>
> i originally offered the suggestion in the effort to serve the purposes of
> modular programming, of which the principles, as i understand it are:
>
>   https://en.wikipedia.org/wiki/Single_responsibility_principle
>
>   https://en.wikipedia.org/wiki/Separation_of_concerns
>
>
> i find it odd that it seems to all inherent to a parcel of sound
> represented in a computer are the number of samples and the number of
> channels (and both are in an AudioSampleBuffer in addition to the audio
> samples themselves) yet the sample rate is considered "ancillary".  i know
> for a fact that the

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-26 Thread Ethan Duni

>15.6 dB  +  (12.04 dB) * log2( Fs/(2B) )

Oh I see, you're actually taking the details of the sinc^2 into account.
What I had in mind was more of a worst-case analysis where we just call the
sin() component 1 and then look at the 1/n^2 decay (which is 12dB per
octave). Which we see in the second term, but of course the sine's
contribution also whacks away a certain portion of energy, hence the 15.6dB
offset.

On the other hand if you're interested in something like the spurious-free
dynamic range, then the simple 12dB/octave estimate is appropriate. The
worst-case components aren't going to get attenuated at all by the sin(),
just the 1/n^2. I tend to favor that in cases where we can't be confident
that the noise floor in question is (at least approximately) flat.

>so, it seems to come out a little more than 12 dB.

I long ago adopted an informal rule that when an engineer says "6dB" he
means "20*log10(2)," and not exactly 6dB. And likewise for 3dB, 12dB, etc.
Doubly so when talking about the rolloff of linear systems, nobody ever
splits that hair... IIRC the prof in my freshman linear circuits class
instructed us to fudge it this way immediately after introducing the
concept of dB :]

>the number of coefs in the FIR filter is a performance issue regarding how
well you're gonna
>beat down them images in between baseband and the next *oversampled* image

Right I see what you mean. I had mixed up my arithmetic on the lengths of
the filters as a function of oversampling ratio.

>i think we're on the same page.  ain't we?

Yeah, I was unclear on which scenario(s) the aliasing analysis was supposed
to apply to.

E



On Wed, Aug 26, 2015 at 12:53 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 8/25/15 7:08 PM, Ethan Duni wrote:
>
>> >if you can, with optimal coefficients designed with the tool of your
>> choice, so i am ignoring any images between B and Nyquist-B, >upsample by
>> 512x and then do linear interpolation between adjacent samples for
>> continuous-time interpolation, you can show that it's >something like 12 dB
>> S/N per octave of oversampling plus another 12 dB.  that's 120 dB.  that's
>> how i got to 512x.
>>
>> Wait, where does the extra 12dB come from? Seems like it should just be
>> 12dB per octave of oversampling. What am I missing?
>>
>
> okay, this is painful.  in our 2-decade old paper, Duane and i did this
> theoretical approximation analysis for drop-sample interpolation, and i did
> it myself for linear, but we did not put in the math for linear
> interpolation in the paper.
>
> so, to satisfy Nyquist (or Shannon or Whittaker or the Russian guy) the
> sample rate Fs must exceed 2B which is twice the bandwidth.  the
> oversampling ratio is defined to be Fs/(2B).  and in octaves it is
> log2(Fs/(2B)).  all frequencies in your baseband satisfy |f| highly oversampled, 2B << Fs.
>
> now, i'm gonna assume that Fs is so much (like 512x) greater than 2B that
> i will assume the attenuation due to the sinc^2 for |f| will assume that the spectrum between -B and +B is uniformly flat (that's
> not quite worst case, but it's worser case than what music, in the bottom 5
> or 6 octaves, is).  so given a unit height on that uniform power spectrum,
> the energy will be 2B.
>
> so, the k-th image (where k is not 0) will have a zero of the sinc^2
> function going right through the heart of it.  that's what's gonna kill the
> son-of-a-bitch.  the energy of that image is:
>
>
>k*Fs+B
>  integral{ (sinc(f/Fs))^4 df }
>k*Fs-B
>
>
> since it's power spectrum it's sinc^4 for linear and sinc^2 for
> drop-sample interpolation.
>
> changing the variable of integration
>
>
>+B
>  integral{ (sinc((k*Fs+f)/Fs))^4 df }
>-B
>
>
>
>+B
>  integral{ (sinc(k+f/Fs))^4 df }
>-B
>
>
>
>  sinc(k+f/Fs) =  sin(pi*(k+f/Fs))/(pi*(k+f/Fs))
>
>   =  (-1)^k * sin(pi*f/Fs)/(pi*(k+f/Fs))
>
>   =approx  (-1)^k  *  (pi*f/Fs)/(pi*k)
>
>   since  |f| < B << Fs
>
> raising to the 4th power gets rid of the toggling polarity.  so now it's
>
> +B
>  1/(k*Fs)^4 * integral{ f^4 df }  =  (2/5)/(k*Fs)^4 * B^5
> -B
>
>
> now you have to sum up the energies of all of the bad images (we are
> assuming that *all* of those images, *after* they are beaten down, will
> somehow fall into the baseband during resampling and their energies will
> team up).  there are both negative and positive frequency images to add
> up.  (but we don't

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-25 Thread Ethan Duni

ystem to have looser requirements since the signal
aliasing issue has been removed.

E

On Mon, Aug 24, 2015 at 12:41 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 8/24/15 11:18 AM, Sampo Syreeni wrote:
>
>> On 2015-08-19, Ethan Duni wrote:
>>
>> and it doesn't require a table of coefficients, like doing higher-order
>>>> Lagrange or Hermite would.
>>>>
>>>
>>> Robert I think this is where you lost me. Wasn't the premise that memory
>>> was cheap, so we can store a big prototype FIR for high quality 512x
>>> oversampling?
>>>
>>
> that was my premise for using linear interpolation *between* adjacent
> oversampled (by 512x) samples.  if you can, with optimal coefficients
> designed with the tool of your choice, so i am ignoring any images between
> B and Nyquist-B, upsample by 512x and then do linear interpolation between
> adjacent samples for continuous-time interpolation, you can show that it's
> something like 12 dB S/N per octave of oversampling plus another 12 dB.
> that's 120 dB.  that's how i got to 512x.  some apps where you might care
> less about inharmonic energy from images folding over (a.k.a. "aliasing"),
> you might not need to go that high of whatever-x.
>
> but the difference in price in memory only, *not* in computational
> burden.  whether it's 64x or 512x, the computational cost is separating the
> index into integer and fractional parts, using the integer part to select
> the N samples to combine and the fractional part to tell you how to combine
> them.  if it's 512x, the fractional part is broken up into the top 9 bits
> to select your N coefficients (and the neighboring set of N coefficients)
> and the rest of the bits are for the linear interpolation.  with only the
> cost of a few K of words (i remember the days when 4K was a lotta memory
> :-), you can get to arbitrarily good with the cost of 2N+1 MAC instructions.
>
> with drop-sample interpolation between fractional delays (6 dB per octave
> of oversampling), then you need another 10 octaves of oversampling, 512K*N
> words of memory, but only N MAC instructions per output sample.
>
> when it's using Hermite or Lagrange then the S/N is 24 dB per octave of
> oversampling, i don't think it's worth it that you need only 16x or 32x
> oversampling (that saves only memory and the cost of computation becomes 4
> times worse or worser).  maybe in an ASIC or an FPGA, but in DSP code or
> regular-old software, i don't see the advantage of cubic or higher-order
> interpolation unless memory is *really* tight and you gotta lotta MIPs to
> burn.
>
>
>> In my (admittedly limited) experience these sorts of tradeoffs come when
>> you need to resample generally, so not just downwards from the original
>> sample rate but upwards as well, and you're doing it all on a dedicated DSP
>> chip.
>>
>> In that case, when your interpolator approaches and goes beyond the
>> Nyquist frequency of the original sample, you need longer and longer
>> approximations of the sinc(x) response,
>>
>
> you need that to get sharper and sharper brick-wall LPFs to whack those
> 511 images in between the baseband and 512x.
>
> then the sinc^2 function in the linear interpolation blasts the hell outa
> all them images that are at multiples of 512x (except the 0th multiple of
> course).  drop-sample interpolation would have only a sinc function doing
> it whereas and Mth-order B-spline would have a sinc^(M+1) function really
> blasting the hell outa them images.
>
> with wonkier and wonkier recursion formulas for online calculation of the
>> coefficients of the interpolating polynomial. Simply because of aliasing
>> suppression, and because you'd like to calculate the coefficients on the
>> fly to save on memory bandwidth.
>>
>> However, if you suitably resample both in the output sampling frequency
>> and in the incoming one, you're left with some margin as far as the
>> interpolator goes, and it's always working downwards, so that it doesn't
>> actually have to do aliasing suppression. An arbitrary low order polynomial
>> is easier to calculate on the fly, then.
>>
>> The crucial part on dedicated DSP chips is that they can generate radix-2
>> FFT coefficients basically for free, with no table lookup
>>
>
> yeah, but you get accumulated errors as you compute the twiddle factors
> on-the-fly.  either in linear or bit-reversed order.
>
> and severely accelerated inline computation as well. That means that you
>> can implement both the input and the output side anti-aliasing/anti-imaging

Re: [music-dsp] [admin] list etiquette

2015-08-22 Thread Ethan Duni

Sounds good Douglass, I'm glad to see you taking the initiative on this
matter. The list has generally been an oasis of pleasant, respectful
behavior and informative discussions, and it's tragic that it has become so
toxic lately.

Thanks
E

On Sat, Aug 22, 2015 at 8:21 AM, Douglas Repetto  wrote:

> Hi everyone, Douglas the list admin here.
>
> I've been away and haven't really been monitoring the list recently.
> It's been full of bad feelings, unpleasant interactions, and macho
> posturing. Really not much that I find interesting. I just want to
> reiterate a few things about the list.
>
> I'm loathe to make or enforce rules. But the list has been pretty much
> useless for the majority of subscribers for the last year or so. I
> know this because many of them have written to complain. It's
> certainly not useful to me.
>
> I've also had several reports of people trying to unsubscribe other
> people and other childish behavior. Come on.
>
> So:
>
> * Please limit yourself to two well-considered posts per day. Take it
> off list if you need more than that.
> * No personal attacks. I'm just going to unsub people who are insulting.
> Sorry.
> * Please stop making macho comments about "first year EE students know
> this" and blahblahblah. This list is for anyone with an interest in
> sound and dsp. No topic is too basic, and complete beginners are
> welcome.
>
> I will happily unsubscribe people who find they can't consistently
> follow these guidelines.
>
> The current list climate is hostile and self-aggrandizing. No
> beginner, gentle coder, or friendly hobbyist is going to post to such
> a list. If you can't help make the list friendly to everyone, please
> leave. This isn't the list for you.
>
>
> douglas
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

>Naturally, there's going to be some jaggedness in the spectrum because
>of the noise. So, obviously, that is not sinc^2 then.

So your whole point is that it's not *exactly* sinc^2, but a slightly noisy
version thereof? My point was that there are no effects of resampling
visible in the graphs. That has nothing to do with exactly how the graphs
were generated, nor does insisting that the graphs are slightly noisy
address the point.

Indeed, you've already conceded that the resampling effects are not visible
in the graphs several posts back. It seems like you're just casting about
for some other issue that you can tell yourself you "won," and then call me
names, to feed your fragile ego. Honestly, it's a pretty sad spectacle and
I'm embarrassed for you. It really would be better for everyone - including
you - if you could interact in a good-faith, mature manner. Please make an
effort to start doing so, or you're pretty soon going to find that nobody
here will interact with you any more.

By the way, there's no reason for any jaggedness to appear in the plots,
given the lengths of data you were talking about. You might want to look
into spectral density estimation methods to trade off frequency resolution
and bin accuracy.  It's pretty standard statistical signal processing 101
stuff. Producing a very smooth graph from a long enough segment of data is
straightforward, if you use appropriate techniques (not just one big FFT of
the whole thing, that won't ever get rid of the noisiness no matter how
much data you throw at it).

E

On Fri, Aug 21, 2015 at 5:47 PM, Peter S 
wrote:

> On 22/08/2015, Ethan Duni  wrote:
> >
> > We've been over this repeatedly, including in the very post you are
> > responding to. The fact that there are many ways to produce a graph of
> the
> > interpolation spectrum is not in dispute, nor is it germaine to my point.
>
> Earlier you disputed that there's no upsampling involved.
> Apparently you change your mind quite often...
>
> > It's seems like you are trying to
> > avoid my point entirely, in favor of some imaginary dispute of your own
> > invention, which you think you can "win."
>
> I claimed something, and you disputed it. I proved that what I
> claimed, is true. Therefore, all your further arguments are invalid...
> (and are boring)
>
> > I have no idea what you think you are proving by scrutinizing graph
> > artifacts like that
>
> I am proving that what you see on the graph is not sinc(x) /
> sinc^2(x), but rather some noisy curve, like the spectrum of upsampled
> noise. Therefore, my original argument is correct.
>
> > It's also in extremely poor taste to use "retard" as a term of abuse.
>
> Well, if you do not see that the graph pictured on Olli's figure is
> not sinc(x), then you're retarded.
>
> > Meanwhile, it seems that you are suggesting that the spectrum of white
> > noise linearly interpolated up to a high oversampling rate is not sinc^2.
>
> Naturally, there's going to be some jaggedness in the spectrum because
> of the noise. So, obviously, that is not sinc^2 then.
>
> > Are you claiming that those wiggles in the graph represent
> > aliasing of the spectrum from resampling at 44.1kHz? If so, that is
> > unlikely.
>
> Nope, the "wiggles" in the graph are from the noise.
>
> -P
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

>1) Olli Niemiatalo's graph *is* equivalent of the spectrum of
>upsampled white noise.

We've been over this repeatedly, including in the very post you are
responding to. The fact that there are many ways to produce a graph of the
interpolation spectrum is not in dispute, nor is it germaine to my point.
I'm not sure what you're trying to accomplish by harping on this point,
while ignoring everything I say. Certainly, it is not convincing me that
you have some worthwhile response to my points, or even that you are
understanding them in the first place. It's seems like you are trying to
avoid my point entirely, in favor of some imaginary dispute of your own
invention, which you think you can "win."

>Have you actually looked at Olli Niemitalo's graph closely?
>Here is proof that it is NOT a graph of sinc(x)/sinc^2(x):
>
>http://morpheus.spectralhead.com/img/other001-analysis.gif
>
>It is NOT sinc(x)/sinc^2(x), and you're blind as a bat if you do not see
that.

I have no idea what you think you are proving by scrutinizing graph
artifacts like that, but it's a preposterous approach to signal analysis on
its face.

It's also in extremely poor taste to use "retard" as a term of abuse.
People with mental disabilities have it hard enough already, without others
treating their status as an insult to be thrown around. I'd appreciate it
if you would compose yourself and refrain from these kinds of ugly
outbursts.

Meanwhile, it seems that you are suggesting that the spectrum of white
noise linearly interpolated up to a high oversampling rate is not sinc^2.
Is your whole point here that generating such a plot by FFTing the
interpolation of a finite segment of white noise will produce finite-data
artifacts in the resulting graph? Because that's not relevant to the
subject, and only goes to show that it's better to just graph the sinc^2
curve directly and so avoid all of the excess computation and finite-data
effects. Are you claiming that those wiggles in the graph represent
aliasing of the spectrum from resampling at 44.1kHz? If so, that is
unlikely.

You do agree that the spectrum of a continuous-time linear interpolator is
given by sinc^2, right?

E

On Fri, Aug 21, 2015 at 4:59 PM, Peter S 
wrote:

> Since you constantly derail this topic with irrelevant talk, let me
> instead prove that
>
> 1) Olli Niemiatalo's graph *is* equivalent of the spectrum of
> upsampled white noise.
> 2) Olli Niemitalo's graph does *not* depict sinc(x)/sinc^2(x).
>
> First I'll prove 1).
>
> Using palette modification, I extracted the linear interpolation curve
> from Olli's figure:
> http://morpheus.spectralhead.com/img/other001b.gif
>
> Then I sampled white noise at 500 Hz, and resampled it to 44.1 kHz
> using linear interpolation. I got this spectrum:
>
> http://morpheus.spectralhead.com/img/resampled_noise_spectrum.gif
>
> To do a proper A/B comparison between the two spectra, I tried to
> align and match them as much as possible, and created an animated GIF
> file that blinks between the two graphs at a 500 ms rate:
>
> http://morpheus.spectralhead.com/img/olli_vs_resampled_noise.gif
>
> Although the alignment is not 100% exact, to my eyes, they look like
> totally equivalent graphs.
>
> This proves that upsampled white noise has the same spectrum as the
> graph shown on Olli's graph for linear interpolation.
>
> Second, I'll prove 2).
>
> Have you actually looked at Olli Niemitalo's graph closely?
> Here is proof that it is NOT a graph of sinc(x)/sinc^2(x):
>
> http://morpheus.spectralhead.com/img/other001-analysis.gif
>
> It is NOT sinc(x)/sinc^2(x), and you're blind as a bat if you do not see
> that.
>
> Since I proved both 1) and 2), it is totally irrelevant what you say,
> because none of what you could ever say would disprove this.
>
> Sinc(x) does not have a jagged/noisy look, therefore it is 100%
> certain it is not what you see on Olli's graph. Point proven, end of
> discussion.
>
> -P
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

>Which contains alias images of the original spectrum, which was my point.

There is no "original spectrum" pictured in that graph. Only the responses
of the interpolators. There is no reference to any input signal at all.

>No one claimed there was fractional delay involved.

Fractional delay is a primary topic of this thread, and a major motivation
for interest in polynomial interpolation in dsp in general.

>Then how do you explain that taking noise sampled at 500 Hz, and
>resampling it to 44.1 kHz gives an identical FFT graph?

We've been over this already. It's because you're resampling the signal at
such a large rate that the effects of the sampling are not visible. And
you've chosen a signal with a flat spectrum, so there are no features of
the signal spectrum visible - only the interpolator response. This goes
exactly to the point that no resampling effects are present in the graphs.
All we see are the interpolator spectra.

The fact that there are various ways to generate a graph of an interpolator
spectrum is entirely beside the point.

>> If you resample to the original rate
>> (in order to implement a fractional delay, say), then those weighted
images
>> will be folded back to the same place they came from.
>That's exactly why they're called aliases.

No, if you fold the images back to the same spots they originated, they are
not aliases. All of the frequencies are mapped back to their original
locations, none end up at other frequencies. Aliases are when signal images
end up in new locations corresponding to different frequency bands.

This distinction is crucial to understanding the operation of fractional
delay interpolators: it's why they don't produce aliasing at their output.
We just get a fractional delay filter with an imperfect spectrum. It's only
the frequency response of the interpolator that gets aliased (introducing
the zero at Nyquist for half-sample delay, for example), not the underlying
signal content. That's why it's important to graph the frequency response
of the interpolators directly, without worrying about signal spectra - to
figure out what happens in the final digital interpolator, you take that
continuous time interpolator spectrum, add a linear phase term for whatever
delay you want, and then alias it according to your new sampling rate to
get the final response of the digital interpolation filter. Signal aliasing
only results if that involves a change in sampling rate.

>Which is not the case on Olli's graph.

Right, Ollie's graph shows only the intermediate stage, the spectrum of the
polynomial interpolator in continuous time. This is an analytical
convenience, we never actually produce any such signal. It's used as an
input to figure out what the final response of a digital interpolator based
on one of these polynomials will be. You can of course sample that at a
very high rate and so neglect the aliasing of the interpolator response,
but what is the point of that? You wouldn't use any of these interpolators
if what you're trying to do is upsample a 500Hz sampled signal to 44.1kHz,
the graphs show that they're crap for that.

>I spent (wasted?) a considerate amount of time creating various
>demonstrations and FFT graphs showing my point.

Your time would be better spent figuring out a point that is relevant to
what I'm saying in the first place. It is indeed a waste of your time to
invent equivalent ways to generate graphs, since that is not the point.

E

On Fri, Aug 21, 2015 at 2:56 PM, Peter S 
wrote:

> On 21/08/2015, Ethan Duni  wrote:
> > The details of how the graphs were generated don't really matter.
>
> Then why do you keep insisting that they're generated by plotting
> sinc^2(x) ?
>
> > The point
> > is that the only effect shown is the spectrum of the continuous-time
> > polynomial interpolator.
>
> Which contains alias images of the original spectrum, which was my point.
>
> > The additional spectral effects of delaying and
> > resampling that continuous-time signal (to get fractional delay, for
> > example) are not shown.
>
> No one claimed there was fractional delay involved.
>
> > There is no "resampling" to be seen in the graphs.
>
> I recreated the exact same graph via resampling a signal, proving that
> is one method of generating that graph.
>
> >>I claim that they are aliases of the original spectrum.
> >
> > What we see in the graph is simply the spectra of the continuous-time
> > interpolators.
>
> Then how do you explain that taking noise sampled at 500 Hz, and
> resampling it to 44.1 kHz gives an identical FFT graph?
>
> How do you explain that an 50 Hz sine wave, resampled to 44.1 kHz,
> contains alias frequencies at 450 Hz, 550 Hz, 950

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

The details of how the graphs were generated don't really matter. The point
is that the only effect shown is the spectrum of the continuous-time
polynomial interpolator. The additional spectral effects of delaying and
resampling that continuous-time signal (to get fractional delay, for
example) are not shown. There is no "resampling" to be seen in the graphs.

>I claim that they are aliases of the original spectrum.

What we see in the graph is simply the spectra of the continuous-time
interpolators. Since the spectra extend beyond the original nyquist rate,
there will indeed be images of the original signal weighted by the
interpolator spectrum present in the continuous-time interpolated signal.
Whether those are ultimately expressed as aliases depends on what you then
do with that continuous time signal. If you resample to the original rate
(in order to implement a fractional delay, say), then those weighted images
will be folded back to the same place they came from. In that case, there
is no aliasing, you just end up with a modified frequency response of your
fractional interpolator. This is where the zero at Nyquist comes from when
we do a half-sample delay - the linear phase term corresponding to a
half-sample delay causes the signal images to become out of phase with each
other as you approach Nyquist, so they cancel out and you get a zero.

It is only if the interpolated continuous-time signal is resampled at a
different rate, or just used directly, that those signal images end up
expressed as aliases.

The rest of your accusations are your usual misreadings and straw men. I
won't be legitimating them by responding, and I hope you will accept that
and give up on these childish tactics. It would be better for everyone if
you could make a point of engaging in good faith and trying to stick to the
subject rather than attacking the intellects of others.

E

On Fri, Aug 21, 2015 at 2:05 PM, Peter S 
wrote:

> Also, you even contradict yourself. You claim that:
>
> 1) Olli's graph was created by graphing sinc(x), sinc^2(x), and not via
> FFT.
>
> 2) The artifacts from the resampling would be barely visible, because
> the oversampling rate is quite high.
>
> So, if - according to 2) - the artifacts are not visible because the
> oversampling is high and the graph doesn't focus on that, then how do
> you know that 1) is true? You claim that the resampling artifacts
> wouldn't be visible anyways.
>
> If that's true, then how would you prove that FFT was not used for
> creating Olli's graph?
>
> Also, even you yourself acknowledge that
>
> "It shows the aliasing left by linear interpolation into the
> continuous time domain."
>
> So, we agree that the graph shows aliasing, right?
>
> I do not know where you get your idea of "additional aliasing" - it's
> the very same aliasing, except the resampling folds it back...
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

>Since that image is not meant to "illustrate the effects of
>resampling", but rather, to "illustrate the effects of interpolation",
>*obviously* it doesn't focus on the aliasing from the resampling.

So you agree that the effects of resampling are not shown, and all we see
is the spectrum of the continuous time polynomial interpolators.

I'm going to accept that concession of my point and move on. If I were you,
I'd quit haranguing people over irrelevancies and straw men, and generally
trying to pretend to superiority. Nobody is buying it, and it just
highlights your insecurity.

E

On Fri, Aug 21, 2015 at 1:24 PM, Peter S 
wrote:

> On 21/08/2015, Ethan Duni  wrote:
> >>It shows *exactly* the aliasing
> >
> > It shows the aliasing left by linear interpolation into the continuous
> time
> > domain. It doesn't show the additional aliasing produced by then delaying
> > and sampling that signal. I.e., the images that would get folded back
> onto
> > the new baseband, disturbing the sinc^2 curve.
>
> This image doesn't involve any fractional delay.
>
> > Those differences would be quite small for resampling to 44.1kHz with no
> > delay, since the oversampling ratio is considerable, so you'd have to
> look
> > carefully to see them.
>
> I think they're actually on the image:
> http://morpheus.spectralhead.com/img/resampling_aliasing.png
>
> They're hard to notice, because the other aliasing masks it.
>
> > This is a big hint that they are not portrayed:
> > Ollie knows what he is doing, so if he wanted to illustrate the effects
> of
> > the resampling, he would have constructed a scenario where they are
> easily
> > visible.
>
> Since that image is not meant to "illustrate the effects of
> resampling", but rather, to "illustrate the effects of interpolation",
> *obviously* it doesn't focus on the aliasing from the resampling.
>
> Therefore, it is not a "hint" at all, and your argument is invalid.
>
> > And probably mentioned a second sample rate, explicitly shown both
> > the sinc^2 and its aliased counterpart, etc. The effect would be shown
> in a
> > visible, explicit manner, if that was what the graph was supposed to
> show.
>
> The fact that this graph is not supposed to demonstrate the aliasing
> from the resampling, does not mean that
>
> 1) it's not there on the graph (it's just barely visible)
>
> 2) the images of the continuous time interpolated signal are not
> aliasing. That's also called aliasing!!!
>
> > But all of those things depend on parameters like oversampling ratio and
> > delay, so it would be a much more complicated picture.
>
> Yes, and that's all entirely irrelevant here... Because the images in
> the continuous time signal before the resampling are also called
> aliasing!!! They're all aliases of the original spectrum, and they all
> alias back to the original spectrum when sampled at the original
> sampling rate! They're called aliasing even before you resample them!
>
> > What we're shown
> > here is just the effects of polynomial interpolation to get to the
> > continuous time domain.
>
> False. I've shown the FFT frequency spectra of actual upsampled signals.
>
> > The additional effects of delaying and then
> > sampling that signal back into the discrete time domain are not visible.
>
> There was no delaying involved at all.
>
> The effects of "sampling that signal back" are not visible, because
> there's 88x oversampling, just as I pointed out. If you want, you can
> repeat the same with less oversampling, and present us your results.
>
> > It seems that you have assumed that some resampling must be happening
> > because the graph only goes up to 22kHz. But that's just the range of the
> > graph, you don't need to do any resampling of anything to graph sinc^2
> over
> > any particular range of frequencies.
>
> I never said you need do to resampling of the continuous time signal
> to graph sinc^2.
>
> I said: the images in the frequency spectrum of the continuous time
> signal are aliases of the original spectrum, and they alias back to
> the original spectrum when the continuous time signal is sampled at
> the original rate!
>
> > But that's not quite the exact same graph.
>
> It's essentially the exact same graph.
>
> > And why are you putting a sound card in the loop?
>
> That was the most convenient way to record the signal.
>
> > This is all just digital processing in question here. You
> > don't even n

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

>It shows *exactly* the aliasing

It shows the aliasing left by linear interpolation into the continuous time
domain. It doesn't show the additional aliasing produced by then delaying
and sampling that signal. I.e., the images that would get folded back onto
the new baseband, disturbing the sinc^2 curve. This is how we end up with a
zero at Nyquist when we do half-sample delay, for example. And also how we
end up with a perfectly flat response if we do the trivial resampling
(original rate, no delay).

Those differences would be quite small for resampling to 44.1kHz with no
delay, since the oversampling ratio is considerable, so you'd have to look
carefully to see them. This is a big hint that they are not portrayed:
Ollie knows what he is doing, so if he wanted to illustrate the effects of
the resampling, he would have constructed a scenario where they are easily
visible. And probably mentioned a second sample rate, explicitly shown both
the sinc^2 and its aliased counterpart, etc. The effect would be shown in a
visible, explicit manner, if that was what the graph was supposed to show.
But all of those things depend on parameters like oversampling ratio and
delay, so it would be a much more complicated picture. What we're shown
here is just the effects of polynomial interpolation to get to the
continuous time domain. The additional effects of delaying and then
sampling that signal back into the discrete time domain are not visible.

It seems that you have assumed that some resampling must be happening
because the graph only goes up to 22kHz. But that's just the range of the
graph, you don't need to do any resampling of anything to graph sinc^2 over
any particular range of frequencies.

>Oh, it's the *exact* same graph! (Minus some
>difference above 20 kHz, due to my soundcard's anti-alias filter.)
>You get the same graph if you sample that continuous time signal
>at a 44.1 kHz sampling rate (with some further aliasing from the
>sampling).

But that's not quite the exact same graph. And why are you putting a sound
card in the loop? This is all just digital processing in question here. You
don't even need to process any signals, there are analytic expressions for
all of the quantities involved. That's how Ollie generated graphs of them
without reference to any particular signals.

Again, the differences in question are small due to the high oversampling
ratio, so it's going to be quite difficult to see them in macroscopic
graphs like this. If you want to see the differences, just make a plot of
both sinc^2 and its aliased versions (for whatever oversampling ratios
and/or delays), and look at the differences. It won't be interesting for
high oversampling ratios and zero delay - which is exactly why that
scenario is a poor choice for illustrating the effects in question.

The fact that sampling a continuous time signal at a very high rate results
in a spectrum that closely resembles the continuous time spectrum (over the
sampled bandwidth) is beside the point. It just means that you're operating
in a regime where the effects are very hard to spot. It doesn't follow from
that resemblance that resampling must be occurring to get a plot of the
spectrum of the continuous time signal.

E

On Fri, Aug 21, 2015 at 10:51 AM, Peter S 
wrote:

> On 21/08/2015, Ethan Duni  wrote:
> >>Creating a 22000 Hz signal from a 250 Hz signal by interpolation, is
> >>*exactly* upsampling
> >
> > That is not what is shown in that graph. The graph simply shows the
> > continuous-time frequency response of the interpolation polynomials,
> > graphed up to 22kHz. No resampling is depicted, or the frequency
> responses
> > would show the aliasing associated with that.
>
> It shows *exactly* the aliasing
> http://morpheus.spectralhead.com/img/interpolation_aliasing.png
>
> There are about 88 alias images visible on the graph.
> The linear interpolation curve is not "smooth", so it contains aliasing.
>
> > It's just showing the sinc^2
> > response of the linear interpolator, and similar for the other
> polynomials.
>
> If the signal you interpolate is white noise, and the spectrum of the
> signal is a flat spectrum rectangle like the one displayed, then after
> resampling, you get *exactly* the spectrum you see on the graph,
> showing 88 alias images.
>
> Proof:
> I created 60 seconds of white noise sampled at 500 Hz, then resampled
> it to 44.1 kHz using linear interpolation. After the upsampling, it
> sounds like this:
>
> http://morpheus.spectralhead.com/wav/noise_resampled.wav
>
> Its spectrum looks like this:
> http://morpheus.spectralhead.com/img/noise_resampled.png
>
> Looks familiar? Oh, it's the *exact* same graph! (Minus some
> difference above 20 kHz, due to my s

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-21 Thread Ethan Duni

>Creating a 22000 Hz signal from a 250 Hz signal by interpolation, is
>*exactly* upsampling

That is not what is shown in that graph. The graph simply shows the
continuous-time frequency response of the interpolation polynomials,
graphed up to 22kHz. No resampling is depicted, or the frequency responses
would show the aliasing associated with that. It's just showing the sinc^2
response of the linear interpolator, and similar for the other polynomials.
This is what you'd get if you used those interpolation polynomials to
convert a 250Hz sampled signal into a continuous time signal, not a
discrete time signal of whatever sampling rate.

E

On Fri, Aug 21, 2015 at 2:09 AM, Peter S 
wrote:

> On 21/08/2015, Ethan Duni  wrote:
> >>In this graph, the signal frequency seems to be 250 Hz, so this graph
> >>shows the equivalent of about 22000/250 = 88x oversampling.
> >
> > That graph just shows the frequency responses of various interpolation
> > polynomials. It's not related to oversampling.
>
> Creating a 22000 Hz signal from a 250 Hz signal by interpolation, is
> *exactly* upsampling - the sampling rate changes by a factor of 88x.
> It's not bandlimited interpolation (using a windowed sinc
> interpolator), hence there is a lot of aliasing above Nyquist.
> Irregardless, it's still oversampling - the resulting signal is
> sampled with a 88x higher frequency than the original. It's equivalent
> to creating a 3,880,800 Hz signal from a 44100 Hz signal.
>
> -P
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-20 Thread Ethan Duni

>In this graph, the signal frequency seems to be 250 Hz, so this graph
>shows the equivalent of about 22000/250 = 88x oversampling.

That graph just shows the frequency responses of various interpolation
polynomials. It's not related to oversampling.

E

On Thu, Aug 20, 2015 at 5:40 PM, Peter S 
wrote:

> In the case of variable pitch playback with interpolation, here are
> the frequency responses:
>
> http://musicdsp.org/files/other001.gif
> (graphs by Olli Niemitalo)
>
> In this case, there's no zero at the original Nyquist freq, rather
> there are zeros at the original sampling rate and its multiplies.
>
> So it's useful to specify what you mean by "high frequency signal loss
> due to interpolation", beacause that term is ambiguous and can mean
> various things.
>
> In this graph, the signal frequency seems to be 250 Hz, so this graph
> shows the equivalent of about 22000/250 = 88x oversampling. At that
> oversampling rate, gain of alias images of linear interpolation is -84
> dB. High amounts of oversampling for high SNR ratios may be
> necessitated by the slow rolloff of aliasing. (This was not mentioned
> in the question in this thread, but is relevant.)
>
> -P
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-20 Thread Ethan Duni

>If all you're trying to do is mitigate the rolloff of linear interp

That's one concern, and by itself it implies that you need to oversample by
at least some margin to avoid having a zero at the top of your audio band
(along with a transition band below that).

But the larger concern is the overall accuracy of the interpolator. At low
oversampling ratios, the sinc^2 rolloff of the linear interpolator response
isn't effective at squashing the signal images, so you end up with aliasing
corrupting your results. Hence the need for higher order interpolation at
lower oversampling ratios, as described in Ollie's paper. If you want to
get high SNR out of linear interpolation, you need to crank up the
oversampling considerably - far beyond what is needed just to avoid the
attenuation of high frequencies of the in-band component, in order to
sufficiently squash the images.

E

On Thu, Aug 20, 2015 at 12:18 PM, Chris Santoro 
wrote:

> As far as the oversampling + linear interpolation approach goes, I have to
> ask... why oversample so much (512x)?
>
> Purely from a rolloff perspective, it seems you can figure out what your
> returns are going to be by calculating sinc^2 at (1/upsample_ratio) for a
> variety of oversampling ratios. Here's the python code to run the numbers...
>
> #-
> import numpy as np
>
> #normalized frequency points
> X = [1.0/512.0, 1.0/256.0, 1.0/128.0, 1.0/64.0, 1.0/32.0, 1.0/16.0,
> 1.0/8.0, 1.0/4.0]
> #find attenuation at frequency points due to linear interpolation worst
> case (halfway in between)
> S = np.sinc(X)
> S = 20*np.log10(S*S)
>
> print S
> #---
>
> and here's what it spits out for various attenuation values at what would
> be nyquist in the baseband:
>
> 2X:   -7.8 dB
> 4X:   -1.8 dB
> 8X:   -0.44 dB
> 16X: -0.11 dB
> 32X: -0.027 dB
> 64X: -0.0069 dB
> 128X:   -0.0017 dB
> 256X:   -0.00043 dB
> 512X:   -0.00010 dB
>
> If all you're trying to do is mitigate the rolloff of linear interp, it
> looks like there's diminishing returns beyond 16X or 32X, where you're
> talking about a tenth of a dB or less at nyquist, which most people can't
> even hear in that range. Your anti-aliasing properties are going to be
> determined by your choice of upsampling/windowed-sync/anti-imaging filter
> and how long you want to let that be. Or am I missing something? It just
> doesn't seem worth it go to that high.
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-19 Thread Ethan Duni

>rbj
>and it doesn't require a table of coefficients, like doing higher-order
Lagrange or Hermite would.

Robert I think this is where you lost me. Wasn't the premise that memory
was cheap, so we can store a big prototype FIR for high quality 512x
oversampling? So why are we then worried about the table space for the
fractional interpolator?

I wonder if the salient design concern here is less about balancing
resources, and more about isolating and simplifying the portions of the
system needed to support arbitrary (as opposed to just very-high-but-fixed)
precision. I like the modularity of the high oversampling/linear interp
approach, since that it supports arbitrary precision with a minimum of
fussy variable components or arcane coefficient calculations. It's got a
lot going for it in software engineering terms. But I'm on the fence about
whether it's the tightest use of resources (for whatever constraints).
Typically those are the arcane ones that take a ton of debugging and
optimization :P

E



On Wed, Aug 19, 2015 at 1:00 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 8/19/15 1:43 PM, Peter S wrote:
>
>> On 19/08/2015, Ethan Duni  wrote:
>>
>>> But why would you constrain yourself to use first-order linear
>>> interpolation?
>>>
>> Because it's computationally very cheap?
>>
>
> and it doesn't require a table of coefficients, like doing higher-order
> Lagrange or Hermite would.
>
> The oversampler itself is going to be a much higher order
>>> linear interpolator. So it seems strange to pour resources into that
>>>
>> Linear interpolation needs very little computation, compared to most
>> other types of interpolation. So I do not consider the idea of using
>> linear interpolation for higher stages of oversampling strange at all.
>> The higher the oversampling, the more optimal it is to use linear in
>> the higher stages.
>>
>>
> here, again, is where Peter and i are on the same page.
>
> So heavy oversampling seems strange, unless there's some hard
>>> constraint forcing you to use a first-order interpolator.
>>>
>> The hard constraint is CPU usage, which is higher in all other types
>> of interpolators.
>>
>>
> for plugins or embedded systems with a CPU-like core, computation burden
> is more of a cost issue than memory used.  but there are other embedded DSP
> situations where we are counting every word used.  8 years ago, i was
> working with a chip that offered for each processing block 8 instructions
> (there were multiple moves, 1 multiply, and 1 addition that could be done
> in a single instruction), 1 state (or 2 states, if you count the output as
> a state) and 4 scratch registers.  that's all i had.  ain't no table of
> coefficients to look up.  in that case memory is way more important than
> wasting a few instructions recomputing numbers that you might otherwise
> just look up.
>
>
>
>
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-19 Thread Ethan Duni

Ugh, I suppose this is what I get for attempting to engage with Peter S
again. Not sure what I was thinking...

E
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-19 Thread Ethan Duni

>Nope. Ever heard of multistage interpolation?

I'm well aware that multistage interpolation gives cost savings relative to
single-stage interpolation, generally. That is beside the point: the costs
of interpolation all still scale with oversampling ratio and quality
requirements, just like in single stage interpolation. There's no magic  to
multi-stage interpolation that avoids that relationship.

>that's just plain wrong and stupid, and that's what all advanced multirate
books
>will also tell you.

You've been told repeatedly that this kind of abusive, condescending
behavior is not welcome here, and you need to cut it out immediately.

>Tell me, you don't have an extra half kilobyte of memory in a typical
>computer?

There are lots of dsp applications that don't run on personal computers,
but rather on very lightweight embedded targets. Memory tends to be at a
premium on those platforms.

E










On Wed, Aug 19, 2015 at 3:55 PM, Peter S 
wrote:

> On 20/08/2015, Ethan Duni  wrote:
> >
> > I don't dispute that linear fractional interpolation is the right choice
> if
> > you're going to oversample by a large ratio. The question is what is the
> > right balance overall, when considering the combined costs of
> > the oversampler and the fractional interpolator.
>
> It's hard to tell in general. It depends on various factors, including:
>
> - your desired/available CPU usage
> - your desired/available memory usage and cache size
> - the available instruction set of your CPU
> - your desired antialias filter steepness
> - your desired stopband attenuation
>
> ...and possibly other factors. Since these may vary largely, I think
> it is impossible to tell in general. What I read in multirate
> literature, and what is also my own experience, is that - when using a
> relatively large oversampling ratio - then it's more cost-effective to
> use linear interpolation at the higher stages (and that's Olli's
> conclusion as well).
>
> > You can leverage any finite interpolator to skip computations in an FIR
> > oversampler, not just linear. You get the most "skipping" in the case of
> > high oversampling ratio and linear interpolation, but the same trick
> still
> > works any time your oversampling ratio is greater than your interpolator
> > order.
>
> But to a varying degree. A FIR interpolator is still "heavy" if you
> skip samples where the coefficient is zero, compared to linear
> interpolation (but it is also higher quality).
>
> > The flipside is that the higher the oversampling ratio, the longer the
> FIR
> > oversampling filter needs to be in the first place.
>
> Nope. Ever heard of multistage interpolation? You may do a small FIR
> stage (say, 2x or 4x), and then a linear stage (or another,
> low-complexity FIR stage according to your desired specifications, or
> even further stages). Seems you still don't understand that you can
> oversample in multiple stages, and use a linear interpolator for the
> higher stages of oversampling... Which is almost always optimal than
> using a single costy FIR filter to do the interpolation. You don't
> need to use a 512x FIR at >100 dB stopband attentuation, that's just
> plain wrong and stupid, and that's what all advanced multirate books
> will also tell you.
>
> Same for IIR case.
>
> >>Since memory is usually not an issue,
> >
> > There are lots of dsp applications where memory is very much the main
> > constraint.
>
> Tell me, you don't have an extra half kilobyte of memory in a typical
> computer? I hear, those have 8-32 GB of RAM nowadays, and CPU cache
> sizes are like 32-128 KiB.
>
> > The performance of your oversampler will be garbage if you do that. And
> so
> > there will be no point in worrying about the quality of fractional
> > interpolation after that point, since the signal you'll be interpolating
> > will be full of aliasing to begin with.
>
> Exactly. But it won't be "heavy"! So it's not the "oversampling" what
> makes the process heavy, but rather, the interpolation / anti-aliasing
> filter!!
>
> > And that means it needs lots of resources, especially as the oversampling
> > ratio gets large. It's the required quality that drives the oversampler
> > costs (and filter design choices).
>
> Which is exactly what I said. If your specification is low, you can
> have a 128x oversampler that is (relatively) "low-cost". It's not the
> oversampling ratio what matters most.
>
> > If you are willing to accept low quality in order to save on CPU (or
> maybe
> >

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-19 Thread Ethan Duni

>To quote Olli Niemitalo:
>
>"The presented optimal interpolators make it possible to do
>transparent-quality resampling for even the most demanding
>applications with only 2x or 4x oversampling before the interpolation.
>However, in most cases simple linear interpolation combined with a
>very high-ratio oversampling (perhaps 512x) is the optimal tradeoff.
>The computational costs depend on the platform and the oversampling
>implementation."

You should include the rest of that paragraph:

"Therefore, which interpolator is the best is not concluded here. You must
first decide what quality you need (for example around 90dB modified SNR
for a transparency of 16 bits) and then see what alternatives the table
given in the summary has to suggest for the oversampling ratios you can
afford."

Also, earlier in the same reference:

"It is outside the scope of this paper to make guesses of the most
profitable over- sampling ratio."

I don't dispute that linear fractional interpolation is the right choice if
you're going to oversample by a large ratio. The question is what is the
right balance overall, when considering the combined costs of
the oversampler and the fractional interpolator. Ollie's paper isn't trying
to address that, he's leaving the oversampler considerations out of scope
and just showing what your best options are for a given oversampling ratio.
The approach there is that you start with a decision of what oversampling
ratio you can afford, and then use his tables to figure out what
interpolator you're going to need to get the desired quality. Note also the
implication that the oversampler is itself the main thing driving the
resource considerations.

The sentence about how 512x oversampling if the optimal trade-off in most
cases is a bit out of place there, considering that there is nothing in the
paper that establishes that, and several instances in which Ollie makes it
explicit that such conclusions are out of scope of the paper.

>Apparently you're missing the whole point - it's the linear
>interpolation that makes the oversampling "cheap"(er) and not (as)
>"heavy".

You can leverage any finite interpolator to skip computations in an FIR
oversampler, not just linear. You get the most "skipping" in the case of
high oversampling ratio and linear interpolation, but the same trick still
works any time your oversampling ratio is greater than your interpolator
order.

The flipside is that the higher the oversampling ratio, the longer the FIR
oversampling filter needs to be in the first place. An FIR lowpass with
cutoff at a normalized frequency of 1/512 and >100dB stop band rejection is
going to require a quite high order. Move the cutoff up to 1/4 or 1/2 and
the required filter order drops dramatically. You can use IIR instead, but
then you have to compute all of the oversamples, not just the (tiny) subset
you require to drive the interpolator - and you have the same growth in the
required filter order as the oversampling ratio increases. And you get
phase distortion, of course.

>Since memory is usually not an issue,

There are lots of dsp applications where memory is very much the main
constraint.

>the costs of your oversampling depends mostly on what kind of resampling
filters you
>use, which you can choose freely. If you use cheaper filters, it won't
>be that "heavy". If I use 128x oversampling with zero order hold or
>zero stuffing, that won't be "heavy", since I'm merely copying
>samples.

The performance of your oversampler will be garbage if you do that. And so
there will be no point in worrying about the quality of fractional
interpolation after that point, since the signal you'll be interpolating
will be full of aliasing to begin with. If you want high quality fractional
interpolation, then the oversampling stage needs to itself be high quality.
And that means it needs lots of resources, especially as the oversampling
ratio gets large. It's the required quality that drives the oversampler
costs (and filter design choices).

If you are willing to accept low quality in order to save on CPU (or maybe
there's nothing in the upper frequencies that you're worried about), then
there's no point in resampling at all. Just use a low order fractional
interpolator directly on the signal.

>It should also be noted that the linear interpolation can be used for
>the upsampling itself as well, reducing the cost of your oversampling,

Again, that would add up to a very low quality upsampler.

E

On Wed, Aug 19, 2015 at 2:06 PM, Peter S 
wrote:

> On 19/08/2015, Ethan Duni  wrote:
> >
> > Obviously it will depend on the details of the application, it just seems
> > kind of unbalanced on its face to use heavy oversampling and then the
> > lightest possible frac

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-19 Thread Ethan Duni

>and it doesn't require a table of coefficients, like doing higher-order
Lagrange or Hermite would.

Well, you can compute those at runtime if you want - and you don't need a
terribly high order Lagrange interpolator if you're already oversampled, so
it's not necessarily a problematic overhead.

Meanwhile, the oversampler itself needs a table of coefficients. Assuming
we're talking about FIR interpolation, to avoid phase distortion. But
that's a single fixed table for supporting a single oversampling ratio, so
I can see how it would add up to a memory savings compared to a bank of
tables for different fractional interpolation points, if you're looking for
really fine/arbitrary granularity. If we're talking about a fixed
fractional delay, I'm not really seeing the advantage.

Obviously it will depend on the details of the application, it just seems
kind of unbalanced on its face to use heavy oversampling and then the
lightest possible fractional interpolator. It's not clear to me that a
moderate oversampling combined with a fractional interpolator of modestly
high order wouldn't be a better use of resources.

So it doesn't make a lot of sense to me to point to the low resource costs
of the first-order linear interpolator, when you're already devoting
resources to heavy oversampling in order to use it. They need to be
considered together and balanced, no? Your point about computing only the
subset of oversamples needed to drive the final fractional interpolator is
well-taken, but I think I need to see a more detailed accounting of that to
be convinced.

E

On Wed, Aug 19, 2015 at 1:00 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 8/19/15 1:43 PM, Peter S wrote:
>
>> On 19/08/2015, Ethan Duni  wrote:
>>
>>> But why would you constrain yourself to use first-order linear
>>> interpolation?
>>>
>> Because it's computationally very cheap?
>>
>
> and it doesn't require a table of coefficients, like doing higher-order
> Lagrange or Hermite would.
>
> The oversampler itself is going to be a much higher order
>>> linear interpolator. So it seems strange to pour resources into that
>>>
>> Linear interpolation needs very little computation, compared to most
>> other types of interpolation. So I do not consider the idea of using
>> linear interpolation for higher stages of oversampling strange at all.
>> The higher the oversampling, the more optimal it is to use linear in
>> the higher stages.
>>
>>
> here, again, is where Peter and i are on the same page.
>
> So heavy oversampling seems strange, unless there's some hard
>>> constraint forcing you to use a first-order interpolator.
>>>
>> The hard constraint is CPU usage, which is higher in all other types
>> of interpolators.
>>
>>
> for plugins or embedded systems with a CPU-like core, computation burden
> is more of a cost issue than memory used.  but there are other embedded DSP
> situations where we are counting every word used.  8 years ago, i was
> working with a chip that offered for each processing block 8 instructions
> (there were multiple moves, 1 multiply, and 1 addition that could be done
> in a single instruction), 1 state (or 2 states, if you count the output as
> a state) and 4 scratch registers.  that's all i had.  ain't no table of
> coefficients to look up.  in that case memory is way more important than
> wasting a few instructions recomputing numbers that you might otherwise
> just look up.
>
>
>
>
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-19 Thread Ethan Duni

>i would say way more than 2x if you're using linear in between.  if memory
is cheap, i might oversample by perhaps as much as 512x >and then use
linear to get in between the subsamples (this will get you 120 dB S/N).

But why would you constrain yourself to use first-order linear
interpolation? The oversampler itself is going to be a much higher order
linear interpolator. So it seems strange to pour resources into that, just
so you can avoid putting them into the final fractional interpolator. Is
the justification that the oversampler is a fixed interpolator, whereas the
final stage is variable (so we don't want to muck around with anything too
complex there)? I've seen it claimed (by Julius Smith IIRC) that
oversampling by as little as 10% cuts the interpolation filter requirements
by over 50%. So heavy oversampling seems strange, unless there's some hard
constraint forcing you to use a first-order interpolator.

>quite familiar with it.

Yeah that was more for the list in general, to keep this discussion
(semi-)grounded.

E

On Wed, Aug 19, 2015 at 9:15 AM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 8/18/15 11:46 PM, Ethan Duni wrote:
>
>> > for linear interpolation, if you are a delayed by 3.5 samples and you
>> keep that delay constant, the transfer function is
>> >
>> >   H(z)  =  (1/2)*(1 + z^-1)*z^-3
>> >
>> >that filter goes to -inf dB as omega gets closer to pi.
>>
>> Note that this holds for symmetric fractional delay filter of any odd
>> order (i.e., Lagrange interpolation filter, windowed sinc, etc). It's not
>> an artifact of the simple linear approach,
>>
>
> at precisely Nyquist, you're right.  as you approach Nyquist, linear
> interpolation is worser than cubic Hermite but better than cubic B-spline
> (better in terms of less roll-off, worser in terms of killing images).
>
> it's a feature of the symmetric, finite nature of the fractional
>> interpolator. Since there are good reasons for the symmetry constraint, we
>> are left to trade off oversampling and filter order/design to get the final
>> passband as flat as we need.
>>
>> My view is that if you are serious about maintaining fidelity across the
>> full bandwidth, you need to oversample by at least 2x.
>>
>
> i would say way more than 2x if you're using linear in between.  if memory
> is cheap, i might oversample by perhaps as much as 512x and then use linear
> to get in between the subsamples (this will get you 120 dB S/N).
>
> That way you can fit the transition band of your interpolation filter
>> above the signal band. In applications where you are less concerned about
>> full bandwidth fidelity, oversampling isn't required. Some argue that 48kHz
>> sample rate is already effectively oversampled for lots of natural
>> recordings, for example. If it's already at 96kHz or higher I would not
>> bother oversampling further.
>>
>
> i might **if** i want to resample by an arbitrary ratio and i am doing
> linear interpolation between the new over-sampled samples.
>
> remember, when we oversample for the purpose of resampling, if the
> prototype LPF is FIR (you know, the polyphase thingie), then you need not
> calculate all of the new over-sampled samples.  only the two you need to
> linear interpolate between.  so oversampling by a large factor only costs
> more in terms of memory for the coefficient storage.  not in computational
> effort.
>
> Also this is recommended reading for this thread:
>>
>> https://ccrma.stanford.edu/~jos/Interpolation/ <
>> https://ccrma.stanford.edu/%7Ejos/Interpolation/>
>>
>>
> quite familiar with it.
>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

> for linear interpolation, if you are a delayed by 3.5 samples and you
keep that delay constant, the transfer function is
>
>   H(z)  =  (1/2)*(1 + z^-1)*z^-3
>
>that filter goes to -inf dB as omega gets closer to pi.

Note that this holds for symmetric fractional delay filter of any odd order
(i.e., Lagrange interpolation filter, windowed sinc, etc). It's not an
artifact of the simple linear approach, it's a feature of the symmetric,
finite nature of the fractional interpolator. Since there are good reasons
for the symmetry constraint, we are left to trade off oversampling and
filter order/design to get the final passband as flat as we need.

My view is that if you are serious about maintaining fidelity across the
full bandwidth, you need to oversample by at least 2x. That way you can fit
the transition band of your interpolation filter above the signal band. In
applications where you are less concerned about full bandwidth fidelity,
oversampling isn't required. Some argue that 48kHz sample rate is already
effectively oversampled for lots of natural recordings, for example. If
it's already at 96kHz or higher I would not bother oversampling further.

Also this is recommended reading for this thread:

https://ccrma.stanford.edu/~jos/Interpolation/

E

On Tue, Aug 18, 2015 at 1:45 PM, Tom Duffy  wrote:

> In order to reconstruct that sinusoid, you'll need a filter with
> an infinitely steep transition band.
> You've demonstrated that SR/2 aliases to 0Hz, i.e. DC.
> That digital stream of samples is not reconstructable.
>
> On 8/18/2015 1:28 PM, Peter S wrote:
>
> That's false. 1, -1, 1, -1, 1, -1 ... is a proper bandlimited signal,
>> and contains no aliasing. That's the maximal allowed frequency without
>> any aliasing. It is a bandlimited Nyquist frequency square wave (which
>> is equivalent to a Nyquist frequency sine wave). From that, you can
>> reconstruct a perfect alias-free sinusoid of frequency SR/2.
>>
>
> NOTICE: This electronic mail message and its contents, including any
> attachments hereto (collectively, "this e-mail"), is hereby designated as
> "confidential and proprietary." This e-mail may be viewed and used only by
> the person to whom it has been sent and his/her employer solely for the
> express purpose for which it has been disclosed and only in accordance with
> any confidentiality or non-disclosure (or similar) agreement between TEAC
> Corporation or its affiliates and said employer, and may not be disclosed
> to any other person or entity.
>
>
>
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>Okay, I get what you mean. But that doesn't change the frequency
>response of a half-sample delay, or doesn't mean that a half-sample
>delay doesn't have a specific gain at Nyquist.

Never said that it did. In fact, I explicitly said that this issue of
sampling of Nyquist frequency sinusoids has no bearing on the frequency
response of fractional interpolators. I'd suggest dropping this whole
derail, if you are no longer hung up on this point.

E

On Tue, Aug 18, 2015 at 2:08 PM, Peter S 
wrote:

> On 18/08/2015, Ethan Duni  wrote:
> >
> > That class of signals is band limited to SR/2. The aliasing is in the
> > amplitude/phase offset, not the frequency.
>
> Okay, I get what you mean. But that doesn't change the frequency
> response of a half-sample delay, or doesn't mean that a half-sample
> delay doesn't have a specific gain at Nyquist.
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>You cannot calculate 1/x when x=0, can you? Since that's division by zero.
>Yet you'll know when x tends to zero from right towards left, then 1/x
>will tend to +infinity.

Not sure what that is supposed to have to do with the present subject.

If you want to put it in terms of simple arithmetic, the aliasing issue
works like this: I add two numbers together, and find that the answer is X.
I tell you X, and then ask you to determine what the two numbers were. Can
you do it?

E

On Tue, Aug 18, 2015 at 2:13 PM, Peter S 
wrote:

> On 18/08/2015, Ethan Duni  wrote:
> >>In order to reconstruct that sinusoid, you'll need a filter with
> >>an infinitely steep transition band.
> >
> > No, even an ideal reconstruction filter won't do it. You've got your
> > +Nyquist component sitting right on top of your -Nyquist component. Hence
> > the aliasing. The information has been lost in the sampling, there's no
> way
> > to reconstruct without some additional side information.
>
> You cannot calculate 1/x when x=0, can you? Since that's division by zero.
> Yet you'll know when x tends to zero from right towards left, then 1/x
> will tend to +infinity.
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>In order to reconstruct that sinusoid, you'll need a filter with
>an infinitely steep transition band.

No, even an ideal reconstruction filter won't do it. You've got your
+Nyquist component sitting right on top of your -Nyquist component. Hence
the aliasing. The information has been lost in the sampling, there's no way
to reconstruct without some additional side information.

E

On Tue, Aug 18, 2015 at 1:45 PM, Tom Duffy  wrote:

> In order to reconstruct that sinusoid, you'll need a filter with
> an infinitely steep transition band.
> You've demonstrated that SR/2 aliases to 0Hz, i.e. DC.
> That digital stream of samples is not reconstructable.
>
> On 8/18/2015 1:28 PM, Peter S wrote:
>
> That's false. 1, -1, 1, -1, 1, -1 ... is a proper bandlimited signal,
>> and contains no aliasing. That's the maximal allowed frequency without
>> any aliasing. It is a bandlimited Nyquist frequency square wave (which
>> is equivalent to a Nyquist frequency sine wave). From that, you can
>> reconstruct a perfect alias-free sinusoid of frequency SR/2.
>>
>
> NOTICE: This electronic mail message and its contents, including any
> attachments hereto (collectively, "this e-mail"), is hereby designated as
> "confidential and proprietary." This e-mail may be viewed and used only by
> the person to whom it has been sent and his/her employer solely for the
> express purpose for which it has been disclosed and only in accordance with
> any confidentiality or non-disclosure (or similar) agreement between TEAC
> Corporation or its affiliates and said employer, and may not be disclosed
> to any other person or entity.
>
>
>
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>> well Peter, here again is where you overreach.  assuming, without loss
>> of generality that the sampling period is 1, the continuous-time signals
>>
>> x(t)  =  1/cos(theta) * cos(pi*t + theta)
>>
>> are all aliases for the signal described above (and incorrectly as
>> "contain[ing] no aliasing").
>
>Well, strictly speaking, that is true. But I assumed the signal to be
>bandlimited to 0..SR/2. In that case, you can perfectly reconstruct
>it, as you have no other alias between 0..SR/2.

That class of signals is band limited to SR/2. The aliasing is in the
amplitude/phase offset, not the frequency.

There are an infinite number of combinations of amplitude/phase of a
nyquist-frequency sinusoid that will all result in the same sampled
sequence. So you can't invert the sampling.

You can construct a DAC that will output some well-behaved nyquist
frequency sinusoid when presented with the input ..., 1, -1, 1, -1, 1, ...,
but you can't guarantee that it will resemble an analog sinusoid that was
sampled to produce such a digital sequence. You don't have enough info to
disambiguate the phase and amplitude.

E

On Tue, Aug 18, 2015 at 1:51 PM, Peter S 
wrote:

> On 18/08/2015, robert bristow-johnson  wrote:
> > On 8/18/15 4:28 PM, Peter S wrote:
> >>
> >> 1, -1, 1, -1, 1, -1 ... is a proper bandlimited signal,
> >> and contains no aliasing. That's the maximal allowed frequency without
> >> any aliasing.
> >
> > well Peter, here again is where you overreach.  assuming, without loss
> > of generality that the sampling period is 1, the continuous-time signals
> >
> > x(t)  =  1/cos(theta) * cos(pi*t + theta)
> >
> > are all aliases for the signal described above (and incorrectly as
> > "contain[ing] no aliasing").
>
> Well, strictly speaking, that is true. But I assumed the signal to be
> bandlimited to 0..SR/2. In that case, you can perfectly reconstruct
> it, as you have no other alias between 0..SR/2.
>
> -P
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>What's causing you to be unable to reconstruct the waveform?

There are an infinite number of different nyquist-frequency sinusoids that,
when sampled, will all give the same ...,1, -1, 1, -1, ... sequence of
samples. The sampling is a many-to-one mapping in that case, and so cannot
be inverted.

See here:
https://en.wikipedia.org/wiki/Nyquist–Shannon_sampling_theorem#Critical_frequency

Or consider what happens if you shift a nyquist-frequency sinusoid by half
a period before sampling it. You get ..., 0, 0, 0, 0, ... - which is quite
obviously the zero signal. It is not going to reproduce a nyquist frequency
sinusoid when you run it through a DAC.

E

On Tue, Aug 18, 2015 at 1:28 PM, Peter S 
wrote:

> On 18/08/2015, Ethan Duni  wrote:
> >>Assume you have a Nyquist frequency square wave: 1, -1, 1, -1, 1, -1, 1,
> > -1...
> >
> > The sampling theorem requires that all frequencies be *below* the Nyquist
> > frequency. Sampling signals at exactly the Nyquist frequency is an edge
> > case that sort-of works in some limited special cases, but there is no
> > expectation that digital processing of such a signal is going to work
> > properly in general.
>
> Not necessarily, at least in theory.
>
> In practice, an anti-alias filter will filter out a signal exactly at
> Nyquist freq, both when sampling it (A/D conversion), and both when
> reconstructing it (D/A conversion). But that doesn't mean that a
> half-sample delay doesn't have -Inf dB gain at Nyquist frequency. It's
> another thing that the anti-alias filter of a converter will typically
> filter it out anyways when reconstructing - but we weren't talking
> about reconstruction, so that is irrelevant here.
>
> A Nyquist frequency signal (1, -1, 1, -1, ...) is a perfectly valid
> bandlimited signal.
>
> > But even given that, the interpolator outputting the zero signal in that
> > case is exactly correct. That's what you would have gotten if you'd
> sampled
> > the same sine wave (*not* square wave - that would imply frequencies
> above
> > Nyquist) with a half-sample offset from the 1, -1, 1, -1, ... case.
>
> More precisely: a bandlimited Nyquist frequency square wave *equals* a
> Nyquist frequency sine wave. Or any other harmonic waveform for that
> matter (triangle, saw, etc.) In all cases, only the fundamental
> partial is there (1, -1, 1, -1, ... = Nyquist frequency sine), all the
> other partials are filtered out from the bandlimiting.
>
> So the signal 1, -1, 1, -1, *is* a Nyquist frequency bandlimited
> square wave, and also a sine-wave as well. They're identical. It *is*
> a bandlimited square wave - that's what you get when you take a
> Nyquist frequency square wave, and bandlimit it by removing all
> partials above Nyquist freq (say, via DFT). You may call it a square,
> a sine, saw, doesn't matter - when bandlimited, they're identical.
>
> > The
> > incorrect behavior arises when you try to go in the other direction
> (i.e.,
> > apply a second half-sample delay), and you still get only DC.
>
> What would be "incorrect" about it? I'm not sure what is your
> assumption. Of course if you apply any kind of filtering to a zero DC
> signal, you'll still have a zero DC signal. -Inf + -Inf = -Inf...  Not
> sure what you're trying to achieve by "applying a second half-sample
> delay"... That also has -Inf dB gain at Nyquist, so you'll still have
> a zero DC signal after that. Since a half-sample delay has -Inf gain
> at Nyquist, you cannot "undo" it by applying another half-sample
> delay...
>
> > But, again, that doesn't really say anything about interpolation.It just
> > says that you sampled the signal improperly in the first place, and so
> > digital processing can't be relied upon to work appropriately.
>
> That's false. 1, -1, 1, -1, 1, -1 ... is a proper bandlimited signal,
> and contains no aliasing. That's the maximal allowed frequency without
> any aliasing. It is a bandlimited Nyquist frequency square wave (which
> is equivalent to a Nyquist frequency sine wave). From that, you can
> reconstruct a perfect alias-free sinusoid of frequency SR/2.
>
> What's causing you to be unable to reconstruct the waveform?
>
> -P
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>*my* point is that as the delay slowly slides from a integer number of
samples, where the transfer function is
>
>   H(z) = z^-N
>
>to the integer + 1/2 sample (with gain above), this linear but
time-variant system is going to sound like there is a LPF getting segued in.
>
>this, for me, is enough to decide never to use solely linear interpolation
for a modulateable delay widget.  if i vary delay, i want only the >delay
to change.

Yeah, absolutely. The variable suppression of high frequencies when
fractional delay changes is undesirable, and indicates that better
interpolation schemes should be used there.

But the example of the weird things that can happen when you try to sample
a sine wave right at the nyquist rate and then process it is orthogonal to
that point.

E

On Tue, Aug 18, 2015 at 1:16 PM, robert bristow-johnson <
r...@audioimagination.com> wrote:

> On 8/18/15 3:44 PM, Ethan Duni wrote:
>
>> >Assume you have a Nyquist frequency square wave: 1, -1, 1, -1, 1, -1, 1,
>> -1...
>>
>> The sampling theorem requires that all frequencies be *below* the Nyquist
>> frequency. Sampling signals at exactly the Nyquist frequency is an edge
>> case that sort-of works in some limited special cases, but there is no
>> expectation that digital processing of such a signal is going to work
>> properly in general.
>>
>> But even given that, the interpolator outputting the zero signal in that
>> case is exactly correct. That's what you would have gotten if you'd sampled
>> the same sine wave (*not* square wave - that would imply frequencies above
>> Nyquist) with a half-sample offset from the 1, -1, 1, -1, ... case. The
>> incorrect behavior arises when you try to go in the other direction (i.e.,
>> apply a second half-sample delay), and you still get only DC.
>>
>> But, again, that doesn't really say anything about interpolation. It just
>> says that you sampled the signal improperly in the first place, and so
>> digital processing can't be relied upon to work appropriately.
>>
>>
> as suprizing as it may first appear, i think Peter S and me are totally on
> the same page here.
>
> regarding *linear* interpolation, *if* you use linear interpolation in a
> precision delay (an LTI thingie, or at least quasi-time-invariant) and you
> delay by some integer + 1/2 sample, the filter you get has coefficients and
> transfer function
>
>H(z) =  (1/2)*(1 + z^-1)*z^-N
>
> (where N is the integer part of the delay).
>
> the gain of that filter, as you approach Nyquist, approaches -inf dB.
>
> *my* point is that as the delay slowly slides from a integer number of
> samples, where the transfer function is
>
>H(z) = z^-N
>
> to the integer + 1/2 sample (with gain above), this linear but
> time-variant system is going to sound like there is a LPF getting segued in.
>
> this, for me, is enough to decide never to use solely linear interpolation
> for a modulateable delay widget.  if i vary delay, i want only the delay to
> change.  and i would prefer if the delay was the same for all frequencies,
> which makes the APF fractional delay thingie problematic.
>
> bestest,
>
> r b-j
>
>
>> On Tue, Aug 18, 2015 at 1:40 AM, Peter S > <mailto:peter.schoffhau...@gmail.com>> wrote:
>>
>> On 18/08/2015, Nigel Redmon > <mailto:earle...@earlevel.com>> wrote:
>> >>
>> >> well, if it's linear interpolation and your fractional delay
>> slowly sweeps
>> >> from 0 to 1/2 sample, i think you may very well hear a LPF start
>> to kick
>> >> in.  something like -7.8 dB at Nyquist.  no, that's not right.
>>  it's -inf
>> >> dB at Nyquist.  pretty serious LPF to just slide into.
>> >
>> > Right the first time, -7.8 dB at the Nyquist frequency, -inf at
>> the sampling
>> > frequency. No?
>>
>> -Inf at Nyquist when you're halfway between two samples.
>>
>> Assume you have a Nyquist frequency square wave: 1, -1, 1, -1, 1,
>> -1, 1, -1...
>> After interpolating with fraction=0.5, it becomes a constant signal
>> 0,0,0,0,0,0,0...
>> (because (-1+1)/2 = 0)
>>
>>
> --
>
> r b-j  r...@audioimagination.com
>
> "Imagination is more important than knowledge."
>
>
>
>
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-18 Thread Ethan Duni

>Assume you have a Nyquist frequency square wave: 1, -1, 1, -1, 1, -1, 1,
-1...

The sampling theorem requires that all frequencies be *below* the Nyquist
frequency. Sampling signals at exactly the Nyquist frequency is an edge
case that sort-of works in some limited special cases, but there is no
expectation that digital processing of such a signal is going to work
properly in general.

But even given that, the interpolator outputting the zero signal in that
case is exactly correct. That's what you would have gotten if you'd sampled
the same sine wave (*not* square wave - that would imply frequencies above
Nyquist) with a half-sample offset from the 1, -1, 1, -1, ... case. The
incorrect behavior arises when you try to go in the other direction (i.e.,
apply a second half-sample delay), and you still get only DC.

But, again, that doesn't really say anything about interpolation. It just
says that you sampled the signal improperly in the first place, and so
digital processing can't be relied upon to work appropriately.

E

On Tue, Aug 18, 2015 at 1:40 AM, Peter S 
wrote:

> On 18/08/2015, Nigel Redmon  wrote:
> >>
> >> well, if it's linear interpolation and your fractional delay slowly
> sweeps
> >> from 0 to 1/2 sample, i think you may very well hear a LPF start to kick
> >> in.  something like -7.8 dB at Nyquist.  no, that's not right.  it's
> -inf
> >> dB at Nyquist.  pretty serious LPF to just slide into.
> >
> > Right the first time, -7.8 dB at the Nyquist frequency, -inf at the
> sampling
> > frequency. No?
>
> -Inf at Nyquist when you're halfway between two samples.
>
> Assume you have a Nyquist frequency square wave: 1, -1, 1, -1, 1, -1, 1,
> -1...
> After interpolating with fraction=0.5, it becomes a constant signal
> 0,0,0,0,0,0,0...
> (because (-1+1)/2 = 0)
> ___
> music-dsp mailing list
> music-dsp@music.columbia.edu
> https://lists.columbia.edu/mailman/listinfo/music-dsp
>
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] Compensate for interpolation high frequency signal loss

2015-08-17 Thread Ethan Duni

Yeah I am also curious. It's not obvious to me where it would make sense to
spend resources compensating for interpolation rather than just juicing up
the interpolation scheme in the first place.

E

On Mon, Aug 17, 2015 at 11:39 AM, Nigel Redmon 
wrote:

> Since compensation filtering has been mentioned by a few, can I ask if
> someone could get specific on an implementation (including a description of
> constraints under which it operates)?
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

[music-dsp] This seems relevant to the list of late

2015-08-12 Thread Ethan Duni

https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect

E
___
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni

Peter S, your combative attitude is unwelcome. It seems that you are less
interested in grasping these topics than you are in hectoring myself and
other list members. Given that and the dubious topicality of this thread,
this will be my last response to you. I hope that you find a healthy way to
address the source of your hostility, and also that you gain more insight
into Information Theory.

My apologies to the list for encouraging this unfortunate tangent.

E

On Thu, Jul 16, 2015 at 8:38 PM, Peter S 
wrote:

> On 17/07/2015, Ethan Duni  wrote:
> > What are these better estimators? It seems that you have several
> estimators
> > in mind but I can't keep track of what they all are,
> > I urge you to slow down, collect your thoughts, and
> > spend a bit more time editing your posts for clarity (and length).
>
> I urge you to pay more attention and read more carefully.
> I do not want to repeat myself several times.
> (Others will think it's repetitive and boring.)
>
> [And fuck the "spend more time" part, I already spent 30+ hours editing.]
>
> > And what is "entropy per bit?" Entropy is measured in bits, in the first
> > place. Did you mean "entropy per symbol" or something?
>
> Are you implying that a bit is not a symbol?
> A bit *is* a symbol. So of course, I meant that.
>
> > Entropy is measured in bits, in the first place.
>
> According to IEC 8-13, entropy is measured in shannons:
> https://en.wikipedia.org/wiki/Shannon_%28unit%29
>
> For historical reasons, "bits" is often used synonymously with "shannons".
>
> > Maybe you could try using this "brain" to interact in a good-faith way.
>
> Faith belongs to church.
>
> > The "entropy" of a signal - as opposed to entropy rate - is not a
> > well-defined quantity, generally speaking.
>
> It's exact value is not "well-defined", yet it is *certain* to be non-zero.
> (Unless you only have only 1 particular signal with 100% probability.)
>
> > The standard quantity of interest in the signal context is entropy rate
>
> Another standard quality of interest in the signal context is "entropy".
>
> https://en.wikipedia.org/wiki/Entropy_%28information_theory%29
>
> Quote:
> "Entropy is a measure of unpredictability of information content."
>
> > If you want to talk about "signal entropy," distinct from the entropy
> rate,
> > then you need to do some additional work to specify what you mean by
> that.
>
> Let me give you an example.
>
> You think that a constant signal has no randomness, thus no entropy (zero
> bits).
> Let's do a little thought experiement:
>
> I have a constant signal, that I want to transmit to you over some
> noiseless discrete channel. Since you think a constant signal has zero
> entropy, I send you _nothing_ (precisely zero bits).
>
> Now try to reconstruct my constant signal from the "nothing" that you
> received from me! Can you?
>
> .
> .
> .
>
> There's very high chance you can't. Let me give you a hint. My
> constant signal is 16 bit signed PCM, and first sample of it is
> sampled from uniform distribution noise.
>
> What is the 'entropy' of my constant signal?
>
> Answer: since the first sample is sampled from uniform distribution
> noise, the probability of you successfully guessing my constant signal
> is 1/65536. Hence, it has an entropy of log2(65536) = 16 bits. In
> other words, I need to send you all the 16 bits of the first sample
> for you to be able to reconstruct my constant signal with 100%
> certainity. Without receiving those 16 bits, you cannot reconstruct my
> constant signal with 100% certainity. That's the measure of its
> "uncertainity" or "unpredictability".
>
> So you (falsely) thought a "constant signal" has zero randomness and
> thus zero entropy, yet it turns out that when I sampled that constant
> signal from the output of 16-bit uniform distribution white noise,
> then my constant signal will have 16 bits of entropy. And if I want to
> transmit it to you, then I need to send you a minimum of 16 bits for
> you to be able to reconstruct, despite that it's a "constant" signal.
>
> It may have an asymptotic 'entropy rate' of zero, yet that doesn't
> mean that the total entropy is zero. So the 'entropy rate' doesn't
> tell you the entropy of the signal. The total entropy (uncertainity,
> unpredictability, randomness) in this particular constant signal is 16
> bits. Hence, its entropy is nonzero, and in this case, 16 bits. Hence,
> if I want to send it to

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni

uot; You need to specify what is the possible set
of parameters, and then specify a distribution over that set, in order to
talk about the entropy.

>If you assume the entropy _rate_ to be the average entropy per bits

What is "per bits?" You mean "per symbol" or something?

The definition of entropy rate is the limit of the conditional entropy
H(X_n|X_{n-1},X_{n-2},...) as n goes to infinity.

The average rate that some actual communication system uses to send some
particular signal is not the same thing as the entropy rate. The entropy
rate is a *lower bound* on that number. To calculate the entropy rate, you
need to figure out the conditional distribution of a given sample
conditioned on all previous samples, and then look at how the entropy of
that conditional distribution behaves asymptotically. In signal processing
terms, that corresponds to building an ideal signal predictor (with
potentially infinite memory, complexity, etc.) and then looking at the
entropy of the residual. The average rate produced by some actual coding
system is an *upper bound* on the entropy rate of the random process in
question.

Again, I encourage you to slow the pace of your replies and instead try to
write fewer, more concise posts with greater emphasis on clarity and
precision. You were doing okay earlier in this thread but seem to be
getting into muddier and muddier waters as it proceeds, and the resulting
confusion seems to be provoking some unpleasantly combative behavior from
you.

E

On Thu, Jul 16, 2015 at 12:50 PM, Peter S 
wrote:

> On 16/07/2015, Ethan Duni  wrote:
> > But, it seems that it does *not* approach zero. If you fed an arbitrarily
> > long periodic waveform into this estimator, you won't see the estimate
> > approaching zero as you increase the length.
>
> False. The better estimators give an estimate that approaches zero.
>
> % set pattern [randbits [randnum 20]]; puts pattern=$pattern; for {set i
> 1} {$i<
> =10} {incr i} {put "L=$i, "; measure [repeat $pattern 1] $i}
> pattern=1000110011011010
> L=1, Estimated entropy per bit: 1.00
> L=2, Estimated entropy per bit: 1.023263
> L=3, Estimated entropy per bit: 0.843542
> L=4, Estimated entropy per bit: 0.615876
> L=5, Estimated entropy per bit: 0.337507
> L=6, Estimated entropy per bit: 0.17
> L=7, Estimated entropy per bit: 0.071429
> L=8, Estimated entropy per bit: 0.031250
> L=9, Estimated entropy per bit: 0.013889
> L=10, Estimated entropy per bit: 0.006250
>
> It seems that the series approaches zero with increasing length.
> I can repeat it arbitrary times with an arbitrary repeated waveform:
> http://morpheus.spectralhead.com/entropy/random-pattern-tests.txt
>
> Longer patterns (up to cycle length 100):
> http://morpheus.spectralhead.com/entropy/random-pattern-tests-100.txt
>
> For comparison, same tests repeated on white noise:
> http://morpheus.spectralhead.com/entropy/white-noise-tests.txt
>
> The numbers do not lie.
>
> > Also you are only able to deal
> > with 1-bit waveforms, I don't see how you can make any claims about
> general
> > periodic waveforms with this.
>
> I have an organ called "brain" that can extrapolate from data, and
> make predictions.
>
> >>"Random" periodic shapes give somewhat higher entropy estimate than
> "pure"
> >>waves like square, so there's somewhat more entropy in it.
> >
> > No, all periodic signals have exactly zero entropy rate.
>
> Entropy != entropy rate. The shape of the waveform itself has some entropy.
>
> > The correct statement is that your estimator does an even worse job on
> > complicated periodic waveforms, than it does on ones like square waves.
> > This is because it's counting transitions, and not actually looking for
> > periodicity.
>
> The bitflip estimator, yes. The pattern matching estimator, no.
>
> > Again, periodic waveforms have exactly zero entropy rate.
>
> Again, entropy != entropy rate. I wrote "entropy" not "entropy rate".
> The shape of the waveform itself contains some information, doesn't
> it? To be able to transmit a waveform of arbitrary shape, you need to
> transmit the shape somehow at least *once*, don't you think? Otherwise
> you cannot reconstruct it... Hence, any periodic waveform has nonzero
> _entropy_. The more complicated the waveform shape is, the more
> entropy it has.
>
> > If there is no randomness, then there is no entropy.
>
> Entirely false. The entropy of the waveform shape is nonzero.
> Do not confuse "entropy" with "entropy rate".
>
> Even a constant signal has "entropy". Unless it's zero, you need to
> transmit the cons

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni

>his model will be baffled as soon as you send something into it that
>is not harmonic. So it is only "ideal" in the very simple case of a
>single, periodic, harmonic waveform, which is just a small subset of
>"arbitrary signals".

I'm not suggesting using a parametric signal model as an estimator. I'm
saying that an ideal estimator would be smart enough to figure out when
it's dealing with a parametrizable signal, and exploit that. It would also
be smart enough to realize when it's dealing with a non-parametrizable
signal, and do the appropriate thing in those cases.

>Quantization, interpolation and other numerical errors
>will add a slight uncertainity to your entropy estimate; in practice,
>things are very rarely "exact". Which I consider one of the reasons
>why a practical entropy estimator will likely never give zero for a
>periodic signal.

Getting to *exactly* zero is kind of nit-picky. An estimation error on the
order of, say, 10^-5 bits/second is as good as zero, since it's saying you
only have to send 1 bit every few hours. Given that you are unlikely to be
encoding any signals that last that long, the difference between that and
zero is kind of academic. This is just a matter of numerical error that can
be reduced arbitrarily by throwing computational power at the problem -
it's not a fundamental issue with the estimation approach itself.

The reason the non-zero estimates you're getting from your estimator are a
problem is that they are artifacts of the estimation strategy - not the
numerical precision - and so they do not reduce with more data, they are
correlated with signal properties like frequency and duty cycle, etc. These
are signs of flaws in the basic estimation approach, not in the numerical
implementation thereof.

E

On Thu, Jul 16, 2015 at 7:07 AM, Peter S 
wrote:

> On 15/07/2015, Ethan Duni  wrote:
> > Right, this is an artifact of the approximation you're doing. The model
> > doesn't explicitly understand periodicity, but instead only looks for
> > transitions, so the more transitions per second (higher frequency) the
> more
> > it has to do.
>
> Yes. So for a periodic waveform, the estimation error equals the
> frequency, as it should be zero. For maximal frequency (pattern
> 01010101...), it has maximal error.
>
> > The ideal estimator/transmission system for a periodic signal would
> figure
> > out that the signal is periodic and what the period/duty cycle and
> > amplitude are, and then simply transmit these 4 (finite) pieces of data.
> > Then the receiver can use those to generate an infinitely long signal.
>
> This model will be baffled as soon as you send something into it that
> is not harmonic. So it is only "ideal" in the very simple case of a
> single, periodic, harmonic waveform, which is just a small subset of
> "arbitrary signals".
>
> > Your model
> > will never get to a zero entropy rate because it doesn't "understand" the
> > concept of periodicity, and so has to keep sending more data forever
>
> Strictly speaking, that's not true. It will (correctly) give an
> entropy rate of zero in the corner case of constant signal (f=0), as
> that has no transitions. (For all other signals, it will always give
> nonzero entropy rate, and so have an estimate error.)
>
> > You only need the analysis window length to be greater than the period.
> > Then you can do a search over possible shifts of the analysis window
> > compared to the current frame, and you will find an exact match.
>
> Only if the cycle length is integer, otherwise it won't be 100% exact...
>
> > You can use fractional delay filters for this.
>
> And fractional delay filters will introduce some error to the signal.
> Example: linear interpolation reduces high frequencies, and allpass
> interpolation introduces Nyquist ringing. (Either of the two is true
> for all interpolators in varying amounts). So it is almost 100%
> certain that the original and the fractionally shifted signal will not
> be bit-by-bit identical, whatever interpolator you're using, unless
> you use some very high amount of oversampling with sinc interpolation
> with a very long kernel, requiring impractically large amounts of CPU
> power (and unless you do the oversampling and interpolation using a
> higher precision accumulator, you'll still have numerical errors
> causing some bits to be different).
>
> Ideally, I consider a match to be exact only in case of every bit
> being equal, whatever the representation (floating point or fixed). So
> if you factor in to your entropy estimate how much self-similarity is
> between the periods, then if the match is not 100% but

Re: [music-dsp] about entropy encoding

2015-07-16 Thread Ethan Duni

>This algorithm gives an entropy rate estimate approaching zero for any
>periodic waveform, irregardless of the shape (assuming the analysis
>window is large enough).

But, it seems that it does *not* approach zero. If you fed an arbitrarily
long periodic waveform into this estimator, you won't see the estimate
approaching zero as you increase the length. Also you are only able to deal
with 1-bit waveforms, I don't see how you can make any claims about general
periodic waveforms with this.

>"Random" periodic shapes give somewhat higher entropy estimate than "pure"
>waves like square, so there's somewhat more entropy in it.

No, all periodic signals have exactly zero entropy rate.

The correct statement is that your estimator does an even worse job on
complicated periodic waveforms, than it does on ones like square waves.
This is because it's counting transitions, and not actually looking for
periodicity.

>Periodic waveforms have a
>lot of self-similarity, so they have low entropy. Constant signal is
>fully self-similar, so it has zero entropy.

Again, periodic waveforms have exactly zero entropy rate. This is true of
all deterministic signals. If there is no randomness, then there is no
entropy. This is the reason that parameter estimation is coming up - if a
signal can be described by a finite set of parameters (amplitude, phase and
frequency, say) then it immediately follows that it has zero entropy rate.
The fact that your estimator doesn't do parameter estimation means that its
not going to capture those cases correctly, and instead will give non-zero
entropy rate estimates. Maybe that's an acceptable trade-off for the domain
you want to address, but it's not something that you can simply hand-wave
away.

>Another method of estimating entropy is to build a predictor that tries to
predict the
>signal from the preceding samples.

That's not so much "another method" as it is a basic requirement of
estimating the entropy rate. You have to account, in some way, for the
dependencies on previous samples, in order to get an idea of how much
"surprise" is coming in each sample. Your estimator is itself a crude
predictor: it predicts that the signal is constant, and then any deviation
from that is counted as a "surprise" and so increments the entropy
estimate. But since it is such a crude predictor it can't account for
obvious, deterministic behavior such as periodic square waves.

>When compressing waveforms, audio codecs typically do that.

Depends on the audio codec. High performance ones actually don't do that,
for a couple of reasons. One is that general audio signals don't really
have a reliable time structure to exploit, so you're actually better off
exploiting psychoacoustics. You get a certain amount of decoration from the
MDCT or whatever, but there isn't really a notion of a general "signal
predictor" or an prediction error. Another is that introducing such
dependencies means that any packet loss/corruption results in error
propagation. Signal predictors are a significant part of speech codecs
(where the fact that you're dealing with human speech signals gives you
some reliable assumptions to exploit) and in low performance audio codecs
(ADPCM) where the overall system is simple enough that introducing some
adaptive prediction doesn't cause too many troubles.

E

On Wed, Jul 15, 2015 at 8:39 PM, Peter S 
wrote:

> On 16/07/2015, robert bristow-johnson  wrote:
> >
> > i've only been following this coarsely, but did you post, either in code
> > or pseudo-code the entropy estimator algorithm?  i'd be curious.
>
> It's essentially based on the ideas outlined on 2015-07-14 19:52 CET,
> see "higher-order approximations". It's a sliding-window histogram
> estimator.
>
> > what if the quantization was fully dithered (TPDF dither of 2 LSBs)?
> > what does your estimator do with that?
>
> So far it's a rudimentary proof-of-concept estimator that currently
> works only on 1-bit samples. So if you dither that, it will think it
> is all noise :) In general, adding any unpredictable noise source to a
> signal will increase its measured entropy.
>
> > how does you estimator know how to measure the necessary
> > information content of different simple deterministic waveforms?  once
> > you establish that the waveform is a triangle wave, again it doesn't
> > take more than three number to fully describe it.
>
> My estimator basically uses pattern matching. It has no notion of
> 'triangle wave', or any perodic wave for that matter, it just sees
> patterns. So it sees a 'corner' of a square wave, and remembers that
> pattern. Next time that corner comes in the next period, it says "Oh!
> This corner again! I've seen this already." So it produces a match. At
> the end of the analysis, it says "So, it was 523 matches for this very
> same corner, which is high probability, thus low entropy." And repeats
> the same for every pattern encountered, summing the estimated
> entropies. Works for 1-bit signals, higher bit-depths

Re: [music-dsp] about entropy encoding

2015-07-15 Thread Ethan Duni

>I wondered a few times what a higher "entropy" estimate for a higher
>frequency would mean according to this - I think it means that a
>higher frequency signal needs a higher bandwidth channel to transmit,
>as you need a transmission rate of 2*F to transmit a periodic square
>wave of frequency F. Hence, for a higher frequency square wave, a
>higher bandwidth is needed.

Right, this is an artifact of the approximation you're doing. The model
doesn't explicitly understand periodicity, but instead only looks for
transitions, so the more transitions per second (higher frequency) the more
it has to do.

The ideal estimator/transmission system for a periodic signal would figure
out that the signal is periodic and what the period/duty cycle and
amplitude are, and then simply transmit these 4 (finite) pieces of data.
Then the receiver can use those to generate an infinitely long signal. So
the entropy rate is zero (/ = 0). Your model
will never get to a zero entropy rate because it doesn't "understand" the
concept of periodicity, and so has to keep sending more data forever,
despite the fact that there are no more "surprises" in the signal after
you've seen one period.

>Using a longer analysis window (a higher-order estimate) would give a
better estimate
>of the entropy of periodic waveforms.

Note that a single period of analysis memory is enough to do this, provided
you can assume periodicity.

>Windowing artifacts: unless the analysis window length is an exact
>multiple of the period length, the truncated edges of the analysis
>window will give some artifacts, making the entropy estimate nonzero
>(similar to DFT windowing artifacts).

You only need the analysis window length to be greater than the period.
Then you can do a search over possible shifts of the analysis window
compared to the current frame, and you will find an exact match.

>Quantization artifacts: unless the period cycle is an integer
>number, each cycle will be slightly different due to various
>quantization artifacts. Unless your algorithm models that, it will
>make the entropy estimate nonzero.

You can use fractional delay filters for this. This is standard stuff in
speech coding for working out the parameters of the periodic part of a
speech signal - every cell phone call anybody has made in the past couple
of decades has done this in real time. You may need oversampling to
preserve the bandwidth if you want a "perfect" match though.

>Computational limit: I can increase the estimate precision for
>periodic square waves to around 0.01-0.001 (within 0.1% error),
>but then the algorithm becomes really slow. An infinite long analysis
>window would need infinite computation.

For periodic waveforms you don't need infinite length, just more than one
period - and a suitable approach to estimation that is built around
periodicity.

What you need very long windows for is estimating the entropy rate of
non-periodic signals that still have significant dependencies across
samples. Consider a Hidden Markov Model or a dynamic random process for
example.

You can think of a general entropy rate estimator as some (possibly quite
sophisticated, nonlinear) signal predictor, where the resulting prediction
error is then assumed to be Gaussian white noise. You then get your entropy
estimate simply by estimating the power in the prediction error. If the
signal in question is well modeled by the estimator, the assumption of a
Gaussian white noise prediction error will be accurate and the resulting
entropy rate estimate will be quite good. If not, the prediction error will
have some different statistics (including dependencies across time) and
treating it as iid Gaussian noise will introduce error into the entropy
estimate. Specifically, the estimate will be too high, since iid Gaussian
noise is the max entropy signal (for a given rms power). This
overestimation corresponds directly to the failure of the signal predictor
to handle the dependencies in the signal.

>Uncertainity: how do you know that the output of some black-box
>process is truly deterministic? Answer: you can't know. You can only
>measure 'observed' probabilities, so the 'measured' entropy never
>reaches zero in a finite amount of time.

Right, any statistical estimator needs to be considered in terms of its
asymptotic behavior, and associated convergence rate. The results for the
first few observations are likely to be quite poor.

For dealing with general signals you have to forego the goal of estimating
the "true" underlying entropy, and instead select an estimation algorithm
that has suitable practical properties (complexity, memory, latency) and
with as much generality as possible (which goal is in tension with the
practical requirements). The hope is that the algorithm is sufficiently
general to give good performance on the class of signals that you have to
deal with in practice - and also not too poor when applied to the class of
signals that it isn't able to model exactly. Note that f

1 2 3 >

1 - 100 of 202 matches

Mail list logo