Re: [music-dsp] about entropy encoding

Ethan Duni Wed, 15 Jul 2015 13:58:11 -0700

>I wondered a few times what a higher "entropy" estimate for a higher
>frequency would mean according to this - I think it means that a
>higher frequency signal needs a higher bandwidth channel to transmit,
>as you need a transmission rate of 2*F to transmit a periodic square
>wave of frequency F. Hence, for a higher frequency square wave, a
>higher bandwidth is needed.

Right, this is an artifact of the approximation you're doing. The model
doesn't explicitly understand periodicity, but instead only looks for
transitions, so the more transitions per second (higher frequency) the more
it has to do.

The ideal estimator/transmission system for a periodic signal would figure
out that the signal is periodic and what the period/duty cycle and
amplitude are, and then simply transmit these 4 (finite) pieces of data.
Then the receiver can use those to generate an infinitely long signal. So
the entropy rate is zero (<finite data>/<infinite output> = 0). Your model
will never get to a zero entropy rate because it doesn't "understand" the
concept of periodicity, and so has to keep sending more data forever,
despite the fact that there are no more "surprises" in the signal after
you've seen one period.

>Using a longer analysis window (a higher-order estimate) would give a
better estimate
>of the entropy of periodic waveforms.

Note that a single period of analysis memory is enough to do this, provided
you can assume periodicity.

>Windowing artifacts: unless the analysis window length is an exact
>multiple of the period length, the truncated edges of the analysis
>window will give some artifacts, making the entropy estimate nonzero
>(similar to DFT windowing artifacts).

You only need the analysis window length to be greater than the period.
Then you can do a search over possible shifts of the analysis window
compared to the current frame, and you will find an exact match.

>Quantization artifacts: unless the period cycle is an integer
>number, each cycle will be slightly different due to various
>quantization artifacts. Unless your algorithm models that, it will
>make the entropy estimate nonzero.

You can use fractional delay filters for this. This is standard stuff in
speech coding for working out the parameters of the periodic part of a
speech signal - every cell phone call anybody has made in the past couple
of decades has done this in real time. You may need oversampling to
preserve the bandwidth if you want a "perfect" match though.

>Computational limit: I can increase the estimate precision for
>periodic square waves to around 0.000001-0.001 (within 0.1% error),
>but then the algorithm becomes really slow. An infinite long analysis
>window would need infinite computation.

For periodic waveforms you don't need infinite length, just more than one
period - and a suitable approach to estimation that is built around
periodicity.

What you need very long windows for is estimating the entropy rate of
non-periodic signals that still have significant dependencies across
samples. Consider a Hidden Markov Model or a dynamic random process for
example.

You can think of a general entropy rate estimator as some (possibly quite
sophisticated, nonlinear) signal predictor, where the resulting prediction
error is then assumed to be Gaussian white noise. You then get your entropy
estimate simply by estimating the power in the prediction error. If the
signal in question is well modeled by the estimator, the assumption of a
Gaussian white noise prediction error will be accurate and the resulting
entropy rate estimate will be quite good. If not, the prediction error will
have some different statistics (including dependencies across time) and
treating it as iid Gaussian noise will introduce error into the entropy
estimate. Specifically, the estimate will be too high, since iid Gaussian
noise is the max entropy signal (for a given rms power). This
overestimation corresponds directly to the failure of the signal predictor
to handle the dependencies in the signal.

>Uncertainity: how do you know that the output of some black-box
>process is truly deterministic? Answer: you can't know. You can only
>measure 'observed' probabilities, so the 'measured' entropy never
>reaches zero in a finite amount of time.

Right, any statistical estimator needs to be considered in terms of its
asymptotic behavior, and associated convergence rate. The results for the
first few observations are likely to be quite poor.

For dealing with general signals you have to forego the goal of estimating
the "true" underlying entropy, and instead select an estimation algorithm
that has suitable practical properties (complexity, memory, latency) and
with as much generality as possible (which goal is in tension with the
practical requirements). The hope is that the algorithm is sufficiently
general to give good performance on the class of signals that you have to
deal with in practice - and also not too poor when applied to the class of
signals that it isn't able to model exactly. Note that for any finite
estimation algorithm/signal model, you can always easily cook up a class of
random signals that it can't model well. So the engineering questions are
things like "what class of signals do we really care about" and "how robust
do we need to be to signals that don't fit in that box?"

In the present thread, we have an estimator that doesn't grok the feature
of "periodicity" and so it shows issues when applied to periodic signals.

>(Do you really need an infinite long sinc kernel to reconstruct your
>digital samples? Typically there's a filter that is "good enough" and
>computable.)

That's not a very good analogy. The error due to finite reconstruction
filters decreases steadily and predictably with longer and longer filters,
so you can always in principle choose a finite length filter for any
desired level of accuracy. But for any finite memory entropy estimator,
there are always classes of signals for which it performs arbitrarily
badly. So you can't put an upper bound on the error simply by choosing a
sufficiently long memory - instead you are growing the class of signals
that it can model adequately, without necessarily improving performance on
signals outside of that class.

One thing worth mentioning, in light of that, is that the "true" entropy
rate may not be of much practical interest, in cases where there is no
practical way to build machinery that would estimate/utilize it. Similarly,
it's common in practice for some applications to use compression schemes
that are known to be quite poor (in the rate/distortion or entropy sense)
because constraints on power consumption/memory/latency are more pressing
than constraints on transmission bandwidth.

E

On Wed, Jul 15, 2015 at 12:17 AM, Peter S <peter.schoffhau...@gmail.com>
wrote:

> On 14/07/2015, Mads Dyrholm <dyrholm.m...@gmail.com> wrote:
> > Well, I was thinking about this as well. How about a 1bit square wave
> then?
> > -
> >
> > Your bitflip counter would not be sensitive to duty cycle.
> > The simpler bit counter would be.
> >
> > I don't see why "entropy" should change with duty cycle since I am just
> > turning one synthesizer knob.
>
> For 1-bit periodic square waves, this simple metric approximates the
> frequency of the signal. By changing the duty cycle, the frequency
> doesn't change, so this metric doesn't change.
>
> I wondered a few times what a higher "entropy" estimate for a higher
> frequency would mean according to this - I think it means that a
> higher frequency signal needs a higher bandwidth channel to transmit,
> as you need a transmission rate of 2*F to transmit a periodic square
> wave of frequency F. Hence, for a higher frequency square wave, a
> higher bandwidth is needed. The simple bitflip metric tells you how
> many times you need to transmit the state to transmit the signal
> fully, as it counts the 'state transitions' (which correlates with the
> frequency in case of 1-bit periodic square waves).
>
> As Ethan pointed out, square waves are highly deterministic, so both
> metrics will give a poor estimate of entropy (or entropy rate) in
> cases of periodic waveforms. In case of an 1-bit square wave, counting
> 0/1 transitions estimates the frequency, and counting ones/zeros
> estimates the duty cycle (giving 1 for 50% duty cycle, and 0 for 0% or
> 100% duty cycle).
>
> So in case of an 1-bit periodic square wave, these two metrics will
> estimate the frequency and duty cycle, respectively.
>
> A first-order estimate (counting the bits) is a very poor estimate of
> entropy, as it has no notion of correlation, hence it cannot
> distinguish between white noise and square wave with 50% duty cycle,
> one of which is fully deterministic, the other being fully random. A
> second-order estimate (analyzing bit pairs) is only slightly better,
> but it can already differentiate between the two when the frequency of
> the square wave is low enough. Hence, simplest possible estimate that
> analyzes correlation (but still a poor estimate). Using a longer
> analysis window (a higher-order estimate) would give a better estimate
> of the entropy of periodic waveforms.
>
> -P
> --
> dupswapdrop -- the music-dsp mailing list and website:
> subscription info, FAQ, source code archive, list archive, book reviews,
> dsp links
> http://music.columbia.edu/cmc/music-dsp
> http://music.columbia.edu/mailman/listinfo/music-dsp
>
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] about entropy encoding

Reply via email to