Re: [music-dsp] about entropy encoding

Ethan Duni Thu, 16 Jul 2015 11:30:07 -0700

>his model will be baffled as soon as you send something into it that
>is not harmonic. So it is only "ideal" in the very simple case of a
>single, periodic, harmonic waveform, which is just a small subset of
>"arbitrary signals".


I'm not suggesting using a parametric signal model as an estimator. I'm
saying that an ideal estimator would be smart enough to figure out when
it's dealing with a parametrizable signal, and exploit that. It would also
be smart enough to realize when it's dealing with a non-parametrizable
signal, and do the appropriate thing in those cases.

>Quantization, interpolation and other numerical errors
>will add a slight uncertainity to your entropy estimate; in practice,
>things are very rarely "exact". Which I consider one of the reasons
>why a practical entropy estimator will likely never give zero for a
>periodic signal.

Getting to *exactly* zero is kind of nit-picky. An estimation error on the
order of, say, 10^-5 bits/second is as good as zero, since it's saying you
only have to send 1 bit every few hours. Given that you are unlikely to be
encoding any signals that last that long, the difference between that and
zero is kind of academic. This is just a matter of numerical error that can
be reduced arbitrarily by throwing computational power at the problem -
it's not a fundamental issue with the estimation approach itself.

The reason the non-zero estimates you're getting from your estimator are a
problem is that they are artifacts of the estimation strategy - not the
numerical precision - and so they do not reduce with more data, they are
correlated with signal properties like frequency and duty cycle, etc. These
are signs of flaws in the basic estimation approach, not in the numerical
implementation thereof.

E

On Thu, Jul 16, 2015 at 7:07 AM, Peter S <peter.schoffhau...@gmail.com>
wrote:

> On 15/07/2015, Ethan Duni <ethan.d...@gmail.com> wrote:
> > Right, this is an artifact of the approximation you're doing. The model
> > doesn't explicitly understand periodicity, but instead only looks for
> > transitions, so the more transitions per second (higher frequency) the
> more
> > it has to do.
>
> Yes. So for a periodic waveform, the estimation error equals the
> frequency, as it should be zero. For maximal frequency (pattern
> 01010101...), it has maximal error.
>
> > The ideal estimator/transmission system for a periodic signal would
> figure
> > out that the signal is periodic and what the period/duty cycle and
> > amplitude are, and then simply transmit these 4 (finite) pieces of data.
> > Then the receiver can use those to generate an infinitely long signal.
>
> This model will be baffled as soon as you send something into it that
> is not harmonic. So it is only "ideal" in the very simple case of a
> single, periodic, harmonic waveform, which is just a small subset of
> "arbitrary signals".
>
> > Your model
> > will never get to a zero entropy rate because it doesn't "understand" the
> > concept of periodicity, and so has to keep sending more data forever
>
> Strictly speaking, that's not true. It will (correctly) give an
> entropy rate of zero in the corner case of constant signal (f=0), as
> that has no transitions. (For all other signals, it will always give
> nonzero entropy rate, and so have an estimate error.)
>
> > You only need the analysis window length to be greater than the period.
> > Then you can do a search over possible shifts of the analysis window
> > compared to the current frame, and you will find an exact match.
>
> Only if the cycle length is integer, otherwise it won't be 100% exact...
>
> > You can use fractional delay filters for this.
>
> And fractional delay filters will introduce some error to the signal.
> Example: linear interpolation reduces high frequencies, and allpass
> interpolation introduces Nyquist ringing. (Either of the two is true
> for all interpolators in varying amounts). So it is almost 100%
> certain that the original and the fractionally shifted signal will not
> be bit-by-bit identical, whatever interpolator you're using, unless
> you use some very high amount of oversampling with sinc interpolation
> with a very long kernel, requiring impractically large amounts of CPU
> power (and unless you do the oversampling and interpolation using a
> higher precision accumulator, you'll still have numerical errors
> causing some bits to be different).
>
> Ideally, I consider a match to be exact only in case of every bit
> being equal, whatever the representation (floating point or fixed). So
> if you factor in to your entropy estimate how much self-similarity is
> between the periods, then if the match is not 100% but just say, 99.9%
> due to quantization, interpolation and other numerical errors, then
> your entropy measure will not be 0 but rather 0.01 (or something like
> that). Two periods are 100% bit-by-bit identical only in case of
> integer period length. And if two periods are not 100% identical, then
> how do you know for _sure_ if that's supposed to be the exact same
> period, and not a slightly different period, having some new
> information? Quantization, interpolation and other numerical errors
> will add a slight uncertainity to your entropy estimate; in practice,
> things are very rarely "exact". Which I consider one of the reasons
> why a practical entropy estimator will likely never give zero for a
> periodic signal.
>
> > What you need very long windows for is estimating the entropy rate of
> > non-periodic signals that still have significant dependencies across
> > samples.
>
> Or, estimating the entropy rate of mixtures of several periodic
> signals of non-harmonic frequencies, which, all being fully
> deterministic, having an actual entropy rate of zero. If your model is
> to find a single repeating period, then - unless your analysis window
> is long enough - it won't find the common period, which is the least
> common multiple of the frequencies of the individual components. You
> can trivially see that in that case, that model can successfully
> utilize a longer window to identify longer cycles.
>
> > You can think of a general entropy rate estimator as some (possibly quite
> > sophisticated, nonlinear) signal predictor, where the resulting
> prediction
> > error is then assumed to be Gaussian white noise. You then get your
> entropy
> > estimate simply by estimating the power in the prediction error. If the
> > signal in question is well modeled by the estimator, the assumption of a
> > Gaussian white noise prediction error will be accurate and the resulting
> > entropy rate estimate will be quite good. If not, the prediction error
> will
> > have some different statistics (including dependencies across time) and
> > treating it as iid Gaussian noise will introduce error into the entropy
> > estimate. Specifically, the estimate will be too high, since iid Gaussian
> > noise is the max entropy signal (for a given rms power). This
> > overestimation corresponds directly to the failure of the signal
> predictor
> > to handle the dependencies in the signal.
>
> I think this is visible on this graph I showed:
> http://morpheus.spectralhead.com/entropy/errordist/
>
> The prediction error distribution (in this particular test) is like
> Gaussian noise, and since this is a very simple approximator, it
> typically overestimates entropy (in this test, by about 12% on
> average).
>
> > Note that for any finite
> > estimation algorithm/signal model, you can always easily cook up a class
> of
> > random signals that it can't model well.
>
> Very true. For example if the model is to try find and identify a
> single cycle of a periodic waveform, that will be baffled as soon as
> you send something inharmonic into it (like something percussive), or
> a mixture of periodic waveforms with inharmonic frequencies, which
> have a true entropy rate of zero.
>
> > In the present thread, we have an estimator that doesn't grok the feature
> > of "periodicity" and so it shows issues when applied to periodic signals.
>
> The model you're outlining (identify a single cycle of a periodic
> waveform) doesn't grok the feature of "non-periodicity" and so it
> shows issues when applied to non-periodic signals. You base it on the
> assumption that the signal is periodic, which is not always the case
> for arbitrary signals.
>
> >>(Do you really need an infinite long sinc kernel to reconstruct your
> >>digital samples? Typically there's a filter that is "good enough" and
> >>computable.)
> >
> > That's not a very good analogy. The error due to finite reconstruction
> > filters decreases steadily and predictably with longer and longer
> filters,
> > so you can always in principle choose a finite length filter for any
> > desired level of accuracy. But for any finite memory entropy estimator,
> > there are always classes of signals for which it performs arbitrarily
> > badly. So you can't put an upper bound on the error simply by choosing a
> > sufficiently long memory - instead you are growing the class of signals
> > that it can model adequately, without necessarily improving performance
> on
> > signals outside of that class.
>
> I tend to disagree (although it somewhat depends on the model).
> Trivial example: if your model is to find a single cycle of a period,
> then if the input is a mixture of periodic waveforms of inharmonic
> frequencies with a longer resulting common cycle, then your algorithm
> will be able to utilize more memory to give a better estimate
> (otherwise, the longer period won't fit into its memory). Sure the
> improvement of estimate doesn't grow lineraly with memory, so if you
> give an algorithm 10x as much memory, the estimate won't be 10x
> better, just slightly better (and it also depends on the model and the
> signal being analyzed).
>
> Typically, compression algorithms can achieve better compression
> ratios when more memory is available, and some data compressors employ
> 1GB or more memory to achieve very high ratios. Of course 10x as much
> memory does not correlate with 10x as much compression, just maybe a
> few percent improvement in compression ratio. After some point as the
> compression approaches the theoretical ideal, there are going to be
> diminishing returns, but that doesn't mean that more available memory
> doesn't typically give better results (it's just not a linear
> correlation). The more memory a predictor has of past events,
> typically the better it can predict what will come next.
>
> > One thing worth mentioning, in light of that, is that the "true" entropy
> > rate may not be of much practical interest, in cases where there is no
> > practical way to build machinery that would estimate/utilize it.
> Similarly,
> > it's common in practice for some applications to use compression schemes
> > that are known to be quite poor (in the rate/distortion or entropy sense)
> > because constraints on power consumption/memory/latency are more pressing
> > than constraints on transmission bandwidth.
>
> People typically settle for what is "good enough" for a particular use
> case. It's usually not worth spending 10x as much time, CPU power and
> memory for an 5% improvement in compression ratio. Very high ratio
> compressors that approach the theoretical entropy as much as possible,
> are only interesting in case you want to win some data compressor
> prize money, otherwise they're so slow and hard on resources that
> they're impractical to use.
>
> The "true entropy" or "ideal compression ratio" is uncomputable for
> general signals - if you have an estimator or compressor that you
> think is ideal, it is impossible to prove that there isn't another
> model that is better than that. And if you keep giving the predictor
> more resources (time and memory), it will typically give slightly
> better and better approximations, the results approaching the true
> entropy with infinte amount of time and resources.
>
> So in practice, there can only be various approximations, from the
> very simplest to the sophisticated pattern matching algorithms with
> "context mixing" (that is, building several simultaneous prediction
> models in varying contexts). Some predictors employ simulated neural
> networks, or other methods, all with varying results and resource
> requirements. What works well for one type of signal, may work not as
> good for another type of signal.
>
> -P
> --
> dupswapdrop -- the music-dsp mailing list and website:
> subscription info, FAQ, source code archive, list archive, book reviews,
> dsp links
> http://music.columbia.edu/cmc/music-dsp
> http://music.columbia.edu/mailman/listinfo/music-dsp
>
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Re: [music-dsp] about entropy encoding

Reply via email to