Peter S, your combative attitude is unwelcome. It seems that you are less
interested in grasping these topics than you are in hectoring myself and
other list members. Given that and the dubious topicality of this thread,
this will be my last response to you. I hope that you find a healthy way to
address the source of your hostility, and also that you gain more insight
into Information Theory.

My apologies to the list for encouraging this unfortunate tangent.

E

On Thu, Jul 16, 2015 at 8:38 PM, Peter S <peter.schoffhau...@gmail.com>
wrote:

> On 17/07/2015, Ethan Duni <ethan.d...@gmail.com> wrote:
> > What are these better estimators? It seems that you have several
> estimators
> > in mind but I can't keep track of what they all are,
> > I urge you to slow down, collect your thoughts, and
> > spend a bit more time editing your posts for clarity (and length).
>
> I urge you to pay more attention and read more carefully.
> I do not want to repeat myself several times.
> (Others will think it's repetitive and boring.)
>
> [And fuck the "spend more time" part, I already spent 30+ hours editing.]
>
> > And what is "entropy per bit?" Entropy is measured in bits, in the first
> > place. Did you mean "entropy per symbol" or something?
>
> Are you implying that a bit is not a symbol?
> A bit *is* a symbol. So of course, I meant that.
>
> > Entropy is measured in bits, in the first place.
>
> According to IEC 80000-13, entropy is measured in shannons:
> https://en.wikipedia.org/wiki/Shannon_%28unit%29
>
> For historical reasons, "bits" is often used synonymously with "shannons".
>
> > Maybe you could try using this "brain" to interact in a good-faith way.
>
> Faith belongs to church.
>
> > The "entropy" of a signal - as opposed to entropy rate - is not a
> > well-defined quantity, generally speaking.
>
> It's exact value is not "well-defined", yet it is *certain* to be non-zero.
> (Unless you only have only 1 particular signal with 100% probability.)
>
> > The standard quantity of interest in the signal context is entropy rate
>
> Another standard quality of interest in the signal context is "entropy".
>
> https://en.wikipedia.org/wiki/Entropy_%28information_theory%29
>
> Quote:
> "Entropy is a measure of unpredictability of information content."
>
> > If you want to talk about "signal entropy," distinct from the entropy
> rate,
> > then you need to do some additional work to specify what you mean by
> that.
>
> Let me give you an example.
>
> You think that a constant signal has no randomness, thus no entropy (zero
> bits).
> Let's do a little thought experiement:
>
> I have a constant signal, that I want to transmit to you over some
> noiseless discrete channel. Since you think a constant signal has zero
> entropy, I send you _nothing_ (precisely zero bits).
>
> Now try to reconstruct my constant signal from the "nothing" that you
> received from me! Can you?
>
> .
> .
> .
>
> There's very high chance you can't. Let me give you a hint. My
> constant signal is 16 bit signed PCM, and first sample of it is
> sampled from uniform distribution noise.
>
> What is the 'entropy' of my constant signal?
>
> Answer: since the first sample is sampled from uniform distribution
> noise, the probability of you successfully guessing my constant signal
> is 1/65536. Hence, it has an entropy of log2(65536) = 16 bits. In
> other words, I need to send you all the 16 bits of the first sample
> for you to be able to reconstruct my constant signal with 100%
> certainity. Without receiving those 16 bits, you cannot reconstruct my
> constant signal with 100% certainity. That's the measure of its
> "uncertainity" or "unpredictability".
>
> So you (falsely) thought a "constant signal" has zero randomness and
> thus zero entropy, yet it turns out that when I sampled that constant
> signal from the output of 16-bit uniform distribution white noise,
> then my constant signal will have 16 bits of entropy. And if I want to
> transmit it to you, then I need to send you a minimum of 16 bits for
> you to be able to reconstruct, despite that it's a "constant" signal.
>
> It may have an asymptotic 'entropy rate' of zero, yet that doesn't
> mean that the total entropy is zero. So the 'entropy rate' doesn't
> tell you the entropy of the signal. The total entropy (uncertainity,
> unpredictability, randomness) in this particular constant signal is 16
> bits. Hence, its entropy is nonzero, and in this case, 16 bits. Hence,
> if I want to send it to you in a message, I need to send a minimum of
> 16 bits to send this constant signal to you. The 'entropy rate'
> doesn't tell you the full picture.
>
> > Switching back and forth between talking about narrow parametric classes
> of
> > signals (like rectangular waves) and general signals is going to lead to
> > confusion, and it is on you to keep track of that and write clearly.
>
> When I say "square wave", then I mean square wave.
> When I say "arbitrary signal", then I mean arbitrary signal.
> What's confusing about that? What makes you unable to follow?
>
> > Moreover, the (marginal) entropy of (classes of) signals is generally not
> > an interesting quantity in the first place [...]
> > Hence the emphasis on entropy rate,
> > which will distinguish between signals that require a lot of bandwidth to
> > transmit and those that don't.
>
> "Entropy" also tells you how much bandwith you require to transmit a
> signal. It's _precisely_ what it tells you. It doesn't tell you "per
> symbol", but rather tells you the "total" number of bits you minimally
> need to transmit the signal fully. I don't understand why you just
> focus on "entropy rate" (= asymptotic entropy per symbol), and forget
> about the total entropy. Both measures tell you the same kind of
> information, in a slightly different context.
>
> >>But not zero _entropy_. The parameters (amplitude, phase, frequency,
> >>waveform shape) themselves have some entropy - you *do* need to
> >>transmit those parameters to be able to reconstruct the signal.
> >
> > Again, that all depends on assumptions on the distribution of the
> > parameters. You haven't specified anything like that, so these assertions
> > are not even wrong. They're simply begging the question of what is your
> > signal space and what is the distribution on it.
>
> I think you do not understand what I say. The distribution of
> parameters is entirely irrelevant. I only said, the entropy is
> *nonzero*. Unless your model can only transmit 1 single waveform with
> a particular set of parameters (= 100% probability, zero entropy), it
> will have a *nonzero* entropy. I cannot say what that value is, I just
> say that unless you have 1 single set of parameters with 100%
> probability, the entropy of the parameters will be certainly
> *nonzero*. Without knowing or caring or needing to know what the
> actual distribution is.
>
> > Again, the important distinction is between estimators that don't hit the
> > correct answer even theoretically (this indicates that the underlying
> > algorithm is inadequate)
>
> There is no practical algorithm that could "hit the correct answer
> theoretically". If that is your measure of adequateness, then all
> estimates will be 'inadequate'.
>
> > and the inevitable imperfections of an actual
> > numerical implementation (which can be made as small as one likes by
> > throwing more computational resources at them).
>
> Yet it will still be nonzero, however high precision and however much
> computational resources you throw at it (which will be impractical by
> the way).
>
> >>Let's imagine you want to transmit a square wave with amplitude=100,
> >>phase=30, frequency=1000, and duty cycle=75%.
> >>Question: what is the 'entropy' of this square wave?
> >
> > What is the distribution over the space of those parameters? The entropy
> is
> > a function of that distribution, so without specifying it there's no
> > answer.
>
> The actual distribution is entirely irrelevant here. What I am saying,
> is that - unless the probability distribution is so that a square wave
> with these parameters have 100% probability, i.e. your model can
> *only* transmit this single particular square wave and no other square
> waves, then the entropy will be *nonzero*. What it is exactly, I can't
> tell, and I don't care. I'm just telling you that - other than a
> single particular corner case - it is *certain* to be different from
> zero.
>
> > Said another way: if that's the only signal I want to transmit, then I'll
> > just build those parameters into the receiver and not bother transmitting
> > anything. The entropy will be zero. The receiver will simply output the
> > desired waveform without any help from the other end.
>
> Exactly. That's precisely what I am telling you. What I am telling
> you, is that - without exception - in *all* other cases, the entropy
> will be *nonzero* (without knowing or caring what it actually is). As
> soon as you want to build a transmitter that can transmit 2 or more
> sets of possible parameters where probability p_i != 1, then the total
> entropy will be *nonzero*. If total entropy is defined as
>
>     H = -K * SUM p_i * log2(p_i)     (Ref.: [Shannon1948])
>
> then if for any i the probability p_i != 1, this sum will produce a
> result that is different from zero. Hence, *nonzero* entropy.
>
> If you only have a single p=1, then this equation will yield zero
> entropy. Since the sum of all probabilities is 1, assume you have any
> other nonzero probability of another set of parameters q != 0. Since p
> + q = 1 and thus p = 1-q, because q != 0, it follows that p != 1.
> Hence, -log2(p) != 0, hence it follows from the above formula that the
> entropy H != 0. I do not know what H is exactly, and I do not care, I
> just know that if any p_i != 1, then H != 0.
>
> Here is a plotted graph of -p*log2(p) as a function of p:
>
> http://morpheus.spectralhead.com/img/log2_p.png
>
> Trivially, the graph of -p*log2(p) is *nonzero* for any p != 1 (or p
> != 0). Without knowing p, I don't know what it is *exactly*, I only
> know that it is *certain* to be nonzero, which is trivially visible if
> you look at the graph.
>
> > Again, this is begging the question. "A square wave" is not a
> distribution,
> > and so doesn't have "entropy." You need to specify what is the possible
> set
> > of parameters, and then specify a distribution over that set, in order to
> > talk about the entropy.
>
> False. I do not need to know the "exact" probabilities to know that
> the entropy will be *nonzero*. "Square wave" in this context was meant
> the set of parameters to represent the square wave using this model,
> which has a probability distribution. I do not need to know the
> _actual_ probability distribution to know that the entropy will be
> *nonzero*, unless it's the very special corner case of 100%
> probability of this particular square wave with this particular set of
> parameteres, and 0% probability of *all* other square waves. Which is
> a very, very, very, very unlikely scenario (and your transmitter will
> be very very dumb and quite unusable, since it cannot transmit any
> square wave other than this single one).
>
> In other words, -p*log2(p) is certain to be nonzero for any p != 1
> (which is the single corner case of 100% probability of this
> particular square wave). Hence, it is certain that the entropy H = -K
> * SUM p_i * log2(p_i) is nonzero, if *any* of the terms is nonzero
> (which happens when any p_i != 1, which happens when you have any
> other nonzero probability p_j != 0). Said otherwise, if your
> transmitter is able to transmit at least *two* sets of parameters with
> nonzero probability, then the entropy of both sets of parameters will
> be bound to be nonzero. I don't know what they actually are, I just
> know that they're nonzero.
>
> Think of it as this - if your receiver can distinguish only two
> different sets of parameters, then you need to send at least *one* bit
> to distinguish between them - '0' meaning square wave "A", and '1'
> meaning square wave "B". Without sending at least a *single* bit, your
> receiver cannot distinguish between square waves A and B. Same for
> more sets of parameters - I don't care or know how many bits you need
> to minimally send to uniquely distinguish between them, I just know
> that it is *certain* to be more than zero bits (at least *one* or
> more, hence nonzero amount of bits, hence nonzero entropy).
>
> >>If you assume the entropy _rate_ to be the average entropy per bits
> >
> > What is "per bits?" You mean "per symbol" or something?
>
> Of course. A bit is a symbol. If you prefer, call it "entropy per
> symbol" instead.
>
> > The definition of entropy rate is the limit of the conditional entropy
> > H(X_n|X_{n-1},X_{n-2},...) as n goes to infinity.
>
> This particular lecture says it has multiple definitions:
> http://www2.isye.gatech.edu/~yxie77/ece587/Lecture6.pdf
>
> Definition 1: average entropy per symbol
> H(X) = lim [n->inf] H(X^n)/n
>
> Reference:
> Dr. Yao Xie, ECE587, Information Theory, Duke University
>
> If you disagree, please discuss it with Mr. Xie.
>
> Definition 2 is the same as yours.
>
> > You were doing okay earlier in this thread but seem to be
> > getting into muddier and muddier waters as it proceeds, and the resulting
> > confusion seems to be provoking some unpleasantly combative behavior from
> > you.
>
> I successfully made an algorithm that gives ~1 for white noise, and a
> value approaching zero for periodic signals. And you're constantly
> telling me that I am wrong (without even understanding what I say).
> What behaviour did you expect? Try to look at the above formula, and
> understand what it means.
>
> -P
>
> Ref.: Shannon, Claude E. (1948). "A Mathematical Theory of Communication".
> --
> dupswapdrop -- the music-dsp mailing list and website:
> subscription info, FAQ, source code archive, list archive, book reviews,
> dsp links
> http://music.columbia.edu/cmc/music-dsp
> http://music.columbia.edu/mailman/listinfo/music-dsp
>
--
dupswapdrop -- the music-dsp mailing list and website:
subscription info, FAQ, source code archive, list archive, book reviews, dsp 
links
http://music.columbia.edu/cmc/music-dsp
http://music.columbia.edu/mailman/listinfo/music-dsp

Reply via email to