On 12/08/2015, Sampo Syreeni <de...@iki.fi> wrote:
>
> Of course you need to: both of the words could have probability one, so
> that their occurrence pairwise would also have probability one for
> certain, and so the joint surprisal could be a flat zero.

And even in this case, the entropy (surprisal) may be nonzero. After
all, how do you know how to group the symbols into pairs? With two
symbols, there are 2 possible ways of organizing symbols into groups.
Is the pair "ab", or is it "ba" ? Is the signal "abababab...." or is
it "babababa...." ?

These are equivalent to two 1-bit square waves with a 180 deg phase
difference (= inverted phase), hence they're two different signals.
Without your transmitter sending the phase information, how does your
receiver know, if the signal to be reconstructed is "abababab..." or
"babababa..." ?

In the case the two signals are equally probable, that's precisely 1
shannon (= 1 bit) of entropy there, even if two symbols have a
probability of 1 when grouped as pairs - since there are 2 ways of
grouping two symbols into pairs. Without sending a nonzero amount of
information, your receiver won't know how to reconstruct the signal
with 100% certainity. (And that is just the probabilistic Shannon
entropy, we're not even speaking about codebook length or algorithmic
entropy here.)

Why do you think all data compression competitions include the size of
the decompressor program? Because you could just say: "Look Mom! No
entropy! I just hard-coded the data into the decoder program!" If you
do that, all that is going to happen, is you're going to get
disqualified from the competition, for cheating. Having zero Shannon
entropy means doing data compression by cheating - hiding the data
somewhere, and pretending it's not there, which gets you disqualified
from any data compression competition.

One has to be a special kind of retarded to actually believe that you
can reconstruct something from nothing. (The Burrows-Wheeler Transform
will not help you, nor a document you wrote many years ago.) The
probabilistic Shannon entropy is a highly simplified view of data
compression.

-P
_______________________________________________
music-dsp mailing list
music-dsp@music.columbia.edu
https://lists.columbia.edu/mailman/listinfo/music-dsp

Reply via email to