Shannon did his work in the context of communication channels, and chose
his terms accordingly.  Think of the distribution as coming from the
physics and practical details of the communication medium, and the value
chosen from that distribution as the message.  Then the entropy is a
measure of how much information the message can contain.

For example, some media have limits which constrain the data.  One such
constraint arises when transitions between one bits and zero bits are used
to synchronize the receiver to the datastream.  The receiver may lose
synchronization if there aren't any transitions for a while, so the data
must not contain sequences of N consecutive ones or M consecutive zeros.
Compact discs, for example, have a constraint of this nature -- something
like N=2, M=10, I believe.

This constraint reduces the amount of information which can be transmitted
by the medium.  The extent of this reduction is a measure of how much
structure has been imposed.

Dan Upper


> Dear all,
> 
> I wondered if anyone could help with a paradox at the heart of my
> understanding of entropy, information and pattern recognition.
> 
> I understand an informative signals as one which contains patterns, as
> opposed to radomly distributed numbers e.g. noise. Therefore, I
> equate information with structure in the signals distribution. However,
> Shannon equates information with entropy, which is maximimum when each
> symbol in the signal is equally as likely as the next i.e. a distribution
> with no `structure'. These views are contradictory.
> 
> What am I misisng in my understanding?
> 
> Many thanks in advance,
> Riz
> 
> 
> Rizwan Choudrey
> Robotics Group
> Department of Engineering Science
> University of Oxford
> 07956 455380
> 

Reply via email to