Hi Rizwan At http://fuzzy.cs.uni-magdeburg.de/~borgelt/other/entropy.ps.gz you can find four pages from my book (together with Rudolf Kruse) "Graphical Models - Methods for Data Analysis and Mining", which explain the intuitive idea underlying Shannon entropy and Shannon information gain. Here entropy is interpreted as a lower bound for the number of yes/no questions you have to ask in order to determine the obtaining one from a set of possible alternatives. In this view coding a message simply means to transmit the answers for a fixed question scheme. With this the two views are easily reconciled: Entropy measures the (minimum amount of) information that you lack if you are facing a probability distribution for the possible states of a system. It measures the (maximum amount of) information you receive if a stream of bits is sent to you through some channel. Maybe this helps.
Regards, Chris Rizwan Choudrey writes: > Dear all, > > I wondered if anyone could help with a paradox at the heart of my > understanding of entropy, information and pattern recognition. > > I understand an informative signals as one which contains patterns, as > opposed to radomly distributed numbers e.g. noise. Therefore, I > equate information with structure in the signals distribution. However, > Shannon equates information with entropy, which is maximimum when each > symbol in the signal is equally as likely as the next i.e. a distribution > with no `structure'. These views are contradictory. > > What am I misisng in my understanding? > > Many thanks in advance, > Riz > > > Rizwan Choudrey > Robotics Group > Department of Engineering Science > University of Oxford > 07956 455380 >
