----- Original Message -----
Sent: Thursday, February 20, 2003 2:25
PM
Subject: RE: [agi] A
probabilistic/algorithmic puzzle...
OK... life
lesson #567: When a mathematical explanation confuses non-math people,
another mathematical explanation is not likely to help
The basic
situation can be thought of as follows.
Suppose you
have a large set of people, say, all the people on Earth
Then you have a
bunch of categories you're interested in, say:
Chinese
Arab
fat
skinny
smelly
female
...
Then you have
some absolute probabilities, e.g.
P(Chinese) =
.2
P(fat) =
.15
etc. , which
tell you how likely a randomly chosen person is to fall into each of the
categories
Then you have
some conditional probabilities, e.g.
P(fat |
skinny)=0
P(smelly|male)
= .62
P(fat |
American) = .4
P(slow|fat) =
.7
The last one,
for instance, tells you that if you know someone is American, then there's a
.4 chance the person is fat (i.e. 40% of Americans are
fat).
The problem at
hand is, you're given some absolute and some conditional probabilities
regarding the concepts at hand, and you want to infer a bunch of
others.
In localized
cases this is easy, for instance using probability theory one can get
evidence for
P(slow|American)
from the
combination of
P(slow|fat)
and
P(fat |
American)
Given n
concepts there are n^2 conditional probabilities to look at. The most
interesting ones to find are the ones for which
P(A|B) is very
different from P(B)
just as for
instance
P(fat|American)
is very different from P(fat)
This problem is
covered by elementary probability theory. Solving it in principle is
no issue. The tricky problem is solving it approximately, for a large
number of concepts and probabilities, in a very rapid computational
way.
Bayesian
networks try to solve the problem by seeking a set of concepts that are
arranged in an "independence hierarchy" (a directed acyclic graph with a
concept at each node, so that each concept is independent of its parents
conditional on its ancestors -- and no I don't feel like explaining that in
nontechnical terms at the moment ;). But this can leave out a
lot of information because real conceptual networks may be grossly
interdependent. Of course, then one can try to learn a whole bunch of
different Bayes nets and merge the probability estimates obtained from each
one....
One thing that complicates
the problem is that ,in some cases, as well as inferring probabilities one
hasn't been given, one may want to make corrections to probabilities one HAS
been given. For instance, sometimes one may be given inconsistent
information, and one has to choose which information to
accept.
For example, if you're
told
P(male) = .5
P(young|male) = .4
P(young) = .1
then something's gotta give, because the first two
probabilities imply P(young) >= .5*.4 = .2
Novamente's probabilistic reasoning system handles
this problem pretty well, but one thing we're struggling with now is keeping
this "correction of errors in the premises" under control. If you let
the system revise its premises to correct errors (a necessity in an AGI
context), then it can easily get carried away in cycles of revising premises
based on conclusions, then revising conclusions based on the new premises,
and so on in a chaotic trajectory leading to meaningless inferred
probabilities.
As I said before, this is a very simple incarnation
of a problem that takes a lot of other forms, more complex but posing the
same essential challenge.
-- Ben G