Re: Thomas' Fuzziness and Probability

S. F. Thomas Tue, 31 Jul 2001 10:13:45 -0700
You raise a good question. As the author in question, may I
respond as follows:

There really is a dualistic relationship between fuzziness and
probability. They are distinct concepts, but I would argue that
there is a real sense in which the former *derives* from the
latter. That thought is anathema to many fuzzicists who fear the
implication that if so, then there is "nothing new" to fuzzy, as
falling ultimately under the ambit of probability theory. I don't
think so. There is plenty new semantics in the fuzzy set theory
of which probabilists have been blissfully unaware, and which in
fact helps to illuminate some problems in the foundations at
least of statistical inference theory.

The duality is precisely analogous to the duality that exists
between probability and likelihood. A probability distribution
over sample space gives rise to a likelihood function over
parameter space. The one is a set function, the other a point
function, and they pertain to different domains. In the case of
natural language semantics, it is precisely because language-use
is a chance phenomenon, even in a calibrational setting, that
there is fuzziness in the meanings of terms. More precisely,
uncertainty in the calibrational response variable, either yes or
no, to a series of calibrational propositions such as "would you
use the term 'tall' to describe the height value for which John
stands as exemplar, in the context of heights of adult males?"
gives rise to a semantic likelihood function over height space,
induced by probabilistic response uncertainty over calibrational
response (yes/no) space, it being understood that many different
height value exemplars (Jim, Peter, Paul, etc.) are similarly
presented in calibrational setting. The affirmation probability
(Bernoulli parameter) varies as a point function over height
space, as opposed to a set function over calibrational response
space. Thus the calibrational response rates traced out with
respect to the height variable is in no sense a probability
distribution, since it would in general not sum to unity; nor is
it a frequency distribution over the height values of the adult
male population that in an obvious sense may be rendered as a
probability distribution. All you have is a characteristic
function that describes, for various height values, the rate at
which a relevant speaker population would use the term "tall" to
describe the height values in question. It is a membership
function in the obvious Zadehian sense of a point function
ranging from 0 to 1, though Zadeh may or may not approve of the
manner in which it is obtained. It could also have been called a
*semantic likelihood* function, or a *characteristic* function of
the *term* tall, as distinct from the *membership* function
characterizing the associated set of tall *men*.

I like the term semantic likelihood because it gets to the heart
of the matter in my view. In a non-calibrational setting, eg. the
use of the term "tall" by a rape victim in court to describe the
height of her attacker, it is the calibrational response
uncertainty in terms purely of language-use, that leads to
semantic uncertainty about the precise height to which she
refers. The semantic likelihood function traces out the relative
possibility of various height-value hypotheses consistent with
her description of her attacker as "tall". In ordinary discourse
and comprehension, we don't need to have it spelt out, obviously.
But is in some sense there.

This analogy leads to the perhaps startling conclusion that the
(absolute) likelihood function familiar from statistical
inference theory is in some sense also a membership function!  It
certainly satisfies the minimum condition that it range on the
[0,1] interval. But semantically as well, it may be construed
simply as the term corresponding to what the data "say" about
some unknown parameter of interest. The greater the quantity of
data, the more precise, or less fuzzy, is the characterization of
the unknown parameter of interest. Statistical sample data
relevant to inference concerning the value of model parameters
are therefore analogous to fuzzy natural-language statements
about things like people's height. It is also analogous to
measurement, which for continuous attributes is fuzzy in general,
since no measurement may be made literally to an infinite number
of decimal places, and at the digit where uncertainty enters,
accidental and systematic errors of measurement, exactly like
those associated with the calibrational proposition with which we
started for characterizing the term "tall", may conspire to
render the range of uncertainty in the ultimate digit of
measurement fuzzy rather than crisp.

In all of this there is an essential and unavoidable duality and
interplay between uncertainty of the probabilistic sort, and
uncertainty of the fuzzy (also likelihood) sort. To treat these
two kinds of uncertainty in this fashion is not to exalt one over
the other, rather to recognize that they are inextricably linked,
as are the two faces of a single coin. We give precedence to
probability naturally, because it comes first in our
comprehension of types of uncertainty, and as earlier mentioned
there is an obvious sense in which likelihood, including the
present semantic variant, *derives* from probability. But it is
precisely because statistical inference theory privileges
probability over likelihood that we have on one hand the
convoluted methods of classical inference seeking to render
uncertainty of the likelihood sort in probabilistic terms (the
use of "p-values" and the like), and on the other hand the vain
Bayesian attempt to treat likelihood as though it were
probability, and therefore subject to a simple integration method
of disjunction (evaluation of composite hypotheses or set
evaluation).  Treating likelihood like the fuzzy characteristic
functions that they essentially are, and using the brilliant
semantics introduced by Zadeh, allows us to develop a new theory
of statistical inference. But this requires both hands to clap:
the right hand of probability with the left hand of
likelihood/fuzzy/possibility. Privileging fuzzy in denial of a
probability connection is misguided in my view, in the same way
that using probability to represent a simple likelihood reality
leads to difficult analytical contortions as in both classical
and Bayesian statistics. It is time to get past the early
tutorial exaggeration that insisted that fuzzy was different from
probability. That is true, but it is also true that the twain do
meet.

Hope this is helpful.

Regards,
S. F. Thomas

Joe Pfeiffer wrote:
> 
> I'm currently reading the book mentioned above; I'm wondering about
> something...
> 
> He attempts to define the membership function of a set by using
> what he calls ``calibrational propositions'' -- the idea is that if
> you ask 70 people if John is tall, and 70 of them say ``yes,'' then
> mu(tall) = .7.  While this seems to do a good job of capturing common
> word usage, it's not at all clear to me that it captures the fuzzy
> behavior of the ``tall'' set; it seems probabilistic rather than
> fuzzy.
> 
> So, what are other people's reactions?
> --
> Joseph J. Pfeiffer, Jr., Ph.D.       Phone -- (505) 646-1605
> Department of Computer Science       FAX   -- (505) 646-1002
> New Mexico State University          http://www.cs.nmsu.edu/~pfeiffer
> SWNMRSEF:  http://www.nmsu.edu/~scifair


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================
Re: Thomas' Fuzziness and Probability

Reply via email to