Dear Charles, Marco, David, ...

On Tuesday, Jun 24, 2003, Charles R. Twardy wrote:
> Rolf Haenni writes about degrees of support
>> }What would Bayesians do in such a case. They would start by saying
>> }p(X|A)=1 and p(A)=0.1. So what is p(X)?
>> }    p(X) = p(X|A)p(A)+p(X|NOT-A)p(NOT-A) = 0.1 + p(X|NOT-A)*0.9.
>> }Correct. But what is p(X|NOT-A)??? Bayesians tend then to assume
>> }p(X|NOT-A)=0.5 and to compute p(X)=0.55.
> I would have thought such a "max entropy" Bayesian would put a flat 
> prior
> between 0 and 1 on p(X|NOT-A), rather than a Dirac delta around p=0.5.
> Isn't that all we need? Doesn't the flatness of our pdf encapsulate
> "degree of ignorance"?
> Am I missing something?

Maybe. Ok, if X is a binary variable, expressing ignorance about 
X|NOT-A by a flat density function might be a possible way to go. But 
how do you measure the flatness of the density function? And what what 
if X has more than two, let's say n possible values? You might then 
suggest to represent the n-1 unknown parameters by n-1 flat densitiy 
functions. But watch out, they are not independent! This makes the 
whole story much more complicated (Marco, the same argument applies to 
the idea of using probability intervals!). And it gets even worse if X 
has an infinite number of possible values. But look how easy to story 
goes in the case of non-additive probabilites of provability (degrees 
of support).

For example, let's assume that X is a real number and it is known that 
(similar to the initial example)
   1) p(A)=0.1
   2) A implies X=13.7
So what can be said said about the hypothesis X=13.7 ?

Well, since when A is true in 10% of the cases we necessarily have 
X=13.7, it follows that the degree of support of X=13.7 is dsp(X=13.7) 
= p(A) = 0.1.

On the other hand, because our knowledge consisting only of 1) and 2) 
does not provide any way to deduce NOT(X=13.7), we have 
dsp(NOT(X=13.7))=0 and thus a level of ignorance = 0.9, which means 
that, if possible, any important decision depending on the question 
X=13.7? should be delayed (of course, as David Poole correctly said, it 
"depends on the relative costs of false-positives, false-negatives and 
not making a decision").

Isn't it beautiful to see how switching from probabilities of events to 
probabilities of provability of events allows to easily cope with such 
incomplete models. How would you do it using flat densitiy functions 
(Charles) or probability intervals (Marco)?

With my best wishes,

Rolf

PS: CU at ECSQARU ?

***************************************************
Dr. Rolf Haenni
University of Konstanz
Center for Junior Research Fellows
D-78457 Konstanz, Germany
Phone:
  - Office: (0049) 7531 88 4885
  - Home:   (0041) 71 670 1049
WWW: http://haenni.shorturl.com
***************************************************

Reply via email to