Dear Charles, Marco, David, ... On Tuesday, Jun 24, 2003, Charles R. Twardy wrote: > Rolf Haenni writes about degrees of support >> }What would Bayesians do in such a case. They would start by saying >> }p(X|A)=1 and p(A)=0.1. So what is p(X)? >> } p(X) = p(X|A)p(A)+p(X|NOT-A)p(NOT-A) = 0.1 + p(X|NOT-A)*0.9. >> }Correct. But what is p(X|NOT-A)??? Bayesians tend then to assume >> }p(X|NOT-A)=0.5 and to compute p(X)=0.55. > I would have thought such a "max entropy" Bayesian would put a flat > prior > between 0 and 1 on p(X|NOT-A), rather than a Dirac delta around p=0.5. > Isn't that all we need? Doesn't the flatness of our pdf encapsulate > "degree of ignorance"? > Am I missing something?
Maybe. Ok, if X is a binary variable, expressing ignorance about X|NOT-A by a flat density function might be a possible way to go. But how do you measure the flatness of the density function? And what what if X has more than two, let's say n possible values? You might then suggest to represent the n-1 unknown parameters by n-1 flat densitiy functions. But watch out, they are not independent! This makes the whole story much more complicated (Marco, the same argument applies to the idea of using probability intervals!). And it gets even worse if X has an infinite number of possible values. But look how easy to story goes in the case of non-additive probabilites of provability (degrees of support). For example, let's assume that X is a real number and it is known that (similar to the initial example) 1) p(A)=0.1 2) A implies X=13.7 So what can be said said about the hypothesis X=13.7 ? Well, since when A is true in 10% of the cases we necessarily have X=13.7, it follows that the degree of support of X=13.7 is dsp(X=13.7) = p(A) = 0.1. On the other hand, because our knowledge consisting only of 1) and 2) does not provide any way to deduce NOT(X=13.7), we have dsp(NOT(X=13.7))=0 and thus a level of ignorance = 0.9, which means that, if possible, any important decision depending on the question X=13.7? should be delayed (of course, as David Poole correctly said, it "depends on the relative costs of false-positives, false-negatives and not making a decision"). Isn't it beautiful to see how switching from probabilities of events to probabilities of provability of events allows to easily cope with such incomplete models. How would you do it using flat densitiy functions (Charles) or probability intervals (Marco)? With my best wishes, Rolf PS: CU at ECSQARU ? *************************************************** Dr. Rolf Haenni University of Konstanz Center for Junior Research Fellows D-78457 Konstanz, Germany Phone: - Office: (0049) 7531 88 4885 - Home: (0041) 71 670 1049 WWW: http://haenni.shorturl.com ***************************************************