On 9 Mar 2004 15:00:59 -0800, [EMAIL PROTECTED] (apgoodb) wrote:

> I created a model to predict if there was a match between 2 lists. 
> The model has 4 levels of matching based on different fields.  It is
> better to say no match when it should be a match than to say match
> when it is not.
> n=5867
> TT=model indicated a match and it is a match
> TF=model indicated a match and it is not a match
> FF=model indicated a non-match and it is a non-match
> FT=model indicated a non-match and it was not a match
  -- typo, I assume, FT=  ... *is*  a match --

I keep stumbling over those terms.  It is a lot more 
familiar to me, if those are called, in order, 
True-positive, False-positive, True-negative, and False-Negative.

You are saying, further, that the two errors are not equal,
that a false-negative is more desirable than a false-positive.

> 
> Here are the results:
> Level 1:
> TT=919
> TF=4
> FF=3546
> FT=1398

If you arrange them in 2x2  fashion, the Odds Ratio
for the first table  is 919*3546  / (4*1398) => 600.
That is one measure of how 'good'  the fit is, where
the OR  is 1.0  for  'chance'.
For the next tables, the values were 3700 and 5000,
and then Infinite (division by zero) -

[ snip, 2 and 3]
 
> Level 4:
> TT=2317
> TF=42
> FF=3508
> FT=0
> 
> As I go from level 1 to level 4, the model matches more TT, but it is
> also getting more TF.  How do I show this?  What does this tell me? 
> Should I just use the TT and TF?  Any help getting started is
> appreciated.

In epidemiology, the similar data are described in terms
of Sensitivity, and one or another version of Specificity:  
How many of the cases are detected?  TP/ (TP+FN) 
How many of the Positive diagnoses were real?  TP/(TP+FP)
 
Model #1 
Sens: 919/2317, 40%;  Spec: 919/923, 99.6%

Model #4 
Sens: 2317/2317, 100%; Spec: 2317/2359, 98.2%.

But if you really care about the 'errors',  I think you will 
compare them directly.  One model has 4 BAD  errors, and
1400 minor ones.  The other has 42  BAD  errors, and 0 of
the other.  Is that better, or worse?  - that depends on 
what the costs/ benefits are for being right/ wrong.


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
 - I need a new job, after March 31.  Openings? -
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to