Re: testing if classifier accuracy differs significantly

Warren Sarle Sun, 20 Aug 2000 12:56:44 -0700

In article <[EMAIL PROTECTED]>,
 Mark Everingham <[EMAIL PROTECTED]> writes:
>Dear all,
>
>Help appreciated on this problem:
>
>I have two classifier systems which take as input an image and produce
>as output a label for each pixel in the image, for example the input
>might be of an outdoor scene, and the labels sky/road/tree etc.
>
>I have a set of images with the correct labels, so I can test how
>accurately a classifier performs by calculating for example the mean
>number of pixels correctly classified per image or the mean number of
>sky pixels correctly classified etc.
>
>The problem is this: Given *two* different classifiers, I want to test
>if the accuracy achieved by each classifier differs *significantly*. One
>way I can think of doing this is:
>
>for classifier 1,2
>       for each image
>               get % pixels correct
>       calculate mean and sd across images
>apply t-test
>
>Because the images used for each classifier are the same, I assume I can
>use a paired t-test. Assuming the distribution of % correct across
>images is approximately normal, this should work fine.
>
>However, I have two nagging objections to this:
>
> i) the accumulation of statistics across *images* rather than any other
>unit is fairly arbitrary

I think it is not arbitrary. Pixels and their classification
within an image are likely to be spatially correlated. Pixels
in different images are not correlated unless you are doing
something weird. The paired t-test you propose is not invalidated
by spatial correlation. If you did McNemar's test on pixels, it
would not be valid due to spatial correlation.

Follow-ups set to sci.stat.consult

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
[EMAIL PROTECTED]    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================
Re: testing if classifier accuracy differs significantly

Reply via email to