[pymvpa] biased accuracy with nperlabel='equal'?

David V. Smith Sun, 23 Oct 2011 11:33:49 -0700

Hi,

I have 140 structural images: 78 are in class A and 62 are in class B. To 
ensure that the training algorithm (LinearNuSVMC) doesn't build a biased model, 
I am using the nperlabel='equal' option in my splitter. I know this part of my 
code is working (see below), so I'm confused why my CVs (leave-one-scan-out) 
are biased with random data (e.g., 55.71%). Can someone please clarify why I'm 
not getting 50% with random data? I suspect I'm just not understanding 
something simple...


Thanks!
David


In [11]: print ds.summary()
Dataset / float64 140 x 20068
uniq: 140 chunks 2 labels
stats: mean=0.114425 std=0.318326 var=0.101332 min=0 max=1
No details due to large number of labels or chunks. Increase maxc and maxl if 
desired
Summary per label across chunks
  label  mean  std  min max #chunks
   1    0.443 0.497  0   1     62
   2    0.557 0.497  0   1     78


In [10]: print '\n'.join([d.summary() for d in 
list(NFoldSplitter(nperlabel='equal')(ds))[0]])

Dataset / float64 122 x 20068
uniq: 122 chunks 2 labels
stats: mean=0.107628 std=0.30991 var=0.0960441 min=0 max=1
No details due to large number of labels or chunks. Increase maxc and maxl if 
desired
Summary per label across chunks
  label mean std min max #chunks
   1     0.5 0.5  0   1     61
   2     0.5 0.5  0   1     61

Dataset / float64 1 x 20068
uniq: 1 chunks 1 labels
stats: mean=0.077935 std=0.268069 var=0.0718612 min=0 max=1

Counts of labels in each chunk:
  chunks\labels 1.0
                ---
      1.0        1

Summary per label across chunks
  label mean std min max #chunks
   1      1   0   1   1     1

Summary per chunk across labels
  chunk mean std min max #labels
   1      1   0   1   1     1


_______________________________________________
Pkg-ExpPsy-PyMVPA mailing list
[email protected]
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa

[pymvpa] biased accuracy with nperlabel='equal'?

Reply via email to