On Tue, 11 Feb 2003, Rolf Turner wrote: > > I've been groping my way through a classification/discrimination > problem, from a consulting client. There are 26 observations, with 4 > possible categories and 24 (!!!) potential predictor variables. > > I tried using lda() on the first 7 predictor variables and got 24 of > the 26 observations correctly classified. (Training and testing both > on the complete data set --- just to get started.) > > I then tried rpart() for comparison and was somewhat surprised when > rpart() only managed to classify 14 of the 26 observations correctly. > (I got the same classification using just the first 7 predictors as I > did using all of the predictors.) > > I would have thought that rpart(), being unconstrained by a parametric > model, would have a tendency to over-fit and therefore to appear to > do better than lda() when the test data and training data are the > same. > > Am I being silly, or is there something weird going on? I can > give more detail on what I actually did, if anyone is interested.
The first. rpart is seriously constrained by having so few observations, and its model is much more restricted than lda: axis-parallel splits only. There is a similar example, with pictures, in MASS (on Cushings). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help