Hi all. I would like to understand what are the units defined on the y-axis when you plot the one-dimensional predictions (histograms) from lda() (MASS) discriminant function objects?
While the helpfile suggests that a histogram is returned by default, the presumably proportion-like values for each group seem to add up to more than 1, and I'm not sure how to interpret the code from ldahist(), which, I believe, defines the heights of each bin as est1/(diff(breaks) * length(data[g == grp])) where est1 is (as far as I can tell) the frequency within the bin, and the denominator is apparently the bin width multiplied by the total sample size for that panel. It seems to be that a far more logical result would be returned for each bin if the diff(breaks) component was removed entirely. While I don't think my concern affects the shape of each group's histogram, I'd much prefer to display a more intuitive y-axis. Example: library(MASS) ld1<-lda(Species ~ Sepal.Length + Sepal.Width, iris) plot(ld1, type = "histogram", dimen = 1) #(eyeballing it suggests that the sum of the "frequencies" reported on the y-axis for each group exceeds 1) Thanks very much. --Bob Farmer Dalhousie University ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.