On 2/12/09 19:55 PM, "Gian Maria Niccolò Benucci" <gian.benu...@gmail.com> wrote: > ... I supposed, that If we use as many dimensions as there are variables, > then we can perfectly reproduce the observed distance matrix. Isn't it? Gian, Not quite so. I think it would be useful to consult a good book, but here some explanation.
The NMDS is not a simple "reproduction" method, but it is a non-linear regression problem. For n points and k dimensions we fit a nonlinear regression with n*k parameters fitted to n*(n-1)/2 observations. It doesn't require much intuition to see that this is not well defined for k approaching n, and then the non-linear regression fails. For details, the non-linear regression function is isoreg() in R, and the model fitting happens with optim() using method = "BFGS" (Broyden, Fletcher, Goldfarb & Shanno). All this is not very obvious because it is done within a C function in the MASS package. The NMDS is nonlinear just in order to be able to produce a good mapping with low values of k: so stick with low values of k. If you want to have complete mapping of dissimilarities, you should use metric scaling. Then you typically ignore the latter axes. However, even here the situation is not as clear as you write. If you use Euclidean distances, then the number of variables give the number of dimensions of metric scaling. With Euclidean distances, the complete solution also exactly reproduces the observed distances. However, with non-Euclidean dissimilarities (like Bray-Curtis in your case) the situation is more complicated. Metric scaling and complete mapping is Euclidean, and if your dissimilarities are non-Euclidean, you have a problem (that you usually ignore). Firstly, the number of above zero eigenvalues and corresponding real eigenvalues is not directly defined by the number of variables. Secondly, you cannot reproduce the observed dissimilarities from real eigenvectors because that reproduction is Euclidean and your measure was non-Euclidean. For exact reproduction, you should subtract the distances in imaginary space (negative eigenvalues) from distances in the real space (positive eigenvalues). We actually do it exactly like this in the betadisper() function in vegan, and for this reason the wcmdscale() function of vegan also returns information on complex eigenvectors and negative eigenvalues. For your other post that came when I wrote this: stress 11.6 is really fine. I think that if you get stress down to 5% (0.05) or less, then there is something fishy in your data or in your model specification, like overfitting. Cheers, Jari Oksanen > But, > of course, our goal is to reduce the observed complexity of nature, that is, > to explain the distance matrix in terms of fewer underlying dimensions... > So what is best at the end?? > And also wich is the function for plotting the stress values versus the > number of dimnsions and how to read the plot? > I hope I was clear, thank you so much! > Yours, > > G. > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology