Dear Alexandre, To build a predictive model of functional groups of species based on a set of traits, you can simply apply a classification tree to the clusters obtained from the same species x traits table. Here is an example with trait data from New Zealand vascular plant species, using the mvpart() function in the mvpart package:
library(FD) library(mvpart) # Gower dissimilarity matrix for mixed trait variables gd <- gowdis(tussock$trait) # Ward hierarchical clustering gc <- hclust(gd, "ward.D2") plot(gc, hang = -1) rect.hclust(gc, 6) # 6 clusters or plant functional types fg <- cutree(gc, 6) # Classification tree tra.ct <- mvpart(as.factor(fg) ~ ., tussock$trait) You get a decision tree with threshold values for the discriminating qualitative or quantitative traits. Unfortunately and for obscure reasons, mvpart is no longer available from CRAN for years. However, you can install this great package from the archive: devtools::install_github("cran/mvpart", force = TRUE) You can also use the more limited rpart::rpart function instead. Best, François ----- Mail original ----- De: "Alexandre F. Souza" <alexsouza.cb.ufrn...@gmail.com> À: "r-sig-ecology" <r-sig-ecology@r-project.org> Envoyé: Lundi 1 Novembre 2021 20:37:20 Objet: [R-sig-eco] On The Choice of a Classification Approach Hello, I am trying to find a method to cluster species based on their quantitative traits and at the same time obtain threshold value for each node in the decision tree. My difficulty is that my dependent variable is the list of species names, each species appearing as a single line with no repetition. All explanatory variables are quantitative. As far as I understood, classification trees need a dependent variable with repeated levels as in the iris dataset, in which each species appears several times. All the examples employing classification trees I found use a dependent variable, but I do not have one except for the species names. MRT uses a species by location matrix as dependent variable, and traditional hierarchical cluster analysis do cluster species but do not use quantitative data to that aim, nor produce threshold values. I can run a non-hierarquical cluster analysis like kmeans, but these do not generate threshold values. My concern is that without threshold values any classification I produce will be restricted to the studied species and will not be applicable to different species that can be found in the studied region, what would be a strong limitation to the use of such classification. Thank you very much in advance for any ideas. Regards, Alexandre -- Dr. Alexandre F. Souza Professor Associado Chefe do Departamento de Ecologia Universidade Federal do Rio Grande do Norte CB, Departamento de Ecologia Campus Universitário - Lagoa Nova 59072-970 - Natal, RN - Brasil lattes: lattes.cnpq.br/7844758818522706 http://www.esferacientifica.com.br https://www.youtube.com/user/alexfadigas http://www.docente.ufrn.br/alexsouza orcid.org/0000-0001-7468-3631 <http://www.docente.ufrn.br/alexsouza> [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology