Dear R-sig-phylo contributors, Thanks a lot for sharing your illuminating thoughts on the list.
I agree with points raised by others, and I'd like to add one suggestion. As already mentioned by several, alpha and sigma-squared can be hard to estimate with certainty. However, in some cases, the stationary variance (sigma-squared/2 alpha) converges well while sigma-squared and alpha separately do not. What such case means is that the expected amount of variance at an OU equilibria (= stationary variances) can be estimated reliably, but how long adaptation takes to get there (= alpha, or half life, log(2)/alpha) is highly uncertain. I thus would suggest Rafael to compute stationary variances from estimated sigma-squared and alpha, and see if you can make a better sense. Good luck. Thanks again for putting an interesting question on the table and for sharing thoughts. Best, Masahito > On 5 Apr 2018, at 00:41, Marguerite Butler <mbutler...@gmail.com> wrote: > > Aloha Rafael, > > I am very glad that you are thinking about power of your analyses and > confidence in your parameter estimates. Folks have chimed in with many > useful responses, however, I believe your original question was whether > uncertainty in parameter estimates indicate bad models and limits of > interpretation? > > It’s important to note that model fit and parameter estimates are two > different aspects of assessment. It is perfectly possible to have robust > model selection (decent power) but have parameters that have a lot of > uncertainty and are hard to identify with any precision (large confidence > intervals). That was pretty well demonstrated in Cressler, Butler, King > 2015. The specific outcome is data and model dependent, but as a > generalization, model selection is more robust than parameter estimation, and > within the parameters, the position of the thetas are a little bit easier to > narrow down than the alphas and sigmas. > > This is not too surprising when you think about the information content of > the data. We are really trying to fit some relatively complex models with few > data (one point per species, with phylogenetic structure to boot), and the > intuitive interpretation is that there are many ways to the same outcome. If > the outcome is the particular pattern of phenotypes along the tree, it might > be possible to get that outcome with strong selection and little > stochasticity, or weak selection toward optima very far apart (and the > observed phenotypes are essentially “on the way” toward the optima), or very > weak selection and strong stochasticity (they were just knocked about that > way by drift), etc. While this last scenario is most different than the > first two (the first two are more OU-type, whereas the last is more BM-like > in being dominated by stochasticity), my point is that there may not > necessarily be a unique set of parameters that explain the data. > > Nevertheless, it may still be possible to say something biologically useful. > First of all, I noticed that you are doing SIMMAP which is great for > exploring phylogenetic uncertainty, but this is a bit arms-length with regard > to OU parameter and model uncertainty. A bit apples and oranges. I would > strongly recommend that you try some parametric bootstrapping on your OU > models for both model selection and parameter estimation. In OUCH, there are > built-in facilities to make this really easy, probably there are in OUwie as > well (sorry I only stick with OUCH :). The key for the model selection > bootstrap is that you can simulate the data using the best-fit model, but > then fit all of the alternative models to see how often your best model > performs. I like to point folks to my former grad student’s paper (Scales et > al 2009). If you want to be even more stastically thorough you can do the > Botteinger, Coop, and Ralph 2012 approach which basically does parametric > simulations on both the best and alternative models. The parametric > bootstraps on the parameter estimates is much easier. and you can just run > the simulations and fit only the best model and draw your CI’s from that. > Maybe you’ve done this already. > > With regard to what can be said, as Liam and Brian and others have said, many > times there are large CIs because we encounter flat or ridged likelihood > surfaces. So that may limit your conclusions somewhat to saying something > like we don’t know what alpha is, precisely, but we know it’s greater than > zero. Etc. In your specific case, what you want to know is whether theta 1 > > theta 2, or something like that, right? So without needing to know the > precise location of the theta’s I would bet that you can look through your > parametric simulations and see how often that condition is satisfied. You > may find that you don’t know what they are exactly, but one is always greater > than the other. > > I love your study question by the way. I too study colorful organisms in > bright and dark habitats. Best of luck! > > Hth, > Marguerite > > Cressler C., Butler M.A., and King A. A. (2015) Detecting adaptive evolution > in phylogenetic comparative analysis using the Ornstein-Uhlenbeck model. > Sys. Bio. 64(6):953-968. DOI: 10.1093/sysbio/syv043 > > Scales J.A., King A.A., and Butler M.A. (2009) Running for your life or > running for your dinner: What drives fiber-type evolution in lizard locomotor > muscles? Am. Nat.173:543-553. DOI: 10.1086/597613 > > Boettinger C., Coop G., and P. Ralph (2012) Is your phylogeny informative? > Measuring the power of comparative methods. Evolution 66-7: 2240-51. DOI: > 10.1111/j.1558-5646.2011.01574.x > >> On Apr 4, 2018, at 10:27 AM, Brian O'Meara <omeara.br...@gmail.com> wrote: >> >> Apparently the plot did not come through correctly. I'm attaching a PDF of >> it. >> >> Best, >> Brian >> >> _______________________________________________________________________ >> Brian O'Meara, http://www.brianomeara.info <http://www.brianomeara.info/>, >> especially Calendar <http://brianomeara.info/calendars/omeara/>, CV >> <http://brianomeara.info/cv/>, and Feedback >> <http://brianomeara.info/teaching/feedback/> >> >> Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville >> Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville >> Associate Director for Postdoctoral Activities, National Institute for >> Mathematical & Biological Synthesis <http://www.nimbios.org/> (NIMBioS) >> >> >> On Wed, Apr 4, 2018 at 4:24 PM, Brian O'Meara <omeara.br...@gmail.com >> <mailto:omeara.br...@gmail.com>> wrote: >> Hi, Rafael et al. >> >> ?OUwie has some information on the standard errors: >> >> The Hessian matrix is used as a means to estimate the approximate standard >> errors of the model parameters and to assess whether they are the maximum >> likelihood estimates. The variance-covariance matrix of the estimated values >> of alpha and sigma^2 are computed as the inverse of the Hessian matrix and >> the standard errors are the square roots of the diagonals of this matrix. >> The Hessian is a matrix of second-order derivatives and is approximated in >> the R package numDeriv. So, if changes in the value of a parameter results >> in sharp changes in the slope around the maximum of the log-likelihood >> function, the second-order derivative will be large, the standard error will >> be small, and the parameter estimate is considered stable. On the other >> hand, if the second-order derivative is nearly zero, then the change in the >> slope around the maximum is also nearly zero, indicating that the parameter >> value can be moved in any direction without greatly affecting the >> log-likelihood. In such situations, the standard error of the parameter will >> be large. >> >> For models that allow alpha and sigma^2 to vary (i.e., OUMV, OUMA, and >> OUMVA), the complexity of the model can often times be greater than the >> information that is contained within the data. As a result one or many >> parameters are poorly estimated, which can cause the function to return a >> log-likelihood that is suboptimal. This has great potential for poor model >> choice and incorrect biological interpretations. An eigendecomposition of >> the Hessian can provide an indication of whether the search returned the >> maximum likelihood estimates. If all the eigenvalues of the Hessian are >> positive, then the Hessian is positive definite, and all parameter estimates >> are considered reliable. However, if there are both positive and negative >> eigenvalues, then the objective function is at a saddlepoint and one or >> several parameters cannot be estimated adequately. One solution is to just >> fit a simpler model. Another is to actually identify the offending >> parameters. This can be done through the examination of the eigenvectors. >> The row order corresponds to the entries in index.matrix, the columns >> correspond to the order of values in eigval, and the larger the value of the >> row entry the greater the association between the corresponding parameter >> and the eigenvalue. Thus, the largest values in the columns associated with >> negative eigenvalues are the parameters that are causing the objective >> function to be at a saddlepoint. >> >> You can get better (by which I mean more accurate) estimates by parametric >> bootstrapping (simulation). >> >> Doing a quick summary of your OUM results (code at end): >> >> Trait: contr >> nonfor: 0.12 (-0.12, 0.36) >> inter: 0.12 (-0.12, 0.36) >> fores: 0.13 (-0.13, 0.39) >> Phylogenetic half life: 0.06 >> Tree height: 28.34 >> >> Trait: cshd >> nonfor: 0.24 (-0.23, 0.72) >> inter: 0.33 (-0.32, 0.98) >> fores: 0.2 (-0.19, 0.6) >> Phylogenetic half life: 6.35 >> Tree height: 28.34 >> >> Trait: dors >> nonfor: 0.04 (-0.04, 0.12) >> inter: 0.15 (-0.14, 0.44) >> fores: 0 (0, 0.01) >> Phylogenetic half life: 3.68 >> Tree height: 28.34 >> >> Trait: vent >> nonfor: 0.05 (-0.05, 0.14) >> inter: 0.32 (-0.3, 0.93) >> fores: -0.03 (0.03, -0.08) >> Phylogenetic half life: 5.43 >> Tree height: 28.34 >> >> It does seem that you don't have a lot of ability to tell optima apart. The >> alpha value is pretty low, though not tremendously (the half life is a good >> way to look at that). But the raw data also suggest it's not three very >> clear clumps (beeswarm plot, regimes on the x axis, trait 1 value on the y). >> It's not phylogenetically corrected, but still a good gut check: >> >> >> >> I agree that for your data, it may be that simpler models work better. It >> could also explain why you're getting crazy theta values for models with V >> or A allowed to vary: without strong attraction, it's hard to figure out >> what the attraction is towards, and especially for the third regime, with >> only 11 data points, there probably isn't a lot of info to estimate many >> parameters for just that regime. >> >> Best, >> Brian >> >> library(OUwie) >> library(ape) >> library(beeswarm) >> >> data <- read.csv("~/Downloads/OUM_PARAMS_ant_f.csv", row.names=1) >> load("~/Downloads/examplemodel.Rdata") >> >> traits <- colnames(data) >> >> regimes <- c("nonfor", "inter", "fores") >> >> observations <- list() >> >> for (trait.index in sequence(length(traits))) { >> cat(paste("\n\nTrait:", traits[trait.index])) >> for (regime.index in sequence(length(regimes))) { >> point.estimate <- data[paste0("theta.",regimes[regime.index]), >> traits[trait.index]] >> se <- data[paste0("theta.",regimes[regime.index]), >> traits[trait.index]] >> result.vector <- round(c(point.estimate,point.estimate-1.96*se, >> point.estimate+1.96*se),2) >> cat(paste0("\n",regimes[regime.index], ": ", result.vector[1], " (", >> result.vector[2], ", ", result.vector[3], ")")) >> if (trait.index==1) { >> observations[[regime.index]] <- >> bla$data[which(bla$data[,1]==regime.index),2] >> } >> } >> cat(paste0("\nPhylogenetic half life: ", round(log(2)/data["alpha", >> traits[trait.index]], 2))) >> cat(paste0("\nTree height: ", >> round(max(ape::branching.times(bla$phy)),2))) >> } >> >> points.to.plot <- bla$data[,1:2] >> colnames(points.to.plot) <- c("regime", "trait1") >> beeswarm(trait1~regime, data=points.to.plot, col="black", pch=20) >> >> >> >> >> _______________________________________________________________________ >> Brian O'Meara, http://www.brianomeara.info <http://www.brianomeara.info/>, >> especially Calendar <http://brianomeara.info/calendars/omeara/>, CV >> <http://brianomeara.info/cv/>, and Feedback >> <http://brianomeara.info/teaching/feedback/> >> >> Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville >> Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville >> Associate Director for Postdoctoral Activities, National Institute for >> Mathematical & Biological Synthesis <http://www.nimbios.org/> (NIMBioS) >> >> >> On Wed, Apr 4, 2018 at 4:07 PM, Jacob Berv <jakeberv.r.sig.ph...@gmail.com >> <mailto:jakeberv.r.sig.ph...@gmail.com>> wrote: >> Even if those models may fit the data better, they may not necessarily help >> Rafael determine whether or not the parameter estimates of interest are >> different across regimes (though perhaps BMS might be informative). >> >> Rafael, maybe you could try fixing the ancestral regimes to match most >> likely states for each node from your SIMMAP summary? I wonder if that might >> decrease your ‘uncertainty’ in parameter estimates. >> >> I don’t have a great answer for your main question though, which is a good >> one. >> >> Jake >> >>> On Apr 4, 2018, at 8:59 PM, William Gearty <wgea...@stanford.edu >>> <mailto:wgea...@stanford.edu>> wrote: >>> >>> Hi Rafael, >>> >>> Have you tried running the BM1, BMS, or OU1 models? If so, how do the AICc >>> values for those compare to those of the OUM model? If not, you should make >>> sure you run those. >>> If you find that the these models fit your data better, this could indicate >>> that there isn't a large different across regimes and an OUM model isn't >>> necessarily suitable. >>> >>> Best, >>> Will >>> >>> On Wed, Apr 4, 2018 at 12:30 PM, Rafael S Marcondes >>> <raf.marcon...@gmail.com <mailto:raf.marcon...@gmail.com> >>>> wrote: >>> >>>> Dear all, >>>> >>>> I'm writing (again!) to ask for help interpreting standard errors of >>>> parameter estimates in OUwie models. >>>> >>>> I'm using OUwie to examine how the evolution of bird plumage color varies >>>> across habitat types (my selective regimes) in a tree of 229 tips. I was >>>> hoping to be able to make inferences based on OUMV and OUMVA models, but I >>>> was getting nonsensical theta estimates from those. So I've basically given >>>> up on them for now. >>>> >>>> But even looking at theta estimates from OUM models, I'm getting really >>>> large standard errors, often overlapping the estimates from other selective >>>> regimes. So I was wondering what that means exactly. How are these erros >>>> calculated? How much do high errors it limit the biological inferences I >>>> can make? I'm more interested in the relative thetas across regimes than on >>>> the exact values (testing the prediction that birds in darker habitats tend >>>> to adapt to darker plumages than birds in more illuminated habitats). >>>> >>>> I have attached a table averaging parameter estimates and errors from >>>> models fitted across a posterior distribution of 100 simmaps for four >>>> traits; and one exemplar fitted model from one trait in one of those >>>> simmaps. >>>> >>>> Thanks a lot for any help, >>>> >>>> >>>> *--* >>>> *Rafael Sobral Marcondes* >>>> PhD Candidate (Systematics, Ecology and Evolution/Ornithology) >>>> >>>> Museum of Natural Science <http://sites01.lsu.edu/wp/mns/ >>>> <http://sites01.lsu.edu/wp/mns/>> >>>> Louisiana State University >>>> 119 Foster Hall >>>> Baton Rouge, LA 70803, USA >>>> >>>> Twitter: @brown_birds <https://twitter.com/brown_birds >>>> <https://twitter.com/brown_birds>> >>>> >>>> >>>> _______________________________________________ >>>> R-sig-phylo mailing list - R-sig-phylo@r-project.org >>>> <mailto:R-sig-phylo@r-project.org> >>>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >>>> <https://stat.ethz.ch/mailman/listinfo/r-sig-phylo> >>>> Searchable archive at http://www.mail-archive.com/r- >>>> <http://www.mail-archive.com/r-> >>>> sig-ph...@r-project.org/ <http://sig-ph...@r-project.org/> >>>> >>>> >>> >>> >>> -- >>> William Gearty >>> PhD Candidate, Paleobiology >>> Department of Geological Sciences >>> Stanford School of Earth, Energy & Environmental Sciences >>> williamgearty.com <http://williamgearty.com/> >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> R-sig-phylo mailing list - R-sig-phylo@r-project.org >>> <mailto:R-sig-phylo@r-project.org> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >>> <https://stat.ethz.ch/mailman/listinfo/r-sig-phylo> >>> Searchable archive at >>> http://www.mail-archive.com/r-sig-phylo@r-project.org/ >>> <http://www.mail-archive.com/r-sig-phylo@r-project.org/> >> >> _______________________________________________ >> R-sig-phylo mailing list - R-sig-phylo@r-project.org >> <mailto:R-sig-phylo@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >> <https://stat.ethz.ch/mailman/listinfo/r-sig-phylo> >> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ >> <http://www.mail-archive.com/r-sig-phylo@r-project.org/> >> >> >> <swarm.pdf>_______________________________________________ >> R-sig-phylo mailing list - R-sig-phylo@r-project.org >> <mailto:R-sig-phylo@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo >> <https://stat.ethz.ch/mailman/listinfo/r-sig-phylo> >> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ >> <http://www.mail-archive.com/r-sig-phylo@r-project.org/> > ____________________________________________ > Marguerite A. Butler > Associate Professor > > Department of Biology > 2538 McCarthy Mall, Edmondson Hall 216 > Honolulu, HI 96822 > > Office: 808-956-4713 > Dept: 808-956-8617 > Lab: 808-956-5867 > FAX: 808-956-4745 > http://butlerlab.org > http://manoa.hawaii.edu/biology/people/marguerite-butler > http://www2.hawaii.edu/~mbutler > > > > > > > > > > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/