Re: [R-sig-phylo] understanding variance-covariance matrix

Brian O'Meara Sat, 25 Aug 2018 11:35:09 -0700

Hi, Agus. The variance-covariance matrix comes from the tree and the
evolutionary model, not the data. Each entry between taxa A and B in the
VCV is how much covariance I should expect between data for taxa A and B
simulated up that tree using that model. I don't want to be *that guy*, but
O'Meara et al. (2006)
https://onlinelibrary.wiley.com/doi/10.1111/j.0014-3820.2006.tb01171.x has
a fairly accessible explanation of this (largely b/c I was just learning
about VCVs when working on that paper). Hansen and Martins (1996)
https://onlinelibrary.wiley.com/doi/10.1111/j.1558-5646.1996.tb03914.x have
a much more detailed description of how you get these covariance matrices
from microevolutionary processes.


Typically, ape::vcv() is how you get a variance covariance for a phylogeny,
assuming Brownian motion and no measurement error. It just basically takes
the history two taxa share to create the covariance (or variance, if the
two taxa are the same taxon). A different approach, which seems to be what
you're doing, would be to simulate up a tree many times, and then for each
pair of taxa (including the pair of a taxon with itself, the diagonal of
the VCV), calculate the covariance. These approaches should get the same
results, though the shared history on the tree approach is faster.

Best,
Brian


_______________________________________________________________________
Brian O'Meara, http://www.brianomeara.info, especially Calendar
<http://brianomeara.info/calendars/omeara/>, CV
<http://brianomeara.info/cv/>, and Feedback
<http://brianomeara.info/teaching/feedback/>

Associate Professor, Dept. of Ecology & Evolutionary Biology, UT Knoxville
Associate Head, Dept. of Ecology & Evolutionary Biology, UT Knoxville



On Sat, Aug 25, 2018 at 1:16 PM Agus Camacho <agus.cama...@gmail.com> wrote:

> Dear list users,
>
> I am trying to make an easy R demonstration to teach the
> variance-covariance matrix to students. However, After consulting the
> internet and books, I found myself facing three difficulties to understand
> the math and code behind this important matrix. As this list is answered by
> several authors of books of phylocomp methods, thought this might make an
> useful general discussion.
>
> Here we go,
>
> 1) I dont know how to generate a phyloVCV matrix in R (Liams kindly
> described some options here
> <
> http://blog.phytools.org/2013/12/three-different-ways-to-calculate-among.html
> >
> but I cannot tell for sure what is X made of. It would seem a dataframe of
> some variables measured across species. But then, I get errors when I
> write:
>
>  tree <- pbtree(n = 10, scale = 1)
>  tree$tip.label <- sprintf("sp%s",seq(1:n))
>  x <- fastBM(tree)
> y <- fastBM(tree)
>   X=data.frame(x,y)
>  rownames(X)=tree$tip.label
>  ## Revell (2009)
>  A<-matrix(1,nrow(X),1)%*%apply(X,2,fastAnc,tree=tree)[1,]
>  V1<-t(X-A)%*%solve(vcv(tree))%*%(X-A)/(nrow(X)-1)
>    ## Butler et al. (2000)
>    Z<-solve(t(chol(vcv(tree))))%*%(X-A)
>  V2<-t(Z)%*%Z/(nrow(X)-1)
>
>    ## pics
>    Y<-apply(X,2,pic,phy=tree)
>  V3<-t(Y)%*%Y/nrow(Y)
>
> 2) The phyloVCV matrix has n x n coordinates defined by the n species, and
> it represents covariances among observations made across the n species,
> right?. Still, I do no know whether these covariances are calculated over
> a) X vs Y values for each pair of species coordinates in the matrix, across
> the n species, or b) directly over the vector of n residuals of Y, after
> correlating Y vs X, across all pairs of species coordinates. I think it may
> be a) because, by definition, variance cannot be calculated for a single
> value. I am not sure though, since it seems the whole point of PGLS is to
> control phylosignal within the residuals of a regression procedure, prior
> to actually making it.
>
> 3) If I create two perfeclty correlated variables with independent
> observations and calculate a covariance or correlation matrix for them, I
> do not get a diagonal matrix, with zeros at the off diagonals (ex. here
> <
> https://www.dropbox.com/s/y8g3tkzk509pz58/vcvexamplewithrandomvariables.xlsx?dl=0
> >),
> why expect then a diagonal matrix for the case of independence among the
> observations?
>
> Thanks in advance and sorry if I missed anything obvious here!
> Agus
> Dr. Agustín Camacho Guerrero. Universidade de São Paulo.
> http://www.agustincamacho.com
> Laboratório de Comportamento e Fisiologia Evolutiva, Departamento de
> Fisiologia,
> Instituto de Biociências, USP.Rua do Matão, trav. 14, nº 321, Cidade
> Universitária,
> São Paulo - SP, CEP: 05508-090, Brasil.
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at
> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] understanding variance-covariance matrix

Reply via email to