Two more comments:
1/ The first insertion in the code of ace() below is not needed if the
user replaces "?" (or any coding for unknown/uncertain state) with NA --
which seems reasonable to do.
2/ This approach of taking uncertainty into account in likelihood
ancestral state reconstruction is also used for DNA sequences and
implemented in phangorn where uncertainty is (implicitly) taken from the
IUPAC ambiguity code. Because there are here four states, there is more
than one possibility of ambiguity, but the same logic applies: if the
tip "may be" in state x, then set column x to one in the matrix of
likelihoods:
1 0 0 0 # if the base observed is A
0 0 1 0 # if the base observed is G
1 0 1 0 # if the base observed is R
...
1 1 1 1 # if the base observed is N
This can be applied to any character with more than two states.
Best,
Emmanuel
Le 22/06/2018 à 11:41, Emmanuel Paradis a écrit :
Hi Théo,
It is possible to modify the code of ace() to take uncertain character
states into account. If you edit the code with fix(ace), after these
four lines:
if (method != "ML")
stop("only ML estimation is possible for discrete characters.")
if (any(phy$edge.length <= 0))
stop("some branches have length zero or negative")
you insert:
if (any(x == "?")) x[x == "?"] <- NA
Then around 30 lines below, after this line:
liks[cbind(TIPS, x)] <- 1
you insert:
if (anyNA(x)) liks[which(is.na(x)), ] <- 1
Save and close. I tested this with a simple example:
tr <- compute.brlen(stree(8, "b"), 1)
x <- rep(0:1, each = 4)
plot(tr)
tiplabels(x)
So there are two clades in state 0 and state 1, respectively. With no
uncertainty, the node likelihoods are well unbalanced except at the root:
R> ace(x, tr, "d")$lik.anc
0 1
[1,] 0.500000000 0.500000000
[2,] 0.953204905 0.046795095
[3,] 0.995642789 0.004357211
[4,] 0.995642789 0.004357211
[5,] 0.046795095 0.953204905
[6,] 0.004357211 0.995642789
[7,] 0.004357211 0.995642789
Let's say the first value is unknown:
x[1] <- "?"
Now the node likelihoods are slightly "in favour" of state 1 because of
the possibility of this state within the "state 0" clade:
R> ace(x, tr, "d")$lik.anc
0 1
[1,] 0.479307333 0.520692667
[2,] 0.904647485 0.095352515
[3,] 0.943101903 0.056898097
[4,] 0.990270355 0.009729645
[5,] 0.053105869 0.946894131
[6,] 0.005881435 0.994118565
[7,] 0.005881435 0.994118565
For comparison, here's what is returned by the current version in ape:
R> ace(x, tr, "d")$lik.anc
? 0 1
[1,] 0.093607676 0.404434328 0.50195800
[2,] 0.096246928 0.787370632 0.11638244
[3,] 0.201389158 0.750401358 0.04820948
[4,] 0.011648214 0.974955230 0.01339656
[5,] 0.017176009 0.050038984 0.93278501
[6,] 0.003422534 0.006275985 0.99030148
[7,] 0.003422534 0.006275985 0.99030148
You may try this modification with your data and give feed-back: if it
useful, I'll fix that in ape.
Cheers,
Emmanuel
Le 21/06/2018 à 23:32, Théo Léger a écrit :
Hello,
I am working on the phylogeny of a subfamily of moths and I use ace
from the ape-package to reconstruct the ancestral state of a bunch of
morphological characters.
I encountered a problem with the few unknown states I have on my
matrix (coded "?", either because material for examination was missing
or the state could not fit in any of the categories): they are treated
as other characters. Is there a way to ignore them? Is there a way to
estimate the state for the species with missing information? I know it
is possible to estimate the state at a tip with missing information in
functions like AncTresh (using bayesian reconstruction) from the
phytools package, but I am not sure if ML reconstruction can do it.
Thanks in advance to those who can enlighten me!
Cheers,
Théo Léger
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/r-sig-phylo@r-project.org/
_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/