Two more comments:

1/ The first insertion in the code of ace() below is not needed if the user replaces "?" (or any coding for unknown/uncertain state) with NA -- which seems reasonable to do.

2/ This approach of taking uncertainty into account in likelihood ancestral state reconstruction is also used for DNA sequences and implemented in phangorn where uncertainty is (implicitly) taken from the IUPAC ambiguity code. Because there are here four states, there is more than one possibility of ambiguity, but the same logic applies: if the tip "may be" in state x, then set column x to one in the matrix of likelihoods:

1 0 0 0 # if the base observed is A
0 0 1 0 # if the base observed is G
1 0 1 0 # if the base observed is R
...
1 1 1 1 # if the base observed is N

This can be applied to any character with more than two states.

Best,

Emmanuel

Le 22/06/2018 à 11:41, Emmanuel Paradis a écrit :
Hi Théo,

It is possible to modify the code of ace() to take uncertain character states into account. If you edit the code with fix(ace), after these four lines:

    if (method != "ML")
        stop("only ML estimation is possible for discrete characters.")
    if (any(phy$edge.length <= 0))
        stop("some branches have length zero or negative")

you insert:

     if (any(x == "?")) x[x == "?"] <- NA

Then around 30 lines below, after this line:

    liks[cbind(TIPS, x)] <- 1

you insert:

    if (anyNA(x)) liks[which(is.na(x)), ] <- 1

Save and close. I tested this with a simple example:

tr <- compute.brlen(stree(8, "b"), 1)
x <- rep(0:1, each = 4)
plot(tr)
tiplabels(x)

So there are two clades in state 0 and state 1, respectively. With no uncertainty, the node likelihoods are well unbalanced except at the root:

R> ace(x, tr, "d")$lik.anc
                0           1
[1,] 0.500000000 0.500000000
[2,] 0.953204905 0.046795095
[3,] 0.995642789 0.004357211
[4,] 0.995642789 0.004357211
[5,] 0.046795095 0.953204905
[6,] 0.004357211 0.995642789
[7,] 0.004357211 0.995642789

Let's say the first value is unknown:

x[1] <- "?"

Now the node likelihoods are slightly "in favour" of state 1 because of the possibility of this state within the "state 0" clade:

R> ace(x, tr, "d")$lik.anc
                0           1
[1,] 0.479307333 0.520692667
[2,] 0.904647485 0.095352515
[3,] 0.943101903 0.056898097
[4,] 0.990270355 0.009729645
[5,] 0.053105869 0.946894131
[6,] 0.005881435 0.994118565
[7,] 0.005881435 0.994118565

For comparison, here's what is returned by the current version in ape:

R> ace(x, tr, "d")$lik.anc
                ?           0          1
[1,] 0.093607676 0.404434328 0.50195800
[2,] 0.096246928 0.787370632 0.11638244
[3,] 0.201389158 0.750401358 0.04820948
[4,] 0.011648214 0.974955230 0.01339656
[5,] 0.017176009 0.050038984 0.93278501
[6,] 0.003422534 0.006275985 0.99030148
[7,] 0.003422534 0.006275985 0.99030148

You may try this modification with your data and give feed-back: if it useful, I'll fix that in ape.

Cheers,

Emmanuel

Le 21/06/2018 à 23:32, Théo Léger a écrit :
Hello,


I am working on the phylogeny of a subfamily of moths and I use ace from the ape-package to reconstruct the ancestral state of a bunch of morphological characters.

I encountered a problem with the few unknown states I have on my matrix (coded "?", either because material for examination was missing or the state could not fit in any of the categories): they are treated as other characters. Is there a way to ignore them? Is there a way to estimate the state for the species with missing information? I know it is possible to estimate the state at a tip with missing information in functions like AncTresh (using bayesian reconstruction) from the phytools package, but I am not sure if ML reconstruction can do it.

Thanks in advance to those who can enlighten me!


Cheers,

Théo Léger

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/






_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Reply via email to