Hi All- I have spent an inordinate and embarrassing amount of time tracking down an excruciatingly cryptic issue with read.dna, which I rarely use. Here are two key problems:
1) The function automatically assumes it is reading DNA sequences when it encounters a string of 10 continuous "DNA-like" characters. This includes all characters in the set (ACGTUMRWSYKVHDBN-). This function, unlike the phylip original, does not have limits on taxon name lengths. Hence, I had - in the middle of a large alignment - a species whose name included the string "MADAGASCAR", which caused a failure. To be fair, the documentation warns of this, but I think this is extremely easy to overlook, and - moreover - it seems unfortunate to have to parse all your taxon names for a potential IUPAC match before trying to use the function. Presumably, most users who specify sequential spacing will be using whitespace to separate taxon names from DNA sequences, and perhaps it is better to exploit this rather than IUPAC matching. 2) The function is whitespace-sensitive. if you tab-separate the numbers on the first line (numbers of taxa, numbers of sites), you'll receive an errror with the message: "the first line of the file must contain the dimensions of the data". It appears that spaces are OK, however. Hopefully this post will be useful to somewhere in the future with a similar issue. Perhaps these can be addressed in a future update to ape? -Dan Rabosky _______________________________________________ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo