Re: [R-sig-phylo] Bootstrap values and NJ when there is no genetic distance between samples

2011-05-09 Thread Joe Felsenstein
Emmanuel wrote: Is it a problem with ties or with identical sequences? I guess you can solve the latter easily (eg, using the haplotype function in pegas), and this will solve the vast majority of ties. Other cases of ties will certainly not result in such high bootstrap values (that's my

Re: [R-sig-phylo] Bootstrap values and NJ when there is no genetic distance between samples

2011-05-07 Thread Alastair Potts
Hi Emmanuel (Klaus and Joe), The example data was meant to demonstrate that the tie-breaking in nj is affecting the bootstrap results - or rather the lack of any way to deal with tie breaking. I've noticed that a bunch of identical sequences form a 'polytomy' in my real dataset (but obviously

Re: [R-sig-phylo] Bootstrap values and NJ when there is no genetic distance between samples

2011-05-06 Thread Emmanuel Paradis
Hi Alastair, Klaus Joe, Before doing the tree, you should do some preliminary data explorations, such as: d - dist.dna(a) hist(d) summary(d) That'd show you any tree estimation procedure (not only NJ) has very little meaning -- just like you do plot(x, y) before doing lm(y ~ x). Best,

Re: [R-sig-phylo] Bootstrap values and NJ when there is no genetic distance between samples

2011-05-05 Thread Klaus Schliep
Hi Alastair, it is not that surprising. NJ normally does not produce poytomies, just edge weights of length 0. How these are broken may depends from the input order (from labels in the distance matrix like in this implementation) or could be broken randomly. I added some code below to highlight