Re: [R-sig-phylo] Problem with bootstraping microstallite NJ tree

2016-01-22 Thread Zhian Kamvar
Hi,

A much more efficient way for doing this process is to use the poppr function 
"aboot()" with the genind object. 

Best,
Zhian

Sent from my iPhone

> 
> 
> Hello,
> I have probably very simple problem, but I can't find the solution... :-( I 
> wish to make bootstraped tree of attached microsatellite diploid data. It 
> works fine with DNA sequences and I was using same code last year and it was 
> working. :-)
> 
> library(pegas)
> library(ape)
> LOCI <- read.loci("ssrs.txt", header=TRUE, loci.sep="\t", allele.sep="/", 
> col.pop=2, col.loci=3:14, row.names=1)
> # Looks OK
> GENIND <- loci2genind(LOCI)
> DIST <- dist(x=GENIND, method="euclidean", diag=TRUE, upper=TRUE)
> NJ <- nj(DIST)
> BOOT <- boot.phylo(phy=NJ, x=LOCI, FUN=function(X) nj(dist(loci2genind(X))), 
> B=1000)
>  |=   
>   
> |   1%Error in df2genind(as.matrix(x[, attr(x, "locicol")]), sep = "[/\\|]",  
> : 
>  length of factor pop differs from nrow(X)
> 
> Any idea what could went wrong?
> Thank You,
> Vojt?ch

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Problem with bootstraping microstallite NJ tree

2016-01-22 Thread Vojtěch Zeisek
Hello,
thank You very much, Emmanuel, this worked perfectly. Silly me I've overlooked 
that. Yes, for microsatellites would be probably better use poppr's bruvo dist 
or ade4's dist.genet.
Sincerely,
Vojtěch
PS: Later yesterday I have received spam answering my original post. Spammer 
is obviously subscribed and harvesting mails passing through the conference 
and then mailing users off-list. I've notified admins, but it's hard to resist 
such... ehm...

Dne Čt 21. ledna 2016 17:15:48 jste napsal(a):
> Hi Vojtěch,
> 
> The trouble comes from boot.phylo() which resamples the columns of the
> data matrix with replacement: this may result in a bootstrap sample
> without population column and loci2genind() doesn't like it. The trick
> is to delete this column from the original "loci" object, and then
> reassign it within the estimation process so that loci2genind() works.
> 
> pop <- LOCI$population
> LOCI$population <- NULL
> 
> foo <- function(X) {
>  X$population <- pop
>  nj(dist(loci2genind(X)))
> }
> 
> Then the following should work:
> 
> NJ <- foo(LOCI)
> BOOT <- boot.phylo(phy=NJ, x=LOCI, FUN=foo, B=1000)
> 
> BTW, I don't know if using dist() on a "genind" object is something
> really meaningful. Maybe you want to use another distance function from
> adegenet or another packae (eg, poppr). Anyway, the above trick will be
> useful in all cases since nj() only needs a distance matrix.
> 
> Best,
> 
> Emmanuel
> 
> Le 21/01/2016 14:30, Vojtěch Zeisek a écrit :
> > Hello,
> > I have probably very simple problem, but I can't find the solution... :-(
> > I
> > wish to make bootstraped tree of attached microsatellite diploid data. It
> > works fine with DNA sequences and I was using same code last year and it
> > was working. :-)
> > 
> > library(pegas)
> > library(ape)
> > LOCI <- read.loci("ssrs.txt", header=TRUE, loci.sep="\t", allele.sep="/",
> > col.pop=2, col.loci=3:14, row.names=1)
> > # Looks OK
> > GENIND <- loci2genind(LOCI)
> > DIST <- dist(x=GENIND, method="euclidean", diag=TRUE, upper=TRUE)
> > NJ <- nj(DIST)
> > BOOT <- boot.phylo(phy=NJ, x=LOCI, FUN=function(X)
> > nj(dist(loci2genind(X))), B=1000)
> > 
> >|=
> >|
> > |   1%Error in df2genind(as.matrix(x[, attr(x, "locicol")]), sep =
> > |   "[/\\|]",
> >
> >length of factor pop differs from nrow(X)
> > 
> > Any idea what could went wrong?
> > Thank You,
> > Vojtěch
-- 
Vojtěch Zeisek
http://trapa.cz/en/

Department of Botany, Faculty of Science
Charles University in Prague
Benátská 2, Prague, 12801, CZ
http://botany.natur.cuni.cz/en/

Institute of Botany, Academy of Science
Zámek 1, Průhonice, 25243, CZ
http://www.ibot.cas.cz/en/

Czech Republic


signature.asc
Description: This is a digitally signed message part.
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Problem with bootstraping microstallite NJ tree

2016-01-22 Thread Hilmar Lapp
The irony is of course that I just received a spam response myself in response 
to the below, so it’s clearly some automated scheme :-)

I will make the mailman archives private now. (For those worried that this 
stops the archives from being publicly indexed at mail-archive.com, this should 
not be the case, because mail-archive.com is subscribed for this purpose. We’ll 
monitor this nonetheless.)

  -hilmar

> On Jan 22, 2016, at 12:05 PM, Hilmar Lapp  wrote:
> 
> 
>> On Jan 22, 2016, at 3:16 AM, Vojtěch Zeisek  wrote:
>> 
>> PS: Later yesterday I have received spam answering my original post. Spammer
>> is obviously subscribed and harvesting mails passing through the conference
>> and then mailing users off-list.
> 
> Just to clarify for everyone, the perpetrators of this kind of spam are *not* 
> subscribed, at least not with the address (or one similar to it) that they 
> are spamming from. I can say this because I’ve blocked the whole TLD (.xyz) 
> from subscribing, and no .xyz emails are subscribed.
> 
> It is of course possible that the perpetrators *are* subscribed with a 
> legit-looking email, and then spam from throw-away addresses. This would be 
> very difficult to combat (there are currently > 1000 subscribers to this 
> list, making it futile to try to scrutinize every subscribed address). It is 
> also possible (and much more likely IMO) that the perpetrators feed off of a 
> public mailing list archive. I just noticed that the public mailman archive 
> apparently does not obfuscate the sender addresses (which apparently is 
> controlled by a setting not available to list administrators, but only to 
> sysadmins at the host). I am probably going to make the list archives 
> private, which would require people perusing an indexed public 3rd party mail 
> archive (such as that at 
> http://www.mail-archive.com/r-sig-phylo@r-project.org/). Those are much more 
> usable than the mailman archive anyway, so perhaps the impact of this step in 
> terms of accessibility may be small.
> 
>  -hilmar
> --
> Hilmar Lapp -:- genome.duke.edu -:- lappland.io
> 
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

--
Hilmar Lapp -:- genome.duke.edu -:- lappland.io



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Problem with bootstraping microstallite NJ tree

2016-01-22 Thread Hilmar Lapp

> On Jan 22, 2016, at 3:16 AM, Vojtěch Zeisek  wrote:
> 
> PS: Later yesterday I have received spam answering my original post. Spammer
> is obviously subscribed and harvesting mails passing through the conference
> and then mailing users off-list.

Just to clarify for everyone, the perpetrators of this kind of spam are *not* 
subscribed, at least not with the address (or one similar to it) that they are 
spamming from. I can say this because I’ve blocked the whole TLD (.xyz) from 
subscribing, and no .xyz emails are subscribed.

It is of course possible that the perpetrators *are* subscribed with a 
legit-looking email, and then spam from throw-away addresses. This would be 
very difficult to combat (there are currently > 1000 subscribers to this list, 
making it futile to try to scrutinize every subscribed address). It is also 
possible (and much more likely IMO) that the perpetrators feed off of a public 
mailing list archive. I just noticed that the public mailman archive apparently 
does not obfuscate the sender addresses (which apparently is controlled by a 
setting not available to list administrators, but only to sysadmins at the 
host). I am probably going to make the list archives private, which would 
require people perusing an indexed public 3rd party mail archive (such as that 
at http://www.mail-archive.com/r-sig-phylo@r-project.org/). Those are much more 
usable than the mailman archive anyway, so perhaps the impact of this step in 
terms of accessibility may be small.

  -hilmar
--
Hilmar Lapp -:- genome.duke.edu -:- lappland.io



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Problem with bootstraping microstallite NJ tree

2016-01-21 Thread Emmanuel Paradis

Hi Vojtěch,

The trouble comes from boot.phylo() which resamples the columns of the 
data matrix with replacement: this may result in a bootstrap sample 
without population column and loci2genind() doesn't like it. The trick 
is to delete this column from the original "loci" object, and then 
reassign it within the estimation process so that loci2genind() works.


pop <- LOCI$population
LOCI$population <- NULL

foo <- function(X) {
X$population <- pop
nj(dist(loci2genind(X)))
}

Then the following should work:

NJ <- foo(LOCI)
BOOT <- boot.phylo(phy=NJ, x=LOCI, FUN=foo, B=1000)

BTW, I don't know if using dist() on a "genind" object is something 
really meaningful. Maybe you want to use another distance function from 
adegenet or another packae (eg, poppr). Anyway, the above trick will be 
useful in all cases since nj() only needs a distance matrix.


Best,

Emmanuel

Le 21/01/2016 14:30, Vojtěch Zeisek a écrit :

Hello,
I have probably very simple problem, but I can't find the solution... :-( I
wish to make bootstraped tree of attached microsatellite diploid data. It
works fine with DNA sequences and I was using same code last year and it was
working. :-)

library(pegas)
library(ape)
LOCI <- read.loci("ssrs.txt", header=TRUE, loci.sep="\t", allele.sep="/",
col.pop=2, col.loci=3:14, row.names=1)
# Looks OK
GENIND <- loci2genind(LOCI)
DIST <- dist(x=GENIND, method="euclidean", diag=TRUE, upper=TRUE)
NJ <- nj(DIST)
BOOT <- boot.phylo(phy=NJ, x=LOCI, FUN=function(X) nj(dist(loci2genind(X))),
B=1000)
   |=
|   1%Error in df2genind(as.matrix(x[, attr(x, "locicol")]), sep = "[/\\|]",
:
   length of factor pop differs from nrow(X)

Any idea what could went wrong?
Thank You,
Vojtěch



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/



___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/