o DCCA (detrended canonical
> correspondence analysis) but the unconstrained DCA. If anyone knows the
> answer for Jonathan's question, please, share it with me, I would also be
> interested.
> Best regards,
>
> Attila
>
> Peter Solymos ezt írta (időpont: 2020. nov. 5
Jonathan,
Have you checked ?vegan::decorana (it is also mentioned in the vignette on
p 2: https://cran.r-project.org/web/packages/vegan/vignettes/intro-vegan.pdf
)
Cheers,
Peter
On Thu, Nov 5, 2020 at 11:03 AM Jonathan Gordon
wrote:
> Hello,
>
> I’m aiming to perform a detrended canonical cor
R >= 4.0 will treat categorical variables as character
(stringsAsFactors=FALSE is the new default) when importing via e.g.
read.table(). If you want factors, you have to make it explicit either as
Torsten showed or by setting stringsAsFactors=TRUE.
Cheers,
Peter
On Tue, Jun 23, 2020 at 10:05 AM
Manuel,
There are few ecology focused packages besides ARIMA() and the forecast
package, for example:
- popbio (based mostly on the Matrix Population Models by Caswell (2001)
and Quantitative Conservation Biology by Morris and Doak (2002).
- PVAClone, that uses JAGS and is based on Nadeem, K., L
is kind of data. I think that many
> other researcher that work with huge species dataset (this one is not very
> big but I worked with thousands of OTUs) will have this problem.
> Do you have an idea how to deal with this kind of data object?
> Thank you,
> Gian
>
>
>
>
&g
Gian,
Once you have your samples by OTU matrix row standardized, you can use a
level of your hierarchy (a vector matching the columns) and the
groupSums(your-matrix, 2, your-groups) function in the mefa4 package to get
your relative abundances.
Cheers,
Peter
Gian Maria Niccolò Benucci ezt írta
Hi Saifi,
Here is how you can set up your design variables to be used in the formula
interface of multipart() or adipart() in vegan. You need to adjust the
settings and make sure that the results make sense, because you know the
data.
library(vegan)
# x <- structure(...) # just copied your data f
Jason,
The segments in 'mefa' just add a 3rd dimension to the object, but that
does not limit accessing the stored information. Sample attributes can have
spatial information, but it is not specifically designed to support spatial
analysis. More concrete feature requests are welcome.
There is an
Try using is.na (missing value) instead of is.nan (not a number).
Peter
--
Péter Sólymos
780-492-8534 | soly...@ualberta.ca | http://psolymos.github.com
Alberta Biodiversity Monitoring Institute http://www.abmi.ca
Boreal Avian Modelling Project http://www.borealbirds.ca
On Sun, Mar 9, 2014 at
Jeff,
I am not sure why you need 100 random numbers for r and K, but if your goal
is to get stochastic state-space model, you need to define the error term
as a separate parameter and run the loop 100 times with the *same* fixed
parameter values. When you do this, then you need to be aware of
para
Eda,
How many permutations do you use? Have you tried setting the RNG seed via
set.seed() ? Also, if you have borderline p-values that change from run to
run it might indicate not so strong discrimination by the given clustering
of sites.
Cheers,
Peter
--
Péter Sólymos
780-492-8534 | soly...@
Andrés,
To get statistics other than the mean (SD ~ error as you wrote) you can
stack the dist object (e.g. stack.dist in mefa pkg with dim.names = TRUE)
and then calculate statistics for subsets based on your grouping variable.
Cheers,
Peter
--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6
Attila,
See paper and R code by Millar et al. 2011 for a solution based on 'glm':
http://www.esapubs.org/archive///ecol/E092/146/
Peter
--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monito
ame denominator (the same total number
> of GPS positions for each bird), maybe the results would be almost
> identical ?
>
> Thank you so much again for your time,
>
> Marie
>
>
>
>
> On Wednesday, November 27, 2013 7:19:47 PM, Peter Solymos <
> soly...@ualber
ement.
>
>
> -Original Message-
> From: r-sig-ecology-boun...@r-project.org
> [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Peter Solymos
> Sent: Thursday, 28 November 2013 10:33 AM
> To: marieline gentes
> Cc: r-sig-ecology@r-project.org
> Subject: Re:
Marie,
Your problem and data seems to me a resource selection problem with matched
use-availability design. Estimating procedure for that design is discussed
in Lele and Keim (2006, Ecology 87:3021--3028) and implemented in the
ResourceSelection package: rspf function, see description of argument
Hello,
A parametric model (e.g. Clench) would allow both intrapolation and
extrapolation. There are some caveats of course: (1) these models arose in
the temporal accumulation sense, spatial accumulation is usually calculated
for randomized data, which is an assumption that individuals in the
samp
Laura,
Hacking the function is straightforward. Change this line:
hessian <- control$hessian
into
hessian <- FALSE
and then this one:
vc <- -solve(as.matrix(fit$hessian))
as
vc <- diag(1, length(fit$par), length(fit$par))
Then you take care of the unexported model_offset_2 function as
pscl
Matias,
The offset term is processed as part of parsing the formula, which results
in a vector of length of the response. Using a vector should not be a
problem.
Peter
--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github
ens if I
> have abundance data for my species matrix but binary trait values?
> Because it seems the function has some problems with this combination.
>
> Regards,
> Thomas
>
> On 6/12/2013 5:17 PM, Peter Solymos wrote:
> > Thomas,
> >
> > 1) P
Thomas,
1) Presence absence data means that you have cell probabilities 1/S_i for
detections and 0 for missing species in a given community i. As Zoltán also
pointed out, it is meaningful to use this, as it has the interpretation of
choosing different species from a species list (and not from a p
Bruce,
Standardizing might not be the best way to go if you have low counts.
You can possibly assume that events follow a homogeneous Poisson
process and rate varies with night length (linear or quadratic) [Y|x ~
Poisson(phi); log(phi)=f(x)]. You can estimate corresponding
coefficients by glm(). I
Kate,
To get what you want the simplest way try:
df1[,colSums(df1>0)>1,drop=FALSE]
Note the drop=FALSE argument, which makes sure that you still get a
matrix if only one species occurs in more than one sites (i.e. no
surprising vector results in the end).
Cheers,
Peter
--
Péter Sólymos, Dept
Jeffrey,
Check out also the popbio package.
Cheers,
Peter
--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.bore
Thiago,
Additive diversity partitions are calculated as difference between
expected diversities at subsequent levels of a given sampling
hierarchy. If levels are not nested, it is not clear how to partition
these terms. You need either 1) to come up with a defendable method of
how to additively pa
Maybe:
A$X2[A$X2>1] <- 1
Peter
--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Alberta Biodiversity Monitoring Institute, http://www.abmi.ca
Boreal Avian Modelling Project, http://www.borealbirds.ca
2012/10/25 Ma
Kristen,
Try something like this:
i <- sample(1:nrow(DataSet.Sub2), 85, replace=FALSE)
DataSet.Sub2.66<- DataSet.Sub2[i,]
DataSet.Sub2.33<- DataSet.Sub2[-i,]
Peter
--
Péter Sólymos, Dept Biol Sci, Univ Alberta, T6G 2E9, Canada AB
soly...@ualberta.ca, Ph 780.492.8534, http://psolymos.github.com
Mario,
If you can assume that the waiting time between events is constant
through time, you can model your counts per unit time with Poisson glm
(constant waiting time leads to an exponential survival function).
log(Observation time) can be used as an offset:
glm(interactions~covariate, offset=lo
as to how I can shuffle
> the cells a 1000 times, and then go on to make the SSD value 1000 times?
>
> Thanks again for helping out.
>
> Allan
>
>
>
> - Original Message -
> From: "Peter Solymos"
> To: "Allan Edelsparre"
> Cc: r-sig
Allan,
Simply defining the dimension might work:
dim(v) <- dim(trial2)
but it is not clear what you are trying to achieve with the rep(...,
1000) part. It won't permute the matrix 1000 times but repeat same
values. You might want to have a look at oecosimu in vegan which
calculates the distributio
Manuel,
You haven't specified the general problem, but for this particular
situation this is how you can do it:
x <- data.frame(array(1:12, c(3,4), list(paste("item", 1:3),
paste("col", 1:4
x <- data.frame(Item=rownames(x), x)
y <- data.frame(Item=x$Item[rep(1:3, each=2)],
matrix(as.matrix(x[
Manuel,
As a starter you can fit nonlinear models using growth functions, and
calculate carrying capacity from estimated model parameters
(stats:::nle, lme4:::nlmer).
I am not sure if this is of big help as it is still in development,
but our PVAClone R package is almost ready for its 1st CRAN su
Ivailo,
Some (but not all) vegan functions internally coerce the "matrix like"
input object to matrix using 'x <- as.matrix(x)'. The as.matrix()
coercion method is defined for sparse matrices in the Matrix package,
and that is why it works for some (but not all) vegan functions. In
case it does no
Thiago,
It's all about indexing:
rn <- union(rownames(matrix1), rownames(matrix2))
cn <- union(colnames(matrix1), colnames(matrix2))
x <- array(0, dim=c(length(rn), length(cn)), dimnames=list(rn, cn))
x[rownames(matrix1), colnames(matrix1)] <- matrix1
x[rownames(matrix2), colnames(matrix2)] <- ma
Hi Elyse,
Tom referred to the mefa package, I'd like to draw the attention to
the mefa4 package which is more efficient and can handle large data
since it uses sparse matrices through the Matrix package. The Melt()
function is the inverse operation to xtabs() or its modified version
Xtabs() in mef
Jonathan and Chris,
The mantel function in vegan package contains dist-to-matrix coercion,
so memory requirements for matrices should be setting the limit.
Cheers,
Peter
Péter Sólymos
Alberta Biodiversity Monitoring Institute
and Boreal Avian Modelling project
Department of Biological Sciences
Kay,
That is an obvious result of the regression tree algorithm which
recursively splits the data and prediction is given as e.g. mean of
observations at terminal nodes. New data will, however, contribute to
cross validation error, a measure of prediction accuracy. The tree
gives the 'global' mode
Bier,
Solutions might depend on OS and 32/64 bit build that you are using.
For general info, have a look at R FAQ:
http://cran.r-project.org/bin/windows/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021
or read help("Memory-limits")
7000*50 is usually not considered big data nowada
Kay and Alexandre,
INLA approach might be fine, but given the data you described, I would
rather think about what might cause the zero inflation (90% zeros) and
the spatial autocorrelation and pick a model accordingly. For example
if the zero inflation might be caused by low abundance related to l
Guojun,
Make lists of them with this function:
fun <- function(file) {
x <- new.env()
load(file, x)
as.list(x)
}
HTH,
Peter
On Fri, Feb 10, 2012 at 1:52 PM, lgj200306 wrote:
> Hi, everyone!
> I have several R workspaces, with each contains many files. I want to combine
> these
Cory,
There is a profile method for fisherfit. If that's not enough, you can
have a look inside of fisherfit, and you'll see the Dev.logseries
internal function and how it is used in nlm to get estimate of alpha.
The same function can be used for manual profiling to get likelihood
values for a seq
On Sat, Aug 6, 2011 at 4:06 AM, Lene Jung wrote:
> HI,
>
> I’m having several problems trying to fit distributions to data that I have
> sorted into a data frame, so the each ID has its own step length and turn
> angle.
>
>
>
> I can fit a Weibull distribution to step lengths with following code:
ts6 3 2
incidences.Genus incidences.Family
Biak-Numfoor rain forests 2 2
Central Range montane rain forests1919
Huon Peninsula montane rain forests
Chris,
Something like this should do the job:
1.
rowSums(xtabs(~ ECO_NAME + Genus, x) > 0)
2.
rowSums(xtabs(~ ECO_NAME + Genus, x))
Cheers,
Peter
On Fri, May 20, 2011 at 3:19 PM, Chris Mcowen wrote:
> Dear List,
>
> I am looking to calculate two things from my data frame and was after some
Jeff,
The zero truncated Poisson distribution can be described as a
probability function conditional on Y>0:
Pr(Y=y|Y>0) = Pr(Y=y) / (1-Pr(Y=0)), y=1,2,3,...
This leads to this function 'nll' for the negative log-likelihood:
nll <- function(theta, y, X) {
lambda <- exp(drop(X %*% theta))
Hi Scott,
Lognormal refers to a probability distribution of a random variable
whose logarithm is normally distributed. Said so, if you log transform
your CV, you can apply Gaussian family, or simply lm().
Cheers,
Peter
On Sat, Mar 26, 2011 at 9:16 AM, Scott Chamberlain
wrote:
> Dear sigecos,
>
>
Dear Diogo,
Thanks for self-correction, I reply for clarification.
The twostagechao function was in the developmental version of vegan
but got removed on 2009-12-16 07:12:59 +0100 (Wed, 16 Dec 2009) at
revision 1083 along its documentation. It never went into vegan stable
release. The function wa
Maria,
You can have number of complete pairs for each column pair combination as
(x <- matrix(c(NA,NA,NA,1,2,3), 2, 3))
(x.na <- !is.na(x))
t(x.na) %*% x.na
You can supply this as is or its lower triangle as vector to the
plotting function.
Cheers,
Peter
On Fri, Jan 21, 2011 at 8:25 AM, Mar
Hi,
for one factor, it is enough to do
dat$ageclass <- dat$ageclass[drop=TRUE]
the 'purgef' function applies the drop statement for each columns in
the data frame and eventually returns a list because that's how
'lapply' works. If you want a data frame in the end, you can either do
(as I recall
Yong Zhang,
I think this is what you are looking for:
library(mefa)
x <- matrix(c(
2, 3, 4, 2,
5, 6, 5, 2,
4, 3, 4, 5,
4, 5, 4, 1), 4, 4, byrow=TRUE)
colnames(x) <- c("site1", "site2", "site3", "s
Ophelia and Others,
A more general solution is:
## define a function (that is part of the mefa package)
`rep.data.frame` <- function(x, ...)
as.data.frame(lapply(x, rep, ...))
## example from ?mefa:::rep.data.frame
x <- data.frame(sample = LETTERS[c(1,1,2,2,3)],
species = letters[c(5,5,5,
s which are found in many sites within the
> region, irrespective of the total number of species found within them. Does
> the beta component of contribdiv ( ) do this, or does it also take into
> account the total number of species within the sites?
> -Burak
>
> -Original Mess
Burak,
Not exactly clear what you want, but Lu et al. 2007 describes the
differentiation coefficient D that is the sum(beta)/sum(gamma) and can
be obtained as attributes(contribdiv(x))$diff.coef . Note also that
there is a link between diversity partitions and distance indices so
you might as well
On Fri, Sep 3, 2010 at 11:32 PM, Jane Shevtsov wrote:
> Does R have a p-function for empirical distributions? In other words,
> how can I find out what fraction of the values in my data set are
> smaller than a given value?
Maybe
sum(x <= crit) / length(x)
Cheers,
Peter
Péter Sólymos
Alberta Bio
Dear All,
I had a quick look at the internal functions used by pscl::hurdle to
do the numerical optimization by optim. It clearly corresponds to the
hurdle model defined in the paper/vignette, where the zero component
is based on a right censored random variable, that is 0 if the
original count da
Manuel,
it depends on whether you are interested (1) only in mean predictions
only or (2) prediction intervals as well. In the first case, this will
give you mean predictions:
x1 <- seq(min1, max1, len=25)
x2 <- seq(min2, max2, len=25)
x3 <- seq(min3, max3, len=25)
x10 <- x20 <- x30 <- rep(0, 25)
Adriano,
what you have reported is strange, however, (1) it is not clear what
the strata 'bloco' is and if it can cause the problem (i.e. too
restrictive permutation scheme, or the strata has something to do with
your independent variable 'trat'), and (2) you are not using Sorensen
index "sor" = 2*
Hi Kay,
I meant to make permutations within time points, i.e. strata=time, and
not within locations (strate=locations). adonis do F-tests based on
sequential sums of squares from permutations, thus the non
independence of repeated measures can have an effect on p-values
associated with terms subseq
Kay,
using strata (restricting permutations within time points, and not
within locations) in adonis makes some sense, given that permutation
tests assume independence. But that does not solve the problem of
dependence, but it is a good starter. If you have a before-after
control-treatment design,
Kang Min,
The error comes from the function 'fisherfit' that uses 'nlm' to
minimize the negative log likelihood for the Fisher's log-series.
Numerical optimization does not tolerate missing values. This code
reproduces your error:
> library(vegan)
> data(BCI)
> BCI[1,1] <- NA
> fisher.alpha(BCI)
Dave,
The vegdist function of the vegan package produces an object of class
"dist" similarly to the dist function in stats. It can be converted
into a symmetric matrix by as.matrix(x). The help page of dist will
give you details about the structure of "dist" objects.
Cheers,
Peter
On Wed, Apr
Claudia,
Here is a more specific paper with hierarchical Bayesian model:
Stephen L. Rathbun and Songlin Fei
A spatial zero-inflated poisson regression model for oak regeneration
Environmental and Ecological Statistics, 2006, 13: 409-426
http://www.springerlink.com/content/r327264t016x2873
Lanna,
I don't know exactly what do you mean by co-occurrence data frame, but
if you'd like to get a species-by-species matrix, in which you count
the co-occurrences of the species (columns in the sites-by-species
community matrix) with each other, you can use the crossprod function
or the %*% ope
> suspect that this is probably not exactly what you want to be doing (I do
> have particular opinions though).
>
> HTH,
>
> Scott
>
> Hall, D.B. Zero-Inflated Poisson and Binomial Regression with Random
> Effects: A Case Study. Biometrics 56, 1030-1039.
>
>
> Pete
Trevor,
You can use weights in the model to provide the surface area (or
sqrt(surface area) to enhance linearity) and leave the counts as they
are in the ZINB model. (In the zeroinfl function weights are used to
weight the log-likelihood and to scale the residuals.)
Cheers,
Peter
Péter Sólymos
n Tukey
>
> -Oorspronkelijk bericht-
> Van: Gavin Simpson [mailto:gavin.simp...@ucl.ac.uk]
> Verzonden: maandag 8 februari 2010 11:14
> Aan: ONKELINX, Thierry
> CC: Peter Solymos; Nathan Lemoine
> Onderwerp: Re: [R-sig-eco] multiple regression
>
> On Mon,
I meant "Species richness is discrete", not categorical.
Peter
On Sat, Feb 6, 2010 at 12:52 PM, Peter Solymos wrote:
> Nathan,
>
> Species richness is categorical, so if your richness values are
> usually low (say < 20), you should consider the use of Poisson GLM
Nathan,
Species richness is categorical, so if your richness values are
usually low (say < 20), you should consider the use of Poisson GLM, or
log-transform your response (and log is the canonical link function
for Poisson GLM). This usually improves the model fit. And this might
apply to abundanc
;> nulls<- replicate(n = 999, nullabun(m), simplify = F)
>> >>
>> >> # how many unique
>> >>
>> >>> null matrices?
>> >>>
>> >> length(unique(nulls) ) # I found 983 out of 999
>> >>
>> >> # ho
stions.
>> >
>>
>> > commsimulator indeed respects the two contraints I'm interested in, but>
>> > only allows for binary data.
>> >
>> > swap.web is *almost* what I need, but
>> only overall matrix fill is kept
>> > constant,
ne
>> > more constraint to your null model (that of column AND row constancy
>> >
>> > of 0s) will uniquely define the matrix! Referees may not pick it up,
>> > but it
>> > may give you trivial results.
>> >
>> > Best wishes,
>> >
t of rounding
>> to nearest integer (a bit arbitrary). In a way, shuffling mat could now
>> be seen as re-allocating "units of biomass" randomly to plots. However,
>> doing so results in a matrix with large number of "individuals" to
>> reshuffle, which can slo
Dear All,
Perhaps, there is another way of approaching this problem: the
Monmonier's maximum-difference barriers algorithm.
Monmonier, M. (1973) Maximum-difference barriers: an alternative
numerical regionalization method. Geographic Analysis, 3, 245–261.
Manni, F., Guerard, E. and Heyer, E. (200
Dear All,
I admit that overdispersion can be a problem. But you can't compare
Poisson with quasi-Poisson based on logLik, because the likelihood is
not defined for quasi* models. The quasi-likelihood can be maximized
to get the dispersion parameter, but coefficients are the same, only
SE's and p-v
Dear Steve,
If the direction is important, you can use that information as a
separate matrix with signs to scale up its effect. Because distance
can't be negative, you might end up with numbers hard to interpret.
Yours,
Peter
Péter Sólymos
Alberta Biodiversity Monitoring Institute
Department of
Christine,
There is no summary method for adonis. After calling the function,
simply use print:
x <- adonis(...)
x
And you are right, you can supply raw data and use the method argument
in adonis to define dissimilarity index (which is "bray" by default).
Cheers,
Peter
Péter Sólymos
Alberta Bi
Christine,
There is no summary method for adonis. After calling the function,
simply use print:
x <- adonis(...)
x
And you are right, you can supply raw data and use the method argument
in adonis to define dissimilarity index (which is "bray" by default).
Cheers,
Peter
Péter Sólymos
Alberta Bi
Hi Bálint,
Here are my two cents.
By using LM with transformed data (which transformation can also be
logit, loglog, cloglog, probit) you loose the Binomial error
structure, because you won't follow the trial/success experiment
scheme. But percent cover is not that kind of [0,1] data where this
s
Dear Leigh,
You have 2 options:
1. build the MAC OS X package for yourself and install it, in this way
you will be able to use the help files as usual,
2. unpack the .tar.gz and source all files in the /R directory (on how
to do it at once see Example in help(source)).
Cheers,
Peter
On Wed, A
No worries!
I just added the option 'dim.names = TRUE' by which you
can get the dimnames back, but you still need the
matrix > dist > data.frame coercion chain.
as.data.frame.dist <-
function (x, row.names = NULL, optional = FALSE, dim.names = FALSE, ...)
{
if (!missing(optional))
.No
Hi,
The 'as.data.frame.dist' function requires the mefa package,
that's why I wrote the line 'library(mefa)'. This returns the
lower triangle only, but it does not return the row/col names.
The melt method, however returns the full matrix, not only
the lower triangle, see below.
Best,
Peter
--
Dear Jin-Long,
You can try this:
x <- cbind(rnorm(10), rnorm(10), rnorm(10), rnorm(10))
y <- cor(x)
y
library(mefa)
z <- as.data.frame(as.dist(y))
z
Yours,
Peter
Peter Solymos, PhD
Postdoctoral Fellow
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton
abs(Counts ~ interaction(x$SITE, x$Replicate) + SPECIES, x)
## replicates cross tabulated separately
y2 <- xtabs(Counts ~ SITE + SPECIES + Replicate, x)
## same with mefa (will give you warnings
## due to some 'empty sample' misspecifications)
m <- mefa(stcs(x))
m$segm
Yours,
Peter
Pe
Dear Jacob,
Erika was right, you just have to perform a goodness of fit test. Bit
it is easier
to inspect your residual deviance.
It follows a Chi-sqared distribution, where the expected value should
be close to
the degrees of freedom if the fit is good. To get a P value for an
object of class
"ne
Hi Kate,
You can use time series analysis (ar, arima functions at first)
instead, because YEAR and WEEK clearly has structure (i.e.
observations are conditional on previous observations with some lag).
To control for SITE, you can use polynomials of the geographical
coordinates (or write a hierarc
Hi Manuel,
I would suggest to use the signed difference. This will be Skellam
distribution with expected value mu1-mu2 (means of the two Poisson
distr) and variance mu1+mu2. The skellam and vglm functions in the
VGAM package can be used for a likelihood ratio test for equal means
(see example(skel
Dear Manuel and Wilfried,
the ctree function in the party package for recursive part(y)itioning
can handle multivariate response. There is also a vignette:
http://cran.r-project.org/web/packages/party/vignettes/party.pdf
Best,
Péter
Péter Sólymos, PhD
Postdoctoral Fellow
Department of Mathematical
Dear Roy,
I haven't done this, but I would start with the function zsm in the
package untb by Robin Hankin. See also the function etienne.
Yours,
Péter
Péter Sólymos, PhD
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton, Alberta, T6G 2G1
Canada
On Sun, Jan 11, 2
Hi All,
maybe a more transparent solution for the zombie factor problem
(dropping unused factor levels) for data frames is (note, this applies
for all factors in the data frame x):
x[] <- lapply(x, function(x) x[drop = TRUE])
As I recall, on the help page of factor(), there is a slight warning
a
Dear Leigh,
The Ward method is minimizing the within cluster sum of squares of the
distances. So it is not easy to back-scale it to reflect original
distances. Instead you should try *linkage methods, see ?hclust. To
read about the Ward (Ward-Orloci) method see:
- Ward 1963 JASA 58: 236-244
- Orlo
like this (SAMPLES and SPECIES are the two
column in the long format, have to be the same length):
x <- mefa(stcs(data.frame(SAMPLES,SPECIES)))
cl <- hclust(dist(x$xtab))
Hope this works,
Peter
Peter Solymos, PhD
Department of Mathematical and Statistical Sciences
University of Alberta
E
91 matches
Mail list logo