Re: [R-sig-eco] vegan package: goodness.cca function? (citations needed)

2009-12-16 Thread Jari Oksanen



On 25/11/2009 16:56, L Quinn lqu...@hotmail.com wrote:

 
 Hello,
 
 I posted this R-help and Ecolog-L before I knew about R-sig-ecology.
 I hope you will forgive the cross-posting, if you subscribe to those lists
 too.
 
 I deleted a number of species
 from a canonical correspondence analysis (CCA) model after using the
 function goodness.cca in vegan.  In his explanation of the diagnostic
 tools for CCA
 (http://cc.oulu.fi/~jarioksa/softhelp/vegan/html/goodness.cca.html),
 Jari Oksanen states that It is a common practise to use goodness
 statistics to remove
 species from ordination plots. However, I have not been able to find
 references on this common practice in the literature. I may be using
 inappropriate search terms?? However, if anybody can point me towards
 some cite-able references/rules of thumb on this, I would very much
 appreciate it.
 
Hello,

CanoDraw does so as default (or did when I last used several years ago), and
since people commonly use CanoDraw, they commonly omit badly fitting
species. This doesn't mean that they are aware of doing so. I don't know how
well this is documented, but check CanoDraw manual and books about CANOCO.

Cheers, Jari Oksanen

 And yes, I am also aware that Jari goes on to say that deleting species isn't
 always a good idea.
 
 Thank you!
 
 Lauren Quinn  
 _
 Bing brings you maps, menus, and reviews organized in one place.
 
 [[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] dissimilarity and species turnover

2009-12-15 Thread Jari Oksanen
On 15/12/09 18:54 PM, Amanda Stanley ama...@appliedeco.org wrote:

 Robert,
 check out package ncf by Ottar Bjornstad
 http://cran.r-project.org/web/packages/ncf/index.html.
 The spline correlogram function might be what you need.  From the
 documentation:
 
 The spline (cross-)correlogram differes from the spatial correlogram (and
 Mantel correlogram) in
 that it estimated spatial dependence as a continous functions of distance
 (rather than binning into
 distance classes).
 
Howdy,

Actually, the binning is not *the* problem (and basic Mantel test does not
use binning at all: it is only for the correlograms). The problem is
partitioning *distances* into *additive* components like is implicitly done
when you have something like partial Mantel tests.

What you can do for a starter is to go the October archive of R-sig-eco
which has two threads on the very same issues (using two distance metrices
in formula, Mantel test with skew-symmetric matrices?) The staring
questions were not exactly identical to this question, but the discussion
soon radiated to relevant issues. I'd recommend you check Sarah Goslee's
comments at the minimum. If you want to go deeper here (and you should if
you are serious), dig up the late 2008 issue of the Ecology with the
Legendre  mates vs. Tuomisto discussion -- somewhere around pages 3230 to
3256 of vol 89). 

That's for the starter.

Cheers, Jari Oksanen

 I've been exploring using this approach for a similar problem. I'd be
 curious to know the opinions of this group if it appropriately deals with
 the issues surrounding Mantel tests.
 
 --Amanda Stanley
 
 ***
 Amanda G. Stanley, Ph.D.
 Project Director
 
 Institute for Applied Ecology
 P.O. Box 2855
 Corvallis, OR 97339-2855
 (Phone)541-753-3099 x133
 (Fax)541-753-3098
 
 ama...@appliedeco.org
 www.appliedeco.org
 
 From: Robert Ptacnik ptac...@icbm.de
 To: r-sig-ecology@r-project.org
 Date: Mon, 14 Dec 2009 17:26:39 +0100
 Subject: [R-sig-eco] dissimilarity and species turnover
 Hi,
 mantel statistics has repeatedly been criticized. I wonder if there is an
 (approved) alternative for my problem:
 I aim to test whether one parameter (productivity, P) affects turnover (t)
 among ecological communities in time (T) or space (S).  (ÄT and ÄS will be
 used as a co-variables).
 To avoid confusion - I do NOT aim to test whether P affects composition as
 such (which could be tested by an ordination method), but whether the degree
 of similarity among samples scales with P.
  My approach so far was to calculate a dissimilarity matrix from my
 community data, distance matrices for the relevant environmental data (ÄT,
 ÄS, ÄP) and a mean (P) matrix, giving the mean(P) for each pair of
 observations.
 The I performed mantel tests whether t correlates with mean(P), taking other
 variables into account (partialing out). However, mantel and especially
 partialing out are often criticized (e.g. see documentation in vegan).
 any views?
 thanks!
 Robert
 
 
 Robert Ptacnik, PhD
 
 ICBM, Univ. of Oldenburg
 Schleusenstrasse 1, DE-26382 Wilhelmshaven
 http://www.icbm.de/planktologie/en/
 
 ptac...@icbm.de
 
 --
 
 [[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fwd: Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-10 Thread Jari Oksanen
On 10/12/09 23:03 PM, gabriel singer gabriel.sin...@univie.ac.at wrote:

 A difference between two communities within a host could still exist and
 could make perfect sense, too, when you regard community as a random
 factor. Then community may introduce some extra variation (compared to
 the within-community variation), experimentally seen interesting and
 important, because the replication of communities makes sure you are not
 pseudoreplicating. I am not sure however, how to declare the correct df
 for the random factor in adonis in this case... anybody knows better than I?

There is no way.

Random factors can be only mimicked by stratified permutation, but they
cannot be defined in adonis. It is only for fixed effect models.

With latest R-Forge version of vegan you can use simulate.rda to mimic
random models in RDA, but there is no way of having correct kind of
permutations for mixed models in any other vegan function.

Cheers, Jari Oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] capscale() for PCoA-CDA

2009-12-03 Thread Jari Oksanen
On 3/12/09 23:54 PM, gabriel singer gabriel.sin...@univie.ac.at wrote:

 Hi everybody,
 
 Anybody has used capscale() in package vegan to compute a PCoA-CDA as
 suggested by Anderson and Willis 2003 (Ecology 84: 511 ff) using one or
 more factors as predictors?
 
 Then I wonder about:
 
 *) How to interpret interactions of factors? Why are interactions
 (specified as ~factor1*factor2 in the function call) shown as
 continuous predictors (using arrows) in the plot function? Wouldn´t
 centroids for all cells in the design be more appropriate? Aren´t
 factorial interactions in a CDA setting more or less meaningless?

Internally capscale() uses constrasts of variables, and they are treated as
continuous variables and shown as arrows in plots. However, if the
constrasts correspond to simple factors, they are not drawn but their
centroids are shown. For ordered factors you get both centroids and the
arrows. The interactions of contrasts cannot be shown as simple class means
and therefore they are drawn as arrows. The simple centroids are not
appropriate, but you should have centroids of all combinations of class
levels of interacting factors.

If you think that factorial interactions in *** (what is CDA?) are
meaningless, why do you want to use them?

I wouldn't say they are meaningless, because that depends on your meaning.
Often they are difficult to interpret, but that's another issue.

 
 *) How to get classification statistics? And how to efficiently run a
 leave 1 out classification analysis? I thought of manually writing
 code that checks for the closest centroid. Would it be appropriate to
 use Euclidean distance as a criterion for this since it happens in PCo
 space? Probably there are more efficient functions which I do not know
 of, yet,... for example a function that allows extraction of distances
 of all objects to all centroids?

There is no such thing. Contributed code will be reviewed for inclusion into
vegan.
 
 *) Is the application of capscale on a Euclidean distance matrix
 equivalent to a classical DFA aka CDA on the original data - or am I
 completely wrong with this idea?

No, it isn't equal to DFA aka CDA. Perhaps... Depends on what are DFA and
CDA. With Euclidean distances, capscale() is equivalent to redundancy
analysis (RDA). Guessing that DFA aka CDA are discriminant analysis, RDA
is not equal to them. The major difference is that RDA uses no information
about scatter of points with respect to the class centroids, but it only
uses class centroids. The RDA tries to maximize the distances among class
centroids, but it doesn't try to maximize the separation of points of
different classes. The methods are very different although the results may
have some similarities.

This is connected to the previous question: because RDA (that is in the
heart of capscale()) does not try to optimize in classification, there is no
classification statistic to be optimized. That should be estimated
independently of the analysis and after the analysis, and there are no
functions for the purpose in vegan.
 
 *) Given only one factor as a predictor, I guess using permutest() or
 anova() on an object resulting from capscale is completely equivalent to
 a direct application of adonis()? Correct?

Have you tried this? After trying, you could tell us if this is true. I
wouldn't expect this. The results may not be completely different, but
internally the methods are pretty different, and when I tried with the same
random number seed and hence same permutations, the results were not
identical.
 
Cheers, Jari

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] how to calculate axis variance in metaMDS, pakage vegan?

2009-12-02 Thread Jari Oksanen

On 2/12/09 19:55 PM, Gian Maria Niccolò Benucci gian.benu...@gmail.com
wrote:

 Okey, I understood...
 I have a matrix of 40 rows (samples) and 29 columns (species). In the
 ordination graph the data divide in two clades ( as i supposed they must
 to)... and that's my best solution for reduce the Stress...
 
 metaMDS(sqrtABCD, distance = bray, k = 23, trymax = 50, autotransform
 =F) - NMS.trial

Gian,

This looks very much like badly degenerate solution. You shouldn't use 23
axes in NMDS, in particular with 40 x 20 source data. In Euclidean space
that data would give you rank of 20 or you could find at maximum 20 axes in
metric scaling. In the Bray-Curtis space the situation is more complicated,
but one random data set (Poisson random variates with lambda = 3.14) gave 25
positive and 14 negative eigenvalues. Probably the 23 dimensions you specify
exhaust the real part of your space even in metric scaling, and probably
(and obviously) fail miserably in nonmetric scaling. You shouldn't get
stress of that magnitude with a decent model with data like that.

It has never occurred to me that anybody would like to have NMDS with that
high number of dimensions. Usually we want to use two, sometimes one or two
more, but that's about the limit. Do the same and set k=2 to k=4 at maximum.
If you want to have mapping of all of your real space (i.e., ignore the
complex space), you can use metric scaling. The standard R choice is
cmdscale(). The vegan alternatives are capscale() which also can do
unconstrained metric scaling, returns information both on the real and
imaginary components of your space, and has plot and other support
functions. The low level alternative in vegan is wcmdscale() which also is
used by capscale(), but does not have any support functions (lacks even
print.wcmdscale!)

NMDS is really intended for nonlinear mapping onto *low* number of
dimensions.

Cheers, Jari Oksanen

 NMS.trial
 
 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 23, trymax = 100,
 autotransform = F)
 
 Nonmetric Multidimensional Scaling using isoMDS (MASS package)
 
 Data: sqrtABCD
 Distance: bray shortest
 
 Dimensions: 23
 Stress: 0.2548688
 Two convergent solutions found after 8 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on ŒsqrtABCD‚
 
 With more than 23 dimensions R gave me that result:
 
 metaMDS(sqrtABCD, distance = bray, k = 30, trymax = 50,
 Using step-across dissimilarities:
 Too long or NA distances: 230 out of 780 (29.5%)
 Stepping across 780 dissimilarities...
 Errore in isoMDS(dist, k = k, trace = isotrace) :
   initial configuration must be complete
 Inoltre: Warning messages:
 1: In cmdscale(d, k) : some of the first 30 eigenvalues are  0
 2: In sqrt(ev) : Si è prodotto un NaN
 
 
 ...Is normal I got better ordination (sepatation of different samples, that
 I know they're different) with few dimension also if the Stress is high?
 
 ... I supposed, that If we use as many dimensions as there are variables,
 then we can perfectly reproduce the observed distance matrix. Isn't it? But,
 of course, our goal is to reduce the observed complexity of nature, that is,
 to explain the distance matrix in terms of fewer underlying dimensions...
 So what is best at the end??
 And also wich is the function for plotting the stress values versus the
 number of dimnsions and how to read the plot?
 I hope I was clear, thank you so much!
 Yours,
 
 G.
 
 [[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] how to calculate axis variance in metaMDS, pakage vegan?

2009-12-02 Thread Jari Oksanen
On 2/12/09 19:55 PM, Gian Maria Niccolò Benucci gian.benu...@gmail.com
wrote:
 
 ... I supposed, that If we use as many dimensions as there are variables,
 then we can perfectly reproduce the observed distance matrix. Isn't it?
Gian, Not quite so. I think it would be useful to consult a good book, but
here some explanation.

The NMDS is not a simple reproduction method, but it is a non-linear
regression problem. For n points and k dimensions we fit a nonlinear
regression with n*k parameters fitted to n*(n-1)/2 observations. It doesn't
require much intuition to see that this is not well defined for k
approaching n, and then the non-linear regression fails. For details, the
non-linear regression function is isoreg() in R, and the model fitting
happens with optim() using method = BFGS (Broyden, Fletcher, Goldfarb 
Shanno). All this is not very obvious because it is done within a C function
in the MASS package. The NMDS is nonlinear just in order to be able to
produce a good mapping with low values of k: so stick with low values of k.

If you want to have complete mapping of dissimilarities, you should use
metric scaling. Then you typically ignore the latter axes. However, even
here the situation is not as clear as you write. If you use Euclidean
distances, then the number of variables give the number of dimensions of
metric scaling. With Euclidean distances, the complete solution also exactly
reproduces the observed distances. However, with non-Euclidean
dissimilarities (like Bray-Curtis in your case) the situation is more
complicated. Metric scaling and complete mapping is Euclidean, and if your
dissimilarities are non-Euclidean, you have a problem (that you usually
ignore). Firstly, the number of above zero eigenvalues and corresponding
real eigenvalues is not directly defined by the number of variables.
Secondly, you cannot reproduce the observed dissimilarities from real
eigenvectors because that reproduction is Euclidean and your measure was
non-Euclidean. For exact reproduction, you should subtract the distances in
imaginary space (negative eigenvalues) from distances in the real space
(positive eigenvalues). We actually do it exactly like this in the
betadisper() function in vegan, and for this reason the wcmdscale() function
of vegan also returns information on complex eigenvectors and negative
eigenvalues.

For your other post that came when I wrote this: stress 11.6 is really fine.
I think that if you get stress down to 5% (0.05) or less, then there is
something fishy in your data or in your model specification, like
overfitting. 

Cheers, Jari Oksanen

 But,
 of course, our goal is to reduce the observed complexity of nature, that is,
 to explain the distance matrix in terms of fewer underlying dimensions...
 So what is best at the end??
 And also wich is the function for plotting the stress values versus the
 number of dimnsions and how to read the plot?
 I hope I was clear, thank you so much!
 Yours,
 
 G.
 
 [[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Rotations for PCoA?

2009-10-20 Thread Jari Oksanen



On 20/10/09 13:19 PM, Etienne Laliberté etiennelalibe...@gmail.com
wrote:

 Date: Mon, 19 Oct 2009 13:44:57 +0200
 From: Jan Hanspach jan.hansp...@ufz.de
 Subject: [R-sig-eco] Rotations for PCoA?
 To: r-sig-ecology@r-project.org
 Message-ID: 4adc5139.9000...@ufz.de
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 
 
 
 Now, I want to know if  it is possible to get rotations for my traits,
 like it is calculated for a PCA? So that I can plot my traits within the
 ordinations space (or give the values in a table).
 Thanks!
 Jan
 
 Pierre Legendre recently emailed me one function which does exactly
 that. I haven't found it on his website so I assume it's not
 official yet. In any case, I'm sure he wouldn't mind me sharing it
 and it can certainly give you some good ideas on how to represent the
 original variables in a PCoA biplot. Here's the code. Enjoy.
 
 
Howdy Folks,

Actually, it is trivial to have rotation scores for continuous variables:
it is nothing but matrix multiplication. The problem here is that the
original analysis used Gower distance for mixed metrics, and we should map
the factor variables onto continuous variables in the same way as the Gower
distance does. After that it is easy to find the rotation scores (but
see below on metrics). While this can be done, this probably is not wanted
since the interpretation of rotation for factors is non-intuitive, to put
it mildly. Gower distance handles factors in a special way, and they are not
the simple factor contrasts you get in standard R functions. How they are
actually handled can be seen in the Fortran code for daisy or in Gower's
paper. An extra complication is that Gower distance uses Manhattan metric.
Therefore it is not possible to just transform data matrix into a form that
would give the rotation scores in Euclidean PCoA.

The standard way (that is not used in Gower distance) to transform factors
into continuous data is to use

 mm - model.matrix(~ ., mydatawithfactors)

Which gives you a model matrix where factors are broken into continuous
contrast variables. You can get rotation, biplot scores or what ever you
call for these contrasts, but that is only the beginning of the problems --
what are you going to do with those scores?

Perhaps you can try vegan function envfit which finds the rotation scores
for continuous variables and class centroids for factor variables. They are
not the same as the strict rotation scores for factors in Gower metric (but
should be the same as the Legendre scores for continuous variables), but may
be more intuitive. 

Cheers, jari oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] using two distance metrices in formula

2009-10-13 Thread Jari Oksanen


On 13/10/2009 17:51, Sarah Goslee sarah.gos...@gmail.com wrote:

 Jens,
 
 You can make your imported dissimilarities into a dist object quite easily.
 
 Say D is your imported dissimilarity data. It needs to be in lower-triangular
 format (assuming you imported it as a symmetric square matrix).
 
 D - as.matrix(D)
 D - D[col(D)  row(D)]
 attr(D, Size) - N # number of samples (# of rows and cols)
 attr(D, Labels) - 1:N # or names of samples, of length N
 attr(D, Diag) - FALSE
 attr(D, Upper) - FALSE
 attr(D, method) - imported # name of index, or NULL
 class(D) - dist
 
There is still easier way:

D - as.dist(D), or
D - as.dist(as.matrix(D))

which actually does exactly those things Sarah list above.


 On Tue, Oct 13, 2009 at 10:20 AM, Jens Oldeland oldel...@gmx.de wrote:
 Dear R-sig-ecology group,
 
 is there a way to use two self-made dissimilarity matrices for the left-hand
 side (LHS) and right hand side (RHS) in vegan functions such as capscale or
 adonis?
 I created those matrices in a GIS using 3D-information therefore I don´t
 want to simply use distances between 2D-coordinates. But as I found out,
 there is only the possibility to use a dist object for LHS.
 
 Can anyone suggest a solution or had similar problems?

However, vegan functions such as capscale or adonis do not accept
dissimilarities on both sides of the formula. The RHS always must be
rectangular data. There are some functions that take dist objects on both
sides, but they are less powerful, and in general, I wouldn't recommend
their use except in special situations (although they are available in
vegan). 

It is difficult to suggest a solution when we do not know what you want to
solve. Why would you want to have dissimilarities on both sides of the
formula? You can do that with, say, Mantel tests available in several
packages in R, but what are the things you want to solve by using
dissimilarities?

Cheers, Jari Oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] using two distance metrices in formula

2009-10-13 Thread Jari Oksanen



On 13/10/09 18:44 PM, Jens Oldeland oldel...@gmx.de wrote:

 Hi again,
 
 our distance matrices are 1) genetic distance (Jaccard) and 2)
 3D-Euclidean Distance and the question we want to solve is if there is
 an effect called Isolation by Distance (IBD) in our data (genetic and
 real-distances of snails on the island of crete) or not. There was a
 debate on the topic if the mantel test or the partial mantel test (isn´t
 this similar to MRM?) in several papers mainly in evolution-journals:
 
 Raufaste, N. and F. Rousset. 2001. Are partial Mantel tests adequate?
 Evolution 55:1703­1705
 Castellano, S. and E. Balletto. 2002. Is the partial Mantel test
 inadequate? Evolution 56:1871­1873.
 
 
 Geffen, E., Anderson, M.J.,  Wayne, R.K. (2004). Climate and habitat
 barriers to dispersal in the highly mobile grey wolf. Molecular Ecology,
 13, 2481-2490
 explain it nicely on page p.2483 (LHS)
 
 The problem arises due to the lack of independence of individual
 distances in a distance matrix. Although a simple Mantel test overcomes
 this issue by the
 use of permutations, a permutational approach does not necessarily solve
 problems introduced by several uncontrolled nuisance parameters in the
 case of more than one
 regressor (i.e. partial tests). Thus, we do not use a Mantel approach
 here, but rather use the distance-based multivariate approach of McArdle
  Anderson (2001). The important point is that, for dbRDA, the
 individual distances are not treated as a single univariate response
 variable, as in the Mantel test, but rather the individual sites are the
 units of observation for analysis, about which we have calculated
 distances using an entire set of genetic variables. The distance matrix
 is therefore treated as information regarding multivariate
 response.Taking this multivariate approach avoids the problems
 associated with the partial Mantel test.

Jens,

There has been a very similar discussion in the Ecology recently between my
good friends, Hanna Tuomisto  co vs. Pierre Legendre  co. However, the
point here and above exactly was that you cannot use dissimilarities on the
RHS (lack of independence), but you must use rectangular data in dbRDA. If
you use distances on the RHS you won't have dbRDA but you get Mantel family
methods (like MRM in ecodist). The problem, of course, is how to map
distances onto Euclidean space (= rectangular data) *and* still study the
effects of the distances instead of the effects of *location*. I don't know
any really good solution here, but all proposed solutions have their
problems. Pierre Legendre, Daniel Borcard and Hanna Tuomisto have all tried
to convince me of their point of view, and while all their conflicting
arguments make sense, they are not yet an optimal solution.

Cheers, Jari Oksanen
 
 so we thought it would be a good idea not to use mantel and friends
 since the problem of IBD seems to need a different approach here.
 
 best,
 Jens
 
 
 
 
 
 Sarah Goslee schrieb:
 That doesn't make much sense to me. You'd need an entirely different method
 than capscale.
 
 Perhaps what you're looking for is more like multiple regression on distance
 matrices (implemented in MRM in ecodist)?
 
  Lichstein, J. 2007. Multiple regression on distance matrices: A
  multivariate spatial analysis tool. Plant Ecology 188: 117-131.
 
  Legendre, P.; Lapointe, F. and Casgrain, P. 1994. Modeling brain
  evolution from behavior: A permutational regression approach.
  Evolution 48: 1487-1499.
 
 Sarah
 
 On Tue, Oct 13, 2009 at 11:13 AM, Jens Oldeland oldel...@gmx.de wrote:
   
 Dear Sarah dear Jari,
 
 many thanks for your explanations. However, it wasnt what I thought about,
 sorry I definitely have to be more specific about the problem.
 
 Okay I try be more precise:
 
 the problem was that for example  capscale accepts   capscale(dist.matrix.1
 ~ N + P + K *Ag, data=varechem)
 but I need  capscale(dist.matrix.1 ~ dist.matrix.2, data=dist.matrix.2)
  so the trick was not on how to create a distance matrix but how to use a
 second on in a formula.
 
 We are trying a similar analysis like the the distlm program by Marti
 Anderson does, however we had a problem with that and wanted to try the
 analysis in R.
 
 thanks already for all your comments !
 
 best
 Jens
 
 
 
 
   
 

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] trouble with BiodiversityR:diversityresult function

2009-10-12 Thread Jari Oksanen
Roman,

The user interface of vegan function 'specpool' changed in latest release,
and the BiodiversityR didn't follow along. The index name should now be
jack1 instead of Jack.1. This requires changes in a couple of places in
BiodiversityR, but I don't know its source code so well that I could easily
spot those changes (probably in BiodiversityRGUI, diversityresult,
diversityresult0). I have cc'ed this to Roeland Kindt, but I think the
Internet connection is not always so good so that it may take some time
before he answers. I can look if I can hack around this in vegan.

Cheers, Jari Oksanen


On 12/10/09 19:54 PM, Roman Luštrik roman.lust...@gmail.com wrote:

 Hello list,
 
 I have a felling this is a generic error but nonetheless, it has been
 bugging me extensively. If anyone can chip in, I would most appreciate.
 
 I'm trying to calculate expected species richness, following KindtCoe Tree
 diversity analysis from my dataset but I get the following error:
 
 divjack - diversityresult(PoCom, index=Jack.1)
 Error in round(result, digits = digits) :
   Non-numeric argument to mathematical function
 
 PoCom is a community data.frame with rows as locations and columns as
 species (as in test dataset called dune in the book mentioned earlier).
 Here is a head(PoCom) and first few lines of str(PoCom):
 
 head(PoCom)
 Sal_idae Pil_mili Syl_idae Ner_sp. Ner_rava Sab_spin Neo_pseu
 HM1   12   145  1311   28
 HM2  1235  113  28   44   170
 HM3   13   558  40027
 HM4  141   23   56  91   7460
 HM570   20  18   11  1140
 HM6   13   140   70   398
 
 str(PoCom)
 'data.frame':   10 obs. of  83 variables:
  $ Sal_idae : int  12 123 13 141 7 13 289 318 99 124
  $ Pil_mili : int  14 5 55 23 0 14 15 23 86 138
  $ Syl_idae : int  5 113 8 56 20 0 45 122 0 2
  $ Ner_sp.  : int  13 28 40 91 18 7 42 41 24 0
  $ Ner_rava : int  1 44 0 74 11 0 16 34 0 35
  $ Sab_spin : int  1 17 2 6 114 39 0 8 2 0
  $ Neo_pseu : int  28 0 7 0 0 8 3 0 12 99
 
 Debugging stops here:
 
 Browse[1] n
 debug: if (method != jackknife) {
 result2 - round(result, digits = digits)
 result2 - data.frame(result2)
 colnames(result2) - index
 }
 Browse[1] n
 debug: result2 - round(result, digits = digits)
 Browse[1] n
 Error in round(result, digits = digits) :
   Non-numeric argument to mathematical function
 
 [[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] trouble with BiodiversityR:diversityresult function

2009-10-12 Thread Jari Oksanen



On 12/10/09 19:54 PM, Roman Luštrik roman.lust...@gmail.com wrote:

 Hello list,
 
 I have a felling this is a generic error but nonetheless, it has been
 bugging me extensively. If anyone can chip in, I would most appreciate.
 
 I'm trying to calculate expected species richness, following KindtCoe Tree
 diversity analysis from my dataset but I get the following error:
 
 divjack - diversityresult(PoCom, index=Jack.1)
 Error in round(result, digits = digits) :
   Non-numeric argument to mathematical function

Roman,

I am not quite sure how BiodiversityR displays the results (I do not use it
regularly). However, when I look at its documentation, it looks to me like
it only would call specpool() of vegan and format the result. So if you go
to the script windown in BiodiversityR and type

 specpool(PoCom)

 and click submit you should get the result.

Cheers, Jari

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mantel test with skew-symmetric matrices?

2009-10-01 Thread Jari Oksanen



On 1/10/09 20:36 PM, Sarah Goslee sarah.gos...@gmail.com wrote:

 I can only speak for the mantel() within ecodist, but I can tell you that it
 will not take full matrices - the upper triangle will be dropped. You could
 roll your own very easily, but it would be exceedingly slow, eg:
 
 mat1 - # some square skew-symmetric matrix
 mat2 - # some other square skew-symmetric matrix
 
 mantelr - cor(as.vector(mat1), as.vector(mat2))
 nperm - 100 # bigger for real problem, of course
 permresults - numeric(nperm)
 permresults[1] - mantelr
 
 for(thisperm in 2:nperm) {
randsample - sample(1:nrow(mat1))
permresults[thisperm] - cor(as.vector(mat1[randsample,
 randsample]), as.vector(mat2))
 }
 and then use permresults to estimate your test statistic of interest.
 
Sarah  Steve,

This was the design I had on my mind. However, I was not sure how
skew-symmetry actually was defined, and therefore I didn't know if free
permutation of rows and columns (even when done correctly like above) will
retain the skew-symmetry. The free permutation would be OK for non-symmetric
matrices, but what about skew-symmetric? (Little thinking and pen  paper
probably would give a quick answer, but I won't do that for a while).

Cheers, Jari

 I haven't thought at all about any statistical issues raised by use of full
 non-symmetric matrices - you're on your own there. It's certainly
 *possible*, and
 I don't see any immediate reason why it would be wrong, but haven't
 pondered the issue.
 
 I see Jari replied as well while I was writing - as for vegan, the
 ecodist function would
 need to be heavily modified to do this. If I'm persuaded that there's
 enough interest,
 I could add it to the list.
 
 Sarah
 
 On Thu, Oct 1, 2009 at 1:20 PM, Steve Arnott arno...@dnr.sc.gov wrote:
 Hello All,
 
 1) Is it wrong to use skew-symmetric matrices - i.e. should I just forget
 about the skew data and use absolute values to make all my matrices
 symmetric? The original Mantel paper (1967, Cancer Research, 2: 209-220) does
 talk about skew-symmetric matrices, but the published applications I've come
 across only seem to use symmetric matrices.
 
  2) If it is ok to use skew-symmetric matrices, do the mantel() and
 mantel.partial() functions in 'vegan' (or related functions in other
 packages, such as 'ecodist') handle them correctly? I've found that it is
 possible to process skew data and generate results using these functions, but
 I'm uncertain from the documentation whether the results are meaningful (i.e.
 is the coding designed to handle such cases appropriately?)

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fixing parameters in GLM and Mixed Effects Models

2009-09-24 Thread Jari Oksanen



On 24/09/09 22:34 PM, Hamish hwil...@ucsd.edu wrote:

 
 I want to force a =1 and b= -1, yielding:
 
 
 N ~ 1*Mass + (-1)*Temp + c*Precipitation + {d,e,f,g}*DIET
 
I think you want to have

N ~ offset(Mass) + offset(-Temp) + Precipitation + DIET

Most *lm functions know offset. However, you must be very careful especially
with links other than identity. Moreover, you must be sure that this really
is what you're looking for: the above expression would be equivalent to
offset(Mass-Temp) and is Mass-Temp really something you think you want to
fix.
  
 
 and then see what the values of the other parameters (c-g) turn out to be,
 as well as the model fit when using the forced parameters in conjunction
 with the free parameters.
 
Cheers, Jari Oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] wascores() for metaMDS?

2009-08-19 Thread Jari Oksanen
Gabriel,


On 19/08/09 12:40 PM, gabriel singer gabriel.sin...@univie.ac.at wrote:

 Hi sig-ecology!
 
 Here comes a probably stupid question... I am looking for smart ways to
 include information about underlying variables in MDS plots. In other
 words, after having computed an ordination with isoMDS or metaMDS from a
 community table, I would like to add something like species
 coefficients/loadings as vectors to the plot of sites. As no species
 coefficients exist in this case, the best I could come up with so far is
 simply vectors calculated from correlation coefficients of the
 individual species with the site scores (on two MDS axes).
 The function metaMDS allows to compute species scores using the
 function wascores() I have now pondered for 2 days how these scores
 are calculated and what their precise meaning would be. Would these
 species scores be appropriate to show as vectors in the MDS?
 Thanks for any answer...
 
I think these were documented... Please point out the unclear parts of the
documentation so that I can correct those.

The wascores are Weighted Averages Scores and they are calculated like
weighted averages, or similarly as species scores in correspondence
analysis. This means that (with some scaling) they show the centroid
(barycentre) of the species occurrence in the ordination graph. It is not
appropriate to present these as arrows which indicate a linear increase to
the direction of the arrow instead of the centre of abundance. Therefore the
species scores can (and as default in metaMDS, will) be presented as points.
If -- for any reason that is none of my business -- you want to get vectors
of species, you can fit species as vectors. This happens with metaMDS or
isoMDS like this:

library(vegan)
data(dune)
m - metaMDS(dune)
# or m - isoMDS(vegdist(dune))
vec - envfit(m, dune)
plot(m, dis=site)
# or with isoMDS: ordiplot(m)
plot(vec)

I promised that I won't comment on this, but still I must say that I cannot
find a reason to do so.

Please note that you can also use ordisurf to fit nonlinear species
responses if you think that species are not points nor arrows. Vegan
tutorial (from the Web) gives an example.

Cheers, Jari Oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Neighbor sampling is random?

2009-07-04 Thread Jari Oksanen
E. C. Pielou may be one of the first who wrote about this. Check her book 
'Mathematical ecology' (J. Wiley, 1977). 

Best wishes, Jari Oksanen
-Original Message-
From: Paulo Inácio de Knegt López de Prado
Sent:  04.07.2009, 17:35 
To: r-sig-ecology@r-project.org
Subject: [R-sig-eco] Neighbor sampling is random?

Dear r-eco-list users,

This is not an R-question, but a statistical one, but maybe somebody can help.

I had read that to set random points over an area and picking the nearest
plant is not random sample, but I could not recover this article.

Is that correct? Could you provide some basic reference?

Thanks a lot

Paulo

--
Paulo Inácio de Knegt López de Prado
Depto. de Ecologia - Instituto de Biociências - USP
Rua do Matão, travessa 14, nº 321
Cid. Universitária, São Paulo - SP
CEP 05508-900
11-30917599 (sala)
11-30917600 (Secretaria)

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Vegan DCCA axis length

2009-01-13 Thread Jari Oksanen


On 13/01/2009, at 15:30 PM, David Giordano Armanini wrote:


Hi everyone,
happy new year.
I am new to the list so apologise for uncorrect use.
I am using the Vegan package version 1.16.9 and I am encountering
problems on obtaining the length of axis of a DCCA/pCCA,
I have read previous threads in which Gavin Simpson and Jari Oksanen
states that are not available routines in vegan to retrieve this
information.
I am interested in the length of the DCCA/pCCA axis as I would like to
perform a similar analyses to the one Lancaster et al. 1996 performed
to measure community persistence in acified stream (Freshwater Biology
1996 36, 179–201) and to the similar one performed as well by Woodward
(Freshwater biology 2002 47 1419-1435).
Thus, my question is how can I retrieve the standard deviations of
species turnover in a DCCA/pCCA? I can not use a simple DCA as I need
to use both conditions (Site) and Constrains (years),
thanks in advance,


I have been involved in this discussion earlier, but here some points:
1. You cannot perform DCCA in vegan, and this is not easily changed.  
As far as I know, there are no other packages to perform this. This is  
due to a deliberate design decisions in the Fortran code for  
decorana(), and a re-design is needed.
2. You can perform pCCA in vegan, and there you have the so-called  
Hill scaling that should approximate the sd scaling. See docs for  
scores.cca for the scaling.
3. I don't think that the determination of the length of the axes  
makes much sense.


Best wishes, Jari Oksanen
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Intepreting a plot from a constrained NMDS

2008-12-07 Thread Jari Oksanen


On 8 Dec 2008, at 2:53, stephen sefick wrote:


A way that may work is use the axis scores as y in a regression- this
seems like it would let you interpret how the communities are
distributed in species space.  Just a thought



NEVER do this! The rotation is not determined in NMDS. If you rotate  
the NMDS solution you will always have the very same solutios, but the  
correlations with the axis scores will change. The whole point of  
having envfit/vectorfit in vegan is that you don't need to calculate  
the correlations with the axes.


The axis rotation is determined in eigenvector ordination, but that  
does not mean that the direction is meaningful. You can see this with  
envfit/vectorfit in unconstrained eigenvector ordination, but also in  
constrained ordination: very rarely the fitted vectors or biplot  
arrows are parallel to axes. Only if axes are parallel to axes (and go  
along the axes), then it would be meaningful to look a the  
relationship of axis and something else.


And indeed, there is no constrained NMDS in vegan. It is unconstrained  
with an interpretation through vector fitting.


cheers, jari oksanen

Stephen Sefick

On Sat, Dec 6, 2008 at 11:58 AM, Manuel Spínola  
[EMAIL PROTECTED] wrote:

Dear list members,

Is there any reference or document on how to interpret a  
constrained

non-metrical multidimensional scaling using ecological data?  By
constrained I mean after fitting environmental covariables,  
using, for
example, the envfit function in the vegan package.  Is it  
possible to
interpret the resulting plot in the same way that a constrained  
ordination,

for example CCA?
Thank you very much in advance.
Best,

Manuel

--
Manuel Spínola, Ph.D.
Instituto Internacional en Conservación y Manejo de Vida Silvestre
Universidad Nacional
Apartado 1350-3000
Heredia
COSTA RICA
[EMAIL PROTECTED]
[EMAIL PROTECTED]
Teléfono: (506) 2277-3598
Fax: (506) 2237-7036

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology





--
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Question on height for hclust function

2008-11-17 Thread Jari Oksanen


On 17 Nov 2008, at 21:40, Leigh Fall wrote:

I've run a cluster analysis with Jaccard distance and Ward's  
method.  The
clustering height (located on the left of the dendrogram) is not  
scaled to

the distance function because the height values range from 0 to 3.5 in
increments of 0.5.  In Oksanen's Vegan tutorial, the examples of the  
cluster
dendrograms show height values that appear to be the Bray distance  
values
(with various linkage methods) because the height values are less  
than 1.

I'm not sure what my height values reflect.  Is the clustering height
associated with Wards?  Can the the height values be rescaled to the  
Jaccard

values?

The height values at the vertical axis depend on the clustering  
method: they are the fusion levels your particular method uses. For  
single linkage these are the shortest distances between clusters, for  
complete linkage they are the maximum distances among clusters  
(cluster diameters after fusion) etc. For Ward's method they are the  
values of Ward's criterion. Now you only need to check how Ward's  
criterion is defined...


Bray and Jaccard are in similar range, and the choice between these  
indices has a negligible effect on the scales. The choice of  
clustering method has a huge impact.


cheers, Jari Oksanen

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] output from ordisurf

2008-10-27 Thread Jari Oksanen
On Thu, 2008-10-23 at 14:44 -0500, Christopher Chizinski wrote:
 Is there a way to output the X,Y,Z values for the contour lines from 
 ordisurf?  
 
 For example, I am using this statement:
 tmp-with(enviro,ordisurf(fish.nmds,enviro$Temp,choices=c(1,2),add=TRUE))
 
 names(tmp)
  [1] coefficients  residuals fitted.values   
  [4] familylinear.predictors deviance
  [7] null.deviance iter  weights 
 [10] prior.weights df.null   y   
 [13] converged sig2  edf 
 [16] hat   boundary  sp  
 [19] nsdf  VeVp  
 [22] mgcv.conv gcv.ubre  aic 
 [25] rank  gcv.ubre.dev  method  
 [28] smoothformula   cmX 
 [31] model control   terms   
 [34] ptermsassignoffset  
 [37] data  df.residual   min.edf 
 [40] call
 
 I would like to get these XYZ values or find a way to calculate them 
 from the output for inclusion into SigmaPlot.  It is probably an easy 
 solution but I could not find what I needed.
 
Dear Christopher Chizinski,

I assume you mean the ordisurf() function in the vegan package. If you
mean something else, this message is not appropriate and you may stop
reading. 

Function ordisurf (of vegan) returns visibly an object of function gam
of the mgcv package (like documented). You can use all mgcv:::gam tools
to handle this object, including predict.gam (of mgcv) which can be used
to predict new values of surface (z) for any values of the coordinates
(x, y). These results can be exported to SigmaPlot or anywhere. If you
look inside the function ordisurf, you can see how predict.gam is used
to find the predicted value of the surface for a grid of axis values.

I changed the ordisurf function in
http://r-forge.r-project.org/projects/vegan so that it now adds the data
used by the contour() function in a new item called grid in the
result. The returned data in grid contain vectors x and y for the
grid values on the axis, and matrix z for the fitted surface values
for each combination of x and y vectors. The values of z outside
the convex hull of observed site are marked as NA. You can directly
export these items or reuse them to draw the surface with the contour()
command. The grid item is available from vegan working version 1.16-3
(revision 532) in R-Forge, and Windows binary is available (probably)
from tomorrow morning (Central European Standard Time).

Best wishes, Jari Oksanen

-- 
Jari Oksanen [EMAIL PROTECTED]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


<    1   2   3